0% found this document useful (0 votes)
97 views

Lecture Notes For FY3452 Gravitation and Cosmology: M. Kachelrieß

Uploaded by

Tabish
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views

Lecture Notes For FY3452 Gravitation and Cosmology: M. Kachelrieß

Uploaded by

Tabish
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 164

Lecture Notes for FY3452

Gravitation and Cosmology

M. Kachelrieß
M. Kachelrieß
Institutt for fysikk
NTNU, Trondheim
Norway
email: [email protected]

Watch out for errors, most was written late in the evening.
Corrections, feedback and any suggestions always welcome!

Copyright c M. Kachelrieß 2010–2012, 2020.


Last up-date November 23, 2020
Contents

1 Special relativity 8
1.1 Newtonian mechanics and gravity . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Minkowski space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Relativistic mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.A Appendix: Comments and examples on tensor and index notation . . . . . . 16

2 Lagrangian mechanics and symmetries 21


2.1 Calculus of variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Hamilton’s principle and the Lagrange function . . . . . . . . . . . . . . . . . 22
2.3 Symmetries and conservation laws . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Free relativistic particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Basic differential geometry 28


3.1 Manifolds and tensor fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Tensor analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.1 Metric connection and covariant derivative . . . . . . . . . . . . . . . 32
3.2.2 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.A Appendix: a bit more... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.A.1 Affine connection and covariant derivative . . . . . . . . . . . . . . . . 35
3.A.2 Riemannian normal coordinates . . . . . . . . . . . . . . . . . . . . . . 36

4 Schwarzschild solution 38
4.1 Spacetime symmetries and Killing vectors . . . . . . . . . . . . . . . . . . . . 38
4.2 Schwarzschild metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Gravitational redshift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4 Orbits of massive particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5 Orbits of photons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.6 Post-Newtonian parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.A Appendix: General stationary isotropic metric . . . . . . . . . . . . . . . . . . 48

5 Gravitational lensing 49

6 Black holes 53
6.1 Rindler spacetime and the Unruh effect . . . . . . . . . . . . . . . . . . . . . 53
6.2 Schwarzschild black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.3 Reissner-Nordström black hole . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.4 Kerr black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.5 Black hole thermodynamics and Hawking radiation . . . . . . . . . . . . . . . 68
6.A Appendix: Conformal flatness for d = 2 . . . . . . . . . . . . . . . . . . . . . 69

3
Contents

7 Classical field theory 70


7.1 Lagrange formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.2 Noether’s theorem and conservation laws . . . . . . . . . . . . . . . . . . . . . 71
7.3 Perfect fluid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.4 Klein-Gordon field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.5 Maxwell field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

8 Einstein’s field equation 84


8.1 Curvature and the Riemann tensor . . . . . . . . . . . . . . . . . . . . . . . . 84
8.2 Integration, metric determinant g, and differential operators . . . . . . . . . . 86
8.3 Einstein-Hilbert action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.4 Dynamical stress tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.4.1 Cosmological constant . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
8.4.2 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
8.5 Alternative theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

9 Linearized gravity and gravitational waves 96


9.1 Linearized gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
9.1.1 Metric perturbations as a tensor field . . . . . . . . . . . . . . . . . . 96
9.1.2 Linearized Einstein equation in vacuum . . . . . . . . . . . . . . . . . 97
9.1.3 Linearized Einstein equation with sources . . . . . . . . . . . . . . . . 99
9.1.4 Polarizations states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
9.2 Stress pseudo-tensor for gravity . . . . . . . . . . . . . . . . . . . . . . . . . . 103
9.3 Emission of gravitational waves . . . . . . . . . . . . . . . . . . . . . . . . . . 106
9.4 Gravitational waves from binary systems . . . . . . . . . . . . . . . . . . . . . 109
9.4.1 Weak field limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
9.4.2 Strong field limit and binary merger . . . . . . . . . . . . . . . . . . . 112
9.A Appendix: Projection operator . . . . . . . . . . . . . . . . . . . . . . . . . . 115
9.B Appendix: Derivation of the retarded Green function . . . . . . . . . . . . . . 116

10 Cosmological models for an homogeneous, isotropic universe 118


10.1 Friedmann-Robertson-Walker metric for an homogeneous, isotropic universe . 118
10.2 Geometry of the Friedmann-Robertson-Walker metric . . . . . . . . . . . . . 120
10.3 Friedmann equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
10.4 Scale-dependence of different energy forms . . . . . . . . . . . . . . . . . . . . 127
10.5 Cosmological models with one energy component . . . . . . . . . . . . . . . . 128
10.6 The ΛCDM model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
10.7 Determining Λ and the curvature R0 from ρm,0 , H0 , q0 . . . . . . . . . . . . . 131
10.8 Particle horizons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

11 Cosmic relics 134


11.1 Time-line of important dates in the early universe . . . . . . . . . . . . . . . 134
11.2 Equilibrium statistical physics in a nut-shell . . . . . . . . . . . . . . . . . . . 136
11.3 Big Bang Nucleosynthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
11.3.1 Equilibrium distributions . . . . . . . . . . . . . . . . . . . . . . . . . 139
11.3.2 Proton-neutron ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
11.3.3 Estimate of helium abundance . . . . . . . . . . . . . . . . . . . . . . 142

4
Contents

11.3.4 Results from detailed calculations . . . . . . . . . . . . . . . . . . . . 142


11.4 Dark matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
11.4.1 Freeze-out of thermal relic particles . . . . . . . . . . . . . . . . . . . 143
11.4.2 Hot dark matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
11.4.3 Cold dark matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

12 Inflation and structure formation 148


12.1 Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
12.1.1 Scalar fields in the expanding universe . . . . . . . . . . . . . . . . . . 149
12.1.2 Generation of perturbations . . . . . . . . . . . . . . . . . . . . . . . . 152
12.1.3 Models for inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
12.2 Structure formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
12.2.1 Overview and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
12.2.2 Jeans mass of baryons . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
12.2.3 Damping scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
12.2.4 Growth of perturbations in an expanding Universe: . . . . . . . . . . . 157
12.2.5 Recipes for structure formation . . . . . . . . . . . . . . . . . . . . . . 158
12.2.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

5
Preface
These notes summarise the lectures for FY3452 Gravitation and Cosmology I gave in 2009,
2010 and 2020. Asked to which of the three more advanced topics black holes, gravitational
waves and cosmology more time should be devoted, students in 2009 voted for cosmology,
while in 2010 and 2020 black holes and gravitational waves were their favourites. As a result,
the notes contain probably more material than manageable in an one semester course.
I’m updating the notes throughout the semester. Compared to the last (2015) version,
the order of topics is changed, some sections are streamlined to get space for new stuff (e.g.
GW discovery), some like the one about Noether’s theorem improved, and conventions will
be unified. At the moment, chapters 1–2, 4, 6–9 are updated.
There are various differing sign conventions in general relativity possible – all of them are
in use. One can define these choices as follows

η αβ = S1 × [−1, +1, +1, +1], (0.1a)


Rαβρσ = S2 × [∂ρ Γαβσ − ∂σ Γαβρ +Γ α κ
κρ Γ βσ − Γακσ Γκβρ ], (0.1b)
Gαβ = S3 × 8πG Tαβ , (0.1c)
Rαβ = S2 S3 × Rραρβ . (0.1d)

We choose these three signs as Si = {−, +, +}. Conventions of other authors are summarised
in the following table:
HEL dI,R MTW, H W
[S1 ] - - + +
[S2 ] + + + -
[S3 ] - - + -

Some useful books:


H: J. B. Hartle. Gravity: An Introduction to Einstein’s General Relativity (Benjamin
Cummings)

HEL: Hobson, M.P., Efstathiou, G.P., Lasenby, A.N.: General relativity: an introduction for
physicists. Cambridge University Press 2006. [On a somewhat higher level than Hartle.]

• Robert M. Wald: General Relativity. University of Chicago Press 1986. [Uses a modern
mathematical language]

• Landau, Lev D.; Lifshitz, Evgenij M.: Course of theoretical physics 2 - The classical
theory of fields. Pergamon Press Oxford, 1975.

MTW: Misner, Charles W.; Thorne, Kip S.; Wheeler, John A.: Gravitation. Freeman New
York, 1998. [Entertaining and nice description of differential geometry - but lengthy.]

• Schutz, Bernard F.: A first course in general relativity. Cambridge Univ. Press, 2004.

6
Contents

• Stephani, Hans: Relativity: an introduction to special and general relativity. Cambridge


Univ. Press, 2004.

W: Weinberg, Steven: Gravitation and cosmology. Wiley New York, 1972. [A classics.
Many applications; outdated concerning cosmology.]

• Weyl, Hermann: Raum, Zeit, Materie. Springer Berlin, 1918 (Space, Time, Matter,
Dover New York, 1952). [The classics.]

Finally: If you find typos (if not, you havn’t read carefully enough) in the part which is
already updated, conceptional errors or have suggestions, send me an email!

7
1 Special relativity

1.1 Newtonian mechanics and gravity


Inertial frames and the principle of relativity Newton presented his mechanics in an ax-
iomatic form. His Lex Prima (or the Galilean law of inertia) states: Each force-less mass point
stays at rest or moves on a straight line at constant speed. Distinguishing between straight
and curved lines requires an affine structure of space, while measuring velocities relies on a
metric structure that allows one to measure distances. In addition, we have to be able to
compare time measurements made at different space points. Thus, in order to apply Newton’s
first law, we have to add some assumptions on space and time. Implicitly, Newton assumed
an Euclidean structure for space, and thus the distance between two points P1 = (x1 , y1 , z1 )
and P2 = (x2 , y2 , z2 ) in a Cartesian coordinate system is
2
∆l12 = (x1 − x2 )2 + (y1 − y2 )2 + (z1 − z2 )2 (1.1)

or, for infinitesimal distances,


dl2 = dx2 + dy 2 + dz 2 . (1.2)
Moreover, he assumed the existence of an absolute time t on which all observers can agree.
In a Cartesian inertial coordinate system, Newton’s lex prima becomes then

d2 x d2 y d2 z
= = = 0. (1.3)
dt2 dt2 dt2
Most often, we call such a coordinate system just an inertial frame. Newton’s first law is not
just a trivial consequence of its second one, but may be seen as a practical definition of those
reference frames for which his following laws are valid.
Which are the transformations which connect these inertial frames or, in other words, which
are the symmetries of empty space and time? We know that translations a and rotations R
are symmetries of Euclidean space: This means that using two different Cartesian coordinate
systems, say a primed and an unprimed one, to label the points P1 and P2 , their distance
defined by Eq. (1.3) remains invariant, cf. with Fig. 1.1. The condition that the norm of the
distance vector l12 is invariant, l12 = l′12 , implys

l′T l′ = lT RT Rl = lT l (1.4)

or RT R = 1. Thus rotations acting on a three-vector x are represented by orthogonal matrices,


R ∈ O(3). All frames connected by x′ = Rx + a to an inertial frame are inertial frames too.
In addition, there may be transformations which connect inertial frames which move with
a given relative velocity. In order to determine them, we consider two frames with relative
velocity v along the x direction: The most general linear1 transformation between these two
1
A non-linear transformation would destroy translation invarince, cf. with Ex.xx

8
1.1 Newtonian mechanics and gravity

y y′

P
b

x′

Figure 1.1: The point P is invariant, with the coordinates (x, y) and (x′ , y ′ ) in the two coor-
dinate systems.

frames is given by
     
t′ At + Bx At + Bx
 x′   Dt + Ex   A(x − vt) 
 y′  = 
   = . (1.5)
y   y 
z′ z z
In the second step, we used that the transformation matrix depends only on two constants,
as you should show in Ex. ??.
Newton assumed the existence of an absolute time, t = t′ , and thus A = 1 and B = 0.
Then proper Galilean transformations x′ = x + vt connect inertial frames moving with
relative speed v. Taking a time derivative leads to the classical addition law for velocities,
ẋ′ = ẋ + v. Time differences ∆t12 and space differences ∆l12 are separately invariant under
these transformations.
The Principle of Relativity states that identical experiments performed in different inertial
frames give identical results. Galilean transformations keep (1.3) invariant, hence Newton’s
first law does not allow to distinguish between different inertial frames. Before the advent of
special relativity, it was thought that this principle applies only to mechanical experiments.
In particular, it was thaught that electrodynamic waves require a medium (the “aether”) to
propagate: thence the rest frame of the aether could be used to single out a preferred frame.
Newton’s Lex Secunda states that observed from an inertial reference frame, the net force
on a particle is proportional to the time rate of change of its linear momentum,

dp
F = (1.6)
dt
where p = min v and min denotes the inertial mass of the body.

Newtonian gravity Newton’s gravitational law as well as Coulomb’s law are examples for
an instantaneous action,
X x − xi
F (x) = Ki . (1.7)
|x − xi |3
i

9
1 Special relativity

The force F (x, t) depends on the distance x(t) − xi (t) to all sources i (electric charges or
masses) at the same time t, i.e. the force needs no time to be transmitted from xi to x.
The factor K in Newton’s law is −Gmg Mg , where we introduced analogue to the electric
charge in the Coulomb law the gravitational “charge” mg characterizing the strength of the
gravitational force between different particles. Surprisingly, one finds min = mg and we can
drop the index.
Since the gravitational field is conservative, ∇ × F = 0, we can introduce a potential φ via

F = −m∇φ (1.8)

with
GM
φ(x) = − . (1.9)
|x − x′ |
Analogue to the electric field E = −∇φ we can introduce a gravitational field, g = −∇φ.
We then obtain ∇ · g(x) = −4πGρ(x) and as Poisson equation,

∆φ(x) = 4πGρ(x) , (1.10)

where ρ is the mass density, ρ = dm/d3 x. Similiarly as the full Maxwell equations reduce in
the v/c → 0 to the electrostatic Poisson equation, a relativistic generalisation of Newtonian
gravity should exist.

1.2 Minkowski space


Light cone and metric tensor A light-signal emitted at the x1 at the time t1 propagates
along a cone defined by

(ct1 − ct2 )2 − (x1 − x2 )2 − (y1 − y2 )2 − (z1 − z2 )2 = 0. (1.11)

In special relativity, we postulate that the speed of light is universal, i.e. that all observers
measure c = c′ . A condition which guaranties this and resembles Eq. (1.1) is that the squared
distance in an inertial frame

∆s2 ≡ (ct1 − ct2 )2 − (x1 − x2 )2 − (y1 − y2 )2 − (z1 − z2 )2 (1.12)

between two spacetime events xµ1 = (ct1 , x1 ) and xµ2 = (ct2 , x2 ) is invariant. Hence the
symmetry group of space and time is given by all those coordinate transformations xµ →
x̃µ = Λµν xν that keep ∆s2 invariant. Since these transformation mix space and time, we
speak about spacetime or, to honnor the inventor of this geometrical interpretation, about
Minkowski space.
The distance of two infinitesimally close spacetime events is called the line-element ds of
the spacetime. In Minkowski space, it is given by

ds2 = c2 dt2 − dx2 − dy 2 − dz 2 (1.13)

using a Cartesian inertial frame. More precisely, the line-element ds is defined as norm of the
displacement vector
ds = dsµ eµ (1.14)

10
1.2 Minkowski space

Choosing as basis the coordinate vectors to xµ = (ct, x), its components are

dsµ = dxµ = (cdt, dx) . (1.15)

We compare now our physical requirement on the distance of spacetime events, Eq. (1.13),
with the general result for the scalar product of two vectors a and b. If these vectors have
the coordinates ai and bi in a certain basis ei , then we can write

3
X 3
X
µ ν
a·b= (a eµ ) · (b eν ) = aµ bν (eµ · eν ) . (1.16)
µ,ν=0 µ,ν=0

Thus we can evaluate the scalar product between any two vectors, if we know the symmetric
matrix g composed of the products of the basis vectors at all spcetime points xµ ,

gµν (x) = eµ (x) · eν (x) = gµν (x) . (1.17)

This symmetric matrix gµν is called the metric tensor.


Applying this now for the displacement vector, we obtain

3
X
2 !
ds = ds · ds = gµν dxµ dxν = c2 dt2 − dx2 − dy 2 − dz 2 . (1.18)
µ,ν=0

Hence the metric tensor gµν becomes for the special case of a Cartesian inertial frame in
Minkowski space diagonal with elements

 
1 0 0 0
 0 −1 0 0 
gµν =
 0 0 −1 0  ≡ ηµν
 (1.19)
0 0 0 −1

Introducing Einstein’s summation convention (cf. the box for details), we can rewrite the
scalar product of two vectors with coordinates aµ and bµ as

a · b ≡ ηµν aµ bν = aµ bµ = aµ bµ . (1.20)

In the last part of (1.20), we “lowered an index:” aµ = ηµν aµ or bµ = ηµν bµ . Next we introduce
the opposite operation of rasing an index by aµ = η µν aµ . Since raising and lowering are inverse
operations, we have ηµν η νσ = δµσ . Thus the elements of ηµν and η µν form inverse matrices,
which agree with (1.19) for a Cartesian intertial coordinate frame in Minkowski space.

11
1 Special relativity

t
(x − y)2 > 0 time-like

(x − y)2 = 0 light-like
b y
(x − y)2 < 0 space-like

Figure 1.2: Light-cone at the point y generated by light-like vectors. Contained in the light-
cone are the time-like vectors, outside the space-like ones.

Einstein’s summation convention:


1. Two equal indices, of which one has to be an upper and one an lower index, imply
summation. We use Greek letters for indices from zero to three, µ = 0, 1, 2, 3, and Latin
letters for indices from one to three, i = 1, 2, 3. Thus
3
X
aµ b µ ≡ aµ b µ = a0 b 0 − a1 b 1 − a2 b 2 − a3 b 3 = a0 b 0 − a · b = a0 b 0 − ai b i .
µ=0

2. Summation indices are dummy indices which can be freely exchanged; the remaining free
indices of the LHS and RHS of an equation have to agree. Hence

8 = aµµ = cµν dµν = cµσ dµσ

is okay, while aµ = bµ or aµ = bµν compares apples to oranges.

Since the metric ηµν is indefinite, the norm of a vector aµ can be

aµ aµ > 0, time-like, (1.21)


µ
aµ a = 0, light-like or null-vector, (1.22)
µ
aµ a < 0, space-like. (1.23)

The cone of all light-like vectors starting from a point P is called light-cone, cf. Fig. 1.2. The
time-like region inside the light-cone consists of two parts, past and future. Only events inside
the past light-cone can influence the physics at point P , while P can influence only its future
light-cone.
The line describing the position of an observer is called world-line. The proper-time τ is
the time displayed by a clock moving with the observer. How can we determine the correct
definition of τ ? First, we ask that in the rest system of the observer, proper- and coordinate-
time agree, dτ = dt. But for a clock at rest, it is dsµ /c = (dt, 0) and thus ds/c = dt. Since
the RHS of dτ = ds/c is an invariant expression, it has to valid in any frame and thus also

12
1.2 Minkowski space

for a moving clock. For finite times, we have to integrate the line-element,
Z 2 Z 2
τ12 = dτ = [dt2 − (dx2 + dy 2 + dz 2 )/c2 ]1/2 (1.24)
1 1
Z 2
= dt [1 − (1/c2 )((dx/dt)2 + (dy/dt)2 + (dz/dt)2 )]1/2 (1.25)
1
Z 2
= dt [1 − v 2 /c2 ]1/2 < t2 − t1 . (1.26)
1

to obtain the proper-time. The last part of this equation, where we introduced the three-
velocity v i = dxi /dt of the clock, shows explicitly the relativistic effect of time dilation, as
well as the connection between coordinate time t and the proper-time τ of a moving clock,
dτ = (1 − (v/c)2 )1/2 dt ≡ dt/γ.

Lorentz transformations If we replace t by −it in ∆s2 , the difference between two


spacetime events becomes (minus) the normal Euclidean distance. Similarly, the identity
cos2 α + sin2 α = 1 for an imaginary angle η = iα becomes cosh2 η − sinh2 η = 1. Thus a close
correspondence exists between rotations Rij in Euclidean space which leave ∆x2 invariant
and Lorentz transformations Λµν which leave ∆s2 invariant. We try therefore as a guess for
a boost along the x direction

c̃t = ct cosh η + x sinh η , (1.27)


x̃ = ct sinh η + x cosh η , (1.28)

with ỹ = y and z̃ = z. Direct calculation shows that ∆s2 is invariant as desired. Consider
now in the system K̃ the origin of the system K. Then x = 0 and

x̃ = ct sinh η and c̃t = ct cosh η . (1.29)

Dividing the two equations gives x̃/c̃t = tanh η. Since β = x̃/c̃t is the relative velocity of the
two systems measured in units of c, the imaginary “rotation angle η” equals the rapidity

η = arctanh β . (1.30)

Note that the rapidity η is a more natural variable than v or β to characterise a Lorentz
boost, because η is additive: Boosting a particle with rapidity η 1 by η leads to the rapidity
η 2 = η1 + η. Using the following identities,
1 1
cosh η = p =p ≡γ (1.31)
2 1 − β2
1 − tanh η
tanh η β
sinh η = p =p = γβ (1.32)
2 1 − β2
1 − tanh η
in (1.27) gives the standard form of the Lorentz transformations,
x + vt
x̃ = p = γ(x + βct) (1.33)
1 − β2
ct + vx/c
ct̃ = p = γ(ct + βx) . (1.34)
1 − β2

13
1 Special relativity

The inverse transformation is obtained by replacing v → −v and exchanging quantities with


and without tilde.
In addition to boosts parametrised by the rapidity η, rotations parametrised by the angle
α keep the spacetime distance invariant and are thus Lorentz transformations. For the special
case of a boost along and a rotation around the x1 axis, they are given in matrix form by
   
cosh η sinh η 0 0 1 0 0 0
 sinh η cosh η 0 0 
 , and Λµ (αx ) =  0 1 0 0 

Λµν (ηx ) =  ν
.
 0 0 1 0   0 0 cos α sin α 
0 0 0 1 0 0 − sin α cos α
(1.35)

Four-vectors and tensors In Minkowski space, we call a four-vector any four-tupel V µ that
transforms as Ṽ µ = Λµν V ν . By convention, we associate three-vectors with the spatial
part of vectors with upper indices, e.g. we set xµ = {ct, x, y, z} or Aµ = {φ, A}. Lowering
then the index by contraction with the metric tensor results in a minus sign of the spatial
components of a four-vector, xµ = ηµν xµ = {ct, −x, −y, −z} or Aµ = {φ, −A}. Summing over
a pair of Lorentz indices, always one index occurs in an upper and one in a lower position.
Additionally to four-vectors, we will meet tensors T µ1 ···µn of rank n which transform as
T̃ µ1 ···µn = Λµ1 ν1 · · · Λµn νn T ν1 ···νn . Every tensor index can be raised and lowered, using the
metric tensors η µν and ηµν .
Special tensors are the Kronecker delta, δµν = ηµν with δµν = 1 for µ = ν and 0 otherwise,
and the Levi–Civita tensor εµνρσ . The latter tensor is completely antisymmetric and has in
four dimensions the elements +1 for an even permutation of ε0123 , −1 for odd permutations
and zero otherwise. In three dimensions, we define the Levi–Civita tensor by ε123 = ε123 = 1.
Next consider differential operators. Forming the differential of a function f defined on
Minkowski space xµ ,
∂f ∂f ∂f ∂f ∂f
df = dt + dx + dy + dz = dxµ , (1.36)
∂t ∂x ∂y ∂z ∂xµ
we see that an upper index in the denominator counts as lower index, and vice versa. We
define the four-dimensional nabla operator as
 
∂ 1 ∂ ∂ ∂ ∂
∂µ ≡ = , , , .
∂xµ c ∂t ∂x ∂y ∂z

Note the “missing” minus sign in the spatial components, which is consistent with ∂µ = ∂xµ
and the rule for the differential in Eq. (1.36). The d’Alembert or wave operator is

1 ∂2
 ≡ ηµν ∂ µ ∂ ν = ∂µ ∂ µ = − ∆. (1.37)
c2 ∂t2
This operator is a scalar, i.e. all the Lorentz indices are contracted, and thus invariant under
Lorentz transformations.

1.3 Relativistic mechanics


From now on, we set c = ~ = 1.

14
1.3 Relativistic mechanics

Four-velocity and four-momentum What is the relativistic generalization of the three-


velocity v = dx/dt? The nominator dx has already the right behaviour to become part
of a four-vector, if the denominator would be invariant. We use therefore instead of dt the
invariant proper time dτ and write
dxα
uα = . (1.38)

The four-velocity is thus the tangent vector to the world-line xα (τ ) parametrised by the
proper-time τ of a particle. Written explicitly, we have

dt 1
u0 = =√ =γ (1.39)
dτ 1 − v2

and
dxi dxi dt vi
ui = = =√ = γv i . (1.40)
dτ dt dτ 1 − v2
Hence the four-velocity is uα = (γ, γv) and its norm is

u · u = u0 u0 − ui ui = γ 2 − γ 2 v 2 = γ 2 (1 − v 2 ) = 1 . (1.41)

The fact that its norm is constant confirms that uα is a four-vector.

Energy and momentum After having constructed the four-velocity, the simplest guess for
the four-momentum is
pα = muα = (γm, γmv) . (1.42)

For small velocities, v ≪ 1, we obtain


 
v2
pi = − . . . mv i
1+ (1.43)
2
mv 2
p0 = m+ − . . . = m + Ekin,nr + . . . (1.44)
2

Thus we can interpret the components as pα = (E, p). The norm follows with (1.41) imme-
diately as
p · p = m2 . (1.45)

Solving for the energy, we obtain


p
E=± m2 + p i p i (1.46)

including the famous E = mc2 as special case for a particle at rest. Note that (1.46) predicts
the existence of solutions with negative energy—undermining the stability of the universe.
According Feynman, we should view thesepnegative energy solutions pas positive energy solu-
tions moving backward in time, exp(−i(− m2 + p2 )t) = exp[−i(+ m2 + p2 )(−t)].

15
1 Special relativity

Four-forces We postulate now that in relativistic mechanics Newton’s law becomes


dpα
fα = (1.47)

where we introduced the four-force f α. Since both uα and pα consist of only three independent
component, we expect that there exists also a constraint on the four-force f α . We form the
scalar product
d(mu) dm du dm
u·f =u· =u·u + mu · = . (1.48)
dτ dτ dτ dτ
In the last step we used twice that u · u = 1. Since all electrons ever observed have the
same mass, no force should exist which changes m. As a consequence, we have to ask that
all physical acceptable force-laws satisfy u · f = 0; such forces are called pure forces.

Observer The world-line xµ (τ ) of an observer, or of any massive particle, is time-like: With


this we mean not that xµ as a vector is time-like (a statement not invariant under trans-
lations) but that the distance ds2 between any two points of the world-line is time-like.
Equivalently, the four-velocity uα of a massive particle is a time-like vector. At each in-
stant, we can choose an instantanous Cartesian inertial frame with the four basis vectors
{eµ (τ )} = {e0 (τ ), e1 (τ ), e2 (τ ), e3 (τ )} in which the observer is at rest. Then the time-like
basis vector e0 (τ ) agrees with the four-velocity uobs of the observer. Moreover, the scalar
product of the basis vectors satisfies eµ · eν = ηµν . A measurement of a particle with four-
momentum kµ = (ω, k) performed by the observer at rest results in the energy ω and the
momenta ki = −k · ei . We can rewrite this as a tensor equation,

ω = k · uobs and ki = −k · ei , (1.49)

and thus the RHSs are valid also for a moving observer.

1.A Appendix: Comments and examples on tensor and index


notation
How to guess physical tensors Classical electrodynamics is typically teached using a for-
mulation which is valid in a specific frame. Thus one uses scalars like the charge density ρ,
vectors like the electric and magnetic field strengths E and B and tensors like Maxwell’s
stress tensor σij , defining their transformation properties with respect to rotations in three-
dimensional space. This leads to the question how we can guess how the four-dimensional
tensors are composed out of their three-dimensional relatives.
In the simplest cases, we may guess this by considering quantities which are related by a
physical law. An example is current conservation,

∂t ρ + ∇ · j = 0. (1.50)

We know that any 4-vector aµ has 4 = 3 + 1 components, which transform as a scalar (a0 )
and a vector (a) under rotations. This suggests to combine (ρ, j) = j µ and ∂µ = (∂t , ∇)
into four-vectors (consistent with our definition of the nabla operator), leading to ∂µ j µ = 0.
Similarly, we combine the scalar potential φ and the vector potential A into a four-vector
Aµ = (φ, A). If we move to tensors of rank two, i.e. 4 × 4 matrices, it is useful to formalise
the splitting of such a tensor in components.

16
1.A Appendix: Comments and examples on tensor and index notation

Reduicible and irreduicible tensors An object which contains invariant subgroups with
respect to a symmetry operation is called reducible. In our case at hand, we want to determine
the reducible subgroups of a tensor of rank n with respect to spatial rotations. For a four-
vector, the splitting is Aµ = (A0 , A). Next, we consider the reducible subgroups of an
arbitrary tensor T µν of rank two. First, we note that we can split any tensor T µν into a
symmetric and antisymmetric piece, T µν = S µν + Aµν with S µν = S νµ and Aµν = −Aνµ ,
writing
1 1
Tµν = (Tµν + Tνµ ) + (Tµν − Tνµ ) ≡ T{µν} + T[µν] ≡ Sµν + Aµν . (1.51)
2 2
This splitting is invariant under general coordinate transformations, and thus also under
rotations, Ex. ??. Physically this expected, since our equations tell us that some quantities are
antisymmetric (e.g. the field-strength tensor F µν ), while others are symmetric (e.g. Maxwell’s
stress tensor σij ) and all observers should agree on this.
Thus we can examine the symmetric and antisymmetric tensors seperately. and we start
with the former. We can split S µν into a scalar S 00 , a vector S 0i and a tensor S ij ,
 00 
µν S S 0i
S = . (1.52)
S i0 S ij

To show this, calculate the effect of a rotation, S̃µν = Λµρ Λνσ Sρσ , or in matrix notation
S ′ = ΛSΛT , where for a rotation
 
ν 1 0
Λµ = . (1.53)
0T R

The tensor S ij is again reducible, since its trace is a scalar. Thus we can decompose S ij into
its trace s = S ii and its traceless part Sji − sδji /(d − 1).
An antisymmetric tensor Fµν has 3 + 2 + 1 = 6 components, i.e. combines two 3-vectors,
or more precisely a pure vector like E and an axial vector like B,
 
0 −Ex −Ey −Ez
 Ex 0 −Bz By 
Aµν =  Ey Bz
. (1.54)
0 −Bx 
Ez −By Bx 0
To show this, calculate again the effect of a rotation, and of a parity tranformation.

(Anti-) symmetrisation Finally let us note some useful relations for contractions involving
symmetric and antisymmetric tensors. First, they are “orthogonal” in the sense that the
contraction of a symmetric tensor Sµν with an antisymmetric tensor Aµν gives zero,

Sµν Aµν = 0. (1.55)

This allows one to (anti-) symmetrize the contraction of an arbitrary tensor Cµν with an
(anti-) symmetric tensor: First split Cµν into symmetric and antisymmetric parts,
1 1
Cµν = (Cµν + Cνµ ) + (Cµν − Cνµ ) ≡ C{µν} + C[µν] . (1.56)
2 2
Then
Sµν C µν = Sµν C {µν} and Aµν C µν = Aµν C [µν] . (1.57)

17
1 Special relativity

Index gymnastics We are mainly concerned with vectors and tensors of rank two. In this
case we can express all equations as matrix operations. For instance, lowering the index of a
vector, Aµ = ηµν Aµ , becomes
  0   
1 0 0 0 A A0
 0 −1 0 0   1   1 
Aµ =    A   −A 
 0 0 −1 0   A2  =  −A2  .
0 0 0 −1 A3 −A3

Raising and lowering indices is the inverse, and thus ηµν η νσ = δµσ . In matrix notation,

ηη −1 = 1.

We can view ηµν η νσ = δµσ as the operation of raising an index of ηµν (or lowering an index of
η µν ): in both cases, we see that the Kronecker delta corresponds to the metric tensor with
mixed indices, δµσ = ηµσ .
The expression for the line-element becomes
  
1 0 0 0 dx0
  0 −1 0 0  1 
  dx 

ds2 = ηµν dxµ dxν = dxµ ηµν dxν = dx0 , dx1 , dx2 , dx3  0 0 −1 0   dx2 
0 0 0 −1 dx3
= (dx0 )2 − (dx1 )2 − (dx2 )2 − (dx3 )2 .

For a second-rank tensor, raising one index gives

Tµν = ηµρ T ρν = T ρν ηµρ 6= ηµρ Tν ρ = Tνµ

Note that the order of tensors does not matter, but the order of indices does. If we move to
matrix notation, we have to restore the right order. Raising next the second index,

Tµν = ηµρ ηνσ T ρσ

we have to re-order it as Tµν = ηµρ T ρσ ησν in matrix notation (using that η is symmetric).
We apply this to the field-strength tensor: Starting from F µν , we want to construct Fµν =
ηµρ F ρσ ησν ,
   
1 0 0 0 0 −Ex −Ey −Ez 1 0 0 0
 0 −1 0 0   Ex
  0 −Bz By   0
  −1 0 0 
Fµν =   0 0 −1 0   Ey Bz

0 −Bx   0 0 −1 0 
0 0 0 −1 Ez −By Bx 0 0 0 0 −1
    
1 0 0 0 0 Ex Ey Ez 0 Ex Ey Ez
 0 −1 0 0   Ex 0 Bz −By   −Ex 0 −Bz By 
= 
 0 0 −1 0   Ey −Bz
= .
0 Bx   −Ey Bz 0 −Bx 
0 0 0 −1 Ez By −Bx 0 −Ez −By Bx 0
(1.58)

Note the general behaviour: The F 00 element and the 3-tensor F ik are multiplied by 12 and
(−1)2 , respectively and do not change sign. The 3-vector F 0k is multiplied by (−1)(+1) and
does change sign.

18
1.A Appendix: Comments and examples on tensor and index notation

Next we want to construct a Lorentz scalar out of F µν . A Lorentz scalar has no indices, so
we contract the two indices, ηµν F µν = Fµ µ . This is invariant, but zero (and thus not useful)
because F µν is antisymmetric. As next try, we construct a Lorentz scalar S using two F’s:
Multiplying the two matrices Fµν and F µν , and taking then the trace, gives
 
E·E
 Ex2 − Bz2 − By2 
S = Fµν F µν = −tr{Fµν F νρ } = −tr  
 Ey2 − Bz2 − Bx2 
Ez2 − By2 − Bx2

i.e. S = −2(E · E − B · B). Note the minus, since we have to change the order of indices in
the second F .
Note also that S has to be a bilinear in E and B and invariant under rotations. Thus the
only possible terms entering S are the scalar products E · E, B · B and E · B. Since B is a
polar (or axial) vector, P B = B, the last term is a pseudo-scalar and cannot enter the scalar
S.
Now we become more ambitious, looking at a tensor with 4 indices, the Levi–Civita or
completely antisymmetric tensor εαβγδ in four dimensions, with

ε0123 = +1, (1.59)

and all even permutations, −1 for odd permutations and zero otherwise. We lower its indices,

εαβγδ = εᾱβ̄γ̄ δ̄ η ᾱα η β̄β η γ̄γ η δ̄δ

and consider the 0123 element using that the metric is diagonal,

ε0123 = +1η 00 η 11 η 22 η 33 = −1. (1.60)

Thus in 4 dimensions, εαβγδ and εαβγδ have opposite signs.


We can use the Levi-Civita tensor to define the dual field-strength tensor
1
F̃ αβ = εαβγδ Fγδ .
2
How to find the elements of this? Using simply the definitions,
1 
F̃01 = F 23 +ε0132 F 32 = −Bx
ε0123 |{z}
2 | {z }
1 −Bx

1 
F̃12 = ε1203 F 03 + ε1230 F 30 = −Ez
2
etc., gives
   
0 −Bx −By −Bz 0 Bx By Bz
 Bx 0 −Ez Ey   −Bx 0 −Ez Ey 
F̃µν =
 By Ez
 and F̃ µν = .
0 −Ex   −By Ez 0 −Ex 
Bz −Ey Ex 0 −Bz −Ey Ex 0

The dual field-strength tensor is useful, because the homogeneous Maxwell equation

∂α Fβγ + ∂β Fγα + ∂γ Fαβ = 0 (1.61)

19
1 Special relativity

becomes simply
∂α F̃ αβ = 0 . (1.62)
Inserting the potential, we obtain zero,
1
∂α F̃ αβ = εαβγδ ∂α Fγδ = εαβγδ ∂α ∂γ Aδ = 0 , (1.63)
2
because we contract a symmetric tensor (∂α ∂γ ) with an anti-symmetric one (εαβγδ ).
Having F µν and F̃µν , we can form another (pseudo-) scalar, A = F̃µν F µν . Multiplying the
two matrices F̃µν and F µν , and taking then the trace, gives
 
B·E
 B·E 
F̃µν F µν = −tr{F̃µν F νρ } = tr  
 B·E 
B·E

i.e. F̃µν F µν = 4E · B. We know that E · B is a pseudo-scalar. This tells us that including the
Levi-Civita tensor converts a tensor into a pseudo-tensor, which does not change sign under
a parity transformation P x = −x. (This analogous to Bi = εijk ∂j Ak , which converts two
pure vectors into an axial one.)

20
2 Lagrangian mechanics and symmetries
We review briefly the Lagrangian formulation of classical mechanics and it connection to
symmetries.

2.1 Calculus of variations


A map F [f (x)] from a certain space of functions f (x) into R is called a functional. We will
consider functionals from the space C2 [a : b] of (at least) twice differentiable functions between
fixed points a and b. Extrema of functionals are obtained by the calculus of variations. Let
us consider as functional the action S defined by
Z b
S[L(q i , q̇ i )] = dt L(q i , q̇ i , t) , (2.1)
a

where L is a function of the 2n independent functions q i and q̇ i = dq i /dt as well as of the


parameter t. In classical mechanics, we call L the Lagrange function of the system, q are its
generalised coordinates, q̇ i the corresponding velocities and t is the time. The extremum of
this action gives the paths from a to b which are solutions of the equation of motions for the
system described by L. We discuss in the next section how one derives the correct L given a
set of interactions and constraints.
The calculus of variations shows how one find those paths that extremize such functionals:
Consider an infinitesimal variation of the path, q i (t) → q i (t) + δq i (t) with δq i (t) = εη i (t) that
keeps the endpoints fixed, but is otherwise arbitrary. The resulting variation of the functional
is Z b Z b  
i i ∂L i ∂L i
δS = dt δL(q , q̇ , t) = dt δq + i δq̇ . (2.2)
a a ∂q i ∂ q̇
We can eliminate the variation δq̇ i of the velocities, integrating the second term by parts
using δ(q̇ i ) = d/dt(δq i ),
    
∂L i b
Z b
∂L d ∂L i
δS = dt − δq + δq . (2.3)
a ∂q i dt ∂ q̇ i ∂ q̇ i a

The boundary term vanishes, because we required that the variations δq i are zero at the
endpoints a and b. Since the variations are otherwise arbitrary, the terms in the first bracket
have to be zero for an extremal curve, δS = 0. Paths that satisfy δS = 0 are classically
allowed. The equations resulting from the condition δS = 0 are called the Euler-Lagrange
equations of the action S,
δS ∂L d ∂L
i
= i− = 0, (2.4)
δq ∂q dt ∂ q̇ i
and give the equations of motion of the system specified by L. Physicists call these equations
often simply Lagrange equations or, especially in classical mechanics, Lagrange equations of
the second kind.

21
2 Lagrangian mechanics and symmetries

The Lagrangian L is not uniquely fixed: Adding a total time-derivative, L′ = L+df (q, t)/dt
does not change the resulting Lagrange equations,
Z b
′ df
S =S+ dt = S + f (q(b), tb ) − f (q(a), ta ) , (2.5)
a dt
since the last two terms vanish varying the action with the restriction of fixed endpoints a
and b.

Infinitesimal variations: If you are worried about the meaning of “infinitesimal” variations,
the following definition may help: Consider an one-parameter family of paths,

q i (t, ε) = q i (t, 0) + εη i (t).

Then the “infinitesimal” variation corresponds to the change linear in ε,



q(t, ε) − q(t, 0) ∂q(t, ε)
δq ≡ lim = (2.6)
ε→0 ε ∂ε ε=0

and similarly for functions and functionals of q. Moreover, it is obvious from Eq. (2.6) that
the assumption of time-independent ε implies that the variation δ and the time-derivative d/dt
acting on q commute,
∂ q̇(t, ε) d
δ(q̇) = = (δq) ,
∂ε ε=0 dt

2.2 Hamilton’s principle and the Lagrange function


The observation that the solutions of the equation of motions can be obtained as the extrema
of an appropriate functional (“the action S”) of the Lagrangian L subject to the conditions
δq i (a) = δq i (b) = 0 is called Hamilton’s principle or the principle of least action. Note that
the last name is a misnomer, since the the extremum can be also a maximum or saddle-point
of the action.
We derive now the Lagrangian L of a free non-relativistic particle from the Galilean principle
of inertia. More precisely, we use that the homogeneity of space and time forbids that L
depends on x and t, while the isotropy of space implies that L depends only on the norm of
the velocity vector, but not on its direction,

L = L(v 2 ).

Let us consider two inertial frames moving with the infinitesimal velocity ε relative to each
other. Then a Galilean transformation connects the velocities measured in the two frames as
v ′ = v + ε. The Galilean principle of relativity requires that the laws of motion have the same
form in both frames, and thus the Langrangians can differ only by a total time-derivative.
Expanding the difference δL in ε gives with δv 2 = 2vε
∂L 2 ∂L
δL = 2
δv = 2vε 2 . (2.7)
∂v ∂v
The difference has to be a total time-derivative. Since v = q̇, the derivative term ∂L/∂v 2 has
to be independent of v. Hence, L ∝ v 2 and we call the proportionality constant m/2, and

22
2.2 Hamilton’s principle and the Lagrange function

the total expression kinetic energy T ,


1
L=T = mv 2 . (2.8)
2

Example: Check the relativity principle for finite relative velocities:


For
1 1 1 1
L′ = mv ′2 = m(v + V )2 = mv 2 + mv · V + mV 2
2 2 2 2
or  
′ d 1 2
L =L+ mx · V + mV t .
dt 2
Thus the difference is indeed a total time derivative.

We can write the velocity with dl2 = dx2 + dy 2 + dz 2 as


dl2 dxi dxk
v2 = = gik ,, (2.9)
dt2 dt dt
where the quadratic form gik is the metric tensor. For instance, in spherical coordinates
dl2 = dr 2 + r 2 sin2 ϑdφ2 + r 2 dϑ2 and thus
1  2 2 2 2 2 2

T = m ṙ + r sin ϑφ̇ + r ϑ̇ . (2.10)
2
Choosing the appropriate coordinates, we can account for constraints: The kinetic energy of a
particle moving on sphere with radius R would be simply given by T P= mR2 (sin2 ϑφ̇2 + ϑ̇2 )/2.
1 2
For a system of non-interacting particles, L is additive, L = a 2 ma va . If there are
interactions (assumed for the moment to dependent only on the coordinates), then we subtract
a function V (r 1 , r 2 , . . .) called potential energy.
We can now derive the equations of motions for a system of n interacting particles,
n
X 1
L= ma va2 − V (r 1 , r 2 , . . . , r n ) . (2.11)
2
a=1

using the Lagrange equations,


dva ∂V
ma =− = Fa . (2.12)
dt ∂r a
We can change from Cartesian coordinates to arbitrary (or “generalized”) coordinated for
the n particles,
∂f a k
xa = f a (q 1 , . . . , q n ), ẋa = q̇ . (2.13)
∂q k
Substituting gives
1
L = aik q̇ i q̇ j − V (qi ) , (2.14)
2
where the matrix aik (q) is a quadratic function of the velocities q̇ i that is apart from the
factors ma identical to the metric tensor on the configuration space q n . Finally, we define the
canonically conjugated momentum pi as
∂L
pi = . (2.15)
∂ q̇ i

23
2 Lagrangian mechanics and symmetries

A coordinate qi that does not appear explicitly in L is called cyclic. The Lagrange equations
imply then ∂L/∂ q̇i = const., so that the corresponding canonically conjugated momentum
pi = ∂L/∂ q̇ i is conserved.

Feynman’s approach to quantum theory:


The whole information about a quantum mechanical system is contained in its time-evolution
operator U (t, t′ ). Its matrix elements K(x′ , t′ ; x, t) (propagator or Green’s function) in the
coordinate basis connect wavefunctions at different times as
Z
ψ(x , t ) = d3 xK(x′ , t′ ; x, t)ψ(x, t) .
′ ′

Feynman proposed the following connection between the propagator K and the classical action
S,
Z q′
K(x′ , t′ ; x, t) = N Dq exp(iS) ,
q

where Dq denotes the “integration over all paths.” Hence the difference between the classical
and quantum world is that in the former only paths extremizing the action S are allowed while
in the latter all paths weighted by exp(iS) contribute.
For a readable introduction see R. P. Feynman, A. R. Hibbs: Quantum mechanics and path integrals or R. P.
Feynman (editor: Laurie M. Brown), Feynman’s thesis : a new approach to quantum theory.

2.3 Symmetries and conservation laws


Quantities that remain constant during the evolution of a mechanical system are called inte-
grals of motions. Seven of them that are connected to the fundamental symmetries of space
and time are of special importance: These are the conserved quantities energy, momentum
and angular momentum.

Energy The Lagrangian of a closed system depends, because of the homogeneity of time,
not on time. Its total time derivative is
dL ∂L ∂L
= i q̇ i + i q̈ i . (2.16)
dt ∂q ∂ q̇
Replacing ∂L/∂q i by (d/dt)∂L/∂ q̇ i , it follows
 
dL d ∂L ∂L i d ∂L
= q̇ i + q̈ = q̇ i . (2.17)
dt dt ∂ q̇ i ∂ q̇ i dt ∂ q̇ i
Hence the quantity
∂L
E ≡ q̇ i
−L (2.18)
∂ q̇ i
remains constant during the evolution of a closed system. This holds also more generally, e.g.
in the presence of static external fields, as long as the Lagrangian is not time-dependent.
We have still to show that E coincides indeed with the usual definition of energy. Using as
L = T (q, q̇) − U (q), where T is quadratic in the velocities, we have
∂L ∂T
q̇ i i
= q̇ i i = 2T (2.19)
∂ q̇ ∂ q̇
and thus E = 2T − L = T + U .

24
2.3 Symmetries and conservation laws

Momentum Homogeneity of space implies that an translation by a constant vector of a


closed system does not change its properties. Thus an infinitesimal translation from r to
r + ε should not change L. Since velocities are unchanged, we have (summation over a
particles)
X ∂L X ∂L
δL = · δr a = ε · . (2.20)
a
∂r a a
∂r a

The condition δL = 0 is true for arbitrary ε, if


X ∂L
= 0. (2.21)
a
∂r a

Using again Lagrange’s equations, we obtain


X d ∂L d X ∂L
= = 0. (2.22)
a
dt ∂v a dt a
∂v a

Hence, in a closed mechanical system the momentum vector of the system


X ∂L X
ptot = = ma v a = const. (2.23)
a
∂v a a

is conserved.
The condition P (2.21) signifies with ∂L/∂r a = −∂V /∂r a that the sum of forces on all
particles is zero, a F a = 0. For the particular case of a two-particle system, F a = −F b , we
have thus derived Newton’s third law, the equality of action and reaction.

Isotropy We consider now the consequences of the isotropy of space, i.e. search the conserved
quantity that follows from a Lagrangian invariant under rotations. Under an infinitesimal
rotation by δφ both coordinates and velocities change,
δr = δφ × r , (2.24)
δv = δφ × v . (2.25)
Inserting the expression into
X  ∂L ∂L

δL = δr a + δv a =0 (2.26)
a
∂r a ∂v a

gives, using also the definition pa = ∂L/∂v a as well as the Lagrange equation ṗa = ∂L/∂r a ,
X
δL = (ṗa · δφ × r a + pa · δφ × v a ) = 0 . (2.27)
a

Permuting the factors and extracting δφ gives


X d X
δφ · (r a × ṗa + v a × pa ) = δφ · r a × pa = 0 . (2.28)
a
dt a

Thus the angular momentum


X
M= r a × pa = const. (2.29)
a

is conserved.

25
2 Lagrangian mechanics and symmetries

2.4 Free relativistic particle


Massive particles We introduced the proper-time τ to measure the time along the worldline
of a massive particle,
Z 2 Z 2
τ12 = dτ = [dt2 − (dx2 + dy 2 + dz 2 )]1/2 (2.30)
1 1
Z 2
= [ηµν dxµ dxν ]1/2 . (2.31)
1

If we use a different parameter σ, e.g. such that σ(τ = 1) = 0 and σ(τ = 2) = 1, then
Z 1  1/2
dxµ dxν
τ12 = dσ ηµν . (2.32)
0 dσ dσ

Note that τ12 is invariant under a reparameterisation σ ′ = f (σ).


We check now if the choice  
dxµ dxν 1/2
L = ηµν (2.33)
dσ dσ
is sensible for a free particle: L is Lorentz-invariant with xµ = (x, t) as dynamical variables,
while σ plays the role of the parameter time t in the non-relativistic case. The Lagrange
equations are
d ∂L ∂L
α
= . (2.34)
dσ ∂(dx /dσ) ∂xα
Consider e.g. the x1 component, then
 
d ∂L d 1 dx1
= = 0. (2.35)
dσ ∂(dx1 /dσ) dσ L dσ

Since L = dτ /dσ, it follows after multiplication with dσ/dτ

d 2 x1
=0 (2.36)
dτ 2
and the same for the other coordinates.
An alternative which we use latter more often is

L = ηµν ẋµ ẋν (2.37)

with ẋµ = dxν /dτ . Since this Lagrangian is the square-root of the one defined in Eq. (2.33)
for the special choice σ = τ , is it clear that the same equation of motion result. While this
Lagrangian is more useful in calculations, it is invariant only under affine transformations,
τ → Aτ + B.

Massless particles The energy-momentum relation of massless particles like the photon
becomes ω = |k|. Thus their four-velocity and four-momenta are light-like, u2 = p2 = 0, and
light signals form the future light-cone of the emission point P . Since ds = dτ = 0 on the
light-cone, we cannot use the Lagrangians (2.33) or (2.37).

26
2.4 Free relativistic particle

To find an alternative, consider how we can parametise the curve x = t. Choosing uα =


(1, 1, 0, 0), we can set
xα (λ) = λuα . (2.38)
Then the four-velocity becomes the tangent vector uα = dxα (λ)/dλ, similar to the defini-
tion (1.38) for massive particles. With the choice (2.38), the four-velocity for a massless
particle satisfies
du
= 0. (2.39)

Such parameters are called affine, and the set of these parameters are invariant under affine
transformations, τ → Aτ + B. In this case, we can use the same equations of motion for
massive and massless particles, only replacing u · u = 1 with u · u = 0.
Let us now show that a time-like geodesic C is a local maximum of the proper-time: We
note first that the path of a light-ray satisfies ds2 = 0, and thus the proper-time along a
light-like geodesics is zero. Next we use that we can approximate C by the curve D using
zig-zaging light-like paths, as shown in Fig. ??. Increasing the number of these paths, the
approximation becomes arbitrary precise, but τ (D) = 0 < τ (C). Thus the time-like geodesic
C cannot be a minimize the proper-time. .

27
3 Basic differential geometry
We motivate this chapter about differential geometry by giving some arguments why a rela-
tivistic theory of gravity should replace Minkowski space by a curved manifold. Let us start
by reviewing three basic properties of gravitation.

1.) The idea underlying the equivalence principle emerged in the 16th century, when among
others Galileo Galilei found experimentally that the acceleration g of a test mass in
a gravitational field is universal. Because of this universality, the gravitating mass
mg = F/g and the inertial mass mi = F/a are identical in classical mechanics, a fact
that puzzled already Newton. While mi = mg can be achieved for one material by a
convenient choice of units, there should be in general deviations for test bodies with
differing compositions.
Knowing more forces, this puzzle becomes even stronger: Contrast the acceleration of
a particle in a gravitational field to the one in a Coulomb field. In the latter case, two
independent properties of the particle, namely its charge q determining the strength of
the electric force acting on it and its mass mi , i.e. the inertia of the particle, are needed
as input in the equation of motion. In the case of gravity, the “gravitational charge”
mg coinicides with the inertial mass mi .
The equivalence of gravitating and inertial masses has been tested already by Newton
and Bessel, comparing the period P of pendula of different materials,
s
mi l
P = 2π , (3.1)
mg g

but finding no measurable differences. The first precision experiment giving an upper
limit on deviations from the equivalence principle was performed by Loránd Eötvös in
1908 using a torsion balance. Current limits for departures from universal gravitational
attraction for different materials are |∆gi /g| < 10−12 .

2.) Newton’s gravitational law postulates as the latter Coulomb law an instantaneous inter-
action. Such an interaction is in contradiction to special relativity. Thus, as interactions
of currents with electromagnetic fields replace the Coulomb law, a corresponding de-
scription should be found for gravity. Moreover, the equivalence of mass and energy
found in special relativity requires that, in a loose sense, energy not only mass should
couple to gravity: Imagine a particle-antiparticle pair falling down a gravitational po-
tential well, gaining energy and finally annihilating into two photons moving the gravi-
tational potential well outwards. If the two photons would not loose energy climbing up
the gravitational potential well, a perpetuum mobile could be constructed. If all forms
of energy act as sources of gravity, then the gravitational field itself is gravitating. Thus
the theory is non-linear and its mathematical structure is much more complicated than
the one of electrodynamics.

28
3.1 Manifolds and tensor fields

3.) Gravity can be switched-off locally, just by cutting the rope of an elevator: Inside a
freely falling elevator, one does not feel any gravitational effects except for tidal forces.
The latter arise if the gravitational field is non-uniform and tries to stretch the elevator.
Inside a sufficiently small freely falling system, also tidal effects plays no role. This
allows us to perform experiments like the growing of crystalls in “zero-gravity” on the
International Space Station which is orbiting only at an altitude of 300 km.
Motivated by 2.), Einstein used 1.), the principle of equivalence, and 3.) to derive general
relativity, a theory that describes the effect of gravity as a deformation of the space-time
known from special relativity.
In general relativity, the gravitational force of Newton’s theory that accelerates particles
in an Euclidean space is replaced by a curved space-time in which particles move force-free
along geodesic lines. In particular, photons move still as in special relativity along curves
satisfying ds2 = 0, while all effects of gravity are now encoded in the form of the line-element
ds. Thus all information about the geometry of a space-time is contained in the metric gµν .

3.1 Manifolds and tensor fields


Manifolds A manifold M is any set that can be continuously parametrized. The number
of independent parameters needed to specify uniquely any point of M is its dimension, the
parameters are called coordinates. Examples are e.g. the group of rotations in R3 (with 3
Euler angles, dim = 3) or the phase space (q i , pi ) of classical mechanics with dim = 2n.
We require the manifold to be smooth: the transitions from one set of coordinates to another
one, xi = f (x̃i , . . . , x̃n ), should be C ∞ . In general, it is impossible to cover all M with one
coordinate system that is well-defined everywhere. (Examples are spherical coordinates on
a sphere S 2 , where φ is ill-defined at the poles.) Instead one has to use patches of different
coordinates that (at least partially) overlap.
A Riemannian manifold is a differentiable manifold with a symmetric, positive-definite
tensor-field gij . Space-time in general relativity is a four-dimensional pseudo-Riemannian
(also called Lorentzian) manifold, where the metric has the signature (1,3).

Covariant and contravariant tensors Consider two n dimensional coordinate systems x and
x̃ and assume that we can express the xi as functions of the x̃i ,

xi = f (x̃1 , . . . , x̃n ) (3.2)

or more briefly xj = xj (x̃i ). Forming the differentials, we obtain

∂xi
dxi = dx̃j . (3.3)
∂ x̃j
The transformation matrix
∂xi
aij = (3.4)
∂ x̃j
is a n × n dimensional matrix with determinant (“Jacobian”) J = det(a). If J 6= 0 in the
point P , we can invert the transformation,

∂ x̃i
dx̃i = dxj = ãij dxj . (3.5)
∂xj

29
3 Basic differential geometry

The transformation matrices are inverse to each other, ãij ajk = δki . According to the product
rule of determinants, J(a) = 1/J(ã).
A contravariant vector X (or contravariant tensor of rank one) has a n-tupel of components
that transforms as
∂ x̃i j
X̃ i = X . (3.6)
∂xj
This definition guarantees that the tensor itself is an invariant object, since the transformation
of its components is cancelled by the transformation of the basis vectors,

X = Xi dxi = X̃i dx̃i = X̃ (3.7)

By definition, a scalar field φ remains invariant under a coordinate transformation, i.e.


φ(x) = φ(x̃) at all points. Consider now the derivative of φ,

∂φ(x(x̃)) ∂xj ∂φ
= . (3.8)
∂ x̃i ∂ x̃i ∂xj
This is the inverse transformation matrix and we call a covariant vector (or covariant tensor
of rank one) any n-tupel transforming as

∂xj
X̃i = Xj . (3.9)
∂ x̃i
More generally, we call an object T that transforms as
′ ′
i,...,n ∂ x̃i ∂ x̃n ∂xj ∂xm i′ ,...,n′
T̃j,...,m = i ′ . . . n ′ j
. . . m
Tj ′ ,...,m′ (3.10)
|∂x {z ∂x } |∂ x̃ {z ∂ x̃ }
n m

a tensor of rank (n, m).

Dual basis We defined earlier gij = ei · ej . Now we define a dual basis ei with metric g ij
via
ei · ej = δij . (3.11)
We want to determine the relation of gij with gij . First we set

ei = Aij ej , (3.12)

multiply then with ek and obtain

gik = ei · ek = Aij ej · ek = Aik . (3.13)

Hence the metric g ij maps covariant vectors Xi into contravariants vectors X i , while gij
provides a map into the opposite direction. In the same way, we can use g to raise and lower
indices of any tensor.
Next we multiply ei with ek = gkl el ,

δki = ei · ek = ei · gkl el = gkl gil (3.14)

or
δki = gkl gli . (3.15)

30
3.2 Tensor analysis

Thus the components of the covariant and the contravariant metric tensors, gij and gij , are
inverse matrices of each other.

Example: Spherical coordinates 1:


Calculate for spherical coordinates x = (r, ϑ, φ) in R3 ,

x′1 = r sin ϑ cos φ ,


x′2 = r sin ϑ sin φ ,
x′3 = r cos ϑ ,

the components of gij and g ij , and g ≡ det(gij ).


From ei = ∂x′j /∂xi ej , it follows

xj ′
e1 = e = sin ϑ cos φ e′1 + sin ϑ sin φ e′2 + cos ϑe′3 ,
∂r j
xj ′
e2 = e = r cos ϑ cos φ e′1 + r cos ϑ sin φ e′2 − r sin ϑe′3 ,
∂ϑ j
xj ′
e3 = e = −r sin ϑ sin φ e′1 + r sin ϑ cos φ e′2 .
∂φ j

Since the ei are orthogonal to each other, the matrices gij and g ij are diagonal. From the definition
gij = ei · ej one finds gij = diag(1, r2 , r2 sin2 ϑ) Inverting gij gives g ij = diag(1, r−2 , r−2 sin−2 ϑ). The
determinant is g = det(gij ) = r4 sin2 ϑ. Note that the volume integral in spherical coordinates is given
by
Z Z Z Z
3 √
3 ′ 3
d x = d x J = d x g = drdϑdφ r2 sin ϑ ,

∂xk′ ∂xl′
since gij = ∂ x̃i ∂ x̃j

gkl and thus det(g) = J 2 det(g ′ ) = J 2 with det(g ′ ) = 1.

3.2 Tensor analysis


Doing analysis on a manifold requires an additional structure that makes it possible to com-
pare e.g. tangent vectors living in tangent spaces at different points of the manifold: A
prescription is required how a vector should be transported from point P to Q in order to
calculate a derivative. Mathematically, many different schemes are possible (and sensible),
but we should require the following:
• Any derivative has to be linear and satisfy the Leibniz rule. In addition, a derivative of
a tensor should be again a tensor. This may require a modification of the usual partial
derivative; this modification should however vanish for a flat space.
• These conditions define the affine connection and the corresponding derivative discussed
in the appendix. However, they do not fix the connection uniquely. One may add
therefore the following 2 additional constraints:
– The length of a vector should remain constant being transported along the mani-
fold. (Think about the four velocity |u| = 1 or |p| = m.)
– A vector should not be twisted “unnecessarily” being transported along the man-
ifold.

31
3 Basic differential geometry

3.2.1 Metric connection and covariant derivative


Relations like ds2 = gik dxi dxj or gik pi pj = m2 become invariant under parallel transport
only, if the metric tensor is covariantly constant,
∇c gab = ∇c gab = 0 . (3.16)
A connection satisfying Eq. (3.16) is called metric compatible and leaves lengths and angles
invariant under parallel transport. This requirement guarantees that we can introduce lo-
cally in the whole space-time Cartesian inertial coordinate systems where the laws of special
relativity are valid. Moreover, these local inertial systems can be consistently connected by
parallel transport using an affine connection satisfying the constraint 3.16.
Now we want to build in this constraint into a new, more specific definition of the covariant
derivative: We consider again how a vector V and its components V a = ea · V transform
under a coordinate change. The derivative of a vector V transforms as a tensor,
∂xb
∂a V → ∂˜a Ṽ = ∂b V , (3.17)
∂ x̃a
since V is an invariant object. If we consider however its components V a = ea · V , then the
moving coordinate basis in curved space-time, ∂a eb 6= 0, introduces an additional term
∂a V b = eb · (∂a V ) + V · (∂a eb ) (3.18)
in the derivative ∂a V b . The first term eb ·(∂a V ) transforms as a tensor, since both eb and ∂a V
are tensors. This implies that the combination of the two remaining terms has to transform
as tensor too, which we define as (new) covariant derivative
∇a V b ≡ eb · (∂a V ) = ∂a V b − V · (∂a eb ) . (3.19)
The first relation tells us that we can view the covariant derivative ∇a V b as the projection of
∂a V onto the direction eb . The Leibniz rule applied to φ = Xa X a implies that
∇a Vb ∂a Vb + V · (∂a eb ) . (3.20)
If we expand now the partial derivative of the basis vectors as a linear combination of the
basis vectors,
∂l ek = −Γklj ej , and ∂l ek = Γjkl ej , (3.21)
and call the coefficients connection coefficients, our two definitions of the covariant derivative
seem to agree. However, we did not differentiate in (3.18) the “dot”, i.e. the scalar product.
As a consequence, the conncetion defined by (3.21) will be compatible to the metric, while
for a general affine connection the covariant derivative in (3.19) would contain an additional
term proportional to ∇a gbc .
Now we differentiate the definition of the metric tensor, gab = ea · eb , with respect to xc ,
∂c gab = (∂c ea ) · eb + ea · (∂c eb ) = Γdac ed · eb + ea Γdbc ed = (3.22)
= Γdac gdb + Γdbc gad . (3.23)
We obtain two equivalent expression by a cyclic permutation of the indices a, b, c,
∂b gca = Γdcb gda + Γdab gcd (3.24)
| {z }
∂a gbc = Γdba gdc +Γdca gbd . (3.25)
| {z }

32
3.2 Tensor analysis

We add the first two terms and subtract the last one. Using additionally the symmetries
Γabc = Γacb and gab = gba , the underlined terms cancel, and dividing by two we obtain

1
(∂c gab + ∂b gac − ∂a gbc ) = Γd cb gad . (3.26)
2
Multiplying by gea and relabeling indices gives as final result

1 ad
Γabc = {abc } ≡ g (∂b gdc + ∂c gbd − ∂d gbc ) . (3.27)
2

This equation defines the Christoffel symbols {abc } (aka Levi-Civita connection aka Rieman-
nian connection): It is the unique connection on a Riemannian manifold which is metric
compatible and torsion-free (i.e. symmetric). Admitting torsion, on the RHS of Eq. (3.27)
a would appear. Such a connection would be still
three permutations of the torsion tensor Tbc
be a metric connection, but not torsion-free.
We now check our claim that the connection (3.27) is metric compatible. First, we define1

Γabc = gad Γdbc . (3.28)

Thus Γabc is symmetric in the last two indices. Then it follows


1
Γabc = (∂b gac + ∂c gba − ∂a gbc ) . (3.29)
2
Adding 2Γabc and 2Γbac gives

2(Γabc + Γbac ) = ∂b gac + ∂c gba − ∂a gbc (3.30)


+ ∂a gbc + ∂c gab − ∂b gac = 2∂c gab (3.31)

or
∂c gab = Γabc + Γbac . (3.32)
Applying the general rule for covariant derivatives, Eq. (3.47), to the metric,

∇c gab = ∂c gab − Γdac gdb − Γdbc gad = ∂c gab − Γbac − Γabc (3.33)

and inserting Eq. (3.32) shows that

∇c gab = ∇c gab = 0 . (3.34)

Hence ∇a commutes with contracting indices,

∇c (X a Xa ) = ∇c (gab X a X b ) = gab ∇c (X a X b ) (3.35)

and “conserves” the norm of vectors. (Exercise: Repeat these steps including torsion.)
Since we can choose for a flat space an Cartesian coordinate system, the connection coef-
ficients are zero and thus ∇a = ∂a . This suggests as general rule that physical laws valid in
Minkowski space hold in general relativity, if one replace ordinary derivatives by covariant
ones and ηij by gij .
1
We showed that g can be used to raise or to lower tensor indices, but Γ is not a tensor.

33
3 Basic differential geometry

3.2.2 Geodesics
A geodesic curve is the shortest or longest curve between two points on a manifold. Such
a curve extremizes the action S(L) of a free particle, L = gab ẋa ẋb , (setting m = 2 and
ẋ = dx/dσ), along the path xa (σ). The parameter σ plays the role of time t in the non-
relativistic case, while t become part of the coordinates. The Lagrange equations are
d ∂L ∂L
c
− c =0 (3.36)
dσ ∂(ẋ ) ∂x
Only g depends on x and thus ∂L/∂xc = gab,c ẋa ẋb . With ∂ ẋa /∂ ẋb = δba we obtain
d
gab,c ẋa ẋb = 2 (gac ẋa ) = 2(gac,b ẋa ẋb + gac ẍa ) (3.37)

or
1
gac ẍa + (2gac,b − gab,c )ẋa ẋb = 0 (3.38)
2
Next we rewrite the second term as
2gca,b ẋa ẋb = (gca,b + gcb,a )ẋa ẋb (3.39)
multiply everything by gdc and obtain
1
ẍd + gdc (gab,c + gac,b − gab,c )ẋa ẋb = 0 . (3.40)
2
We recognize the definition of the Levi-Civita connection and rewrite the equation of a
geodesics as
ẍc + Γcab ẋa ẋb = 0 . (3.41)
The connection entering the equation for an extremal curve is the Levi-Civita connection,
because we used the Lagrangian of a classical spinless particle.
This result justifies the use of a torsionless connection which is metric compatible: Although
a starP
consists of a collection of individual particles carrying spin si , its total spin sums up to
zero, i si ≈, 0, because the si are uncorrelated. Thus we can describe macrosopic matter in
general relativity as a a classical spinless point particle (or fluid, if extended). In such a case,
only the symmetric part of the connection influences the geodesic motion of the considered
system.
Example: Sphere S 2 . Calculate the Christoffel symbols of the two-dimensional unit sphere S 2 .
The line-element of the two-dimensional unit sphere S 2 is given by ds2 = dϑ2 + sin2 ϑdφ2 . A faster
alternative to the definition (3.27) of the Christoffel coefficients is the use of the geodesic equation:
From the Lagrange function L = gab ẋa ẋb = ϑ̇2 + sin2 ϑφ̇2 we find
∂L d ∂L d
=0 , = (2 sin2 ϑφ̇) = 2 sin2 ϑφ̈ + 4 cos ϑ sin ϑϑ̇φ̇
∂φ dt ∂ φ̇ dt
∂L d ∂L d
= 2 cos ϑ sin ϑφ̇2 , = (2ϑ̇) = 2ϑ̈
∂ϑ dt ∂ ϑ̇ dt
and thus the Lagrange equations are
φ̈ + 2 cot ϑϑ̇φ̇ = 0 and ϑ̈ − cos ϑ sin ϑφ̇2 = 0 .
Comparing with the geodesic equation ẍκ +Γκµν ẋµ ẋν = 0, we can read off the non-vanishing Christoffel
symbols as Γφϑφ = Γφφϑ = cot ϑ and Γϑφφ = − cos ϑ sin ϑ. (Note that 2 cot ϑ = Γφϑφ + Γφφϑ .)

34
3.A Appendix: a bit more...

3.A Appendix: a bit more...


3.A.1 Affine connection and covariant derivative
Consider how the partial derivative of a vector field, ∂c X a , transforms under a change of
coordinates,
 ′a   ′a 
∂ ∂x ∂xd ∂ ∂x
∂c′ X ′a = X b
= Xb (3.42)
∂x′c ∂xb ∂x′c ∂xd ∂xb
∂x′a ∂xd b ∂ 2 x′a ∂xd b
= ∂d X + X . (3.43)
∂xb ∂x′c b d ′c
|∂x ∂x{z ∂x }
≡−Γa
bc

The first term transforms as desired as a tensor of rank (1,1), while the second term—caused
by the in general non-linear change of the coordinate basis—destroys the tensorial behavior.
If we define a covariant derivative ∇c X a of a vector X a by requiring that the result is a tensor,
we should set
∇c X a = ∂c X a + Γabc X b . (3.44)

The n3 quantities Γabc (“affine connection”) transform as

∂x′a ∂xe ∂xf d ∂ 2 xd ∂x′a


Γ′abc = Γ + . (3.45)
∂xd ∂x′b ∂x′c ef ∂x′b ∂x′c ∂xd
Using ∇c φ = ∂c φ and requiring that the usual Leibniz rule is valid for φ = Xa X a leads to

∇c Xa = ∂c Xa − Γbac Xb . (3.46)

For a general tensor, the covariant derivative is defined by the same reasoning as

a... a...
∇c Tb... = ∂c Tb... d...
+ Γadc Tb... a...
+ . . . − Γdbc Td... − ... (3.47)

Note that it is the last index of the connection coefficients that is the same as the index of
the covariant derivative. The plus sign goes together with upper (superscripts), the minus
with lower indices.
From the transformation law (3.45) it is clear that the inhomogeneous term disappears for
an antisymmetric combination of the connection coefficients Γ in the lower indices. Thus this
combination forms a tensor, called torsion,
a
Tbc = Γabc − Γacb . (3.48)

We consider only symmetric connections, Γabc = Γacb , or torsionless manifolds. We will justify
this choice later, when we consider the geodesic motion of a classical particle.

Parallel transport We say a tensor T is parallel transported along the curve x(σ), if its
a... stay constant. In flat space, this means simply
components Tb...

d a... dxc a...


T = ∂c Tb... = 0. (3.49)
dσ b... dσ

35
3 Basic differential geometry

In curved space, we have to replace the normal derivative by a covariant one. We define the
directional covariant derivative along x(σ) as
D dxc
= ∇c . (3.50)
dσ dσ
Then a tensor is parallel transported along the curve x(σ), if
D a... dxc a...
T = ∇c Tb... = 0. (3.51)
dσ b... dσ

3.A.2 Riemannian normal coordinates


In a (pseudo-) Riemannian manifold, one can find in each point P a coordinate system, called
(Riemannian) normal or geodesic coordinates, with the following properties,

g̃ab (P ) = ηab , (3.52a)


∂c g̃ab (P ) = 0, (3.52b)
Γ̃abc (P ) = 0. (3.52c)

We proof it by construction. We choose new coordinates x̃a centered at P ,


1
x̃a = xa − xaP + Γabc (xb − xbP )(xc − xcP ). (3.53)
2
Here Γabc are the connection coefficients in P calculated in the original coordinates xa . We
differentiate
∂ x̃a
= δda + Γadb (xb − xbP ) . (3.54)
∂xd
Hence ∂ x̃a /∂xd = δda at the point P . Differentiating again,

∂ 2 x̃a
= Γadb δeb = Γade . (3.55)
∂xd ∂xe
Inserting these results into the transformation law (3.45) of the connection coefficients, where
we swap in the second term derivatives of x and x̃,

∂ x̃a ∂xf ∂xg d ∂ 2 x̃a ∂xd ∂xf


Γ̃abc = Γ − (3.56)
∂xd ∂ x̃b ∂ x̃c f g ∂xd ∂xf ∂ x̃b ∂ x̃c
gives
Γ̃abc = δda δbf δcg Γdf g − Γadf δbd δcf = Γabc − Γabc (3.57)
or
Γ̃ade (P ) = 0 . (3.58)
Thus we have found a coordinate system with vanishing connection coefficients at P . By
a linear transformation (that does not affect ∂gab ) we can bring finally gab into the form
ηab : As required by the equivalence principle, we can introduce in each spacetime point P a
free-falling coordinate system in which physics is described by the known physical laws in the
absence of gravity.
Note that the introduction of Riemannian normal coordinates is in general only possible, if
the connection is symmetric: Since the antisymmetric part of the connection coefficients, the

36
3.A Appendix: a bit more...

torsion, transforms as a tensor, it can not be eliminated by a coordinate change. This implies
not necessarily a contradiction to the equivalence principle, as long as the torsion is properly
generated by source terms in the equation of motions of the matter fields. In particular, the
spin current of fermions leads to non-zero torsion. As the elementary spins in macroscopic
bodys cancel, torsion is in all relevant astrophysical and cosmological applications negligible.
This justifies our choice of a symmetric connection.

37
4 Schwarzschild solution
In the next three chapters, we investigate the solutions of Einsteins field equations that
describe the gravitational field outside a spherical mass distribution. The metric valid for a
static mass distribution was found by Karl Schwarzschild in 1915, only one month after the
publication of Einsteins field equations. A real understanding of the physical significance of
the singularities contained in the solution was obtained only in the 1960s. The solution for a
rotating mass distribution was found by Kerr only in 1963.

4.1 Spacetime symmetries and Killing vectors


A spacetime posseses a symmetry if it looks the same moving from a point P along a vector
field ξ to a different point P̃ . More precisely, we mean with “looking the same” that the
metric tensor transported along ξ remains the same. Thus the shifted metric g̃µν (x̃) at the
new point P̃ has to be the same function of its argument x̃ as the original metric gµν (x) of
its argument x,
g̃µν (x) = gµν (x) for all x. (4.1)

Note the difference to the definition of a scalar, φ̃(x̃) = φ(x). In the latter case, we require
that a scalar field has the same value at a point P which in turn changes coordinates from x
to x̃.
Mathematically, the transport of a tensor T along a vector field ξ is described by the Lie
derivative Lξ T . Instead of introducing this new derivative (which we do not use later), we
consider the change of the metric under an infinitesimal coordinate transformation,

x̃µ = xµ + εξ µ (xν ) + O(ε2 ). (4.2)

Then we can identify the tranlation ξ µ with the vector field ξ at x. Next we connect the
metric tensor at the two different points by an Taylor expansion,

gµν (x̃) = gµν (x + εξ) = gµν (x) + εξ α ∂α gµν (x) + O(ε2 ). (4.3)

On the other hand, we can use the usual transformation law for a tensor of rank two under
an arbitrary coordinate transformation,

∂xα ∂xβ
g̃µν (x̃) = gαβ (x), (4.4)
∂ x̃µ ∂ x̃ν

or, exchanging tilted and untilted quantities,

∂ x̃α ∂ x̃β
gµν (x) = g̃αβ (x̃). (4.5)
∂xµ ∂xν

38
4.1 Spacetime symmetries and Killing vectors

If the transformation (4.2) is a spacetime symmetry, then g̃µν (x) = gµν (x). Evaluating the
transformation matrices and inserting the Taylor expansion, we obtain

∂ x̃α ∂ x̃β
gµν (x) = gαβ (x̃) = (δµα + ε∂ α ξµ )(δνβ + ε∂ β ξν ) [gαβ (x) + εξ ρ ∂ρ gαβ (x)] + O(ε2 ) (4.6)
∂xµ ∂xν
= gµν (x) + ε [∂µ ξν + ∂ν ξµ + ξ α ∂α gµν (x)] + O(ε2 ). (4.7)

Thus the metric is kept invariant, if the condition

δgµν = ∂µ ξν + ∂ν ξµ + ξ α ∂α gµν = 0 (4.8)

is satisfied. Inserting Eq. (3.32) for the partial derivative of the metric tensor, we can combine
the Christoffel symbols with the partial derivatives into covariant derivatives of the vector
field, obtaining the Killing equation1

δgµν = ∇µ ξν + ∇ν ξµ = 0. (4.9)

Its solutions ξ are the Killing vector fields of the metric. Moving along a Killing vector field,
the metric is kept invariant.
Since Eq. (4.9) is tensor equation, the previous Eq. (4.8) is also invariant under arbitrary co-
ordinate transformations, although it contains only partial derivatives. Is is the Lie derivative
of a tensor of rank two.

Example: Killing vectors of R3 :


Choosing Cartesian coordinates, dl2 = dx2 + dy 2 + dz 2 , makes it obvious that translations correspond
to Killing vectors ξ1 = (1, 0, 0), ξ2 = (0, 1, 0), and ξ3 = (0, 0, 1). We find the Killing vectors describing
rotational symmetry by writing for an infinitesimal rotation around, e.g., the z axis,

x′ = cos αx − sin αy ≈ x − αy ,
y′ = sin αx + cos αy ≈ y + αx ,
z′ = z.

Hence ξz = (−y, x, 0) and the other two follow by cyclic permutation. One of them, ξz , we could
have also identified by rewriting the line-element in spherical coordinates and noting that dl does not
contain φ dependent terms.

Conserved quantities along geodesics Assume that the metric is independent from one
coordinate, e.g. x0 . Then there exists a corresponding Killing vector, ξ = (1, 0, 0, 0), and
x0 is a cyclic coordinate, ∂L/∂x0 = 0. With L = dτ /dσ, the resulting conserved quantity
∂L/∂ ẋ0 = const. can be written as

∂L dxβ dxβ
= g0β = g0β = ξ · u. (4.10)
∂ ẋ0 Ldσ dτ
Hence the quantity ξ · u is conserved along the solutions xµ (σ) of the Lagrange equation, i.e.
along geodetics.
1
This equation is a much stronger constraint than it looks like: its solutions are uniquely determined by the
value at a single point.

39
4 Schwarzschild solution

4.2 Schwarzschild metric


The metric outside of a radial-symmetric mass distribution is given in Schwarzschild coordi-
nates as  
2 2 2M dr 2
ds = dt 1 − − − r 2 (dϑ2 + sin2 ϑdφ2 ) . (4.11)
r 1 − 2Mr

Its main properties are


• symmetries: The metric is time-independent and spherically symmetric. Hence two
(out of the four) Killing vectors are ξ = (1, 0, 0, 0) and η = (0, 0, 0, 1), where we order
coordinates as {t, r, φ, ϑ}.
• asymptotically flat: we recover Minkowski space for M/r → ∞.
• the metric is diagonal.
• potential singularities at r = 2M and r = 0. The radius 2M is called Schwarzschild
radius and has the value
2GM M
Rs = 2
= 3 km (4.12)
c M⊙

• At Rs , the coordinate t becomes space-like, while r becomes time-like.

4.3 Gravitational redshift


Redshift formula According to Eq. (1.49), an observer with four-velocity uobs measures the
frequency
ω = p · uobs (4.13)
of a photon with four-momentum p. For an observer at rest,
uobs · uobs = 1 = gtt (ut )2 . (4.14)
Hence
uobs = (1 − 2M/r)−1/2 ξ . (4.15)
Inserting this into (4.13), we find for the frequency measured by an observer at position r,
ω(r) = (1 − 2M/r)−1/2 ξ · p . (4.16)
Since ξ · p is conserved and ω∞ = ξ · p, we obtain
r
2M
ω∞ = ω(r) 1 − . (4.17)
r
Thus a photon climbing out of the potential wall of the mass M looses energy, in agreement
with the principle of equivalence. The information sent towards an observer at infinity by a
spaceship falling towards r = 2M will be more and more redshifted, with ω → 0 for r → 2M .
This indicates that r = 2M is an event horizon hiding all processes inside from the outside.
If M/r ≪ 1, we can expand the square root. Inserting also G and c, we find
   
GM VN
ω∞ ≈ ω(r) 1 − 2 = ω(r) 1 − 2 , (4.18)
rc c
where VN is the Newtonian potential.

40
4.4 Orbits of massive particles

4.4 Orbits of massive particles


Radial equation and effective potential for massive particles Spherically symmetry means
that the movement of a test particle is contained in a plane. We choose ϑ = π/2 and uϑ = 0.
We replace in the normalization condition u · u = 1 written out for the Schwarzschild metric,
   2      2
2M dt 2M −1 dr 2 2 dφ
1= 1− − 1− −r , (4.19)
r dτ r dτ dτ
the velocities ut and ur by the conserved quantities
 
2M dt
e ≡ ξ·u= 1− (4.20)
r dτ

l ≡ −η · u = r 2 sin ϑ2 . (4.21)

Setting then A = 1 − 2M/r, we find
 2
e2 1 dr l2
1= − − . (4.22)
A A dτ r2
We want to rewrite this equation in a form similar to the energy equation in the Newtonian
case. Multiplying by A/2 makes the dr/dτ term similar to a kinetic energy term. Bringing
also all constant terms on the LHS and calling them E ≡ (e2 − 1)/2, we obtain
 
e2 − 1 1 dr 2
E≡ = + Veff (4.23)
2 2 dτ
with
M l2 M l2 M l2
Veff = −+ 2 − 3 = V0 − 3 . (4.24)
r 2r r r
Hence the energy2 of a test particle in the Schwarzschild metric can be, as in the Newtonian
case, divided into kinetic energy and potential energy. The latter contains the additional term
M l2 /r 3 , suppressed by 1/c2 , that becomes important at small r.
The asymptotic behavior of Veff for r → 0 and r → ∞ is
M M l2
Veff (r → ∞) → − and Veff (r → 0) → − , (4.25)
r r3
while the potential at the Schwarschild radius, V (2M ) = 1/2, is independent of M .
We determine the extrema of Veff by solving dVeff /dr = 0 and find

l2 h p i
1 ± 1 − 12M 2 /l2
r1,2 = (4.26)
2M

Hence the potential has no extrema for M/l > 12 and is always negative: A particle can
reach r = 0 for small enough but finite angular momentum, in contrast to the Newtonian
case. By the same argument, there
√ exists a last stable orbit at r = 6M , when the two extrema
r1 and r2 coincide for M/l = 12.
The orbits can be classified according the relative size of E and Veff for a given l:
2
More precisely, e and l are the energy and the angular momentum per unit mass. Thus the -1 in E corresponds
to the rest mass of the test particle.

41
4 Schwarzschild solution

0.4 0.1

0.2 l=6M
0.05

l=4.6M
0 l=4M
l=3.7M
Veff

Veff
0
l=2M
-0.2 l=4M

-0.05 l=3.7M
-0.4
l=2M

-0.6 -0.1
0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14
r/M r/M

Figure 4.1: The effective potential Veff for various values of l/M as function of distance r/M ,
for two different scales.

• Bound orbit exists for E < 0. Two circular orbits, one stable at the minimum of Veff and
an unstable one at the maximum of Veff ; orbits that oscillate between the two turning
points.
• Scattering orbit exists for E > 0: If E > max{Veff }, the particle hits after a finite time
the singularity r = 0. For 0 < E < max{Veff }, the particle turns at E = max{Veff } and
escapes to r → ∞.

We derive below a differential equation for r(φ), from which the orbits in the Schwarzschild
metric can be calculated. For the lazy student, several webpages exist where such orbits can
be visualised, see e.g. http://www.fourmilab.ch/gravitation/orbits/.

Radial infall We consider the free fall of a particle that is at rest at infinity, dt/dτ = 1,
E = 0 and l = 0. The radial equation (4.23) simplifies to
 2
1 dr M
= (4.27)
2 dτ r

and can be integrated by separation of variables,


Z 0 √ Z τ
1/2
drr = 2M dτ (4.28)
r τ∗

with the result


2 3/2 √
r = 2M (τ∗ − τ ) . (4.29)
3
Hence a freely falling particle needs only a finite proper-time to fall from finite r to r = 0. In
particular, it passes the Schwarzschild radius 2M in finite proper time.
We can answer the same question using the coordinate time t by combining Eqs. (4.20)
[with E = 0 and thus e = 1] and (4.27),
 −1/2  −1
dt 2M 2M
=− 1− . (4.30)
dr r r

42
4.4 Orbits of massive particles

Integrating gives
 −1/2 

2M −1
Z
2M
t = dr 1−
r r
( p )
2  
r 3/2  r 1/2 r/2M + 1

= t + 2M − −2 + ln p (4.31)
3 2M 2M r/2M − 1
→ ∞ for r → 2M .

Since the coordinate time t equals the proper-time for an observer at infinity, a freely falling
particle reaches the Schwarzschild radius r = 2M only for t → ∞ for such an observer.
The last result can be derived immediately for light-rays. Choosing a light-ray in radial
direction with dφ = dϑ = 0, the metric (4.11) simplifies with ds2 = 0 to

dr 2M
=1− . (4.32)
dt r
Thus light travelling towards the star, as seen from the outside, will travel slower and slower
as it comes closer to the Schwarzschild radius r = 2M . The coordinate time is ∝ ln |1− 2M/r|
and thus for an observer at infinity the signal will reach r = 2M again only asymptotically
for t → ∞.

Perihelion precession We recall first the derivation of the law of motion r = r(φ) in the
Newtonian case. We solve the Lagrange equations for L = (1/2)m(ṙ 2 + r 2 φ̇2 ) + GM m/r,
obtaining

r 2 φ̇ = l , (4.33)
l GM
r̈ = 3
− 2 . (4.34)
r r
We eliminate t by
dr dr dφ dr l l
= = 2
≡ r′ 2 (4.35)
dt dφ dt dφ r r
and introduce u = 1/r,
GM
u′′ + u = . (4.36)
l2
The solution follows as
GM
u= (1 + e cos φ) . (4.37)
l2
We redo the same steps, starting from Eq. (4.23) for the Schwarzschild metric,

l2 2M 2M l2
ṙ 2 + = e2
− 1 + − . (4.38)
r2 r r3
We eliminate first t and introduce then u = 1/r,

e2 − 1 2M u
(u′ )2 + u2 = + 2 + 2M u3 . (4.39)
l2 l

43
4 Schwarzschild solution

We can transform this into a linear differential equation differentiating with respect to φ.
Thereby we eliminate also the constant (e2 − 1)/l2 , and dividing3 by 2u′ it follows
M GM 3GM 2
u′′ + u = 2
+ 3M u2 = 2 + u . (4.40)
l l c2
In the last step we reintroduced c and G. Hence we see that the Newtonian limit corresponds
to c → ∞ (“instantaneous interactions”) or v/c → 0 (“static limit”). The latter statement
becomes clear, if one uses the virial theorem: GM u = GM/r ∼ v 2 .
In most situations, the relativistic correction is tiny. We use therefore perturbation theory
to determine an approximate solution, setting u = u0 +δu, where u0 is the Newtonian solution.
Inserting u into Eq. (4.40), we obtain
3(GM )3 2
(δu)′′ + δu = (u0 + 2u0 δu + δu2 ) . (4.41)
c2 l4
Here we used that u0 solves the Newtonian equation of motion (4.36). Keeping on the RHS
only the leading term u20 results in
3(GM )3
(δu)′′ + δu = (1 + 2e cos φ + e2 cos2 φ) . (4.42)
c2 l4
Its solution is   
3(GM )3 2 1 1
δu = 1 + eφ sin φ + e − cos(2φ) . (4.43)
c2 l4 2 6

The solution of the linear inhomogenous differential equation (4.42) is found by adding the
particular solutions of the three inhomogenous terms. With A, B and C being constant, it is

u′′ + u = A ⇒ u = A, (4.44)
1
u′′ + u = B cos φ ⇒ u = Bφ sin φ, (4.45)
2
1 1
u′′ + u = C cos2 φ ⇒ u = C − cos(2φ). (4.46)
2 6

While the first and third term in the square bracket lead only to extremely tiny changes
in the orbital parameters, the second term is linear in φ and its effect accumulates therefore
with time. Thus we include only δu ∝ eφ sin φ in the approximate solution. Introducing
α = 3(GM )2 /(cl)2 ≪ 1 and employing

cos[φ(1 − α)] = cos φ cos(αφ) + sin φ sin(αφ) ≃ cos φ + αφ sin φ, (4.47)

we find
GM GM
u = u0 + δu ≃ 2
[1 + e(cos φ + α sin φ)] ≃ 2 [1 + e cos(φ(1 − α))] . (4.48)
l l
Hence the period is 2π/(1 − α), and the ellipse processes with
2π 6π(GM )2 6πGM
∆φ = − 2π ≃ 2πα = 2
= . (4.49)
1−α (lc) a(1 − e2 )c2
3
The case u′ = 0 corresponds to radial infall treated in the previous section.

44
4.5 Orbits of photons

The effect increases for orbits with small major axis a and large eccentricity e. Urbain Le
Verrier first recognized in 1859 that the precession of the Mercury’s perihelion deviates from
the Newtonian predicition: Perturbations by other planets lead to ∆φ = 532.3′′ /century,
compared to the observed value of ∆φ = 574.1′′ /century. The main part of the discrepancy
is explaind by the effect of Eq. (4.49), predicting a shift of ∆φ = 43.0′′ /century. (Tiny
additional corrections are induced by the quadrupole moment of the Sun (0.02′′ /century) and
the Lens-Thirring effect (−0.002′′ /century)).

4.5 Orbits of photons


We repeat the discussion of geodesics for massive particle for massless ones by changing
u · u = 1 into u · u = 0 and by using an affine parameter λ instead of the proper-time τ .
Reordering gives
 
1 e2 1 dr 2
≡ 2 = 2 + Weff (4.50)
b2 l l dλ
with the impact parameter b = |l/e| and
 
1 2M
Weff = 2 1− . (4.51)
r r

The radial equation (4.50) is invariant under reparametrisations of the affine parameter,
λ → Aλ + B, since the change cancels both in b and ldλ. Consequently, the orbit of a photon
does not depend seperately on the energy e and the angular momentum l, but only on the
impact parameter b of the photon.
The maximum of Weff is at 3M with height 1/27M 2 . For impact parameters b > 27M ,
photon orbits have a turning point and photons escape to infinity. For b < 27M , they hit
r = 0, while for b = 27M a (unstable) circular orbit is possible.

Light deflection We transform Eq. (4.50) as in the m > 0 case into a differential equation
for u(φ). For small deflections, we use again perturbation theory. In zeroth order in v/c, we
can set the RHS of
3GM 2
u′′ + u = u (4.52)
c2
to zero. The solution u0 is a straight line,

sin φ
u0 = . (4.53)
b
Inserting u = u0 + δu gives
3GM sin2 φ
(δu)′′ + δu = . (4.54)
c2 b2
A particular solution is
3GM
δu = (1 + 1/3 cos(2φ)) . (4.55)
2c2 b2
Thus the complete approximate solution is
sin φ 3GM
u = u0 + δu = + 2 2 (1 + 1/3 cos(2φ)) . (4.56)
b 2c b

45
4 Schwarzschild solution

Considering the limit r → ∞ or u → 0 of this equation gives half of the deflection angle of a
light-ray with impact parameter b to a point mass M ,
4GM 2Rs
∆φ = 2
= . (4.57)
c b b
For a light-ray grazing the Solar surface, b = R⊙ , we obtain as numerical estimate
4GM⊙ 2Rs
∆φ⊙ = 2
= ≃ 10−5 ≈ 2′′ . (4.58)
c R⊙ R⊙
For a recollection of the 1919 results see https://arxiv.org/abs/2010.13744.

Shapiro effect Shapiro suggested to use the time-delay of a radar signal as test of general
relativity. Suppose we send a radar signal from the Earth to Venus where it is reflected back
to Earth. The point r0 of closest approach to the Sun is characterized by dr/dt|r0 = 0.
Rewriting Eq. (4.50) as  
2 l2 2M
ṙ + 2 1 − = e2 (4.59)
r r
and introducing the Killing vector e in ṙ 2 ,
   2
2 dr dt 2 e2 dr
ṙ = = , (4.60)
dt dλ (1 − 2M/r)2 dt
we find  2
1 dr l2 1
+ − = 0. (4.61)
(1 − 2M/r)3 dt e2 r 2 1 − 2M/r
We now evaluate this equation at the point of closest approach, i.e. for dr/dt|r0 = 0,

l2 r02
= , (4.62)
e2 1 − 2M/r

and use this equation to eliminate l2 /e2 in (4.61). Then we obtain


 1/2
dr 1 r02 (1 − 2M/r)
= 1− 2 (4.63)
dt 1 − 2M/r r (1 − 2M/r0 )
or Z  −1/2
r
dr r 2 (1 − 2M/r)
t(r, r0 ) = 1 − 20 . (4.64)
r0 1 − 2M/r r (1 − 2M/r0 )
Next we expand this expression in M/r ≪ 1,
Z r  
r 2M ) M r0
t(r, r0 ) = dr 2 1 − + (4.65)
r0 (r − r02 )1/2 r r(r + r0 )
" #  
(r 2 − r02 )1/2 2GM r + (r 2 − r02 )1/2 GM r − r0 1/2
= + ln + 3 , (4.66)
c c3 r0 c r + r0

where we restored also G and c in the last step. The first term corresponds to straight line
propagation and thus the excess time ∆t is given by the second and third term. Finally, we

46
4.6 Post-Newtonian parameters

Figure 4.2: Measurement of the Shapiro time-delay compared to the prediction in GR.

can use that the orbits both of Earth and Venus are much more distant from the Sun than
the point of closest approach, RE , RV ≫ r0 . Hence we obtain for the time delay
   
4GM 4RE RV
∆t = ln +1 . (4.67)
c3 r02
In Fig. 4.2, one of the first measurements of the Shapiro time-delay is shown together with
the prediction using Eq. (4.67); an excellent agreement is visible.

4.6 Post-Newtonian parameters


In order to search for deviations from general relativity one uses the post-Newtonian approx-
imation, i.e. an expansion around the Minkowski space. Any spherically symmetric, static
spacetime can be expressed as

ds2 = A(r)dt2 − B(r)dr 2 − r 2 (dϑ2 + sin2 ϑdφ2 ) (4.68)

with two unknown functions A(r) and B(r). Since the only available length is Rg , A and B
can be expanded as power series in r/Rg ,
 2
2GM 2GM
A(r) = 1 + a1 r/Rg + a2 (r/Rg )2 + . . . = 1 − + 2(β − γ) ... (4.69)
c2 r c2 r
2GM
B(r) = 1 + b1 r/Rg + b2 (r/Rg )2 + . . . = 1 + γ 2 + . . . (4.70)
c r
Agreement with Newtonian gravity is achieved, if the only non-zero expansion coefficient a1
equals two, i.e. for A = 1 − 2GM/(rc2 ) and B = 1. Searching for deviations from GR, one
keeps therefore a1 = 2 fixed and introduces the “post-Newtonian” parameter β and γ such
that agreement with Einstein gravity is achieved for γ = 1 and β = 0. The predictions

47
4 Schwarzschild solution

for the three classical tests of GR we have discussed can be redone using the metric (4.69).
Alternative theories of gravity predict the numerical values of the post-Newtonian parameters
and can thereby easily compared to experimental results.

4.A Appendix: General stationary isotropic metric

48
5 Gravitational lensing
One distinguishes three different cases of gravitational lensing, depending on the strength of
the lensing effect:
1. Strong lensing occurs when the lens is very massive and the source is close to it: In
this case light can take different paths to the observer and more than one image of
the source will appear, either as multiple images or deformed arcs of a source. In the
extreme case that a point-like source, lens and observer are aligned the image forms an
“Einstein ring”.
2. Weak Lensing: In many cases the lens is not strong enough to form multiple images or
arcs. However, the source can still be distorted and its image may be both stretched
(shear) and magnified (convergence). If all sources were well known in size and shape,
one could just use the shear and convergence to deduce the properties of the lens.
3. Microlensing: One observes only the usual point-like image of the source. However,
the additional light bent towards the observer leads to brightening of the source. Thus
microlensing is only observable as a transient phenomenon, when the lens crosses ap-
proximately the axis observer-source.

Lens equation We consider the simplest case of a point-like mass M , the lens, between the
observer O and the source S as shown in Fig. 5.1. The angle β denotes the (unobservable)
angle between the true position of the source and the direction to the lens, while ϑ± are the
angles between the image positions and the source. The corresponding distances DOS , DOL ,
and DLS are also depicted in Fig. 5.1 and, since DOS +DLS = DOL does not hold in cosmology,
we keep all three distances. Finally, the impact parameter b is as usual the smallest distance
between the light-ray and the lens.
Then the lens equation in the “thin lens” (b ≪ Di ) and weak deflection (α ≪ 1) limit
follows from AS + SB = AB as

ϑDos = αDls + βDos . (5.1)

The thin lens approximation implies ϑ ≪ 1, and since β < ϑ, also β is small. Solving for b
and inserting for the deflection angle α = 4GM/(c2 b) as well as b = ϑDol , we find first
4GM Dls 1
β =ϑ− . (5.2)
c2 Dos Dol ϑ
Multiplying by ϑ, we obtain then a quadratic equation,

ϑ2 − βϑ − ϑE = 0 , (5.3)

where we introduced the Einstein angle


 1/2
Dls
ϑE = 2RS . (5.4)
Dos Dol

49
5 Gravitational lensing
B

α
S

ϑ+
L
⑦ β
A
ϑ−

Dso

Dsl Dlo

Figure 5.1: The source S that is off the optical axis OL by the angle β appears as two images
on opposite sides from the optical axis OL. The two images are separated by the
angles ϑ± from the optical axis O.

The two images of the source are deflected by the angles


1
ϑ± = [β ± (β 2 + 4ϑ2E )1/2 ] (5.5)
2
from the line-of-sight to the lens. If we do not know the lens location, measuring the separation
of two lenses images, ϑ+ +ϑ− , provides only an upper bound on the lens mass. If observer, lens
and source are aligned, then symmetry implies that ϑ+ = ϑ− = ϑE , i.e. the image becomes
a circle with radius ϑE . Deviations from this perfectly symmetric situations break the circle
into arcs as shown in an image of the galaxy cluster Abell 2218 in Fig. 5.2.
For a numerical estimate of the Einstein angle in case of a stellar object in our own galaxy,
we set M = M⊙ and Dls /Dos ≈ 1/2 and obtain
 
M Dol 1/2
ϑE = 0.64′′ × 10−4 . (5.6)
M⊙ 10 kpc
The numerical value of order 10−4 of an arc-second for the deflection led to the name “mi-
crolensing.”

Magnification Without scattering or absorption of photons, the conservation of photon


number implies that the intensity along the trajectory of a light-ray stays constant, 1
2 dN dN I
f (x, p) = 3 3 = = . (5.7)
h3 d xd p dA dt dΩ dE hcp2
In particular, we found that the observed intensity I equals the surface brightness B of the
source: The F ∝ 1/r 2 law follows since the solid angle dΩ seen by a detector of size A
decreases as 1/r 2 .
1
We define here intensity as connected to the energy flux F, while often the particle flux is used.

50
Figure 5.2: Gravitational lensing of the galaxy cluster Abell 2218.

Gravity can affect this result in two ways: First, gravity can redshift the frequency of
photons, νsr = νobs (1 + z). This can be either the gravitational redshift as in Sec. (4.3)
or a cosmological redshift due to the expansion of the universe (that will be discussed in
Sec. 10.2). Thus the intensity Iobs at the observed frequency photons νobs is the emitted
intensity evaluated at νobs (1 + z) and reduced by (1 + z)3 ,

I(νsr )
Iobs (νobs ) = . (5.8)
(1 + z)3

In both cases, this redshift depends only on the initial and the final point of the photon
trajectory, but not on the actual path in-between. Thus the redshift cancels if one considers
the relative magnification of a source by gravitational lensing.
Second, gravitational lensing affects the solid angle the source is seen in a detector of fixed
size. As a result, the apparent brightness of a source increases proportionally to the increase
of the visible solid angle, if the source cannot be resolved as a extended object (cf. Sec...).
Hence we can compute the magnification of a source by calculating the ratio of the solid angle
visible without and with lensing.
In Fig. 5.3, we sketch how the two lensed images are stretched: An infinitesimal small
surface element 2π sin βdβdφ ≈ 2πβdβdφ of the unlensed source becomes in the lense plane
2πϑ± dϑ± dφ. Thus the images are tangentially stretched by ϑ± /β, while the radial size is
changed by dϑ± /dβ. Thus the magnification a± of the source is

ϑ± dϑ±
a± = . (5.9)
βdβ

Differentiating Eq. (5.5) gives


 
dϑ± 1 β
= 1± 2 (5.10)
dβ 2 (β + 4ϑ2E )1/2

51
5 Gravitational lensing

ϑ+
β

ϑ−

Figure 5.3: The effect of gravitational lensing on the shape of an extended source: The surface
element 2πβdβdφ of the unlensed image at position β is transformed into the two
lensed images of size 2πϑ± dϑ± dφ at position ϑ± .

and thus
x2 + 2
atot = a+ + a− = >1 (5.11)
x(x2 + 4)1/2
with x = β/ϑE . For large separation x, the magnification atot goes to one, while the mag-
nification diverges for x → 0 as atot ∼ 1/x: In this limit we would receive light from an
infinite number of images on the Einstein circle. Physically, the approximation of a point
source breaks down when x reaches the extension of the source. Since atot is larger than one,
gravitational lensing always increases the total flux observed from a lensed source, facilitating
the observation of very faint objects. As compensation, the source appears slightly dimmed
to all those observers who do not see the source lensed.
Two important applications of gravitational lensing are the search for dark matter in the
form of black holes or brown dwarfs in our own galaxy by microlensing and the determination
of the value of the cosmological constant by weak lensing observations.
In microlensing experiments that have tried to detect dark matter in the form of MACHOs
(black holes, brown dwarfs,. . . ) one observed stars of the LMC. If a MACHO with speed
v ≈ 220 km/s moves through the line-of-sight of a monitored star, its light-curve is magnified
temporally. If v is the perpendicular velocity of the source,
 1/2
v2
β(t) = β02 + 2 (t − t0 )2 (5.12)
Dol

The magnification a(t) is symmetric around t0 and its shape can be determined inserting
typical values for Dol , the MACHO mass.

52
6 Black holes
A black hole is a solution of Einstein’s equations containing a physical singularity which in
turn is covered by an event horizon. Such a horizon acts classically as a perfect unidirectional
membrane which any causal influence can cross only towards the singularity.

Some definitions: A conformal transformation of the metric,

gµν (x) → g̃µν (x) = Ω2 (x)gµν (x). (6.1)

changes distances, but keeps angles invariant. Thus the causal structure of two conformally
related spacetimes is identical.
A spacetime is called conformally flat if it is connected by a conformal transformation to
Minkowski space,
gµν (x) = Ω2 (x)ηµν (x) = e2ω(x) ηµν (x). (6.2)
In particular, light-rays also propagate in conformally flat spacetimes along straight lines at
±45 degrees to the time axis.
We add two additional definitions for spacetimes with special symmetries. A stationary
spacetime has a time-like Killing vector field. In appropriate coordinates, the metric tensor
is independent of the time coordinate,

ds2 = g00 (x)dt2 + 2g0i (x)dtdxi + gij (x)dxi dxj . (6.3)

A stationary spacetime is static if it is invariant under time reversal. Thus the off-diagonal
terms g0i have to vanish, and the metric simplifies to

ds2 = g00 (x)dt2 + gij (x)dxi dxj . (6.4)

An example of a stationary spacetime is the metric around a spherically symmetric mass


distribution which rotates with constant velocity. If the mass distribution is at rest then the
spacetime becomes static.

6.1 Rindler spacetime and the Unruh effect


Rindler spacetime Recall from exercise 2.3 that the trajectory of an accelerated observer
(suppressing the transverse coordinates y and z) is given by

1 1
t(τ ) = sinh(aτ ) and x(τ ) = cosh(aτ ). (6.5)
a a
It describes one branch of the hyperbola x2 − t2 = a−2 . Introducing light-cone coordinates,

u=t−x and v = t + x, (6.6)

53
6 Black holes

it follows
1
u(τ ) = − exp(−aτ ). (6.7)
a
Our aim is to determine how the uniformly accelerated observer experiences Minkowski
space. As a first step, we try to find a frame {ξ, χ} comoving with the observer. In this
frame, the observer is at rest, χ(τ ) = 0, and the coordinate time ξ agrees with the proper
time, ξ = τ . Introducing comoving light-cone coordinates,
ũ = ξ − χ and ṽ = ξ + χ, (6.8)
these conditions become
ũ(τ ) = ṽ(τ ) = τ. (6.9)
Moreover, we can choose the comoving coordinates such that the metric is conformally flat,
ds2 = Ω2 (ξ, χ)(dξ 2 − dχ2 ) = Ω2 (ũ, ṽ)dũdṽ. (6.10)
Next we have to relate the comoving coordinates {ũ, ṽ} to Minkowski coordinates {t, x}.
Since dũ2 and dṽ 2 are missing in the line element, the functions u(ũ, ṽ) and v(ũ, ṽ) can depend
only on one of their two arguments. We can set therefore u(ũ) and v(ṽ). Expressing u̇ as
du du dũ
= , (6.11)
dτ dũ dτ
inserting u̇ = −au and ũ˙ = 1 we arrive at
du
−au = . (6.12)
dũ
Separating variables and integrating we end up with u = C1 e−aũ . In the same way, we find
v = C2 eaṽ . Since the line element has to agree along the trajectory with the proper-time,
ds2 = dτ 2 = dudv, the two integration constants C1 and C2 have to satisfy the constraint
−a2 C1 C2 = 1. Choosing C1 = −C2 , the desired relation between the two sets of coordinates
becomes
1 1
u = − e−aũ and v = eaṽ , (6.13)
a a
or using Cartesian coordinates,
1 1
t = eaχ sinh(aξ) and x = eaχ cosh(aξ). (6.14)
a a
The spacetime described by the coordinates defining the comoving frame of the accelerated
observer,
ds2 = e2aχ (dξ 2 − dχ2 ), (6.15)
is called Rindler spacetime. It is locally equivalent to Minkowski space but differs globally.
If we vary the Rindler coordinates over their full range, ξ ∈ R and χ ∈ R, then we cover
only the one quarter of Minkowski space with x > |t|. Thus for an accelerated observer an
event horizon exist: Evaluating on a hypersurface of constant comoving time, ξ = const., the
physical distance from χ = −∞ to the observer placed at χ = 0 gives
Z 0 q
1
d= dχ |gχχ | = . (6.16)
−∞ a
This corresponds to the coordinate distance between the observer and the horizon in
Minkowski coordinates.
definition:: The particle horizon is the maximal distance from which we can receive signals,
while the event horizon defines the maximal distance to which we can send signals.

54
6.1 Rindler spacetime and the Unruh effect

Exponential redshift Later we will discuss gravitational particle production as the effect
of a non-trivial Bogolyubov transformation between different vacua. Before we apply this
formalism, we will examine the basis of this physical phenomenon in a classical picture. As a
starter, we want to derive the formula for the relativistic Doppler effect. Consider an observer
who is moving with constant velocity v relative to the Cartesian inertial system xµ = (t, x)
where we neglect the two transverse dimensions. We can parameterise the trajectory of the
observer as

xµ (τ ) = (t(τ ), x(τ )) = (τ γ, τ γv), (6.17)

where γ denotes its Lorentz factor. A monochromatic wave of a scalar, massless field φ(k) ∝
exp[−iω(t − x)] will be seen by the moving observer as

" r #
1−v
φ(τ ) ≡ φ(xµ (τ )) ∝ exp [−iωτ (γ − γv)] = exp −iωτ . (6.18)
1+v

Thus this simple calculation reproduces the usual Doppler formula, where the frequency ω of
the scalar wave is shifted as
r
′ 1−v
ω = ω. (6.19)
1+v

Next we apply the same method to the case of an accelerated observer. Then t(τ ) =
a−1 sinh(aτ )
and x(τ ) = a−1 cosh(aτ ). Inserting this trajectory again into a monochromatic
wave with φ(k) ∝ exp(−iω(t − x) now gives

   
iω iω
φ(τ ) ∝ exp − [sinh(aτ ) − cosh(aτ )] = exp exp(−aτ ) ≡ e−iϑ . (6.20)
a a

Thus an accelerated observer does not see a monochromatic wave, but a superposition of
plane waves with varying frequencies. Defining the instantaneous frequency by


ω(τ ) = = ω exp(−aτ ), (6.21)

we see that the phase measured by the accelerated observer is exponentially redshifted. As
next step, we want to determine the power spectrum P (ν) = |φ(ν)|2 measured by the observer,
for which we have to calculate the Fourier transform φ(ν).

55
6 Black holes

Determine the Fourier transform of the wave φ(τ ).


Substituting y = exp(−aτ ) in
Z ∞ Z ∞  
iντ iω
φ(ν) = dτ φ(τ )e = dτ exp exp(−aτ ) eiντ (6.22)
−∞ −∞ a

gives Z ∞
1
φ(ν) = dy y −iν/a−1 ei(ω/a)y . (6.23)
a 0
On the other hand, we can rewrite Euler’s integral representation of the Gamma function as
Z ∞
dt tz−1 e−bt = b−z Γ(z) = exp(−z ln b) Γ(z) (6.24)
0

for ℜ(z) > 0 and ℜ(b) > 0. Comparing these two expressions, we see that they agree setting
z = −iν/a + ε and b = −iω/a + ε. Here we added an infinitesimal positive real quantity ε > 0
to ensure the convergence of the integral. In order to determine the correct phase of b−z , we
have rewritten this factor as exp(−z ln b) and have used
  ω iπ

ln b = lim ln − + ε = ln − sign(ω/a). (6.25)
ε→0 a a 2

Thus the Fourier transform φ(ν) is given by

1  ω iν/a
φ(ν) = Γ(−iν/a)eπν/(2a) . (6.26)
a a

The Fourier transform φ(ν) contains negative frequencies,

1  ω iν/a
φ(−ν) = φ(ν)e−πν/a = Γ(−iν/a)e−πν/(2a) . (6.27)
a a

Using the reflection formula of the Gamma function for imaginary arguments,
π
Γ(ix)Γ(−ix) = , (6.28)
x sinh(πx)

we find the power spectrum at negative frequencies as

π e−πν/a β 1
P (−ν) = 2
= βν
(6.29)
a (ν/a) sinh(πν/a) ν e −1

with β = 2π/a. Remarkably, the dependence on the frequency ω of the scalar wave—still
present in the Fourier transform φ(ν)—has dropped from the negative frequency part of
the power spectrum P (−ν) which corresponds to a thermal Planck law with temperature
T = 1/β = a/(2π).
The occurrence of negative frequencies is the classical analogue for the mixing of posi-
tive and negative frequencies in the Bogolyubov method. Therefore we expect that on the
quantum level a uniformly accelerated detector will measure a thermal Planck spectrum with
temperature T = 1/β = a/(2π). This phenomenon is called Unruh effect and T = a/(2π) the
Unruh temperature.

56
6.2 Schwarzschild black holes

6.2 Schwarzschild black holes


Next, we recall our definition of an event horizon as a three-dimensional hypersurface which
limits a region of a spacetime which can never influence an observer. The event horizon is
formed by light-rays and is therefore a null surface. Hence we require that at each point of
such a surface defined by f (xµ ) = 0 a null tangent vector nµ exists that is orthogonal to two
space-like tangent vectors. The normal nµ to this surface is parallel to the gradient along the
surface, nµ = h∇µ f = h∂ µ f , where h is an arbitrary non-zero function. From

0 = nµ nµ = gµν nµ nν (6.30)

we see that the line element vanishes on the horizon, ds = 0. Hence the (future) light-cones
at each point of an event horizon are tangential to the horizon.

Eddington–Finkelstein coordinates We next try to find new coordinates which are regular
at r = 2M and valid in the whole range 0 < r < ∞. Such a coordinate transformation
has to be singular at r = 2M , otherwise we cannot hope to cancel the singularity present in
the Schwarzschild coordinates. We can eliminate the troublesome factor grr = (1 − 2M r )
−1

introducing a new radial coordinate r defined by
dr
dr ∗ = . (6.31)
1 − 2M
r

Integrating (6.31) results in


r
r ∗ (r) = r + 2M ln

− 1 + A, (6.32)
2M
with A ≡ −2M a as integration constant. The coordinate r ∗ (r) is often called tortoise co-
ordinate, because r ∗ (r) changes only logarithmically close to the horizon. This coordinate
change maps the range r ∈ [2M, ∞] of the radial coordinate onto r ∗ ∈ [−∞, ∞]. A radial null
geodesics satisfies d(t ± r ∗ ) = 0, and thus in- and out-going light-rays are given by
r

ũ ≡ t − r = t − r − 2M ln − 1 − A, outgoing rays, (6.33)
2M
r
ṽ ≡ t + r ∗ = t + r + 2M ln

− 1 + A, ingoing rays. (6.34)
2M
For r > 2M , Eq. (4.32) implies that dr/dt > 0 so that r increases with t. Therefore (6.33)
describes outgoing light-rays, while (6.34) corresponds to ingoing light-rays for r > 2M .
We can extend now the Schwarzschild metric using as coordinate the “advanced time pa-
rameter ṽ” instead of t. Forming the differential,
 
 r −1 2M −1
dṽ = dt + dr + −1 dr = dt + 1 − dr, (6.35)
2M r
we can eliminate dt from the Schwarzschild metric and find
 
2 2M
ds = 1 − dṽ 2 − 2dṽdr − r 2 dΩ. (6.36)
r
This metric was found first by Eddington and was later rediscovered by Finkelstein. Although
gṽṽ vanishes at r = 2M , the determinant g = r 4 sin2 ϑ is non-zero at the horizon and thus

57
6 Black holes

b
a

Figure 6.1: Left: The Schwarzschild spacetime using advanced Eddington–Finkelstein coor-
dinates; the singularity is shown by a zigzag line, the horizon by a thick line and
geodesics by thin lines. Right: Collapse of a star modelled by pressureless matter;
dashes lines show geodesics, the thin solid line encompasses the collapsing stellar
surface.

the metric is invertible. Moreover, r ∗ was defined by (6.32) initially only for r > 2M , but we
can use this definition also for r < 2M , arriving at the same expression (6.36). Therefore,
the metric using the advanced time parameter ṽ is regular at 2M and valid for all r > 0. We
can view this metric hence as an extension of the r > 2M part of the Schwarzschild solution,
similar to the process of analytic continuation of complex functions. The price we have to
pay for a non-zero determinant at r = 2M are non-diagonal terms in the metric. As a result,
the spacetime described by (6.36) is not symmetric under the exchange t → −t. We will see
shortly the consequences of this asymmetry.
We now study the behaviour of radial light-rays, which are determined by ds2 = 0 and
dφ = dϑ = 0. Thus radial light-rays satisfy Adṽ 2 − 2dṽdr = 0, which is trivially solved by
ingoing light-rays, dṽ = 0 and thus ṽ = const. The solutions for dṽ 6= 0 are given by (6.33).
Additionally, the horizon r = 2M which is formed by stationary light-rays satisfies ds2 = 0.
In order to draw a spacetime diagram, it is more convenient to replace the light-like coordinate
ṽ by a new time-like coordinate. We show in the left panel of Fig. 6.1 geodesics using as new
time coordinate t̃ = ṽ − r. Then the ingoing light-rays are straight lines at 45◦ to the r axis.
Radial light-rays which are outgoing for r > 2M and ingoing for r < 2M follow Eq. (6.34).
A few future light-cones are indicated: they are formed by the intersection of light-rays, and
they tilt towards r = 0 as they approach the horizon. At r = 2M , one light-ray forming the
light-cone becomes stationary and part of the horizon, while the remaining part of the cone
lies completely inside the horizon.
Let us now discuss how Fig. 6.1 would like using the retarded Eddington–Finkelstein coor-
dinate ũ. Now the outgoing radial null geodesics are straight lines at 45◦ . They start from
the singularity, crossing smoothly r = 2M and continue to spatial infinity. Such a situation,
where the singularity is not covered by an event horizon is called a “white hole”. The cosmic
censorship hypothesis postulates that singularities formed in gravitational collapse are always
covered by event horizons. This implies that the time-invariance of the Einstein equations is

58
6.2 Schwarzschild black holes

broken by its solutions. In particular, only the BH solution using the retarded Eddington–
Finkelstein coordinates should be realised by nature—otherwise we should expect causality
to be violated. This behaviour may be compared to classical electrodynamics, where all
solutions are described by the retarded Green function, while the advanced Green function
seems to have no relevance.

Collapse to a BH After a star has consumed its nuclear fuel, gravity can be balanced only
by the Fermi degeneracy pressure of its constituents. Increasing the total mass of the star
remnant, the stellar EoS is driven towards the relativistic regime until the star becomes
unstable. As a result, the collapse of its core to a BH seems to be inevitable for a sufficiently
heavy star.
Let us consider a toy model for such a gravitational collapse. We describe the star by
a spherically symmetric cloud of pressureless matter. While the assumption of negligible
pressure is unrealistic, it implies that particles at the surface of the star follow radial geodesics
in the Schwarzschild spacetime. Thus we do not have to bother about the interior solution
of the star, where Tµν 6= 0 and our vacuum solution does not apply. In advanced Eddington–
Finkelstein coordinates, the collapse is schematically shown in the right panel of Fig. 6.1.
At the end of the collapse, a stationary Schwarzschild BH has formed. Note that in our toy
model the event horizon forms before the singularity, as required by the cosmic censorship
hypothesis. The horizon grows from r = 0 following the light-like geodesic a shown by the
thin black line until it reaches its final size Rs = 2M . What happens if we drop a lump
of matter δM on a radial geodesics into the BH? Since we do not add angular momentum
to the BH, the final stage is, according to the Birkhoff’s theorem, still a Schwarzschild BH.
All deviations from spherical symmetry corresponding to gradient energy in the intermediate
regime are being radiated away as gravitational waves. Thus in the final stage, the only
change is an increase of the horizon, size Rs → 2(M + δM ). Therefore some light-rays (e.g. b)
which we expected to escape to spatial infinity will be trapped. Similarly, light-ray a, which
we thought to form the horizon, will be deflected by the increased gravitational attraction
towards the singularity. In essence, knowing only the spacetime up to a fixed time t, we
are not able to decide which light-rays form the horizon. The event horizon of a black hole
is a global property of the spacetime: It is not only independent of the observer but also
influenced by the complete spacetime.
How does the stellar collapse looks like for an observer at large distances? Let us assume
that the observer uses a neutrino detector and is able to measure the neutrino luminosity
Lν (r) = dEν /dt = Nν ων /dt emitted by a shell of stellar material at radius r. In order to
determine the luminosity Lν (r), we have to connect r and t. Linearising Eq. (??) around
r = 2M gives
r − 2M
= e−(t−t0 )/2M . (6.37)
r0 − 2M
For an observer at large distance r0 , the time difference between two pulses sent by a shell
falling into a BH increases thus exponentially for r → 2M . As a result the energy ων of an
individual neutrino is also exponentially redshifted

ων (r) = ων (r0 )e−(t−t0 )/2M . (6.38)

A more detailed analysis confirms the expectation that then also the luminosity decreases
exponentially. Thus an observer at infinity will not see shells which slow down logarithmically

59
6 Black holes

as they fall towards r → 2M , as suggested by Eq. (??). Instead the signal emitted by the shell
will fade away exponentially, with the short characteristic time scale of M = M tPl /MPl ≈
10−5 s for a stellar-size BH.

Kruskal coordinates We have been able to extend the Schwarzschild solution into two dif-
ferent branches; a BH solution using the advanced time parameter ṽ and a white hole solution
using the retarded time parameter ũ. The analogy with the analytic continuation of complex
functions leads naturally to the question of whether we can combine these two branches into
one common solution. Moreover, our experience with the Rindler metric suggests that an
event horizon where energies are exponentially redshifted implies the emission of a thermal
spectrum. If true, our BH would not be black after all. One way to test this suggestion is to
relate the vacua as defined by different observers via a Bogolyubov transformation. In order
to simplify this process, we would like to find new coordinates for which the Schwarzschild
spacetime is conformally flat.
An obvious attempt to proceed is to use both the advanced and the retarded time param-
eters. For most of our discussion, it is sufficient to concentrate on the t, r coordinates in the
line element ds2 = ds̄2 + r 2 dΩ, and to neglect the angular dependence from the r 2 dΩ part.
We start by eliminating r in favour of r ∗ ,
 
2 2M
ds̄ = 1 − (dt2 − dr ∗2 ), (6.39)
r(r ∗ )
where r has to be expressed through r ∗ . This metric is conformally flat but the definition
of r(r ∗ ) on the horizon contains the ill-defined factor ln(2m/r − 1). Clearly, a new set of
coordinates where this factor is exponentiated is what we are seeking.
This is achieved introducing both Eddington–Finkelstein parameters,
ũ = t − r ∗ , ṽ = t + r ∗ , (6.40)
for which the metric simplifies to
 
2 2M
ds̄ = 1− dũdṽ. (6.41)
r(ũ, ṽ)
From (6.32) and (6.40), it follows
ṽ − ũ r

= r (r) = r + 2M ln − 1 − 2M a, (6.42)
2 2M
or   
2M 2M ṽ − ũ r 
1− = exp exp a − . (6.43)
r r 4M 2M
This allows us to eliminate the singular factor 1 − 2M/r in (6.41), obtaining
    
2 2M r  ũ ṽ
ds̄ = exp a − exp − dũ exp dṽ. (6.44)
r 2M 4M 4M
Finally, we change to Kruskal light-cone coordinates u and v defined by
   
ũ ṽ
u = −4M exp − and v = 4M exp , (6.45)
4M 4M
arriving at
2M  r 
ds2 = exp a − dudv + r 2 dΩ. (6.46)
r 2M

60
6.2 Schwarzschild black holes


t=
r=
3M
2M
t=
r=4 II
M

I’ I t=0 T

II’
t=
−2M

t=
2M


=


r

Figure 6.2: Spacetime diagram for the Kruskal coordinates T and R.

Kruskal diagram The coordinates ũ, ṽ cover only the exterior r > 2M of the Schwarzschild
spacetime, and thus u, v are initially only defined for r > 2M . Since they are regular at the
Schwarzschild radius, we can extend these coordinates towards r = 0. In order to draw the
spacetime diagram of the full Schwarzschild spacetime shown in Fig. 6.2, it is useful to go
back to time- and space-like coordinates via

u=T −R and v = T + R. (6.47)

Then the connection between the pair of coordinates {T, R}, {u, v} and {t, r} is given by
 ∗   r   r 
r
uv = T 2 − R2 = −16M 2 exp = −16M 2 − 1 exp −a , (6.48a)
2M 2M 2M
u T −R
= = exp [−t/(2M )] . (6.48b)
v T +R

Lines with r = const. are given by uv = T 2 − R2 = const. They are thus parabola shown
as dotted lines in Fig. 6.4. Lines with t = const. are determined by u/v = const. and are
thus given by straight (solid) lines through zero. In particular, null geodesics correspond to
straight lines with angle 45◦ in the R − T diagram. The horizon r = 2M is given by to u = 0
or v = 0. Hence two separate horizons exist: a past horizon at t = −∞ (for v = 0 and thus
T = −R) and a future horizon at t = +∞ (for u = 0 and thus T = R). Also, the singularity
at r = 0 corresponds to two separate lines in the R − T Kruskal diagram1 and is given by
p
T = ± 16M 2 + R2 . (6.49)
1
Recall that we suppress two space dimension: Thus a point in the R − T Kruskal diagram correspond to a
sphere S 2 , and a line to R × S 2 .

61
6 Black holes

The horizon lines {t = −∞, r = 2M } and {t = ∞, r = 2M } divide the spacetime in four


parts. The future singularity is unavoidable in part II, while in region II’ all trajectories start
at the past singularity. Region I corresponds to the original Schwarzschild solution outside
the horizon r > 2M , while region I and II encompass the advanced Eddington–Finkelstein
solution. The regions I’ and II’ represent the retarded Eddington–Finkelstein solution, where
II’ corresponds to a white hole. Note that I’ represents a new asymptotically flat Schwarzschild
exterior solution.
The presence of a past horizon v = 0 at t = −∞ makes the complete BH solutions time-
symmetric and corresponds to an eternal BH. If we model a realistic BH, that is, one that
was created at finite t by a collapsing mass distribution, with Kruskal coordinates, then any
effect induced by the past horizon should be considered as unphysical.

6.3 Reissner-Nordström black hole


The solution of the coupled Einstein-Maxwell equations for a point-like particle with mass M
and electric charge Q was found by Reissner and Nordström. Its line-element is

dr 2
ds2 = A(r)dt2 − − r 2 (dϑ2 + sin2 ϑdφ2 ) (6.50)
A(r)
with  
2GM GQ2
A(r) = 1− + . (6.51)
r 4πr 2

6.4 Kerr black holes


The stationary spacetime outside a rotating mass distribution can be derived by symmetry
arguments similarly (but much more tortorous. . . ) to the case of the Schwarzschild metric.
It was found first accidentally by R. Kerr in 1963. The black hole solution of this spacetime
is fully characterised by two quantities, the mass M and the angular momentum L of the
Kerr BH. Both parameters can be manipulated, at least in a gedankenexperiment, dropping
material into the BH. Examining the response of a Kerr black hole to such changes was crucial
for the discovery of “black hole thermodynamics”.
In Boyer–Lindquist coordinates, the metric outside of a rotating mass distribution is given
by
 
2 2M r 4M ar sin2 ϑ ρ2 2
ds = 1 − 2 dt2 + dφdt − dr − ρ2 dϑ2
ρ ρ2 ∆
  (6.52)
2 2 2M ra2 sin2 ϑ 2 2
− r +a + sin ϑdφ ,
ρ2
with the abbreviations

a = L/M , ρ2 = r 2 + a2 cos2 ϑ, ∆ = r 2 − 2M r + a2 . (6.53)

The metric is time-independent and axially symmetric. Hence two obvious Killing vectors
are, as in the Schwarzschild case, ξ = (1, 0, 0, 0) and η = (0, 0, 0, 1), where we again order
coordinates as {t, r, ϑ, φ}.

62
6.4 Kerr black holes

The presence of the mixed term gtφ means that the metric is stationary, but not static—as
one expects for a star or BH rotating with constant rotation velocity. Finally, the metric is
asymptotically flat and the weak-field limit shows that L is the angular momentum of the
rotating black hole.
Its main properties are
• The metric is asymptotically flat.
• Potential singularities at ρ = 0 and ∆ = 0.
• The weak-field limit shows that L is the angular momentum of the rotating black hole.
• The presence of the mixed term gtφ means that infalling particles (and thus space-time)
is dragged around the rotating black hole.
Orbits in the equatorial plane ϑ = π/2 could be derived in the same way as for the
Schwarzschild case, for ϑ 6= π/2 the discussion becomes much more involved.

Singularity First we examine the potential singularities at ρ = 0 and ∆ = 0. The calculation


of the scalar invariants formed from the Riemann tensor shows that only ρ = 0 is a physical
singularity, while ∆ = 0 corresponds to a coordinate singularity. The physical singularity at
ρ2 = 0 = r 2 +a2 cos ϑ2 corresponds to r = 0 and ϑ = π/2. Thus the value r = 0 is surprisingly
not compatible with all ϑ values. o understand this point, we consider the M → 0 limit of
the Kerr metric (6.52) keeping a = L/M fixed,

ρ2
ds2 = dt2 − dr 2 − ρ2 dϑ2 − (r 2 + a2 ) sin2 ϑdφ2 . (6.54)
r 2 + a2
The comparison with the Minkowski metric shows that
p
x = r 2 + a2 sin ϑ cos φ, z = r cos ϑ,
p (6.55)
y = r 2 + a2 sin ϑ sin φ,

Hence the singularity at r = 0 and ϑ = π/2 corresponds to a ring of radius a in the equatorial
plane z = 0 of the Kerr black hole.

Horizons We have defined an event horizon as a three-dimensional hypersurface, f (xµ ) = 0,


that is null. In a stationary, axisymmetric spacetime the general equation of a surface,
f (xµ ) = 0, simplifies to f (r, ϑ) = 0. The condition for a null surface becomes

0 = gµν (∂µ f )(∂ν f ) = grr (∂r f )2 + gϑϑ (∂ϑ f )2 . (6.56)

In the case of the surface defined by the coordinate singularity ∆ = r 2 − 2M r + a2 = 0 that


depends only on r, p
r± = M ± M 2 − a2 , (6.57)
the condition defining a horizons becomes simply grr = 0 or grr = 1/g rr = ∞. Hence, r− and
r+ define an inner and outer horizon around a Kerr black hole.
The surface A of the outer horizon follows from inserting r+ together with dr = dt = 0
into the metric,
 
2 2 2 2 2 2M r+ a2 sin2 ϑ
ds = ρ+ dϑ + r+ + a + sin2 ϑdφ2 , (6.58)
ρ2+

63
6 Black holes

Figure 6.3: Structure of a Kerr black hole: The ergoregion (grey area) is bounded by the

outer √ ergosurface r+ = M + M 2 − a2 cos2 ϑ and the outer event
√ horizon rh =
M + M − a , followed by√the inner event horizon rh = M − M 2 − a2 , the
2 2

inner ergosurface r− = M − M 2 − a2 cos2 ϑ and the ring singularity {x2 + y 2 =


a2 , z = 0}.

2 + a2 = 2M r , we obtain
Using r± ±
 2
2M r+
ds2 = ρ2+ dϑ2 + sin2 ϑdφ2 . (6.59)
ρ+

Hence the metric determinant g2 restricted to the angular variables is given by g2 =

gϑϑ gφφ = 2M r+ sin ϑ and integration gives the area A of the horizon as
Z 2π Z π
√ p
A= dφ dϑ g2 = 8πM r+ = 8πM (M + M 2 − a2 ). (6.60)
0 0
Note that the area depends on the angular momentum of the black hole that can in turn
be manipulated by dropping material into the hole. The horizon area A for fixed mass M
becomes maximal for a non-rotating black hole, A = 16πM 2 , and decreases to A = 8πM 2 for
a maximally rotating one with a = M . For a > M , the metric component grr = ∆ has no
real zero and thus no event horizon exists.
(For an interpretation see the space-time diagram 6.4 that uses coordinates of the advanced
Eddington-Finkelstein type.)

Ergosphere and dragging of inertial frames The Kerr metric is a special case of a metric
with gtφ 6= 0. As result, both massive and massless particles with zero angular momentum
alling into a Kerr black hole will acquire a non-zero angular rotation velocity ω = dφ/dt as
seen by an observer from infinity.
We consider a light-ray with dϑ = dr = 0. Then the line element becomes
gtt dt2 + 2gtφ dtdφ + gφφ dφ2 = 0. (6.61)
Dividing by gφφ dt2 , we obtain a quadratic equation for the angular rotation velocity ω =
dφ/dt,
gtφ gφφ
ω2 + 2 ω+ =0 (6.62)
gφφ gφφ

64
6.4 Kerr black holes

Figure 6.4: Space-time diagram in advanced Eddington-Finkelstein coordinates for a Kerr


black hole with a < M . Between the two horizons r− < r < r+ , light cones are
oriented towards r− , particles have to cross r− . Inside the inner horizon, geodesics
are possible that do not reach r = 0 in finite time. The behavior for r → 0 (and
ϑ 6= π/2) suggests that one can extend the space-time to r < 0.

65
6 Black holes

with the two solutions s 2


gtφ gtφ gtt
ω1/2 =− ± − . (6.63)
gφφ gφφ gφφ
There are two interesting special cases of this equation. First, on the surface gtt = 0, the two
possible solutions of ω = dφ/dt for light-rays satisfy2
gtφ
ω1 = 0 and ω2 = −2 . (6.64)
gφφ
Hence, the rotating black hole drags spacetime at gtt = 0 so strongly that even a photon can
only co-rotate. Similarly, this condition specifies a surface inside which no stationary observers
are possible. The normalisation condition u · u = 1 is inconsistent with ua = (1, 0, 0, 0) and
gtt < 0: however strong your rocket engines are, your space-ship will not be able to hover
at the same point (r, ϑ, φ) inside the region with gtt < 0. Therefore one calls a surface with
gtt = 0 a stationary limit surface. Solving
2M r
gtt = 1 − = 0, (6.65)
ρ2
we find the position of the two stationary limit surfaces at
p
r1/2 = M ± M 2 − a cos ϑ. (6.66)

The ergosphere is the space bounded by these two surfaces.


The other interesting special case of Eq. (6.63) occurs when the allowed range of values,
ω1 ≤ ω ≤ ω2 , shrinks to a single value, i.e. when
 
2 gtt gtφ 2
ω = = . (6.67)
gφφ gφφ
This happens at the outer horizon r+ and defines the rotation velocity ωH of the black hole.
In the case of a Kerr black hole, we find
a
ωH = . (6.68)
2M r+
Thus the rotation velocity of the black hole corresponds to the rotation velocity of the light-
rays forming its horizon, as seen by an observer at spatial infinity.

Extension of the Kerr metric The behavior of geodesics for r → 0 (and ϑ 6= π/2) suggests
that one can extend the space-time to r < 0. For r → −∞, the extension becomes asymptot-
ically flat, i.e. there exists a second Minkowski space that is connected to ours via the Kerr
black hole. Since for negative r, ∆ is always positive, ∆ = r 2 − 2M r + a2 > 0, the singularity
is not protected by an event horizon in the “other” Minkowksi space. Moreover, there exist
closed time-like curves: Consider a curve depending only on φ in the equatorial plane, the
line-element for small, negative r is
 
2 2 2 2M a2 2M a2 2
ds = r + a + dφ2 ∼ dφ < 0 (6.69)
r r
2
Note that ω1 < ω2 , because of gφφ < 0. Hence photons (and thus also spacetime) is corotating, as expected.

66
6.4 Kerr black holes

time-like.
The cosmic censorship hypothesis postulates that singularities formed in gravitational col-
lapse are always covered by event horizons. Thus we are in the “r > 0” Minkowski space of
all Kerr black holes – and the r < 0 is simply a mathematical artefact of a highly symmetrical
manifold, not showing up in real physical situations.

Penrose process and the area theorem The total energy of a Kerr BH consists of its rest
energy and its rotational energy. These two quantities control the size of the event horizon
and therefore it is important to understand how they change dropping matter into the BH.
The energy of any particle moving on a geodesics is conserved, E = p · ξ. Inside the
ergosphere, the Killing vector ξ is space-like and the quantity E is thus the component of a
spatial momentum which can have both signs. This led Penrose to entertain the following
gedankenexperiment: Suppose the spacecraft A starts at infinity and falls into the ergosphere.
There it splits into two parts: B is dropped into the BH, while C escapes to infinity. In the
splitting process, four-momentum has to be conserved, pA = pB + pC . We can now choose
a time-like geodesics for B falling into the BH such that EB < 0. Then EC > EA and the
escaping part C of the spacecraft has at infinity a higher energy than initially.
The Penrose process decreases both the mass and the angular momentum of the BH by an
amount equal to that of the space craft B falling into the BH. Now we want to show that the
changes are correlated in such a way that the area of the BH increases. Let us first define a
new Killing vector,
K = ξ + ωH η.
This Killing vector is null on the horizon and time-like outside. It corresponds to the four-
velocity with the maximal possible rotation velocity. Now we use EB = pB ·ξ and LB = −pB ·η
and
pB · K = pB · (ξ + ωH η) = EB − ωH LB > 0, (6.70)
to obtain the bound LB < EB /ωH . Since EB < 0, the added angular momentum is negative,
LB < 0.
The mass and the angular momentum of the BH change by δM = EB and δL = LB , when
particle B drops into the BH. Thus
aδL
δM > ωH δL = 2 . (6.71)
r+ + a2
Now we define the irreducible mass of BH as the mass of that Schwarzschild BH whose event
horizon has the same area,
2 1 p
Mirr = (M 2 + M 2 − L2 ) (6.72)
2
or  2
L
M 2 = Mirr2
+ . (6.73)
2Mirr
Thus we can interpret the total mass as the Pythagorean sum of the irreducible mass and a
contribution related to the rotational energy. Differentiating the relation (6.72) results in
a −1

δMirr = √ ωH δM − δL . (6.74)
4Mirr M 2 − a2
Our bound implies now δMirr > 0 or δA > 0. Thus the surface of a Kerr BH can only increase,
even when its mass decreases.

67
6 Black holes

6.5 Black hole thermodynamics and Hawking radiation


Bekenstein entropy We have shown that classically the horizon of a black hole can only
increase with time. The only other quantity in physics with the same property is the entropy,
dS ≥ 0. This suggests a connection between the horizon area and its entropy. To derive this
relation, we apply the first law of thermodynamics dU = T dS − P dV + . . . to a Kerr black
hole. Its internal energy U is given by U = M and thus

dU = dM = T dS − ωdL , (6.75)

where ωdL denotes the mechanical work done on a rotating macroscopic body.
Our experience with the thermodynamics of non-gravitating systems suggests that the
entropy is an extensive quantity and thus proportional to the volume, S ∝ V . We now offer
an argument that shows that the entropy S of a black hole is proportional to its area A. We
introduce the “rationalised area” α = A/4π = 2M r+ , cf. (6.60), or
p
α = 2M 2 + 2 M 4 − L2 . (6.76)

The parameters describing a Kerr black hole are its mass M and its angular momentum L and
thus α = α(M, L). We form the differential dα and find after some algebra (problem 25.??)

M 2 − a2 a
dα = dM + dL. (6.77)
2α α
Using now Eq. (6.60) and (6.68), we can rewrite the RHS as

M 2 − a2
dα = dM + ω H dL. (6.78)

Thus the first law of black hole thermodynamics predicts the correct angular velocity ωH of
a Kerr black hole. Including the term Φdq representing the work done by adding the charge
dq to a black hole, the area law of a charged black hole together with the first law of BH
thermodynamics reproduces the correct surface potential Φ of a charged black hole.
The factor in front of dα is positive, as its interpretation as temperature requires. We
identify √
M 2 − a2
T dS = dα (6.79)

and thus S = f (A). The validity of the area theorem requires that f is a linear function,
the proportionality coefficient between S and A can be only determined by calculating the
temperature of black hole. Hawking could show 1974 that a black hole in vacuum emits
black-body radiation (“Hawking radiation”) with temperature

2 M 2 − a2
T = (6.80)
A
and thus
kc3 A
S= A= . (6.81)
4~G 4L2Pl
The entropy of a black hole is not extensive but is proportional to its surface. It is large,
because its basic unit of entropy, 4L2Pl , is so tiny. The presence of ~ in the first formula, where

68
6.A Appendix: Conformal flatness for d = 2

we have inserted the natural constants, signals that the black hole entropy is a quantum
property.
The heat capacity CV of a Schwarzschild black hole follows with U = M = 1/(8πT ) from
the definition
∂U 1
CV = =− < 0. (6.82)
∂T 8πT 2
As it is typical for self-gravitating systems, its heat capacity is negative. Thus a black hole
surrounded by a cooler medium emits radiation, heats up the environment and becomes
hotter.

Hawking radiation Hawking could show 1974 that a black hole in vacuum emits black-body
radiation (“Hawking radiation”) with temperature

2 M 2 − a2
T = (6.83)
A
and thus
kc3 A
S= A= . (6.84)
4~G 4L2Pl
A black hole surrounded by a cooler medium emits radiation and heats up the environment.
The entropy of a black hole is large, because its basic unit of entropy, 4L2Pl , is so tiny.
We can understand this result considering an observer in the Schwarzschild metric. The
acceleration of a stationary observer,
 −1/2  −1/2
1/2 2M M Rs Rs /2
a ≡ (−a · a) = 1− = 1− , (6.85)
r r2 r r2

diverges approaching the horizon, r → Rs = 2M . The acceleration a close to the horizon, i.e.
for r1 − Rs ≪ Rs , is thus much larger than the curvature ∝ 1/Rs . We can use therefore the
approximation of an accelerated observer in a flat space, who sees according to the Unruh
effect a thermal spectrum with temperature T = a1 /2π at r1 . Assume now that p the observer
moves from r1 to r2 > r1 . Then the spectrum is redshifted by V1 /V2 with Vi = 1 − Rs /ri .
For r2 → ∞, it is V2 → 1 and thus T2 → V1 T1 . Approaching also the horizon, the temperature
becomes
p 1 R /2 1 1
T = lim V1 T1 = lim 1 − Rs /r1 p s = = . (6.86)
r1 →Rs r1 →Rs 2π r12 1 − Rs /r1 4πRs 8πM

connection firewall, Kerr

6.A Appendix: Conformal flatness for d = 2

69
7 Classical field theory

7.1 Lagrange formalism


A relativistic field associates to each spacetime point xµ a set of values. The space of field
values at each point can be characterized by its transformation properties under Lorentz
transformations (a scalar φ, vector Aµ , tensor gµν , or spinor ψa field) and internal symme-
try groups which are (typically) Lie groups like U(1), SU(n),. . . Thus we have to generalize
Hamilton’s principle to a collection of fields φa (xµ ), a = 1, . . . , k, where the index a includes
both Lorentz and group indices. To ensure Lorentz invariance, we consider a scalar Lagrange
density L that may, analogously to L(q, q̇), depend on the fields and its first derivatives ∂µ φa .
There is no explicit time-dependence, since “everything” should be explained by the fields
and their interactions. The Lagrangian L(φa , ∂µ φa ) is obtained by integrating L over a given
space volume V .
The action S is thus the four-dimensional integral
Z b Z
S[L (φa , ∂µ φa )] = dt L(φa , ∂µ φa ) = d4 x L (φa , ∂µ φa ) , (7.1)
a Ω

where Ω = V × [ta : tb ]. If the Lorentz scalar L is in addition a local function, i.e. it is


a function of the fields and their gradients at the same spacetime point xµ , we will obtain
automatically Lorentz-invariant equations of motions.
A variation εφa ≡ δφa of the fields leads to a variation of the action,
Z  
4 ∂L a ∂L a
δS = d x δφ + δ(∂µ φ ) , (7.2)
Ω ∂φa ∂(∂µ φa )

where we have to sum over fields (a = 1, . . . , k) and the Lorentz index µ = 0, . . . , 3. We


eliminate again the variation of the field gradients ∂µ φa by a partial integration using Gauß’
theorem,
Z   
4 ∂L ∂L
δS = d x − ∂µ δφa = 0 . (7.3)
Ω ∂φa ∂(∂µ φa )

The boundary term vanishes, since we require that the variation is zero on the boundary ∂Ω.
Thus the Lagrange equations for the fields φa are
 
∂L ∂L
− ∂µ = 0. (7.4)
∂φa ∂(∂µ φa )

If the Lagrange density L is changed by a four–dimensional divergence, the same equations


of motions result.

70
7.2 Noether’s theorem and conservation laws

7.2 Noether’s theorem and conservation laws


Conservation laws Let j µ be a conserved vector field in Minkowski space,

∂µ j µ = 0 . (7.5)

Then Z Z
d
d3 x j 0 = − dS · j (7.6)
dt V ∂V
and Z
Q= d3 x j 0 (7.7)
V
is a globally conserved quantity, if there is no outgoing flux j through the boundary ∂V . To
show that Q is a Lorentz invariant quantity, we have to rewrite Eq. (7.7) as a tensor equation.
Consider Z
Q(t = 0) = d4 x j µ (x)∂µ ϑ(n · x) (7.8)

with ϑ the step function and n a unit vector in time direction, n · x = x0 = t. Then
Z Z Z
Q(t = 0) = d x j (x)∂0 ϑ(x ) = d x j (x)δ(x ) = d3 x j 0 (x)
4 0 0 4 0 0
(7.9)

and hence Eqs. (7.7) and (7.8) are equivalent. Since one of them is a tensor equation, Q is
Lorentz invariant.
In the same way, we can construct in Minkowski space globally conserved quantities Q for
conserved tensors: If for instance ∂µ T µν = 0, then
Z
P ν = d3 x T 0ν (7.10)

is a globally conserved vector, and similarly for higher-rank tensors.

Symmetries and Noether’s theorem Noether’s theorem gives a formal connection between
global, continuous symmetries of a physical system and the resulting conservation laws. Such
symmetries can be divided into space-time and internal symmeties. We derive this theorem
in two steps, considering in the first one only internal symmetries.
We assume that our collection of fields φa has a continuous symmetry group. Thus we can
consider an infinitesimal change δφa that keeps L (φa , ∂µ φa ) invariant,

δL δL
0 = δL = δ0 φa + δ0 ∂µ φa . (7.11)
δφa δ∂µ φa

Here, we used the notation δ0 to stress that we exclude variations due to the change of
spacetime point. Now we exchange δ∂µ against ∂µ δ in the second term and use then the
Lagrange equations, δL /δφa = ∂µ (δL /δ∂µ φa ), in the first term. Then we can combine the
two terms using the Leibniz rule,
   
δL δL δL
0 = δL = ∂µ δ0 φa + ∂µ δ0 φa = ∂µ δ0 φa . (7.12)
δ∂µ φa δ∂µ φa δ∂µ φa

71
7 Classical field theory

Hence the invariance of L under the change δ0 φa implies the existence of a conserved current,
∂µ j µ = 0, with
δL
jµ = δ0 φa . (7.13)
δ∂µ φa
If the transformation δ0 φa leads to change in L that is a total four-divergence, δ0 L = ∂µ K µ ,
and boundary terms can be dropped, then the equation of motions are still invariant. The
conserved current is changed to j µ = δL /δ∂µ φa δ0 φa − K µ .
In the second step, we consider in addition a variation of the coordinates, x′µ = xµ + δxµ .
Such a variation implies a change of the fields

φ′a (x′µ ) = φa (xµ ) + δφa (xµ ) (7.14)

and thus also of the Lagrange density. Note that we compare now the field at different points.
In order to be able recycle our old result, we split the total variation δφa (xµ ) as follows

δφa (xµ ) = φ′a (x′µ ) − φa (xµ ) = φ′a (xµ + δxµ ) − φa (xµ ) (7.15)
= φ′a (xµ ) + δxµ ∂µ φ′a (xµ ) − φa (xµ ) = δ0 φa (xµ ) + δxµ ∂µ φ′a (xµ ) (7.16)
µ
= δ0 φa (xµ ) + δx ∂µ φa (xµ ). (7.17)

Here we made in the second line first a Taylor expansion, and introduced then the local
variation δ0 φa (xµ ) = φ′a (xµ ) − φa (xµ ) which we calculated previously. Since δxµ is already
a linear term, we could replace in the third line φ′a (xµ ) ≃ φa (xµ ), neglecting thereby only a
quadratic term.
We consider now the variation of the action S implied by the coordinate change x̃µ =
xµ + δxµ . Such a variation implies not only a variation of L but also of the integration
measure d4 x, Z
 4 
δS = d x(δL ) + (δd4 x)L . (7.18)

The two integration measures d4 x and d4 x̃ are connected by the Jacobian, i.e. the determinant
of the transformation matrix
∂ x̃µ
aµν = . (7.19)
∂xν
Using again that the variation is infinitesimal, we find
 
∂δx0 ∂δx0
µ 1 + · · ·
∂ x̃  ∂x
1
0 1
∂x 1  ∂δxµ
J = ν =  ∂δx0 1 + ∂δx  = 1 + . (7.20)
∂x ∂x ∂x1 ∂xµ
... ...

Inserting first this result and using then Eq. (7.17) applied to L gives
Z   Z  
4 ∂δxµ 4 ∂L ∂δxµ
δS = d x δL + L = d x δ0 L + δxµ + L . (7.21)
Ω ∂xµ Ω ∂xµ ∂xµ

We combine the last two terms using the Leibniz rule, and insert the known variation δ0 L
at the same point from Eq. (7.12), obtaining
Z  
4 ∂ ∂L
δS = d x δ0 φa + L δxµ . (7.22)
Ω ∂xµ ∂(∂ µ φa )

72
7.2 Noether’s theorem and conservation laws

If the system is invariant under these transformations, the variation of the action is zero,
δS = 0, and the square bracket represents a conserved current j µ . As last step, we change
from the local variation δ0 to the full variation δ using Eq. (7.17), obtaining as final expression
for the Noether current
 
∂L ∂L ∂φa
jµ = δφa − − ηµν L δxν . (7.23)
∂(∂ µ φa ) ∂(∂ µ φa ) ∂xν

Translations Invariance under translations x′µ = xµ + εµ means φ′a (x′ ) = φa (x) or δφa = 0.
Hence we obtain a conserved tensor
∂L ∂φa
Θµν = − ηµν L (7.24)
∂(∂ µ φa ) ∂xν
called the energy-momentum stress tensor or in short the stress tensor. We will see in the
next chapter that this tenor sources gravity—being thus of crucial interest for us. If the
stress tensor is derived via the Noether procedure (7.24), it is called canonical. In general,
the canonical stress tensor is not symmetric, Θµν 6= Θνµ , as it should be as source of gravity
in Einstein’s theory. Note however that the Noether procedure does not uniquely specificy
the stress tensor, because we can add any tensor ∂λ f λµν which is antisymmetric in µ and
λ: such a term drops out of the conservation law because of ∂µ ∂λ f λµν = 0. This freedom
allows us to obtain always a symmetric stress tensor. We will learn later a different method,
leading directly to a symmetric energy-momentum tensor Tµν (called the dynamical energy-
momentum tensor).

Stress tensor: The invention of the three-dimensional stress tensor σij goes back to Pascal
and Euler. Recall that σij is determined via dFi = σij dAj as the response of a material to
the force Fi on its surface element Aj . This implies that we can view the stress tensor also as
an (anisotropic) pressure tensor. Moreover, it follows with fi = dFi /dV for the force density
fj = ∂i σij as equilibrium condition (or equation of motion) of the system.
The relativistic stress tensor Tµν was introduced by Minkowski in 1908 for electrodynamics,
combining Maxwell’s stress tensor (in vaccuum)
1
σij = Ei Ej + Bi Bj − (E 2 − B 2 )δij
2
with the energy density ρ = (E 2 − B 2 )/2, the Poynting vector (or energy flux) S = E × B,
and the momentum density π  
µν ρ S
T = .
π σij
In a relativistic theory, the energy flux equals the momentum density. Then T 0i = T i0 , what
is sufficient to show the symmetry of the full tensor.

Integrating we obtain four conserved Noether charges,


Z
p = d3 x Θ0ν .
ν
(7.25)

From the example, we know that Θ00 corresponds to the energy density ρ. Therefore p0 is
the energy, and thus pµ the four-momentum of the field. This is in line with the fact that
translations are generated by the four-momentum operator.

73
7 Classical field theory

Lorentz transformations Lorentz transformation, i.e. rotations and boosts, lead to a linear
change of coordinates,
x̃µ = xµ + δω µν xν . (7.26)
They preserve the norm of vectors, implying that
xµ xµ = x̃µ x̃µ = (xµ + δω µσ xσ ) (xµ + δωµτ xτ ) (7.27)
= xµ xµ + δω µσ xσ xµ + δωµτ xµ xτ + O(ω 2 ) (7.28)
µ µν νµ
= x xµ + (δω + δω )xµ xν . (7.29)
Thus the matrix parameterising Lorentz transformations is antisymmetric,
ω µν = −ω νµ , (7.30)
and has six independent elements. For an infinitesimal transformation, the transformed fields
φ̃a (x̃) depend linearly on1 δω µν and φa (x),
1
φ̃a (x̃) = φa (x) + δωµν (I µν )ab φb (x) . (7.31)
2
The symmetric part of (I µν )ab does not contribute, because of the antisymmetry of the δω µν .
Hence we can choose also the (I µν )ab as antisymmetric and thus there exists six generators
(I µν )ab corresponding to the three boosts and the three rotations. The explicit form of the
generators Iab (the “matrix representation of the Lorentz group” for spin s) depends on the
spin of the considered field, as the known different transformation properties of scalar (s = 0),
spinor (s = 1/2) and vector (s = 1) fields under rotations show.
We evaluate now the Noether current (7.23), inserting first the definition of the stress
tensor,
∂L
jµ = δφa − Θµν δxν . (7.32)
∂(∂ µ φa )
Next we use δxµ = δω µν xν and δφa = 12 δω µν (I µν )ab φb (x) as well as the antisymmetry of δω µν ,
to obtain
∂L 1 νλ 1
jµ = µ
δω (Iνλ )ab φb (x) − Θµν δω νλ xλ = δω νλ Mµνλ (7.33)
∂(∂ φa ) 2 | {z } 2
1
2
δω νλ (Θµν xλ −Θµλ xν )

with the definition


∂L
Mµνλ = Θµλ xν − Θµν xλ + (Iνλ )ab φb . (7.34)
∂(∂ µ φa )
This tensor of rank three is antisymmetric in the index pair νλ and conserved with respect
to the index µ. In order to understand its meaning, let us consider first a scalar field φ(x).
Then φ̃(x̃) = φ(x), the last term is thus absent, and the conservation law becomes
0 = ∂µ M µνλ = δµν Θµλ − δµλ Θµν = Θνλ − Θλν . (7.35)

Hence for a scalar field, the canonical stress tensor is symmetric, Θνλ = Θλν , and agrees with
the dynamical stress tensor, Θµν = T µν . The corresponding Noether charges are
Z Z
νµ 3 0νµ
M = d xM = d3 x xν Θ0µ − xµ Θ0ν ≡ Lµν . (7.36)

1
We add a factor 1/2, because in the summation two terms contribute for each transformation parameter.

74
7.3 Perfect fluid

Recalling Eq. (7.25), we see that these charges agree with the relativistic orbital angular mo-
mentum tensor Lµν . Since Lµν is antisymmetric, Eq. (7.36) defines six conserved quantities,
one for each of the generators of the Lorentz group. Choosing spatial indices, Lij agrees with
the non-relativistic orbital angular momentum, while the conservation of Li0 leads to the
relativistic version of the constant center-of-mass motion.
For a field with non-zero spin, the last term in Eq. (7.34) does not vanish. It represents
therefore the intrinsic or spin angular momentum density S µν of the field. In this case, only
the total angular momentum M µν is conserved, not however the orbital and spin angular
momentum individually. Moreover, the canonical stress tensor is not symmetric.

Spin, helicity and the representation of SO(1,3): ffff

7.3 Perfect fluid


In cosmology, the various contributions to the energy content of the universe can be modelled
as fluids, averaging over sufficiently large scales such that N ≫ 1 particles (photons, dark
matter particles, . . . , galaxies) are contained in a “fluid element”. In almost all cases, viscosity
is negligible and the state of such an ideal or perfect fluid is fully parametrised by its energy
density ρ and pressure P .
We construct the stress tensor of a perfect fluid considering first the simplest case of
pressureless matter, traditionally called dust. Consider now how the energy density ρ
of dust transforms. An observer moving relative to the rest frame of dust measures
ρ′ = γdm/(γ −1 dV ) = γ 2 ρ. Hence the energy density should be the 00 component of the
stress tensor tensor T αβ , with T 00 = ρ in the rest frame. In order to find the expression
valid in any frame we can use the tensor method: We express T αβ as a linear combination of
all relevant tensors, which are in our case the four-velocity uα plus the invariant tensors of
Minkowski space, i.e. the metric tensor and the Levi-Civita symbol. Additionally, we impose
the constraint that T αβ is symmetric, leading to

T αβ = Aρuα uβ + Bρη αβ . (7.37)

In the rest-frame, uα = (1, 0), the condition T 00 = ρ leads A − B = 1, while T 11 = 0 implies


B = 0. Thus the stress tensor of dust is

T αβ = ρuα uβ . (7.38)

Writing uα = (γ, γv), we can identify T 00 = γ 2 ρ with the energy density, T 0i = γ 2 ρv i with
the energy/momentum density flux in direction i, and T ij = γ 2 ρv i v j with the flow of the
momentum density component i through the area with normal direction j.
Let us now check the consequences of ∂α T αβ = 0, assuming for simplicity the non-relativistic
limit. We look first at the α = 0 component,

∂t ρ + ∇ · (ρu) = 0 . (7.39)

This corresponds to the mass continuity equation and, because of E = m for dust, at the
same time to energy conservation. Next we consider the α = 1, 2, 3 = i components,

∂t (ρuj ) + ∂i (ui uj ρ) = 0 (7.40)

75
7 Classical field theory

or
u∂t ρ + ρ∂t u + u∇ · (ρu) + (u · ∇)uρ = 0 . (7.41)
Taking the continuity equation into account, we obtain the Euler equation for a force-free
fluid without viscosity,
ρ∂t u + (u · ∇)uρ = 0 . (7.42)
Hence, as anounced, the condition ∂µ T µν = f ν gives the equations of motion.
Finally we include the effect of pressure. We know that the pressure tensor coincides with
the σij part of the stress tensor. Moreover, for a perfect fluid in its rest-frame, the pressure is
isotropic Pij = P δij . This corresponds to Pij = −P ηij and adds −P to T 00 . Compensating
for this gives
T αβ = (ρ + P )uα uβ − P η αβ . (7.43)

7.4 Klein-Gordon field


Real field The Klein-Gordon equation is a relativistic wave equation describing a scalar
field. We first consider a real field φ. Similar as the free Schrödinger equation,

p2 ∆
i∂t ψ = ψ= ψ, (7.44)
2m 2m
can be “derived” using the replacements

E → i∂t p → −i∇x (7.45)

from the non-relativistic energy-momentum relation E = p2 /(2m), we obtain from the rela-
tivistic E 2 = m2 + p2

( + m2 )φ = 0 with  = ηµν ∂ µ ∂ ν . (7.46)

Translation invariance implys that we can choose the solutions as eigenstates of the momen-
tum
p operator, p̂φ = pφ. These states are plane waves with positive and negative energies
± k2 + m2 . Interpreting the Klein–Gordon equation as a relativistic wave equation for a
single particle cannot therefore be fully satisfactory, since the energy of its solutions is not
bounded from below.
How do we guess the correct Lagrange density L ? The correspondence q̇ ↔ ∂µ φ means
that the kinetic field energy is quadratic in the field derivatives. In contrast, the mass term
m2 is potential energy, V (φ) ∝ m2 . The relativistic energy-momentum relation E 2 = m2 + p2
suggests that V (φ) is also quadratic, with the same numerical coefficient as the kinetic energy.
Therefore we try as Lagrange density
1 1 1 1 1
L = ηµν (∂ µ φ)(∂ ν φ) − V (φ) = ηµν (∂ µ φ)(∂ ν φ) − m2 φ2 ≡ (∂µ φ)2 − m2 φ2 , (7.47)
2 2 2 2 2
where the factor 1/2 is convention: The kinetic energy of a canonically normalised real field
carries the prefactor 1/2. With

∂ 
(η µν ∂µ φ∂ν φ) = η µν δµα ∂ν φ + δνα ∂µ φ = η αν ∂ν φ + η µα ∂µ φ = 2∂ α φ, (7.48)
∂(∂α φ)

76
7.4 Klein-Gordon field

the Lagrange equation becomes


 
∂L ∂L
− ∂α = −m2 φ − ∂α ∂ α φ = 0. (7.49)
∂φ ∂(∂α φ)

Thus the Lagrange density (7.47) leads to the Klein-Gordon equation. We can check if we
have correctly chosen the signs by calculating the stress tensor,

∂L ,ν
T µν = φ − η µν L = φ,µ φ,ν − η µν L . (7.50)
∂φ,µ

that is already symmetric. The corresponding 00 component is


1h i
T 00 = φ,0 φ,0 − L = (∂t φ)2 + (∇φ)2 + m2 φ2 > 0 (7.51)
2
positiv definite. Thus the energy density of a scalar field is, in contrast to the energy of the
single-particle solution, bounded from below.

Complex field and internal symmetries If two field exist with the same mass m, one might
wish to combine the two real fields into one complex field,
1
φ = √ (φ1 + iφ2 ) . (7.52)
2

Then one can interprete φ and φ† as a particle and its antiparticle, which are Hermetian
conjugated fields.
The resulting Lagrangian density is just the sum,

L = ∂µ φ† ∂ µ φ − m2 φ† φ (7.53)

The presence of two fields sharing some quantum numbers (here the mass) opens up the
possibility of internal symmetries. The Lagrangian (7.53) is invariant under global phase
transformations, φ → eiϑ φ and φ† → e−iϑ φ† . With δφ = iφ and δφ† = −iφ† , the conserved
current follows as h i
j µ = i φ† ∂ µ φ − (∂ µ φ† )φ . (7.54)
R 3 0
The conserved charge Q = d x j can be also negative and thus we cannot interpret j 0
as the probability density to observe a φ particle. Instead, we should associate Q with a
conserved additive quantum number as, for example, the electric charge.
Next we calculate the stress tensor,

T 00 = 2∂t φ† ∂t φ − L = |∂t φ|2 + |∇φ|2 + m2 |φ|2 > 0 . (7.55)

We consider now plane-wave solutions to the Klein-Gordon equation,

φ = N e−ikx . (7.56)

If we insert ∂µ φ = ikµ φ into L , we find L = 0 and thus

T 00 = 2|N |2 k0 k0 . (7.57)

77
7 Classical field theory

Relativistic one-particle states are usually normalised as N −2 = 2ωV . Thence the energy
density T 00 = ω/V agrees with the expectation for one particle with energy ω per volume V .
The other components are necessarily

T µν = 2|N |2 kµ kν . (7.58)

Since T µν is symmetric, we can find a frame in which T µν is diagonal with T ∝


diag(ω, vx kx , vy ky , vz kz )/V ). This agrees with the contribution of a single particle to the
energy denisty and pressure of an ideal fluid. This holds also for other fields, and thus we can
model as ideal fluids, distinguished only by their equation of state (E.o.S.), w = P/ρ.

7.5 Maxwell field


Field tensor We start by considering a charged point particle interacting with an external
electromagnetic described by the vector potential Aµ = (φ, A). As Lagrangian for the free
particle we use L = −mds or Z b
S0 = − ds m. (7.59)
a
How can the interaction term charged particle with an electromagnetic field look like? The
action should be a scalar and the simplest choice is
Z Z
µ dxµ
Sem = −q dx Aµ (x) = −q dσ Aµ (x) . (7.60)

Note that this choice for Sem is invariant under a change of gauge,

Aµ (x) → Aµ (x) + ∂µ Λ(x) . (7.61)

The resulting change in the action,


Z 2 Z 2
dxµ ∂Λ(x)
δΛ Sem = −q dσ = −q dΛ = q[Λ(2) − Λ(1)] (7.62)
1 dσ ∂xµ 1

drops out from


pδS for fixed endpoints, thus not affecting the resulting equation of motion.
With ds = dxµ dxµ , the variation of the action is
Z b Z b 
dxµ δdxµ
δS = −δ (mds + qAµ dxµ ) = − m + qAµ d(δxµ ) + qδAµ dxµ . (7.63)
a a ds
We use δd = dδ in the frist term and integrate then the first two terms partially,
Z b   
dxµ µ µ µ
δS = md δx + qδx dAµ − qδAµ dx (7.64)
a ds
where we have uses as “always” that the boundary terms vanish. Next we introduce uµ =
dxµ /ds and use
∂Aµ ν ∂Aµ ν
δAµ = ν
δx , dAµ = dx . (7.65)
∂x ∂xν
Then Z b 
µ ∂Aµ µ ν ∂Aµ ν µ
δS = mduµ δx + q ν δx dx − q ν δx dx . (7.66)
a ∂x ∂x

78
7.5 Maxwell field

Finally, we rewrite in the first term duα = duα /ds ds, in the second and third dxα = uα ds
and exchange the summation indices µ and ν in the third term. Then
Z b   
duµ ∂Aν ∂Aµ
δS = m −q − uν δxµ ds = 0 . (7.67)
a ds ∂xµ ∂xν
For arbitrary variations, the brackets has to be zero and we obtain as equation of motion
 
duµ ∂Aν ∂Aµ
m = fµ = q − uν ≡ qFµν uν . (7.68)
ds ∂xµ ∂xν
This is the relativistic form of the Lorentz force.

Connection between 3- and 4-dim. formulation of electrodynamics


The first raw of Fµν = ∂µ Aν − ∂ν Aµ reads with Aµ = (φ, −Ak ) and ∂ν = ∂/∂xµ = (∂/∂t, ∇k )
as
F0k = ∂0 Ak − ∂k A0
Setting F0k = E k gives
E = −∇φ − ∂t A,
what agrees with the first row of Fµν given in Eq. (1.58). We go in the opposite direction for
B = ∇ × A. In components, we have e.g.

B 1 = ∂2 A3 − ∂3 A2 = ∂3 A2 − ∂2 A3 = F32

and similarly for the other components.


The force law fµ = eFµν uν becomes simplest in a frame with u = (1, 0, 0, 0). Then fµ = eFµ0
or −F = −eE.

Now we can rewrite the Maxwell equations as


∂α F αβ = j β (7.69)
and
∂α Fβγ + ∂β Fγα + ∂γ Fαβ = 0 . (7.70)
The last equation is completely antisymmetric in all three indices, and contains therefore only
four independent equations. It is equivalent to
∂α F̃ αβ = 0 , (7.71)
where
1
F̃ αβ = εαβγδ Fγδ (7.72)
2
is the dual field-strength tensor.
The components of the electromagnetic field-strength tensor F µν and its dual F̃αβ =
1 µν are given by (see also appendix to chapter 1)
2 εαβµν F
   
0 −Ex −Ey −Ez 0 −Bx −By −Bz
 E x 0 −B z B 
y 
 B x 0 Ez −Ey 
F µν =  Ey Bz and F̃ µν =  .
0 −Bx   By −Ez 0 Ex 
Ez −By Bx 0 Bz Ey −Ex 0
They are connected to the electric and magnetic fields measured by an observer with four-
velocity uα as Eα = Fαβ uβ and Bα = F̃αβ uβ .

79
7 Classical field theory

Current conservation and gauge invariance We take the divergence of Maxwells equation
(7.69),
∂ν ∂µ F µν = ∂ν j ν . (7.73)

Since ∂ν ∂µ is symmetric and F µν antisymmetric, the summation of the two factors has to be
zero,
∂ν ∂µ F µν = −∂ν ∂µ F νµ = −∂µ ∂ν F νµ = −∂ν ∂µ F µν . (7.74)

Thus current conservation,


∂ν j ν = 0 , (7.75)

follows from the antisymmetry


R of F. The latter followed in turn from the assumed gauge-
invariant action Sem = −q dxµ Aµ .
Consider next the transformation of F under a gauge transformation,

Aµ → A′µ = Aµ + ∂µ χ . (7.76)


Fµν = ∂µ A′ν − ∂ν A′µ = Fµν + ∂µ ∂ν χ − ∂ν ∂µ χ = Fµν . (7.77)

Thus the gauge invariance of F is again closely connected to the fact that it is an antisym-
metric tensor, formed by derivatives of A.

Differential forms:
A surface in R3 can be described at any point either by its two tangent vectors e1 and e2 or
by the normal n. They are connected by a cross product, n = e1 × e2 , or in index notation,

ni = εijk e1,j ek,2 . (7.78)

In four dimensions, the ε tensor defines a map between 1-3 and 2-2 tensors. Since ε is antisym-
metric, the symmetric part of tensors would be lost; Hence the map is suited for antisymmetric
tensors.
Antisymmetric tensors of rank n can be seen also as differential forms: Functions are forms of
order n = 0; differential of functions are an example of order n = 1,
∂f
df = dxi (7.79)
∂xi
Thus the dxi form a basis, and one can write in general

A = Ai dxi . (7.80)

For n > 1, the basis has to be antisymmetrized,


1
F = Fµν dxµ ∧ dxν (7.81)
2
with dxµ ∧ dxν = −dxν ∧ dxµ . Looking at df , we can define a differentiation of a form ω with
coefficients w and degree n as an operation that increases its degree by one to n + 1,

dω = dwα,...,β dxn+1 ∧ dxα ∧ . . . ∧ dxβ (7.82)

Thus we have F = dA. Moreover, it follows d2 ω = 0 for all forms. Hence a gauge transformation
F ′ = d(A + dχ) = F .

80
7.5 Maxwell field

Wave equation The Maxwell equation (7.69) consists of four equations for the six compo-
nents of F . Thus we need either a second equation, i.e. Eq. (7.70), or we should transform
Eq. (7.69) into an equation for the four components of the four-potential A. In this case,
Eq. (7.70) is automatically satisfied. Let us do the latter and insert the definition of A,

∂µ F µν = ∂µ (∂ µ Aν − ∂ ν Aµ ) = Aν − ∂µ ∂ ν Aµ = j ν . (7.83)

Gauge invariance allows us to choose a potential Aµ such that ∂µ Aµ = 0. Such a choice is


called fixing the gauge, and the particular case ∂µ Aµ = 0 is denoted as the Lorenz gauge. In
this gauge, the wave equation simplifies to

Aµ = j µ . (7.84)

Inserting then a plane wave Aµ ∝ εµ eikx into the free wave equation, Aν = 0, we find
that k is a light-like vector, while the Lorenz gauge condition ∂µ Aµ = 0 results in εµ kµ = 0.
Imposing the Lorenz gauge, we can still add to the potential Aµ any function ∂ µ χ satisfying
χ = 0. We can use this freedom to set A0 = 0, obtaining thereby εµ kµ = −ε · k = 0.
Thus the photon propagates with the speed of light, is transversely polarised and has two
polarisation states as expected for a massless particle.
Let us discuss now why gauge invariance is necessary for a massless spin-1 particle. First
(r)
we consider a linearly polarised photon with polarisation vectors εµ lying in the plane per-
(1)
pendicular to its momentum vector k. If we perform a Lorentz boost on εµ , we will find

ε̃(1) ν (1) (1) (2)


µ = Λ µ εν = a1 εµ + a2 εµ + a3 kµ , (7.85)

where the coefficients ai depend on the direction β of the boost. Thus, in general the po-
larisation vector will not be anymore perpendicular to k. Similarly, if we perform a gauge
transformation
Aµ (x) → A′µ (x) = Aµ (x) − ∂µ Λ(x) (7.86)

with
Λ(x) = −iλ exp(−ikx) + h.c. , (7.87)

then
A′µ (x) = (εµ + λkµ ) exp(−ikx) + h.c. = ε′µ exp(−ikx) + h.c. (7.88)

Choosing, for example, a photon propagating in z direction, kµ = (ω, 0, 0, ω), we see that
the gauge transformation does not affect the transverse components ε1 and ε2 . Thus only
the components of εµ transverse to k can have physical significance. On the other hand, the
time-like and longitudinal components depend on the arbitrary parameter λ and are therefore
unphysical. In particular, they can be set to zero by a gauge transformation. First, ε′µ k′µ = 0
implies (again for a photon propagating in z direction) ε′0 = −ε′3 . From ε′3 = ε3 + λω, we see
that λ = −ε3 /ω sets ε′3 = −ε′0 = 0. Thus the transformation law (7.85) for the polarisation
vector of a massless spin-1 particles requires the existence of the gauge symmetry (7.86). The
gauge symmetry in turn implies that the massless spin-1 particle couples only to conserved
currents.

81
7 Classical field theory

Lagrange density The free field equation is

∂µ F µν = 0 . (7.89)

In order to find L , we multiply by a variation δAν that vanishes on the boundary ∂Ω. Then
we integrate over Ω = V × [ta : tb ], and perform a partial integration,
Z Z
d4 x ∂µ F µν δAν = − d4 x F µν δ(∂µ Aν ) = 0 . (7.90)
Ω Ω

Next we note that

(Aα,β − Aβ,α )(Aα,β − Aβ,α ) = 2(Aα,β − Aβ,α )Aα,β (7.91)

and thus
1
F µν δ(∂µ Aν ) = F µν δFµν . (7.92)
2
Applying the product rule, we obtain as final result
Z
1
− δ d4 x Fµν F µν = 0 (7.93)
4 Ω

and
1
L = − Fµν F µν . (7.94)
4
Note that we expressed L trough F , but L should be viewed nevertheless as function of A:
We are varying the action with respect to Aµ , giving us the a second-order (wave) equation.
This is in accordance with the fact that Aµ determines the interaction (7.60) with charged
particles.

Stress tensor According to Eq. (7.24) we have

∂Aσ ∂L
Θµν = − δµν L . (7.95)
∂x ∂(∂Aσ /∂xν )
µ

Since L depends only on the derivatives Aµ,ν , we can use the following short-cut: We know
already that
1
δL = − δ(Fµν F µν ) = F µν δ(∂ν Aµ ) . (7.96)
4
Thus
∂L
= F σν = −F νσ (7.97)
∂(∂Aσ /∂xν )
and
∂Aσ νσ 1 ν
Θµν = − F + δµ Fστ F στ . (7.98)
∂xµ 4
Raising the index µ and rearranging σ, we have

∂Aσ ν 1
Θµν = − F σ + η µν Fστ F στ . (7.99)
∂xµ 4

82
7.5 Maxwell field

This result in neither gauge invariant (contains A) nor symmetric. To symmetrize it, we
should add
∂Aµ ν ∂
Fσ= (Aµ F νσ ) . (7.100)
∂xσ ∂xσ
The last step is possible for a free electromagnetic field, ∂σ F νσ = 0, and shows that we are
allowed to add the LHS. Then the two terms combine to F , and we get
1 µν
Θµν = −F µσ F νσ + η Fστ F στ . (7.101)
4
In this form, the stress tensor is symmetric and gauge invariant. We can thus identify the
expression (7.101) with the dynamical stress tensor, Θµν = T µν . Note that its trace is zero,
T µµ = 0.

83
8 Einstein’s field equation
Up to now, we have investigated the behaviour of test-particles and light-rays in a given
curved spacetime determined by the metric tensor gµν . The transition from point mechanics
to field theory means that the role of the mass m as the source of gravity should be taken by
the mass density ρ, or in the relativistic case, by the stress tensor Tµν . Thus we expect field
equations of the type Gµν = κTµν , where κ is proportional to Newton’s constant G and Gµν
is a function of gµν and its derivatives.

Riemann tensor via area... analogue to non-abelian field-strength tensor.. equation of geodesic
deviation

8.1 Curvature and the Riemann tensor


We are looking for an invariant characterisation of an manifold curved by gravity. As the
discussion of normal coordinates showed, the first derivatives of the metric can be (at one
point) always chosen to be zero. Hence this quantity will contain second derivatives of the
metric, i.e. first derivatives of the Christoffel symbols.
The commutator of covariant derivatives will in general not vanish,
µ... µ...
(∇α ∇β − ∇β ∇α )Tν... = [∇α , ∇β ]Tν... 6= 0, (8.1)

(think at the parallel transport from A first along ea , then along eb to B and then back to A
along −ea and −eb on a sphere), is obviously a tensor and contains second derivatives of the
µ...
metric. The statement [∇α , ∇β ]Tν... = 0 is coordinate independent, and can thus be used to
characterize in an invariant way, if a manifold is flat.
For the special case of a vector V α we obtain with

∇ρ V α = ∂ρ V α + Γαβρ V β (8.2)

first

∇σ ∇ρ V α = ∂σ (∂ρ V α + Γαβρ V β ) + Γακσ (∂ρ V κ + Γκβρ V β ) − Γκρσ (∂κ V α + Γαβκ V β ). (8.3)

The second part of the commutator follows relabelling σ ↔ ρ as

∇ρ ∇σ V α = ∂ρ (∂σ V α + Γaβσ V β ) + Γακρ (∂σ V κ + Γκbσ V β ) − Γκσρ (∂κ V α + Γαβκ V β ). (8.4)

Now we subtract the two equations using that ∂ρ ∂σ = ∂σ ∂ρ and Γαβρ = Γαρβ ,
 
[∇ρ , ∇σ ]V α = ∂ρ Γαβσ − ∂σ Γαβρ + Γακρ Γκβσ − Γακσ Γκβρ V β ≡ Rαβρσ V β . (8.5)

The tensor Rαβρσ is called Riemann or curvature tensor.

84
8.1 Curvature and the Riemann tensor

Its symmetry properties imply that we can construct out of the Riemann tensor only one
non-zero tensor of rank two, contracting α either with the third or fourth index, Rραρβ =
−Rραβρ . We define the Ricci tensor by

Rαβ = Rραρβ = −Rραβρ = ∂ρ Γραβ − ∂β Γραρ + Γραβ Γσρσ − Γσβρ Γρασ . (8.6)

A further contraction gives the curvature scalar,

R = Rαβ g αβ . (8.7)

Symmetry properties Inserting the definition of the Christoffel symbols and using normal
coordinates, the Riemann tensor becomes
1
Rαβρσ = {∂σ ∂β gαρ + ∂ρ ∂α gβσ − ∂σ ∂α gβρ − ∂ρ ∂β gασ } . (8.8)
2
The tensor is antisymmetric in the indices ρ ↔ σ, antisymmetric in α ↔ β and symmetric
against an exchange of the index pairs (αβ) ↔ (ρσ). Moreover, there exists one algebraic
identity,
Rαβρσ + Rασβρ + Rαρσβ = 0 . (8.9)
Since each pair of indices (αβ) and (ρσ) can take six values, we can combine the antisym-
metrized components of R[αβ][ρσ] in a symmetric six-dimensional matrix. The number of
independent components of this matrix is thus for d = 4 space-time dimensions

n × (n + 1) 6×7
−1= − 1 = 20 ,
2 2
where we accounted also for the constraint (8.9). In general, the number n of independent
components is in d space-time dimensions given by n = d2 (d2 − 1)/12, while the number m
of field equations is m = d(d + 1)/2. Thus we find

d 1 2 3 4
n 0 1 6 20
m - 3 6 10

This implies that an one-dimensional manifold is always flat (ask yourself why?). Moreover,
the number of independent components of the Riemann tensor is smaller or equals the number
of field equations for d = 2 and d = 3. Hence the Riemann tensor vanishes in empty space, if
d = 2, 3. Starting from d = 4, already an empty space can be curved and gravitational waves
exist.
The Bianchi identity is a differential constraint,

∇κ Rαβρσ + ∇ρ Rαβσκ + ∇σ Rαβκρ = 0 , (8.10)

that is checked again simplest using normal coordinates. In the context of general relativ-
ity, the Bianchi identities are necessary consequence of the Einstein-Hilbert action and the
requirement of general covariance.

85
8 Einstein’s field equation

Example: Sphere S 2 . Calculate the Ricci tensor Rij and the scalar curvature R of the two-
dimensional unit sphere S 2 .
We have already determined the non-vanishing Christoffel symbols of the sphere S 2 as Γφϑφ = Γφφϑ =
cot ϑ and Γϑφφ = − cos ϑ sin ϑ. We will show later that the Ricci tensor of a maximally symmetric
space as a sphere satisfies Rab = Kgab . Since the metric is diagonal, the non-diagonal elements of the
Ricci tensor are zero too, Rφϑ = Rϑφ = 0. We calculate with
Rab = Rc acb = ∂c Γcab − ∂b Γcac + Γcab Γdcd − Γdbc Γcad
the ϑϑ component, obtaining
Rϑϑ = 0 − ∂ϑ (Γφϑφ + Γϑϑϑ ) + 0 − Γdϑc Γcϑd = 0 + ∂ϑ cot ϑ − Γφϑφ Γφϑφ
= 0 − ∂ϑ cot ϑ − cot2 ϑ = 1 .

From Rab = Kgab , we find Rϑϑ = Kgϑϑ and thus K = 1. Hence Rφφ = gφφ = sin2 ϑ.
The scalar curvature is (diagonal metric with g φφ = 1/ sin2 ϑ and g ϑϑ = 1)
1
R = g ab Rab = g φφ Rφφ + g ϑϑ Rϑϑ = sin2 ϑ + 1 × 1 = 2 .
sin2 ϑ
Note that our definition of the Ricci tensor guaranties that the curvature of a sphere is also positive,
if we consider it as subspace of a four-dimensional space-time.

8.2 Integration, metric determinant g, and differential operators


In special relativity, Lorentz transformations left the volume element d4 x invariant, d4 x′ =
dt′ d2 x′⊥ dx′k = (γdt)d2 x⊥ (dxk /γ) = dx4 . We allow now for arbitrary coordinate transforma-
tion for which the Jacobi determinant can deviate from one. Thus the action of a field with
Lagrange density L ′ becomes
Z p Z
4 ′
S= d x |g| L = d4 x L , (8.11)
Ω Ω

where g denotes the determinant


p of the metric tensor gµν . Often, as in the second step, we
prefer to include the factor |g| into the definition of L . In order to find the equations of
motion, we have to determine the variation of the metric determinant g.
In Lm , the effects of gravity are accounted for by the replacements {∂a , ηab } → {∇a , gab }.
Note that this transition is not unique: For instance, in the case of a scalar field we can add
a term ξR2 φ2 to the usual Lagrangian. Since this term vanishes in Minkowski space, we have
no way to determine the value of ξ from experiments in flat space.

Variation of the metric determinant g We consider a variation of a matrix M with elements


mij (x) under an infinitesimal change of the coordinates, δxa = εxa ,
δ ln detM ≡ ln det(M + δM ) − ln det(M ) (8.12a)
−1 −1
= ln det[M (M + δM )] = ln det[1 + M δM ] = (8.12b)
−1 2 −1 2
= ln[1 + tr(M δM )] + O(ε ) = tr(M δM ) + O(ε ). (8.12c)
In the last step, we used ln(1 + ε) = ε + O(ε2 ). Expressing now both the LHS and the RHS
as δf = ∂µ f δxµ and comparing then the coefficients of δxµ gives
∂µ ln detM = tr(M −1 ∂µ M ). (8.13)

86
8.2 Integration, metric determinant g, and differential operators

p
Useful formula for derivatives Applied to derivatives of|g|, we obtain
1 µν 1 1 p
g ∂λ gµν = ∂λ ln g = p ∂λ ( |g|). (8.14)
2 2 |g|
while we find for contracted Christoffel symbols
1 1 1 1 p
Γµµν = g µκ (∂µ gκν + ∂ν gµκ − ∂κ gµν ) = gµκ ∂ν gµκ = ∂ν ln g = p ∂ν ( |g|). (8.15)
2 2 2 |g|
Next we consider the divergence of a vector field,
1 p 1 p
∇µ V µ = ∂µ V µ + Γµλµ V λ = ∂µ V µ + p (∂µ |g|)V µ = p ∂µ ( |g|V µ ). (8.16)
|g| |g|
and of antisymmetric tensors of rank 2,
1 p
∇µ Aµν = ∂µ Aµν + Γµλµ Aλν + Γν λµ Aµλ = p ∂µ ( |g|Aµν ) . (8.17)
|g|
In the latter case, the third term Γν λµ Aµλ vanishes because of the antisymmetry of Aµλ so
that we could combine the first two as in the vector case. This generalises to completely
anti-symmetric tensors of all orders. For a symmetric tensor, we find
∇µ S µν = ∂µ S µν + Γµλµ S λν + Γν λµ S µλ = (8.18)
We can express Γbca as derivative of the metric tensor,
1 p
∇µ S µν = p ∂µ ( |g|S µν ) + Γν λµ S µλ . (8.19)
|g|
Thus we can perform the covariant derivative of S µν without the need to know the Christoffel
symbols.
Example: Spherical coordinates 3:
Calculate for spherical coordinates x = (r, ϑ, φ) in R3 the gradient, divergence, and the Laplace
operator. Note that one uses normally normalized unit vectors in case of a diagonal metric: this

corresponds to a rescaling of vector components V i → V i / gii (no summation in i) or basis vectors.
(Recall the analogue rescaling in the exercise “acceleration of a stationary observer in SW BH.)
We express the gradient of a scalar function f first as
∂f ∂f 1 ∂f 1 ∂f
∂ i f ei = g ij ei = er + 2 eφ + 2 2 eϑ
∂xj ∂r r ∂φ r sin ϑ ∂ϑ

and rescale then the basis, e∗i = ei / gii , or e∗r = er , e∗φ = reφ , and e∗ϑ = r sin ϑeϑ . In this new
(“physical”) basis, the gradient is given by
∂f ∗ 1 ∂f ∗ 1 ∂f ∗
∂ i f e∗i = er + eϑ + e .
∂r r ∂ϑ r sin ϑ ∂φ φ
√ √
The covariant divergence of a vector field with rescaled components X i / gii is with g = r2 sin ϑ
given by
 
1 p 1 ∂(r2 sin ϑXr ) ∂(r2 sin ϑXϑ ) ∂(r2 sin ϑXφ )
∇i X i = p ∂i ( |g|X i ) = 2 + +
|g| r sin ϑ ∂r r∂ϑ r sin ϑ∂φ
1 ∂(r2 Xr ) 1 ∂(sin ϑXϑ ) 1 ∂Xφ
= + +
r2 ∂r r sin ϑ ∂ϑ r sin ϑ ∂φ
   
∂ 2 ∂ cot ϑ 1 ∂Xφ
= + Xr + + Xφ + .
∂r r ∂ϑ r r sin ϑ ∂φ

87
8 Einstein’s field equation

Global conservation laws An immediate consequence of Eq. (8.16) is a covariant form of


Gauß’ theorem for vector fields. In particular, we can conclude from local current conser-
vation, ∇µ j µ = 0, the existence of a globally conserved charge. If the conserved current j a
vanishes at infinity, then we obtain also in a general space-time
Z p Z p Z p
4 µ 4 µ
d x |g| ∇µ j = d x∂µ ( |g|j ) = dSµ |g| j µ = 0 . (8.20)
Ω Ω ∂Ω

For a non-zero current, the volume integral over the charge density j 0 remains constant,
Z p Z p Z p
4 µ 3 0
d x |g| ∇µ j = d x |g|j − d3 x |g|j 0 = 0 . (8.21)
Ω V (t2 ) V (t2 )

Thus the conservation of Noether charges of internal symmetries as the electric charge, baryon
number, etc., is not affected by an expanding universe.
Next we consider the stress tensor as example for a locally conserved symmetric tensors of
rank two. Now, the second term in Eq. (8.19) prevents us to convert the local conservation
law into a global one. If the space-time admits however a Killing field ξ, then we can form
the vector field P µ = T µν ξν with

∇µ P µ = ∇µ (T µν ξν ) = ξν ∇µ T µν + T µν ∇µ ξν = 0. (8.22)

Here, the first term vanishes since T µν is conserved and the second because T µν is symmetric,
while ∇µ ξν is antisymmetric. Therefore the vector field µ = T µν ξν is also conserved, ∇µ P µ =
0, and we obtain thus the conservation of the component of the energy-momentum vector in
the direction of ξ.
In summary, global energy conservation requires the existence of a time-like Killing vector
field. Moving along such a Killing field, the metric would be invariant. Since we expect in an
expanding universe a time-dependence of the metric, a time-like Killing vector field does not
exist and the energy contained in a “comoving” volume changes with time.

8.3 Einstein-Hilbert action


Einstein equation in vacuum Our main guide in choosing the appropriate action for the
gravitational field is simplicity. A Lagrange density has mass dimension four (or length
−4) such that the action is dimensionless. In the case of gravity, we have to account for the
dimensionfull coupling, Newton’s constant G, and require therefore that the Lagrange density
without coupling has mass dimension two. Among the possible terms we can select are
p 
L = |g| Λ + bR + c∇a ∇b Rab + d(∇a ∇b Rab )2 + . . .

+ f (R) + . . . (8.23)

Note that the terms in the first line are ordered according to the number of derivatives: Λ : ∂ 0 ,
b : ∂ 2 , c : ∂ 4 . Choosing only the first term, a constant, will not give dynamical equations.
The next simplest possibility is to pick out only the second term, as it was done originally

88
8.3 Einstein-Hilbert action

by Hilbert. The following c term will be suppressed relative to b by dimensional reasons as


∇a ∇b /M 2 ∼ E 2 /M 2 . Here, E is the characteristic energy of the process considered, while
we expect 1/M 2 ∼ GN for a theory of gravity. Thus at low energies, the first two terms
should dominate the gravitational interactions. In contrast, a term like f (R) in the second
line is a modification of the simple R term—the allowed size of this modification has to be
constrained by experiments. We will see later that, if we do no include a constant term Λ
in the gravitational action, it will pop up on the matter side. Thus we add Λ right from the
start and define as the Einstein-Hilbert Lagrange density for the gravitational field
p
LEH = − |g|(R + 2Λ) . (8.24)

The Lagrangian is a function of the metric, its first and second derivatives,1
LEH (gµν , ∂ρ gµν , ∂ρ ∂σ gµν ). The resulting action
Z p
SEH [gµν ] = − d4 x |g| {R + 2Λ} (8.25)

is a functional of the metric tensor gµν , and a variation of the action with respect to the metric
gives the field equations for the gravitational field. We allow for variations of the metric gµν
restricted by the condition that the variation of gµν and its first derivatives vanish on the
boundary ∂Ω. Asking that variation is zero, we obtain
Z p
0 = δSEH = −δ d4 x |g|(R + 2Λ) = (8.26a)
Z Ω
p
= −δ d4 x |g| (g µν Rµν + 2Λ) (8.26b)
Z Ω
np p p o
= − d4 x |g| gµν δRµν + |g|Rµν δgµν + (R + 2Λ) δ |g| . (8.26c)

Our task is to rewrite the first and third term as variations of δg µν or to show that they are
equivalent to boundary terms. Let us start with the first term. Choosing inertial coordinates,
the Ricci tensor at the considered point P becomes

Rµν = ∂ρ Γρµν − ∂ν Γρµρ . (8.27)

Hence
gµν δRµν = g µν (∂ρ δΓρµν − ∂ν δΓρµρ ) = gµν ∂ρ δΓρµν − gµρ ∂ρ δΓν µν , (8.28)
where we exchanged the indices ν and ρ in the last term. Since ∂ρ gµν = 0 at P , we can
rewrite the expression as

g µν δRµν = ∂ρ (gµν δΓρµν − gµρ δΓν µν ) = ∂ρ V ρ . (8.29)

The quantity V ρ is a vector, since the difference of two connection coefficients transforms as a
tensor. Replacing in Eq. (8.29) the partial derivative by a covariant one promotes it therefore
in a valid tensor equation,
1 p
gµν δRµν = ∇µ V µ = p ∂µ ( |g|V µ ). (8.30)
|g|
1
Recall that the Lagrange equations are modified in the case of higher derivatives which is one reason why
we directly vary the action in order to obtain the field equations.

89
8 Einstein’s field equation

Thus this term corresponds to a surface term which we assume to vanish. Next we rewrite
the third term using
p 1 1p 1p
δ |g| = p δ|g| = |g| g µν δgµν = − |g| gµν δgµν (8.31)
2 |g| 2 2
and obtain Z  
p 1
δSEH =− 4
d x |g| Rµν − gµν R − Λ gµν δgµν = 0. (8.32)
Ω 2
Hence the metric fulfils in vacuum the equation

1 δSEH 1
−p µν
= Rµν − R gµν − Λgµν ≡ Gµν − Λgµν = 0, (8.33)
|g| δg 2

where we introduced the Einstein tensor Gµν . The constant Λ is called the cosmological
constant. It has the demension of a length squared: If the cosmological constant is non-zero,
empty space is curved with a curvature radius Λ−1/2 .

Einstein equation with matter We consider now the combined action of gravity and matter,
as the sum of the Einstein-Hilbert Lagrange density LEH /2κ and the Lagrange density Lm
including all relevant matter fields,
1 1p
L = LEH + Lm = − |g|(R + 2Λ) + Lm . (8.34)
2κ 2κ
In Lm , the effects of gravity are accounted for by the replacements {∂µ , ηµν } → {∇µ , gµν },
while we have to adjust later the constant κ such that we reproduce Newtonian dynamics
in the weak-field limit. We expect that the source of the gravitational field is the energy-
momentum tensor. More precisely, the Einstein tensor (“geometry”) should be determined
by the matter, Gµν = κTµν . Since we know already the result of the variation of SEH , we
conclude that the variation of Sm should give

2 δSm
p = Tµν . (8.35)
|g| δgµν

The tensor Tµν defined by this equation is called dynamical energy-momentum stress tensor .
In order to show that this definition makes sense, we have to prove that ∇µ Tµν = 0 and
we have to convince ourselves that this definition reproduces the standard results we know
already. Einstein’s field equation follows then as

Gµν − Λgµν = κTµν . (8.36)

Alternative form of the Einstein equation We can rewrite the Einstein equation such that
the only geometrical term on the LHS is the Ricci tensor. Because of
1
Rµµ − gµµ (R + 2Λ) = R − 2(R + 2Λ) = −R − 4Λ = κTµµ (8.37)
2
we can perform with T ≡ Tµµ the replacement R = −4Λ − κT in the Einstein equation and
obtain
1
Rµν = κ(Tµν − gµν T ) − gµν Λ . (8.38)
2

90
8.4 Dynamical stress tensor

This form of the Einstein equations is often useful, when it is easier to calculate T than R.
Note also that Eq. (8.38) informs us that an empty universe with Λ = 0 has a vanishing Ricci
tensor, Rµν = 0.

8.4 Dynamical stress tensor


We start by proving that the dynamical stress tensor defined by by Eq. (8.35) is conserved.
We consider the change of the matter action under variations of the metric,
Z Z
1 4
p αβ 1 p
δSm = d x |g| Tαβ δg = − d4 x |g| T αβ δgαβ . (8.39)
2 Ω 2 Ω
We allow infinitesimal but otherwise arbitrary coordinate transformations,
x̃α = xα + ξ α (xβ ) . (8.40)
For the resulting change in the metric δgαβ we can use the Killing Eq. (4.9),
δgαβ = ∇α ξβ + ∇β ξα . (8.41)
We use that T αβ is symmetric and that general covariance guarantees that δSm = 0 for a
coordinate transformation,
Z p
δSm = − d4 x |g| T αβ ∇α ξβ = 0 . (8.42)

Next we apply the product rule,


Z p Z p
δSm = − d4 x |g| (∇α T αβ )ξβ + d4 x |g| ∇α (T αβ ξβ ) = 0 . (8.43)
Ω Ω

The second term is a four-divergence and thus a boundary term that we can neglect. The
remaining first term vanishes for arbitrary ξ, if the stress tensor is conserved,

∇α T αβ = 0 . (8.44)
Hence the local conservation of energy-momentum is a consequence of the general covariance
of the gravitational field equations, in the same way as current conservation follows from
gauge invariance in electromagnetism.
We now evaluate the dynamical stress tensor for the examples of the Klein-Gordon and the
photon field. Note that the replacements ηαβ → gαβ requires also that we have to express
summation indices as contractions with the metric tensor, i.e. we have to replace e.g. Aα B α
by gαβ Aα Bβ . Thus we rewrite Eq. (7.47) including a potential V (φ), that could be also a
mass term, V (φ) = m2 φ2 /2, as
1 αβ
L = g ∇α φ∇β φ − V (φ) . (8.45)
2
With ∇α φ = ∂α φ for a scalar field, the variation of the action gives
Z np p o
1
δSKG = d4 x |g|∇α φ∇β φ δgαβ + [gαβ ∇α φ∇β φ − 2V (φ)]δ |g|
2
Z Ω p  
4 αβ 1 1
= d x |g|δg ∇α φ∇β φ − gαβ L . (8.46a)
Ω 2 2

91
8 Einstein’s field equation

and thus
2 δSm
Tαβ = p = ∇α φ∇β φ − gαβ L . (8.47)
|g| δgαβ
Next we consider the free electromagnetic action,
Z Z
1 4
p ab 1 p
Sem = − d x |g|Fab F = − d4 x |g|g ac gbd Fab Fcd . (8.48)
4 Ω 4 Ω
Noting that Fαβ = ∇α Aµ − ∇ν Aµ = ∂α Aµ − ∂ν Aµ , we obtain
Z n p o
1 p
δSem = − d4 x (δ |g|)Fρσ F ρσ + |g|δ(gαρ gβσ )Fαβ Fρσ (8.49a)
4 Ω
Z  
1 4
p αβ 1 ρσ ρσ
=− d x |g|δg − gαβ Fρσ F + 2g Fαρ Fβσ . (8.49b)
4 Ω 2
Hence the dynamical stress tensor is
1
Tαβ = −Fαρ Fβ ρ + gαβ Fρσ F ρσ . (8.50)
4
Thus we reproduced in both cases the (symmetrised) canonical stress tensor.

8.4.1 Cosmological constant


To understand better the meaning of the constant Λ, we ask now if one of know energy-
matter tensors could mimick a term gαβ Λ. First we consider a scalar field. The constancy of
Λ requires clearly ∇α φ = 0 and thus

Tαβ = gαβ V0 (φ) , (8.51)

where V0 is the minimum of the potential V (φ). Hence a scalar field with a non-zero minimum
of its potential acts as a cosmological constant.
Next we consider a perfect fluid described by the two parameters density ρ and pressure
P . We know already that T αβ = diag{ρ, P, P, P } for a perfect fluid in its rest frame. Hence
and a fluid with P = −ρ, i.e. marginally fulfilling the strong energy condition, has the same
property as a cosmological constant.
Is it possible to distinguish a term like Tab = gab V0 (φ) in Sm from a non-zero Λ in SEH ? In
principle yes, since a cosmological constant fulfils P = −ρ exactly and independently of all
external parameters like temperature or density. The latter change with time in the universe
and therefore there may be detectable differences to a fluid with P = P (ρ, T, . . .) and a scalar
field with potential V = V (ρ, T, . . .), even if they mimick today very well a cosmological
constant with P = −ρ.

8.4.2 Equations of motion


We show now that Einstein’s equation imply that particles move along geodesics. By analogy
with a pressureless fluid, T αβ = ρuα uβ , we postulate2
Z
αβ m dxα dxβ (4)
T (x̃) = p dτ δ (x̃ − x(τ )) (8.52)
|g| dτ dτ
2
Note the delta function is accompanied by a factor 1/ |g| such that d4 x |g| f (x)δ(x − x0 )/ |g| = f (x0 ).
p R p p

92
8.5 Alternative theories

for a point-particle moving along x(τ ) with proper time τ . Inserting this into
1 p
∇α T αβ = ∂α T αβ + Γασα T σβ + Γβ σα T ασ = p ∂α ( |g|T αβ ) + Γβ σα T ασ = 0 (8.53)
|g|
gives Z Z
∂ (4)
α β
dτ ẋ ẋ δ (x̃ − x(τ )) + Γβ σα dτ ẋα ẋσ δ(4) (x̄ − x(τ )) = 0 . (8.54)
∂ x̃α
We can replace ∂/∂ x̃α = −∂/∂xα acting on δ(4) (x̃ − x(τ )) and use moreover
∂ (4) d (4)
ẋα α
δ (x̃ − x(τ )) = δ (x̃ − x(τ )) (8.55)
∂x dτ
to obtain Z Z
βd (4) β
− dτ ẋ δ (x̃ − x(τ )) + Γ σα dτ ẋα ẋσ δ(4) (x̃ − x(τ )) = 0 . (8.56)

Integrating the first term by parts we obtain
Z  
dτ ẍβ + Γβσα ẋα ẋσ δ(4) (x̃ − x(τ )) = 0 . (8.57)

The integral vanishes only, when the word-line xα (τ ) is a geodesics. Hence Einstein’s equation
implies already the equation of motion of a point particle, in contrast to Maxwell’s theory,
where the Lorentz force law has to be postulated separately.

8.5 Alternative theories


The Einstein-Hilbert action (8.24) is most likely only the low-energy limit of either the “true”
action of gravity or of an unified theory of all interactions. It is therefore interesting to
examine modifications of the Einstein-Hilbert action and to compare their predictions to
observations.

Tensor-scalar theories The field equations for a purely scalar theory of gravity would be
φ = −4πGTaa . (8.58)
It predicts no coupling between photons and gravitation, since Taa = 0 for the electromagnetic
field. A purely vector theory for gravity fails, since it predicts not attraction but repulsion
for two masses.
However, it may well be that gravity is a mixture of scalar, vector and tensor exchange,
dominated by the later. An important example for a tensor-scalar theory is the Brans-Dicke
theory. Here one use gµν to describe gravitational interactions but assumes that the strength,
G, is determined by a scalar field φ,
Z  
4
p 1 2 2
S = d x |g| − φ R + α(∂µ φ) + Lm (gµν , ψ) , (8.59)
2
where ψ represents all matter fields. Rescaling the metric by
κ
g̃µν = gµν 2
φ
we are back to Einstein gravity, but now φ couples universally to all matter fields ψ.

93
8 Einstein’s field equation

f (R) gravity Another important class of modified gravity models are the so-called f (R)
gravity models, which generalise the Einstein–Hilbert action replacing R by a general function
f (R). Thus the action of f (R) gravity coupled to matter has the form
Z  
4
p 1
S = d x |g| − f (R) + Lm , (8.60)
2κ̃

where Sm may contain both non-relativistic matter and radiation. Note that for f (R) 6= R, the
gravitational constant κ̃ = 8π G̃ deviates from Newton’s constant G measured in a Cavendish
experiment. The field equations can be derived from the action (8.60) either by a variation
w.r.t. the metric or the connection. The dynamics and the number of the resulting degrees
of freedom differ in the two treatments. Following the first approach, generalising our old
derivation one obtains
1
F (R)Rµν − f (R)gµν − ∇µ ∇ν F (R) + gµν F (R) = κTµν (8.61)
2
with F ≡ df /dR. Taking the trace of this expression, we find

F (R)R − 2f (R)gµν + 3F (R) = κT. (8.62)

The term F (R) acts as a kinetic term so that these models contain an additional propagating
scalar degree of freedom, φ = F (R).

Extra dimensions and Kaluza-Klein theories String theory suggests that we live in a world
with d = 10 spacetime dimensions. There are two obvious answers to this result: first, one
may conclude that string theory is disproven by nature or, second, one may adjust reality.
Consistency of the second approach with experimental data could be achieved, if the d − 4
dimensions are compactified with a sufficiently small radius R, such that they are not visible
in experiments sensible to wavelengths λ ≫ R.
Let us check what happens to a scalar particle with mass m, if we add a fifth compact
dimension y. The Klein–Gordon equation for a scalar field φ(xµ , y) becomes

(5 + m2 )φ(xµ , y) = 0 (8.63)

with the five-dimensional d’Alembert operator 5 =  − ∂y2 . The equation can be separated,
φ(xµ , y) = φ(xµ )f (y), and since the fifth dimension is compact, the spectrum of f is discrete.
Assuming periodic boundary conditions, f (x) = f (x + R), gives

φ(xµ , y) = φ(xµ ) cos(nπy/R). (8.64)

The energy eigenvalues of these solutions are ωk,n 2 = k2 + m2 + (nπ/R)2 . From a four-
dimensional point of view, the term (nπ/R)2 appears as a mass term, m2n = m2 + (nπ/R)2 .
Since we usually consider states with different masses as different particles, we see the five-
dimensional particle as a tower of particles with mass mn but otherwise identical quantum
numbers. Such theories are called Kaluza–Klein theories, and the tower of particles Kaluza–
Klein particles. If R ≪ λ, where λ is the length-scale experimentally probed, only the n = 0
particle is visible and physics appears to be four-dimensional.
Since string theory includes gravity, one often assumes that the radius R of the extra-
dimensions is determined by the Planck length, R = 1/MPl = (8πGN )1/2 ∼ 10−34 cm. In this

94
8.5 Alternative theories

case it is difficult to imagine any observational consequences of the additional dimensions. Of


greater interest is the possibility that some of the extra dimensions are large,

R1,...,δ ≫ Rδ+1,...,6 = 1/MPl .

Since the 1/r 2 behaviour of the gravitational force is not tested below d∗ ∼ mm scales, one
can imagine that large extra dimensions exists that are only visible to gravity: Relating the
d = 4 and d > 4 Newton’s law F ∼ mr2+δ 1 m2
at the intermediate scale r = R, we can derive
the “true” value of the Planck scale in this model: Matching of Newton’s law in 4 and 4 + δ
dimensions at r = R gives
m1 m2 1 m1 m2
F (r = R) = GN 2
= 2+δ 2+δ . (8.65)
R MD R

This equation relates the size R of the large extra dimensions to the true fundamental scale
MD of gravity in this model,

G−1 2 δ δ+2
N = 8πMPl = R MD , (8.66)

while Newton’s constant GN becomes just an auxiliary quantity useful to describe physics
at r >
∼ R. (You may compare this to the case of weak interactions where Fermi’s constant
GF ∝ g2 /m2W is determined by the weak coupling constant g and the mass mW of the W -
boson.) Thus in such a set-up, gravity is much weaker than weak interaction because the
gravitational field is diluted into a large volume.
Next we ask, if MD ∼ TeV is possible, what would allow one to test such theories at
accelerators as LHC. Inserting the measured value of GN and MD = 1 TeV in Eq. (8.66) we
find the required value for the size R of the large extra dimension as 1013 cm and 0.1 cm for
δ = 1 and 2, respectively, Thus the case δ = 1 is excluded by the agreement of the dynamics
of the solar system with four-dimensional Newtonian physics. The cases δ ≥ 2 are possible,
because Newton’s law is experimentally tested only for scales r >∼ 1 mm.

95
9 Linearized gravity and gravitational waves
In any relativistic theory of gravity, the effects of an accelerated point mass on the surround-
ing spacetime can propagate maximally with the speed of light. Thus one expects that, in
close analogy to electromagnetic waves, gravitational waves exist. Such waves correspond to
ripples in spacetime which lead to local stresses and transport energy. Although gravitational
waves were already predicted by Einstein in 1916, their existence was questioned until the
1950s: Since locally the effects of gravity can be eliminated, it was doubted that they cause
any measurable effects. Similarly, the non-existence of a stress tensor for the gravitational
field raised the question how, e.g., the momentum and energy flux of gravitational waves can
be properly defined. Only in 1957, at the now famous “Chapell Hill Conference”, this con-
troversy was decided: First, Pirani presented a formalism how coordinate independent effects
of a gravitational wave could be deduced. Second, Feynman suggested the following simple
gedankenexperiment: A gravitational wave passing a rod with sticky beads would move the
beads along the rod; friction would then produce heat, implying that the gravitational wave
had done work. Soon after that the first gravitational wave detectors were developed, but
only in 2015 the first detection was accomplished.

9.1 Linearized gravity


In electrodynamics, the photon is uncharged and the Maxwell equations are thus linear. In
contrast, gravitational fields carry energy, are thus self-interacting and in turn the Einstein
equations are non-linear. In order to derive a wave equation, we have therefore to linearize
the field equations as first step.

9.1.1 Metric perturbations as a tensor field


We are looking for small perturbations hµν around the Minkowski1 metric ηµν ,

gµν = ηµν + hµν , with |hµν | ≪ 1. (9.1)

These perturbations may be caused either by the propagation of gravitational waves or by the
gravitational potential of a star. In the first case, current experiments show that we should
not hope for h larger than O(h) ∼ 10−22 . Keeping only terms linear in h is therefore an
excellent approximation. Choosing in the second case as application the final phase of the
spiral-in of a neutron star binary system, deviations from Newtonian limit can become large.
Hence one needs a systematic “post-Newtonian” expansion or even a numerical analysis to
describe properly such cases.
We choose a Cartesian coordinate system xµ and ask ourselves which transformations are
compatible with the splitting (9.1) of the metric. If we consider global Lorentz transformations
1 (0)
The same analysis could be performed for small perturbations around an arbitrary metric gµν , adding
however considerable technical complexity.

96
9.1 Linearized gravity

Aα = j α wave equation h̄ab = T ab


∂α Aα = 0 gauge condition ∂a h̄αβ = 0
transverse polarization transverse, traceless
R α (t ,x′ ) R αβ (t ,x′ )
Aα (x) = d3 x′ J|x−xr
′| solution h̄αβ (x) = d3 x′ T |x−x r
′|

G ... ...ab
Lem = − 32 d¨a d¨a energy loss Lgr = − 5 I ab I

Table 9.1: Comparison of basic formulas for electromagnetic and gravitational radiation.

Λν µ , then x̃ν = Λν µ xµ , and the metric tensor transforms as

g̃αβ = Λρα Λσβ gρσ = Λρα Λσβ (ηρσ + hρσ ) = ηαβ + Λρα Λσβ hρσ = η̃αβ + Λρα Λσβ hρσ . (9.2)

Since h̃αβ = Λρα Λσβ hρσ , we see that global Lorentz transformations respect the splitting (9.1).
Thus hµν transforms as a rank-2 tensor under global Lorentz transformations. We can view
therefore the perturbation hµν as a symmetric rank-2 tensor field defined on Minkowski space
that satisfies as wave equation the linearized Einstein equation, similar as the photon field
fulfills a wave equation derived from Maxwell’s equations.
The splitting (9.1) is however clearly not invariant under general coordinate transforma-
tions, as they allow, for example, the finite rescaling gµν → Ωgµν . We restrict therefore
ourselves to infinitesimal coordinate transformations,

x̃µ = xµ + ξ µ (xν ) (9.3)

with |ξ µ | ≪ 1. Then the Killing equation (4.8) simplifies to

h̃µν = hµν + ∂µ ξν + ∂ν ξµ , (9.4)

because the term ξ ρ ∂ρ hµν is quadratic in the small quantities hµν and ξµ and can be neglected.
Recall that the ξ ρ ∂ρ hµν term appeared, because we compared the metric tensor at different
points. In its absence, it is more fruitful to view Eq. (9.4) not as a coordinate but as a gauge
transformation analogous to Eq. (7.88). In this interpretation, we stay in Minkowski space
and the fields h̃µν and hµν describe the same physics, since the gravitational field equations
do not fix uniquely hµν for a given source.

Comparison with electromagnetism In Table 9.1, basic properties of electromagnetic and


gravitational waves are compared.

9.1.2 Linearized Einstein equation in vacuum


From ∂µ ηνρ = 0 and the definition
1
Γµνλ = gµκ (∂ν gκλ + ∂λ gνκ − ∂κ gνλ ) (9.5)
2
we find for the change of the connection linear in hµν
1 1
δΓµνλ = η µκ (∂ν hκλ + ∂λ hνκ − ∂κ hνλ ) = (∂ν hµλ + ∂λ hµν − ∂ µ hνλ ). (9.6)
2 2

97
9 Linearized gravity and gravitational waves

Note that we used η µν to lower indices which is appropriate in the linear approximation.
Remembering the definition of the Riemann tensor,
Rµνλκ = ∂λ Γµνκ − ∂κ Γµνλ + Γµρλ Γρνκ − Γµρκ Γρνλ , (9.7)
we see that we can neglect the terms quadratic in the connection terms. Thus we find for the
change
δRµνλκ = ∂λ δΓµνκ − ∂κ δΓµνλ (9.8a)
1
= ∂λ ∂ν hµκ + ∂λ ∂κ hµν − ∂λ ∂ µ hνκ − (∂κ ∂ν hµλ + ∂κ ∂λ hµν − ∂κ ∂ µ hνλ ) (9.8b)
2
1
= ∂λ ∂ν hµκ + ∂κ ∂ µ hνλ − ∂λ ∂ µ hνκ − ∂κ ∂ν hµλ . (9.8c)
2
The change in the Ricci tensor follows by contracting µ and λ,
1n o
δRλνλκ = ∂λ ∂ν hλκ + ∂κ ∂ λ hνλ ) − ∂λ ∂ λ hνκ − ∂κ ∂ν hλλ . (9.9)
2
µ
Next we introduce h ≡ hµ ,  = ∂µ ∂ µ , and relabel the indices,
1
δRµν = ∂µ ∂ρ hρν + ∂ν ∂ρ hρµ − hµν − ∂µ ∂ν h . (9.10)
2
We now rewrite all terms apart from hµν as derivatives of the vector
1
ξµ = ∂ν hνµ − ∂µ h, (9.11)
2
obtaining
1
{−hµν + ∂µ ξν + ∂ν ξµ } .
δRµν = (9.12)
2
Looking back at the properties of hµν under gauge transformations, Eq. (9.4), we see that we
can gauge away the second and third term. Thus the linearised Einstein equation in vacuum,
δRµν = 0, becomes simply
hµν = 0, (9.13)
if the harmonic gauge2
1
ξµ = ∂ν hνµ − ∂µ h = 0 (9.14)
2
is chosen. Hence the familiar wave equation holds for all independent components of
hµν , and the perturbations propagate with the speed of light. Inserting plane waves
hµν = εµν exp(−ikx) into the wave equation, one finds immediately that k is a null vector.
The characteristic property of gravity that we can introduce in each point an inertial
coordinate system implies that we can set the perturbation hµν equal to zero in a single
point. This ambiguity was one of the reasons that the existtence of gravitational waves was
doubted for long time. In Section 8.1, we introduced therefore the Riemann tensor as an
umambigious signature for the non-zero curvature of space-time. The derivation of a wave
equation for the Riemann tensor,
Rµνλκ = 0, (9.15)
by Pirani in 1956 (which follows by using (9.13) in (9.8c)), can be therefore seen as the theo-
retical proof for the existence of gravitational waves in Einstein gravity: Ripples in spacetime,
or more formally perturbations of the curvature tensor, propagate with the speed of light.
2
Alternatively, this gauge is called Hilbert, Loren(t)z, de Donder,. . . , gauge.

98
9.1 Linearized gravity

9.1.3 Linearized Einstein equation with sources


We found 2δRµν = −hµν . By contraction it follows 2δR = −h. Combining then both
terms gives
   
1 1
 hµν − ηµν h = −2 δRµν − ηµν δR = −2κδTµν . (9.16)
2 2

Since we assumed an empty universe in zeroth order, δTµν is the complete contribution to the
stress tensor. We omit therefore in the following the δ in δTµν . Next we introduce as useful
short-hand notation the “trace-reversed” amplitude as

1
h̄µν ≡ hµν − ηµν h . (9.17)
2
The harmonic gauge condition becomes then

∂ µ h̄µν = 0 (9.18)

and the linearised Einstein equation in the harmonic gauge follows as

h̄µν = −2κTµν . (9.19)

¯ = h and Eq. (8.38), we can rewrite this wave equation also as


Because of h̄ µν µν

hµν = −2κT̄µν (9.20)

with the trace-reversed stress tensor T̄µν ≡ Tµν − 21 ηµν T .

Newtonian limit We are now in the position to fix the value of the constant κ, comparing
the wave equation (9.19) with the Schwarzschild metric in the Newtonian limit. This limit
corresponds to v/c → 0 and thus the only non-zero element of the stress tensor becomes
T 00 = ρ. Moreover, the d’Alembert operator can be approximated by minus the Laplace
operator,  → −∆. The Schwarzschild metric in the weak-field limit is

ds2 = (1 + 2Φ)dt2 − (1 − 2Φ) dx2 + dy 2 + dz 2 (9.21)

with Φ = −GM/r as the Newtonian gravitational potential. Comparing this metric to


Eq. (9.1), we find as (static) metric perturbations

h00 = 2Φ, hij = 2δij Φ, h0i = 0 . (9.22)

With hji = −hij and thus h = −4Φ, it follows


 
1
−∆ h00 − η00 h = −4∆Φ = −2κρ . (9.23)
2

Hence the linearised Einstein equation has in the Newtonian limit the same form as the
Poisson equation ∆Φ = 4πGρ, and the constant κ equals κ = 8πG.

99
9 Linearized gravity and gravitational waves

9.1.4 Polarizations states


TT gauge We consider a plane wave hµν = εµν exp(−ikx). The symmetric matrix εab
is called polarization tensor. Its ten independent components are constrained both by the
wave equation and the gauge condition. The harmonic gauge ∂ µ h̄µν = 0 corresponds to four
constraints and reduces thereby the number of independent polarisation states from ten to six.
Even after fixing the harmonic gauge ∂ µ h̄µν = 0, we can still perform a gauge transformation
using four functions ξµ satisfying ξµ = 0. We can choose them such that four additional
components of hµν vanish. In the transverse traceless (TT) gauge, we set (i = 1, 2, 3)

h0i = 0, and h = 0. (9.24)

The harmonic gauge condition becomes ξα = ∂β hβα = 0 or

ξ0 = ∂β hβ0 = ∂0 h00 = −iωε00 e−ikx = 0 , (9.25a)


ξa = ∂β hβa = ∂b hba = ikb εab e−ikx = 0. (9.25b)

Thus ε00 = 0 and the polarisation tensor is transverse, ka εab = kb εab = 0. If we choose
the plane wave propagating in the z direction, k = kez , the last raw and column of the
polarisation tensor vanish too. Accounting for h = 0 and εαβ = εβα , only two independent
elements are left,  
0 0 0 0
 0 ε11 ε12 0 
εαβ = 
 0 ε12 −ε11 0  .
 (9.26)
0 0 0 0
In general, one can construct the polarisation tensor in the TT gauge by first setting the
non-transverse part to zero and then subtracting the trace. The resulting two independent
elements are (again for k = kez ) then ε11 = (εxx − εyy )/2 and ε12 .
Let us re-discuss the procedure of determing the physical polarisation states of a gravita-
tional wave following the same approach that we used for the photon in Eqs. (7.86)-(-7.88).
We consider first as gauge transformation

ξ µ (x) = −iλµ exp(−ikx) , (9.27)

obtaining3

h̃µν = hµν + ∂ µ ξ ν + ∂ ν ξ µ = (εµν + λµ kν + λν kµ ) exp(−ikx) = ε̃µν exp(−ikx). (9.28)

Choosing again a photon propagating in z direction, kµ = (ω, 0, 0, ω), it follows


 
2λ0 λ1 λ2 λ0 λ3
 λ1 0 0 λ1 
λµ kν + λν kµ = ω 
 λ2
. (9.29)
0 0 λ2 
λ0 λ3 λ1 λ2 2λ3

Thus the gauge transformation does not affect the non-zero components in the TT gauge,
which are therefore the only physical ones. On the other hand, the arbitrariness of λµ allows
3
Simpler to consider trace-reversed h̄µν ...

100
9.1 Linearized gravity

us to set all other elements of the polarisation tensor to zero. To see this, we note that the
harmonic gauge implies
1
kµ εµν = kν εµµ . (9.30)
2
Then it follows for ν = {1, 2}

ε01 + ε31 = ε02 + ε32 = 0, (9.31)

while the ν = {0, 3} components result with εij = −εij and kµ = (ω, 0, 0, −ω) in

1
ε00 + ε30 = (ε00 − ε11 − ε22 − ε33 ) = −(ε03 + ε33 ). (9.32)
2
Thus we can eleminate four elements of the polarisation tensor. We choose to eleminate ε0i ,
using first ε01 = −ε31 and ε02 = −ε32 . Next we combine the LHS and the RHS of Eq. (9.32)
using ε30 = ε03 , obtaining
1
ε03 = − (ε00 + ε33 ). (9.33)
2
Finally, we use this relation to eleminate ε03 in the ν = 3 equation,
1 1 1
ε00 − ε33 = (ε00 − ε11 − ε22 − ε33 ). (9.34)
2 2 2
and thus ε11 = −ε22 . Apart from the invariant physical elements, ε̃11 = ε11 and ε̃12 = ε12 ,
the remaining four elements of the polarisation tensor transform as

ε̃13 = ε13 + ωλ1 , ε̃23 = ε23 + ωλ2 , (9.35)


33 33 3 00 00 0
ε̃ =ε + 2ωλ , ε̃ =ε + 2ωλ , (9.36)

Since each of the four elements depends on a different λµ , they can be set to zero choosing
ε00 ε13 ε23 ε33
λ0 = − , λ1 = − , λ2 = − , λ3 = − . (9.37)
2ω ω ω 2ω

Helicity We determine now how a metric perturbation hab transforms under a rotation with
the angle α. We choose the wave propagating in z direction, k = kez , the TT gauge, and the
rotation in the xy plane. Then the general Lorentz transformation Λ becomes
 
1 0 0 0
 0 cos α sin α 0 
Λµν = 
 0 − sin α cos α 0  .
 (9.38)
0 0 0 1

Since k = kez and thus Λµν kν = kµ , the rotation affects only the polarisation tensor. We
rewrite ε′µν = Λµρ Λνσ ερσ in matrix notation, ε′ = ΛεΛT . It is sufficient to perform the
calculation for the xy sub-matrices. The result after introducing circular polarisation states
ε± = ε11 ± iε12 is
ε′µν
± = exp(∓2iα)ε± .
µν
(9.39)
The same calculation for a circularly polarised photon gives ε′µ µ
± = exp(∓iα)ε± . Any plane
wave ψ which is transformed into ψ ′ = e−ihα ψ by a rotation of an angle α around its propa-
gation axis is said to have helicity h. Thus if we say that a photon has spin 1 and a graviton

101
9 Linearized gravity and gravitational waves

Figure 9.1: The effect of a right-handed polarised gravitational wave on a ring of transverse
test particles as function of time; the dashed line shows the state without gravi-
tational wave.

has spin 2, we mean more precisely that electromagnetic and gravitational plane waves have
helicity 1 and 2, respectively. Doing the same calculation in an arbitrary gauge, one finds that
the remaining, unphysical degrees of freedom transform as helicity 1 and 0 (problem 9.??).
In general, a massive tensor field of rank n contains states with helicity h = −n, . . . , n, con-
taining thus 2n + 1 polarisation states. In contrast, a massless tensor field of rank n contains
only the two polarisation states with maximal helicity, h = −n and h = n.

Detection principle of gravitational waves Let us consider the effect of a gravitational


wave on a free test particle that is initially at rest, uα = (1, 0, 0, 0). Then the geodesic
equation simplifies to u̇α = −Γα00 . The four relevant Christoffel symbols are in the linearised
approximation, cf. Eq. (9.6),

1
Γα00 = (∂0 hα0 + ∂0 hα0 − ∂ α h00 ) . (9.40)
2
We are free to choose the TT gauge in which all component of hαβ appearing on the RHS
are zero. Hence the acceleration of the test particle is zero and its coordinate position is
unaffected by the gravitational wave: the TT gauge defines a comoving coordinate system.
The physical distance l between two test particles is given by integrating

dl2 = gab dξ a dξ b = (hab − δab )dξ a dξ b , (9.41)

where gab is the spatial part of the metric and dξ the spatial coordinate distance between
infinitesimal separated test particles. Hence the passage of a gravitational wave, hαβ ∝
εαβ cos(ωt), results in a periodic change of the separation of freely moving test particles.
Figure 9.1 shows that a gravitational wave exerts tidal forces, stretching and squashing test
particles in the transverse plane. The relative size of the change, ∆L/L, is given by the
amplitude h of the gravitational wave. It is this tiny periodic change, ∆L/L < −21 cos(ωt),
∼ 10
which gravitational wave experiments aim to detect. There are two basic types of gravitational
wave experiments. In the first, one uses the fact that the tidal forces of a passing gravitational
wave excite lattice vibrations in a solid state. If the wave frequency is resonant with a lattice
mode, the vibrations might be amplified to detectable levels. In the second type of experiment,
the free test particles are replaced by mirrors. Between the mirrors, a laser beam is reflected
multipe times, thereby increasing the effective length L and thus ∆L, before two beams at
90◦ are brought to interference.

102
9.2 Stress pseudo-tensor for gravity

Figure 9.2: Sensitivity of present and future experiments compared to the expectations for
the amplitude h = ∆L/L for various gravitational wave sources.

A collection of potential gravitational wave sources is compared to the sensitivity of present


and future experiments in Fig. 9.2. As the most promising gravitational wave source the in-
spiral of binary systems composed of neutron stars or black holes has been suggested. In
September 2015, the Advanced Laser Interferometer Gravitational-Wave Observatory (Ad-
vanced LIGO) detected such a signal for the first time [1, 2]. Since then, such merger have
been observed on a regular basis: Currently, 47 compact binary mergers have been detected.
Other, weaker sources in the frequency range ∼ 100 Hz are supernova explosions. The coales-
ence of supermassive black holes during the merger of two galaxies proceeds on much longer
time scales. Correspondingly, the frequency of these events is much lower, and experiments
searching for them a space-based interferometers. Additionally, a stochastic background of
gravitational waves might be produced during inflation and phase transitions in the early
universe.

9.2 Stress pseudo-tensor for gravity


Stress pseudo-tensor We consider again the splitting (9.1) of the metric, but we take into
account now terms of second order in hab . We rewrite the Einstein equation by bringing the
Einstein tensor on the RHS and adding the linearized Einstein equation,
 
(1) 1 (1) 1 (1) 1 (1)
Rαβ − R ηαβ = −κTαβ + −Rαβ + R gαβ + Rαβ − R ηαβ . (9.42)
2 2 2

103
9 Linearized gravity and gravitational waves

The LHS of this equation is the LHS of the usual gravitational wave equation, while the RHS
now includes as source not only matter but also the gravitational field itself. It is therefore
natural to define
(1) 1
Rαβ − R(1) ηαβ = −κ (Tαβ + tαβ ) (9.43)
2
with tαβ as the stress pseudo-tensor for gravity. If we expand all quantities,

(1) (2) (1) (2)


gαβ = ηαβ + hαβ + hαβ + O(h3 ) , Rαβ = Rαβ + Rαβ + O(h3 ) , (9.44)

(1) (2)
we can set, assuming hαβ ≪ 1, Rαβ − Rαβ = Rαβ + O(h3 ), etc. Hence we find as stress
(1)
pseudo-tensor for the metric perturbations hαβ at O(h3 )
 
1 (2) 1
tαβ =− Rαβ − R(2) ηαβ . (9.45)
κ 2

(1)
This tensor is symmetric, quadratic in hαβ and conserved because of the Bianchi identity.
Moreover, it transforms as a tensor in Minkowski space. This implies that one can derive
global conservation laws for the energy and the angular momentum of the gravitational field,
if we assume |hαβ | → 0 for x → ∞. However, tαβ does not transform as a tensor under
general coordinate transformation,, since it can be made at each point identically to zero
by a suitable coordinate transformation. In the case of gravitational waves we may expect
that averaging tαβ over a volume large compared to the wave-length considered solves this
problem. Moreover, such an averaging simplifies the calculation of tαβ , since all terms odd in
kx cancel. Nevertheless, the calculation is extremly messy. We will use therefore a short-cut
via the following two digressions.

Quadratic Einstein-Hilbert action We construct the action of gravity quadratic in hµν from
the wave equation (9.19), following the same logic as in the Maxwell case. We multiply by a
variation δhµν and integrate, obtaining
Z  
harm harm 4
p 1 µν 1 µν
0 = δSEH + δSm = d x |g| δh h̄µν + δh Tµν . (9.46)
4κ 2

Here, we divided by two such that we obtain the correctly normalised stress tensor of matter
using (8.35). Next, we massage the first term into a form similar to the kinetic energy of a
scalar field in Minkowski space: We insert first the definition of h̄µν , use then the product
rule and perform finally a partial integration,

1 1
δhµν h̄µν = δhµν hµν − δhµν ηµν h = δhµν hµν − δhh (9.47a)
 2   2 
1 µν 1 1 1
= δ h hµν − hh = −δ (∂κ h ) − (∂κ h)2 .
µν 2
(9.47b)
2 4 2 4

Thus the quadratic Einstein-Hilbert action in the harmonic gauge becomes


Z  
harm 1 4 1 1
SEH = − d x (∂ρ hµν )2 − 2
(∂ρ h) . (9.48)
32πG 2 4

104
9.2 Stress pseudo-tensor for gravity

Specializing (9.48) to the TT gauge, we obtain


Z
1 1
TT
SEH =− d4 x (∂ρ hij )2 . (9.49)
32πG 2

We can express an arbitrary polarisation state as the sum over the polarisation tensors for
circular polarised waves, X
hµν = h(a) εµν
(a)
. (9.50)
a=+,−

(a)
Inserting this decomposition into (9.49) and using εµν εµν(b) = δab , the action becomes
Z
TT 1 X 1 2
SEH = − d4 x ∂ρ h(a) . (9.51)
32πG a 2

Thus the gravitational action in the TT gauge consists of two degrees of freedom, h+ and
h− , which determine the contribution of left- and right-circular polarised waves. Apart from
the pre-factor, the action is the same as the one of two scalar fields. This means that we can
shortcut many calculations involving gravitational waves by using simply the corresponding
results for scalar fields. We can understand this equivalence by recalling that the part of the
action action quadratic in the fields just enforces the relativistic energy–momentum relation
via a Klein–Gordon equation for each field component. The remaining content of (9.48) is
just the rule how the unphysical components in hµν have to be eliminated. In the TT gauge,
we have already applied this information, and thus the two scalar wave equations for h(±)
summarise the Einstein equation at O(h2 ).

Averaged stress tensor The stress tensor of a scalar field is in general given by

2 δSm
Tαβ = p = ∂α φ∂β φ − gαβ L . (9.52)
|g| δgαβ

We consider now a free field, i.e. set now V (φ) = 0, and take the average over a volume Ω
large compared to the typical wavelength of the field,
Z
1 1
hTαβ i = d4 x Tαβ = h∂α φ∂β φi − ηαβ h(∂ρ φ)2 i . (9.53)
Ω 2

Performing a partial integration of the second term, we can drop the surface term, and use
then the equation of motions,

h(∂ρ φ)2 i = −hφφi = 0 . (9.54)

Hence hTαβ i = h∂α φ∂β φi. Comparing now SKG and SEH in the TT gauge suggests that the
averaged stress pseudo-tensor of the gravitational field is given in this gauge by

1
htαβ i = h∂α hij ∂β hij i . (9.55)
32πG

Bootstrap Bootstrap of full Einstein equation out of linear ansatz.

105
9 Linearized gravity and gravitational waves

9.3 Emission of gravitational waves


The first, indirect, evidence for gravitational waves has been the observation of close neutron
star-neutron star binaries showing that such systems loose energy, leading to a shrinkage of
their orbit with time. These observations are consistent with the prediction for the energy
loss by the emission of gravitational waves.
The steps in deriving this energy loss formula are similar to the corresponding derivation
for the dipole emission formula of electromagnetic radiation. Step one, the derivation of the
Green function for the wave equation (9.19) is exactly the same, after having fixed the gauge
freedom. In the second step, we have to connect the amplitude of the field at large distances
(“in the wave zone”) to the source, i.e. the current j a and the stress tensor T ab , respectively.
Finally, we use the connection between the field and its (pseudo) stress tensor (Tem ab or tab ) to

derive the energy flux through a sphere around the source.

Quadrupol formula Gravitational waves in the linearized approximation fulfil the superpo-
sition principle. Hence, if the solution for a point source is known,

−x G(x − x′ ) = δ(x − x′ ) , (9.56)

the general solution can be obtained by integrating the Green function over the sources,
Z
h̄αβ (x) = −2κ d4 x′ G(x − x′ )Tαβ (x′ ) . (9.57)

The Green function G(x − x′ ) is not completely specified by Eq. (9.56): We can add solutions
of the homogeneous wave equation and we have to specify how the poles of G(x − x′ ) are
treated. In classical physics, one chooses the retarded Green function G(+) (x − x′ ) defined by

1
G(+) (x − x′ ) = − δ[|x − x′ | − (t − t′ )]ϑ(t − t′ ) , (9.58)
4π|x − x′ |

picking up the contributions along the past light-cone; for a derivation see appendix 9.B.
Inserting the retarded Green function into Eq. (9.57), we can perform the time integral
using the delta function and obtain
Z
Tαβ (t − |x − x′ |, x′ )
h̄αβ (x) = 4G d 3 x′ . (9.59)
|x − x′ |

The retarded time tr ≡ t − |x − x′ | denotes the emission time tr of a signal emitted at x′ that
reaches x at time t propagating with the speed of light.
We perform now a Fourier transformation from time to angular frequency,
Z Z Z
1 iωt 4G Tαβ (tr , x′ )
h̄αβ (ω, x) = √ dt e h̄αβ (t, x) = √ dt d3 x′ eiωt . (9.60)
2π 2π |x − x′ |

Next we change from the integration variable t to tr ,


Z Z
4G ′ Tαβ (tr , x′ )
h̄αβ (ω, x) = √ dtr d3 x′ eiωtr eiω|x−x | , (9.61)
2π |x − x′ |

106
9.3 Emission of gravitational waves

and introduce the Fourier transformed Tαβ (ω, x′ ),


Z
′ Tαβ (ω, x′ )
h̄αβ (ω, x) = 4G d3 x′ eiω|x−x | . (9.62)
|x − x′ |

We proceed using the same approximations as in electrodynamics: We restrict ourselves to


slowly moving, compact sources observed in the wave zone and choose the coordinate system
such that |x′ | ≪ |x|. Then most radiation is emitted at frequencies such that |x−x′ | ≃ |x| ≡ r
and thus Z
eiωr
h̄αβ (ω, x) = 4G d3 x′ Tαβ (ω, x′ ) . (9.63)
r
Finally, we Fourier transform back to the retarded time tr = t − r,
Z
1
h̄αβ (t, x) = √ dt e−iωt h̄αβ (ω, x) (9.64a)

Z Z
4G −iω(t−r)
=√ dt e d3 x′ Tαβ (ω, x′ ) . (9.64b)
2πZ r
4G
= d3 x′ Tαβ (tr , x′ ) . (9.64c)
r

Next we want to eleminate all elements of Tαβ except T00 . We use first (flat-space) energy-
momentum conservation,
∂ 00 ∂
T + b T 0b = 0 , (9.65a)
∂t ∂x
∂ a0 ∂
T + b T ab = 0 . (9.65b)
∂t ∂x
Then we differentiate Eq. (9.65a) with respect to time and use Eq. (9.65b), obtaining

∂ 2 00 ∂2 0b ∂2
T = − T = T ab . (9.66)
∂t2 ∂xb ∂t ∂xa ∂xb

Multiplying with xa xb and integrating gives thus


Z Z Z
∂2 3 a b 00 3 ∂2
a b
d xx x T = d xx x T ij = 2 d3 x T ab . (9.67)
∂t2 ∂xi xj

Here we dropped also surface terms, using that the source is compact. In the harmonic gauge,
we need to calculate only the components hij (tr , x). We define as quadrupole moment of the
source stress tensor Z
I (tr ) = d3 x xa xb T 00 (tr , x) .
ab
(9.68)

Then the quadrupole formula for the emission of gravitational waves results,

2G ¨
h̄ab (t, x) = Iab (tr ) , (9.69)
c6 r

where we added also c. Since hαβ is traceless, the trace of I αβ does not produce gravitational
waves: It is connected to the dipole moment and its time derivative vanishes because of

107
9 Linearized gravity and gravitational waves

conservation of linear momentum. Thus it is more convenient to replace I αβ by the reduced


(trace-less, irreducible) quadrupole moment
Z  
ab 3 a b 1 ab 2 00
Q = d x x x − δ r T (x) . (9.70)
3

Our derivation neglected perturbations of flat space and seems therefore not applicable to a
self-gravitating system. However, our final result depends only on the motion of the particles,
not how it is produced. An analysis at next order in perturbation theory shows indeed that
our result applies to self-gravitating systems like binary stars.
Note the following peculiarity of a gravitational wave experiment: Such an experiment
measures the amplitude hab ∝ 1/r of a metric perturbation, while the sensitivity of all other
experiments (light, neutrinos, cosmic rays, . . . ) is proportional to the energy flux ∝ 1/r 2 of
radiation. This difference is connected to the fact that a gravitational wave is caused by the
coherent motion of the source, and can be thus observed as a coherent wave over time. In
particular, one can measure the phase of hab as function of time. In contrast, light observed
from an astrophysical source is a incoherent superposition of individual photons. As a result,
increasing the sensitivity of a gravitational wave detector by a factor ten increases the number
of potential sources by a factor 1000, in contrast to a factor 103/2 for other detectors.
One may wonder if this behavior contradicts the fact that also the energy flux of a gravi-
tational wave follows as 1/r 2 law. However, the energy dissipated from a gravitational wave
crossing the Earth (including our experimental set-up) is extremely tiny, while the energy den-
sity of gravitational wave with amplitude as small as h ∼ 10−22 is surprisingly large (check it
e.g. with (9.75)).

Energy loss We evaluate now Eq. (9.55) for a plane-wave

hij = Aij cos(kx) (9.71)

with amplitudes Aij which we choise to be real. Using hsin2 (kx)i = 1/2, we obtain

1
htαβ i = kα kβ Aij Aij . (9.72)
64πG
The energy-flux F, i.e. the energy crossing an unit area per unit time, in the direction n is
in general F = ct0i ni . For a plane-wave with wave-vector kµ , it follows
1
F = t0i k̂i = k0 ki k̂i Aij Aij = ct00 , (9.73)
64πG

where we used k0 = −ki k̂i . Thus we got the reasonable result that the energy-flux is simply
the energy-density t00 multiplied with the wave-speed c. Expressing as the sum over linearly
polarised waves, X
hµν = h(a) εµν
(a)
. (9.74)
a=+,×

it follows with a = h(+) and b = h(×)


ω2 
F= a2 + b2 . (9.75)
32πG

108
9.4 Gravitational waves from binary systems

In the case of a spherical wave emitted from the origin, we choose n = er . Then
1 1 c5
F(er ) = ht0i ni i = h(∂t hij )(n · ∇)hij i = h(∂t hij )(∂r hij )i . (9.76)
64πG 64πG 64πG
Inserting the quadrupole formula, one finds (cf. the appendix for details)
Z
dE G ... ...ij
Lgr = − = − dΩr 2 F(er ) = 5 Qij Q , (9.77)
dt 5c
where we added c.

9.4 Gravitational waves from binary systems


9.4.1 Weak field limit
The emission of gravitational radiation is negligible for all systems where Newtonian gravity is
a good approximation. One of the rare examples where general relativistic effects can become
important are close binary systems of compact stars. The first such example was found 1974
by Hulse and Taylor who discovered a pulsar in a binary system via the Doppler-shift of its
radio pulses. The extreme precision of the periodicity of the pulsar signal makes this binary
system to an ideal laboratory to test various effect of special and general relativity:
• The pulsar’s orbital speed changes by a factor of four during its orbit and allows us to
test the usual (special relativistic) Doppler effect.
• At the same time, the gravitational field alternately strengthens at periastron and weak-
ens at apastron, leading to a periodic gravitational redshift of the pulse.
• The small size of the orbit leads to a precession of the Perihelion by 4.2◦ /yr.
• The system emits gravitational waves and looses thereby energy. As a result the orbit
of the binary shrinks by 4mm/yr.
In the following, we will derive some of these predictions.

Kepler problem We start recalling the basic formulae from the Kepler problem For a system
of two stars with masses M = m1 + m2 we introduce the reduced mass
m1 m2
µ= (9.78)
m1 + m2
and c.m. coordinates.
GM µ2
u′′ + u = . (9.79)
L2
Inserting as trial solution the equation of a conic section,
1 1 + e cos ϑ
u= = (9.80)
r a(1 − e2 )
we find
1 − e cos ϑ + e cos ϑ ! GM µ2
u′′ + u = = . (9.81)
a(1 − e2 ) L2
Thus we obtain as constraint for the angular momentum
p
L = µ GM a(1 − e2 ) . (9.82)

109
9 Linearized gravity and gravitational waves

Gravitational wave emission In the first step, we derive the instantanuous energy loss of
the binary system due to gravitational wave emission. Since we assume that the losses are
small, we can treat the orbital parameters a and e as constant. The quadrupole moments
follow as

Ixx = m1 x21 + m2 x22 = µr 2 cos2 φ, (9.83a)


Iyy = µr 2 sin2 φ, (9.83b)
2
Ixy = µr cos φ sin φ, (9.83c)
2
I ≡ Ixx + Iyy = µr . (9.83d)

In order to find the derivates of Iik , we have to determine first ṙ and φ̇. Eleminating L using
Eq. (9.82) we obtain

L [a(1 − e2 )M ]1/2
φ̇ = = . (9.84)
µr 2 r2

Differentiating then Eq. (9.80) and inserting φ̇, we find

 1/2
a(1 − e2 )e sin φ φ̇ M
ṙ = = e sin φ. (9.85)
(1 + e cos φ)2 a(1 − e2 )

We are now in the position to calculate, e.g.,

 
I˙xx = 2µ cos φ r ṙ cos φ − r 2 φ̇ sin φ . (9.86)

With

r 2 φ̇ = [a(1 − e2 )M ]1/2 ≡ A (9.87)

and
e sin φ
r ṙ = A , (9.88)
1 + e cos φ

it follows

 
2 e sin φ Mr
r ṙ cos φ − r φ̇ sin φ = A sin φ −1 =− sin φ. (9.89)
1 + e cos φ [a(1 − e2 )]1/2

Thus we obtain
2m1 m2 r
I˙xx = − cos φ sin φ. (9.90)
[a(1 − e2 )M ]1/2

110
9.4 Gravitational waves from binary systems

The calculation of the other elements and the higher derivatives proceeds in the same way,
leading to
2m1 m2 
I¨xx = − 2
cos 2φ + e cos3 φ , (9.91a)
a(1 − e )
... 2m1 m2 
I xx = 2
2 sin 2φ + 3e cos2 φ sin φ φ̇, (9.91b)
a(1 − e )
2m1 m2
I˙yy = r (cos φ sin φ + e sin φ) , (9.91c)
[a(1 − e2 )M ]1/2
2m1 m2 
I¨yy = 2
cos 2φ + e cos φ + e cos3 φ + e2 , (9.91d)
a(1 − e )
... 2m1 m2 2 2

I yy = − 2 sin φ + e sin φ + 3e cos φ sin φ φ̇, (9.91e)
a(1 − e2 )
m1 m2 r 
I˙xy = 2 1/2
cos2 φ − sin2 φ + e cos φ (9.91f)
[a(1 − e )M ]
2m1 m2 
I¨xy = − 2
sin 2φ + e sin φ + e sin φ cos2 φ (9.91g)
a(1 − e )
... 2m1 m2 
I xy = − 2
2 cos 2φ − e cos φ + 3e cos3 φ φ̇, (9.91h)
a(1 − e )
... ... ... 2m1 m2
I = I xx + I yy = − e sin φφ̇. (9.91i)
a(1 − e2 )
Inserting these expressions into
 
dE G ... ...ij G ...2 ...2 ...2 1 ...2
Lgr =− = Qij Q = I xx + 2 I xy + I yy − I (9.92)
dt 5 5 3
results in
dE 8m21 m22  
− = 2 2
12(1 + e cos φ)2 + e2 sin2 φ φ̇2 . (9.93)
dt 15a(1 − e )
for the instantanous energy loss. In order to obtain the average energy loss, we have to average
this expression over one period,
  Z Z
dE 1 T dE 1 2π dφ dE 32 m21 m22 M
− =− dt =− = f (e) (9.94)
dt T 0 dt T 0 φ̇ dt 5 a5
with
1 + 73 2 37 4
24 e + 96 e
f (e) = (9.95)
(1 − e2 )7/2

Time evolution Now we can determine how the orbital parameters change over time. The
major axis a decreases with time as
da m1 m2 dE 2a2 dE
= = , (9.96)
dt 2E 2 dt m1 m2 dt
or averaged over one period,
 
da 64 m21 m22 M
− = f (e). (9.97)
dt 5 a3

111
9 Linearized gravity and gravitational waves

The orbital period changes as

Ṗ 3 Ė 3 ȧ 96 m21 m22 M
= =− =− f (e). (9.98)
P 2E 2a 5 a4
What remains to do is to work out the change of the eccentricity,
 
de M 2 dE dL
= 3 3 L + 2EL . (9.99)
dt m1 m2 e dt dt

Determineing the loss of angualar momentum L due to gravitational wave emission is more
involved than the energy loss: Since L = r × p contains a factor r, we have to take into
account terms h ∝ 1/r 2 what requires to include term of O(h3 ) in htµν i. Therefore we simply
cite the result obtained by Peters 1964 [3, 4],
dL 2G 3ij k ...
− = ε Q̈i Qjk . (9.100)
dt 5
Then the instantanuous loss of angular momentum follows as
dL G h ¨ ... ... ... i
− = Ixy ( I yy − I xx ) + I xy (I¨xx − I¨yy ) , (9.101)
dt 5
leading to  
de 304 m1 m2 M e
− = g(e) (9.102)
dt 15 a4
with
1 + 121
304 e
2
g(e) = . (9.103)
(1 − e2 )7/2

Hulse-Taylor pulsar The binary system found by Hulse and Taylor consists of a pulsar
with mass m1 = 1.44M⊙ and a companion with mass m2 = 1.34M⊙ . Their orbital period
is P = 7h40min on an orbit with rather strong eccentricity, e = 0.617. In this case, the
emission of gravitational radiation is strongly enhanced compared to an circular orbit. Let
us now compare the observed change in the orbit of the binary with the prediction of general
relativity. The prediction of Einstein’s general relativity,

Ṗ (e) = f (e) Ṗ (0) ≃ 11.7 × Ṗ (0) ≃ (−2.403 ± 0.002) × 10−12 ,

(9.104)
th

is in excellent agreement with the observed value,



Ṗ (e) ≃ (−2.40 ± 0.05) × 10−12

(9.105)
obs

A comparison of the predicted and observed accumulted shift in the period is shown in Fig. ??.

9.4.2 Strong field limit and binary merger


Post-Newtonian approximation and beyond In the previous section, we used the orbits
obtained in the Newtonian limit. This approximation corresponds to the limit c → ∞ and
neglects all retardation effects. Since the energy loss due the gravitational wave emission is
of order O(1/c5 ), cf. with (Eq. 9.77), we should be able to improve this approximation using

112
9.4 Gravitational waves from binary systems

a Lagrange function only of coordinates and velocities but including post-Newtonian (PN)
corrections up to order (v/c)4 . The first relativistic terms, at the 1PN order, were derived in
1937–39, the 2PN approximation was tackled by Ohta et al. in 1973–74, while results for the
3PN order were obtained starting from 1998. Alternatives to this brûte-de-force approach such
as the effective one-body theory have been developped where one maps the two-body problem
of GR onto an one-body problem in an effective metric. However, all these approaches are
restricted to the inspiral phase of a merger. In contrast, numerical simulations of the merging
phase of binaries give accurate results, but can take months even on super-computers. Thus
their extension towards early times of the inspiral phase is restricted, and the set of parameters
{m1 , m2 , s1 , s2 , e, . . .} for which simulations exist is sparse. For instance, simulations for large
mass ratios m/ m2 are numerically still prohibitive. As a results, a combination of the different
approaches is needed to describe the coalesence of binaries of binary system accurately.

Qualitative discussion Let us now discuss qualitatively the final stage in the time evolution
of a close binary system. We can assume that the emission of GWs has lead to a circulisation
of the orbits. Then
32 G4 µ2 M 3
Lgw = . (9.106)
5 a5
Next we can relate the relative changes per time in the orbital period P , the separation a and
the energy E using E ∝ 1/a and P ∝ a3/2 as

Ė ȧ 2 Ṗ
=− = . (9.107)
E a 3P
Solving first for the change in the period,

3 Lgw 96 G3 µM 2
Ṗ = − P = P, (9.108)
2 E 5 a4
and eliminating then a gives
96
Ṗ = − (2π)8 G5/3 µM 2/3 P −5/3 . (9.109)
5
Combining (9.107) and (9.108), we obtain

2 Ṗ 64 G3 µM 2
ȧ = a=− . (9.110)
3P 3 a3
Separating variables and integrating, we find
256 3
a4 = G µM 2 (t − tc ). (9.111)
5
Here, tc denotes the (theoretical) coalensence time for point-like stars. With the initial con-
dition a(t = 0) = a0 , it follows
 
t 1/4
a(t) = a0 1 − (9.112)
tc
and
5 a40
tc = . (9.113)
256 G µM 2
3

113
9 Linearized gravity and gravitational waves

Figure 9.3: Example of waveforms from black hole (upper) and neutrons star (lower panel)
binaries.

As a rule of thumb, our approximations (slow velocities and weak fields) break down at
r ≃ rISCO . Since the last stage of the merger is fast, the estimate (9.113) is quite reliable.
From the exercise, we know that the amplitude is
4GµM h
hij = Aij = Aij , (9.114)
ar r
where the non-zero amplitudes are Aij ∝ sin(2ωt + φ). Thus the emitted gravitational wave
is monochromatic, with frequency twice the orbital frequency of the binaries,
 
2ω 2 (GM )1/2 t −3/8
νGW = = = = ν0 1 − . (9.115)
2π P πa3/2 tc
Moreover, the amplitude of the gravitational wave signal increases with time as
1
h(t) ∝ ∝ (t − tc )−1/4 . (9.116)
a
Expressed as function of the frequency νGW , the amplitude becomes

4GµM ω 2/3 2/3


h(t) = = 4GµM ≡ 4π 2/3 G5/3 M5/3 νGW . (9.117)
a (GM )1/3
In the last step, we introduced the chirp mass,

(m1 m2 )3/5
M ≡ µ3/5 M 2/5 = , (9.118)
(m1 + m2 )1/5

114
9.A Appendix: Projection operator

which is the combination of the masses m1 and m2 easiest to extract from the gravitational
wave signal. Finally, we have to replace the instantaneous phase in the polarisation tensor by
the time-integrated phase, since ω depends on time,
Z  −5/8
t − tc
Φ(t) = dt 2ω = + φ0 . (9.119)
5GM

Thus both the amplitude and the phase evolution of the gravitational wave signal provide
information on the chirp mass M.
A typical wave-form of the merger of a black hole binary is shown in the upper panel
of Fig. 9.3. It consists of the waves emitted during the inspiral (“the chirp”), the merger,
and the ring-down. In this last phase, oscillations of the BH formed during the merger are
damped by the emission of GWs and decay exponentially, leading to standard Kerr BH. The
frequencies and the damping times of the eigenmodes of a BH can be calculated, and thus the
ring-down provides additional opportunities to test GR. The lower panel of Fig. 9.3 shows
a typical wave-form of a neutron star merger: When tidal interactions start to deform the
neutron stars, the gravitational wave signal is not monochromatic anymore and the structure
of the stars has to be accounted for.

9.A Appendix: Projection operator


TT . We start by search-
We want to find the trace-less transverse part of an arbitrary tensor Mik
ing for an operator which projects any tensor on the two-dimensional subspace orthogonal to
the unit vector n. Any projection operator should satisfy

P±2 = P± , P± P∓ = 0, and P+ + P− = 1,

In our case, the desired projection operator is

Pi j = δij − ni nj . (9.120)

Frist, we show that this operator satisfies P 2 = P ,

Pi j Pj k = (δij − ni nj )(δjk − nj nk ) = δik − ni nk = Pi k . (9.121)

Morover, it is ni Pi j vj = 0 for all vectors v; Thus P projects indeed any vector on the subspace
orthogonal to n. Since a tensor is a multi-linear map, we have to apply a projection operator
on each of its indices,
TT
Mkl = Pki Pl j Mij . (9.122)
T is transverse, nk M T = nl M T = 0, but in general not traceless
The tensor Mkl kl kl

MkT k = Pki P kj Mij = Pl i Mil . (9.123)

Subtracting the trace, we obtain the transverse, traceless part of M ,


 
TT i j 1 ij
Mkl = Pk Pl − Pkl P Mij . (9.124)
2

115
9 Linearized gravity and gravitational waves

Next we insert this projection operator into Eq. (9.77),


Z
G ...TT ...TT ij
Lgr = − dΩQij Q . (9.125)

We have to find the projection onto the radial unit vector er , whose Cartesian components
we denote as (x̂1 , x̂2 , x̂3 ). Then
...TT ...TT ij ... ...ij ... j ...ik 1 ...ij ...kl
Qij Q = Qij Q − 2Qi Q x̂i x̂j + Q Q x̂i x̂j x̂k x̂l . (9.126)
2
Since Qij is an integral over space, it does not depend on er and can be taken out of the
integral. Then it follows
Z Z
4π ij 4π  ij kl 
dΩ x̂i x̂j = δ and dΩ x̂i x̂j x̂k x̂l = δ δ + δik δjl + δil δjk . (9.127)
3 15
To see the first result, we note Rthat the only available symmetric tensor of rank two is
δij . Contracting the indices in dΩ x̂i x̂j = Aδij , it follows A = 1/3. Using the same
line of argument, the integral with four x̂i is evaluated. Combining everything, we obtain the
quadrupole formuala (9.77) for the emission of gravitational waves.

9.B Appendix: Derivation of the retarded Green function


We want to find the Green function for the gravitational wave equation (9.13). Starting from
Eq. (9.56),  2 

− − ∆ G(x − x′ ) = δ(x − x′ )δ(t − t′ ), (9.128)
∂t2
we introduce relative coordinates, r = x − x′ , and perform an (asymmetric) Fourier transfor-
mation in the time t,
Z   Z
∂2
dt ∆ − 2 G(r, τ )e = δ(r) dt δ(t − t′ )eiωt .
iωt
(9.129)
∂t

The integral on the RHS is trivial, and the one on the LHS defines G(r, ω). Next we perform
the time derivatives, obtaining

(∆ + ω 2 )G(r, ω) = δ(r)eiωt . (9.130)

Thus the time dependence of the Green function is

G(r, ω) = Gω (r)eiωt . (9.131)

We are interested in spherically symmetric solutions, emitted by a source at r = |r| = 0.


Then
1 d2 [rGω (r)]
+ ω 2 Gω (r) = δ(r). (9.132)
r dr 2
For r > 0, the solution of
d2 [rGω (r)]
+ ω 2 rGω (r) = 0 (9.133)
dr 2

116
9.B Appendix: Derivation of the retarded Green function

is
Aeikr Be−ikr
Gω (r) = + ≡ AG(+) (−)
ω (r) + BGω (r). (9.134)
r r
Thus the solution consists of out- and in-going spherical waves. Next we consider the limit
r → 0 (or the static limit) of the wave equation. Integrating over a small sphere of radius r,
we obtain Z Z
d3 x ∆G(r, ω) = dSi ∂i G(r, ω) = 4πr 2 ∂r G(r, ω) = 1. (9.135)

Here, weR used Gauss’ theorem to convert the volume into a surface integral, while we could
neglect d3 xω 2 G(r, ω) ∝ r 3 for r → 0. Moreover, we used that the integral over the delta
function on the RHS gives one. Thus the Green function for small r satisfies
1
G(r, ω) = − + C. (9.136)
4πr
Comparing this to Eq, (9.134) fixes A + B = −1 and C = 0. Finally, we transform back to
time, Z Z
dω (±) 1 dω −iω(τ ∓r)
G(±) (r, t) = G (r, ω)e−iωτ = − e , (9.137)
2π 4πr 2π
where we used ω = |k|. Then it follows

δ(τ ∓ r)
G(±) (r, t) = − . (9.138)
4πr

The delta function enforces τ = ±r. Since r > 0, the Green function G(+) includes only
sources with τ > 0, i.e. along the past light-cone of the observer at {t, x}, while the Green
function G(−) includes only sources with τ < 0, i.e. along the past light-cone.
Finally, we comment on the differences between the classical and the quantum case:

• In classical physics, we use only positive energy solutions and the causal propagator
is the retarded one, which propagates these solutions forward in time. A relativistic
quantum theory contains in addition negative energy solutions. The causal or Feyn-
man propagates then positive energy solutions (particles) forward, and negative energy
solutions (antiparticles) backward in time, in a way conistent with the CPT theorem.

• In the classical case, one eleminates the gauge freedom completely such that only phys-
ical degrees of freedom propagate. Then it is sufficient to use a scalar Green function,
which propagates the physical polarisation states in the same way. Such gauges (like
the Coulomb or TT gauge) are however valid only in a specific frame. Therefore one
prefers in the quantum case a covariant (like the Lorenz or harmonic) gauge. These
gauges include also the instantanous Coulomb or Newtonian interactions. The Green
function of a tensor of rank n becomes then a tensor of rank 2n.

• spherical versus plane waves.

117
10 Cosmological models for an homogeneous,
isotropic universe

10.1 Friedmann-Robertson-Walker metric for an homogeneous,


isotropic universe
Einstein’s cosmological principle Einstein postulated that the Universe is homogeneous and
isotropic at each moment of its evolution. Note that a space isotropic around at least two
points is also homogeneous, while a homogeneous space is not necessarily isotropic. The CMB
provides excellent evidence that the universe is isotropic around us. Baring suggestions that
we live at a special place, the universe is also homogeneous.

Weyl’s postulate In 1923, Hermann Weyl postulated the existence of a privileged class
of observers in the universe, namely those following the “average” motion of galaxies. He
postulated that these observers follow time-like geodesics that never intersect. They may
however diverge from a point in the (finite or infinite) past or converge towards such a point
in the future.
Weyl’s postulate implies that we can find coordinates such that galaxies are at rest. These
coordinates are called comoving coordinates and can be constructed as follows: One chooses
first a space-like hypersurface. Through each point in this hypersurface lies a unique worldline
of a privileged observer. We choose the coordinate time such that it agrees with the proper-
time of all observers, g00 = 1, and the spatial coordinate vectors such that they are constant
and lie in the tangent space T at this point. Then ua = δ0a and for n ∈ T it follows na = (0, n)
and
0 = ua na = gab ua nb = g0β nβ . (10.1)

Since n is arbitrary, it follows g0β = 0. Hence as a consequence of Weyl’s postulate we may


choose the metric as
ds2 = dt2 − dl2 = dt2 − gαβ dxα dxβ . (10.2)

The cosmological principle constrains further the form of dl2 : Homogeneity requires that
the gαβ can depend on time only via a common factor S(t), while isotropy requires that only
x · x, dx · x, and dx · dx enter dl2 . Hence

dl2 = C(r)(x · dx)2 + D(r)(dx · dx)2 = C(r)r 2 dr 2 + D(r)[dr 2 + r 2 dϑ2 + r 2 sin2 ϑdφ2 ] (10.3)

We can eliminate the function D(r) by the rescaling r 2 → Dr 2 . Thus the line-element becomes
 
dl2 = S(t) B(r)dr 2 + r 2 dΩ (10.4)

with dΩ = dϑ2 + sin2 ϑdφ2 , while B(r) is a function that we have still to specify.

118
10.1 Friedmann-Robertson-Walker metric for an homogeneous, isotropic universe

Maximally symmetric spaces are spaces with constant curvature. Hence the Riemann ten-
sor of such spaces can depend only on the metric tensor and a constant K specifying the
curvature. The only form that respects the (anti-)symmetries of the Riemann tensor is

Rabcd = K(gac gbd − gad gbc ) . (10.5)

Contracting Rabcd with g ac , we obtain in three dimensions for the Ricci tensor

Rbd = gac Rabcd = Kg ac (gac gbd − gad gbc ) = K(3gbd − gbd ) = 2Kgbd . (10.6)

A final contraction gives as curvature R of a three-dimensional maximally symmetric space

R = gab Rab = 2Kδaa = 6K . (10.7)

A comparison of Eq. (10.6) with the Ricci tensor for the metric (10.4) will fix the still
unknown function B(r). We proceed in the standard way: Calculation of the Christoffel
symbols with the help of the geodesic equations, then use of the definition (8.6) for the Ricci
tensor,
1 dB
Rrr = = 2Kgrr = 2KB (10.8)
rB dr
r dB 1
Rϑϑ = 1+ 2
− = 2Kgϑϑ = 2Kr 2 . (10.9)
2B dr B
(The φφ equation contains no additional information.) Integration of (10.8) gives
1
B= (10.10)
A − Kr 2
with A as integration constant. Inserting the result into (10.9) determines A as A = 1. Thus
we have determined the line-element of a maximally symmetric 3-space with curvature K as

dr 2
dl2 = + r 2 (sin2 ϑdφ2 + dϑ2 ) . (10.11)
1 − Kr 2
Going over to the full four-dimensional line-element, we rescale for K 6= 0 the r coordinate
by r → |K|1/2 r. Then we absorb the factor 1/|K| in front of dl2 by defining the scale factor
R(t) as 
S(t)/|K|1/2 , K 6= 0
R(t) = (10.12)
S(t) K=0
As result we obtain the Friedmann-Robertson-Walker (FRW) metric for an homogeneous,
isotropic universe
 
dr 2
ds2 = dt2 − R2 (t) + r 2
(sin 2
ϑdφ2
+ dϑ 2
) (10.13)
1 − kr 2

with k = ±1 (positive/negative curvature) or k = 0 (flat three-dimensional space). Finally,


we give two alternatives forms of the FRW metric that are also often used. The first one uses
the conformal time dη = dt/R,
 
ds2 = R2 (η) dη 2 − dl2 ) (10.14)

119
10 Cosmological models for an homogeneous, isotropic universe

and gives for k = 0 a conformally flat metric. In the second one, one introduces r = sin χ for
k = 1. Then dr = cos χdχ = (1 − r 2 )1/2 dχ and
 
ds2 = dt2 − R2 (t) dχ2 + S 2 (χ)(sin2 ϑdφ2 + dϑ2 ) (10.15)

with S(χ) = sin χ = r. Defining



sin χ
 for k = 1,
S(χ) = χ for k = 0, (10.16)


sinh χ for k = −1 .

the metric (10.15) is valid for all three values of k.

Note that the rescaling r → |K|1/2 r makes r dimensionless, while R has the dimension of a
length. Therefore one often introduces additionally a dimensionless scale factor a(t) ≡ R(t)/R0 .

10.2 Geometry of the Friedmann-Robertson-Walker metric


Geometry of the FRW spaces Let us consider a sphere of fixed radius at fixed time, dr =
dt = 0. The line-element ds2 simplifies then to R2 (t)r 2 (sin2 ϑdφ2 + dϑ2 ), which is the usual
line-element of a sphere S 2 with radius rR(t). Thus the area of the sphere is A = 4π(rR(t))2 =
4π[S(χ)R(t)]2 and the circumference of a circle is L = 2πrR(t), while rR(t) has the physical
meaning of a length.
By contrast,
√ the radial distance between two points (r, ϑ, φ) and (r + dr, ϑ, φ) is dl =
R(t)dr/ 1 − kr 2 . Thus the radius of a sphere centered at r = 0 is

Z r
dr ′  arcsin(r) for k = 1,
l = R(t) √ = R(t) × r for k = 0, (10.17)
0 1 − kr ′ 2 
arcsinh(r) for k = −1 .

Using χ as coordinate, the same result follows immediately


Z χ(r)
l = R(t) dχ = R(t)χ . (10.18)
0

Hence for k = 0, i.e. a flat space, one obtains the usual result L/l = 2π, while for k = 1
(spherical geometry) L/l = 2πr/ arcsin(r) < 2π and for k = −1 (hyperbolic geometry)
L/l = 2πr/arcsinh(r) > 2π.
For k = 0 and k = −1, l is unbounded, while for k = +1 there exists a maximal distance
lmax (t). Hence the first two case correspond to open spaces with an infinite volume, while the
latter is a closed space with finite volume.

Hubble’s law Hubble found empirically that the spectral lines of “distant” galaxies are
redshifted, z = ∆λ/λ0 > 1, with a rate proportional to their distance d,

cz = H0 d . (10.19)

120
10.2 Geometry of the Friedmann-Robertson-Walker metric

If this redshift is interpreted as Doppler effect, z = ∆λ/λ0 = vr /c, then the recession velocity
of galaxies follows as
v = H0 d . (10.20)

The restriction “distant galaxies” means more precisely that H0 d ≫ vpec ∼ few × 100 km/s.
In other words, the peculiar motion of galaxies caused by the gravitational attraction of
nearby galaxy clusters should be small compared to the Hubble flow H0 d. Note that the
interpretation of v as recession velocity is problematic. The validity of such an interpretation
is certainly limited to v ≪ c.
The parameter H0 is called Hubble constant and has the value H0 ≈ 71+4 −3 km/s/Mpc. We
will see soon that the Hubble law Eq. (10.20) is an approximation valid for z ≪ 1. In general,
the Hubble constant is not constant but depends on time, H = H(t), and we will call it
therefore Hubble parameter for t 6= t0 .
We can derive Hubble’s law by a Taylor expansion of R(t),

1
R(t) = R(t0 ) + (t − t0 )Ṙ(t0 ) + (t − t0 )2 R̈(t0 ) + . . . (10.21)
 2 
1 2 2
= R(t0 ) 1 + (t − t0 )H0 − (t − t0 ) q0 H0 + . . . , (10.22)
2

where
Ṙ(t0 ) R̈(t0 )R(t0 )
H0 ≡ and q0 ≡ − (10.23)
R(t0 ) Ṙ2 (t0 )

is called deceleration parameter: If the expansion is slowing down, R̈ < 0 and q0 > 0.
Hubble’s law follows now as an an approximation for small redshift: For not too large
time-differences, we can use the expansion Eq. (10.21) and write

1 R(t)
1−z ≈ = ≈ 1 + (t − t0 )H0 . (10.24)
1+z R0

Hence Hubble’s law, z = (t0 −t)H0 = d/cH0 , is valid as long as z ≈ H0 (t0 −t) ≪ 1. Deviations
from its linear form arises for z >
∼ 1 and can be used to determine q0 .

Hubble’s law as consequence of homogeneity Consider Hubble’s law as a vector equation


with us at the center of the coordinate system,

v = Hd . (10.25)

What sees a different observer at position d′ ? He has the velocity v ′ = Hd′ relative to us.
We are assuming that velocities are small and thus

v ′′ ≡ v − v ′ = H(d − d′ ) = Hd′′ , (10.26)

where v ′′ and d′′ denote the position relative to the new observer. A linear relation be-
tween v and d as Hubble law is the only relation compatible with homogeneity and thus the
“cosmological principle”.

121
10 Cosmological models for an homogeneous, isotropic universe

d′

O d′′

d
G

Figure 10.1: An observer at position d′ sees the galaxy G recessing with the speed
H(d − d′ ) = Hd′′ , if the Hubble relation is linear.

✦ t + δt2
✦✦✦ 2

✦✦✦
✦ ✦✦ t2
✦✦✦ ✦✦✦
✦ ✦
✦✦✦ ✦✦✦
✦ ✦
✦✦✦ ✦✦✦
✦ ✦
t1 + δt1 ✦ ✦✦
✦✦
✦✦✦
t1 ✦

galaxy, r = 0 observer, r

Figure 10.2: World lines of a galaxy emitting light and an observer at comoving coordinates
r = 0 and r, respectively.

122
10.2 Geometry of the Friedmann-Robertson-Walker metric

Lemaitre’s redshift formula


A light-ray propagates with v = c or ds2 = 0. Assuming a galaxy at r = 0 and an observer
at r, i.e. light rays with dφ = dϑ = 0, we rewrite the FRW metric as
dt dr
=√ . (10.27)
R 1 − kr 2
We integrate this expression between the emission and absorption times t1 and t2 of the first
light-ray, Z t2 Z r
dt dr
= √ (10.28)
t1 R 0 1 − kr 2
and between t1 + δt1 and t2 + δt2 for the second light-ray (see also Fig. 10.2),
Z t2 +δt2 Z r
dt dr
= √ . (10.29)
t1 +δt1 R 0 1 − kr 2
The RHS’s are the same and thus we can equate the LHS’s,
Z t2 Z t2 +δt2
dt dt
= . (10.30)
t1 R t1 +δt1 R

We change the integration limits, subtracting the common interval [t1 + δt1 : t2 ] and obtain
Z t1 +δt1 Z t2 +δt2
dt dt
= . (10.31)
t1 R t2 R

Now we choose the time intervals δti as the time between two wave crests separated by the
wave lengths λi of an electromagnetic wave. Since these time intervals are extremely short
compared to cosmological times, δti = λi /c ≪ ti , we can assume R(t) as constant performing
the integrals and obtain
δt1 δt2 λ1 λ2
= or = . (10.32)
R1 R2 R1 R2
The redshift z of an object is defined as the relative change in the wavelength between emission
and detection,
λ2 − λ1 λ2
z= = −1 (10.33)
λ1 λ1
or
λ2 R2
1+z = = . (10.34)
λ1 R1
Typically, the observation happens at the present epoch, and thus we set 1 + z = R0 /R(t).
This result is intuitively understandable, since the expansion of the universe stretches all
lengths including the wave-length of a photon. For a massless particle like the photon, ν = cλ
and E = cp, and thus its frequency (energy) and its wave-length (momentum) are affected in
the same way. By contrast, the energy of a non-relativistic particle with E ≈ mc2 is nearly
fixed.
A similar calculation as for the photon can be done for massive particles. Since the geodesic
equation for massive particles leads to a more involved calculation, we use in this case however
a different approach. We consider two comoving observer separated by the proper distance

123
10 Cosmological models for an homogeneous, isotropic universe

δl. A massive particle with velocity v needs the time δt = δl/v to travel from observer one
to observer two. The relative velocity of the two observer is

Ṙ Ṙ δR
δu = δl = vδt = v . (10.35)
R R R
Since we assume that the two observes are separated only infinitesimally, we can use the
addition law for velocities from special relativity for the calculation of the velocity v ′ measured
by the second observer,
v − δu δR
v′ = = v − (1 − v 2 )δu + O(δu2 ) = v − (1 − v 2 )v . (10.36)
1 − vδu R
Introducing δv = v − v ′ , we obtain
δv δR
2
= . (10.37)
v(1 − v ) R
and integrating this equation results in
mv const.
p= √ = . (10.38)
1−v 2 R
Thus not the energy but the momentum p = ~/λ of massive particles is red-shifted: The
kinetic energy of massive particles goes quadratically to zero, and hence peculiar velocities
relative to the Hubble flow are strongly damped by the expansion of the universe.

Luminosity and angular diameter distance


In an expanding universe, the distance to an object depends on the expansion history, the
behaviour of the scale factor R(t), between the time of emission t of the observed light and
its reception at t0 . From the metric (10.15) we can define the (radial) coordinate distance
Z t0
dt
χ= (10.39)
t R(t)
as well as the proper distance d = gχχ χ = R(t)χ. The proper distance is however only for
a static metric a measurable quantity and cosmologists use therefore other, operationally
defined measures for the distance. The two most important examples are the luminosity and
the angular diameter distances.

Luminosity distance The luminosity distance dL is defined such, that the inverse-square law
between luminosity L of a source at distance d and the received energy flux F is valid,
 
L 1/2
dL = . (10.40)
4πF

Assume now that a (isotropically emitting) source with luminosity L(t) and comoving coor-
dinate χ is observed at t0 by an observer at O. The cut at O through the forward light cone
of the source emitted at te defines a sphere S 2 with proper area

A = 4πR2 (t0 )S 2 (χ) . (10.41)

124
10.2 Geometry of the Friedmann-Robertson-Walker metric

Two additional effects are that the frequency of a single photon is redshifted, ν0 = νe /(1 + z),
and that the arrival rate of photons is reduced by the same factor due to time-dilation. Hence
the received flux is
1 L(te )
F(t0 ) = (10.42)
(1 + z) 4πR02 S 2 (χ)
2

and the luminosity distance in a FRW universe follows as

dL = (1 + z) R0 S(χ) . (10.43)

Note that dL depends via χ on the expansion history of the universe between te and t0 .
Observable are not the coordinates χ or r, but the redshift z of a galaxy. Differentiating
1 + z = R0 /R(t), we obtain

R0 R0 dR
dz = − dR = − 2 dt = −(1 + z)Hdt (10.44)
R2 R dt
or Z Z
t0 0
dz
t0 − t = dt = . (10.45)
t z H(z)(1 + z)
Inserting the relation (10.44) into Eq. (10.39), we find the coordinate χ of a galaxy at redshift
z as Z t0 Z z
dt 1 dz
χ= = (10.46)
t R(t) R0 0 H(z)
For small redshift z ≪ 1, we can use the expansion (10.22)
Z t0
dt
χ = [1 − (t − t0 )H0 + . . .]−1 (10.47)
t R 0
1 1 1 1
≈ [(t − t0 ) + (t − t0 )2 H0 + . . .] = [z − (1 + q0 )z 2 + . . .] (10.48)
R0 2 R0 H 0 2

In practise, one observes only the luminosity within a certain frequency range instead of the
total (or bolometric) luminosity. A correction for this effect requires the knowledge of the
intrinsic source spectrum.

Angular diameter distance Instead of basing a distance measurement on standard candles,


one may use standard rods with know proper length l whose angular diameter ∆ϑ can be
observed. Then we define the angular diameter distance as

l
dA = . (10.49)
∆ϑ

R0 S(χ)
dA = . (10.50)
1+z
Thus at small distances, z ≪ 1, the two definitions agree by construction, while for large
redshift the differences increase as (1 + z)2 .

125
10 Cosmological models for an homogeneous, isotropic universe

10.3 Friedmann equations


The FRW metric together with a perfect fluid as energy-momentum tensor gives for the
time-time component of the Einstein equation
4πG
R̈ = − (ρ + 3P )R , (10.51)
3
for the space-time components

RR̈ + 2Ṙ2 + 2K = 4πG(ρ − P ) , (10.52)

and 0 = 0 for the space-space components. Eliminating R̈ and showing explicitly the con-
tribution of a cosmological constant to the energy density ρ, the usual Friedmann equation
follows as
!2
Ṙ 8π k Λ
H2 ≡ = Gρ − 2 + . (10.53)
R 3 R 3

while the “acceleration equation” is

R̈ Λ 4πG
= − (ρ + 3P ) . (10.54)
R 3 3

This equation determines the (de-) acceleration of the Universe as function of its matter and
energy content. “Normal” matter is characterized by ρ > 0 and P ≥ 0. Thus a static solution
is impossible for a universe with Λ = 0. Such a universe is decelerating and since today Ṙ > 0,
R̈ was always negative and there was a “big bang”.
We define the critical density ρcr as the density for which the spatial geometry of the
universe is flat. From k = 0, it follows
3H02
ρcr = (10.55)
8πG
and thus ρcr is uniquely fixed by the value of H0 . One “hides” this dependence by introducing
h,
H0 = 100 h km/(s Mpc) .
Then one can express the critical density as function of h,

ρcr = 2.77 × 1011 h2 M⊙ /Mpc3 = 1.88 × 10−29 h2 g/cm3 = 1.05 × 10−5 h2 GeV/cm3 .

Thus a flat universe with H0 = 100h km/s/Mpc requires an energy density of ∼ 10 protons
per cubic meter. We define the abundance Ωi of the different players in cosmology as their
energy density relative to ρcr , Ωi = ρ/ρcr .
In the following, we will often include Λ as other contributions to the energy density ρ via
8π Λ
GρΛ = . (10.56)
3 3
Thereby one recognizes also that the cosmological constant acts as a constant energy density
Λ Λ
ρΛ = or ΩΛ = . (10.57)
8πG 3H02

126
10.4 Scale-dependence of different energy forms

We can understand better the physical properties of the cosmological constant by replacing
Λ by (8πG)ρΛ . Now we can compare the effect of normal matter and of the Λ term on the
acceleration,
R̈ 8πG 4πG
= ρΛ − (ρ + 3P ) (10.58)
R 3 3
Thus Λ is equivalent to matter with an E.o.S. wΛ = P/ρ = −1. This property can be checked
using only thermodynamics: With P = −(∂U/∂V )S and UΛ = ρΛ V , it follows P = −ρ.
The borderline between an accelerating and decelerating universe is given by ρ = −3P or
w = −1/3. The condition ρ < −3P violates the so-called strong energy condition for “normal”
matter in equilibrium. An accelerating universe requires therefore a positive cosmological
constant or a dominating form of matter that is not in equilibrium.
Note that the energy contribution of relativistic matter, photons and possibly neutrinos,
is today much smaller than the one of non-relativistic matter (stars and cold dark matter).
Thus the pressure term in the acceleration equation can be neglected at the present epoch.
Measuring R̈/R, Ṙ/R and ρ fixes therefore the geometry of the universe.

Thermodynamics The first law of thermodynamics becomes for a perfect fluid with dS = 0
simply
dU = T dS − P dV = −P dV (10.59)
or
d(ρR3 ) = −P d(R3 ) . (10.60)
Dividing by dt,
Rρ̇ + 3(ρ + P )Ṙ = 0 , (10.61)
we obtain our old result,
ρ̇ = −3(ρ + P )H . (10.62)
This result could be also derived from ∇a T ab = 0. Moreover, the three equations are not
independent.

10.4 Scale-dependence of different energy forms


The dependence of different energy forms as function of the scale factor R can derived from
energy conservation, dU = −P dV , if an E.o.S. P = P (ρ) = wρ is specified. For w = const.,
it follows
d(ρR3 ) = −3P R2 dR (10.63)
or eliminating P
dρ 3
R + 3ρR2 = −3wρR2 . (10.64)
dR
Separating the variables,
dR dρ
−3(1 + w) = , (10.65)
R ρ
we can integrate and obtain
 −3
 R for matter (w = 0) ,
ρ ∝ R−3(1+w) = R−4 for radiation (w = 1/3) , (10.66)

const. for Λ (w = −1) .

127
10 Cosmological models for an homogeneous, isotropic universe

This result can be understood also from heuristic arguments:


• (Non-relativistic) matter means that kT ≪ m. Thus ρ = nm ≫ nT = P and non-
relativistic matter is pressure-less, w = 0. The mass m is constant and n ∝ 1/R3 , hence
ρ is just diluted by the expansion of the universe, ρ ∝ 1/R3 .
• Radiation is not only diluted but the energy of each single photon is additionally red-
shifted, E ∝ 1/R. Thus the energy density of radiation scales as ∝ 1/R4 . Alternatively,
one can use that ρ = aT 4 and T ∝ hEi ∝ 1/R.
• Cosmological constant Λ: From 8π Λ
3 Gρλ = 3 one obtains that the cosmological constant
Λ
acts as an energy density ρλ = 8πG that is constant in time, independent from a possible
expansion or contraction of the universe.
• Note that the scaling of the different energy forms is very different. It is therefore
surprising that “just today”, the energy in matter and due to the cosmological constant
is of the same order (“coincidence problem”).

Let us rewrite the Friedmann equation for the present epoch as


 
k 2 8πG Λ
= H0 ρ0 + − 1 = H02 (Ωtot,0 − 1) . (10.67)
R02 3H02 3H02

We express the curvature term for arbitrary times through Ωtot,0 and the redshift z as

k k
2
= 2 (1 + z)2 = H02 (Ωtot,0 − 1)(1 + z)2 . (10.68)
R R0

Dividing the Friedmann equation (10.53) by H02 = 8πGρcr /3, we obtain

H 2 (z) X
= Ωi (z) − (Ωtot,0 − 1)(1 + z)2
H02 i
= Ωrad,0 (1 + z)4 + Ωm,0 (1 + z)3 + ΩΛ − (Ωtot,0 − 1)(1 + z)2 (10.69)

This expression allows us to calculate the age of the universe (10.45), distances (10.43), etc. for
a given cosmological model, i.e. specifying the energy content Ωi,0 and the Hubble parameter
H0 at the present epoch.

10.5 Cosmological models with one energy component


We consider a flat universe, k = 0, with one dominating energy component with E.o.S w =
P/ρ = const.. With ρ = ρcr (R/R0 )−3(1+w) , the Friedmann equation becomes


Ṙ2 = GρR2 = H02 R03+3w R−(1+3w) , (10.70)
3

where we inserted the definition of ρcr = 3H02 /(8πG). Separating variables we obtain
Z R0 Z t0
−(3+3w)/2
R0 dR R(1+3w)/2 = H0 dt = t0 H0 (10.71)
0 0

128
10.6 The ΛCDM model

and hence the age of the Universe follows as



2  2/3 for matter (w = 0) ,
t0 H 0 = = 1/2 for radiation (w = 1/3) , (10.72)
3 + 3w 
→∞ for Λ (w = −1) .

Models with w > −1 needed a finite time to expand from the initial singularity R(t = 0) = 0
to the current size R0 , while a Universe with only a Λ has no “beginning”.
In models with a hot big-bang, ρ, T → ∞ for t → 0, and we should expect that classical
gravity breaks down at some moment t∗ . As long as R ∝ tα with α < 1, most time elapsed
during the last fractions of t0 H0 . Hence our result for the age of the universe does not depend
on unknown physics close to the big-bang as long as w > −1/3.
If we integrate (10.71) to the arbitrary time t, we obtain the time-dependence of the scale
factor,  2/3
 t for matter (w = 0) ,
R(t) ∝ t2/(3+3w) = t 1/2 for radiation (w = 1/3) , (10.73)

exp(t) for Λ (w = −1) .

Age problem of the universe. The age of a matter-dominated universe is (expanded around
Ω0 = 1)  
2 1
t0 = 1 − (Ω0 − 1) + . . . . (10.74)
3H0 5
Globular cluster ages require t0 ≥ 13 Gyr. Using Ω0 = 1 leads to H0 ≤ 2/3 × 13 Gyr =
1/19.5 Gyr or h ≤ 0.50. Thus a flat universe with t0 = 13 Gyr without cosmological constant
requires a too small value of H0 . Choosing Ωm ≈ 0.3 increases the age by just 14%.
We derive the age t0 of a flat Universe with Ωm + ΩΛ = 1 in the next section as

3t0 H0 1 1 + ΩΛ
=√ ln √ . (10.75)
2 ΩΛ 1 − ΩΛ
Requiring H0 ≥ 65 km/s/Mpc and t0 ≥ 13 Gyr means that the function on the RHS should
be larger than 3 × 13Gyr × 0.65/(2 × 9.8Gyr ≈ 1.3 or ΩΛ ≥ 0.55.

10.6 The ΛCDM model


We consider a flat Universe containing as its only two components pressure-less matter and
a cosmological constant, Ωm + ΩΛ = 1. Then the curvature term in the Friedmann equation
and the pressure term in the deceleration equation play no role and we can hope to solve
these equations for a(t). Multiplying the deceleration equation (10.54) by two and adding it
to the Friedmann equation (10.53), we eliminate ρm ,
 2
ä ȧ
2 + = Λ. (10.76)
a a
(We denote the scale factor in this section with a.) Next we rewrite first the LHS and then
the RHS as total time derivatives: With
"  #
d 2 3 2 ȧ 2 ä
(aȧ ) = ȧ + 2aȧä = ȧa +2 , (10.77)
dt a a

129
10 Cosmological models for an homogeneous, isotropic universe

1.8

1.6

1.4
t0H0

1.2 Ωm+ΩΛ=1
1

0.8
open
0.6
0 0.2 0.4 0.6 0.8 1
Ωm

Figure 10.3: The product t0 H0 for an open universe containing only matter (dotted blue line)
and for a flat cosmological model with ΩΛ + Ωm = 1 (solid red line).

we obtain
d 1 d 3
(aȧ2 ) = ȧa2 Λ = (a )Λ . (10.78)
dt 3 dt
Integrating is now trivial,
Λ 3
aȧ2 = a +C. (10.79)
3
The constant C can be determined most easily by setting a(t0 ) = 1 and comparing the
Friedmann equation (10.53) with (10.79) for t = t0 as C = 8πGρm,0 /3.
Next we introduce the new variable x = a3/2 . Then
da dx da dx 2x−1/3
= = , (10.80)
dt dt dx dt 3
and we obtain as new differential equation

ẋ2 − Λx2 /4 + C/3 = 0 . (10.81)



p solution x(t) = A sinh( Λt/2) of the homogeneous equation fixes the constant
Inserting the
A as A = 3C/Λ. We can express A also by the current values of Ωi as A = Ωm /ΩΛ =
(1 − ΩΛ )/ΩΛ . Hence the time-dependence of the scale factor is

a(t) = A1/3 sinh2/3 ( 3Λt/2) . (10.82)

The time-scale of the expansion is set by tΛ = 2/ 3Λ.
The present age t0 of the universe follows by setting a(t0 ) = 1 as
p
t0 = tΛ arctanh( ΩΛ ) . (10.83)

The deceleration parameter q = −ä/aH 2 is an important quantity for observational tests


of the ΛCDM model. We calculate first the Hubble parameter
ȧ 2
H(t) = = coth(t/tΛ ) (10.84)
a 3tΛ

130
10.7 Determining Λ and the curvature R0 from ρm,0 , H0 , q0

0.6

0.4
Ω=0.1
0.2

-0.2
q

-0.4 Ω=0.9

-0.6

-0.8

-1
0 0.5 1 1.5 2
t/t0

Figure 10.4: The deceleration parameter q as function of t/t0 for the ΛCDM model and
various values for ΩΛ (0.1, 0.3, 0.5, 0.7 and 0.9 from the top to the bottom).

and then
1
q(t) = [1 − 3 tanh2 (t/tΛ ) . (10.85)
2
The limiting behavior of q corresponds with q = 1/2 for t → 0 and q = −1 for t → ∞ as
expected to the one of a flat Ωm = 1 and a ΩΛ = 1 universe. More interesting is the transition
region and, as shown in Fig. 10.4, the transition from a decelerating to an accelerating universe
happens for ΩΛ = 0.7 at t ≈ 0.55t0 . This can easily converted to redshift, z∗ = a(t0 )/a(t∗ ) −
1 ≈ 0.7, that is directly measured in supernova observations.

10.7 Determining Λ and the curvature R0 from ρm,0 , H0 , q0


General discussion: We apply now the Friedmann and the acceleration equation to the
present time. Thus Ṙ0 = R0 H0 , R̈ = −q0 H02 R0 and we can neglect the pressure term in
Eq. (10.54),
R̈0 Λ 4πG
= −q0 H02 = − ρm,0 . (10.86)
R0 3 3
Thus we can determine the value of the cosmological constant from the observables ρm,0 , H0
and q0 via
Λ = 4πGρm,0 − 3q0 H02 . (10.87)
Solving next the Friedmann equation (10.53) for k/R02 ,
k 8πG Λ
2 = ρm,0 + − H02 , (10.88)
R0 3 3
we write ρm,0 = Ωm ρcr and insert Eq. (10.87) for Λ. Then we obtain for the curvature term
k H02
= (3Ωm − 2q0 − 2) . (10.89)
R02 2

131
10 Cosmological models for an homogeneous, isotropic universe

Hence the sign of 3Ωm − 2q0 − 2 decides about the sign of k and thus the curvature of
the universe. For a universe without cosmological constant, Λ = 0, equation (10.87) gives
Ωm = 2q0 and thus

k = −1 ⇔ Ωm < 1 ⇔ q0 < 1/2 ,


k=0 ⇔ Ωm = 1 ⇔ q0 = 1/2 , (10.90)
k = +1 ⇔ Ωm > 1 ⇔ q0 > 1/2 .

For a flat universe with Λ = 0, ρm,0 = ρcr and k = 0,


 
3H02 3
0 = 4πG + H02 (q0 − 1) = H02 + q0 − 1 , (10.91)
8πG 2
and thus q0 = 1/2. In this special case, q0 < 1/2 means k = −1 and thus an infinite space
with negative curvature, while a finite space with positive curvature has q > 1/2.

Example: Comparison with observations: Use the Friedmann equations applied to the present
time to derive central values of Λ and k, R0 from the observables H0 ≈ (71 ± 4) km/s/Mpc and
ρ0 = (0.27 ± 0.04)ρcr , and q0 = −0.6. Discuss the allowed range and significance of the values.
We evaluate first  2
7.1 × 106 cm
H02 ≈ ≈ 5.2 × 10−36 s−2 .
s 3.1 × 1024 cm
The value of the cosmological constant Λ follows as
 
ρ 1
Λ = 4πGρm,0 − 3q0 H02 = 3H02 − 3q0 ≈ 3H02 × ( × 0.27 + 0.6) ≈ 0.73 × 3H02
2ρcr 2
or ΩΛ = 0.73.
The curvature radius R follows as
 
k ρ q0 + 1
= 4πGρm,0 − H02 (q0 + 1) = 3H02 − (10.92)
R02 2ρcr 3
= 3H02 (0.135 ± 0.02 − 0.4/3) = 3H02 (0.002 ± 0.02) (10.93)

thus a flat universe (k = 0) is consistent with observations.

10.8 Particle horizons


The particle horizon lH is defined as distance out to which one can observe a particle by
exchange of a light signal, i.e. it is the border of the region causally connected to the observer.
Without expansion, lH = ct0 , where t0 is the age of the universe. In an expanding universe,
the path the light has to travel will be stretched, dlH = R0 /R(t)cdt, and thus
Z
dt′
lH = cR0
R(t′ )
For a matter- or radiation-dominated universe R(t) = R0 (t/t0 )α with α = 2/3 and 1/2,
respectively. Both models start with an initial singularity t = 0, and thus
Z t0  −α
t ct0
lH (t0 ) = c dt = .
0 t0 1−α

132
10.8 Particle horizons

The ratio
lH (t) t
∝ α ∝ t1−α
R(t) t
gives the fraction of the Hubble horizon that was causally connected at time t < t0 . Since
0 < α < 1, this fraction decreases going back in time. p
For an universe dominated by a cosmological constant Λ > 0, R(t) = R0 exp( Λ/3t) =
R0 exp(Ht) and thus
Z t0
dt′ cR0
lH (t2 ) = cR0 ′
= [exp(−Ht) − exp(−Ht0 )]
t exp(Ht ) H

With R(t0 ) = R0 and thus t0 = 0,

lH (t) c
= [exp(−Ht) − 1] .
R0 H
Since t < t0 = 0, the expression in the bracket is always larger than one and the causally
connected region is larger than the Hubble horizon. If exponential expansion would have
persisted for all times, then lH (t) → ∞ for t → −∞ and thus the whole universe would be
causally connected.

133
11 Cosmic relics

11.1 Time-line of important dates in the early universe


Different energy form today. Let us summarize the relative importance of the various energy
forms today. The critical density ρcr = 3H02 /(8πG) has with h = 0.7 today the numerical
value ρcr ≈ 7.3 × 10−6 GeV/cm3 . This would corresponds to roughly 8 protons per cubic
meter. However, main player today is the cosmological constant with ΩΛ ≈ 0.73. Next comes
(pressure-less) matter with Ωm ≈ 0.27 that consists mostly of non-baryonic dark matter,
while only Ωb = 4% of the total energy density of the universe consists of matter that we
know. The energy density of cosmic microwave background (CMB) photons with temperature
T = 2.7 K = 2.3 × 10−4 eV is ργ = aT 4 = 0.4 eV/cm3 or Ωγ ≈ 5 × 10−5 .
The contribution of the three neutrino flavors to the energy density depends on the unknown
absolute neutrino mass scale, 5 × 10−5 < ∼ Ων < ∼ 0.05. The lower bound corresponds to three
(effectively) massless neutrinos, the upper to one massive neutrino flavor with mν ∼ 0.3 eV.

Different energy forms as function of time The scaling of Ωi with redshift z, 1 + z =


R0 /R(t) is given by

H 2 (z)/H02 = Ωm,0 (1 + z)3 + Ωrad,0 (1 + z)4 + ΩΛ − (Ωtot,0 − 1)(1 + z)2 . (11.1)


| {z }
≈0

Thus the relative importance of the different energy forms changes: Going back in time, one
enters first the matter-dominated and then the radiation-dominated epoch.
The cosmic triangle shown in Fig. 11.1 illustrates the evolution in time of the various energy
components and the resulting coincidence problem: Any universe with a non-zero positive
cosmological constant will be driven with time to a fix-point with Ωm , Ωk → 0. The only
other non-evolving state is a flat universe containing only matter—however, this solution is
unstable. Hence, the question arises why we live in an epoch where all energy components
have comparable size.

Temperature increase as T ∼ 1/R has three main effects: Firstly, bound states like atoms
and nuclei are dissolved when the temperature reaches their binding energy, T > ∼ Eb . Secondly,
particles with mass mX can be produced, when T > ∼ 2m X , in reactions like γγ → X̄X. Thus
the early Universe consists of a plasma containing more and more heavier particles that are in
thermal equilibrium. Finally, most reaction rates Γ = nσv increase faster than the expansion
1/2
rate of the universe for t → 0, since n ∝ T 3 for relativistic particles, while H ∝ ρrad ∝ T 2 .
Therefore, reactions that have became ineffective today were important in the early Universe.

Matter-radiation equilibrium zeq : The density of matter decreases slower than the energy
density of radiation. Going backward in time, there will be therefore a time when the density

134
11.1 Time-line of important dates in the early universe

0.0

1.0
0.5
OCDM

0.5
1.0 OPEN
Ωm Ωk

SCDM ΛCDM 0.0


FLAT
1.5

-0.5
2.0 CLOSED

0.0 0.5 1.0 1.5


ΩΛ

Figure 11.1: The cosmic triangle showing the time evolution of the various energy compo-
nents.

of matter and radiation were equal. Before that time with redshift zeq , the universe was
radiation-dominated,
Ωrad,0 (1 + zeq )4 = Ωm,0 (1 + zeq )3 (11.2)

or
Ωm,0
zeq = − 1 ≈ 5400 . (11.3)
Ωrad,0

This time is important, because i) the time-dependence of the scale factor changes from
R ∝ t2/3 for a matter to R ∝ t1/2 for a radiation dominated universe, ii) the E.o.S. and thus the
speed of sound changed from w ≈ 1/3, vs2 = (∂P/∂ρ)S = c2 /3 to w ≈ 0, vs2 = 5kT /(3m) ≪ c2 .
The latter quantity determines the Jeans length and thus which structures in the Universe
can collapse.

Recombination zrec : Today, most hydrogen and helium in the interstellar and intergalactic
medium is neutral. Increasing the temperature, the fraction of ions and free electron increases,
i.e. the reaction H +γ ↔ H + +e− that is mainly controlled by the factor exp(−Eb /kT ) will be
shifted to the right. By definition, we call recombination the time when 50% of all atoms are
ionized. A naive estimate gives kT ∼ Eb ≈ 13.6 eV≈ 160.000K or zrec = 60.000. However,
there are many more photons than hydrogen atoms, and therefore recombination happens
latter: A more detailed calculation gives zrec ∼ 1000.
Since the interaction probability of photons with neutral hydrogen is much smaller than with
electrons and protons, recombination marks the time when the Universe became transparent
to light.

135
11 Cosmic relics

Big Bang Nucleosynthesis At Tns ∼ ∆ ≡ mn − mp ≈ 1.3 MeV or t ∼ 1 s, part of protons


and neutrons forms nuclei, mainly 4 He. As in the case of recombination, the large number
of photons delays nucleosynthesis relative to the estimate Tns ≈ ∆ to Tns ≈ 0.1 MeV.

Quark-hadron or QCD transition Above T ∼ mπ ∼ 100 MeV, hadrons like protons, neu-
trons or pions dissolve into their fundamental constituents, quarks q and gluons g.

Baryogenesis All the matter observed in the Universe consists of matter (protons and elec-
trons), and not of anti-matter (anti-protons an positrons). Thus the baryon-to-photon ratio
is
nb − nb̄ nb Ωb ρcr /mN
η= = = ≈ 7 × 10−10 . (11.4)
nγ nγ 2ζ(3)Tγ3 /π 2

The early plasma of quarks q and anti-quarks q̄ contained a tiny surplus of quarks. After all
anti-matter annihilated with matter, only the small surplus of matter remained. The tiny
asymmetry can be explained by interactions in the early Universe that were not completely
symmetric with respect to an exchange of matter-antimatter.

11.2 Equilibrium statistical physics in a nut-shell


The distribution function f (p) of a free gas of fermions or bosons in kinetic equilibrium are
are
1
f (p) = (11.5)
exp[β(E − µ)] ± 1
p
where β = 1/T denotes the inverse temperature, E = m2 + p2 , and +1 refers to fermions
and -1 to bosons, respectively. As we will see later, photons as massless particles stay also
in an expanding universe in equilibrium and may serve therefore as a thermal bath for other
particles. A species X stays in kinetic equilibrium, if e.g. in the reaction X + γ → X + γ the
energy exchange with photons is fast enough.
TheP chemical potential µ is the average energy needed, if an additional particles is added,
dU = i µdNi . If µ is zero, If the species X is also in chemical equilibrium with other species,
e.g. via the reaction X + X̄ ↔ γ + γ with photons, then their chemical potentials are related
by µX + µX̄ = 2µγ = 0.
The number density n, energy density ρ and pressure P of a species X follows as
Z
g
n = d3 p f (p) , (11.6)
(2π)3
Z
g
ρ = d3 p Ef (p) , (11.7)
(2π)3
Z 2
g 3 p
P = d p f (p) . (11.8)
(2π)3 3E

The factor g takes into account the internal degrees of freedom like spin or color. Thus for a
photon, a massless spin-1 particle g = 2, for an electron g = 4, etc.

136
11.2 Equilibrium statistical physics in a nut-shell

Derivation of the pressure integral for free quantum gas:


Comparing the 1. law of thermodynamics, dU = T dS − P dV , with the total differential
dU = (∂U/∂S)RV dS + (∂U/∂V )S dV gives P = −(∂U/∂V )S .
SinceR U = V Ef (p) and S ∝ ln(V f (p)), differentiating U keeping S constant means P =
−V (∂E/∂V )f (p).
We write ∂E/∂V = (∂E/∂p)(∂p/∂L)(∂L/∂V ). To evaluate this we note that ∂E/∂p = p/E,
that from V = L3 it follows ∂L/∂V = 1/(3L2) and that finally the quantization conditions of
free particles, pk = 2πk/L implies ∂p/∂L = −p/L. Combined this gives ∂E/∂V = −p2 /(3EV ).

In the non-relativistic limit T ≪ m, eβ(m−µ) ≫ 1 and thus differences between bosons and
fermions disappear,
Z ∞  3/2
g −β(m−µ) p
2 −β 2m
2
mT
n = e dp p e =g exp[−β(m − µ)] , (11.9)
2π 2 0 2π
ρ = mn , (11.10)
P = nT ≪ ρ . (11.11)

These expressions correspond to the classical Maxwell-Boltzmann statistics1 . The number of


non-relativistic particles is exponentially suppressed, if their chemical potential is small. Since
the number of protons per photons is indeed very small in the universe, cf. Eq. (11.4), and
therefore also the number of electron (the universe should be neutral), the chemical potential
µ can be neglected in cosmology at least for protons and electron.
In the relativistic limit T ≫ m with T ≫ µ all properties of a gas are determined by its
temperature T ,
Z
g T3 ∞ x2 ζ(3)
n = dx = ε1 2 gT 3 , (11.12)
2π 2 0 ex ± 1 π
Z
g T4 ∞ x3 π2
ρ = dx = ε2 gT 4 , (11.13)
2π 2 0 ex ± 1 30
P = ρ/3 , (11.14)

where for bosons ε1 = ε2 = 1 and for fermions ε1 = 3/4 and ε2 = 7/8, respectively.
Since the energy density and the pressure of non-relativistic species is exponentially sup-
pressed, the total energy density and the pressure of all species present in the universe can
be well-approximated including only relativistic ones,

π2
ρrad = g∗ T 4 , (11.15)
30
π2
Prad = ρrad /3 = g∗ T 4 , (11.16)
90
where  4  4
X Ti 7 X Ti
g∗ = gi + gi . (11.17)
T 8 T
bosons fermions

Here we took into account that the temperature of different particle species can differ.
2
1
dxx2n e−ax can be reduced to a Gaussian integral by differentiating with respect
R∞
Integrals of the type 0
to the parameter a.

137
11 Cosmic relics

Entropy Rewriting the first law of thermodynamics, dU = T dS − P dV , as


dU P d(V ρ) p V dρ ρ+P
dS = + dV = + dV = dT + dV (11.18)
T T T T T dT T
and comparing this expression with the total differential dS(T, V ), one obtains
∂S ρ+P
= . (11.19)
∂V T T
Since the RHS is independent of V for constant T , we can integrate and obtain
ρ+P
S= V + f (T ) . (11.20)
T
The integration constant f (T ) has to vanish to ensure that S is an extensive variable, S ∝ V .
The total entropy density s ≡ S/V of the universe can again approximated by the rela-
tivistic species,
2π 2
s= g∗S T 3 , (11.21)
45
where now  3  3
X Ti 7 X Ti
g∗,S = gi + gi . (11.22)
T 8 T
bosons fermions
The entropy S is an important quantity because it is conserved during the evolution of the
universe. Conservation of S implies that S ∝ g∗,S R3 T 3 = const. and thus the temperature
of the Universe evolves as
−1/3
T ∝ g∗,S R−1 . (11.23)

When g∗ is constant, the temperature T ∝ 1/R. Consider now the case that a particle
species, e.g. electrons, becomes non-relativistic at T ∼ me . Then the particles annihilate,
e+ +e− → γγ, and its entropy is transferred to photons. Formally, g∗,S decreases and therefore
the temperature decreases for a short period less slowly than T ∝ 1/R.
Since s ∝ R−3 and also the net number of particles with a conserved charge, e.g. nB ≡
nB − nB̄ ∝ R−3 if baryon number B is conserved, the ratio nB /s remains constant.

Relativistic degrees of freedom. To obtain the number of relativistic degrees of freedom


g∗ in the universe as function of T , we have to know the degrees of freedom of the various
particle species:
• The spin degrees of freedom of massive particles with spin s are 2s + 1, and of neutrinos
1, where we count particles and anti-particles separately. Massless bosons like photons
and gravitons are their own anti-particle and have 2 spin states.
• Below TQCD ∼ 250 MeV strongly interacting particles are bound in hadrons, while
above TQCD free quarks and gluons exist.
• Quarks have as additional label 3 colors, there are eight gluons.
• We assume that all species have the same temperature and approximate their contribu-
tion to g∗ by a step function ϑ(T − m).
Using the “Particle Data Book” to find the masses of the various particles, we can construct
g∗ as function of T as shown in table 11.1.

138
11.3 Big Bang Nucleosynthesis

Temperature new particles 4∆g∗ 4g∗


T < me γ + νi 4 × (2 + 3 × 2 × 7/8) 29
me < T < mµ e± 14 43
mµ < T < mπ µ± 14 57
mπ < T < Tc π± , π0 12 69
Tc < T < ms ¯g
u, ū, d, d, 6 × 14 + 4 × 8 × 2 − 12 205
ms < T < mc s, s̄ 3 × 14 = 42 247
mc < T < mτ c, c̄ 42 289
mτ < T < mb τ± 14 303
mb < T < mW,Z b, b̄ 42 345
mW,Z < T < mh W ±, Z 4 × 3 × 3 = 36 381
mh < T < mt h 4 384
mt < T <? t, t̄ 42 426

Table 11.1: The number of relativistic degrees of freedom g∗ present in the universe as func-
tion of its temperature.

11.3 Big Bang Nucleosynthesis


Nuclear reactions in stars are supposed to produce all the observed heavier elements. However,
stellar reaction can explain at most a fraction of 5% of 4 He, while the production of the weakly
bound deuterium and Lithium-7 in stars is impossible. Thus the light elements up to Li-7 are
primordial: Y (D) = few×10−5 , Y ( 3 H) = few×10−5 Y ( 4 He) ≈ 0.25, Y ( 7 Li) ≈ (1−2)×10−7 .
Observational challenge is to find as ”old” stars/gas clouds as possible and then to extrapolate
back to primordial values.

Estimate of 4 He production by stars:


The binding energy of 4 He is Eb = 28.3 MeV. If 1/4 of all nucleons were fussed into 4 He during
t ∼ 10 Gyr, the luminosity-mass ratio would be
L 1 Eb erg L⊙
= =5 ≈ 2.5 .
Mb 4 4mp t gs M⊙

The observed luminosity-mass ratio is however only MLb ≤ 0.05L⊙/M⊙ . Assuming a roughly
constant luminosity of stars, they can produce only 0.05/2.5 ≈ 2% of the observed 4 He.

Big Bang Nucleosynthesis (BBN) is controlled by two parameters: The mass difference
between protons and neutrons, ∆ ≡ mn − mp ≈ 1.3 MeV and the freeze-out temperature Tf
of reaction converting protons into neutrons and vice versa.

11.3.1 Equilibrium distributions


In the non-relativistic limit T ≪ m, the number density of the nuclear species with mass
number A and charge Z is
 3/2
mA T
nA = gA exp[β(µA − mA )] . (11.24)

139
11 Cosmic relics

In chemical equilibrium, µA = Zµp + (A − Z)µn and we can eliminate µA by inserting the


equivalent expression of (11.24) for protons and neutrons,
 3A/2
nZ A−Z
p nn 2π
exp(βµA ) = exp[β(Zµp + (A − Z)µn )] = exp[β(Zmp + (A − Z)mn )] .
2A mN T
(11.25)
Here and in the following we can set in the pre-factors mp ≈ mn ≈ mN and mA ≈ AmN ,
keeping the exact masses only in the exponentials. Inserting this expression for exp(βµA )
together with the definition of the binding energy of a nucleus, BA = Zmp + (A − Z)mn − mA ,
we obtain  
2π 3(A−1)/2 A3/2 Z A−Z
nA = gA n n exp(βBA ) . (11.26)
mN T 2A p n
The mass fraction XA contributed by a nuclear species is
AnA X X
XA = with nB = np + nn + Ai nAi and Xi = 1 . (11.27)
nB
i i

With nZ A−Z /n = X Z X A−Z nA−1 and η ∝ T 3 and thus nA−1 ∝ η A−1 T 3(A−1) , we have
p nn N p n N B
 3(A−1)/2
T
XA ∝ η A−1 XpZ XnA−Z exp(βBA ) . (11.28)
mN

The fact that η ≪ 1, i.e. that the number of photons per baryon is extremely large, means
that nuclei with A > 1 are much less abundant and that nucleosynthesis takes place later
than naively expected. Let us consider the particular case of deuterium in Eq. (11.28),
 3/2
XD 24ζ(3) T
= √ η exp(βBD ) (11.29)
Xp Xn π mN

with BD = 2.23 MeV. The start of nucleosynthesis could be defined approximately by the
condition XD /(Xp Xn ) = 1, or T ≈ 0.1 MeV according to the left panel in Fig. 11.2.P
The right
panel of the same figure shows the results, if the equations (11.28) together with i Xi = 1
are solved for the lightest and stablest nuclei. Now it becomes clear that in thermal equi-
librium between 0.1 < 4
∼T <∼ 0.2 MeV essentially all free neutrons will bind to He. For low
temperatures one cannot expect that the true abundance follows the equilibrium abundance,
Eq. (11.28), shown in Fig. 11.2. First, in the expanding universe the weak reactions that
convert protons and nucleons will freeze out as soon as their rate drops below the expansion
rate of the universe. This effect will discussed in the following in more detail. Second, the
Coulomb barrier will prevent the production of nuclei with Z ≫ 1. Third, neutrons are not
stable and decay.

11.3.2 Proton-neutron ratio


Gamov criterion The interaction depth τ = nlσ gives the probability that a test particle
interacts with cross section σ in a slab of length l filled with targets of density n. If τ ≫ 1,
interactions are efficient and the test particle is in thermal equilibrium with the surrounding.
We can apply the same criteria to the Universe: We say a particle species A is in thermal
equilibrium, as long as τ = nlσ = nσvt ≫ 1. The time t corresponds to the typical time-scale

140
11.3 Big Bang Nucleosynthesis

1e+20
1
n=p
1e+15
1e-05 4
He
1e+10

1e-10
XD/(Xp Xn)

100000 D

XA
1
1e-15

1e-05
3
1e-20 He
1e-10
12
C
1e-15 1e-25
0.1 1 10 100 1
T/MeV T/MeV

Figure 11.2: Relative equilibrium abundance XD /(Xp Xn ) of deuterium as function of tem-


perature T (left) and equilibrium mass fractions of nucleons, D, 3 He, 2 He and
12 C (right).

for the expansion of the universe, τ = (Ṙ/R)−1 = H −1 . Note that this is also the typical
time-scale for changes in the temperature T . Thus we can rewrite this condition as

Γ ≡ nσv ≫ H . (11.30)

A particle species ”goes out of equilibrium” when its interaction rate Γ becomes smaller than
the expansion rate H of the universe.

Decoupling of neutrinos The interaction rates of neutrinos in processes like n ↔ p + e− + νe


or e+ e− ↔ ν̄ν is σ ∼ G2F E 2 . If we approximate the energy of all particle species by their
temperature T , their velocity by c and their density by n ∼ T 3 , then the interaction rate of
weak processes is
Γ ≈ hvσnν i ≈ G2F T 5 (11.31)
The early universe is radiation-dominated with ρrad ∝ 1/R4 , H = 1/(2t) and negligible cur-
vature k/R2 . Thus the Friedmann equation simplifies to H 2 = (8π/3)Gρ with ρ = g∗ π 2 /30T 4 ,
or
1 √ T2
= H = 1.66 g∗ . (11.32)
2t MPl

Here, we introduced also the Planck mass MPl = 1/ GN ≈ 1.2 × 1019 GeV. Requiring
Γ(Tfr ) = H(Tfr ) gives as freeze-out temperature Tfr of weak processes
 √ 
1.66 g∗ 1/3
Tfr ≈ ≈ 1MeV (11.33)
G2F MPl
with g∗ = 10.75. The relation between time and temperature follows as
 
t 2.4 MeV 2
=√ . (11.34)
s g∗ T
Thus the time-sequence is as follows
• at Tfr ≈ 1 MeV: the neutron-proton ratio freezes-in and can be approximated by the
ratio of their equilibrium distribution in the non-relativistic limit.

141
11 Cosmic relics

• as the universe cools down from Tfr to Tns , neutrons decay with half-live τn ≈ 886 s.
• at Tns , practically all neutrons are bound to 4 He, with only small admixture of other
elements.

Proton-neutron ratio Above Tf , reactions like νe + n ↔ p + e− keep nucleons in thermal


equilibrium. As we have seen, Tf ∼ 1 MeV and thus we can treat nucleons in the non-
relativistic limit. Then their relative abundance is given by the Boltzmann factor exp (−∆/T )
for T >∼ Tf with ∆ = mN − mP = 1.29 MeV for the mass difference of neutrons and protons.
Hence for Tf ,  
nn ∆ 1
= exp − ≈ . (11.35)
np t=tfr
Tf 6
As the universe cools down to Tns , neutrons decay with half-live τn ≈ 886 s,
 
nn 1 tns 2
≈ exp − ≈ . (11.36)
np t=tns
6 τn 15

11.3.3 Estimate of helium abundance


The synthesis if 4 He proceeds though a chain of reactions, pn → dγ, dp → 3 Heγ, d 3 He →
4 Hep. Let’s assume that 4 He formation takes place instantaneously. Moreover, we assume
that all neutrons are bound in 4 He. We need two neutrons to form one helium atom, n(4 He) =
nn /2, and thus

M (4 He) 4mN × nn /2 2nn /np 4


Y (4 He) ≡ = = = ∼ 0.235 (11.37)
Mtot mN (np + nn ) 1 + nn /np 17

Our naive estimate not too far away from Y ∼ 0.245.


The dependence of Y ( 4 He) on the input physics is rather remarkable.
• The helium abundance dependence exponentially on ∆ and Tf :
– The mass difference ∆ depends on both electromagnetic and strong interactions.
BBN tests therefore the time-dependence of fundamental interaction expected e.g.
in string theories.
– The freeze-out temperature Tfr depends on number of relativistic degrees of freedom
g∗ and restricts thereby additional light particle.
– a non-zero chemical potential of neutrinos.
• A weaker dependence on start of nucleosynthesis Tns and thus ηb or Ωb .

11.3.4 Results from detailed calculations


Detailed calculations predict not only the relative amount of light elements produced, but
also their absolute amount as function of e.g. the baryon-photon ratio η. Requiring that the
relative fraction of helium-4, deuterium and lithium-7 compared to hydrogen is consistent with
observation allows one to determine η or equivalently the baryon content, Ωb h2 = 0.019±0.001.
Although the binding energy per nucleon of Carbon-12 and Oxygen-16 is higher than the of
4 He, they are not produced: at time of 4 He production Coulomb barrier prevents already

fusion. Also, stable element with A = 5 is missing.

142
11.4 Dark matter

Figure 11.3: Abundances of light-elements as function of η (left) and of the number of light
neutrino species (right).

11.4 Dark matter


11.4.1 Freeze-out of thermal relic particles
When the number density nX of a particle species X is not changed by interactions, then it is
diluted just by the expansion of space, nX ∝ R−3 . It is convenient to account for this trivial
expansion effect by dividing nX through the entropy density s ∝ R−3 , i.e. to use the quantity
Y = n/s. We first consider again the equilibrium distribution Yeq for µX = 0,
( 
π 1/2 gX 3/2
nX 45
4 x exp(−x) = 0.145 gg∗S
X
x3/2 exp(−x) for x ≫ 3,
Yeq = = 45ζ(3) εgX g∗S
2π 8 (11.38)
s
2π 4 g
∗S
= 0.278 εg
g
∗S
X
for x ≪ 3

where x = T /m and geff = 3/4 (geff = 1) for fermions (bosons). If the particle X is in chemical
equilibrium, its abundance is determined for T ≫ m by its contribution to the total number of
degrees of freedom of the plasma, while Yeq is exponentially suppressed for T ≪ m (assuming
µX = 0). In an expanding universe, one may expect that the reaction rate Γ for processes
like γγ ↔ X̄X drops below the expansion rate H mainly for two reasons: i) Cross sections
may depend on energy as, e.g., weak processes σ ∝ s ∝ T 2 for s < 2
∼ mW , ii) the density nX
decreases at least as n ∝ T 3 . Around the freeze-out time xf , the true abundance Y starts
to deviate from the equilibrium abundance Yeq and becomes constant, Y (x) ≈ Yeq (xf ) for
x> ∼ xf . This behavior is illustrated in Fig. 11.4.

Boltzmann equation When the number N = nV of a particle species is not changed by


interactions, then the expansion of the Universe dilutes their number density as n ∝ R−3 .
The corresponding change in time is connected with the expansion rate of the universe, the
Hubble parameter H = Ṙ/R, as

dn dn dR Ṙ
= = −3n = −3Hn . (11.39)
dt dR dt R

143
11 Cosmic relics

Additionally, there might be production and annihilation processes. While the annihilation
rate βn2 = hσann vi n2 has to be proportional to n2 , we allow for an arbitrary function as
production rate ψ,
dn
= −3Hn − βn2 + ψ . (11.40)
dt
In a static Universe, dn/dt = 0 defines equilibrium distributions neq . Detailed balance requires
that the number of X particles produced in reactions like e+ e− → X̄X is in equilibrium equal
to the number that is destroyed in X̄X → e+ e− , or βn2eq = ψeq . Since the reaction partners
(like the electrons in our example) are assumed to be in equilibrium, we can replace ψ = ψeq
by βn2eq and obtain
dn
= −3Hn − hσann vi(n2 − n2eq ) . (11.41)
dt
This equation together with the initial condition n ≈ neq for T → ∞ determines n(t) for a
given annihilation cross section σann .
Next we rewrite the evolution equation for n(t) using the dimensionless variables Y and x.
Changing from n = sY to Y we can eliminate the 3Hn term,
dn dY
= −3Hn + s . (11.42)
dt dt
With (2t)−2 = H 2 ∝ ρ ∝ T 4 ∝ x−4 or t = t∗ x2 , we obtain
dY sx 
= − hσann vi Y 2 − Yeq
2
. (11.43)
dx H
Finally we recast the Boltzmann equation in a form that makes our intuitive Gamov criterion
explicit, " #

x dY ΓA Y 2
=− −1 (11.44)
Yeq dx H Yeq
with ΓA = neq hσann vi: The relative change of Y is controlled by the factor ΓA /H times
the deviation from equilibrium. The evolution of Y = nX /s is shown schematically in
Fig. 11.4: As the universe expands and cools down, nX decreases at least as R−3 . Therefore,
the annihilation rate ∝ n2 quenches and the abundance “freezes-out:” The reaction rates are
not longer sufficient to keep the particle in equilibrium and the ratio nX /s stays constant.
For the discussion of approximate solutions to this equation, it is convenient to distinguish
according to the freeze-out temperature: hot dark matter (HDM) with xf ≪ 3, cold dark
matter (CDM) with xf ≫ 3 and the intermediate case of warm dark matter with xf ∼ 3.

11.4.2 Hot dark matter


For xf ≪ 3, freeze-out occurs when the particle is still relativistic and Yeq is not changing
with time. The asymptotic value of Y , Y (x → ∞) ≡ Y∞ , is just the equilibrium value at
freeze-out,
geff
Y∞ = Yeq (xf ) = 0.278 , (11.45)
g∗S
where the only temperature-dependence is contained in g∗S . The number density today is
then
geff
n0 = s0 Y∞ = 2970 Y∞ cm−3 = 825 cm−3 . (11.46)
g∗S

144
11.4 Dark matter

1e-05
Y∞
Y(x)/Y(x=0)

1e-10
↓ xf

1e-15
increasing σ: ↓

1e-20
1 10 100 1000
x = m/T

Figure 11.4: Illustration of the freeze-out process. The quantity Y = nX /s is nX divided by


the entropy density s ∝ R−3 to scale out the trivial effect of expansion.

The numerical value of s0 used will be discussed in the next paragraph. Although a HDM
particle was relativistic at freeze-out, it is today non-relativistic if its mass m is m ≫ 3K ≈
0.2meV. In this case its energy density is simply ρ0 = ms0 Y∞ and its abundance Ωh2 = ρ0 /ρcr
or
m geff
Ωh2 = 7.8 × 10−2 . (11.47)
eV g∗S
Hence HDM particles heavier than O(100eV) overclose the universe.

11.4.3 Cold dark matter


Abundance of CDM For CDM with xf ≫ 3, freeze-out occurs when the particles are already
non-relativistic and Yeq is exponentially changing with time. Thus the main problem is
to find xf , for late times we use again Y (x → ∞) ≡ Y∞ ≈ Y (xf ), i.e. the equilibrium
value at freeze-out. We parametrize the temperature-dependence of cross section as hσann i =
σ0 (T /m)n = σ0 /xn . For simplicity, we consider only the most relevant case for CDM, n = 0

or s-wave annihilation. Then the Gamov criterion becomes with H = 1.66 g∗ T 2 /MPl and
ΓA = neq hσann vi,
 
mTf 3/2 √ T2
g exp(−m/Tf ) σ0 = 1.66 g∗ f (11.48)
2π MPl
or
−1/2 g
xf exp(xf ) = 0.038 √ MPl mσ0 ≡ C . (11.49)
g∗
To obtain an approximate solution, we neglect first in
1
ln C = − ln xf + xf (11.50)
2
the slowly varying term ln xf . Inserting next xf ≈ ln C into Eq. (11.50) to improve the
approximation gives then
1
xf = ln C + ln(ln C) . (11.51)
2

145
11 Cosmic relics


The relic abundance for CDM follows from n(xf ) = 1.66 g∗ Tf2 /(σ0 MPl ) and n0 =
n(xf )[R(xf )/R0 ]3 = n(xf )[g∗,f /g∗,0 ][T0 /T (xf )]3 as

xf T03
ρ0 = mn0 ≈ 10 √ (11.52)
g∗,f σ0 MPl
or
mn0 4 × 10−39 cm2
ΩX h2 = ≈ xf (11.53)
ρcr σ0
Thus the abundance of a CDM particle is inverse proportionally to its annihilation cross
section, since a more strongly interacting particle stays longer in equilibrium. Note that the
abundance depends only logarithmically on the mass m via Eq. (11.51) and implicitly via
g∗,f on the freeze-out temperature Tf . Typical values of xf found numerically for weakly
interacting massive particles (WIMPs) are xf ∼ 20. Partial-wave unitarity bounds σann as
σann ≤ c/m2 . Requiring Ω < 0.3 leads to m < 20 − 50 TeV. This bounds the mass of any
stable particle that was once in thermal equilibrium.

Baryon abundance from freeze-out:


We can calculate the expected baryon abundance for a zero chemical potential using the formulas
derived above. Nucleon interact via pions; their annihilation cross section can be approximated
19
as hσvi ≈ m−2
π . With C ≈ 2 × 10 , it follows xf ≈ 44, Tf ∼ 22 MeV and Y∞ = 7 × 10
−20
.
The observed baryon abundance is much larger and can be not explained as a usual freeze-out
process.

Cold dark matter candidates


A particle suitable as CDM candidate should interact according Eq. (11.53) with σ ∼
10−37 cm2 . It is surprising that the numerical values of T0 and MPl conspire in Eq. (11.53) to
lead to numerical value of σ0 typical for weak interactions. Cold dark matter particles with
masses around the weak scale and interaction strengths around the weak scale were dubbed
“WIMP”. An obvious candidate was a heavy neutrino, mν ∼ 10GeV, excluded early by direct
DM searches, neutrino mass limits, and accelerator searches. Presently, the candidate with
most supporters is the lightest supersymmetric particle (LSP). Depending on the details of
the theory, it could be a neutralino (most favorable for detection) or other options. The
mass range open of thermal CDM particles is rather narrow: If it is too light, it becomes a
warm or hot dark matter particle. If it is too heavy, it overcloses the universe. There exists
however also the possibility that DM was never in thermal equilibrium. Two examples are
the axion (a particle proposed to solve the CP problem of QCD) and superheavy particle
(generically produced at the end of inflation). An overview of different CDM candidates is
given in Fig. 11.5.

146
11.4 Dark matter

SM neutrinos
−5

−10 WIMP
log(σ/pbarn)

−15

−20 axion

axino
−25
SHDM
−30
gravitino
−35

−40
−18−15−12−9 −6 −3 0 3 6 9 12 15 18
log(m/GeV)

Figure 11.5: Particles proposed as DM particle with Ω ∼ 1, the expected size of their cross
section and their mass. Red excluded; blue thermally and black non-thermally
produced.

147
12 Inflation and structure formation

12.1 Inflation
Shortcomings of the standard big-bang model

• Causality or horizon problem: why are even causally disconnected regions of the universe
homogeneous, as we discussed for CMB?
The horizon grows like t, but the scale factor in radiation or matter dominated epoch
only as t2/3 or t1/2 , respectively. Thus for any scale l contained today completely inside
the horizon, there exists a time t < t0 where it crossed the horizon. A solution to the
horizon problem requires that R grows faster than the horizon t. Since R ∝ t2/[3(1+w] ,
we need w < −1/3 or (q < 0, accelerated expansion of the universe).

• Flatness problem: the curvature term in the Friedmann equation is k/R2 . Thus this
term decreases slower than matter (∝ 1/R3 ) or radiation (1/R4 ), but faster than vacuum
energy. Let us rewrite the Friedmann equation as
 
k 8πG Λ
= H2 ρ+ − 1 = H 2 (Ωtot − 1) . (12.1)
R2 3H 2 3H 2

The LHS scales as (1 + z)2 , the Hubble parameter for MD as (1 + z)3 and for RD
as (1 + z)4 . General relativity is supposed to be valid until the energy scale MPl .
Most of time was RD, so we can estimate 1 + zPl = (t0 /tPl )1/2 ∼ 1030 (tPl ∼ 10−43 s).
Thus if today |Ωtot − 1| <∼ 1%, then the deviation had too be extremely small at tPl ,
|Ωtot − 1| < 10−2 /(1 + z )2 ≈ 10−62 !
∼ Pl

Taking the time-derivative of

|k| |k|
|Ωtot − 1| = 2 2
= (12.2)
H R Ṙ2

gives
d d |k| 2|k|R̈
|Ωtot − 1| = =− <0 (12.3)
dt dt Ṙ 2 Ṙ3

for R̈ > 0. Thus Ωtot − 1 increases if the universe decelerates, i.e. Ṙ decreases (radia-
tion/matter dominates), and decreases if the universe accelerates , i.e. Ṙ increases (or
vacuum energy dominates). Thus again q < 0 (or w < −1/3) is needed.

• The standard big-bang model contains no source for the initial fluctuations required for
structure formation.

148
12.1 Inflation

Solution by inflation Inflation is a modification of the standard big-bang model where a


phase of accelerated expansion in the very early universe is introduced. For the expansion a
field called inflaton with E.o.S w < −1/3 is responsible. We discuss briefly how the inflation
solves the short-comings of standard big-bang model for the special case w = −1:
• Horizon problem: In contrast to the radiation or matter-dominated phase, the scale
factor grows during inflation faster than the horizon scale, R(t2 )/R(t1 ) = exp[(t2 −
t1 )H] ≫ t2 /t1 . Thus one can blow-up a small, at time t1 causally connected region, to
superhorizon scales.
• Flatness problem: During inflation Ṙ = HR, R = R0 exp(Ht) and thus

k
Ωtot − 1 = ∝ exp(−2Ht) . (12.4)
Ṙ2
Thus Ωtot − 1 drives exponentially towards zero.

• Inflation blows-up quantum fluctuation to astronomical scales, generating initial fluc-


tuation without scale, P0 (k) = kns with ns ≈ 1, as required by observations.

12.1.1 Scalar fields in the expanding universe


Equation of state We consider a scalar field, Eq. (7.47), including a potential V (φ),

1
L = gµν ∇µ φ∇ν φ − V (φ) , (12.5)
2
that could be also a mass term, V (φ) = m2 φ2 /2). We remember first the expressions for the
energy-density ρ = T 00 and the pressure P ,
1 1
ρ = φ̇2 + V , P = φ̇2 − V (12.6)
2 2
and as equation of state
P φ̇2 − 2V (φ)
w= = ∈ [−1 : 1] . (12.7)
ρ φ̇2 + 2V (φ)
Thus a classical scalar field may act as dark energy, w < 0, leading to an accelerated expansion
of the Universe. A necessary condition is that the field is “slowly rolling”, i.e. that its kinetic
energy is smaller than its potential energy, φ̇2 /2 < V (φ).

Field equation in a FRW background We use Eq. (7.47) including a potential V (φ) (that
could be also a mass term, V (φ) = m2 φ2 /2),

1
L = gµν ∇µ φ∇ν φ − V (φ) , (12.8)
2
to derive the equations of motions for a scalar field in apflat FRW metric, gab =
diag(1, −a2 , −a2 , −a2 ), gab = diag(1, −a−2 , −a−2 , −a−2 ), and |g| = a3 . Varying the ac-
tion Z  
4 3 1 2 1 2
SKG = d xa φ̇ − 2 (∇φ) − V (φ) (12.9)
Ω 2 2a

149
12 Inflation and structure formation

gives
Z  
4 3 1 ′
δSKG = d x a φ̇δφ̇ − 2 (∇φ) · δ(∇φ) − V δφ
Ω a
Z  
4 d 3 2 3 ′
= d x − (a φ̇) + a∇ φ − a V δφ
Ω dt
Z  
4 3 1 2 ′ !
= d x a −φ̈ − 3H φ̇ + 2 ∇ φ − V δφ = 0 . (12.10)
Ω a
Thus the field equation for a Klein-Gordon field in a FRW background is
1 2
φ̈ + 3H φ̇ − ∇ φ + V ′ = 0. (12.11)
a2

The term 3H φ̇ acts in an expanding universe as a friction term for the oscillating φ field.
Moreover, the gradient of φ is also suppressed for increasing a; this term can be therefore
often neglected in an expanding universe.

Number of e-foldings and slow roll conditions We can integrate Ṙ = RH for an arbitrary
time-evolution of H, Z 
R(t) = R(t0 ) exp dtH(t) . (12.12)

If we define the number N of e-foldings as N = ln(R2 /R1 ), then


Z Z
R2 dφ
N = ln = dt H(t) = H(t) . (12.13)
R1 φ̇
With φ̈ + 3H φ̇ + V ′ = 0 or
φ̈ + V ′ V′
φ̇ = − ≈− (12.14)
3H 3H
and the Friedmann equation H 2 = 8πGV /3 it follows
Z Z
3H 2 8πGV
N = dφ ′
= dφ ≫ 1. (12.15)
V V′
Successful inflation requires N >
∼ 40 and thus
 2
1 V′
ε≡ ≪ 1. (12.16)
2 8πGV
Additionally to large V and a flat slope V ′ , the potential energy can dominate only, if |φ̈| ≪
|V ′ |. Then the field equation reduces to V ′ ≈ −3H φ̇, or after differentiating to V ′′ φ̇ ≈ −3H φ̈.
Thus another condition for inflation is
|φ̈| |V ′′ φ̇| |V ′′ |
1≫ ≈ ≈ (12.17)
|V ′ | |3HV ′ | 24πGV
and one defines as second slow-roll condition
V ′′
η≡ ≪ 1. (12.18)
8πGV
Hence inflation requires large V , a flat slope V ′ and small curvature V ′′ of the potential.

150
12.1 Inflation

Solutions of the KG field equation in a FRW background Next we want to rewrite the KG
equation as the one for an harmonic oscillator with a time-dependent oscillation frequency.
We introduce first the conformal time dη = dt/a,

dφ dφ dη 1
φ̇ = = = φ′ , (12.19)
dt dη dt a
 
1 d 1 ′ 1 a′
φ̈ = φ = 2 φ′′ − 3 φ′ , (12.20)
a dη a a a

and express also the Hubble parameter as function of η,

ȧ a′ H
H= = 2 ≡ . (12.21)
a a a
Inserting these expressions into Eq. (12.11) and multiplying with a2 gives

φ′′ + 2H φ′ − ∇2 φ + V ′ = 0 . (12.22)

Performing then a Fourier transformation,


X
φ(x, t) = φk (t)eixk , (12.23)
k

and using as potential a mass term, V ′ = m2 φ, we obtain

φ′′k + 2Hφ′k + (k2 + m2 a2 )φk = 0 . (12.24)

Finally, we can eliminate the friction term 2Hφ′k by introducing φk (η) = uk (η)/a. Then a
harmonic oscillator equation for uk ,

u′′k + ωk2 uk = 0 , (12.25)

with the time-dependent frequency

a′′
ωk2 (η) = k2 + m2 a2 − (12.26)
a
results. You can check that the action for the field u using conformal coordinates η, x is
mathematically equivalent to the one of a scalar field in Minkowski space with time-dependent
mass m2eff (η) = m2 a2 − a′′ /a. This time-dependence appears, because the gravitational field
can perform work on the field u. Alternatively, we could show that “the” vacuum at different
times η is not the same, because we compare the vacuum for fields with different effective
masses, leading to particle production. For an excellent introduction into this subject see the
book by V. F. Mukhanov and S. Winitzki, “Introduction to quantum fields in gravity;” for a
free pdf file of the draft version see http://sites.google.com/site/winitzki/.
We consider now as two limiting cases the short and the long-wavelength limit. In the
′′
first case, k2 + m2 a2 ≫ aa , the field equation is conformally equivalent to the one in normal
Minkowski space, with solution
1
uk (η, x) = √ (Ak e−ikx + Ak eikx ) . (12.27)
2k

151
12 Inflation and structure formation

In the opposite limit, a′′ uk = au′′k , with the solution φk = const. The complete solution is
given by Hankel functions H3/2 (η),
   
−ikx i ikx i
uk (η) = Ak e 1− + Bk e 1+ . (12.28)
kη kη

Modes outside the horizon are frozen in with amplitude


u H
k
|φk | = = √ . (12.29)
a 2k3

12.1.2 Generation of perturbations


We treated the scalar field driving inflation, the “inflaton”, as a classical field. As every
system it is subject to quantum fluctuations. These fluctuations are of the order δφ ∼ H, i.e.
the same for all Fourier modes φk . Thus a generic prediction of inflation is an “scale-invariant”
spectrum of primordial perturbations, P0 (k) ∝ kns with ns ≈ 1. Primordial perturbations
with such a spectrum are indeed required to explain the CMB anisotropies.
We consider fluctuations of the inflaton field φ around its classical average value,

φ(x, t) = φ0 (t) + δφ(x, t) . (12.30)

Inserting this into the field equation (12.11) gives six terms. We evaluate first the potential
term, assuming that the potential has its minimum for φ = φ0 = 0. Then
1
V (φ) = V (0) + V ′′ φ2 + O(φ3 ) (12.31)
2
and we see that the second derivative of the potential acts as a effective mass term, m2eff = V ′′
for the φ field. Thus

V (φ + δφ) = V ′ (φ0 ) + V ′′ (φ0 )δφ = V ′ (φ0 ) + m2φ δφ . (12.32)

Taking into account that the classical term φ0 satisfies separately the field equation (12.11)
gives as equation for the fluctuations
 2 
∂ 1 2 ∂ 2
− ∇ + 3H + mφ δφ = 0 . (12.33)
∂t2 a2 ∂t

We perform next a Fourier expansion of the fluctuations,


X
δφ(x, t) = φk (t)eixk , (12.34)
k

with k as comoving wave-number. Since the proper distance varies as ax, the momentum is
p = k/a.  2 
k 2
φ̈k + 3H φ̇k + + m φ φk = 0 . (12.35)
a2
Comparing this equation with (12.11), we see that the fluctuations obey basically the same
equation as the average field. The only difference is the effective mass term.

152
12.1 Inflation

Going over to conformal time

φ′′k + 2Hφ′k + k2 φk = 0 , (12.36)

and to φk (η) = uk (η)/a gives


 
a′′
u′′k + 2Hu′k 2
+ k − uk = 0 . (12.37)
a

Combining a = 1/(Hη 2 ) and a′′ = −2/(Hη 3 ) or

a′′ 2
= (12.38)
a η

gives  
2
u′′k + k − 2
uk = 0 . (12.39)
η
Hence fluctuations satisfy also
   
−ikx i ikx i
uk (η) = Ak e 1− + Bk e 1+ . (12.40)
kη kη

Power spectrum of perturbations The two-point correlation function of the field φ is


Z
′ ′
X
′ ′ d3 k ′
hφ(x , t )φ(x, t)i = hφ(x , t )|kihk|φ(x, t)i = 3
|φk |2 eik(x −x) . (12.41)
(2π)
k

We introduce spherical coordinates in Fourier space and choose x = x′ ,


Z Z Z
2 4πk2 dk k2 dk 2
hφ (x, t)i = |φk |2 = dk |φ |2 =
2 k
∆ (k) . (12.42)
(2π)3 2π
| {z } k φ
≡P (k)

The functions P (k) is the power spectrum, but often one calls also ∆2φ (k) with the same name.
The spectrum of fluctuations ∆2φ (k) outside of the horizon is

k3 H2
∆2φ (k) = |φk |2
= (12.43)
2π 2 4π 2
Hence, the power-spectrum of superhorizon fluctuations is independent of the wave-number
in the approximation that H is constant during inflation. The total area below the function
∆2φ (k) = const. plotted versus ln(k) gives hφ2 (x, t)i, as shown by the last part of Eq. (12.42).
Hence a spectrum with ∆2φ (k) = const. contains the some amount of fluctuation on all angular
scales. Such a spectrum of fluctuations is called a Harisson-Zel’dovich spectrum, and is
produced by inflation in the limit of infinitely slow-rolling of the inflaton.
Fluctuations in the inflaton field, φ = φ0 +δφ, lead to fluctuations in the energy-momentum
tensor T ab = T0ab + δT ab , and thus to metric perturbations g ab = g0ab + δgab . These metric
perturbations hab affect in turn all matter fields present.

153
12 Inflation and structure formation

V
O
a F
C
MP4 hot universe

reheating

inflation

O
F
? C
− 1/4
0 MP λ MP ϕ 0 tP ~ 10–43 sec t ~ 10–35 sec t0 ~ 1017 sec t

Figure 12.1: Left: A slowly rolling scalar field as model for inflation. Right: The evolution
of the scale factor R including an inflationary phase in the early universe.

12.1.3 Models for inflation


Inflation has to start and to stop (“graceful exit problem”). In order to start inflation, the
inflaton has to be displaced from its equilibrium position.
Original idea of Guth: Symmetry restoration at a first order (discontinuous) phase transi-
tion, bubble creation or 2.order. Latent heat of phase transition is used to reheat universe
(expansion lead to cool, empty state!) and to create particles. Too inhomogenous.
Modern ideas: Chaotic inflation: quantum fluctuation in a patch of the universe. Field rolls
back, inflation ends when φ back in minimum. Oscillates around minimum, coupling to other
particles leads to particle production.
4 = V . Gener-
If the coupling to other particles is “large”, then (instantaneous) reheating aTrh
4
ically, the coupling should be small. Delay leads to aTrh = V (R/R ) .′ 3

12.2 Structure formation


12.2.1 Overview and data
• Structure formation operates via gravitational instability, but needs as starting point a
seed of primordial fluctuations (generated in inflation)
• Growth of structure is inhibited by many factors, e.g. pressure.
Thep distance travelled
p by a freely falling
√ particle is R ∼ gt2 /2 with g = GM/R2 ; or
t ∼ R3 /GM ∼ 1/Gρ. Thus τff ∼ 1/ Gρ.
Pressure can balance gravity, if τff >
∼ λ/vs . This defines a critical length (“Jeans length”)
vs
λ∼ √ ,

below which pressure can counteract density perturbations (resulting in acoustic oscil-
lations), above the density perturbation grows. Shows already that structure formation
is sensitive to E.o.S. (compare e.g. radiation with vs2 = 1/3 with baryonic matter
vs2 = 5T /(3m)).
• If growth of perturbation leads to Ω ≥ 1 in a region, the region decouples from the
Hubble expansion and collapses.

154
12.2 Structure formation

• Assume ρ = ρm + ργ . If perturbations in ρ are adiabatic, i.e. the entropy per baryon is


conserved, δ(ρm /s) = 0 or δ ln(ρm /T 3 ) = 0, then δ ln ρm − 3δ ln T = 0 or

δρm δT
=3 .
ρm T

[Another possibility would be δρ = 0 or δρm = −δργ = −4aT 3 δT = −4ργ δT /T and


4δT /T = −δρm /ργ = −(ρm /ργ ) (δρm /ρm ). In the radiation epoch ρm /ργ ≪ 1 and
temperature fluctuations are suppressed.]
⇒ Temperature fluctuation in CMB at z ≈ 1100 and matter fluctuation today 0 ≤ z <
∼5
have the same origin, if primordial fluctuations are adiabatic.
• Basics of structure formation:
assume initial fluctuations and examine how they are transformed by gravitational in-
stability, interactions and free-streaming of different particle species
• comparison with observations via i) power-spectrum P (k) = |δk |2 , where
Z
ρ(x) − ρ̄
δk ∝ d3 x e−ikx δ(x) ∝ kns with δ(x) ≡
ρ̄
R
or ii) correlation function d3 x n(x)n(x + x0 ) or normalized
R
1/V d3 x n(x)n(x + x0 )
ξ(x0 ) ≡ R −1
(1/V d3 x n(x)n(x + x0 ))2

The correlation function is the Fourier-transform of the power spectrum.


Typical ξ ≈ (r/r0 )γ with γ ∼ 1.8 for 0.1 <
∼r< ∼ 10Mpc.
• An example of the status in 1995 is shown in Fig. 12.2. The field is driven by a
tremendous growth of data:
galaxy catalogues: Hubble ’32: 1250, Abell ’58: 2712 cluster, 2dF : 250.000, SDSS (-
’08): 106 .
CMB experiments: ’65 detection, COBE ’92: anisotropies, towards ’00: first peak, . . .
N-body simulations: Peeble ’70: N = 100, Efstahiou, Eastwood ’81: N = 20.000, 2005:
Virgo: Millenium simulation N = 106 .

12.2.2 Jeans mass of baryons


Consider mixture of radiation and non-relativistic nucleons after e+ e− annihilations, i.e. T ≈
0.5 MeV. With ρ = ρm + ργ and P ≈ Pγ = ργ /3, we have
 1/2  −1/2  −1/2
∂ρ 1 ∂ρm 1 3ρm
vs = =√ 1+ =√ 1+ (12.44)
∂P S 3 ∂ργ S 3 4ργ

δρm δT 3δργ
where we used ρm =3 T = 4ργ .
√ √
For t ≪ teq , the adiabatic sound speed is close to vs = 1/ 3, while vs = 0.76/ 3 for t = teq .
The Jeans mass of baryons is close to the horizon size until recombination. Then vs drops to
the value for a mono-atomic gas, vs2 = 5T
3m , where m ∼ mH ∼ 1 GeV.
b

155
12 Inflation and structure formation

d ( h-1 Mpc )
1000 100 10 1
5
10 Microwave Background Superclusters Clusters Galaxies

4
10 CDM
P ( k ) ( h-3 Mpc 3)

B E n=1
CO
1000 TCDM
n = .8
100

10
HDM
n=1 MDM
1 n=1

0.1
0.001 0.01 0.1 1 10

k ( h Mpc-1 )

Figure 12.2: Comparison of the predicted power spectrum normalized to COBE data in several
models popular around ’95 with observations: HDM (Ων = 1), CDM (Ωm = 1)
and MDM (Ωm = 0.8, Ων = 0.2).

The total mass MJ contained within a sphere of radius λJ /2 = π/kJ is


 
4π π 3 π 5/2 vs3
MJ = ρ0 = (12.45)
3 kJ 6 G3/2 ρ1/2
0

is called the Jeans mass. It is unchanged by the expansion of the universe, since the wave-
number kJ ∝ R and ρ0 ∝ 1/R3 .
Let us compare the Jeans mass just before and after recombination,
π 5/2 vs3 −2
MJ (zeq,> ) = 3/2 1/2
∼ 1015 Ωh2 M⊙ (12.46)
6 G ρ
and −1/2
MJ (zeq,< ) ∼ 105 Ωh2 M⊙ (12.47)
The Jeans mass of baryons does not coincide with the observed mass of galaxies, neither fits
the corresponding length scale the break in the power spectrum around k ≈ 0.04h/ Mpc.

12.2.3 Damping scales


Collisional or Silk damping Consider which fluctuations are damped by dissipative pro-
cesses, e.g. by Thomson scattering. (The mean free path of photons is always much larger
than the one of electrons, lγ = (ne σT )−1 ≫ le = (nγ σT )−1 , since nγ ∼ 1010 ne . Thus photon
diffusion is much more important then electron diffusion.)
A sound wave with wavelength λ can be damped if the diffusion time τdiff is smaller than
the Hubble time tH . Estimate τdiff by a random-walk with N = λ2 /lint 2 steps, each with size

lint = 1/ne σth ,


τdiff = N lint = λ2 /lint < tH . (12.48)

156
12.2 Structure formation

Figure 12.3: Acoustic baryon oscillation in the correlation function of galaxies with large
redshift of the SDSS, astro-ph/0501171.

Thus the damping scale is λD = (lint lH )1/2 . If there would be a baryon-dominated epoch,
then ne ∝ ρ and ρ ∝ 1/t2 , hence lint lH ∝ ρ−1−1/2 and λD ∝ ρ−3/4 . Finally, the corresponding
mass scale is MD ∝ λ3 ρ ∝ ρ−9/4 ρ ∝ ρ−5/4 . Numerically,

MD = 1012 (Ωh2 )−5/4 M⊙ (12.49)

and the corresponding length scale is (taking into account Ωb ≈ 0.04 < Ωm ≈ 0.27 and
h ≈ 0.7)
λD = 3.5(Ωm /Ωb )1/2 (Ωh2 )−3/4 Mpc ≈ 40 Mpc . (12.50)
Thus the Silk scale λD has the right numerical value to explain the break in the power
spectrum at k ≈ 0.04h/ Mpc. Fluctuations are damped on scales λ > ∼ λD , the stronger the
larger Ωb . Since MD ≫ MJ , acoustic oscillation should be visible for k >
∼ kD in the power
spectrum of galaxies. First evidence was found around 2005, cf. Fig. 12.3.

12.2.4 Growth of perturbations in an expanding Universe:


We restrict ourselves to the simplest case: perturbations in a pressure-less, expanding medium.
Starting from a homogeneous universe, we add matter inside a sphere of radius R, ρ̄ → ρ̄(1+δ).
Then the acceleration on the surface of this sphere is

R̈ 4π
= − Gρ̄δ . (12.51)
R 3
The time evolution of the mass density is

ρ(t) = ρ̄(t)[1 + δ(t)] = ρ̄0 /a3 (t)[1 + δ(t)] (12.52)

and thus mass conservation



M= ρ̄[1 + δ]R3 = const. (12.53)
3
implies
R(t) ∝ a(t)[1 + δ]−1/3 . (12.54)

157
12 Inflation and structure formation

Expanding for δ ≪ 1 and differentiating gives

R̈ ä 2 ȧ 1
= − δ̇ − δ̈. (12.55)
R a 3a 3

Combined with (12.51) we obtain

δ̈ + 2H δ̇ − 4πGρδ = 0. (12.56)

For a matter-dominated universe, H = 2/(3t) and ρ = 1/(6πGt2 ). Inserting the trial solution
δ ∝ tα gives
4 2
α(α − 1)tα−2 + αtα−2 − tα−2 = 0 (12.57)
3 3
or
1 2
α2 + α − = 0 (12.58)
3 3
and finally α = −1 and 2/3. Thus the general solution δ(t) = At−1 + Bt2/3 consists of a
decaying mode δ ∝ 1/t and a mode growing like δ ∝ t2/3 ∝ R.
During the radiation-dominated epoch, with δγ = 0, one can neglect the term 4πGρδ. With
H = 1/(2t)
1
δ̈ + δ̇ = 0 (12.59)
t
with solution δ(t) = δ(ti )[1 + a ln(t/ti )]. Thus perturbations do not grow until zeq .

Non-linear regime N-body simulations are mainly used to study structure formation on the
smallest scale, e.g. the dark matter profile of a galaxy.

12.2.5 Recipes for structure formation


Summary of different effects

• On sub-horizon scales our Newtonian analysis applies. During the radiation-dominated


epoch, perturbations do not grow. During the matter-dominated epoch, perturbations
grow on scales larger than the Jeans scale as δ ∝ t2/3 ∝ R. Perturbations on smaller
scales oscillate as acoustic waves.

• Before recombination, baryons are tightly coupled to radiation. The baryon Jeans scale
is of order of the horizon size. After recombination, it drops by a factor 1010 .

• Silk damping reduces power on scales smaller than 40 Mpc.

• Free-streaming of HDM suppresses exponentially suppress power on scales smaller than


few Mpc (for Ων = 1).

• CDM with baryons would be affected only by Silk damping.

158
12.2 Structure formation

5
10

Ων = 0.05

Ων = 0.01


Pg(k) (h Mpc )
3

Ων = 0
4
10
−3

P
⇒ mν <
∼ 2.2 eV at 95% C.L.
3
10
0.01 0.10
−1
k (h Mpc )

Figure 12.4: Neutrino mass limits from the 2dF galaxy survey: For Ων > ∼ 0.05 there is too
less power on scales smaller than (or since normalization is arbitrary) slope too
steep).

Recipe
• The connection between the initial perturbation spectrum Pi (k) = |δk,i |2 and the ob-
served power spectrum P (k) today is formally given by the transfer function T (k),

P (k) = T 2 (k)Pi (k). (12.60)

• Inflation predicts that an initial perturbation spectrum Pi (k) ∝ kns with ns ≈ 1, gen-
erally adiabatic ones.

• Normalize Pi (k) to the COBE data.

• Choose a set of cosmological parameters {h, ΩCDM , Ωb , ΩΛ , Ων , ns }.

• Calculate T (k).

• Fix a prescription to convert ρ ≈ ρCDM with ρb measured in observation (“bias”).

• Derive statistical quantities to be compared to observations; perform a likelihood anal-


ysis.

12.2.6 Results
• The three models without cosmological constant shown in Fig. 12.2 all fail.

159
12 Inflation and structure formation

• The exponential suppression on small Pscales typical for HDM is not observed, can be
used to derive limit on Ων <
∼ 0.05 or mν i <
∼ 2.2 eV.
• Acoustic baryon oscillations are only a tiny sub-dominant effect, but are now observed,
cf. Fig. 12.3.

160
Bibliography
[1] B.P. Abbott et al. Observation of Gravitational Waves from a Binary Black Hole Merger.
Phys. Rev. Lett., 116:061102, 2016.

[2] M. Coleman Miller. Implications of the Gravitational Wave Event GW150914. Gen. Rel.
Grav., 48:95, 2016.

[3] P.C. Peters. Gravitational Radiation and the Motion of Two Point Masses. PhD thesis,
Caltech, 1964.

[4] P.C. Peters. Gravitational Radiation and the Motion of Two Point Masses. Phys. Rev.,
136:B1224–B1232, 1964.

161
Index

advanced time parameter, 57 cosmological principle, 118


affine parameter, 27 critical density, 126
affine transformation, 26 curvature scalar, 85
area theorem, 67
d’Alembert operator
Bekenstein entropy, 68 five-dimensional, 94
Bianchi identity, 85, 104 dark matter
black hole cold, 144
entropy, 68 hot, 144
ergosphere, 64 distance
eternal, 62 angular diameter, 125
event horizon, 59, 63–64 luminosity, 124
Kerr, 62–68 Doppler effect, 55
merger, 103, 115
Einstein angle, 49
Reissner-Nordström, 62
Einstein equation, 90
Schwarzschild, 57–62, 67, 69
linearised, 97–99
stationary limit surface, 66
Einstein equations, 58
blackbody radiation
Einstein tensor, 90
cosmic, see cosmic microwave back-
energy-momentum tensor, see stress tensor
ground
dynamical, 90
Bogolyubov transformation, 55, 56
equilibrium
Boltzmann equation, 143
chemical, 136
kinetic, 136
chirp mass, 114
equivalence principle, 28, 36
conformal time, 119
ergosphere, 64, 66
conformally flat, 54, 60
event horizon, 57
coordinate
extra dimensions
cyclic, 24
large, 94
coordinates
Boyer–Lindquist, 62 field-strength tensor, 79
comoving, 102, 118 dual, 79
Eddington–Finkelstein, 57–60 force
Kruskal, 60–62 pure, 16
light-cone, 53, 54, 60 four-vector, 14
Riemannian normal, 36 Friedmann equation, 126
Schwarzschild, 40
tortoise, 57 Gamma function
cosmic censorship, 58, 59, 67 reflection formula, 56
cosmological constant, 90, 126, 127 Gamov criterion, 140

162
Index

gauge Lemaitre’s redshift formula, 123


harmonic, 98, 99 lens equation, 49
Lorenz, 81 light-cone
transverse traceless, 100 coordinates, 53, 54, 60
gauge transformation line-element, 10
abelian, 81
gravity, 97, 98, 100 metric
geodesic equation, 102 Friedmann-Robertson-Walker, 119
geodesics metric tensor
time-like as maxima, 27 in Minkowski space
gravitational wave, 102 perturbations, 103
graviton, 102 Noether charge, 88
gravity
extra dimensions, 94 observer, 16
Newtonian limit, 99
wave equation, 98, 99 parallel transport, 35
Green function, 116–117 Penrose process, 67
advanced, 59 photon, 102
retarded, 59, 106 polarisation
circular, 101, 105
Hamilton’s principle, 22 linear, 108
Harisson-Zel’dovich spectrum, 153 tensor, 100, 105, 108
Hawking radiation, 68, 69 vector, 81, 101
heat capacity, black hole, 69 power spectrum, 55, 56, 153
Helicity, 101 principle of equivalence, 29
horizon, 53 projection operator, 115
event, 59, 63–64 transverse, 115
thermal spectrum, 60 proper-time
Hubble parameter, 121 maxima, 27
Hubble’s law, 120
quadrupole formula, 107, 116
inertial frame
dragging, 64 recombination, 135
inflation, 148 redshift
irreducible mass, 67 accelerated observer, 55
isometry, 38 gravitational, 59
reduced mass, 109
Jeans length, 135 reflection formula, 56
Ricci tensor, 85
Kaluza–Klein particles, 94 Riemann tensor, 84, 98
Killing equation, 39
Killing vector, 38 Schwarzschild metric, 40
Killing vector field, 53 Schwarzschild radius, 61
Klein-Gordon equation, 76 Shapiro effect, 46
five-dimensional, 94 singularity
in a FRW background, 150 coordinate, 57, 63
physical, 58, 59, 61, 63
ΛCDM model, 129 spaces

163
Index

maximally symmetric, 119


spacetime
Rindler, 53, 60
static, 53
stationary, 53, 63
stationary limit surface, 66
stress tensor
gravity, 103
trace-reversed, 99
symmetry
gauge, 81

tensor
in Minkowski space, 14
trace-reversed, 99

Unruh effect, 56

Weyl’s postulate, 118


white hole, 58, 60, 62

164

You might also like