Lecture Notes For FY3452 Gravitation and Cosmology: M. Kachelrieß
Lecture Notes For FY3452 Gravitation and Cosmology: M. Kachelrieß
M. Kachelrieß
M. Kachelrieß
Institutt for fysikk
NTNU, Trondheim
Norway
email: [email protected]
Watch out for errors, most was written late in the evening.
Corrections, feedback and any suggestions always welcome!
1 Special relativity 8
1.1 Newtonian mechanics and gravity . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Minkowski space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Relativistic mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.A Appendix: Comments and examples on tensor and index notation . . . . . . 16
4 Schwarzschild solution 38
4.1 Spacetime symmetries and Killing vectors . . . . . . . . . . . . . . . . . . . . 38
4.2 Schwarzschild metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Gravitational redshift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4 Orbits of massive particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5 Orbits of photons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.6 Post-Newtonian parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.A Appendix: General stationary isotropic metric . . . . . . . . . . . . . . . . . . 48
5 Gravitational lensing 49
6 Black holes 53
6.1 Rindler spacetime and the Unruh effect . . . . . . . . . . . . . . . . . . . . . 53
6.2 Schwarzschild black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.3 Reissner-Nordström black hole . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.4 Kerr black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.5 Black hole thermodynamics and Hawking radiation . . . . . . . . . . . . . . . 68
6.A Appendix: Conformal flatness for d = 2 . . . . . . . . . . . . . . . . . . . . . 69
3
Contents
4
Contents
5
Preface
These notes summarise the lectures for FY3452 Gravitation and Cosmology I gave in 2009,
2010 and 2020. Asked to which of the three more advanced topics black holes, gravitational
waves and cosmology more time should be devoted, students in 2009 voted for cosmology,
while in 2010 and 2020 black holes and gravitational waves were their favourites. As a result,
the notes contain probably more material than manageable in an one semester course.
I’m updating the notes throughout the semester. Compared to the last (2015) version,
the order of topics is changed, some sections are streamlined to get space for new stuff (e.g.
GW discovery), some like the one about Noether’s theorem improved, and conventions will
be unified. At the moment, chapters 1–2, 4, 6–9 are updated.
There are various differing sign conventions in general relativity possible – all of them are
in use. One can define these choices as follows
We choose these three signs as Si = {−, +, +}. Conventions of other authors are summarised
in the following table:
HEL dI,R MTW, H W
[S1 ] - - + +
[S2 ] + + + -
[S3 ] - - + -
HEL: Hobson, M.P., Efstathiou, G.P., Lasenby, A.N.: General relativity: an introduction for
physicists. Cambridge University Press 2006. [On a somewhat higher level than Hartle.]
• Robert M. Wald: General Relativity. University of Chicago Press 1986. [Uses a modern
mathematical language]
• Landau, Lev D.; Lifshitz, Evgenij M.: Course of theoretical physics 2 - The classical
theory of fields. Pergamon Press Oxford, 1975.
MTW: Misner, Charles W.; Thorne, Kip S.; Wheeler, John A.: Gravitation. Freeman New
York, 1998. [Entertaining and nice description of differential geometry - but lengthy.]
• Schutz, Bernard F.: A first course in general relativity. Cambridge Univ. Press, 2004.
6
Contents
W: Weinberg, Steven: Gravitation and cosmology. Wiley New York, 1972. [A classics.
Many applications; outdated concerning cosmology.]
• Weyl, Hermann: Raum, Zeit, Materie. Springer Berlin, 1918 (Space, Time, Matter,
Dover New York, 1952). [The classics.]
Finally: If you find typos (if not, you havn’t read carefully enough) in the part which is
already updated, conceptional errors or have suggestions, send me an email!
7
1 Special relativity
d2 x d2 y d2 z
= = = 0. (1.3)
dt2 dt2 dt2
Most often, we call such a coordinate system just an inertial frame. Newton’s first law is not
just a trivial consequence of its second one, but may be seen as a practical definition of those
reference frames for which his following laws are valid.
Which are the transformations which connect these inertial frames or, in other words, which
are the symmetries of empty space and time? We know that translations a and rotations R
are symmetries of Euclidean space: This means that using two different Cartesian coordinate
systems, say a primed and an unprimed one, to label the points P1 and P2 , their distance
defined by Eq. (1.3) remains invariant, cf. with Fig. 1.1. The condition that the norm of the
distance vector l12 is invariant, l12 = l′12 , implys
l′T l′ = lT RT Rl = lT l (1.4)
8
1.1 Newtonian mechanics and gravity
y y′
P
b
x′
Figure 1.1: The point P is invariant, with the coordinates (x, y) and (x′ , y ′ ) in the two coor-
dinate systems.
frames is given by
t′ At + Bx At + Bx
x′ Dt + Ex A(x − vt)
y′ =
= . (1.5)
y y
z′ z z
In the second step, we used that the transformation matrix depends only on two constants,
as you should show in Ex. ??.
Newton assumed the existence of an absolute time, t = t′ , and thus A = 1 and B = 0.
Then proper Galilean transformations x′ = x + vt connect inertial frames moving with
relative speed v. Taking a time derivative leads to the classical addition law for velocities,
ẋ′ = ẋ + v. Time differences ∆t12 and space differences ∆l12 are separately invariant under
these transformations.
The Principle of Relativity states that identical experiments performed in different inertial
frames give identical results. Galilean transformations keep (1.3) invariant, hence Newton’s
first law does not allow to distinguish between different inertial frames. Before the advent of
special relativity, it was thought that this principle applies only to mechanical experiments.
In particular, it was thaught that electrodynamic waves require a medium (the “aether”) to
propagate: thence the rest frame of the aether could be used to single out a preferred frame.
Newton’s Lex Secunda states that observed from an inertial reference frame, the net force
on a particle is proportional to the time rate of change of its linear momentum,
dp
F = (1.6)
dt
where p = min v and min denotes the inertial mass of the body.
Newtonian gravity Newton’s gravitational law as well as Coulomb’s law are examples for
an instantaneous action,
X x − xi
F (x) = Ki . (1.7)
|x − xi |3
i
9
1 Special relativity
The force F (x, t) depends on the distance x(t) − xi (t) to all sources i (electric charges or
masses) at the same time t, i.e. the force needs no time to be transmitted from xi to x.
The factor K in Newton’s law is −Gmg Mg , where we introduced analogue to the electric
charge in the Coulomb law the gravitational “charge” mg characterizing the strength of the
gravitational force between different particles. Surprisingly, one finds min = mg and we can
drop the index.
Since the gravitational field is conservative, ∇ × F = 0, we can introduce a potential φ via
F = −m∇φ (1.8)
with
GM
φ(x) = − . (1.9)
|x − x′ |
Analogue to the electric field E = −∇φ we can introduce a gravitational field, g = −∇φ.
We then obtain ∇ · g(x) = −4πGρ(x) and as Poisson equation,
where ρ is the mass density, ρ = dm/d3 x. Similiarly as the full Maxwell equations reduce in
the v/c → 0 to the electrostatic Poisson equation, a relativistic generalisation of Newtonian
gravity should exist.
In special relativity, we postulate that the speed of light is universal, i.e. that all observers
measure c = c′ . A condition which guaranties this and resembles Eq. (1.1) is that the squared
distance in an inertial frame
between two spacetime events xµ1 = (ct1 , x1 ) and xµ2 = (ct2 , x2 ) is invariant. Hence the
symmetry group of space and time is given by all those coordinate transformations xµ →
x̃µ = Λµν xν that keep ∆s2 invariant. Since these transformation mix space and time, we
speak about spacetime or, to honnor the inventor of this geometrical interpretation, about
Minkowski space.
The distance of two infinitesimally close spacetime events is called the line-element ds of
the spacetime. In Minkowski space, it is given by
using a Cartesian inertial frame. More precisely, the line-element ds is defined as norm of the
displacement vector
ds = dsµ eµ (1.14)
10
1.2 Minkowski space
Choosing as basis the coordinate vectors to xµ = (ct, x), its components are
We compare now our physical requirement on the distance of spacetime events, Eq. (1.13),
with the general result for the scalar product of two vectors a and b. If these vectors have
the coordinates ai and bi in a certain basis ei , then we can write
3
X 3
X
µ ν
a·b= (a eµ ) · (b eν ) = aµ bν (eµ · eν ) . (1.16)
µ,ν=0 µ,ν=0
Thus we can evaluate the scalar product between any two vectors, if we know the symmetric
matrix g composed of the products of the basis vectors at all spcetime points xµ ,
3
X
2 !
ds = ds · ds = gµν dxµ dxν = c2 dt2 − dx2 − dy 2 − dz 2 . (1.18)
µ,ν=0
Hence the metric tensor gµν becomes for the special case of a Cartesian inertial frame in
Minkowski space diagonal with elements
1 0 0 0
0 −1 0 0
gµν =
0 0 −1 0 ≡ ηµν
(1.19)
0 0 0 −1
Introducing Einstein’s summation convention (cf. the box for details), we can rewrite the
scalar product of two vectors with coordinates aµ and bµ as
a · b ≡ ηµν aµ bν = aµ bµ = aµ bµ . (1.20)
In the last part of (1.20), we “lowered an index:” aµ = ηµν aµ or bµ = ηµν bµ . Next we introduce
the opposite operation of rasing an index by aµ = η µν aµ . Since raising and lowering are inverse
operations, we have ηµν η νσ = δµσ . Thus the elements of ηµν and η µν form inverse matrices,
which agree with (1.19) for a Cartesian intertial coordinate frame in Minkowski space.
11
1 Special relativity
t
(x − y)2 > 0 time-like
(x − y)2 = 0 light-like
b y
(x − y)2 < 0 space-like
Figure 1.2: Light-cone at the point y generated by light-like vectors. Contained in the light-
cone are the time-like vectors, outside the space-like ones.
2. Summation indices are dummy indices which can be freely exchanged; the remaining free
indices of the LHS and RHS of an equation have to agree. Hence
The cone of all light-like vectors starting from a point P is called light-cone, cf. Fig. 1.2. The
time-like region inside the light-cone consists of two parts, past and future. Only events inside
the past light-cone can influence the physics at point P , while P can influence only its future
light-cone.
The line describing the position of an observer is called world-line. The proper-time τ is
the time displayed by a clock moving with the observer. How can we determine the correct
definition of τ ? First, we ask that in the rest system of the observer, proper- and coordinate-
time agree, dτ = dt. But for a clock at rest, it is dsµ /c = (dt, 0) and thus ds/c = dt. Since
the RHS of dτ = ds/c is an invariant expression, it has to valid in any frame and thus also
12
1.2 Minkowski space
for a moving clock. For finite times, we have to integrate the line-element,
Z 2 Z 2
τ12 = dτ = [dt2 − (dx2 + dy 2 + dz 2 )/c2 ]1/2 (1.24)
1 1
Z 2
= dt [1 − (1/c2 )((dx/dt)2 + (dy/dt)2 + (dz/dt)2 )]1/2 (1.25)
1
Z 2
= dt [1 − v 2 /c2 ]1/2 < t2 − t1 . (1.26)
1
to obtain the proper-time. The last part of this equation, where we introduced the three-
velocity v i = dxi /dt of the clock, shows explicitly the relativistic effect of time dilation, as
well as the connection between coordinate time t and the proper-time τ of a moving clock,
dτ = (1 − (v/c)2 )1/2 dt ≡ dt/γ.
with ỹ = y and z̃ = z. Direct calculation shows that ∆s2 is invariant as desired. Consider
now in the system K̃ the origin of the system K. Then x = 0 and
Dividing the two equations gives x̃/c̃t = tanh η. Since β = x̃/c̃t is the relative velocity of the
two systems measured in units of c, the imaginary “rotation angle η” equals the rapidity
η = arctanh β . (1.30)
Note that the rapidity η is a more natural variable than v or β to characterise a Lorentz
boost, because η is additive: Boosting a particle with rapidity η 1 by η leads to the rapidity
η 2 = η1 + η. Using the following identities,
1 1
cosh η = p =p ≡γ (1.31)
2 1 − β2
1 − tanh η
tanh η β
sinh η = p =p = γβ (1.32)
2 1 − β2
1 − tanh η
in (1.27) gives the standard form of the Lorentz transformations,
x + vt
x̃ = p = γ(x + βct) (1.33)
1 − β2
ct + vx/c
ct̃ = p = γ(ct + βx) . (1.34)
1 − β2
13
1 Special relativity
Four-vectors and tensors In Minkowski space, we call a four-vector any four-tupel V µ that
transforms as Ṽ µ = Λµν V ν . By convention, we associate three-vectors with the spatial
part of vectors with upper indices, e.g. we set xµ = {ct, x, y, z} or Aµ = {φ, A}. Lowering
then the index by contraction with the metric tensor results in a minus sign of the spatial
components of a four-vector, xµ = ηµν xµ = {ct, −x, −y, −z} or Aµ = {φ, −A}. Summing over
a pair of Lorentz indices, always one index occurs in an upper and one in a lower position.
Additionally to four-vectors, we will meet tensors T µ1 ···µn of rank n which transform as
T̃ µ1 ···µn = Λµ1 ν1 · · · Λµn νn T ν1 ···νn . Every tensor index can be raised and lowered, using the
metric tensors η µν and ηµν .
Special tensors are the Kronecker delta, δµν = ηµν with δµν = 1 for µ = ν and 0 otherwise,
and the Levi–Civita tensor εµνρσ . The latter tensor is completely antisymmetric and has in
four dimensions the elements +1 for an even permutation of ε0123 , −1 for odd permutations
and zero otherwise. In three dimensions, we define the Levi–Civita tensor by ε123 = ε123 = 1.
Next consider differential operators. Forming the differential of a function f defined on
Minkowski space xµ ,
∂f ∂f ∂f ∂f ∂f
df = dt + dx + dy + dz = dxµ , (1.36)
∂t ∂x ∂y ∂z ∂xµ
we see that an upper index in the denominator counts as lower index, and vice versa. We
define the four-dimensional nabla operator as
∂ 1 ∂ ∂ ∂ ∂
∂µ ≡ = , , , .
∂xµ c ∂t ∂x ∂y ∂z
∂
Note the “missing” minus sign in the spatial components, which is consistent with ∂µ = ∂xµ
and the rule for the differential in Eq. (1.36). The d’Alembert or wave operator is
1 ∂2
≡ ηµν ∂ µ ∂ ν = ∂µ ∂ µ = − ∆. (1.37)
c2 ∂t2
This operator is a scalar, i.e. all the Lorentz indices are contracted, and thus invariant under
Lorentz transformations.
14
1.3 Relativistic mechanics
dt 1
u0 = =√ =γ (1.39)
dτ 1 − v2
and
dxi dxi dt vi
ui = = =√ = γv i . (1.40)
dτ dt dτ 1 − v2
Hence the four-velocity is uα = (γ, γv) and its norm is
u · u = u0 u0 − ui ui = γ 2 − γ 2 v 2 = γ 2 (1 − v 2 ) = 1 . (1.41)
Energy and momentum After having constructed the four-velocity, the simplest guess for
the four-momentum is
pα = muα = (γm, γmv) . (1.42)
Thus we can interpret the components as pα = (E, p). The norm follows with (1.41) imme-
diately as
p · p = m2 . (1.45)
including the famous E = mc2 as special case for a particle at rest. Note that (1.46) predicts
the existence of solutions with negative energy—undermining the stability of the universe.
According Feynman, we should view thesepnegative energy solutions pas positive energy solu-
tions moving backward in time, exp(−i(− m2 + p2 )t) = exp[−i(+ m2 + p2 )(−t)].
15
1 Special relativity
and thus the RHSs are valid also for a moving observer.
∂t ρ + ∇ · j = 0. (1.50)
We know that any 4-vector aµ has 4 = 3 + 1 components, which transform as a scalar (a0 )
and a vector (a) under rotations. This suggests to combine (ρ, j) = j µ and ∂µ = (∂t , ∇)
into four-vectors (consistent with our definition of the nabla operator), leading to ∂µ j µ = 0.
Similarly, we combine the scalar potential φ and the vector potential A into a four-vector
Aµ = (φ, A). If we move to tensors of rank two, i.e. 4 × 4 matrices, it is useful to formalise
the splitting of such a tensor in components.
16
1.A Appendix: Comments and examples on tensor and index notation
Reduicible and irreduicible tensors An object which contains invariant subgroups with
respect to a symmetry operation is called reducible. In our case at hand, we want to determine
the reducible subgroups of a tensor of rank n with respect to spatial rotations. For a four-
vector, the splitting is Aµ = (A0 , A). Next, we consider the reducible subgroups of an
arbitrary tensor T µν of rank two. First, we note that we can split any tensor T µν into a
symmetric and antisymmetric piece, T µν = S µν + Aµν with S µν = S νµ and Aµν = −Aνµ ,
writing
1 1
Tµν = (Tµν + Tνµ ) + (Tµν − Tνµ ) ≡ T{µν} + T[µν] ≡ Sµν + Aµν . (1.51)
2 2
This splitting is invariant under general coordinate transformations, and thus also under
rotations, Ex. ??. Physically this expected, since our equations tell us that some quantities are
antisymmetric (e.g. the field-strength tensor F µν ), while others are symmetric (e.g. Maxwell’s
stress tensor σij ) and all observers should agree on this.
Thus we can examine the symmetric and antisymmetric tensors seperately. and we start
with the former. We can split S µν into a scalar S 00 , a vector S 0i and a tensor S ij ,
00
µν S S 0i
S = . (1.52)
S i0 S ij
To show this, calculate the effect of a rotation, S̃µν = Λµρ Λνσ Sρσ , or in matrix notation
S ′ = ΛSΛT , where for a rotation
ν 1 0
Λµ = . (1.53)
0T R
The tensor S ij is again reducible, since its trace is a scalar. Thus we can decompose S ij into
its trace s = S ii and its traceless part Sji − sδji /(d − 1).
An antisymmetric tensor Fµν has 3 + 2 + 1 = 6 components, i.e. combines two 3-vectors,
or more precisely a pure vector like E and an axial vector like B,
0 −Ex −Ey −Ez
Ex 0 −Bz By
Aµν = Ey Bz
. (1.54)
0 −Bx
Ez −By Bx 0
To show this, calculate again the effect of a rotation, and of a parity tranformation.
(Anti-) symmetrisation Finally let us note some useful relations for contractions involving
symmetric and antisymmetric tensors. First, they are “orthogonal” in the sense that the
contraction of a symmetric tensor Sµν with an antisymmetric tensor Aµν gives zero,
This allows one to (anti-) symmetrize the contraction of an arbitrary tensor Cµν with an
(anti-) symmetric tensor: First split Cµν into symmetric and antisymmetric parts,
1 1
Cµν = (Cµν + Cνµ ) + (Cµν − Cνµ ) ≡ C{µν} + C[µν] . (1.56)
2 2
Then
Sµν C µν = Sµν C {µν} and Aµν C µν = Aµν C [µν] . (1.57)
17
1 Special relativity
Index gymnastics We are mainly concerned with vectors and tensors of rank two. In this
case we can express all equations as matrix operations. For instance, lowering the index of a
vector, Aµ = ηµν Aµ , becomes
0
1 0 0 0 A A0
0 −1 0 0 1 1
Aµ = A −A
0 0 −1 0 A2 = −A2 .
0 0 0 −1 A3 −A3
Raising and lowering indices is the inverse, and thus ηµν η νσ = δµσ . In matrix notation,
ηη −1 = 1.
We can view ηµν η νσ = δµσ as the operation of raising an index of ηµν (or lowering an index of
η µν ): in both cases, we see that the Kronecker delta corresponds to the metric tensor with
mixed indices, δµσ = ηµσ .
The expression for the line-element becomes
1 0 0 0 dx0
0 −1 0 0 1
dx
ds2 = ηµν dxµ dxν = dxµ ηµν dxν = dx0 , dx1 , dx2 , dx3 0 0 −1 0 dx2
0 0 0 −1 dx3
= (dx0 )2 − (dx1 )2 − (dx2 )2 − (dx3 )2 .
Note that the order of tensors does not matter, but the order of indices does. If we move to
matrix notation, we have to restore the right order. Raising next the second index,
we have to re-order it as Tµν = ηµρ T ρσ ησν in matrix notation (using that η is symmetric).
We apply this to the field-strength tensor: Starting from F µν , we want to construct Fµν =
ηµρ F ρσ ησν ,
1 0 0 0 0 −Ex −Ey −Ez 1 0 0 0
0 −1 0 0 Ex
0 −Bz By 0
−1 0 0
Fµν = 0 0 −1 0 Ey Bz
0 −Bx 0 0 −1 0
0 0 0 −1 Ez −By Bx 0 0 0 0 −1
1 0 0 0 0 Ex Ey Ez 0 Ex Ey Ez
0 −1 0 0 Ex 0 Bz −By −Ex 0 −Bz By
=
0 0 −1 0 Ey −Bz
= .
0 Bx −Ey Bz 0 −Bx
0 0 0 −1 Ez By −Bx 0 −Ez −By Bx 0
(1.58)
Note the general behaviour: The F 00 element and the 3-tensor F ik are multiplied by 12 and
(−1)2 , respectively and do not change sign. The 3-vector F 0k is multiplied by (−1)(+1) and
does change sign.
18
1.A Appendix: Comments and examples on tensor and index notation
Next we want to construct a Lorentz scalar out of F µν . A Lorentz scalar has no indices, so
we contract the two indices, ηµν F µν = Fµ µ . This is invariant, but zero (and thus not useful)
because F µν is antisymmetric. As next try, we construct a Lorentz scalar S using two F’s:
Multiplying the two matrices Fµν and F µν , and taking then the trace, gives
E·E
Ex2 − Bz2 − By2
S = Fµν F µν = −tr{Fµν F νρ } = −tr
Ey2 − Bz2 − Bx2
Ez2 − By2 − Bx2
i.e. S = −2(E · E − B · B). Note the minus, since we have to change the order of indices in
the second F .
Note also that S has to be a bilinear in E and B and invariant under rotations. Thus the
only possible terms entering S are the scalar products E · E, B · B and E · B. Since B is a
polar (or axial) vector, P B = B, the last term is a pseudo-scalar and cannot enter the scalar
S.
Now we become more ambitious, looking at a tensor with 4 indices, the Levi–Civita or
completely antisymmetric tensor εαβγδ in four dimensions, with
and all even permutations, −1 for odd permutations and zero otherwise. We lower its indices,
and consider the 0123 element using that the metric is diagonal,
1
F̃12 = ε1203 F 03 + ε1230 F 30 = −Ez
2
etc., gives
0 −Bx −By −Bz 0 Bx By Bz
Bx 0 −Ez Ey −Bx 0 −Ez Ey
F̃µν =
By Ez
and F̃ µν = .
0 −Ex −By Ez 0 −Ex
Bz −Ey Ex 0 −Bz −Ey Ex 0
The dual field-strength tensor is useful, because the homogeneous Maxwell equation
19
1 Special relativity
becomes simply
∂α F̃ αβ = 0 . (1.62)
Inserting the potential, we obtain zero,
1
∂α F̃ αβ = εαβγδ ∂α Fγδ = εαβγδ ∂α ∂γ Aδ = 0 , (1.63)
2
because we contract a symmetric tensor (∂α ∂γ ) with an anti-symmetric one (εαβγδ ).
Having F µν and F̃µν , we can form another (pseudo-) scalar, A = F̃µν F µν . Multiplying the
two matrices F̃µν and F µν , and taking then the trace, gives
B·E
B·E
F̃µν F µν = −tr{F̃µν F νρ } = tr
B·E
B·E
i.e. F̃µν F µν = 4E · B. We know that E · B is a pseudo-scalar. This tells us that including the
Levi-Civita tensor converts a tensor into a pseudo-tensor, which does not change sign under
a parity transformation P x = −x. (This analogous to Bi = εijk ∂j Ak , which converts two
pure vectors into an axial one.)
20
2 Lagrangian mechanics and symmetries
We review briefly the Lagrangian formulation of classical mechanics and it connection to
symmetries.
The boundary term vanishes, because we required that the variations δq i are zero at the
endpoints a and b. Since the variations are otherwise arbitrary, the terms in the first bracket
have to be zero for an extremal curve, δS = 0. Paths that satisfy δS = 0 are classically
allowed. The equations resulting from the condition δS = 0 are called the Euler-Lagrange
equations of the action S,
δS ∂L d ∂L
i
= i− = 0, (2.4)
δq ∂q dt ∂ q̇ i
and give the equations of motion of the system specified by L. Physicists call these equations
often simply Lagrange equations or, especially in classical mechanics, Lagrange equations of
the second kind.
21
2 Lagrangian mechanics and symmetries
The Lagrangian L is not uniquely fixed: Adding a total time-derivative, L′ = L+df (q, t)/dt
does not change the resulting Lagrange equations,
Z b
′ df
S =S+ dt = S + f (q(b), tb ) − f (q(a), ta ) , (2.5)
a dt
since the last two terms vanish varying the action with the restriction of fixed endpoints a
and b.
Infinitesimal variations: If you are worried about the meaning of “infinitesimal” variations,
the following definition may help: Consider an one-parameter family of paths,
and similarly for functions and functionals of q. Moreover, it is obvious from Eq. (2.6) that
the assumption of time-independent ε implies that the variation δ and the time-derivative d/dt
acting on q commute,
∂ q̇(t, ε) d
δ(q̇) = = (δq) ,
∂ε ε=0 dt
L = L(v 2 ).
Let us consider two inertial frames moving with the infinitesimal velocity ε relative to each
other. Then a Galilean transformation connects the velocities measured in the two frames as
v ′ = v + ε. The Galilean principle of relativity requires that the laws of motion have the same
form in both frames, and thus the Langrangians can differ only by a total time-derivative.
Expanding the difference δL in ε gives with δv 2 = 2vε
∂L 2 ∂L
δL = 2
δv = 2vε 2 . (2.7)
∂v ∂v
The difference has to be a total time-derivative. Since v = q̇, the derivative term ∂L/∂v 2 has
to be independent of v. Hence, L ∝ v 2 and we call the proportionality constant m/2, and
22
2.2 Hamilton’s principle and the Lagrange function
23
2 Lagrangian mechanics and symmetries
A coordinate qi that does not appear explicitly in L is called cyclic. The Lagrange equations
imply then ∂L/∂ q̇i = const., so that the corresponding canonically conjugated momentum
pi = ∂L/∂ q̇ i is conserved.
Feynman proposed the following connection between the propagator K and the classical action
S,
Z q′
K(x′ , t′ ; x, t) = N Dq exp(iS) ,
q
where Dq denotes the “integration over all paths.” Hence the difference between the classical
and quantum world is that in the former only paths extremizing the action S are allowed while
in the latter all paths weighted by exp(iS) contribute.
For a readable introduction see R. P. Feynman, A. R. Hibbs: Quantum mechanics and path integrals or R. P.
Feynman (editor: Laurie M. Brown), Feynman’s thesis : a new approach to quantum theory.
Energy The Lagrangian of a closed system depends, because of the homogeneity of time,
not on time. Its total time derivative is
dL ∂L ∂L
= i q̇ i + i q̈ i . (2.16)
dt ∂q ∂ q̇
Replacing ∂L/∂q i by (d/dt)∂L/∂ q̇ i , it follows
dL d ∂L ∂L i d ∂L
= q̇ i + q̈ = q̇ i . (2.17)
dt dt ∂ q̇ i ∂ q̇ i dt ∂ q̇ i
Hence the quantity
∂L
E ≡ q̇ i
−L (2.18)
∂ q̇ i
remains constant during the evolution of a closed system. This holds also more generally, e.g.
in the presence of static external fields, as long as the Lagrangian is not time-dependent.
We have still to show that E coincides indeed with the usual definition of energy. Using as
L = T (q, q̇) − U (q), where T is quadratic in the velocities, we have
∂L ∂T
q̇ i i
= q̇ i i = 2T (2.19)
∂ q̇ ∂ q̇
and thus E = 2T − L = T + U .
24
2.3 Symmetries and conservation laws
is conserved.
The condition P (2.21) signifies with ∂L/∂r a = −∂V /∂r a that the sum of forces on all
particles is zero, a F a = 0. For the particular case of a two-particle system, F a = −F b , we
have thus derived Newton’s third law, the equality of action and reaction.
Isotropy We consider now the consequences of the isotropy of space, i.e. search the conserved
quantity that follows from a Lagrangian invariant under rotations. Under an infinitesimal
rotation by δφ both coordinates and velocities change,
δr = δφ × r , (2.24)
δv = δφ × v . (2.25)
Inserting the expression into
X ∂L ∂L
δL = δr a + δv a =0 (2.26)
a
∂r a ∂v a
gives, using also the definition pa = ∂L/∂v a as well as the Lagrange equation ṗa = ∂L/∂r a ,
X
δL = (ṗa · δφ × r a + pa · δφ × v a ) = 0 . (2.27)
a
is conserved.
25
2 Lagrangian mechanics and symmetries
If we use a different parameter σ, e.g. such that σ(τ = 1) = 0 and σ(τ = 2) = 1, then
Z 1 1/2
dxµ dxν
τ12 = dσ ηµν . (2.32)
0 dσ dσ
d 2 x1
=0 (2.36)
dτ 2
and the same for the other coordinates.
An alternative which we use latter more often is
with ẋµ = dxν /dτ . Since this Lagrangian is the square-root of the one defined in Eq. (2.33)
for the special choice σ = τ , is it clear that the same equation of motion result. While this
Lagrangian is more useful in calculations, it is invariant only under affine transformations,
τ → Aτ + B.
Massless particles The energy-momentum relation of massless particles like the photon
becomes ω = |k|. Thus their four-velocity and four-momenta are light-like, u2 = p2 = 0, and
light signals form the future light-cone of the emission point P . Since ds = dτ = 0 on the
light-cone, we cannot use the Lagrangians (2.33) or (2.37).
26
2.4 Free relativistic particle
27
3 Basic differential geometry
We motivate this chapter about differential geometry by giving some arguments why a rela-
tivistic theory of gravity should replace Minkowski space by a curved manifold. Let us start
by reviewing three basic properties of gravitation.
1.) The idea underlying the equivalence principle emerged in the 16th century, when among
others Galileo Galilei found experimentally that the acceleration g of a test mass in
a gravitational field is universal. Because of this universality, the gravitating mass
mg = F/g and the inertial mass mi = F/a are identical in classical mechanics, a fact
that puzzled already Newton. While mi = mg can be achieved for one material by a
convenient choice of units, there should be in general deviations for test bodies with
differing compositions.
Knowing more forces, this puzzle becomes even stronger: Contrast the acceleration of
a particle in a gravitational field to the one in a Coulomb field. In the latter case, two
independent properties of the particle, namely its charge q determining the strength of
the electric force acting on it and its mass mi , i.e. the inertia of the particle, are needed
as input in the equation of motion. In the case of gravity, the “gravitational charge”
mg coinicides with the inertial mass mi .
The equivalence of gravitating and inertial masses has been tested already by Newton
and Bessel, comparing the period P of pendula of different materials,
s
mi l
P = 2π , (3.1)
mg g
but finding no measurable differences. The first precision experiment giving an upper
limit on deviations from the equivalence principle was performed by Loránd Eötvös in
1908 using a torsion balance. Current limits for departures from universal gravitational
attraction for different materials are |∆gi /g| < 10−12 .
2.) Newton’s gravitational law postulates as the latter Coulomb law an instantaneous inter-
action. Such an interaction is in contradiction to special relativity. Thus, as interactions
of currents with electromagnetic fields replace the Coulomb law, a corresponding de-
scription should be found for gravity. Moreover, the equivalence of mass and energy
found in special relativity requires that, in a loose sense, energy not only mass should
couple to gravity: Imagine a particle-antiparticle pair falling down a gravitational po-
tential well, gaining energy and finally annihilating into two photons moving the gravi-
tational potential well outwards. If the two photons would not loose energy climbing up
the gravitational potential well, a perpetuum mobile could be constructed. If all forms
of energy act as sources of gravity, then the gravitational field itself is gravitating. Thus
the theory is non-linear and its mathematical structure is much more complicated than
the one of electrodynamics.
28
3.1 Manifolds and tensor fields
3.) Gravity can be switched-off locally, just by cutting the rope of an elevator: Inside a
freely falling elevator, one does not feel any gravitational effects except for tidal forces.
The latter arise if the gravitational field is non-uniform and tries to stretch the elevator.
Inside a sufficiently small freely falling system, also tidal effects plays no role. This
allows us to perform experiments like the growing of crystalls in “zero-gravity” on the
International Space Station which is orbiting only at an altitude of 300 km.
Motivated by 2.), Einstein used 1.), the principle of equivalence, and 3.) to derive general
relativity, a theory that describes the effect of gravity as a deformation of the space-time
known from special relativity.
In general relativity, the gravitational force of Newton’s theory that accelerates particles
in an Euclidean space is replaced by a curved space-time in which particles move force-free
along geodesic lines. In particular, photons move still as in special relativity along curves
satisfying ds2 = 0, while all effects of gravity are now encoded in the form of the line-element
ds. Thus all information about the geometry of a space-time is contained in the metric gµν .
Covariant and contravariant tensors Consider two n dimensional coordinate systems x and
x̃ and assume that we can express the xi as functions of the x̃i ,
∂xi
dxi = dx̃j . (3.3)
∂ x̃j
The transformation matrix
∂xi
aij = (3.4)
∂ x̃j
is a n × n dimensional matrix with determinant (“Jacobian”) J = det(a). If J 6= 0 in the
point P , we can invert the transformation,
∂ x̃i
dx̃i = dxj = ãij dxj . (3.5)
∂xj
29
3 Basic differential geometry
The transformation matrices are inverse to each other, ãij ajk = δki . According to the product
rule of determinants, J(a) = 1/J(ã).
A contravariant vector X (or contravariant tensor of rank one) has a n-tupel of components
that transforms as
∂ x̃i j
X̃ i = X . (3.6)
∂xj
This definition guarantees that the tensor itself is an invariant object, since the transformation
of its components is cancelled by the transformation of the basis vectors,
∂φ(x(x̃)) ∂xj ∂φ
= . (3.8)
∂ x̃i ∂ x̃i ∂xj
This is the inverse transformation matrix and we call a covariant vector (or covariant tensor
of rank one) any n-tupel transforming as
∂xj
X̃i = Xj . (3.9)
∂ x̃i
More generally, we call an object T that transforms as
′ ′
i,...,n ∂ x̃i ∂ x̃n ∂xj ∂xm i′ ,...,n′
T̃j,...,m = i ′ . . . n ′ j
. . . m
Tj ′ ,...,m′ (3.10)
|∂x {z ∂x } |∂ x̃ {z ∂ x̃ }
n m
Dual basis We defined earlier gij = ei · ej . Now we define a dual basis ei with metric g ij
via
ei · ej = δij . (3.11)
We want to determine the relation of gij with gij . First we set
ei = Aij ej , (3.12)
Hence the metric g ij maps covariant vectors Xi into contravariants vectors X i , while gij
provides a map into the opposite direction. In the same way, we can use g to raise and lower
indices of any tensor.
Next we multiply ei with ek = gkl el ,
or
δki = gkl gli . (3.15)
30
3.2 Tensor analysis
Thus the components of the covariant and the contravariant metric tensors, gij and gij , are
inverse matrices of each other.
xj ′
e1 = e = sin ϑ cos φ e′1 + sin ϑ sin φ e′2 + cos ϑe′3 ,
∂r j
xj ′
e2 = e = r cos ϑ cos φ e′1 + r cos ϑ sin φ e′2 − r sin ϑe′3 ,
∂ϑ j
xj ′
e3 = e = −r sin ϑ sin φ e′1 + r sin ϑ cos φ e′2 .
∂φ j
Since the ei are orthogonal to each other, the matrices gij and g ij are diagonal. From the definition
gij = ei · ej one finds gij = diag(1, r2 , r2 sin2 ϑ) Inverting gij gives g ij = diag(1, r−2 , r−2 sin−2 ϑ). The
determinant is g = det(gij ) = r4 sin2 ϑ. Note that the volume integral in spherical coordinates is given
by
Z Z Z Z
3 √
3 ′ 3
d x = d x J = d x g = drdϑdφ r2 sin ϑ ,
∂xk′ ∂xl′
since gij = ∂ x̃i ∂ x̃j
′
gkl and thus det(g) = J 2 det(g ′ ) = J 2 with det(g ′ ) = 1.
31
3 Basic differential geometry
32
3.2 Tensor analysis
We add the first two terms and subtract the last one. Using additionally the symmetries
Γabc = Γacb and gab = gba , the underlined terms cancel, and dividing by two we obtain
1
(∂c gab + ∂b gac − ∂a gbc ) = Γd cb gad . (3.26)
2
Multiplying by gea and relabeling indices gives as final result
1 ad
Γabc = {abc } ≡ g (∂b gdc + ∂c gbd − ∂d gbc ) . (3.27)
2
This equation defines the Christoffel symbols {abc } (aka Levi-Civita connection aka Rieman-
nian connection): It is the unique connection on a Riemannian manifold which is metric
compatible and torsion-free (i.e. symmetric). Admitting torsion, on the RHS of Eq. (3.27)
a would appear. Such a connection would be still
three permutations of the torsion tensor Tbc
be a metric connection, but not torsion-free.
We now check our claim that the connection (3.27) is metric compatible. First, we define1
or
∂c gab = Γabc + Γbac . (3.32)
Applying the general rule for covariant derivatives, Eq. (3.47), to the metric,
∇c gab = ∂c gab − Γdac gdb − Γdbc gad = ∂c gab − Γbac − Γabc (3.33)
and “conserves” the norm of vectors. (Exercise: Repeat these steps including torsion.)
Since we can choose for a flat space an Cartesian coordinate system, the connection coef-
ficients are zero and thus ∇a = ∂a . This suggests as general rule that physical laws valid in
Minkowski space hold in general relativity, if one replace ordinary derivatives by covariant
ones and ηij by gij .
1
We showed that g can be used to raise or to lower tensor indices, but Γ is not a tensor.
33
3 Basic differential geometry
3.2.2 Geodesics
A geodesic curve is the shortest or longest curve between two points on a manifold. Such
a curve extremizes the action S(L) of a free particle, L = gab ẋa ẋb , (setting m = 2 and
ẋ = dx/dσ), along the path xa (σ). The parameter σ plays the role of time t in the non-
relativistic case, while t become part of the coordinates. The Lagrange equations are
d ∂L ∂L
c
− c =0 (3.36)
dσ ∂(ẋ ) ∂x
Only g depends on x and thus ∂L/∂xc = gab,c ẋa ẋb . With ∂ ẋa /∂ ẋb = δba we obtain
d
gab,c ẋa ẋb = 2 (gac ẋa ) = 2(gac,b ẋa ẋb + gac ẍa ) (3.37)
dσ
or
1
gac ẍa + (2gac,b − gab,c )ẋa ẋb = 0 (3.38)
2
Next we rewrite the second term as
2gca,b ẋa ẋb = (gca,b + gcb,a )ẋa ẋb (3.39)
multiply everything by gdc and obtain
1
ẍd + gdc (gab,c + gac,b − gab,c )ẋa ẋb = 0 . (3.40)
2
We recognize the definition of the Levi-Civita connection and rewrite the equation of a
geodesics as
ẍc + Γcab ẋa ẋb = 0 . (3.41)
The connection entering the equation for an extremal curve is the Levi-Civita connection,
because we used the Lagrangian of a classical spinless particle.
This result justifies the use of a torsionless connection which is metric compatible: Although
a starP
consists of a collection of individual particles carrying spin si , its total spin sums up to
zero, i si ≈, 0, because the si are uncorrelated. Thus we can describe macrosopic matter in
general relativity as a a classical spinless point particle (or fluid, if extended). In such a case,
only the symmetric part of the connection influences the geodesic motion of the considered
system.
Example: Sphere S 2 . Calculate the Christoffel symbols of the two-dimensional unit sphere S 2 .
The line-element of the two-dimensional unit sphere S 2 is given by ds2 = dϑ2 + sin2 ϑdφ2 . A faster
alternative to the definition (3.27) of the Christoffel coefficients is the use of the geodesic equation:
From the Lagrange function L = gab ẋa ẋb = ϑ̇2 + sin2 ϑφ̇2 we find
∂L d ∂L d
=0 , = (2 sin2 ϑφ̇) = 2 sin2 ϑφ̈ + 4 cos ϑ sin ϑϑ̇φ̇
∂φ dt ∂ φ̇ dt
∂L d ∂L d
= 2 cos ϑ sin ϑφ̇2 , = (2ϑ̇) = 2ϑ̈
∂ϑ dt ∂ ϑ̇ dt
and thus the Lagrange equations are
φ̈ + 2 cot ϑϑ̇φ̇ = 0 and ϑ̈ − cos ϑ sin ϑφ̇2 = 0 .
Comparing with the geodesic equation ẍκ +Γκµν ẋµ ẋν = 0, we can read off the non-vanishing Christoffel
symbols as Γφϑφ = Γφφϑ = cot ϑ and Γϑφφ = − cos ϑ sin ϑ. (Note that 2 cot ϑ = Γφϑφ + Γφφϑ .)
34
3.A Appendix: a bit more...
The first term transforms as desired as a tensor of rank (1,1), while the second term—caused
by the in general non-linear change of the coordinate basis—destroys the tensorial behavior.
If we define a covariant derivative ∇c X a of a vector X a by requiring that the result is a tensor,
we should set
∇c X a = ∂c X a + Γabc X b . (3.44)
∇c Xa = ∂c Xa − Γbac Xb . (3.46)
For a general tensor, the covariant derivative is defined by the same reasoning as
a... a...
∇c Tb... = ∂c Tb... d...
+ Γadc Tb... a...
+ . . . − Γdbc Td... − ... (3.47)
Note that it is the last index of the connection coefficients that is the same as the index of
the covariant derivative. The plus sign goes together with upper (superscripts), the minus
with lower indices.
From the transformation law (3.45) it is clear that the inhomogeneous term disappears for
an antisymmetric combination of the connection coefficients Γ in the lower indices. Thus this
combination forms a tensor, called torsion,
a
Tbc = Γabc − Γacb . (3.48)
We consider only symmetric connections, Γabc = Γacb , or torsionless manifolds. We will justify
this choice later, when we consider the geodesic motion of a classical particle.
Parallel transport We say a tensor T is parallel transported along the curve x(σ), if its
a... stay constant. In flat space, this means simply
components Tb...
35
3 Basic differential geometry
In curved space, we have to replace the normal derivative by a covariant one. We define the
directional covariant derivative along x(σ) as
D dxc
= ∇c . (3.50)
dσ dσ
Then a tensor is parallel transported along the curve x(σ), if
D a... dxc a...
T = ∇c Tb... = 0. (3.51)
dσ b... dσ
∂ 2 x̃a
= Γadb δeb = Γade . (3.55)
∂xd ∂xe
Inserting these results into the transformation law (3.45) of the connection coefficients, where
we swap in the second term derivatives of x and x̃,
36
3.A Appendix: a bit more...
torsion, transforms as a tensor, it can not be eliminated by a coordinate change. This implies
not necessarily a contradiction to the equivalence principle, as long as the torsion is properly
generated by source terms in the equation of motions of the matter fields. In particular, the
spin current of fermions leads to non-zero torsion. As the elementary spins in macroscopic
bodys cancel, torsion is in all relevant astrophysical and cosmological applications negligible.
This justifies our choice of a symmetric connection.
37
4 Schwarzschild solution
In the next three chapters, we investigate the solutions of Einsteins field equations that
describe the gravitational field outside a spherical mass distribution. The metric valid for a
static mass distribution was found by Karl Schwarzschild in 1915, only one month after the
publication of Einsteins field equations. A real understanding of the physical significance of
the singularities contained in the solution was obtained only in the 1960s. The solution for a
rotating mass distribution was found by Kerr only in 1963.
Note the difference to the definition of a scalar, φ̃(x̃) = φ(x). In the latter case, we require
that a scalar field has the same value at a point P which in turn changes coordinates from x
to x̃.
Mathematically, the transport of a tensor T along a vector field ξ is described by the Lie
derivative Lξ T . Instead of introducing this new derivative (which we do not use later), we
consider the change of the metric under an infinitesimal coordinate transformation,
Then we can identify the tranlation ξ µ with the vector field ξ at x. Next we connect the
metric tensor at the two different points by an Taylor expansion,
gµν (x̃) = gµν (x + εξ) = gµν (x) + εξ α ∂α gµν (x) + O(ε2 ). (4.3)
On the other hand, we can use the usual transformation law for a tensor of rank two under
an arbitrary coordinate transformation,
∂xα ∂xβ
g̃µν (x̃) = gαβ (x), (4.4)
∂ x̃µ ∂ x̃ν
∂ x̃α ∂ x̃β
gµν (x) = g̃αβ (x̃). (4.5)
∂xµ ∂xν
38
4.1 Spacetime symmetries and Killing vectors
If the transformation (4.2) is a spacetime symmetry, then g̃µν (x) = gµν (x). Evaluating the
transformation matrices and inserting the Taylor expansion, we obtain
∂ x̃α ∂ x̃β
gµν (x) = gαβ (x̃) = (δµα + ε∂ α ξµ )(δνβ + ε∂ β ξν ) [gαβ (x) + εξ ρ ∂ρ gαβ (x)] + O(ε2 ) (4.6)
∂xµ ∂xν
= gµν (x) + ε [∂µ ξν + ∂ν ξµ + ξ α ∂α gµν (x)] + O(ε2 ). (4.7)
is satisfied. Inserting Eq. (3.32) for the partial derivative of the metric tensor, we can combine
the Christoffel symbols with the partial derivatives into covariant derivatives of the vector
field, obtaining the Killing equation1
δgµν = ∇µ ξν + ∇ν ξµ = 0. (4.9)
Its solutions ξ are the Killing vector fields of the metric. Moving along a Killing vector field,
the metric is kept invariant.
Since Eq. (4.9) is tensor equation, the previous Eq. (4.8) is also invariant under arbitrary co-
ordinate transformations, although it contains only partial derivatives. Is is the Lie derivative
of a tensor of rank two.
x′ = cos αx − sin αy ≈ x − αy ,
y′ = sin αx + cos αy ≈ y + αx ,
z′ = z.
Hence ξz = (−y, x, 0) and the other two follow by cyclic permutation. One of them, ξz , we could
have also identified by rewriting the line-element in spherical coordinates and noting that dl does not
contain φ dependent terms.
Conserved quantities along geodesics Assume that the metric is independent from one
coordinate, e.g. x0 . Then there exists a corresponding Killing vector, ξ = (1, 0, 0, 0), and
x0 is a cyclic coordinate, ∂L/∂x0 = 0. With L = dτ /dσ, the resulting conserved quantity
∂L/∂ ẋ0 = const. can be written as
∂L dxβ dxβ
= g0β = g0β = ξ · u. (4.10)
∂ ẋ0 Ldσ dτ
Hence the quantity ξ · u is conserved along the solutions xµ (σ) of the Lagrange equation, i.e.
along geodetics.
1
This equation is a much stronger constraint than it looks like: its solutions are uniquely determined by the
value at a single point.
39
4 Schwarzschild solution
40
4.4 Orbits of massive particles
l2 h p i
1 ± 1 − 12M 2 /l2
r1,2 = (4.26)
2M
√
Hence the potential has no extrema for M/l > 12 and is always negative: A particle can
reach r = 0 for small enough but finite angular momentum, in contrast to the Newtonian
case. By the same argument, there
√ exists a last stable orbit at r = 6M , when the two extrema
r1 and r2 coincide for M/l = 12.
The orbits can be classified according the relative size of E and Veff for a given l:
2
More precisely, e and l are the energy and the angular momentum per unit mass. Thus the -1 in E corresponds
to the rest mass of the test particle.
41
4 Schwarzschild solution
0.4 0.1
0.2 l=6M
0.05
l=4.6M
0 l=4M
l=3.7M
Veff
Veff
0
l=2M
-0.2 l=4M
-0.05 l=3.7M
-0.4
l=2M
-0.6 -0.1
0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14
r/M r/M
Figure 4.1: The effective potential Veff for various values of l/M as function of distance r/M ,
for two different scales.
• Bound orbit exists for E < 0. Two circular orbits, one stable at the minimum of Veff and
an unstable one at the maximum of Veff ; orbits that oscillate between the two turning
points.
• Scattering orbit exists for E > 0: If E > max{Veff }, the particle hits after a finite time
the singularity r = 0. For 0 < E < max{Veff }, the particle turns at E = max{Veff } and
escapes to r → ∞.
We derive below a differential equation for r(φ), from which the orbits in the Schwarzschild
metric can be calculated. For the lazy student, several webpages exist where such orbits can
be visualised, see e.g. http://www.fourmilab.ch/gravitation/orbits/.
Radial infall We consider the free fall of a particle that is at rest at infinity, dt/dτ = 1,
E = 0 and l = 0. The radial equation (4.23) simplifies to
2
1 dr M
= (4.27)
2 dτ r
42
4.4 Orbits of massive particles
Integrating gives
−1/2
2M −1
Z
2M
t = dr 1−
r r
( p )
2
r 3/2 r 1/2 r/2M + 1
′
= t + 2M − −2 + ln p (4.31)
3 2M 2M r/2M − 1
→ ∞ for r → 2M .
Since the coordinate time t equals the proper-time for an observer at infinity, a freely falling
particle reaches the Schwarzschild radius r = 2M only for t → ∞ for such an observer.
The last result can be derived immediately for light-rays. Choosing a light-ray in radial
direction with dφ = dϑ = 0, the metric (4.11) simplifies with ds2 = 0 to
dr 2M
=1− . (4.32)
dt r
Thus light travelling towards the star, as seen from the outside, will travel slower and slower
as it comes closer to the Schwarzschild radius r = 2M . The coordinate time is ∝ ln |1− 2M/r|
and thus for an observer at infinity the signal will reach r = 2M again only asymptotically
for t → ∞.
Perihelion precession We recall first the derivation of the law of motion r = r(φ) in the
Newtonian case. We solve the Lagrange equations for L = (1/2)m(ṙ 2 + r 2 φ̇2 ) + GM m/r,
obtaining
r 2 φ̇ = l , (4.33)
l GM
r̈ = 3
− 2 . (4.34)
r r
We eliminate t by
dr dr dφ dr l l
= = 2
≡ r′ 2 (4.35)
dt dφ dt dφ r r
and introduce u = 1/r,
GM
u′′ + u = . (4.36)
l2
The solution follows as
GM
u= (1 + e cos φ) . (4.37)
l2
We redo the same steps, starting from Eq. (4.23) for the Schwarzschild metric,
l2 2M 2M l2
ṙ 2 + = e2
− 1 + − . (4.38)
r2 r r3
We eliminate first t and introduce then u = 1/r,
e2 − 1 2M u
(u′ )2 + u2 = + 2 + 2M u3 . (4.39)
l2 l
43
4 Schwarzschild solution
We can transform this into a linear differential equation differentiating with respect to φ.
Thereby we eliminate also the constant (e2 − 1)/l2 , and dividing3 by 2u′ it follows
M GM 3GM 2
u′′ + u = 2
+ 3M u2 = 2 + u . (4.40)
l l c2
In the last step we reintroduced c and G. Hence we see that the Newtonian limit corresponds
to c → ∞ (“instantaneous interactions”) or v/c → 0 (“static limit”). The latter statement
becomes clear, if one uses the virial theorem: GM u = GM/r ∼ v 2 .
In most situations, the relativistic correction is tiny. We use therefore perturbation theory
to determine an approximate solution, setting u = u0 +δu, where u0 is the Newtonian solution.
Inserting u into Eq. (4.40), we obtain
3(GM )3 2
(δu)′′ + δu = (u0 + 2u0 δu + δu2 ) . (4.41)
c2 l4
Here we used that u0 solves the Newtonian equation of motion (4.36). Keeping on the RHS
only the leading term u20 results in
3(GM )3
(δu)′′ + δu = (1 + 2e cos φ + e2 cos2 φ) . (4.42)
c2 l4
Its solution is
3(GM )3 2 1 1
δu = 1 + eφ sin φ + e − cos(2φ) . (4.43)
c2 l4 2 6
The solution of the linear inhomogenous differential equation (4.42) is found by adding the
particular solutions of the three inhomogenous terms. With A, B and C being constant, it is
u′′ + u = A ⇒ u = A, (4.44)
1
u′′ + u = B cos φ ⇒ u = Bφ sin φ, (4.45)
2
1 1
u′′ + u = C cos2 φ ⇒ u = C − cos(2φ). (4.46)
2 6
While the first and third term in the square bracket lead only to extremely tiny changes
in the orbital parameters, the second term is linear in φ and its effect accumulates therefore
with time. Thus we include only δu ∝ eφ sin φ in the approximate solution. Introducing
α = 3(GM )2 /(cl)2 ≪ 1 and employing
we find
GM GM
u = u0 + δu ≃ 2
[1 + e(cos φ + α sin φ)] ≃ 2 [1 + e cos(φ(1 − α))] . (4.48)
l l
Hence the period is 2π/(1 − α), and the ellipse processes with
2π 6π(GM )2 6πGM
∆φ = − 2π ≃ 2πα = 2
= . (4.49)
1−α (lc) a(1 − e2 )c2
3
The case u′ = 0 corresponds to radial infall treated in the previous section.
44
4.5 Orbits of photons
The effect increases for orbits with small major axis a and large eccentricity e. Urbain Le
Verrier first recognized in 1859 that the precession of the Mercury’s perihelion deviates from
the Newtonian predicition: Perturbations by other planets lead to ∆φ = 532.3′′ /century,
compared to the observed value of ∆φ = 574.1′′ /century. The main part of the discrepancy
is explaind by the effect of Eq. (4.49), predicting a shift of ∆φ = 43.0′′ /century. (Tiny
additional corrections are induced by the quadrupole moment of the Sun (0.02′′ /century) and
the Lens-Thirring effect (−0.002′′ /century)).
The radial equation (4.50) is invariant under reparametrisations of the affine parameter,
λ → Aλ + B, since the change cancels both in b and ldλ. Consequently, the orbit of a photon
does not depend seperately on the energy e and the angular momentum l, but only on the
impact parameter b of the photon.
The maximum of Weff is at 3M with height 1/27M 2 . For impact parameters b > 27M ,
photon orbits have a turning point and photons escape to infinity. For b < 27M , they hit
r = 0, while for b = 27M a (unstable) circular orbit is possible.
Light deflection We transform Eq. (4.50) as in the m > 0 case into a differential equation
for u(φ). For small deflections, we use again perturbation theory. In zeroth order in v/c, we
can set the RHS of
3GM 2
u′′ + u = u (4.52)
c2
to zero. The solution u0 is a straight line,
sin φ
u0 = . (4.53)
b
Inserting u = u0 + δu gives
3GM sin2 φ
(δu)′′ + δu = . (4.54)
c2 b2
A particular solution is
3GM
δu = (1 + 1/3 cos(2φ)) . (4.55)
2c2 b2
Thus the complete approximate solution is
sin φ 3GM
u = u0 + δu = + 2 2 (1 + 1/3 cos(2φ)) . (4.56)
b 2c b
45
4 Schwarzschild solution
Considering the limit r → ∞ or u → 0 of this equation gives half of the deflection angle of a
light-ray with impact parameter b to a point mass M ,
4GM 2Rs
∆φ = 2
= . (4.57)
c b b
For a light-ray grazing the Solar surface, b = R⊙ , we obtain as numerical estimate
4GM⊙ 2Rs
∆φ⊙ = 2
= ≃ 10−5 ≈ 2′′ . (4.58)
c R⊙ R⊙
For a recollection of the 1919 results see https://arxiv.org/abs/2010.13744.
Shapiro effect Shapiro suggested to use the time-delay of a radar signal as test of general
relativity. Suppose we send a radar signal from the Earth to Venus where it is reflected back
to Earth. The point r0 of closest approach to the Sun is characterized by dr/dt|r0 = 0.
Rewriting Eq. (4.50) as
2 l2 2M
ṙ + 2 1 − = e2 (4.59)
r r
and introducing the Killing vector e in ṙ 2 ,
2
2 dr dt 2 e2 dr
ṙ = = , (4.60)
dt dλ (1 − 2M/r)2 dt
we find 2
1 dr l2 1
+ − = 0. (4.61)
(1 − 2M/r)3 dt e2 r 2 1 − 2M/r
We now evaluate this equation at the point of closest approach, i.e. for dr/dt|r0 = 0,
l2 r02
= , (4.62)
e2 1 − 2M/r
where we restored also G and c in the last step. The first term corresponds to straight line
propagation and thus the excess time ∆t is given by the second and third term. Finally, we
46
4.6 Post-Newtonian parameters
Figure 4.2: Measurement of the Shapiro time-delay compared to the prediction in GR.
can use that the orbits both of Earth and Venus are much more distant from the Sun than
the point of closest approach, RE , RV ≫ r0 . Hence we obtain for the time delay
4GM 4RE RV
∆t = ln +1 . (4.67)
c3 r02
In Fig. 4.2, one of the first measurements of the Shapiro time-delay is shown together with
the prediction using Eq. (4.67); an excellent agreement is visible.
with two unknown functions A(r) and B(r). Since the only available length is Rg , A and B
can be expanded as power series in r/Rg ,
2
2GM 2GM
A(r) = 1 + a1 r/Rg + a2 (r/Rg )2 + . . . = 1 − + 2(β − γ) ... (4.69)
c2 r c2 r
2GM
B(r) = 1 + b1 r/Rg + b2 (r/Rg )2 + . . . = 1 + γ 2 + . . . (4.70)
c r
Agreement with Newtonian gravity is achieved, if the only non-zero expansion coefficient a1
equals two, i.e. for A = 1 − 2GM/(rc2 ) and B = 1. Searching for deviations from GR, one
keeps therefore a1 = 2 fixed and introduces the “post-Newtonian” parameter β and γ such
that agreement with Einstein gravity is achieved for γ = 1 and β = 0. The predictions
47
4 Schwarzschild solution
for the three classical tests of GR we have discussed can be redone using the metric (4.69).
Alternative theories of gravity predict the numerical values of the post-Newtonian parameters
and can thereby easily compared to experimental results.
48
5 Gravitational lensing
One distinguishes three different cases of gravitational lensing, depending on the strength of
the lensing effect:
1. Strong lensing occurs when the lens is very massive and the source is close to it: In
this case light can take different paths to the observer and more than one image of
the source will appear, either as multiple images or deformed arcs of a source. In the
extreme case that a point-like source, lens and observer are aligned the image forms an
“Einstein ring”.
2. Weak Lensing: In many cases the lens is not strong enough to form multiple images or
arcs. However, the source can still be distorted and its image may be both stretched
(shear) and magnified (convergence). If all sources were well known in size and shape,
one could just use the shear and convergence to deduce the properties of the lens.
3. Microlensing: One observes only the usual point-like image of the source. However,
the additional light bent towards the observer leads to brightening of the source. Thus
microlensing is only observable as a transient phenomenon, when the lens crosses ap-
proximately the axis observer-source.
Lens equation We consider the simplest case of a point-like mass M , the lens, between the
observer O and the source S as shown in Fig. 5.1. The angle β denotes the (unobservable)
angle between the true position of the source and the direction to the lens, while ϑ± are the
angles between the image positions and the source. The corresponding distances DOS , DOL ,
and DLS are also depicted in Fig. 5.1 and, since DOS +DLS = DOL does not hold in cosmology,
we keep all three distances. Finally, the impact parameter b is as usual the smallest distance
between the light-ray and the lens.
Then the lens equation in the “thin lens” (b ≪ Di ) and weak deflection (α ≪ 1) limit
follows from AS + SB = AB as
The thin lens approximation implies ϑ ≪ 1, and since β < ϑ, also β is small. Solving for b
and inserting for the deflection angle α = 4GM/(c2 b) as well as b = ϑDol , we find first
4GM Dls 1
β =ϑ− . (5.2)
c2 Dos Dol ϑ
Multiplying by ϑ, we obtain then a quadratic equation,
ϑ2 − βϑ − ϑE = 0 , (5.3)
49
5 Gravitational lensing
B
α
S
⑦
ϑ+
L
⑦ β
A
ϑ−
Dso
Dsl Dlo
Figure 5.1: The source S that is off the optical axis OL by the angle β appears as two images
on opposite sides from the optical axis OL. The two images are separated by the
angles ϑ± from the optical axis O.
50
Figure 5.2: Gravitational lensing of the galaxy cluster Abell 2218.
Gravity can affect this result in two ways: First, gravity can redshift the frequency of
photons, νsr = νobs (1 + z). This can be either the gravitational redshift as in Sec. (4.3)
or a cosmological redshift due to the expansion of the universe (that will be discussed in
Sec. 10.2). Thus the intensity Iobs at the observed frequency photons νobs is the emitted
intensity evaluated at νobs (1 + z) and reduced by (1 + z)3 ,
I(νsr )
Iobs (νobs ) = . (5.8)
(1 + z)3
In both cases, this redshift depends only on the initial and the final point of the photon
trajectory, but not on the actual path in-between. Thus the redshift cancels if one considers
the relative magnification of a source by gravitational lensing.
Second, gravitational lensing affects the solid angle the source is seen in a detector of fixed
size. As a result, the apparent brightness of a source increases proportionally to the increase
of the visible solid angle, if the source cannot be resolved as a extended object (cf. Sec...).
Hence we can compute the magnification of a source by calculating the ratio of the solid angle
visible without and with lensing.
In Fig. 5.3, we sketch how the two lensed images are stretched: An infinitesimal small
surface element 2π sin βdβdφ ≈ 2πβdβdφ of the unlensed source becomes in the lense plane
2πϑ± dϑ± dφ. Thus the images are tangentially stretched by ϑ± /β, while the radial size is
changed by dϑ± /dβ. Thus the magnification a± of the source is
ϑ± dϑ±
a± = . (5.9)
βdβ
51
5 Gravitational lensing
ϑ+
β
ϑ−
Figure 5.3: The effect of gravitational lensing on the shape of an extended source: The surface
element 2πβdβdφ of the unlensed image at position β is transformed into the two
lensed images of size 2πϑ± dϑ± dφ at position ϑ± .
and thus
x2 + 2
atot = a+ + a− = >1 (5.11)
x(x2 + 4)1/2
with x = β/ϑE . For large separation x, the magnification atot goes to one, while the mag-
nification diverges for x → 0 as atot ∼ 1/x: In this limit we would receive light from an
infinite number of images on the Einstein circle. Physically, the approximation of a point
source breaks down when x reaches the extension of the source. Since atot is larger than one,
gravitational lensing always increases the total flux observed from a lensed source, facilitating
the observation of very faint objects. As compensation, the source appears slightly dimmed
to all those observers who do not see the source lensed.
Two important applications of gravitational lensing are the search for dark matter in the
form of black holes or brown dwarfs in our own galaxy by microlensing and the determination
of the value of the cosmological constant by weak lensing observations.
In microlensing experiments that have tried to detect dark matter in the form of MACHOs
(black holes, brown dwarfs,. . . ) one observed stars of the LMC. If a MACHO with speed
v ≈ 220 km/s moves through the line-of-sight of a monitored star, its light-curve is magnified
temporally. If v is the perpendicular velocity of the source,
1/2
v2
β(t) = β02 + 2 (t − t0 )2 (5.12)
Dol
The magnification a(t) is symmetric around t0 and its shape can be determined inserting
typical values for Dol , the MACHO mass.
52
6 Black holes
A black hole is a solution of Einstein’s equations containing a physical singularity which in
turn is covered by an event horizon. Such a horizon acts classically as a perfect unidirectional
membrane which any causal influence can cross only towards the singularity.
changes distances, but keeps angles invariant. Thus the causal structure of two conformally
related spacetimes is identical.
A spacetime is called conformally flat if it is connected by a conformal transformation to
Minkowski space,
gµν (x) = Ω2 (x)ηµν (x) = e2ω(x) ηµν (x). (6.2)
In particular, light-rays also propagate in conformally flat spacetimes along straight lines at
±45 degrees to the time axis.
We add two additional definitions for spacetimes with special symmetries. A stationary
spacetime has a time-like Killing vector field. In appropriate coordinates, the metric tensor
is independent of the time coordinate,
A stationary spacetime is static if it is invariant under time reversal. Thus the off-diagonal
terms g0i have to vanish, and the metric simplifies to
1 1
t(τ ) = sinh(aτ ) and x(τ ) = cosh(aτ ). (6.5)
a a
It describes one branch of the hyperbola x2 − t2 = a−2 . Introducing light-cone coordinates,
53
6 Black holes
it follows
1
u(τ ) = − exp(−aτ ). (6.7)
a
Our aim is to determine how the uniformly accelerated observer experiences Minkowski
space. As a first step, we try to find a frame {ξ, χ} comoving with the observer. In this
frame, the observer is at rest, χ(τ ) = 0, and the coordinate time ξ agrees with the proper
time, ξ = τ . Introducing comoving light-cone coordinates,
ũ = ξ − χ and ṽ = ξ + χ, (6.8)
these conditions become
ũ(τ ) = ṽ(τ ) = τ. (6.9)
Moreover, we can choose the comoving coordinates such that the metric is conformally flat,
ds2 = Ω2 (ξ, χ)(dξ 2 − dχ2 ) = Ω2 (ũ, ṽ)dũdṽ. (6.10)
Next we have to relate the comoving coordinates {ũ, ṽ} to Minkowski coordinates {t, x}.
Since dũ2 and dṽ 2 are missing in the line element, the functions u(ũ, ṽ) and v(ũ, ṽ) can depend
only on one of their two arguments. We can set therefore u(ũ) and v(ṽ). Expressing u̇ as
du du dũ
= , (6.11)
dτ dũ dτ
inserting u̇ = −au and ũ˙ = 1 we arrive at
du
−au = . (6.12)
dũ
Separating variables and integrating we end up with u = C1 e−aũ . In the same way, we find
v = C2 eaṽ . Since the line element has to agree along the trajectory with the proper-time,
ds2 = dτ 2 = dudv, the two integration constants C1 and C2 have to satisfy the constraint
−a2 C1 C2 = 1. Choosing C1 = −C2 , the desired relation between the two sets of coordinates
becomes
1 1
u = − e−aũ and v = eaṽ , (6.13)
a a
or using Cartesian coordinates,
1 1
t = eaχ sinh(aξ) and x = eaχ cosh(aξ). (6.14)
a a
The spacetime described by the coordinates defining the comoving frame of the accelerated
observer,
ds2 = e2aχ (dξ 2 − dχ2 ), (6.15)
is called Rindler spacetime. It is locally equivalent to Minkowski space but differs globally.
If we vary the Rindler coordinates over their full range, ξ ∈ R and χ ∈ R, then we cover
only the one quarter of Minkowski space with x > |t|. Thus for an accelerated observer an
event horizon exist: Evaluating on a hypersurface of constant comoving time, ξ = const., the
physical distance from χ = −∞ to the observer placed at χ = 0 gives
Z 0 q
1
d= dχ |gχχ | = . (6.16)
−∞ a
This corresponds to the coordinate distance between the observer and the horizon in
Minkowski coordinates.
definition:: The particle horizon is the maximal distance from which we can receive signals,
while the event horizon defines the maximal distance to which we can send signals.
54
6.1 Rindler spacetime and the Unruh effect
Exponential redshift Later we will discuss gravitational particle production as the effect
of a non-trivial Bogolyubov transformation between different vacua. Before we apply this
formalism, we will examine the basis of this physical phenomenon in a classical picture. As a
starter, we want to derive the formula for the relativistic Doppler effect. Consider an observer
who is moving with constant velocity v relative to the Cartesian inertial system xµ = (t, x)
where we neglect the two transverse dimensions. We can parameterise the trajectory of the
observer as
where γ denotes its Lorentz factor. A monochromatic wave of a scalar, massless field φ(k) ∝
exp[−iω(t − x)] will be seen by the moving observer as
" r #
1−v
φ(τ ) ≡ φ(xµ (τ )) ∝ exp [−iωτ (γ − γv)] = exp −iωτ . (6.18)
1+v
Thus this simple calculation reproduces the usual Doppler formula, where the frequency ω of
the scalar wave is shifted as
r
′ 1−v
ω = ω. (6.19)
1+v
Next we apply the same method to the case of an accelerated observer. Then t(τ ) =
a−1 sinh(aτ )
and x(τ ) = a−1 cosh(aτ ). Inserting this trajectory again into a monochromatic
wave with φ(k) ∝ exp(−iω(t − x) now gives
iω iω
φ(τ ) ∝ exp − [sinh(aτ ) − cosh(aτ )] = exp exp(−aτ ) ≡ e−iϑ . (6.20)
a a
Thus an accelerated observer does not see a monochromatic wave, but a superposition of
plane waves with varying frequencies. Defining the instantaneous frequency by
dϑ
ω(τ ) = = ω exp(−aτ ), (6.21)
dτ
we see that the phase measured by the accelerated observer is exponentially redshifted. As
next step, we want to determine the power spectrum P (ν) = |φ(ν)|2 measured by the observer,
for which we have to calculate the Fourier transform φ(ν).
55
6 Black holes
gives Z ∞
1
φ(ν) = dy y −iν/a−1 ei(ω/a)y . (6.23)
a 0
On the other hand, we can rewrite Euler’s integral representation of the Gamma function as
Z ∞
dt tz−1 e−bt = b−z Γ(z) = exp(−z ln b) Γ(z) (6.24)
0
for ℜ(z) > 0 and ℜ(b) > 0. Comparing these two expressions, we see that they agree setting
z = −iν/a + ε and b = −iω/a + ε. Here we added an infinitesimal positive real quantity ε > 0
to ensure the convergence of the integral. In order to determine the correct phase of b−z , we
have rewritten this factor as exp(−z ln b) and have used
ω iπ
iω
ln b = lim ln − + ε = ln − sign(ω/a). (6.25)
ε→0 a a 2
1 ω iν/a
φ(ν) = Γ(−iν/a)eπν/(2a) . (6.26)
a a
1 ω iν/a
φ(−ν) = φ(ν)e−πν/a = Γ(−iν/a)e−πν/(2a) . (6.27)
a a
Using the reflection formula of the Gamma function for imaginary arguments,
π
Γ(ix)Γ(−ix) = , (6.28)
x sinh(πx)
π e−πν/a β 1
P (−ν) = 2
= βν
(6.29)
a (ν/a) sinh(πν/a) ν e −1
with β = 2π/a. Remarkably, the dependence on the frequency ω of the scalar wave—still
present in the Fourier transform φ(ν)—has dropped from the negative frequency part of
the power spectrum P (−ν) which corresponds to a thermal Planck law with temperature
T = 1/β = a/(2π).
The occurrence of negative frequencies is the classical analogue for the mixing of posi-
tive and negative frequencies in the Bogolyubov method. Therefore we expect that on the
quantum level a uniformly accelerated detector will measure a thermal Planck spectrum with
temperature T = 1/β = a/(2π). This phenomenon is called Unruh effect and T = a/(2π) the
Unruh temperature.
56
6.2 Schwarzschild black holes
0 = nµ nµ = gµν nµ nν (6.30)
we see that the line element vanishes on the horizon, ds = 0. Hence the (future) light-cones
at each point of an event horizon are tangential to the horizon.
Eddington–Finkelstein coordinates We next try to find new coordinates which are regular
at r = 2M and valid in the whole range 0 < r < ∞. Such a coordinate transformation
has to be singular at r = 2M , otherwise we cannot hope to cancel the singularity present in
the Schwarzschild coordinates. We can eliminate the troublesome factor grr = (1 − 2M r )
−1
∗
introducing a new radial coordinate r defined by
dr
dr ∗ = . (6.31)
1 − 2M
r
57
6 Black holes
b
a
Figure 6.1: Left: The Schwarzschild spacetime using advanced Eddington–Finkelstein coor-
dinates; the singularity is shown by a zigzag line, the horizon by a thick line and
geodesics by thin lines. Right: Collapse of a star modelled by pressureless matter;
dashes lines show geodesics, the thin solid line encompasses the collapsing stellar
surface.
the metric is invertible. Moreover, r ∗ was defined by (6.32) initially only for r > 2M , but we
can use this definition also for r < 2M , arriving at the same expression (6.36). Therefore,
the metric using the advanced time parameter ṽ is regular at 2M and valid for all r > 0. We
can view this metric hence as an extension of the r > 2M part of the Schwarzschild solution,
similar to the process of analytic continuation of complex functions. The price we have to
pay for a non-zero determinant at r = 2M are non-diagonal terms in the metric. As a result,
the spacetime described by (6.36) is not symmetric under the exchange t → −t. We will see
shortly the consequences of this asymmetry.
We now study the behaviour of radial light-rays, which are determined by ds2 = 0 and
dφ = dϑ = 0. Thus radial light-rays satisfy Adṽ 2 − 2dṽdr = 0, which is trivially solved by
ingoing light-rays, dṽ = 0 and thus ṽ = const. The solutions for dṽ 6= 0 are given by (6.33).
Additionally, the horizon r = 2M which is formed by stationary light-rays satisfies ds2 = 0.
In order to draw a spacetime diagram, it is more convenient to replace the light-like coordinate
ṽ by a new time-like coordinate. We show in the left panel of Fig. 6.1 geodesics using as new
time coordinate t̃ = ṽ − r. Then the ingoing light-rays are straight lines at 45◦ to the r axis.
Radial light-rays which are outgoing for r > 2M and ingoing for r < 2M follow Eq. (6.34).
A few future light-cones are indicated: they are formed by the intersection of light-rays, and
they tilt towards r = 0 as they approach the horizon. At r = 2M , one light-ray forming the
light-cone becomes stationary and part of the horizon, while the remaining part of the cone
lies completely inside the horizon.
Let us now discuss how Fig. 6.1 would like using the retarded Eddington–Finkelstein coor-
dinate ũ. Now the outgoing radial null geodesics are straight lines at 45◦ . They start from
the singularity, crossing smoothly r = 2M and continue to spatial infinity. Such a situation,
where the singularity is not covered by an event horizon is called a “white hole”. The cosmic
censorship hypothesis postulates that singularities formed in gravitational collapse are always
covered by event horizons. This implies that the time-invariance of the Einstein equations is
58
6.2 Schwarzschild black holes
broken by its solutions. In particular, only the BH solution using the retarded Eddington–
Finkelstein coordinates should be realised by nature—otherwise we should expect causality
to be violated. This behaviour may be compared to classical electrodynamics, where all
solutions are described by the retarded Green function, while the advanced Green function
seems to have no relevance.
Collapse to a BH After a star has consumed its nuclear fuel, gravity can be balanced only
by the Fermi degeneracy pressure of its constituents. Increasing the total mass of the star
remnant, the stellar EoS is driven towards the relativistic regime until the star becomes
unstable. As a result, the collapse of its core to a BH seems to be inevitable for a sufficiently
heavy star.
Let us consider a toy model for such a gravitational collapse. We describe the star by
a spherically symmetric cloud of pressureless matter. While the assumption of negligible
pressure is unrealistic, it implies that particles at the surface of the star follow radial geodesics
in the Schwarzschild spacetime. Thus we do not have to bother about the interior solution
of the star, where Tµν 6= 0 and our vacuum solution does not apply. In advanced Eddington–
Finkelstein coordinates, the collapse is schematically shown in the right panel of Fig. 6.1.
At the end of the collapse, a stationary Schwarzschild BH has formed. Note that in our toy
model the event horizon forms before the singularity, as required by the cosmic censorship
hypothesis. The horizon grows from r = 0 following the light-like geodesic a shown by the
thin black line until it reaches its final size Rs = 2M . What happens if we drop a lump
of matter δM on a radial geodesics into the BH? Since we do not add angular momentum
to the BH, the final stage is, according to the Birkhoff’s theorem, still a Schwarzschild BH.
All deviations from spherical symmetry corresponding to gradient energy in the intermediate
regime are being radiated away as gravitational waves. Thus in the final stage, the only
change is an increase of the horizon, size Rs → 2(M + δM ). Therefore some light-rays (e.g. b)
which we expected to escape to spatial infinity will be trapped. Similarly, light-ray a, which
we thought to form the horizon, will be deflected by the increased gravitational attraction
towards the singularity. In essence, knowing only the spacetime up to a fixed time t, we
are not able to decide which light-rays form the horizon. The event horizon of a black hole
is a global property of the spacetime: It is not only independent of the observer but also
influenced by the complete spacetime.
How does the stellar collapse looks like for an observer at large distances? Let us assume
that the observer uses a neutrino detector and is able to measure the neutrino luminosity
Lν (r) = dEν /dt = Nν ων /dt emitted by a shell of stellar material at radius r. In order to
determine the luminosity Lν (r), we have to connect r and t. Linearising Eq. (??) around
r = 2M gives
r − 2M
= e−(t−t0 )/2M . (6.37)
r0 − 2M
For an observer at large distance r0 , the time difference between two pulses sent by a shell
falling into a BH increases thus exponentially for r → 2M . As a result the energy ων of an
individual neutrino is also exponentially redshifted
A more detailed analysis confirms the expectation that then also the luminosity decreases
exponentially. Thus an observer at infinity will not see shells which slow down logarithmically
59
6 Black holes
as they fall towards r → 2M , as suggested by Eq. (??). Instead the signal emitted by the shell
will fade away exponentially, with the short characteristic time scale of M = M tPl /MPl ≈
10−5 s for a stellar-size BH.
Kruskal coordinates We have been able to extend the Schwarzschild solution into two dif-
ferent branches; a BH solution using the advanced time parameter ṽ and a white hole solution
using the retarded time parameter ũ. The analogy with the analytic continuation of complex
functions leads naturally to the question of whether we can combine these two branches into
one common solution. Moreover, our experience with the Rindler metric suggests that an
event horizon where energies are exponentially redshifted implies the emission of a thermal
spectrum. If true, our BH would not be black after all. One way to test this suggestion is to
relate the vacua as defined by different observers via a Bogolyubov transformation. In order
to simplify this process, we would like to find new coordinates for which the Schwarzschild
spacetime is conformally flat.
An obvious attempt to proceed is to use both the advanced and the retarded time param-
eters. For most of our discussion, it is sufficient to concentrate on the t, r coordinates in the
line element ds2 = ds̄2 + r 2 dΩ, and to neglect the angular dependence from the r 2 dΩ part.
We start by eliminating r in favour of r ∗ ,
2 2M
ds̄ = 1 − (dt2 − dr ∗2 ), (6.39)
r(r ∗ )
where r has to be expressed through r ∗ . This metric is conformally flat but the definition
of r(r ∗ ) on the horizon contains the ill-defined factor ln(2m/r − 1). Clearly, a new set of
coordinates where this factor is exponentiated is what we are seeking.
This is achieved introducing both Eddington–Finkelstein parameters,
ũ = t − r ∗ , ṽ = t + r ∗ , (6.40)
for which the metric simplifies to
2 2M
ds̄ = 1− dũdṽ. (6.41)
r(ũ, ṽ)
From (6.32) and (6.40), it follows
ṽ − ũ r
∗
= r (r) = r + 2M ln − 1 − 2M a, (6.42)
2 2M
or
2M 2M ṽ − ũ r
1− = exp exp a − . (6.43)
r r 4M 2M
This allows us to eliminate the singular factor 1 − 2M/r in (6.41), obtaining
2 2M r ũ ṽ
ds̄ = exp a − exp − dũ exp dṽ. (6.44)
r 2M 4M 4M
Finally, we change to Kruskal light-cone coordinates u and v defined by
ũ ṽ
u = −4M exp − and v = 4M exp , (6.45)
4M 4M
arriving at
2M r
ds2 = exp a − dudv + r 2 dΩ. (6.46)
r 2M
60
6.2 Schwarzschild black holes
∞
t=
r=
3M
2M
t=
r=4 II
M
I’ I t=0 T
II’
t=
−2M
t=
2M
−
=
∞
r
Kruskal diagram The coordinates ũ, ṽ cover only the exterior r > 2M of the Schwarzschild
spacetime, and thus u, v are initially only defined for r > 2M . Since they are regular at the
Schwarzschild radius, we can extend these coordinates towards r = 0. In order to draw the
spacetime diagram of the full Schwarzschild spacetime shown in Fig. 6.2, it is useful to go
back to time- and space-like coordinates via
Then the connection between the pair of coordinates {T, R}, {u, v} and {t, r} is given by
∗ r r
r
uv = T 2 − R2 = −16M 2 exp = −16M 2 − 1 exp −a , (6.48a)
2M 2M 2M
u T −R
= = exp [−t/(2M )] . (6.48b)
v T +R
Lines with r = const. are given by uv = T 2 − R2 = const. They are thus parabola shown
as dotted lines in Fig. 6.4. Lines with t = const. are determined by u/v = const. and are
thus given by straight (solid) lines through zero. In particular, null geodesics correspond to
straight lines with angle 45◦ in the R − T diagram. The horizon r = 2M is given by to u = 0
or v = 0. Hence two separate horizons exist: a past horizon at t = −∞ (for v = 0 and thus
T = −R) and a future horizon at t = +∞ (for u = 0 and thus T = R). Also, the singularity
at r = 0 corresponds to two separate lines in the R − T Kruskal diagram1 and is given by
p
T = ± 16M 2 + R2 . (6.49)
1
Recall that we suppress two space dimension: Thus a point in the R − T Kruskal diagram correspond to a
sphere S 2 , and a line to R × S 2 .
61
6 Black holes
dr 2
ds2 = A(r)dt2 − − r 2 (dϑ2 + sin2 ϑdφ2 ) (6.50)
A(r)
with
2GM GQ2
A(r) = 1− + . (6.51)
r 4πr 2
The metric is time-independent and axially symmetric. Hence two obvious Killing vectors
are, as in the Schwarzschild case, ξ = (1, 0, 0, 0) and η = (0, 0, 0, 1), where we again order
coordinates as {t, r, ϑ, φ}.
62
6.4 Kerr black holes
The presence of the mixed term gtφ means that the metric is stationary, but not static—as
one expects for a star or BH rotating with constant rotation velocity. Finally, the metric is
asymptotically flat and the weak-field limit shows that L is the angular momentum of the
rotating black hole.
Its main properties are
• The metric is asymptotically flat.
• Potential singularities at ρ = 0 and ∆ = 0.
• The weak-field limit shows that L is the angular momentum of the rotating black hole.
• The presence of the mixed term gtφ means that infalling particles (and thus space-time)
is dragged around the rotating black hole.
Orbits in the equatorial plane ϑ = π/2 could be derived in the same way as for the
Schwarzschild case, for ϑ 6= π/2 the discussion becomes much more involved.
ρ2
ds2 = dt2 − dr 2 − ρ2 dϑ2 − (r 2 + a2 ) sin2 ϑdφ2 . (6.54)
r 2 + a2
The comparison with the Minkowski metric shows that
p
x = r 2 + a2 sin ϑ cos φ, z = r cos ϑ,
p (6.55)
y = r 2 + a2 sin ϑ sin φ,
Hence the singularity at r = 0 and ϑ = π/2 corresponds to a ring of radius a in the equatorial
plane z = 0 of the Kerr black hole.
63
6 Black holes
Figure 6.3: Structure of a Kerr black hole: The ergoregion (grey area) is bounded by the
√
outer √ ergosurface r+ = M + M 2 − a2 cos2 ϑ and the outer event
√ horizon rh =
M + M − a , followed by√the inner event horizon rh = M − M 2 − a2 , the
2 2
2 + a2 = 2M r , we obtain
Using r± ±
2
2M r+
ds2 = ρ2+ dϑ2 + sin2 ϑdφ2 . (6.59)
ρ+
√
Hence the metric determinant g2 restricted to the angular variables is given by g2 =
√
gϑϑ gφφ = 2M r+ sin ϑ and integration gives the area A of the horizon as
Z 2π Z π
√ p
A= dφ dϑ g2 = 8πM r+ = 8πM (M + M 2 − a2 ). (6.60)
0 0
Note that the area depends on the angular momentum of the black hole that can in turn
be manipulated by dropping material into the hole. The horizon area A for fixed mass M
becomes maximal for a non-rotating black hole, A = 16πM 2 , and decreases to A = 8πM 2 for
a maximally rotating one with a = M . For a > M , the metric component grr = ∆ has no
real zero and thus no event horizon exists.
(For an interpretation see the space-time diagram 6.4 that uses coordinates of the advanced
Eddington-Finkelstein type.)
Ergosphere and dragging of inertial frames The Kerr metric is a special case of a metric
with gtφ 6= 0. As result, both massive and massless particles with zero angular momentum
alling into a Kerr black hole will acquire a non-zero angular rotation velocity ω = dφ/dt as
seen by an observer from infinity.
We consider a light-ray with dϑ = dr = 0. Then the line element becomes
gtt dt2 + 2gtφ dtdφ + gφφ dφ2 = 0. (6.61)
Dividing by gφφ dt2 , we obtain a quadratic equation for the angular rotation velocity ω =
dφ/dt,
gtφ gφφ
ω2 + 2 ω+ =0 (6.62)
gφφ gφφ
64
6.4 Kerr black holes
65
6 Black holes
Extension of the Kerr metric The behavior of geodesics for r → 0 (and ϑ 6= π/2) suggests
that one can extend the space-time to r < 0. For r → −∞, the extension becomes asymptot-
ically flat, i.e. there exists a second Minkowski space that is connected to ours via the Kerr
black hole. Since for negative r, ∆ is always positive, ∆ = r 2 − 2M r + a2 > 0, the singularity
is not protected by an event horizon in the “other” Minkowksi space. Moreover, there exist
closed time-like curves: Consider a curve depending only on φ in the equatorial plane, the
line-element for small, negative r is
2 2 2 2M a2 2M a2 2
ds = r + a + dφ2 ∼ dφ < 0 (6.69)
r r
2
Note that ω1 < ω2 , because of gφφ < 0. Hence photons (and thus also spacetime) is corotating, as expected.
66
6.4 Kerr black holes
time-like.
The cosmic censorship hypothesis postulates that singularities formed in gravitational col-
lapse are always covered by event horizons. Thus we are in the “r > 0” Minkowski space of
all Kerr black holes – and the r < 0 is simply a mathematical artefact of a highly symmetrical
manifold, not showing up in real physical situations.
Penrose process and the area theorem The total energy of a Kerr BH consists of its rest
energy and its rotational energy. These two quantities control the size of the event horizon
and therefore it is important to understand how they change dropping matter into the BH.
The energy of any particle moving on a geodesics is conserved, E = p · ξ. Inside the
ergosphere, the Killing vector ξ is space-like and the quantity E is thus the component of a
spatial momentum which can have both signs. This led Penrose to entertain the following
gedankenexperiment: Suppose the spacecraft A starts at infinity and falls into the ergosphere.
There it splits into two parts: B is dropped into the BH, while C escapes to infinity. In the
splitting process, four-momentum has to be conserved, pA = pB + pC . We can now choose
a time-like geodesics for B falling into the BH such that EB < 0. Then EC > EA and the
escaping part C of the spacecraft has at infinity a higher energy than initially.
The Penrose process decreases both the mass and the angular momentum of the BH by an
amount equal to that of the space craft B falling into the BH. Now we want to show that the
changes are correlated in such a way that the area of the BH increases. Let us first define a
new Killing vector,
K = ξ + ωH η.
This Killing vector is null on the horizon and time-like outside. It corresponds to the four-
velocity with the maximal possible rotation velocity. Now we use EB = pB ·ξ and LB = −pB ·η
and
pB · K = pB · (ξ + ωH η) = EB − ωH LB > 0, (6.70)
to obtain the bound LB < EB /ωH . Since EB < 0, the added angular momentum is negative,
LB < 0.
The mass and the angular momentum of the BH change by δM = EB and δL = LB , when
particle B drops into the BH. Thus
aδL
δM > ωH δL = 2 . (6.71)
r+ + a2
Now we define the irreducible mass of BH as the mass of that Schwarzschild BH whose event
horizon has the same area,
2 1 p
Mirr = (M 2 + M 2 − L2 ) (6.72)
2
or 2
L
M 2 = Mirr2
+ . (6.73)
2Mirr
Thus we can interpret the total mass as the Pythagorean sum of the irreducible mass and a
contribution related to the rotational energy. Differentiating the relation (6.72) results in
a −1
δMirr = √ ωH δM − δL . (6.74)
4Mirr M 2 − a2
Our bound implies now δMirr > 0 or δA > 0. Thus the surface of a Kerr BH can only increase,
even when its mass decreases.
67
6 Black holes
dU = dM = T dS − ωdL , (6.75)
where ωdL denotes the mechanical work done on a rotating macroscopic body.
Our experience with the thermodynamics of non-gravitating systems suggests that the
entropy is an extensive quantity and thus proportional to the volume, S ∝ V . We now offer
an argument that shows that the entropy S of a black hole is proportional to its area A. We
introduce the “rationalised area” α = A/4π = 2M r+ , cf. (6.60), or
p
α = 2M 2 + 2 M 4 − L2 . (6.76)
The parameters describing a Kerr black hole are its mass M and its angular momentum L and
thus α = α(M, L). We form the differential dα and find after some algebra (problem 25.??)
√
M 2 − a2 a
dα = dM + dL. (6.77)
2α α
Using now Eq. (6.60) and (6.68), we can rewrite the RHS as
√
M 2 − a2
dα = dM + ω H dL. (6.78)
2α
Thus the first law of black hole thermodynamics predicts the correct angular velocity ωH of
a Kerr black hole. Including the term Φdq representing the work done by adding the charge
dq to a black hole, the area law of a charged black hole together with the first law of BH
thermodynamics reproduces the correct surface potential Φ of a charged black hole.
The factor in front of dα is positive, as its interpretation as temperature requires. We
identify √
M 2 − a2
T dS = dα (6.79)
2α
and thus S = f (A). The validity of the area theorem requires that f is a linear function,
the proportionality coefficient between S and A can be only determined by calculating the
temperature of black hole. Hawking could show 1974 that a black hole in vacuum emits
black-body radiation (“Hawking radiation”) with temperature
√
2 M 2 − a2
T = (6.80)
A
and thus
kc3 A
S= A= . (6.81)
4~G 4L2Pl
The entropy of a black hole is not extensive but is proportional to its surface. It is large,
because its basic unit of entropy, 4L2Pl , is so tiny. The presence of ~ in the first formula, where
68
6.A Appendix: Conformal flatness for d = 2
we have inserted the natural constants, signals that the black hole entropy is a quantum
property.
The heat capacity CV of a Schwarzschild black hole follows with U = M = 1/(8πT ) from
the definition
∂U 1
CV = =− < 0. (6.82)
∂T 8πT 2
As it is typical for self-gravitating systems, its heat capacity is negative. Thus a black hole
surrounded by a cooler medium emits radiation, heats up the environment and becomes
hotter.
Hawking radiation Hawking could show 1974 that a black hole in vacuum emits black-body
radiation (“Hawking radiation”) with temperature
√
2 M 2 − a2
T = (6.83)
A
and thus
kc3 A
S= A= . (6.84)
4~G 4L2Pl
A black hole surrounded by a cooler medium emits radiation and heats up the environment.
The entropy of a black hole is large, because its basic unit of entropy, 4L2Pl , is so tiny.
We can understand this result considering an observer in the Schwarzschild metric. The
acceleration of a stationary observer,
−1/2 −1/2
1/2 2M M Rs Rs /2
a ≡ (−a · a) = 1− = 1− , (6.85)
r r2 r r2
diverges approaching the horizon, r → Rs = 2M . The acceleration a close to the horizon, i.e.
for r1 − Rs ≪ Rs , is thus much larger than the curvature ∝ 1/Rs . We can use therefore the
approximation of an accelerated observer in a flat space, who sees according to the Unruh
effect a thermal spectrum with temperature T = a1 /2π at r1 . Assume now that p the observer
moves from r1 to r2 > r1 . Then the spectrum is redshifted by V1 /V2 with Vi = 1 − Rs /ri .
For r2 → ∞, it is V2 → 1 and thus T2 → V1 T1 . Approaching also the horizon, the temperature
becomes
p 1 R /2 1 1
T = lim V1 T1 = lim 1 − Rs /r1 p s = = . (6.86)
r1 →Rs r1 →Rs 2π r12 1 − Rs /r1 4πRs 8πM
69
7 Classical field theory
The boundary term vanishes, since we require that the variation is zero on the boundary ∂Ω.
Thus the Lagrange equations for the fields φa are
∂L ∂L
− ∂µ = 0. (7.4)
∂φa ∂(∂µ φa )
70
7.2 Noether’s theorem and conservation laws
∂µ j µ = 0 . (7.5)
Then Z Z
d
d3 x j 0 = − dS · j (7.6)
dt V ∂V
and Z
Q= d3 x j 0 (7.7)
V
is a globally conserved quantity, if there is no outgoing flux j through the boundary ∂V . To
show that Q is a Lorentz invariant quantity, we have to rewrite Eq. (7.7) as a tensor equation.
Consider Z
Q(t = 0) = d4 x j µ (x)∂µ ϑ(n · x) (7.8)
with ϑ the step function and n a unit vector in time direction, n · x = x0 = t. Then
Z Z Z
Q(t = 0) = d x j (x)∂0 ϑ(x ) = d x j (x)δ(x ) = d3 x j 0 (x)
4 0 0 4 0 0
(7.9)
and hence Eqs. (7.7) and (7.8) are equivalent. Since one of them is a tensor equation, Q is
Lorentz invariant.
In the same way, we can construct in Minkowski space globally conserved quantities Q for
conserved tensors: If for instance ∂µ T µν = 0, then
Z
P ν = d3 x T 0ν (7.10)
Symmetries and Noether’s theorem Noether’s theorem gives a formal connection between
global, continuous symmetries of a physical system and the resulting conservation laws. Such
symmetries can be divided into space-time and internal symmeties. We derive this theorem
in two steps, considering in the first one only internal symmetries.
We assume that our collection of fields φa has a continuous symmetry group. Thus we can
consider an infinitesimal change δφa that keeps L (φa , ∂µ φa ) invariant,
δL δL
0 = δL = δ0 φa + δ0 ∂µ φa . (7.11)
δφa δ∂µ φa
Here, we used the notation δ0 to stress that we exclude variations due to the change of
spacetime point. Now we exchange δ∂µ against ∂µ δ in the second term and use then the
Lagrange equations, δL /δφa = ∂µ (δL /δ∂µ φa ), in the first term. Then we can combine the
two terms using the Leibniz rule,
δL δL δL
0 = δL = ∂µ δ0 φa + ∂µ δ0 φa = ∂µ δ0 φa . (7.12)
δ∂µ φa δ∂µ φa δ∂µ φa
71
7 Classical field theory
Hence the invariance of L under the change δ0 φa implies the existence of a conserved current,
∂µ j µ = 0, with
δL
jµ = δ0 φa . (7.13)
δ∂µ φa
If the transformation δ0 φa leads to change in L that is a total four-divergence, δ0 L = ∂µ K µ ,
and boundary terms can be dropped, then the equation of motions are still invariant. The
conserved current is changed to j µ = δL /δ∂µ φa δ0 φa − K µ .
In the second step, we consider in addition a variation of the coordinates, x′µ = xµ + δxµ .
Such a variation implies a change of the fields
and thus also of the Lagrange density. Note that we compare now the field at different points.
In order to be able recycle our old result, we split the total variation δφa (xµ ) as follows
δφa (xµ ) = φ′a (x′µ ) − φa (xµ ) = φ′a (xµ + δxµ ) − φa (xµ ) (7.15)
= φ′a (xµ ) + δxµ ∂µ φ′a (xµ ) − φa (xµ ) = δ0 φa (xµ ) + δxµ ∂µ φ′a (xµ ) (7.16)
µ
= δ0 φa (xµ ) + δx ∂µ φa (xµ ). (7.17)
Here we made in the second line first a Taylor expansion, and introduced then the local
variation δ0 φa (xµ ) = φ′a (xµ ) − φa (xµ ) which we calculated previously. Since δxµ is already
a linear term, we could replace in the third line φ′a (xµ ) ≃ φa (xµ ), neglecting thereby only a
quadratic term.
We consider now the variation of the action S implied by the coordinate change x̃µ =
xµ + δxµ . Such a variation implies not only a variation of L but also of the integration
measure d4 x, Z
4
δS = d x(δL ) + (δd4 x)L . (7.18)
Ω
The two integration measures d4 x and d4 x̃ are connected by the Jacobian, i.e. the determinant
of the transformation matrix
∂ x̃µ
aµν = . (7.19)
∂xν
Using again that the variation is infinitesimal, we find
∂δx0 ∂δx0
µ 1 + · · ·
∂ x̃ ∂x
1
0 1
∂x 1 ∂δxµ
J = ν = ∂δx0 1 + ∂δx = 1 + . (7.20)
∂x ∂x ∂x1 ∂xµ
... ...
Inserting first this result and using then Eq. (7.17) applied to L gives
Z Z
4 ∂δxµ 4 ∂L ∂δxµ
δS = d x δL + L = d x δ0 L + δxµ + L . (7.21)
Ω ∂xµ Ω ∂xµ ∂xµ
We combine the last two terms using the Leibniz rule, and insert the known variation δ0 L
at the same point from Eq. (7.12), obtaining
Z
4 ∂ ∂L
δS = d x δ0 φa + L δxµ . (7.22)
Ω ∂xµ ∂(∂ µ φa )
72
7.2 Noether’s theorem and conservation laws
If the system is invariant under these transformations, the variation of the action is zero,
δS = 0, and the square bracket represents a conserved current j µ . As last step, we change
from the local variation δ0 to the full variation δ using Eq. (7.17), obtaining as final expression
for the Noether current
∂L ∂L ∂φa
jµ = δφa − − ηµν L δxν . (7.23)
∂(∂ µ φa ) ∂(∂ µ φa ) ∂xν
Translations Invariance under translations x′µ = xµ + εµ means φ′a (x′ ) = φa (x) or δφa = 0.
Hence we obtain a conserved tensor
∂L ∂φa
Θµν = − ηµν L (7.24)
∂(∂ µ φa ) ∂xν
called the energy-momentum stress tensor or in short the stress tensor. We will see in the
next chapter that this tenor sources gravity—being thus of crucial interest for us. If the
stress tensor is derived via the Noether procedure (7.24), it is called canonical. In general,
the canonical stress tensor is not symmetric, Θµν 6= Θνµ , as it should be as source of gravity
in Einstein’s theory. Note however that the Noether procedure does not uniquely specificy
the stress tensor, because we can add any tensor ∂λ f λµν which is antisymmetric in µ and
λ: such a term drops out of the conservation law because of ∂µ ∂λ f λµν = 0. This freedom
allows us to obtain always a symmetric stress tensor. We will learn later a different method,
leading directly to a symmetric energy-momentum tensor Tµν (called the dynamical energy-
momentum tensor).
Stress tensor: The invention of the three-dimensional stress tensor σij goes back to Pascal
and Euler. Recall that σij is determined via dFi = σij dAj as the response of a material to
the force Fi on its surface element Aj . This implies that we can view the stress tensor also as
an (anisotropic) pressure tensor. Moreover, it follows with fi = dFi /dV for the force density
fj = ∂i σij as equilibrium condition (or equation of motion) of the system.
The relativistic stress tensor Tµν was introduced by Minkowski in 1908 for electrodynamics,
combining Maxwell’s stress tensor (in vaccuum)
1
σij = Ei Ej + Bi Bj − (E 2 − B 2 )δij
2
with the energy density ρ = (E 2 − B 2 )/2, the Poynting vector (or energy flux) S = E × B,
and the momentum density π
µν ρ S
T = .
π σij
In a relativistic theory, the energy flux equals the momentum density. Then T 0i = T i0 , what
is sufficient to show the symmetry of the full tensor.
From the example, we know that Θ00 corresponds to the energy density ρ. Therefore p0 is
the energy, and thus pµ the four-momentum of the field. This is in line with the fact that
translations are generated by the four-momentum operator.
73
7 Classical field theory
Lorentz transformations Lorentz transformation, i.e. rotations and boosts, lead to a linear
change of coordinates,
x̃µ = xµ + δω µν xν . (7.26)
They preserve the norm of vectors, implying that
xµ xµ = x̃µ x̃µ = (xµ + δω µσ xσ ) (xµ + δωµτ xτ ) (7.27)
= xµ xµ + δω µσ xσ xµ + δωµτ xµ xτ + O(ω 2 ) (7.28)
µ µν νµ
= x xµ + (δω + δω )xµ xν . (7.29)
Thus the matrix parameterising Lorentz transformations is antisymmetric,
ω µν = −ω νµ , (7.30)
and has six independent elements. For an infinitesimal transformation, the transformed fields
φ̃a (x̃) depend linearly on1 δω µν and φa (x),
1
φ̃a (x̃) = φa (x) + δωµν (I µν )ab φb (x) . (7.31)
2
The symmetric part of (I µν )ab does not contribute, because of the antisymmetry of the δω µν .
Hence we can choose also the (I µν )ab as antisymmetric and thus there exists six generators
(I µν )ab corresponding to the three boosts and the three rotations. The explicit form of the
generators Iab (the “matrix representation of the Lorentz group” for spin s) depends on the
spin of the considered field, as the known different transformation properties of scalar (s = 0),
spinor (s = 1/2) and vector (s = 1) fields under rotations show.
We evaluate now the Noether current (7.23), inserting first the definition of the stress
tensor,
∂L
jµ = δφa − Θµν δxν . (7.32)
∂(∂ µ φa )
Next we use δxµ = δω µν xν and δφa = 12 δω µν (I µν )ab φb (x) as well as the antisymmetry of δω µν ,
to obtain
∂L 1 νλ 1
jµ = µ
δω (Iνλ )ab φb (x) − Θµν δω νλ xλ = δω νλ Mµνλ (7.33)
∂(∂ φa ) 2 | {z } 2
1
2
δω νλ (Θµν xλ −Θµλ xν )
Hence for a scalar field, the canonical stress tensor is symmetric, Θνλ = Θλν , and agrees with
the dynamical stress tensor, Θµν = T µν . The corresponding Noether charges are
Z Z
νµ 3 0νµ
M = d xM = d3 x xν Θ0µ − xµ Θ0ν ≡ Lµν . (7.36)
1
We add a factor 1/2, because in the summation two terms contribute for each transformation parameter.
74
7.3 Perfect fluid
Recalling Eq. (7.25), we see that these charges agree with the relativistic orbital angular mo-
mentum tensor Lµν . Since Lµν is antisymmetric, Eq. (7.36) defines six conserved quantities,
one for each of the generators of the Lorentz group. Choosing spatial indices, Lij agrees with
the non-relativistic orbital angular momentum, while the conservation of Li0 leads to the
relativistic version of the constant center-of-mass motion.
For a field with non-zero spin, the last term in Eq. (7.34) does not vanish. It represents
therefore the intrinsic or spin angular momentum density S µν of the field. In this case, only
the total angular momentum M µν is conserved, not however the orbital and spin angular
momentum individually. Moreover, the canonical stress tensor is not symmetric.
T αβ = ρuα uβ . (7.38)
Writing uα = (γ, γv), we can identify T 00 = γ 2 ρ with the energy density, T 0i = γ 2 ρv i with
the energy/momentum density flux in direction i, and T ij = γ 2 ρv i v j with the flow of the
momentum density component i through the area with normal direction j.
Let us now check the consequences of ∂α T αβ = 0, assuming for simplicity the non-relativistic
limit. We look first at the α = 0 component,
∂t ρ + ∇ · (ρu) = 0 . (7.39)
This corresponds to the mass continuity equation and, because of E = m for dust, at the
same time to energy conservation. Next we consider the α = 1, 2, 3 = i components,
75
7 Classical field theory
or
u∂t ρ + ρ∂t u + u∇ · (ρu) + (u · ∇)uρ = 0 . (7.41)
Taking the continuity equation into account, we obtain the Euler equation for a force-free
fluid without viscosity,
ρ∂t u + (u · ∇)uρ = 0 . (7.42)
Hence, as anounced, the condition ∂µ T µν = f ν gives the equations of motion.
Finally we include the effect of pressure. We know that the pressure tensor coincides with
the σij part of the stress tensor. Moreover, for a perfect fluid in its rest-frame, the pressure is
isotropic Pij = P δij . This corresponds to Pij = −P ηij and adds −P to T 00 . Compensating
for this gives
T αβ = (ρ + P )uα uβ − P η αβ . (7.43)
p2 ∆
i∂t ψ = ψ= ψ, (7.44)
2m 2m
can be “derived” using the replacements
from the non-relativistic energy-momentum relation E = p2 /(2m), we obtain from the rela-
tivistic E 2 = m2 + p2
Translation invariance implys that we can choose the solutions as eigenstates of the momen-
tum
p operator, p̂φ = pφ. These states are plane waves with positive and negative energies
± k2 + m2 . Interpreting the Klein–Gordon equation as a relativistic wave equation for a
single particle cannot therefore be fully satisfactory, since the energy of its solutions is not
bounded from below.
How do we guess the correct Lagrange density L ? The correspondence q̇ ↔ ∂µ φ means
that the kinetic field energy is quadratic in the field derivatives. In contrast, the mass term
m2 is potential energy, V (φ) ∝ m2 . The relativistic energy-momentum relation E 2 = m2 + p2
suggests that V (φ) is also quadratic, with the same numerical coefficient as the kinetic energy.
Therefore we try as Lagrange density
1 1 1 1 1
L = ηµν (∂ µ φ)(∂ ν φ) − V (φ) = ηµν (∂ µ φ)(∂ ν φ) − m2 φ2 ≡ (∂µ φ)2 − m2 φ2 , (7.47)
2 2 2 2 2
where the factor 1/2 is convention: The kinetic energy of a canonically normalised real field
carries the prefactor 1/2. With
∂
(η µν ∂µ φ∂ν φ) = η µν δµα ∂ν φ + δνα ∂µ φ = η αν ∂ν φ + η µα ∂µ φ = 2∂ α φ, (7.48)
∂(∂α φ)
76
7.4 Klein-Gordon field
Thus the Lagrange density (7.47) leads to the Klein-Gordon equation. We can check if we
have correctly chosen the signs by calculating the stress tensor,
∂L ,ν
T µν = φ − η µν L = φ,µ φ,ν − η µν L . (7.50)
∂φ,µ
Complex field and internal symmetries If two field exist with the same mass m, one might
wish to combine the two real fields into one complex field,
1
φ = √ (φ1 + iφ2 ) . (7.52)
2
Then one can interprete φ and φ† as a particle and its antiparticle, which are Hermetian
conjugated fields.
The resulting Lagrangian density is just the sum,
L = ∂µ φ† ∂ µ φ − m2 φ† φ (7.53)
The presence of two fields sharing some quantum numbers (here the mass) opens up the
possibility of internal symmetries. The Lagrangian (7.53) is invariant under global phase
transformations, φ → eiϑ φ and φ† → e−iϑ φ† . With δφ = iφ and δφ† = −iφ† , the conserved
current follows as h i
j µ = i φ† ∂ µ φ − (∂ µ φ† )φ . (7.54)
R 3 0
The conserved charge Q = d x j can be also negative and thus we cannot interpret j 0
as the probability density to observe a φ particle. Instead, we should associate Q with a
conserved additive quantum number as, for example, the electric charge.
Next we calculate the stress tensor,
φ = N e−ikx . (7.56)
T 00 = 2|N |2 k0 k0 . (7.57)
77
7 Classical field theory
Relativistic one-particle states are usually normalised as N −2 = 2ωV . Thence the energy
density T 00 = ω/V agrees with the expectation for one particle with energy ω per volume V .
The other components are necessarily
T µν = 2|N |2 kµ kν . (7.58)
78
7.5 Maxwell field
Finally, we rewrite in the first term duα = duα /ds ds, in the second and third dxα = uα ds
and exchange the summation indices µ and ν in the third term. Then
Z b
duµ ∂Aν ∂Aµ
δS = m −q − uν δxµ ds = 0 . (7.67)
a ds ∂xµ ∂xν
For arbitrary variations, the brackets has to be zero and we obtain as equation of motion
duµ ∂Aν ∂Aµ
m = fµ = q − uν ≡ qFµν uν . (7.68)
ds ∂xµ ∂xν
This is the relativistic form of the Lorentz force.
B 1 = ∂2 A3 − ∂3 A2 = ∂3 A2 − ∂2 A3 = F32
79
7 Classical field theory
Current conservation and gauge invariance We take the divergence of Maxwells equation
(7.69),
∂ν ∂µ F µν = ∂ν j ν . (7.73)
Since ∂ν ∂µ is symmetric and F µν antisymmetric, the summation of the two factors has to be
zero,
∂ν ∂µ F µν = −∂ν ∂µ F νµ = −∂µ ∂ν F νµ = −∂ν ∂µ F µν . (7.74)
Aµ → A′µ = Aµ + ∂µ χ . (7.76)
′
Fµν = ∂µ A′ν − ∂ν A′µ = Fµν + ∂µ ∂ν χ − ∂ν ∂µ χ = Fµν . (7.77)
Thus the gauge invariance of F is again closely connected to the fact that it is an antisym-
metric tensor, formed by derivatives of A.
Differential forms:
A surface in R3 can be described at any point either by its two tangent vectors e1 and e2 or
by the normal n. They are connected by a cross product, n = e1 × e2 , or in index notation,
In four dimensions, the ε tensor defines a map between 1-3 and 2-2 tensors. Since ε is antisym-
metric, the symmetric part of tensors would be lost; Hence the map is suited for antisymmetric
tensors.
Antisymmetric tensors of rank n can be seen also as differential forms: Functions are forms of
order n = 0; differential of functions are an example of order n = 1,
∂f
df = dxi (7.79)
∂xi
Thus the dxi form a basis, and one can write in general
A = Ai dxi . (7.80)
Thus we have F = dA. Moreover, it follows d2 ω = 0 for all forms. Hence a gauge transformation
F ′ = d(A + dχ) = F .
80
7.5 Maxwell field
Wave equation The Maxwell equation (7.69) consists of four equations for the six compo-
nents of F . Thus we need either a second equation, i.e. Eq. (7.70), or we should transform
Eq. (7.69) into an equation for the four components of the four-potential A. In this case,
Eq. (7.70) is automatically satisfied. Let us do the latter and insert the definition of A,
∂µ F µν = ∂µ (∂ µ Aν − ∂ ν Aµ ) = Aν − ∂µ ∂ ν Aµ = j ν . (7.83)
Aµ = j µ . (7.84)
Inserting then a plane wave Aµ ∝ εµ eikx into the free wave equation, Aν = 0, we find
that k is a light-like vector, while the Lorenz gauge condition ∂µ Aµ = 0 results in εµ kµ = 0.
Imposing the Lorenz gauge, we can still add to the potential Aµ any function ∂ µ χ satisfying
χ = 0. We can use this freedom to set A0 = 0, obtaining thereby εµ kµ = −ε · k = 0.
Thus the photon propagates with the speed of light, is transversely polarised and has two
polarisation states as expected for a massless particle.
Let us discuss now why gauge invariance is necessary for a massless spin-1 particle. First
(r)
we consider a linearly polarised photon with polarisation vectors εµ lying in the plane per-
(1)
pendicular to its momentum vector k. If we perform a Lorentz boost on εµ , we will find
where the coefficients ai depend on the direction β of the boost. Thus, in general the po-
larisation vector will not be anymore perpendicular to k. Similarly, if we perform a gauge
transformation
Aµ (x) → A′µ (x) = Aµ (x) − ∂µ Λ(x) (7.86)
with
Λ(x) = −iλ exp(−ikx) + h.c. , (7.87)
then
A′µ (x) = (εµ + λkµ ) exp(−ikx) + h.c. = ε′µ exp(−ikx) + h.c. (7.88)
Choosing, for example, a photon propagating in z direction, kµ = (ω, 0, 0, ω), we see that
the gauge transformation does not affect the transverse components ε1 and ε2 . Thus only
the components of εµ transverse to k can have physical significance. On the other hand, the
time-like and longitudinal components depend on the arbitrary parameter λ and are therefore
unphysical. In particular, they can be set to zero by a gauge transformation. First, ε′µ k′µ = 0
implies (again for a photon propagating in z direction) ε′0 = −ε′3 . From ε′3 = ε3 + λω, we see
that λ = −ε3 /ω sets ε′3 = −ε′0 = 0. Thus the transformation law (7.85) for the polarisation
vector of a massless spin-1 particles requires the existence of the gauge symmetry (7.86). The
gauge symmetry in turn implies that the massless spin-1 particle couples only to conserved
currents.
81
7 Classical field theory
∂µ F µν = 0 . (7.89)
In order to find L , we multiply by a variation δAν that vanishes on the boundary ∂Ω. Then
we integrate over Ω = V × [ta : tb ], and perform a partial integration,
Z Z
d4 x ∂µ F µν δAν = − d4 x F µν δ(∂µ Aν ) = 0 . (7.90)
Ω Ω
and thus
1
F µν δ(∂µ Aν ) = F µν δFµν . (7.92)
2
Applying the product rule, we obtain as final result
Z
1
− δ d4 x Fµν F µν = 0 (7.93)
4 Ω
and
1
L = − Fµν F µν . (7.94)
4
Note that we expressed L trough F , but L should be viewed nevertheless as function of A:
We are varying the action with respect to Aµ , giving us the a second-order (wave) equation.
This is in accordance with the fact that Aµ determines the interaction (7.60) with charged
particles.
∂Aσ ∂L
Θµν = − δµν L . (7.95)
∂x ∂(∂Aσ /∂xν )
µ
Since L depends only on the derivatives Aµ,ν , we can use the following short-cut: We know
already that
1
δL = − δ(Fµν F µν ) = F µν δ(∂ν Aµ ) . (7.96)
4
Thus
∂L
= F σν = −F νσ (7.97)
∂(∂Aσ /∂xν )
and
∂Aσ νσ 1 ν
Θµν = − F + δµ Fστ F στ . (7.98)
∂xµ 4
Raising the index µ and rearranging σ, we have
∂Aσ ν 1
Θµν = − F σ + η µν Fστ F στ . (7.99)
∂xµ 4
82
7.5 Maxwell field
This result in neither gauge invariant (contains A) nor symmetric. To symmetrize it, we
should add
∂Aµ ν ∂
Fσ= (Aµ F νσ ) . (7.100)
∂xσ ∂xσ
The last step is possible for a free electromagnetic field, ∂σ F νσ = 0, and shows that we are
allowed to add the LHS. Then the two terms combine to F , and we get
1 µν
Θµν = −F µσ F νσ + η Fστ F στ . (7.101)
4
In this form, the stress tensor is symmetric and gauge invariant. We can thus identify the
expression (7.101) with the dynamical stress tensor, Θµν = T µν . Note that its trace is zero,
T µµ = 0.
83
8 Einstein’s field equation
Up to now, we have investigated the behaviour of test-particles and light-rays in a given
curved spacetime determined by the metric tensor gµν . The transition from point mechanics
to field theory means that the role of the mass m as the source of gravity should be taken by
the mass density ρ, or in the relativistic case, by the stress tensor Tµν . Thus we expect field
equations of the type Gµν = κTµν , where κ is proportional to Newton’s constant G and Gµν
is a function of gµν and its derivatives.
Riemann tensor via area... analogue to non-abelian field-strength tensor.. equation of geodesic
deviation
(think at the parallel transport from A first along ea , then along eb to B and then back to A
along −ea and −eb on a sphere), is obviously a tensor and contains second derivatives of the
µ...
metric. The statement [∇α , ∇β ]Tν... = 0 is coordinate independent, and can thus be used to
characterize in an invariant way, if a manifold is flat.
For the special case of a vector V α we obtain with
∇ρ V α = ∂ρ V α + Γαβρ V β (8.2)
first
Now we subtract the two equations using that ∂ρ ∂σ = ∂σ ∂ρ and Γαβρ = Γαρβ ,
[∇ρ , ∇σ ]V α = ∂ρ Γαβσ − ∂σ Γαβρ + Γακρ Γκβσ − Γακσ Γκβρ V β ≡ Rαβρσ V β . (8.5)
84
8.1 Curvature and the Riemann tensor
Its symmetry properties imply that we can construct out of the Riemann tensor only one
non-zero tensor of rank two, contracting α either with the third or fourth index, Rραρβ =
−Rραβρ . We define the Ricci tensor by
Rαβ = Rραρβ = −Rραβρ = ∂ρ Γραβ − ∂β Γραρ + Γραβ Γσρσ − Γσβρ Γρασ . (8.6)
R = Rαβ g αβ . (8.7)
Symmetry properties Inserting the definition of the Christoffel symbols and using normal
coordinates, the Riemann tensor becomes
1
Rαβρσ = {∂σ ∂β gαρ + ∂ρ ∂α gβσ − ∂σ ∂α gβρ − ∂ρ ∂β gασ } . (8.8)
2
The tensor is antisymmetric in the indices ρ ↔ σ, antisymmetric in α ↔ β and symmetric
against an exchange of the index pairs (αβ) ↔ (ρσ). Moreover, there exists one algebraic
identity,
Rαβρσ + Rασβρ + Rαρσβ = 0 . (8.9)
Since each pair of indices (αβ) and (ρσ) can take six values, we can combine the antisym-
metrized components of R[αβ][ρσ] in a symmetric six-dimensional matrix. The number of
independent components of this matrix is thus for d = 4 space-time dimensions
n × (n + 1) 6×7
−1= − 1 = 20 ,
2 2
where we accounted also for the constraint (8.9). In general, the number n of independent
components is in d space-time dimensions given by n = d2 (d2 − 1)/12, while the number m
of field equations is m = d(d + 1)/2. Thus we find
d 1 2 3 4
n 0 1 6 20
m - 3 6 10
This implies that an one-dimensional manifold is always flat (ask yourself why?). Moreover,
the number of independent components of the Riemann tensor is smaller or equals the number
of field equations for d = 2 and d = 3. Hence the Riemann tensor vanishes in empty space, if
d = 2, 3. Starting from d = 4, already an empty space can be curved and gravitational waves
exist.
The Bianchi identity is a differential constraint,
that is checked again simplest using normal coordinates. In the context of general relativ-
ity, the Bianchi identities are necessary consequence of the Einstein-Hilbert action and the
requirement of general covariance.
85
8 Einstein’s field equation
Example: Sphere S 2 . Calculate the Ricci tensor Rij and the scalar curvature R of the two-
dimensional unit sphere S 2 .
We have already determined the non-vanishing Christoffel symbols of the sphere S 2 as Γφϑφ = Γφφϑ =
cot ϑ and Γϑφφ = − cos ϑ sin ϑ. We will show later that the Ricci tensor of a maximally symmetric
space as a sphere satisfies Rab = Kgab . Since the metric is diagonal, the non-diagonal elements of the
Ricci tensor are zero too, Rφϑ = Rϑφ = 0. We calculate with
Rab = Rc acb = ∂c Γcab − ∂b Γcac + Γcab Γdcd − Γdbc Γcad
the ϑϑ component, obtaining
Rϑϑ = 0 − ∂ϑ (Γφϑφ + Γϑϑϑ ) + 0 − Γdϑc Γcϑd = 0 + ∂ϑ cot ϑ − Γφϑφ Γφϑφ
= 0 − ∂ϑ cot ϑ − cot2 ϑ = 1 .
From Rab = Kgab , we find Rϑϑ = Kgϑϑ and thus K = 1. Hence Rφφ = gφφ = sin2 ϑ.
The scalar curvature is (diagonal metric with g φφ = 1/ sin2 ϑ and g ϑϑ = 1)
1
R = g ab Rab = g φφ Rφφ + g ϑϑ Rϑϑ = sin2 ϑ + 1 × 1 = 2 .
sin2 ϑ
Note that our definition of the Ricci tensor guaranties that the curvature of a sphere is also positive,
if we consider it as subspace of a four-dimensional space-time.
86
8.2 Integration, metric determinant g, and differential operators
p
Useful formula for derivatives Applied to derivatives of|g|, we obtain
1 µν 1 1 p
g ∂λ gµν = ∂λ ln g = p ∂λ ( |g|). (8.14)
2 2 |g|
while we find for contracted Christoffel symbols
1 1 1 1 p
Γµµν = g µκ (∂µ gκν + ∂ν gµκ − ∂κ gµν ) = gµκ ∂ν gµκ = ∂ν ln g = p ∂ν ( |g|). (8.15)
2 2 2 |g|
Next we consider the divergence of a vector field,
1 p 1 p
∇µ V µ = ∂µ V µ + Γµλµ V λ = ∂µ V µ + p (∂µ |g|)V µ = p ∂µ ( |g|V µ ). (8.16)
|g| |g|
and of antisymmetric tensors of rank 2,
1 p
∇µ Aµν = ∂µ Aµν + Γµλµ Aλν + Γν λµ Aµλ = p ∂µ ( |g|Aµν ) . (8.17)
|g|
In the latter case, the third term Γν λµ Aµλ vanishes because of the antisymmetry of Aµλ so
that we could combine the first two as in the vector case. This generalises to completely
anti-symmetric tensors of all orders. For a symmetric tensor, we find
∇µ S µν = ∂µ S µν + Γµλµ S λν + Γν λµ S µλ = (8.18)
We can express Γbca as derivative of the metric tensor,
1 p
∇µ S µν = p ∂µ ( |g|S µν ) + Γν λµ S µλ . (8.19)
|g|
Thus we can perform the covariant derivative of S µν without the need to know the Christoffel
symbols.
Example: Spherical coordinates 3:
Calculate for spherical coordinates x = (r, ϑ, φ) in R3 the gradient, divergence, and the Laplace
operator. Note that one uses normally normalized unit vectors in case of a diagonal metric: this
√
corresponds to a rescaling of vector components V i → V i / gii (no summation in i) or basis vectors.
(Recall the analogue rescaling in the exercise “acceleration of a stationary observer in SW BH.)
We express the gradient of a scalar function f first as
∂f ∂f 1 ∂f 1 ∂f
∂ i f ei = g ij ei = er + 2 eφ + 2 2 eϑ
∂xj ∂r r ∂φ r sin ϑ ∂ϑ
√
and rescale then the basis, e∗i = ei / gii , or e∗r = er , e∗φ = reφ , and e∗ϑ = r sin ϑeϑ . In this new
(“physical”) basis, the gradient is given by
∂f ∗ 1 ∂f ∗ 1 ∂f ∗
∂ i f e∗i = er + eϑ + e .
∂r r ∂ϑ r sin ϑ ∂φ φ
√ √
The covariant divergence of a vector field with rescaled components X i / gii is with g = r2 sin ϑ
given by
1 p 1 ∂(r2 sin ϑXr ) ∂(r2 sin ϑXϑ ) ∂(r2 sin ϑXφ )
∇i X i = p ∂i ( |g|X i ) = 2 + +
|g| r sin ϑ ∂r r∂ϑ r sin ϑ∂φ
1 ∂(r2 Xr ) 1 ∂(sin ϑXϑ ) 1 ∂Xφ
= + +
r2 ∂r r sin ϑ ∂ϑ r sin ϑ ∂φ
∂ 2 ∂ cot ϑ 1 ∂Xφ
= + Xr + + Xφ + .
∂r r ∂ϑ r r sin ϑ ∂φ
87
8 Einstein’s field equation
For a non-zero current, the volume integral over the charge density j 0 remains constant,
Z p Z p Z p
4 µ 3 0
d x |g| ∇µ j = d x |g|j − d3 x |g|j 0 = 0 . (8.21)
Ω V (t2 ) V (t2 )
Thus the conservation of Noether charges of internal symmetries as the electric charge, baryon
number, etc., is not affected by an expanding universe.
Next we consider the stress tensor as example for a locally conserved symmetric tensors of
rank two. Now, the second term in Eq. (8.19) prevents us to convert the local conservation
law into a global one. If the space-time admits however a Killing field ξ, then we can form
the vector field P µ = T µν ξν with
∇µ P µ = ∇µ (T µν ξν ) = ξν ∇µ T µν + T µν ∇µ ξν = 0. (8.22)
Here, the first term vanishes since T µν is conserved and the second because T µν is symmetric,
while ∇µ ξν is antisymmetric. Therefore the vector field µ = T µν ξν is also conserved, ∇µ P µ =
0, and we obtain thus the conservation of the component of the energy-momentum vector in
the direction of ξ.
In summary, global energy conservation requires the existence of a time-like Killing vector
field. Moving along such a Killing field, the metric would be invariant. Since we expect in an
expanding universe a time-dependence of the metric, a time-like Killing vector field does not
exist and the energy contained in a “comoving” volume changes with time.
Note that the terms in the first line are ordered according to the number of derivatives: Λ : ∂ 0 ,
b : ∂ 2 , c : ∂ 4 . Choosing only the first term, a constant, will not give dynamical equations.
The next simplest possibility is to pick out only the second term, as it was done originally
88
8.3 Einstein-Hilbert action
The Lagrangian is a function of the metric, its first and second derivatives,1
LEH (gµν , ∂ρ gµν , ∂ρ ∂σ gµν ). The resulting action
Z p
SEH [gµν ] = − d4 x |g| {R + 2Λ} (8.25)
Ω
is a functional of the metric tensor gµν , and a variation of the action with respect to the metric
gives the field equations for the gravitational field. We allow for variations of the metric gµν
restricted by the condition that the variation of gµν and its first derivatives vanish on the
boundary ∂Ω. Asking that variation is zero, we obtain
Z p
0 = δSEH = −δ d4 x |g|(R + 2Λ) = (8.26a)
Z Ω
p
= −δ d4 x |g| (g µν Rµν + 2Λ) (8.26b)
Z Ω
np p p o
= − d4 x |g| gµν δRµν + |g|Rµν δgµν + (R + 2Λ) δ |g| . (8.26c)
Ω
Our task is to rewrite the first and third term as variations of δg µν or to show that they are
equivalent to boundary terms. Let us start with the first term. Choosing inertial coordinates,
the Ricci tensor at the considered point P becomes
Hence
gµν δRµν = g µν (∂ρ δΓρµν − ∂ν δΓρµρ ) = gµν ∂ρ δΓρµν − gµρ ∂ρ δΓν µν , (8.28)
where we exchanged the indices ν and ρ in the last term. Since ∂ρ gµν = 0 at P , we can
rewrite the expression as
The quantity V ρ is a vector, since the difference of two connection coefficients transforms as a
tensor. Replacing in Eq. (8.29) the partial derivative by a covariant one promotes it therefore
in a valid tensor equation,
1 p
gµν δRµν = ∇µ V µ = p ∂µ ( |g|V µ ). (8.30)
|g|
1
Recall that the Lagrange equations are modified in the case of higher derivatives which is one reason why
we directly vary the action in order to obtain the field equations.
89
8 Einstein’s field equation
Thus this term corresponds to a surface term which we assume to vanish. Next we rewrite
the third term using
p 1 1p 1p
δ |g| = p δ|g| = |g| g µν δgµν = − |g| gµν δgµν (8.31)
2 |g| 2 2
and obtain Z
p 1
δSEH =− 4
d x |g| Rµν − gµν R − Λ gµν δgµν = 0. (8.32)
Ω 2
Hence the metric fulfils in vacuum the equation
1 δSEH 1
−p µν
= Rµν − R gµν − Λgµν ≡ Gµν − Λgµν = 0, (8.33)
|g| δg 2
where we introduced the Einstein tensor Gµν . The constant Λ is called the cosmological
constant. It has the demension of a length squared: If the cosmological constant is non-zero,
empty space is curved with a curvature radius Λ−1/2 .
Einstein equation with matter We consider now the combined action of gravity and matter,
as the sum of the Einstein-Hilbert Lagrange density LEH /2κ and the Lagrange density Lm
including all relevant matter fields,
1 1p
L = LEH + Lm = − |g|(R + 2Λ) + Lm . (8.34)
2κ 2κ
In Lm , the effects of gravity are accounted for by the replacements {∂µ , ηµν } → {∇µ , gµν },
while we have to adjust later the constant κ such that we reproduce Newtonian dynamics
in the weak-field limit. We expect that the source of the gravitational field is the energy-
momentum tensor. More precisely, the Einstein tensor (“geometry”) should be determined
by the matter, Gµν = κTµν . Since we know already the result of the variation of SEH , we
conclude that the variation of Sm should give
2 δSm
p = Tµν . (8.35)
|g| δgµν
The tensor Tµν defined by this equation is called dynamical energy-momentum stress tensor .
In order to show that this definition makes sense, we have to prove that ∇µ Tµν = 0 and
we have to convince ourselves that this definition reproduces the standard results we know
already. Einstein’s field equation follows then as
Alternative form of the Einstein equation We can rewrite the Einstein equation such that
the only geometrical term on the LHS is the Ricci tensor. Because of
1
Rµµ − gµµ (R + 2Λ) = R − 2(R + 2Λ) = −R − 4Λ = κTµµ (8.37)
2
we can perform with T ≡ Tµµ the replacement R = −4Λ − κT in the Einstein equation and
obtain
1
Rµν = κ(Tµν − gµν T ) − gµν Λ . (8.38)
2
90
8.4 Dynamical stress tensor
This form of the Einstein equations is often useful, when it is easier to calculate T than R.
Note also that Eq. (8.38) informs us that an empty universe with Λ = 0 has a vanishing Ricci
tensor, Rµν = 0.
The second term is a four-divergence and thus a boundary term that we can neglect. The
remaining first term vanishes for arbitrary ξ, if the stress tensor is conserved,
∇α T αβ = 0 . (8.44)
Hence the local conservation of energy-momentum is a consequence of the general covariance
of the gravitational field equations, in the same way as current conservation follows from
gauge invariance in electromagnetism.
We now evaluate the dynamical stress tensor for the examples of the Klein-Gordon and the
photon field. Note that the replacements ηαβ → gαβ requires also that we have to express
summation indices as contractions with the metric tensor, i.e. we have to replace e.g. Aα B α
by gαβ Aα Bβ . Thus we rewrite Eq. (7.47) including a potential V (φ), that could be also a
mass term, V (φ) = m2 φ2 /2, as
1 αβ
L = g ∇α φ∇β φ − V (φ) . (8.45)
2
With ∇α φ = ∂α φ for a scalar field, the variation of the action gives
Z np p o
1
δSKG = d4 x |g|∇α φ∇β φ δgαβ + [gαβ ∇α φ∇β φ − 2V (φ)]δ |g|
2
Z Ω p
4 αβ 1 1
= d x |g|δg ∇α φ∇β φ − gαβ L . (8.46a)
Ω 2 2
91
8 Einstein’s field equation
and thus
2 δSm
Tαβ = p = ∇α φ∇β φ − gαβ L . (8.47)
|g| δgαβ
Next we consider the free electromagnetic action,
Z Z
1 4
p ab 1 p
Sem = − d x |g|Fab F = − d4 x |g|g ac gbd Fab Fcd . (8.48)
4 Ω 4 Ω
Noting that Fαβ = ∇α Aµ − ∇ν Aµ = ∂α Aµ − ∂ν Aµ , we obtain
Z n p o
1 p
δSem = − d4 x (δ |g|)Fρσ F ρσ + |g|δ(gαρ gβσ )Fαβ Fρσ (8.49a)
4 Ω
Z
1 4
p αβ 1 ρσ ρσ
=− d x |g|δg − gαβ Fρσ F + 2g Fαρ Fβσ . (8.49b)
4 Ω 2
Hence the dynamical stress tensor is
1
Tαβ = −Fαρ Fβ ρ + gαβ Fρσ F ρσ . (8.50)
4
Thus we reproduced in both cases the (symmetrised) canonical stress tensor.
where V0 is the minimum of the potential V (φ). Hence a scalar field with a non-zero minimum
of its potential acts as a cosmological constant.
Next we consider a perfect fluid described by the two parameters density ρ and pressure
P . We know already that T αβ = diag{ρ, P, P, P } for a perfect fluid in its rest frame. Hence
and a fluid with P = −ρ, i.e. marginally fulfilling the strong energy condition, has the same
property as a cosmological constant.
Is it possible to distinguish a term like Tab = gab V0 (φ) in Sm from a non-zero Λ in SEH ? In
principle yes, since a cosmological constant fulfils P = −ρ exactly and independently of all
external parameters like temperature or density. The latter change with time in the universe
and therefore there may be detectable differences to a fluid with P = P (ρ, T, . . .) and a scalar
field with potential V = V (ρ, T, . . .), even if they mimick today very well a cosmological
constant with P = −ρ.
92
8.5 Alternative theories
for a point-particle moving along x(τ ) with proper time τ . Inserting this into
1 p
∇α T αβ = ∂α T αβ + Γασα T σβ + Γβ σα T ασ = p ∂α ( |g|T αβ ) + Γβ σα T ασ = 0 (8.53)
|g|
gives Z Z
∂ (4)
α β
dτ ẋ ẋ δ (x̃ − x(τ )) + Γβ σα dτ ẋα ẋσ δ(4) (x̄ − x(τ )) = 0 . (8.54)
∂ x̃α
We can replace ∂/∂ x̃α = −∂/∂xα acting on δ(4) (x̃ − x(τ )) and use moreover
∂ (4) d (4)
ẋα α
δ (x̃ − x(τ )) = δ (x̃ − x(τ )) (8.55)
∂x dτ
to obtain Z Z
βd (4) β
− dτ ẋ δ (x̃ − x(τ )) + Γ σα dτ ẋα ẋσ δ(4) (x̃ − x(τ )) = 0 . (8.56)
dτ
Integrating the first term by parts we obtain
Z
dτ ẍβ + Γβσα ẋα ẋσ δ(4) (x̃ − x(τ )) = 0 . (8.57)
The integral vanishes only, when the word-line xα (τ ) is a geodesics. Hence Einstein’s equation
implies already the equation of motion of a point particle, in contrast to Maxwell’s theory,
where the Lorentz force law has to be postulated separately.
Tensor-scalar theories The field equations for a purely scalar theory of gravity would be
φ = −4πGTaa . (8.58)
It predicts no coupling between photons and gravitation, since Taa = 0 for the electromagnetic
field. A purely vector theory for gravity fails, since it predicts not attraction but repulsion
for two masses.
However, it may well be that gravity is a mixture of scalar, vector and tensor exchange,
dominated by the later. An important example for a tensor-scalar theory is the Brans-Dicke
theory. Here one use gµν to describe gravitational interactions but assumes that the strength,
G, is determined by a scalar field φ,
Z
4
p 1 2 2
S = d x |g| − φ R + α(∂µ φ) + Lm (gµν , ψ) , (8.59)
2
where ψ represents all matter fields. Rescaling the metric by
κ
g̃µν = gµν 2
φ
we are back to Einstein gravity, but now φ couples universally to all matter fields ψ.
93
8 Einstein’s field equation
f (R) gravity Another important class of modified gravity models are the so-called f (R)
gravity models, which generalise the Einstein–Hilbert action replacing R by a general function
f (R). Thus the action of f (R) gravity coupled to matter has the form
Z
4
p 1
S = d x |g| − f (R) + Lm , (8.60)
2κ̃
where Sm may contain both non-relativistic matter and radiation. Note that for f (R) 6= R, the
gravitational constant κ̃ = 8π G̃ deviates from Newton’s constant G measured in a Cavendish
experiment. The field equations can be derived from the action (8.60) either by a variation
w.r.t. the metric or the connection. The dynamics and the number of the resulting degrees
of freedom differ in the two treatments. Following the first approach, generalising our old
derivation one obtains
1
F (R)Rµν − f (R)gµν − ∇µ ∇ν F (R) + gµν F (R) = κTµν (8.61)
2
with F ≡ df /dR. Taking the trace of this expression, we find
The term F (R) acts as a kinetic term so that these models contain an additional propagating
scalar degree of freedom, φ = F (R).
Extra dimensions and Kaluza-Klein theories String theory suggests that we live in a world
with d = 10 spacetime dimensions. There are two obvious answers to this result: first, one
may conclude that string theory is disproven by nature or, second, one may adjust reality.
Consistency of the second approach with experimental data could be achieved, if the d − 4
dimensions are compactified with a sufficiently small radius R, such that they are not visible
in experiments sensible to wavelengths λ ≫ R.
Let us check what happens to a scalar particle with mass m, if we add a fifth compact
dimension y. The Klein–Gordon equation for a scalar field φ(xµ , y) becomes
with the five-dimensional d’Alembert operator 5 = − ∂y2 . The equation can be separated,
φ(xµ , y) = φ(xµ )f (y), and since the fifth dimension is compact, the spectrum of f is discrete.
Assuming periodic boundary conditions, f (x) = f (x + R), gives
The energy eigenvalues of these solutions are ωk,n 2 = k2 + m2 + (nπ/R)2 . From a four-
dimensional point of view, the term (nπ/R)2 appears as a mass term, m2n = m2 + (nπ/R)2 .
Since we usually consider states with different masses as different particles, we see the five-
dimensional particle as a tower of particles with mass mn but otherwise identical quantum
numbers. Such theories are called Kaluza–Klein theories, and the tower of particles Kaluza–
Klein particles. If R ≪ λ, where λ is the length-scale experimentally probed, only the n = 0
particle is visible and physics appears to be four-dimensional.
Since string theory includes gravity, one often assumes that the radius R of the extra-
dimensions is determined by the Planck length, R = 1/MPl = (8πGN )1/2 ∼ 10−34 cm. In this
94
8.5 Alternative theories
Since the 1/r 2 behaviour of the gravitational force is not tested below d∗ ∼ mm scales, one
can imagine that large extra dimensions exists that are only visible to gravity: Relating the
d = 4 and d > 4 Newton’s law F ∼ mr2+δ 1 m2
at the intermediate scale r = R, we can derive
the “true” value of the Planck scale in this model: Matching of Newton’s law in 4 and 4 + δ
dimensions at r = R gives
m1 m2 1 m1 m2
F (r = R) = GN 2
= 2+δ 2+δ . (8.65)
R MD R
This equation relates the size R of the large extra dimensions to the true fundamental scale
MD of gravity in this model,
G−1 2 δ δ+2
N = 8πMPl = R MD , (8.66)
while Newton’s constant GN becomes just an auxiliary quantity useful to describe physics
at r >
∼ R. (You may compare this to the case of weak interactions where Fermi’s constant
GF ∝ g2 /m2W is determined by the weak coupling constant g and the mass mW of the W -
boson.) Thus in such a set-up, gravity is much weaker than weak interaction because the
gravitational field is diluted into a large volume.
Next we ask, if MD ∼ TeV is possible, what would allow one to test such theories at
accelerators as LHC. Inserting the measured value of GN and MD = 1 TeV in Eq. (8.66) we
find the required value for the size R of the large extra dimension as 1013 cm and 0.1 cm for
δ = 1 and 2, respectively, Thus the case δ = 1 is excluded by the agreement of the dynamics
of the solar system with four-dimensional Newtonian physics. The cases δ ≥ 2 are possible,
because Newton’s law is experimentally tested only for scales r >∼ 1 mm.
95
9 Linearized gravity and gravitational waves
In any relativistic theory of gravity, the effects of an accelerated point mass on the surround-
ing spacetime can propagate maximally with the speed of light. Thus one expects that, in
close analogy to electromagnetic waves, gravitational waves exist. Such waves correspond to
ripples in spacetime which lead to local stresses and transport energy. Although gravitational
waves were already predicted by Einstein in 1916, their existence was questioned until the
1950s: Since locally the effects of gravity can be eliminated, it was doubted that they cause
any measurable effects. Similarly, the non-existence of a stress tensor for the gravitational
field raised the question how, e.g., the momentum and energy flux of gravitational waves can
be properly defined. Only in 1957, at the now famous “Chapell Hill Conference”, this con-
troversy was decided: First, Pirani presented a formalism how coordinate independent effects
of a gravitational wave could be deduced. Second, Feynman suggested the following simple
gedankenexperiment: A gravitational wave passing a rod with sticky beads would move the
beads along the rod; friction would then produce heat, implying that the gravitational wave
had done work. Soon after that the first gravitational wave detectors were developed, but
only in 2015 the first detection was accomplished.
These perturbations may be caused either by the propagation of gravitational waves or by the
gravitational potential of a star. In the first case, current experiments show that we should
not hope for h larger than O(h) ∼ 10−22 . Keeping only terms linear in h is therefore an
excellent approximation. Choosing in the second case as application the final phase of the
spiral-in of a neutron star binary system, deviations from Newtonian limit can become large.
Hence one needs a systematic “post-Newtonian” expansion or even a numerical analysis to
describe properly such cases.
We choose a Cartesian coordinate system xµ and ask ourselves which transformations are
compatible with the splitting (9.1) of the metric. If we consider global Lorentz transformations
1 (0)
The same analysis could be performed for small perturbations around an arbitrary metric gµν , adding
however considerable technical complexity.
96
9.1 Linearized gravity
G ... ...ab
Lem = − 32 d¨a d¨a energy loss Lgr = − 5 I ab I
Table 9.1: Comparison of basic formulas for electromagnetic and gravitational radiation.
g̃αβ = Λρα Λσβ gρσ = Λρα Λσβ (ηρσ + hρσ ) = ηαβ + Λρα Λσβ hρσ = η̃αβ + Λρα Λσβ hρσ . (9.2)
Since h̃αβ = Λρα Λσβ hρσ , we see that global Lorentz transformations respect the splitting (9.1).
Thus hµν transforms as a rank-2 tensor under global Lorentz transformations. We can view
therefore the perturbation hµν as a symmetric rank-2 tensor field defined on Minkowski space
that satisfies as wave equation the linearized Einstein equation, similar as the photon field
fulfills a wave equation derived from Maxwell’s equations.
The splitting (9.1) is however clearly not invariant under general coordinate transforma-
tions, as they allow, for example, the finite rescaling gµν → Ωgµν . We restrict therefore
ourselves to infinitesimal coordinate transformations,
because the term ξ ρ ∂ρ hµν is quadratic in the small quantities hµν and ξµ and can be neglected.
Recall that the ξ ρ ∂ρ hµν term appeared, because we compared the metric tensor at different
points. In its absence, it is more fruitful to view Eq. (9.4) not as a coordinate but as a gauge
transformation analogous to Eq. (7.88). In this interpretation, we stay in Minkowski space
and the fields h̃µν and hµν describe the same physics, since the gravitational field equations
do not fix uniquely hµν for a given source.
97
9 Linearized gravity and gravitational waves
Note that we used η µν to lower indices which is appropriate in the linear approximation.
Remembering the definition of the Riemann tensor,
Rµνλκ = ∂λ Γµνκ − ∂κ Γµνλ + Γµρλ Γρνκ − Γµρκ Γρνλ , (9.7)
we see that we can neglect the terms quadratic in the connection terms. Thus we find for the
change
δRµνλκ = ∂λ δΓµνκ − ∂κ δΓµνλ (9.8a)
1
= ∂λ ∂ν hµκ + ∂λ ∂κ hµν − ∂λ ∂ µ hνκ − (∂κ ∂ν hµλ + ∂κ ∂λ hµν − ∂κ ∂ µ hνλ ) (9.8b)
2
1
= ∂λ ∂ν hµκ + ∂κ ∂ µ hνλ − ∂λ ∂ µ hνκ − ∂κ ∂ν hµλ . (9.8c)
2
The change in the Ricci tensor follows by contracting µ and λ,
1n o
δRλνλκ = ∂λ ∂ν hλκ + ∂κ ∂ λ hνλ ) − ∂λ ∂ λ hνκ − ∂κ ∂ν hλλ . (9.9)
2
µ
Next we introduce h ≡ hµ , = ∂µ ∂ µ , and relabel the indices,
1
δRµν = ∂µ ∂ρ hρν + ∂ν ∂ρ hρµ − hµν − ∂µ ∂ν h . (9.10)
2
We now rewrite all terms apart from hµν as derivatives of the vector
1
ξµ = ∂ν hνµ − ∂µ h, (9.11)
2
obtaining
1
{−hµν + ∂µ ξν + ∂ν ξµ } .
δRµν = (9.12)
2
Looking back at the properties of hµν under gauge transformations, Eq. (9.4), we see that we
can gauge away the second and third term. Thus the linearised Einstein equation in vacuum,
δRµν = 0, becomes simply
hµν = 0, (9.13)
if the harmonic gauge2
1
ξµ = ∂ν hνµ − ∂µ h = 0 (9.14)
2
is chosen. Hence the familiar wave equation holds for all independent components of
hµν , and the perturbations propagate with the speed of light. Inserting plane waves
hµν = εµν exp(−ikx) into the wave equation, one finds immediately that k is a null vector.
The characteristic property of gravity that we can introduce in each point an inertial
coordinate system implies that we can set the perturbation hµν equal to zero in a single
point. This ambiguity was one of the reasons that the existtence of gravitational waves was
doubted for long time. In Section 8.1, we introduced therefore the Riemann tensor as an
umambigious signature for the non-zero curvature of space-time. The derivation of a wave
equation for the Riemann tensor,
Rµνλκ = 0, (9.15)
by Pirani in 1956 (which follows by using (9.13) in (9.8c)), can be therefore seen as the theo-
retical proof for the existence of gravitational waves in Einstein gravity: Ripples in spacetime,
or more formally perturbations of the curvature tensor, propagate with the speed of light.
2
Alternatively, this gauge is called Hilbert, Loren(t)z, de Donder,. . . , gauge.
98
9.1 Linearized gravity
Since we assumed an empty universe in zeroth order, δTµν is the complete contribution to the
stress tensor. We omit therefore in the following the δ in δTµν . Next we introduce as useful
short-hand notation the “trace-reversed” amplitude as
1
h̄µν ≡ hµν − ηµν h . (9.17)
2
The harmonic gauge condition becomes then
∂ µ h̄µν = 0 (9.18)
Newtonian limit We are now in the position to fix the value of the constant κ, comparing
the wave equation (9.19) with the Schwarzschild metric in the Newtonian limit. This limit
corresponds to v/c → 0 and thus the only non-zero element of the stress tensor becomes
T 00 = ρ. Moreover, the d’Alembert operator can be approximated by minus the Laplace
operator, → −∆. The Schwarzschild metric in the weak-field limit is
ds2 = (1 + 2Φ)dt2 − (1 − 2Φ) dx2 + dy 2 + dz 2 (9.21)
Hence the linearised Einstein equation has in the Newtonian limit the same form as the
Poisson equation ∆Φ = 4πGρ, and the constant κ equals κ = 8πG.
99
9 Linearized gravity and gravitational waves
Thus ε00 = 0 and the polarisation tensor is transverse, ka εab = kb εab = 0. If we choose
the plane wave propagating in the z direction, k = kez , the last raw and column of the
polarisation tensor vanish too. Accounting for h = 0 and εαβ = εβα , only two independent
elements are left,
0 0 0 0
0 ε11 ε12 0
εαβ =
0 ε12 −ε11 0 .
(9.26)
0 0 0 0
In general, one can construct the polarisation tensor in the TT gauge by first setting the
non-transverse part to zero and then subtracting the trace. The resulting two independent
elements are (again for k = kez ) then ε11 = (εxx − εyy )/2 and ε12 .
Let us re-discuss the procedure of determing the physical polarisation states of a gravita-
tional wave following the same approach that we used for the photon in Eqs. (7.86)-(-7.88).
We consider first as gauge transformation
obtaining3
Thus the gauge transformation does not affect the non-zero components in the TT gauge,
which are therefore the only physical ones. On the other hand, the arbitrariness of λµ allows
3
Simpler to consider trace-reversed h̄µν ...
100
9.1 Linearized gravity
us to set all other elements of the polarisation tensor to zero. To see this, we note that the
harmonic gauge implies
1
kµ εµν = kν εµµ . (9.30)
2
Then it follows for ν = {1, 2}
while the ν = {0, 3} components result with εij = −εij and kµ = (ω, 0, 0, −ω) in
1
ε00 + ε30 = (ε00 − ε11 − ε22 − ε33 ) = −(ε03 + ε33 ). (9.32)
2
Thus we can eleminate four elements of the polarisation tensor. We choose to eleminate ε0i ,
using first ε01 = −ε31 and ε02 = −ε32 . Next we combine the LHS and the RHS of Eq. (9.32)
using ε30 = ε03 , obtaining
1
ε03 = − (ε00 + ε33 ). (9.33)
2
Finally, we use this relation to eleminate ε03 in the ν = 3 equation,
1 1 1
ε00 − ε33 = (ε00 − ε11 − ε22 − ε33 ). (9.34)
2 2 2
and thus ε11 = −ε22 . Apart from the invariant physical elements, ε̃11 = ε11 and ε̃12 = ε12 ,
the remaining four elements of the polarisation tensor transform as
Since each of the four elements depends on a different λµ , they can be set to zero choosing
ε00 ε13 ε23 ε33
λ0 = − , λ1 = − , λ2 = − , λ3 = − . (9.37)
2ω ω ω 2ω
Helicity We determine now how a metric perturbation hab transforms under a rotation with
the angle α. We choose the wave propagating in z direction, k = kez , the TT gauge, and the
rotation in the xy plane. Then the general Lorentz transformation Λ becomes
1 0 0 0
0 cos α sin α 0
Λµν =
0 − sin α cos α 0 .
(9.38)
0 0 0 1
Since k = kez and thus Λµν kν = kµ , the rotation affects only the polarisation tensor. We
rewrite ε′µν = Λµρ Λνσ ερσ in matrix notation, ε′ = ΛεΛT . It is sufficient to perform the
calculation for the xy sub-matrices. The result after introducing circular polarisation states
ε± = ε11 ± iε12 is
ε′µν
± = exp(∓2iα)ε± .
µν
(9.39)
The same calculation for a circularly polarised photon gives ε′µ µ
± = exp(∓iα)ε± . Any plane
wave ψ which is transformed into ψ ′ = e−ihα ψ by a rotation of an angle α around its propa-
gation axis is said to have helicity h. Thus if we say that a photon has spin 1 and a graviton
101
9 Linearized gravity and gravitational waves
Figure 9.1: The effect of a right-handed polarised gravitational wave on a ring of transverse
test particles as function of time; the dashed line shows the state without gravi-
tational wave.
has spin 2, we mean more precisely that electromagnetic and gravitational plane waves have
helicity 1 and 2, respectively. Doing the same calculation in an arbitrary gauge, one finds that
the remaining, unphysical degrees of freedom transform as helicity 1 and 0 (problem 9.??).
In general, a massive tensor field of rank n contains states with helicity h = −n, . . . , n, con-
taining thus 2n + 1 polarisation states. In contrast, a massless tensor field of rank n contains
only the two polarisation states with maximal helicity, h = −n and h = n.
1
Γα00 = (∂0 hα0 + ∂0 hα0 − ∂ α h00 ) . (9.40)
2
We are free to choose the TT gauge in which all component of hαβ appearing on the RHS
are zero. Hence the acceleration of the test particle is zero and its coordinate position is
unaffected by the gravitational wave: the TT gauge defines a comoving coordinate system.
The physical distance l between two test particles is given by integrating
where gab is the spatial part of the metric and dξ the spatial coordinate distance between
infinitesimal separated test particles. Hence the passage of a gravitational wave, hαβ ∝
εαβ cos(ωt), results in a periodic change of the separation of freely moving test particles.
Figure 9.1 shows that a gravitational wave exerts tidal forces, stretching and squashing test
particles in the transverse plane. The relative size of the change, ∆L/L, is given by the
amplitude h of the gravitational wave. It is this tiny periodic change, ∆L/L < −21 cos(ωt),
∼ 10
which gravitational wave experiments aim to detect. There are two basic types of gravitational
wave experiments. In the first, one uses the fact that the tidal forces of a passing gravitational
wave excite lattice vibrations in a solid state. If the wave frequency is resonant with a lattice
mode, the vibrations might be amplified to detectable levels. In the second type of experiment,
the free test particles are replaced by mirrors. Between the mirrors, a laser beam is reflected
multipe times, thereby increasing the effective length L and thus ∆L, before two beams at
90◦ are brought to interference.
102
9.2 Stress pseudo-tensor for gravity
Figure 9.2: Sensitivity of present and future experiments compared to the expectations for
the amplitude h = ∆L/L for various gravitational wave sources.
103
9 Linearized gravity and gravitational waves
The LHS of this equation is the LHS of the usual gravitational wave equation, while the RHS
now includes as source not only matter but also the gravitational field itself. It is therefore
natural to define
(1) 1
Rαβ − R(1) ηαβ = −κ (Tαβ + tαβ ) (9.43)
2
with tαβ as the stress pseudo-tensor for gravity. If we expand all quantities,
(1) (2)
we can set, assuming hαβ ≪ 1, Rαβ − Rαβ = Rαβ + O(h3 ), etc. Hence we find as stress
(1)
pseudo-tensor for the metric perturbations hαβ at O(h3 )
1 (2) 1
tαβ =− Rαβ − R(2) ηαβ . (9.45)
κ 2
(1)
This tensor is symmetric, quadratic in hαβ and conserved because of the Bianchi identity.
Moreover, it transforms as a tensor in Minkowski space. This implies that one can derive
global conservation laws for the energy and the angular momentum of the gravitational field,
if we assume |hαβ | → 0 for x → ∞. However, tαβ does not transform as a tensor under
general coordinate transformation,, since it can be made at each point identically to zero
by a suitable coordinate transformation. In the case of gravitational waves we may expect
that averaging tαβ over a volume large compared to the wave-length considered solves this
problem. Moreover, such an averaging simplifies the calculation of tαβ , since all terms odd in
kx cancel. Nevertheless, the calculation is extremly messy. We will use therefore a short-cut
via the following two digressions.
Quadratic Einstein-Hilbert action We construct the action of gravity quadratic in hµν from
the wave equation (9.19), following the same logic as in the Maxwell case. We multiply by a
variation δhµν and integrate, obtaining
Z
harm harm 4
p 1 µν 1 µν
0 = δSEH + δSm = d x |g| δh h̄µν + δh Tµν . (9.46)
4κ 2
Here, we divided by two such that we obtain the correctly normalised stress tensor of matter
using (8.35). Next, we massage the first term into a form similar to the kinetic energy of a
scalar field in Minkowski space: We insert first the definition of h̄µν , use then the product
rule and perform finally a partial integration,
1 1
δhµν h̄µν = δhµν hµν − δhµν ηµν h = δhµν hµν − δhh (9.47a)
2 2
1 µν 1 1 1
= δ h hµν − hh = −δ (∂κ h ) − (∂κ h)2 .
µν 2
(9.47b)
2 4 2 4
104
9.2 Stress pseudo-tensor for gravity
We can express an arbitrary polarisation state as the sum over the polarisation tensors for
circular polarised waves, X
hµν = h(a) εµν
(a)
. (9.50)
a=+,−
(a)
Inserting this decomposition into (9.49) and using εµν εµν(b) = δab , the action becomes
Z
TT 1 X 1 2
SEH = − d4 x ∂ρ h(a) . (9.51)
32πG a 2
Thus the gravitational action in the TT gauge consists of two degrees of freedom, h+ and
h− , which determine the contribution of left- and right-circular polarised waves. Apart from
the pre-factor, the action is the same as the one of two scalar fields. This means that we can
shortcut many calculations involving gravitational waves by using simply the corresponding
results for scalar fields. We can understand this equivalence by recalling that the part of the
action action quadratic in the fields just enforces the relativistic energy–momentum relation
via a Klein–Gordon equation for each field component. The remaining content of (9.48) is
just the rule how the unphysical components in hµν have to be eliminated. In the TT gauge,
we have already applied this information, and thus the two scalar wave equations for h(±)
summarise the Einstein equation at O(h2 ).
Averaged stress tensor The stress tensor of a scalar field is in general given by
2 δSm
Tαβ = p = ∂α φ∂β φ − gαβ L . (9.52)
|g| δgαβ
We consider now a free field, i.e. set now V (φ) = 0, and take the average over a volume Ω
large compared to the typical wavelength of the field,
Z
1 1
hTαβ i = d4 x Tαβ = h∂α φ∂β φi − ηαβ h(∂ρ φ)2 i . (9.53)
Ω 2
Performing a partial integration of the second term, we can drop the surface term, and use
then the equation of motions,
Hence hTαβ i = h∂α φ∂β φi. Comparing now SKG and SEH in the TT gauge suggests that the
averaged stress pseudo-tensor of the gravitational field is given in this gauge by
1
htαβ i = h∂α hij ∂β hij i . (9.55)
32πG
105
9 Linearized gravity and gravitational waves
Quadrupol formula Gravitational waves in the linearized approximation fulfil the superpo-
sition principle. Hence, if the solution for a point source is known,
the general solution can be obtained by integrating the Green function over the sources,
Z
h̄αβ (x) = −2κ d4 x′ G(x − x′ )Tαβ (x′ ) . (9.57)
The Green function G(x − x′ ) is not completely specified by Eq. (9.56): We can add solutions
of the homogeneous wave equation and we have to specify how the poles of G(x − x′ ) are
treated. In classical physics, one chooses the retarded Green function G(+) (x − x′ ) defined by
1
G(+) (x − x′ ) = − δ[|x − x′ | − (t − t′ )]ϑ(t − t′ ) , (9.58)
4π|x − x′ |
picking up the contributions along the past light-cone; for a derivation see appendix 9.B.
Inserting the retarded Green function into Eq. (9.57), we can perform the time integral
using the delta function and obtain
Z
Tαβ (t − |x − x′ |, x′ )
h̄αβ (x) = 4G d 3 x′ . (9.59)
|x − x′ |
The retarded time tr ≡ t − |x − x′ | denotes the emission time tr of a signal emitted at x′ that
reaches x at time t propagating with the speed of light.
We perform now a Fourier transformation from time to angular frequency,
Z Z Z
1 iωt 4G Tαβ (tr , x′ )
h̄αβ (ω, x) = √ dt e h̄αβ (t, x) = √ dt d3 x′ eiωt . (9.60)
2π 2π |x − x′ |
106
9.3 Emission of gravitational waves
Next we want to eleminate all elements of Tαβ except T00 . We use first (flat-space) energy-
momentum conservation,
∂ 00 ∂
T + b T 0b = 0 , (9.65a)
∂t ∂x
∂ a0 ∂
T + b T ab = 0 . (9.65b)
∂t ∂x
Then we differentiate Eq. (9.65a) with respect to time and use Eq. (9.65b), obtaining
∂ 2 00 ∂2 0b ∂2
T = − T = T ab . (9.66)
∂t2 ∂xb ∂t ∂xa ∂xb
Here we dropped also surface terms, using that the source is compact. In the harmonic gauge,
we need to calculate only the components hij (tr , x). We define as quadrupole moment of the
source stress tensor Z
I (tr ) = d3 x xa xb T 00 (tr , x) .
ab
(9.68)
Then the quadrupole formula for the emission of gravitational waves results,
2G ¨
h̄ab (t, x) = Iab (tr ) , (9.69)
c6 r
where we added also c. Since hαβ is traceless, the trace of I αβ does not produce gravitational
waves: It is connected to the dipole moment and its time derivative vanishes because of
107
9 Linearized gravity and gravitational waves
Our derivation neglected perturbations of flat space and seems therefore not applicable to a
self-gravitating system. However, our final result depends only on the motion of the particles,
not how it is produced. An analysis at next order in perturbation theory shows indeed that
our result applies to self-gravitating systems like binary stars.
Note the following peculiarity of a gravitational wave experiment: Such an experiment
measures the amplitude hab ∝ 1/r of a metric perturbation, while the sensitivity of all other
experiments (light, neutrinos, cosmic rays, . . . ) is proportional to the energy flux ∝ 1/r 2 of
radiation. This difference is connected to the fact that a gravitational wave is caused by the
coherent motion of the source, and can be thus observed as a coherent wave over time. In
particular, one can measure the phase of hab as function of time. In contrast, light observed
from an astrophysical source is a incoherent superposition of individual photons. As a result,
increasing the sensitivity of a gravitational wave detector by a factor ten increases the number
of potential sources by a factor 1000, in contrast to a factor 103/2 for other detectors.
One may wonder if this behavior contradicts the fact that also the energy flux of a gravi-
tational wave follows as 1/r 2 law. However, the energy dissipated from a gravitational wave
crossing the Earth (including our experimental set-up) is extremely tiny, while the energy den-
sity of gravitational wave with amplitude as small as h ∼ 10−22 is surprisingly large (check it
e.g. with (9.75)).
with amplitudes Aij which we choise to be real. Using hsin2 (kx)i = 1/2, we obtain
1
htαβ i = kα kβ Aij Aij . (9.72)
64πG
The energy-flux F, i.e. the energy crossing an unit area per unit time, in the direction n is
in general F = ct0i ni . For a plane-wave with wave-vector kµ , it follows
1
F = t0i k̂i = k0 ki k̂i Aij Aij = ct00 , (9.73)
64πG
where we used k0 = −ki k̂i . Thus we got the reasonable result that the energy-flux is simply
the energy-density t00 multiplied with the wave-speed c. Expressing as the sum over linearly
polarised waves, X
hµν = h(a) εµν
(a)
. (9.74)
a=+,×
108
9.4 Gravitational waves from binary systems
In the case of a spherical wave emitted from the origin, we choose n = er . Then
1 1 c5
F(er ) = ht0i ni i = h(∂t hij )(n · ∇)hij i = h(∂t hij )(∂r hij )i . (9.76)
64πG 64πG 64πG
Inserting the quadrupole formula, one finds (cf. the appendix for details)
Z
dE G ... ...ij
Lgr = − = − dΩr 2 F(er ) = 5 Qij Q , (9.77)
dt 5c
where we added c.
Kepler problem We start recalling the basic formulae from the Kepler problem For a system
of two stars with masses M = m1 + m2 we introduce the reduced mass
m1 m2
µ= (9.78)
m1 + m2
and c.m. coordinates.
GM µ2
u′′ + u = . (9.79)
L2
Inserting as trial solution the equation of a conic section,
1 1 + e cos ϑ
u= = (9.80)
r a(1 − e2 )
we find
1 − e cos ϑ + e cos ϑ ! GM µ2
u′′ + u = = . (9.81)
a(1 − e2 ) L2
Thus we obtain as constraint for the angular momentum
p
L = µ GM a(1 − e2 ) . (9.82)
109
9 Linearized gravity and gravitational waves
Gravitational wave emission In the first step, we derive the instantanuous energy loss of
the binary system due to gravitational wave emission. Since we assume that the losses are
small, we can treat the orbital parameters a and e as constant. The quadrupole moments
follow as
In order to find the derivates of Iik , we have to determine first ṙ and φ̇. Eleminating L using
Eq. (9.82) we obtain
L [a(1 − e2 )M ]1/2
φ̇ = = . (9.84)
µr 2 r2
1/2
a(1 − e2 )e sin φ φ̇ M
ṙ = = e sin φ. (9.85)
(1 + e cos φ)2 a(1 − e2 )
I˙xx = 2µ cos φ r ṙ cos φ − r 2 φ̇ sin φ . (9.86)
With
and
e sin φ
r ṙ = A , (9.88)
1 + e cos φ
it follows
2 e sin φ Mr
r ṙ cos φ − r φ̇ sin φ = A sin φ −1 =− sin φ. (9.89)
1 + e cos φ [a(1 − e2 )]1/2
Thus we obtain
2m1 m2 r
I˙xx = − cos φ sin φ. (9.90)
[a(1 − e2 )M ]1/2
110
9.4 Gravitational waves from binary systems
The calculation of the other elements and the higher derivatives proceeds in the same way,
leading to
2m1 m2
I¨xx = − 2
cos 2φ + e cos3 φ , (9.91a)
a(1 − e )
... 2m1 m2
I xx = 2
2 sin 2φ + 3e cos2 φ sin φ φ̇, (9.91b)
a(1 − e )
2m1 m2
I˙yy = r (cos φ sin φ + e sin φ) , (9.91c)
[a(1 − e2 )M ]1/2
2m1 m2
I¨yy = 2
cos 2φ + e cos φ + e cos3 φ + e2 , (9.91d)
a(1 − e )
... 2m1 m2 2 2
I yy = − 2 sin φ + e sin φ + 3e cos φ sin φ φ̇, (9.91e)
a(1 − e2 )
m1 m2 r
I˙xy = 2 1/2
cos2 φ − sin2 φ + e cos φ (9.91f)
[a(1 − e )M ]
2m1 m2
I¨xy = − 2
sin 2φ + e sin φ + e sin φ cos2 φ (9.91g)
a(1 − e )
... 2m1 m2
I xy = − 2
2 cos 2φ − e cos φ + 3e cos3 φ φ̇, (9.91h)
a(1 − e )
... ... ... 2m1 m2
I = I xx + I yy = − e sin φφ̇. (9.91i)
a(1 − e2 )
Inserting these expressions into
dE G ... ...ij G ...2 ...2 ...2 1 ...2
Lgr =− = Qij Q = I xx + 2 I xy + I yy − I (9.92)
dt 5 5 3
results in
dE 8m21 m22
− = 2 2
12(1 + e cos φ)2 + e2 sin2 φ φ̇2 . (9.93)
dt 15a(1 − e )
for the instantanous energy loss. In order to obtain the average energy loss, we have to average
this expression over one period,
Z Z
dE 1 T dE 1 2π dφ dE 32 m21 m22 M
− =− dt =− = f (e) (9.94)
dt T 0 dt T 0 φ̇ dt 5 a5
with
1 + 73 2 37 4
24 e + 96 e
f (e) = (9.95)
(1 − e2 )7/2
Time evolution Now we can determine how the orbital parameters change over time. The
major axis a decreases with time as
da m1 m2 dE 2a2 dE
= = , (9.96)
dt 2E 2 dt m1 m2 dt
or averaged over one period,
da 64 m21 m22 M
− = f (e). (9.97)
dt 5 a3
111
9 Linearized gravity and gravitational waves
Ṗ 3 Ė 3 ȧ 96 m21 m22 M
= =− =− f (e). (9.98)
P 2E 2a 5 a4
What remains to do is to work out the change of the eccentricity,
de M 2 dE dL
= 3 3 L + 2EL . (9.99)
dt m1 m2 e dt dt
Determineing the loss of angualar momentum L due to gravitational wave emission is more
involved than the energy loss: Since L = r × p contains a factor r, we have to take into
account terms h ∝ 1/r 2 what requires to include term of O(h3 ) in htµν i. Therefore we simply
cite the result obtained by Peters 1964 [3, 4],
dL 2G 3ij k ...
− = ε Q̈i Qjk . (9.100)
dt 5
Then the instantanuous loss of angular momentum follows as
dL G h ¨ ... ... ... i
− = Ixy ( I yy − I xx ) + I xy (I¨xx − I¨yy ) , (9.101)
dt 5
leading to
de 304 m1 m2 M e
− = g(e) (9.102)
dt 15 a4
with
1 + 121
304 e
2
g(e) = . (9.103)
(1 − e2 )7/2
Hulse-Taylor pulsar The binary system found by Hulse and Taylor consists of a pulsar
with mass m1 = 1.44M⊙ and a companion with mass m2 = 1.34M⊙ . Their orbital period
is P = 7h40min on an orbit with rather strong eccentricity, e = 0.617. In this case, the
emission of gravitational radiation is strongly enhanced compared to an circular orbit. Let
us now compare the observed change in the orbit of the binary with the prediction of general
relativity. The prediction of Einstein’s general relativity,
Ṗ (e) = f (e) Ṗ (0) ≃ 11.7 × Ṗ (0) ≃ (−2.403 ± 0.002) × 10−12 ,
(9.104)
th
A comparison of the predicted and observed accumulted shift in the period is shown in Fig. ??.
112
9.4 Gravitational waves from binary systems
a Lagrange function only of coordinates and velocities but including post-Newtonian (PN)
corrections up to order (v/c)4 . The first relativistic terms, at the 1PN order, were derived in
1937–39, the 2PN approximation was tackled by Ohta et al. in 1973–74, while results for the
3PN order were obtained starting from 1998. Alternatives to this brûte-de-force approach such
as the effective one-body theory have been developped where one maps the two-body problem
of GR onto an one-body problem in an effective metric. However, all these approaches are
restricted to the inspiral phase of a merger. In contrast, numerical simulations of the merging
phase of binaries give accurate results, but can take months even on super-computers. Thus
their extension towards early times of the inspiral phase is restricted, and the set of parameters
{m1 , m2 , s1 , s2 , e, . . .} for which simulations exist is sparse. For instance, simulations for large
mass ratios m/ m2 are numerically still prohibitive. As a results, a combination of the different
approaches is needed to describe the coalesence of binaries of binary system accurately.
Qualitative discussion Let us now discuss qualitatively the final stage in the time evolution
of a close binary system. We can assume that the emission of GWs has lead to a circulisation
of the orbits. Then
32 G4 µ2 M 3
Lgw = . (9.106)
5 a5
Next we can relate the relative changes per time in the orbital period P , the separation a and
the energy E using E ∝ 1/a and P ∝ a3/2 as
Ė ȧ 2 Ṗ
=− = . (9.107)
E a 3P
Solving first for the change in the period,
3 Lgw 96 G3 µM 2
Ṗ = − P = P, (9.108)
2 E 5 a4
and eliminating then a gives
96
Ṗ = − (2π)8 G5/3 µM 2/3 P −5/3 . (9.109)
5
Combining (9.107) and (9.108), we obtain
2 Ṗ 64 G3 µM 2
ȧ = a=− . (9.110)
3P 3 a3
Separating variables and integrating, we find
256 3
a4 = G µM 2 (t − tc ). (9.111)
5
Here, tc denotes the (theoretical) coalensence time for point-like stars. With the initial con-
dition a(t = 0) = a0 , it follows
t 1/4
a(t) = a0 1 − (9.112)
tc
and
5 a40
tc = . (9.113)
256 G µM 2
3
113
9 Linearized gravity and gravitational waves
Figure 9.3: Example of waveforms from black hole (upper) and neutrons star (lower panel)
binaries.
As a rule of thumb, our approximations (slow velocities and weak fields) break down at
r ≃ rISCO . Since the last stage of the merger is fast, the estimate (9.113) is quite reliable.
From the exercise, we know that the amplitude is
4GµM h
hij = Aij = Aij , (9.114)
ar r
where the non-zero amplitudes are Aij ∝ sin(2ωt + φ). Thus the emitted gravitational wave
is monochromatic, with frequency twice the orbital frequency of the binaries,
2ω 2 (GM )1/2 t −3/8
νGW = = = = ν0 1 − . (9.115)
2π P πa3/2 tc
Moreover, the amplitude of the gravitational wave signal increases with time as
1
h(t) ∝ ∝ (t − tc )−1/4 . (9.116)
a
Expressed as function of the frequency νGW , the amplitude becomes
(m1 m2 )3/5
M ≡ µ3/5 M 2/5 = , (9.118)
(m1 + m2 )1/5
114
9.A Appendix: Projection operator
which is the combination of the masses m1 and m2 easiest to extract from the gravitational
wave signal. Finally, we have to replace the instantaneous phase in the polarisation tensor by
the time-integrated phase, since ω depends on time,
Z −5/8
t − tc
Φ(t) = dt 2ω = + φ0 . (9.119)
5GM
Thus both the amplitude and the phase evolution of the gravitational wave signal provide
information on the chirp mass M.
A typical wave-form of the merger of a black hole binary is shown in the upper panel
of Fig. 9.3. It consists of the waves emitted during the inspiral (“the chirp”), the merger,
and the ring-down. In this last phase, oscillations of the BH formed during the merger are
damped by the emission of GWs and decay exponentially, leading to standard Kerr BH. The
frequencies and the damping times of the eigenmodes of a BH can be calculated, and thus the
ring-down provides additional opportunities to test GR. The lower panel of Fig. 9.3 shows
a typical wave-form of a neutron star merger: When tidal interactions start to deform the
neutron stars, the gravitational wave signal is not monochromatic anymore and the structure
of the stars has to be accounted for.
P±2 = P± , P± P∓ = 0, and P+ + P− = 1,
Pi j = δij − ni nj . (9.120)
Morover, it is ni Pi j vj = 0 for all vectors v; Thus P projects indeed any vector on the subspace
orthogonal to n. Since a tensor is a multi-linear map, we have to apply a projection operator
on each of its indices,
TT
Mkl = Pki Pl j Mij . (9.122)
T is transverse, nk M T = nl M T = 0, but in general not traceless
The tensor Mkl kl kl
115
9 Linearized gravity and gravitational waves
The integral on the RHS is trivial, and the one on the LHS defines G(r, ω). Next we perform
the time derivatives, obtaining
116
9.B Appendix: Derivation of the retarded Green function
is
Aeikr Be−ikr
Gω (r) = + ≡ AG(+) (−)
ω (r) + BGω (r). (9.134)
r r
Thus the solution consists of out- and in-going spherical waves. Next we consider the limit
r → 0 (or the static limit) of the wave equation. Integrating over a small sphere of radius r,
we obtain Z Z
d3 x ∆G(r, ω) = dSi ∂i G(r, ω) = 4πr 2 ∂r G(r, ω) = 1. (9.135)
Here, weR used Gauss’ theorem to convert the volume into a surface integral, while we could
neglect d3 xω 2 G(r, ω) ∝ r 3 for r → 0. Moreover, we used that the integral over the delta
function on the RHS gives one. Thus the Green function for small r satisfies
1
G(r, ω) = − + C. (9.136)
4πr
Comparing this to Eq, (9.134) fixes A + B = −1 and C = 0. Finally, we transform back to
time, Z Z
dω (±) 1 dω −iω(τ ∓r)
G(±) (r, t) = G (r, ω)e−iωτ = − e , (9.137)
2π 4πr 2π
where we used ω = |k|. Then it follows
δ(τ ∓ r)
G(±) (r, t) = − . (9.138)
4πr
The delta function enforces τ = ±r. Since r > 0, the Green function G(+) includes only
sources with τ > 0, i.e. along the past light-cone of the observer at {t, x}, while the Green
function G(−) includes only sources with τ < 0, i.e. along the past light-cone.
Finally, we comment on the differences between the classical and the quantum case:
• In classical physics, we use only positive energy solutions and the causal propagator
is the retarded one, which propagates these solutions forward in time. A relativistic
quantum theory contains in addition negative energy solutions. The causal or Feyn-
man propagates then positive energy solutions (particles) forward, and negative energy
solutions (antiparticles) backward in time, in a way conistent with the CPT theorem.
• In the classical case, one eleminates the gauge freedom completely such that only phys-
ical degrees of freedom propagate. Then it is sufficient to use a scalar Green function,
which propagates the physical polarisation states in the same way. Such gauges (like
the Coulomb or TT gauge) are however valid only in a specific frame. Therefore one
prefers in the quantum case a covariant (like the Lorenz or harmonic) gauge. These
gauges include also the instantanous Coulomb or Newtonian interactions. The Green
function of a tensor of rank n becomes then a tensor of rank 2n.
117
10 Cosmological models for an homogeneous,
isotropic universe
Weyl’s postulate In 1923, Hermann Weyl postulated the existence of a privileged class
of observers in the universe, namely those following the “average” motion of galaxies. He
postulated that these observers follow time-like geodesics that never intersect. They may
however diverge from a point in the (finite or infinite) past or converge towards such a point
in the future.
Weyl’s postulate implies that we can find coordinates such that galaxies are at rest. These
coordinates are called comoving coordinates and can be constructed as follows: One chooses
first a space-like hypersurface. Through each point in this hypersurface lies a unique worldline
of a privileged observer. We choose the coordinate time such that it agrees with the proper-
time of all observers, g00 = 1, and the spatial coordinate vectors such that they are constant
and lie in the tangent space T at this point. Then ua = δ0a and for n ∈ T it follows na = (0, n)
and
0 = ua na = gab ua nb = g0β nβ . (10.1)
The cosmological principle constrains further the form of dl2 : Homogeneity requires that
the gαβ can depend on time only via a common factor S(t), while isotropy requires that only
x · x, dx · x, and dx · dx enter dl2 . Hence
dl2 = C(r)(x · dx)2 + D(r)(dx · dx)2 = C(r)r 2 dr 2 + D(r)[dr 2 + r 2 dϑ2 + r 2 sin2 ϑdφ2 ] (10.3)
We can eliminate the function D(r) by the rescaling r 2 → Dr 2 . Thus the line-element becomes
dl2 = S(t) B(r)dr 2 + r 2 dΩ (10.4)
with dΩ = dϑ2 + sin2 ϑdφ2 , while B(r) is a function that we have still to specify.
118
10.1 Friedmann-Robertson-Walker metric for an homogeneous, isotropic universe
Maximally symmetric spaces are spaces with constant curvature. Hence the Riemann ten-
sor of such spaces can depend only on the metric tensor and a constant K specifying the
curvature. The only form that respects the (anti-)symmetries of the Riemann tensor is
Contracting Rabcd with g ac , we obtain in three dimensions for the Ricci tensor
Rbd = gac Rabcd = Kg ac (gac gbd − gad gbc ) = K(3gbd − gbd ) = 2Kgbd . (10.6)
A comparison of Eq. (10.6) with the Ricci tensor for the metric (10.4) will fix the still
unknown function B(r). We proceed in the standard way: Calculation of the Christoffel
symbols with the help of the geodesic equations, then use of the definition (8.6) for the Ricci
tensor,
1 dB
Rrr = = 2Kgrr = 2KB (10.8)
rB dr
r dB 1
Rϑϑ = 1+ 2
− = 2Kgϑϑ = 2Kr 2 . (10.9)
2B dr B
(The φφ equation contains no additional information.) Integration of (10.8) gives
1
B= (10.10)
A − Kr 2
with A as integration constant. Inserting the result into (10.9) determines A as A = 1. Thus
we have determined the line-element of a maximally symmetric 3-space with curvature K as
dr 2
dl2 = + r 2 (sin2 ϑdφ2 + dϑ2 ) . (10.11)
1 − Kr 2
Going over to the full four-dimensional line-element, we rescale for K 6= 0 the r coordinate
by r → |K|1/2 r. Then we absorb the factor 1/|K| in front of dl2 by defining the scale factor
R(t) as
S(t)/|K|1/2 , K 6= 0
R(t) = (10.12)
S(t) K=0
As result we obtain the Friedmann-Robertson-Walker (FRW) metric for an homogeneous,
isotropic universe
dr 2
ds2 = dt2 − R2 (t) + r 2
(sin 2
ϑdφ2
+ dϑ 2
) (10.13)
1 − kr 2
119
10 Cosmological models for an homogeneous, isotropic universe
and gives for k = 0 a conformally flat metric. In the second one, one introduces r = sin χ for
k = 1. Then dr = cos χdχ = (1 − r 2 )1/2 dχ and
ds2 = dt2 − R2 (t) dχ2 + S 2 (χ)(sin2 ϑdφ2 + dϑ2 ) (10.15)
Note that the rescaling r → |K|1/2 r makes r dimensionless, while R has the dimension of a
length. Therefore one often introduces additionally a dimensionless scale factor a(t) ≡ R(t)/R0 .
Hence for k = 0, i.e. a flat space, one obtains the usual result L/l = 2π, while for k = 1
(spherical geometry) L/l = 2πr/ arcsin(r) < 2π and for k = −1 (hyperbolic geometry)
L/l = 2πr/arcsinh(r) > 2π.
For k = 0 and k = −1, l is unbounded, while for k = +1 there exists a maximal distance
lmax (t). Hence the first two case correspond to open spaces with an infinite volume, while the
latter is a closed space with finite volume.
Hubble’s law Hubble found empirically that the spectral lines of “distant” galaxies are
redshifted, z = ∆λ/λ0 > 1, with a rate proportional to their distance d,
cz = H0 d . (10.19)
120
10.2 Geometry of the Friedmann-Robertson-Walker metric
If this redshift is interpreted as Doppler effect, z = ∆λ/λ0 = vr /c, then the recession velocity
of galaxies follows as
v = H0 d . (10.20)
The restriction “distant galaxies” means more precisely that H0 d ≫ vpec ∼ few × 100 km/s.
In other words, the peculiar motion of galaxies caused by the gravitational attraction of
nearby galaxy clusters should be small compared to the Hubble flow H0 d. Note that the
interpretation of v as recession velocity is problematic. The validity of such an interpretation
is certainly limited to v ≪ c.
The parameter H0 is called Hubble constant and has the value H0 ≈ 71+4 −3 km/s/Mpc. We
will see soon that the Hubble law Eq. (10.20) is an approximation valid for z ≪ 1. In general,
the Hubble constant is not constant but depends on time, H = H(t), and we will call it
therefore Hubble parameter for t 6= t0 .
We can derive Hubble’s law by a Taylor expansion of R(t),
1
R(t) = R(t0 ) + (t − t0 )Ṙ(t0 ) + (t − t0 )2 R̈(t0 ) + . . . (10.21)
2
1 2 2
= R(t0 ) 1 + (t − t0 )H0 − (t − t0 ) q0 H0 + . . . , (10.22)
2
where
Ṙ(t0 ) R̈(t0 )R(t0 )
H0 ≡ and q0 ≡ − (10.23)
R(t0 ) Ṙ2 (t0 )
is called deceleration parameter: If the expansion is slowing down, R̈ < 0 and q0 > 0.
Hubble’s law follows now as an an approximation for small redshift: For not too large
time-differences, we can use the expansion Eq. (10.21) and write
1 R(t)
1−z ≈ = ≈ 1 + (t − t0 )H0 . (10.24)
1+z R0
Hence Hubble’s law, z = (t0 −t)H0 = d/cH0 , is valid as long as z ≈ H0 (t0 −t) ≪ 1. Deviations
from its linear form arises for z >
∼ 1 and can be used to determine q0 .
v = Hd . (10.25)
What sees a different observer at position d′ ? He has the velocity v ′ = Hd′ relative to us.
We are assuming that velocities are small and thus
where v ′′ and d′′ denote the position relative to the new observer. A linear relation be-
tween v and d as Hubble law is the only relation compatible with homogeneity and thus the
“cosmological principle”.
121
10 Cosmological models for an homogeneous, isotropic universe
d′
O d′′
d
G
Figure 10.1: An observer at position d′ sees the galaxy G recessing with the speed
H(d − d′ ) = Hd′′ , if the Hubble relation is linear.
✦ t + δt2
✦✦✦ 2
✦
✦✦✦
✦ ✦✦ t2
✦✦✦ ✦✦✦
✦ ✦
✦✦✦ ✦✦✦
✦ ✦
✦✦✦ ✦✦✦
✦ ✦
t1 + δt1 ✦ ✦✦
✦✦
✦✦✦
t1 ✦
galaxy, r = 0 observer, r
Figure 10.2: World lines of a galaxy emitting light and an observer at comoving coordinates
r = 0 and r, respectively.
122
10.2 Geometry of the Friedmann-Robertson-Walker metric
We change the integration limits, subtracting the common interval [t1 + δt1 : t2 ] and obtain
Z t1 +δt1 Z t2 +δt2
dt dt
= . (10.31)
t1 R t2 R
Now we choose the time intervals δti as the time between two wave crests separated by the
wave lengths λi of an electromagnetic wave. Since these time intervals are extremely short
compared to cosmological times, δti = λi /c ≪ ti , we can assume R(t) as constant performing
the integrals and obtain
δt1 δt2 λ1 λ2
= or = . (10.32)
R1 R2 R1 R2
The redshift z of an object is defined as the relative change in the wavelength between emission
and detection,
λ2 − λ1 λ2
z= = −1 (10.33)
λ1 λ1
or
λ2 R2
1+z = = . (10.34)
λ1 R1
Typically, the observation happens at the present epoch, and thus we set 1 + z = R0 /R(t).
This result is intuitively understandable, since the expansion of the universe stretches all
lengths including the wave-length of a photon. For a massless particle like the photon, ν = cλ
and E = cp, and thus its frequency (energy) and its wave-length (momentum) are affected in
the same way. By contrast, the energy of a non-relativistic particle with E ≈ mc2 is nearly
fixed.
A similar calculation as for the photon can be done for massive particles. Since the geodesic
equation for massive particles leads to a more involved calculation, we use in this case however
a different approach. We consider two comoving observer separated by the proper distance
123
10 Cosmological models for an homogeneous, isotropic universe
δl. A massive particle with velocity v needs the time δt = δl/v to travel from observer one
to observer two. The relative velocity of the two observer is
Ṙ Ṙ δR
δu = δl = vδt = v . (10.35)
R R R
Since we assume that the two observes are separated only infinitesimally, we can use the
addition law for velocities from special relativity for the calculation of the velocity v ′ measured
by the second observer,
v − δu δR
v′ = = v − (1 − v 2 )δu + O(δu2 ) = v − (1 − v 2 )v . (10.36)
1 − vδu R
Introducing δv = v − v ′ , we obtain
δv δR
2
= . (10.37)
v(1 − v ) R
and integrating this equation results in
mv const.
p= √ = . (10.38)
1−v 2 R
Thus not the energy but the momentum p = ~/λ of massive particles is red-shifted: The
kinetic energy of massive particles goes quadratically to zero, and hence peculiar velocities
relative to the Hubble flow are strongly damped by the expansion of the universe.
Luminosity distance The luminosity distance dL is defined such, that the inverse-square law
between luminosity L of a source at distance d and the received energy flux F is valid,
L 1/2
dL = . (10.40)
4πF
Assume now that a (isotropically emitting) source with luminosity L(t) and comoving coor-
dinate χ is observed at t0 by an observer at O. The cut at O through the forward light cone
of the source emitted at te defines a sphere S 2 with proper area
124
10.2 Geometry of the Friedmann-Robertson-Walker metric
Two additional effects are that the frequency of a single photon is redshifted, ν0 = νe /(1 + z),
and that the arrival rate of photons is reduced by the same factor due to time-dilation. Hence
the received flux is
1 L(te )
F(t0 ) = (10.42)
(1 + z) 4πR02 S 2 (χ)
2
dL = (1 + z) R0 S(χ) . (10.43)
Note that dL depends via χ on the expansion history of the universe between te and t0 .
Observable are not the coordinates χ or r, but the redshift z of a galaxy. Differentiating
1 + z = R0 /R(t), we obtain
R0 R0 dR
dz = − dR = − 2 dt = −(1 + z)Hdt (10.44)
R2 R dt
or Z Z
t0 0
dz
t0 − t = dt = . (10.45)
t z H(z)(1 + z)
Inserting the relation (10.44) into Eq. (10.39), we find the coordinate χ of a galaxy at redshift
z as Z t0 Z z
dt 1 dz
χ= = (10.46)
t R(t) R0 0 H(z)
For small redshift z ≪ 1, we can use the expansion (10.22)
Z t0
dt
χ = [1 − (t − t0 )H0 + . . .]−1 (10.47)
t R 0
1 1 1 1
≈ [(t − t0 ) + (t − t0 )2 H0 + . . .] = [z − (1 + q0 )z 2 + . . .] (10.48)
R0 2 R0 H 0 2
In practise, one observes only the luminosity within a certain frequency range instead of the
total (or bolometric) luminosity. A correction for this effect requires the knowledge of the
intrinsic source spectrum.
l
dA = . (10.49)
∆ϑ
R0 S(χ)
dA = . (10.50)
1+z
Thus at small distances, z ≪ 1, the two definitions agree by construction, while for large
redshift the differences increase as (1 + z)2 .
125
10 Cosmological models for an homogeneous, isotropic universe
and 0 = 0 for the space-space components. Eliminating R̈ and showing explicitly the con-
tribution of a cosmological constant to the energy density ρ, the usual Friedmann equation
follows as
!2
Ṙ 8π k Λ
H2 ≡ = Gρ − 2 + . (10.53)
R 3 R 3
R̈ Λ 4πG
= − (ρ + 3P ) . (10.54)
R 3 3
This equation determines the (de-) acceleration of the Universe as function of its matter and
energy content. “Normal” matter is characterized by ρ > 0 and P ≥ 0. Thus a static solution
is impossible for a universe with Λ = 0. Such a universe is decelerating and since today Ṙ > 0,
R̈ was always negative and there was a “big bang”.
We define the critical density ρcr as the density for which the spatial geometry of the
universe is flat. From k = 0, it follows
3H02
ρcr = (10.55)
8πG
and thus ρcr is uniquely fixed by the value of H0 . One “hides” this dependence by introducing
h,
H0 = 100 h km/(s Mpc) .
Then one can express the critical density as function of h,
ρcr = 2.77 × 1011 h2 M⊙ /Mpc3 = 1.88 × 10−29 h2 g/cm3 = 1.05 × 10−5 h2 GeV/cm3 .
Thus a flat universe with H0 = 100h km/s/Mpc requires an energy density of ∼ 10 protons
per cubic meter. We define the abundance Ωi of the different players in cosmology as their
energy density relative to ρcr , Ωi = ρ/ρcr .
In the following, we will often include Λ as other contributions to the energy density ρ via
8π Λ
GρΛ = . (10.56)
3 3
Thereby one recognizes also that the cosmological constant acts as a constant energy density
Λ Λ
ρΛ = or ΩΛ = . (10.57)
8πG 3H02
126
10.4 Scale-dependence of different energy forms
We can understand better the physical properties of the cosmological constant by replacing
Λ by (8πG)ρΛ . Now we can compare the effect of normal matter and of the Λ term on the
acceleration,
R̈ 8πG 4πG
= ρΛ − (ρ + 3P ) (10.58)
R 3 3
Thus Λ is equivalent to matter with an E.o.S. wΛ = P/ρ = −1. This property can be checked
using only thermodynamics: With P = −(∂U/∂V )S and UΛ = ρΛ V , it follows P = −ρ.
The borderline between an accelerating and decelerating universe is given by ρ = −3P or
w = −1/3. The condition ρ < −3P violates the so-called strong energy condition for “normal”
matter in equilibrium. An accelerating universe requires therefore a positive cosmological
constant or a dominating form of matter that is not in equilibrium.
Note that the energy contribution of relativistic matter, photons and possibly neutrinos,
is today much smaller than the one of non-relativistic matter (stars and cold dark matter).
Thus the pressure term in the acceleration equation can be neglected at the present epoch.
Measuring R̈/R, Ṙ/R and ρ fixes therefore the geometry of the universe.
Thermodynamics The first law of thermodynamics becomes for a perfect fluid with dS = 0
simply
dU = T dS − P dV = −P dV (10.59)
or
d(ρR3 ) = −P d(R3 ) . (10.60)
Dividing by dt,
Rρ̇ + 3(ρ + P )Ṙ = 0 , (10.61)
we obtain our old result,
ρ̇ = −3(ρ + P )H . (10.62)
This result could be also derived from ∇a T ab = 0. Moreover, the three equations are not
independent.
127
10 Cosmological models for an homogeneous, isotropic universe
We express the curvature term for arbitrary times through Ωtot,0 and the redshift z as
k k
2
= 2 (1 + z)2 = H02 (Ωtot,0 − 1)(1 + z)2 . (10.68)
R R0
H 2 (z) X
= Ωi (z) − (Ωtot,0 − 1)(1 + z)2
H02 i
= Ωrad,0 (1 + z)4 + Ωm,0 (1 + z)3 + ΩΛ − (Ωtot,0 − 1)(1 + z)2 (10.69)
This expression allows us to calculate the age of the universe (10.45), distances (10.43), etc. for
a given cosmological model, i.e. specifying the energy content Ωi,0 and the Hubble parameter
H0 at the present epoch.
8π
Ṙ2 = GρR2 = H02 R03+3w R−(1+3w) , (10.70)
3
where we inserted the definition of ρcr = 3H02 /(8πG). Separating variables we obtain
Z R0 Z t0
−(3+3w)/2
R0 dR R(1+3w)/2 = H0 dt = t0 H0 (10.71)
0 0
128
10.6 The ΛCDM model
Models with w > −1 needed a finite time to expand from the initial singularity R(t = 0) = 0
to the current size R0 , while a Universe with only a Λ has no “beginning”.
In models with a hot big-bang, ρ, T → ∞ for t → 0, and we should expect that classical
gravity breaks down at some moment t∗ . As long as R ∝ tα with α < 1, most time elapsed
during the last fractions of t0 H0 . Hence our result for the age of the universe does not depend
on unknown physics close to the big-bang as long as w > −1/3.
If we integrate (10.71) to the arbitrary time t, we obtain the time-dependence of the scale
factor, 2/3
t for matter (w = 0) ,
R(t) ∝ t2/(3+3w) = t 1/2 for radiation (w = 1/3) , (10.73)
exp(t) for Λ (w = −1) .
Age problem of the universe. The age of a matter-dominated universe is (expanded around
Ω0 = 1)
2 1
t0 = 1 − (Ω0 − 1) + . . . . (10.74)
3H0 5
Globular cluster ages require t0 ≥ 13 Gyr. Using Ω0 = 1 leads to H0 ≤ 2/3 × 13 Gyr =
1/19.5 Gyr or h ≤ 0.50. Thus a flat universe with t0 = 13 Gyr without cosmological constant
requires a too small value of H0 . Choosing Ωm ≈ 0.3 increases the age by just 14%.
We derive the age t0 of a flat Universe with Ωm + ΩΛ = 1 in the next section as
√
3t0 H0 1 1 + ΩΛ
=√ ln √ . (10.75)
2 ΩΛ 1 − ΩΛ
Requiring H0 ≥ 65 km/s/Mpc and t0 ≥ 13 Gyr means that the function on the RHS should
be larger than 3 × 13Gyr × 0.65/(2 × 9.8Gyr ≈ 1.3 or ΩΛ ≥ 0.55.
129
10 Cosmological models for an homogeneous, isotropic universe
1.8
1.6
1.4
t0H0
1.2 Ωm+ΩΛ=1
1
0.8
open
0.6
0 0.2 0.4 0.6 0.8 1
Ωm
Figure 10.3: The product t0 H0 for an open universe containing only matter (dotted blue line)
and for a flat cosmological model with ΩΛ + Ωm = 1 (solid red line).
we obtain
d 1 d 3
(aȧ2 ) = ȧa2 Λ = (a )Λ . (10.78)
dt 3 dt
Integrating is now trivial,
Λ 3
aȧ2 = a +C. (10.79)
3
The constant C can be determined most easily by setting a(t0 ) = 1 and comparing the
Friedmann equation (10.53) with (10.79) for t = t0 as C = 8πGρm,0 /3.
Next we introduce the new variable x = a3/2 . Then
da dx da dx 2x−1/3
= = , (10.80)
dt dt dx dt 3
and we obtain as new differential equation
130
10.7 Determining Λ and the curvature R0 from ρm,0 , H0 , q0
0.6
0.4
Ω=0.1
0.2
-0.2
q
-0.4 Ω=0.9
-0.6
-0.8
-1
0 0.5 1 1.5 2
t/t0
Figure 10.4: The deceleration parameter q as function of t/t0 for the ΛCDM model and
various values for ΩΛ (0.1, 0.3, 0.5, 0.7 and 0.9 from the top to the bottom).
and then
1
q(t) = [1 − 3 tanh2 (t/tΛ ) . (10.85)
2
The limiting behavior of q corresponds with q = 1/2 for t → 0 and q = −1 for t → ∞ as
expected to the one of a flat Ωm = 1 and a ΩΛ = 1 universe. More interesting is the transition
region and, as shown in Fig. 10.4, the transition from a decelerating to an accelerating universe
happens for ΩΛ = 0.7 at t ≈ 0.55t0 . This can easily converted to redshift, z∗ = a(t0 )/a(t∗ ) −
1 ≈ 0.7, that is directly measured in supernova observations.
131
10 Cosmological models for an homogeneous, isotropic universe
Hence the sign of 3Ωm − 2q0 − 2 decides about the sign of k and thus the curvature of
the universe. For a universe without cosmological constant, Λ = 0, equation (10.87) gives
Ωm = 2q0 and thus
Example: Comparison with observations: Use the Friedmann equations applied to the present
time to derive central values of Λ and k, R0 from the observables H0 ≈ (71 ± 4) km/s/Mpc and
ρ0 = (0.27 ± 0.04)ρcr , and q0 = −0.6. Discuss the allowed range and significance of the values.
We evaluate first 2
7.1 × 106 cm
H02 ≈ ≈ 5.2 × 10−36 s−2 .
s 3.1 × 1024 cm
The value of the cosmological constant Λ follows as
ρ 1
Λ = 4πGρm,0 − 3q0 H02 = 3H02 − 3q0 ≈ 3H02 × ( × 0.27 + 0.6) ≈ 0.73 × 3H02
2ρcr 2
or ΩΛ = 0.73.
The curvature radius R follows as
k ρ q0 + 1
= 4πGρm,0 − H02 (q0 + 1) = 3H02 − (10.92)
R02 2ρcr 3
= 3H02 (0.135 ± 0.02 − 0.4/3) = 3H02 (0.002 ± 0.02) (10.93)
132
10.8 Particle horizons
The ratio
lH (t) t
∝ α ∝ t1−α
R(t) t
gives the fraction of the Hubble horizon that was causally connected at time t < t0 . Since
0 < α < 1, this fraction decreases going back in time. p
For an universe dominated by a cosmological constant Λ > 0, R(t) = R0 exp( Λ/3t) =
R0 exp(Ht) and thus
Z t0
dt′ cR0
lH (t2 ) = cR0 ′
= [exp(−Ht) − exp(−Ht0 )]
t exp(Ht ) H
lH (t) c
= [exp(−Ht) − 1] .
R0 H
Since t < t0 = 0, the expression in the bracket is always larger than one and the causally
connected region is larger than the Hubble horizon. If exponential expansion would have
persisted for all times, then lH (t) → ∞ for t → −∞ and thus the whole universe would be
causally connected.
133
11 Cosmic relics
Thus the relative importance of the different energy forms changes: Going back in time, one
enters first the matter-dominated and then the radiation-dominated epoch.
The cosmic triangle shown in Fig. 11.1 illustrates the evolution in time of the various energy
components and the resulting coincidence problem: Any universe with a non-zero positive
cosmological constant will be driven with time to a fix-point with Ωm , Ωk → 0. The only
other non-evolving state is a flat universe containing only matter—however, this solution is
unstable. Hence, the question arises why we live in an epoch where all energy components
have comparable size.
Temperature increase as T ∼ 1/R has three main effects: Firstly, bound states like atoms
and nuclei are dissolved when the temperature reaches their binding energy, T > ∼ Eb . Secondly,
particles with mass mX can be produced, when T > ∼ 2m X , in reactions like γγ → X̄X. Thus
the early Universe consists of a plasma containing more and more heavier particles that are in
thermal equilibrium. Finally, most reaction rates Γ = nσv increase faster than the expansion
1/2
rate of the universe for t → 0, since n ∝ T 3 for relativistic particles, while H ∝ ρrad ∝ T 2 .
Therefore, reactions that have became ineffective today were important in the early Universe.
Matter-radiation equilibrium zeq : The density of matter decreases slower than the energy
density of radiation. Going backward in time, there will be therefore a time when the density
134
11.1 Time-line of important dates in the early universe
0.0
1.0
0.5
OCDM
0.5
1.0 OPEN
Ωm Ωk
-0.5
2.0 CLOSED
Figure 11.1: The cosmic triangle showing the time evolution of the various energy compo-
nents.
of matter and radiation were equal. Before that time with redshift zeq , the universe was
radiation-dominated,
Ωrad,0 (1 + zeq )4 = Ωm,0 (1 + zeq )3 (11.2)
or
Ωm,0
zeq = − 1 ≈ 5400 . (11.3)
Ωrad,0
This time is important, because i) the time-dependence of the scale factor changes from
R ∝ t2/3 for a matter to R ∝ t1/2 for a radiation dominated universe, ii) the E.o.S. and thus the
speed of sound changed from w ≈ 1/3, vs2 = (∂P/∂ρ)S = c2 /3 to w ≈ 0, vs2 = 5kT /(3m) ≪ c2 .
The latter quantity determines the Jeans length and thus which structures in the Universe
can collapse.
Recombination zrec : Today, most hydrogen and helium in the interstellar and intergalactic
medium is neutral. Increasing the temperature, the fraction of ions and free electron increases,
i.e. the reaction H +γ ↔ H + +e− that is mainly controlled by the factor exp(−Eb /kT ) will be
shifted to the right. By definition, we call recombination the time when 50% of all atoms are
ionized. A naive estimate gives kT ∼ Eb ≈ 13.6 eV≈ 160.000K or zrec = 60.000. However,
there are many more photons than hydrogen atoms, and therefore recombination happens
latter: A more detailed calculation gives zrec ∼ 1000.
Since the interaction probability of photons with neutral hydrogen is much smaller than with
electrons and protons, recombination marks the time when the Universe became transparent
to light.
135
11 Cosmic relics
Quark-hadron or QCD transition Above T ∼ mπ ∼ 100 MeV, hadrons like protons, neu-
trons or pions dissolve into their fundamental constituents, quarks q and gluons g.
Baryogenesis All the matter observed in the Universe consists of matter (protons and elec-
trons), and not of anti-matter (anti-protons an positrons). Thus the baryon-to-photon ratio
is
nb − nb̄ nb Ωb ρcr /mN
η= = = ≈ 7 × 10−10 . (11.4)
nγ nγ 2ζ(3)Tγ3 /π 2
The early plasma of quarks q and anti-quarks q̄ contained a tiny surplus of quarks. After all
anti-matter annihilated with matter, only the small surplus of matter remained. The tiny
asymmetry can be explained by interactions in the early Universe that were not completely
symmetric with respect to an exchange of matter-antimatter.
The factor g takes into account the internal degrees of freedom like spin or color. Thus for a
photon, a massless spin-1 particle g = 2, for an electron g = 4, etc.
136
11.2 Equilibrium statistical physics in a nut-shell
In the non-relativistic limit T ≪ m, eβ(m−µ) ≫ 1 and thus differences between bosons and
fermions disappear,
Z ∞ 3/2
g −β(m−µ) p
2 −β 2m
2
mT
n = e dp p e =g exp[−β(m − µ)] , (11.9)
2π 2 0 2π
ρ = mn , (11.10)
P = nT ≪ ρ . (11.11)
where for bosons ε1 = ε2 = 1 and for fermions ε1 = 3/4 and ε2 = 7/8, respectively.
Since the energy density and the pressure of non-relativistic species is exponentially sup-
pressed, the total energy density and the pressure of all species present in the universe can
be well-approximated including only relativistic ones,
π2
ρrad = g∗ T 4 , (11.15)
30
π2
Prad = ρrad /3 = g∗ T 4 , (11.16)
90
where 4 4
X Ti 7 X Ti
g∗ = gi + gi . (11.17)
T 8 T
bosons fermions
Here we took into account that the temperature of different particle species can differ.
2
1
dxx2n e−ax can be reduced to a Gaussian integral by differentiating with respect
R∞
Integrals of the type 0
to the parameter a.
137
11 Cosmic relics
When g∗ is constant, the temperature T ∝ 1/R. Consider now the case that a particle
species, e.g. electrons, becomes non-relativistic at T ∼ me . Then the particles annihilate,
e+ +e− → γγ, and its entropy is transferred to photons. Formally, g∗,S decreases and therefore
the temperature decreases for a short period less slowly than T ∝ 1/R.
Since s ∝ R−3 and also the net number of particles with a conserved charge, e.g. nB ≡
nB − nB̄ ∝ R−3 if baryon number B is conserved, the ratio nB /s remains constant.
138
11.3 Big Bang Nucleosynthesis
Table 11.1: The number of relativistic degrees of freedom g∗ present in the universe as func-
tion of its temperature.
The observed luminosity-mass ratio is however only MLb ≤ 0.05L⊙/M⊙ . Assuming a roughly
constant luminosity of stars, they can produce only 0.05/2.5 ≈ 2% of the observed 4 He.
Big Bang Nucleosynthesis (BBN) is controlled by two parameters: The mass difference
between protons and neutrons, ∆ ≡ mn − mp ≈ 1.3 MeV and the freeze-out temperature Tf
of reaction converting protons into neutrons and vice versa.
139
11 Cosmic relics
With nZ A−Z /n = X Z X A−Z nA−1 and η ∝ T 3 and thus nA−1 ∝ η A−1 T 3(A−1) , we have
p nn N p n N B
3(A−1)/2
T
XA ∝ η A−1 XpZ XnA−Z exp(βBA ) . (11.28)
mN
The fact that η ≪ 1, i.e. that the number of photons per baryon is extremely large, means
that nuclei with A > 1 are much less abundant and that nucleosynthesis takes place later
than naively expected. Let us consider the particular case of deuterium in Eq. (11.28),
3/2
XD 24ζ(3) T
= √ η exp(βBD ) (11.29)
Xp Xn π mN
with BD = 2.23 MeV. The start of nucleosynthesis could be defined approximately by the
condition XD /(Xp Xn ) = 1, or T ≈ 0.1 MeV according to the left panel in Fig. 11.2.P
The right
panel of the same figure shows the results, if the equations (11.28) together with i Xi = 1
are solved for the lightest and stablest nuclei. Now it becomes clear that in thermal equi-
librium between 0.1 < 4
∼T <∼ 0.2 MeV essentially all free neutrons will bind to He. For low
temperatures one cannot expect that the true abundance follows the equilibrium abundance,
Eq. (11.28), shown in Fig. 11.2. First, in the expanding universe the weak reactions that
convert protons and nucleons will freeze out as soon as their rate drops below the expansion
rate of the universe. This effect will discussed in the following in more detail. Second, the
Coulomb barrier will prevent the production of nuclei with Z ≫ 1. Third, neutrons are not
stable and decay.
140
11.3 Big Bang Nucleosynthesis
1e+20
1
n=p
1e+15
1e-05 4
He
1e+10
1e-10
XD/(Xp Xn)
100000 D
XA
1
1e-15
1e-05
3
1e-20 He
1e-10
12
C
1e-15 1e-25
0.1 1 10 100 1
T/MeV T/MeV
for the expansion of the universe, τ = (Ṙ/R)−1 = H −1 . Note that this is also the typical
time-scale for changes in the temperature T . Thus we can rewrite this condition as
Γ ≡ nσv ≫ H . (11.30)
A particle species ”goes out of equilibrium” when its interaction rate Γ becomes smaller than
the expansion rate H of the universe.
141
11 Cosmic relics
• as the universe cools down from Tfr to Tns , neutrons decay with half-live τn ≈ 886 s.
• at Tns , practically all neutrons are bound to 4 He, with only small admixture of other
elements.
142
11.4 Dark matter
Figure 11.3: Abundances of light-elements as function of η (left) and of the number of light
neutrino species (right).
where x = T /m and geff = 3/4 (geff = 1) for fermions (bosons). If the particle X is in chemical
equilibrium, its abundance is determined for T ≫ m by its contribution to the total number of
degrees of freedom of the plasma, while Yeq is exponentially suppressed for T ≪ m (assuming
µX = 0). In an expanding universe, one may expect that the reaction rate Γ for processes
like γγ ↔ X̄X drops below the expansion rate H mainly for two reasons: i) Cross sections
may depend on energy as, e.g., weak processes σ ∝ s ∝ T 2 for s < 2
∼ mW , ii) the density nX
decreases at least as n ∝ T 3 . Around the freeze-out time xf , the true abundance Y starts
to deviate from the equilibrium abundance Yeq and becomes constant, Y (x) ≈ Yeq (xf ) for
x> ∼ xf . This behavior is illustrated in Fig. 11.4.
dn dn dR Ṙ
= = −3n = −3Hn . (11.39)
dt dR dt R
143
11 Cosmic relics
Additionally, there might be production and annihilation processes. While the annihilation
rate βn2 = hσann vi n2 has to be proportional to n2 , we allow for an arbitrary function as
production rate ψ,
dn
= −3Hn − βn2 + ψ . (11.40)
dt
In a static Universe, dn/dt = 0 defines equilibrium distributions neq . Detailed balance requires
that the number of X particles produced in reactions like e+ e− → X̄X is in equilibrium equal
to the number that is destroyed in X̄X → e+ e− , or βn2eq = ψeq . Since the reaction partners
(like the electrons in our example) are assumed to be in equilibrium, we can replace ψ = ψeq
by βn2eq and obtain
dn
= −3Hn − hσann vi(n2 − n2eq ) . (11.41)
dt
This equation together with the initial condition n ≈ neq for T → ∞ determines n(t) for a
given annihilation cross section σann .
Next we rewrite the evolution equation for n(t) using the dimensionless variables Y and x.
Changing from n = sY to Y we can eliminate the 3Hn term,
dn dY
= −3Hn + s . (11.42)
dt dt
With (2t)−2 = H 2 ∝ ρ ∝ T 4 ∝ x−4 or t = t∗ x2 , we obtain
dY sx
= − hσann vi Y 2 − Yeq
2
. (11.43)
dx H
Finally we recast the Boltzmann equation in a form that makes our intuitive Gamov criterion
explicit, " #
x dY ΓA Y 2
=− −1 (11.44)
Yeq dx H Yeq
with ΓA = neq hσann vi: The relative change of Y is controlled by the factor ΓA /H times
the deviation from equilibrium. The evolution of Y = nX /s is shown schematically in
Fig. 11.4: As the universe expands and cools down, nX decreases at least as R−3 . Therefore,
the annihilation rate ∝ n2 quenches and the abundance “freezes-out:” The reaction rates are
not longer sufficient to keep the particle in equilibrium and the ratio nX /s stays constant.
For the discussion of approximate solutions to this equation, it is convenient to distinguish
according to the freeze-out temperature: hot dark matter (HDM) with xf ≪ 3, cold dark
matter (CDM) with xf ≫ 3 and the intermediate case of warm dark matter with xf ∼ 3.
144
11.4 Dark matter
1e-05
Y∞
Y(x)/Y(x=0)
1e-10
↓ xf
1e-15
increasing σ: ↓
1e-20
1 10 100 1000
x = m/T
The numerical value of s0 used will be discussed in the next paragraph. Although a HDM
particle was relativistic at freeze-out, it is today non-relativistic if its mass m is m ≫ 3K ≈
0.2meV. In this case its energy density is simply ρ0 = ms0 Y∞ and its abundance Ωh2 = ρ0 /ρcr
or
m geff
Ωh2 = 7.8 × 10−2 . (11.47)
eV g∗S
Hence HDM particles heavier than O(100eV) overclose the universe.
145
11 Cosmic relics
√
The relic abundance for CDM follows from n(xf ) = 1.66 g∗ Tf2 /(σ0 MPl ) and n0 =
n(xf )[R(xf )/R0 ]3 = n(xf )[g∗,f /g∗,0 ][T0 /T (xf )]3 as
xf T03
ρ0 = mn0 ≈ 10 √ (11.52)
g∗,f σ0 MPl
or
mn0 4 × 10−39 cm2
ΩX h2 = ≈ xf (11.53)
ρcr σ0
Thus the abundance of a CDM particle is inverse proportionally to its annihilation cross
section, since a more strongly interacting particle stays longer in equilibrium. Note that the
abundance depends only logarithmically on the mass m via Eq. (11.51) and implicitly via
g∗,f on the freeze-out temperature Tf . Typical values of xf found numerically for weakly
interacting massive particles (WIMPs) are xf ∼ 20. Partial-wave unitarity bounds σann as
σann ≤ c/m2 . Requiring Ω < 0.3 leads to m < 20 − 50 TeV. This bounds the mass of any
stable particle that was once in thermal equilibrium.
146
11.4 Dark matter
SM neutrinos
−5
−10 WIMP
log(σ/pbarn)
−15
−20 axion
axino
−25
SHDM
−30
gravitino
−35
−40
−18−15−12−9 −6 −3 0 3 6 9 12 15 18
log(m/GeV)
Figure 11.5: Particles proposed as DM particle with Ω ∼ 1, the expected size of their cross
section and their mass. Red excluded; blue thermally and black non-thermally
produced.
147
12 Inflation and structure formation
12.1 Inflation
Shortcomings of the standard big-bang model
• Causality or horizon problem: why are even causally disconnected regions of the universe
homogeneous, as we discussed for CMB?
The horizon grows like t, but the scale factor in radiation or matter dominated epoch
only as t2/3 or t1/2 , respectively. Thus for any scale l contained today completely inside
the horizon, there exists a time t < t0 where it crossed the horizon. A solution to the
horizon problem requires that R grows faster than the horizon t. Since R ∝ t2/[3(1+w] ,
we need w < −1/3 or (q < 0, accelerated expansion of the universe).
• Flatness problem: the curvature term in the Friedmann equation is k/R2 . Thus this
term decreases slower than matter (∝ 1/R3 ) or radiation (1/R4 ), but faster than vacuum
energy. Let us rewrite the Friedmann equation as
k 8πG Λ
= H2 ρ+ − 1 = H 2 (Ωtot − 1) . (12.1)
R2 3H 2 3H 2
The LHS scales as (1 + z)2 , the Hubble parameter for MD as (1 + z)3 and for RD
as (1 + z)4 . General relativity is supposed to be valid until the energy scale MPl .
Most of time was RD, so we can estimate 1 + zPl = (t0 /tPl )1/2 ∼ 1030 (tPl ∼ 10−43 s).
Thus if today |Ωtot − 1| <∼ 1%, then the deviation had too be extremely small at tPl ,
|Ωtot − 1| < 10−2 /(1 + z )2 ≈ 10−62 !
∼ Pl
|k| |k|
|Ωtot − 1| = 2 2
= (12.2)
H R Ṙ2
gives
d d |k| 2|k|R̈
|Ωtot − 1| = =− <0 (12.3)
dt dt Ṙ 2 Ṙ3
for R̈ > 0. Thus Ωtot − 1 increases if the universe decelerates, i.e. Ṙ decreases (radia-
tion/matter dominates), and decreases if the universe accelerates , i.e. Ṙ increases (or
vacuum energy dominates). Thus again q < 0 (or w < −1/3) is needed.
• The standard big-bang model contains no source for the initial fluctuations required for
structure formation.
148
12.1 Inflation
k
Ωtot − 1 = ∝ exp(−2Ht) . (12.4)
Ṙ2
Thus Ωtot − 1 drives exponentially towards zero.
1
L = gµν ∇µ φ∇ν φ − V (φ) , (12.5)
2
that could be also a mass term, V (φ) = m2 φ2 /2). We remember first the expressions for the
energy-density ρ = T 00 and the pressure P ,
1 1
ρ = φ̇2 + V , P = φ̇2 − V (12.6)
2 2
and as equation of state
P φ̇2 − 2V (φ)
w= = ∈ [−1 : 1] . (12.7)
ρ φ̇2 + 2V (φ)
Thus a classical scalar field may act as dark energy, w < 0, leading to an accelerated expansion
of the Universe. A necessary condition is that the field is “slowly rolling”, i.e. that its kinetic
energy is smaller than its potential energy, φ̇2 /2 < V (φ).
Field equation in a FRW background We use Eq. (7.47) including a potential V (φ) (that
could be also a mass term, V (φ) = m2 φ2 /2),
1
L = gµν ∇µ φ∇ν φ − V (φ) , (12.8)
2
to derive the equations of motions for a scalar field in apflat FRW metric, gab =
diag(1, −a2 , −a2 , −a2 ), gab = diag(1, −a−2 , −a−2 , −a−2 ), and |g| = a3 . Varying the ac-
tion Z
4 3 1 2 1 2
SKG = d xa φ̇ − 2 (∇φ) − V (φ) (12.9)
Ω 2 2a
149
12 Inflation and structure formation
gives
Z
4 3 1 ′
δSKG = d x a φ̇δφ̇ − 2 (∇φ) · δ(∇φ) − V δφ
Ω a
Z
4 d 3 2 3 ′
= d x − (a φ̇) + a∇ φ − a V δφ
Ω dt
Z
4 3 1 2 ′ !
= d x a −φ̈ − 3H φ̇ + 2 ∇ φ − V δφ = 0 . (12.10)
Ω a
Thus the field equation for a Klein-Gordon field in a FRW background is
1 2
φ̈ + 3H φ̇ − ∇ φ + V ′ = 0. (12.11)
a2
The term 3H φ̇ acts in an expanding universe as a friction term for the oscillating φ field.
Moreover, the gradient of φ is also suppressed for increasing a; this term can be therefore
often neglected in an expanding universe.
Number of e-foldings and slow roll conditions We can integrate Ṙ = RH for an arbitrary
time-evolution of H, Z
R(t) = R(t0 ) exp dtH(t) . (12.12)
150
12.1 Inflation
Solutions of the KG field equation in a FRW background Next we want to rewrite the KG
equation as the one for an harmonic oscillator with a time-dependent oscillation frequency.
We introduce first the conformal time dη = dt/a,
dφ dφ dη 1
φ̇ = = = φ′ , (12.19)
dt dη dt a
1 d 1 ′ 1 a′
φ̈ = φ = 2 φ′′ − 3 φ′ , (12.20)
a dη a a a
ȧ a′ H
H= = 2 ≡ . (12.21)
a a a
Inserting these expressions into Eq. (12.11) and multiplying with a2 gives
φ′′ + 2H φ′ − ∇2 φ + V ′ = 0 . (12.22)
Finally, we can eliminate the friction term 2Hφ′k by introducing φk (η) = uk (η)/a. Then a
harmonic oscillator equation for uk ,
a′′
ωk2 (η) = k2 + m2 a2 − (12.26)
a
results. You can check that the action for the field u using conformal coordinates η, x is
mathematically equivalent to the one of a scalar field in Minkowski space with time-dependent
mass m2eff (η) = m2 a2 − a′′ /a. This time-dependence appears, because the gravitational field
can perform work on the field u. Alternatively, we could show that “the” vacuum at different
times η is not the same, because we compare the vacuum for fields with different effective
masses, leading to particle production. For an excellent introduction into this subject see the
book by V. F. Mukhanov and S. Winitzki, “Introduction to quantum fields in gravity;” for a
free pdf file of the draft version see http://sites.google.com/site/winitzki/.
We consider now as two limiting cases the short and the long-wavelength limit. In the
′′
first case, k2 + m2 a2 ≫ aa , the field equation is conformally equivalent to the one in normal
Minkowski space, with solution
1
uk (η, x) = √ (Ak e−ikx + Ak eikx ) . (12.27)
2k
151
12 Inflation and structure formation
In the opposite limit, a′′ uk = au′′k , with the solution φk = const. The complete solution is
given by Hankel functions H3/2 (η),
−ikx i ikx i
uk (η) = Ak e 1− + Bk e 1+ . (12.28)
kη kη
Inserting this into the field equation (12.11) gives six terms. We evaluate first the potential
term, assuming that the potential has its minimum for φ = φ0 = 0. Then
1
V (φ) = V (0) + V ′′ φ2 + O(φ3 ) (12.31)
2
and we see that the second derivative of the potential acts as a effective mass term, m2eff = V ′′
for the φ field. Thus
Taking into account that the classical term φ0 satisfies separately the field equation (12.11)
gives as equation for the fluctuations
2
∂ 1 2 ∂ 2
− ∇ + 3H + mφ δφ = 0 . (12.33)
∂t2 a2 ∂t
with k as comoving wave-number. Since the proper distance varies as ax, the momentum is
p = k/a. 2
k 2
φ̈k + 3H φ̇k + + m φ φk = 0 . (12.35)
a2
Comparing this equation with (12.11), we see that the fluctuations obey basically the same
equation as the average field. The only difference is the effective mass term.
152
12.1 Inflation
a′′ 2
= (12.38)
a η
gives
2
u′′k + k − 2
uk = 0 . (12.39)
η
Hence fluctuations satisfy also
−ikx i ikx i
uk (η) = Ak e 1− + Bk e 1+ . (12.40)
kη kη
The functions P (k) is the power spectrum, but often one calls also ∆2φ (k) with the same name.
The spectrum of fluctuations ∆2φ (k) outside of the horizon is
k3 H2
∆2φ (k) = |φk |2
= (12.43)
2π 2 4π 2
Hence, the power-spectrum of superhorizon fluctuations is independent of the wave-number
in the approximation that H is constant during inflation. The total area below the function
∆2φ (k) = const. plotted versus ln(k) gives hφ2 (x, t)i, as shown by the last part of Eq. (12.42).
Hence a spectrum with ∆2φ (k) = const. contains the some amount of fluctuation on all angular
scales. Such a spectrum of fluctuations is called a Harisson-Zel’dovich spectrum, and is
produced by inflation in the limit of infinitely slow-rolling of the inflaton.
Fluctuations in the inflaton field, φ = φ0 +δφ, lead to fluctuations in the energy-momentum
tensor T ab = T0ab + δT ab , and thus to metric perturbations g ab = g0ab + δgab . These metric
perturbations hab affect in turn all matter fields present.
153
12 Inflation and structure formation
V
O
a F
C
MP4 hot universe
reheating
inflation
O
F
? C
− 1/4
0 MP λ MP ϕ 0 tP ~ 10–43 sec t ~ 10–35 sec t0 ~ 1017 sec t
Figure 12.1: Left: A slowly rolling scalar field as model for inflation. Right: The evolution
of the scale factor R including an inflationary phase in the early universe.
154
12.2 Structure formation
δρm δT
=3 .
ρm T
δρm δT 3δργ
where we used ρm =3 T = 4ργ .
√ √
For t ≪ teq , the adiabatic sound speed is close to vs = 1/ 3, while vs = 0.76/ 3 for t = teq .
The Jeans mass of baryons is close to the horizon size until recombination. Then vs drops to
the value for a mono-atomic gas, vs2 = 5T
3m , where m ∼ mH ∼ 1 GeV.
b
155
12 Inflation and structure formation
d ( h-1 Mpc )
1000 100 10 1
5
10 Microwave Background Superclusters Clusters Galaxies
4
10 CDM
P ( k ) ( h-3 Mpc 3)
B E n=1
CO
1000 TCDM
n = .8
100
10
HDM
n=1 MDM
1 n=1
0.1
0.001 0.01 0.1 1 10
k ( h Mpc-1 )
Figure 12.2: Comparison of the predicted power spectrum normalized to COBE data in several
models popular around ’95 with observations: HDM (Ων = 1), CDM (Ωm = 1)
and MDM (Ωm = 0.8, Ων = 0.2).
is called the Jeans mass. It is unchanged by the expansion of the universe, since the wave-
number kJ ∝ R and ρ0 ∝ 1/R3 .
Let us compare the Jeans mass just before and after recombination,
π 5/2 vs3 −2
MJ (zeq,> ) = 3/2 1/2
∼ 1015 Ωh2 M⊙ (12.46)
6 G ρ
and −1/2
MJ (zeq,< ) ∼ 105 Ωh2 M⊙ (12.47)
The Jeans mass of baryons does not coincide with the observed mass of galaxies, neither fits
the corresponding length scale the break in the power spectrum around k ≈ 0.04h/ Mpc.
156
12.2 Structure formation
Figure 12.3: Acoustic baryon oscillation in the correlation function of galaxies with large
redshift of the SDSS, astro-ph/0501171.
Thus the damping scale is λD = (lint lH )1/2 . If there would be a baryon-dominated epoch,
then ne ∝ ρ and ρ ∝ 1/t2 , hence lint lH ∝ ρ−1−1/2 and λD ∝ ρ−3/4 . Finally, the corresponding
mass scale is MD ∝ λ3 ρ ∝ ρ−9/4 ρ ∝ ρ−5/4 . Numerically,
and the corresponding length scale is (taking into account Ωb ≈ 0.04 < Ωm ≈ 0.27 and
h ≈ 0.7)
λD = 3.5(Ωm /Ωb )1/2 (Ωh2 )−3/4 Mpc ≈ 40 Mpc . (12.50)
Thus the Silk scale λD has the right numerical value to explain the break in the power
spectrum at k ≈ 0.04h/ Mpc. Fluctuations are damped on scales λ > ∼ λD , the stronger the
larger Ωb . Since MD ≫ MJ , acoustic oscillation should be visible for k >
∼ kD in the power
spectrum of galaxies. First evidence was found around 2005, cf. Fig. 12.3.
R̈ 4π
= − Gρ̄δ . (12.51)
R 3
The time evolution of the mass density is
157
12 Inflation and structure formation
R̈ ä 2 ȧ 1
= − δ̇ − δ̈. (12.55)
R a 3a 3
δ̈ + 2H δ̇ − 4πGρδ = 0. (12.56)
For a matter-dominated universe, H = 2/(3t) and ρ = 1/(6πGt2 ). Inserting the trial solution
δ ∝ tα gives
4 2
α(α − 1)tα−2 + αtα−2 − tα−2 = 0 (12.57)
3 3
or
1 2
α2 + α − = 0 (12.58)
3 3
and finally α = −1 and 2/3. Thus the general solution δ(t) = At−1 + Bt2/3 consists of a
decaying mode δ ∝ 1/t and a mode growing like δ ∝ t2/3 ∝ R.
During the radiation-dominated epoch, with δγ = 0, one can neglect the term 4πGρδ. With
H = 1/(2t)
1
δ̈ + δ̇ = 0 (12.59)
t
with solution δ(t) = δ(ti )[1 + a ln(t/ti )]. Thus perturbations do not grow until zeq .
Non-linear regime N-body simulations are mainly used to study structure formation on the
smallest scale, e.g. the dark matter profile of a galaxy.
• Before recombination, baryons are tightly coupled to radiation. The baryon Jeans scale
is of order of the horizon size. After recombination, it drops by a factor 1010 .
158
12.2 Structure formation
5
10
Ων = 0.05
❄
Ων = 0.01
❄
✻
Pg(k) (h Mpc )
3
Ων = 0
4
10
−3
P
⇒ mν <
∼ 2.2 eV at 95% C.L.
3
10
0.01 0.10
−1
k (h Mpc )
Figure 12.4: Neutrino mass limits from the 2dF galaxy survey: For Ων > ∼ 0.05 there is too
less power on scales smaller than (or since normalization is arbitrary) slope too
steep).
Recipe
• The connection between the initial perturbation spectrum Pi (k) = |δk,i |2 and the ob-
served power spectrum P (k) today is formally given by the transfer function T (k),
• Inflation predicts that an initial perturbation spectrum Pi (k) ∝ kns with ns ≈ 1, gen-
erally adiabatic ones.
• Calculate T (k).
12.2.6 Results
• The three models without cosmological constant shown in Fig. 12.2 all fail.
159
12 Inflation and structure formation
• The exponential suppression on small Pscales typical for HDM is not observed, can be
used to derive limit on Ων <
∼ 0.05 or mν i <
∼ 2.2 eV.
• Acoustic baryon oscillations are only a tiny sub-dominant effect, but are now observed,
cf. Fig. 12.3.
160
Bibliography
[1] B.P. Abbott et al. Observation of Gravitational Waves from a Binary Black Hole Merger.
Phys. Rev. Lett., 116:061102, 2016.
[2] M. Coleman Miller. Implications of the Gravitational Wave Event GW150914. Gen. Rel.
Grav., 48:95, 2016.
[3] P.C. Peters. Gravitational Radiation and the Motion of Two Point Masses. PhD thesis,
Caltech, 1964.
[4] P.C. Peters. Gravitational Radiation and the Motion of Two Point Masses. Phys. Rev.,
136:B1224–B1232, 1964.
161
Index
162
Index
163
Index
tensor
in Minkowski space, 14
trace-reversed, 99
Unruh effect, 56
164