0% found this document useful (0 votes)
18 views

2018 QFT

Uploaded by

wlrsw101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

2018 QFT

Uploaded by

wlrsw101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 552

A Stroll Through

QUANTUM FIELDS

FRANÇOIS GELIS

I NSTITUT DE P HYSIQUE T H ÉORIQUE


CEA-S ACLAY
Contents

1 Basics of Quantum Field Theory 1


1.1 Special relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Free scalar fields, Mode decomposition . . . . . . . . . . . . . . . 6
1.3 Interacting scalar fields . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 LSZ reduction formulas . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 From transition amplitudes to reaction rates . . . . . . . . . . . . . 17
1.6 Generating functional . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.7 Perturbative expansion and Feynman rules . . . . . . . . . . . . . . 27
1.8 Calculation of loop integrals . . . . . . . . . . . . . . . . . . . . . 33
1.9 Källen-Lehmann spectral representation . . . . . . . . . . . . . . . 36
1.10 Ultraviolet divergences and renormalization . . . . . . . . . . . . . 38
1.11 Spin 1/2 fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.12 Spin 1 fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
1.13 Abelian gauge invariance, QED . . . . . . . . . . . . . . . . . . . 57
1.14 Charge conservation, Ward-Takahashi identities . . . . . . . . . . . 60
1.15 Spontaneous symmetry breaking . . . . . . . . . . . . . . . . . . . 63
1.16 Perturbative unitarity . . . . . . . . . . . . . . . . . . . . . . . . . 71

2 Functional quantization 85
2.1 Path integral in quantum mechanics . . . . . . . . . . . . . . . . . 85
2.2 Classical limit, Least action principle . . . . . . . . . . . . . . . . . 89
2.3 More functional machinery . . . . . . . . . . . . . . . . . . . . . . 89
2.4 Path integral in scalar field theory . . . . . . . . . . . . . . . . . . 96
2.5 Functional determinants . . . . . . . . . . . . . . . . . . . . . . . . 98
2.6 Quantum effective action . . . . . . . . . . . . . . . . . . . . . . . 101
2.7 Two-particle irreducible effective action . . . . . . . . . . . . . . . 107
2.8 Euclidean path integral and Statistical mechanics . . . . . . . . . . 114

i
ii F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

3 Path integrals for fermions and photons 119


3.1 Grassmann variables . . . . . . . . . . . . . . . . . . . . . . . . . 119
3.2 Path integral for fermions . . . . . . . . . . . . . . . . . . . . . . . 125
3.3 Path integral for photons . . . . . . . . . . . . . . . . . . . . . . . 127
3.4 Schwinger-Dyson equations . . . . . . . . . . . . . . . . . . . . . 130
3.5 Quantum anomalies . . . . . . . . . . . . . . . . . . . . . . . . . . 133

4 Non-Abelian gauge symmetry 143


4.1 Non-abelian Lie groups and algebras . . . . . . . . . . . . . . . . . 144
4.2 Yang-Mills Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . 152
4.3 Non-Abelian gauge theories . . . . . . . . . . . . . . . . . . . . . 157
4.4 Spontaneous gauge symmetry breaking . . . . . . . . . . . . . . . 162
4.5 θ-term and strong-CP problem . . . . . . . . . . . . . . . . . . . . 168
4.6 Non-local gauge invariant operators . . . . . . . . . . . . . . . . . 176

5 Quantization of Yang-Mills theory 187


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
5.2 Gauge fixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
5.3 Fadeev-Popov quantization and Ghost fields . . . . . . . . . . . . . 191
5.4 Feynman rules for non-abelian gauge theories . . . . . . . . . . . . 193
5.5 On-shell non-Abelian Ward identities . . . . . . . . . . . . . . . . 197
5.6 Ghosts and unitarity . . . . . . . . . . . . . . . . . . . . . . . . . . 199

6 Renormalization of gauge theories 211


6.1 Ultraviolet power counting . . . . . . . . . . . . . . . . . . . . . . 211
6.2 Symmetries of the quantum effective action . . . . . . . . . . . . . 212
6.3 Renormalizability . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
6.4 Background field method . . . . . . . . . . . . . . . . . . . . . . . 223

7 Renormalization group 231


7.1 Callan-Symanzik equations . . . . . . . . . . . . . . . . . . . . . . 231
7.2 Correlators containing composite operators . . . . . . . . . . . . . 234
7.3 Operator product expansion . . . . . . . . . . . . . . . . . . . . . . 237
7.4 Example: QCD corrections to weak decays . . . . . . . . . . . . . 241
7.5 Non-perturbative renormalization group . . . . . . . . . . . . . . . 248
CONTENTS iii

8 Effective field theories 259


8.1 General principles of effective theories . . . . . . . . . . . . . . . . 260
8.2 Example: Fermi theory of weak decays . . . . . . . . . . . . . . . 264
8.3 Standard model as an effective field theory . . . . . . . . . . . . . . 267
8.4 Effective theories in QCD . . . . . . . . . . . . . . . . . . . . . . . 274
8.5 EFT of spontaneous symmetry breaking . . . . . . . . . . . . . . . 284

9 Quantum anomalies 295


9.1 Axial anomalies in a gauge background . . . . . . . . . . . . . . . 295
9.2 Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
9.3 Wess-Zumino consistency conditions . . . . . . . . . . . . . . . . . 314
9.4 ’t Hooft anomaly matching . . . . . . . . . . . . . . . . . . . . . . 318
9.5 Scale anomalies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320

10 Localized field configurations 327


10.1 Domain walls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
10.2 Skyrmions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
10.3 Monopoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
10.4 Instantons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

11 Modern tools for tree level amplitudes 357


11.1 Shortcomings of the usual approach . . . . . . . . . . . . . . . . . 357
11.2 Colour ordering of gluonic amplitudes . . . . . . . . . . . . . . . . 358
11.3 Spinor-helicity formalism . . . . . . . . . . . . . . . . . . . . . . . 364
11.4 Britto-Cachazo-Feng-Witten on-shell recursion . . . . . . . . . . . 374
11.5 Tree-level gravitational amplitudes . . . . . . . . . . . . . . . . . . 385
11.6 Cachazo-Svrcek-Witten rules . . . . . . . . . . . . . . . . . . . . . 395

12 Worldline formalism 407


12.1 Worldline representation . . . . . . . . . . . . . . . . . . . . . . . 407
12.2 Quantum electrodynamics . . . . . . . . . . . . . . . . . . . . . . 413
12.3 Schwinger mechanism . . . . . . . . . . . . . . . . . . . . . . . . 417
12.4 Calculation of one-loop amplitudes . . . . . . . . . . . . . . . . . . 420
iv F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

13 Lattice field theory 431


13.1 Discretization of bosonic actions . . . . . . . . . . . . . . . . . . . 432
13.2 Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
13.3 Hadron mass determination on the lattice . . . . . . . . . . . . . . 441
13.4 Wilson loops and confinement . . . . . . . . . . . . . . . . . . . . 442
13.5 Gauge fixing on the lattice . . . . . . . . . . . . . . . . . . . . . . 446
13.6 Lattice worldline formalism . . . . . . . . . . . . . . . . . . . . . 450

14 Quantum field theory at finite temperature 457


14.1 Canonical thermal ensemble . . . . . . . . . . . . . . . . . . . . . 457
14.2 Finite-T perturbation theory . . . . . . . . . . . . . . . . . . . . . 458
14.3 Large distance effective theories . . . . . . . . . . . . . . . . . . . 477
14.4 Out-of-equilibrium systems . . . . . . . . . . . . . . . . . . . . . . 492

15 Strong fields and semi-classical methods 501


15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
15.2 Expectation values in a coherent state . . . . . . . . . . . . . . . . 503
15.3 Quantum field theory with external sources . . . . . . . . . . . . . 509
15.4 Observables at LO and NLO . . . . . . . . . . . . . . . . . . . . . 510
15.5 Green’s formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
15.6 Mode functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
15.7 Multi-point correlation functions at tree level . . . . . . . . . . . . 531
Chapter 1

Basics of Quantum
Field Theory

1.1 Special relativity

1.1.1 Lorentz transformations

Special relativity plays a crucial role in quantum field theories1 . Various observers in
frames that are moving at a constant speed relative to each other should be able to
describe physical phenomena using the same laws of Physics. This does not imply
that the equations governing these phenomena are independent of the observer’s
frame, but that these equations transform in a constrained fashion –depending on the
nature of the objects they contain– under a change of reference frame.
Let us consider two frames F and F ′ , in which the coordinates of a given event

are respectively xµ and x µ . A Lorentz transformation is a linear transformation such
that the interval ds ≡ dt2 − dx2 is the same in the two frames2 . If we denote the
2

coordinate transformation by

x′µ = Λµ ν xν , (1.1)
1 An exception to this assertion is for quantum field models applied to condensed matter physics, where

the basic degrees of freedom are to a very good level of approximation described by Galilean kinematics.
2 The physical premises of special relativity require that the speed of light be the same in all inertial

frames, which implies solely that ds2 = 0 be preserved in all inertial frames. The group of transformations
that achieves this is called the conformal group. In four space-time dimensions, the conformal group is 15
dimensional, and in addition to the 6 orthochronous Lorentz transformations it contains dilatations as well
as non-linear transformations called special conformal transformations.

1
2 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

the matrix Λ of the transformation must obey

gµν Λµ ρ Λν σ = gρσ (1.2)

where gµν is the Minkowski metric tensor


 
+1
 −1 
 
gµν ≡  . (1.3)
 −1 
−1

Note that eq. (1.2) implies that



Λµ ν = Λ−1 ν µ . (1.4)

If we consider an infinitesimal Lorentz transformation,

Λµ ν = δµ ν + ωµ ν (1.5)

(with all components of ω much smaller than unity), this implies that

ωµν = −ωνµ (1.6)

(with all indices down). Consequently, there are 6 independent Lorentz transfor-
mations, three of which are ordinary rotations and three are boosts. Note that the
infinitesimal transformations (1.5) have a determinant3 equal to +1 (they are called
proper transformations), and do not change the direction of the time axis since
Λ0 0 = 1 ≥ 0 (they are called orthochronous). Any combination of such infinitesimal
transformations shares the same properties, and their set forms a subgroup of the full
group of transformations that preserve the Minkowski metric. c sileG siocnarF

1.1.2 Representations of the Lorentz group

More generally, a Lorentz transformation acts on a quantum system via a transforma-


tion U(Λ), that forms a representation of the Lorentz group, i.e.

U(ΛΛ ′ ) = U(Λ)U(Λ ′ ) . (1.7)

For an infinitesimal Lorentz transformation, we can write


i
U(1 + ω) = I + ωµν Mµν . (1.8)
2
3 From eq. (1.2), the determinant may be equal to ±1.
1. BASICS OF Q UANTUM F IELD T HEORY 3

(The prefactor i/2 in the second term of the right hand side is conventional.) Since
the ωµν are antisymmetric, the generators Mµν can also be chosen antisymmetric.
By using eq. (1.7) for the Lorentz transformation Λ−1 Λ ′ Λ, we arrive at

U−1 (Λ)Mµν U(Λ) = Λµ ρ Λν σ Mρσ , (1.9)

indicating that Mµν transforms as a rank-2 tensor. When used with an infinitesimal
transformation Λ = 1 + ω, this identity leads to the commutation relation that defines
the Lie algebra of the Lorentz group
 µν 
M , Mρσ = i(gµρ Mνσ − gνρ Mµσ ) − i(gµσ Mνρ − gνσ Mµρ ) . (1.10)

When necessary, it is possible to divide the six generators Mµν into three generators
Ji for ordinary spatial rotations, and three generators Ki for the Lorentz boosts along
each of the spatial directions:

Rotations : Ji ≡ 21 ǫijk Mjk ,


Lorentz boosts : Ki ≡ Mi0 . (1.11)

In a fashion similar to eq. (1.9), we obtain the transformation of the 4-impulsion Pµ ,

U−1 (Λ)Pµ U(Λ) = Λµ ρ Pρ , (1.12)

which leads to the following commutation relation between Pµ and Mµν ,


 µ 
P , Mρσ = i(gµσ Pρ − gµρ Pσ ) ,
 µ ν
P ,P = 0 . (1.13)

1.1.3 One-particle states


Let us denote p, σ a one-particle state, where p is the 3-momentum of that particle,
and σ denotes its other quantum numbers. Since this state contains a particle with a
definite momentum, it is an eigenstate of the momentum operator Pµ , namely
p
Pµ p, σ = pµ p, σ , with p0 ≡ p2 + m2 . (1.14)

Consider now the state U(Λ) p, σ . We have

Pµ U(Λ) p, σ = U(Λ) U−1 (Λ)Pµ U(Λ) p, σ = Λµ ν pν U(Λ) p, σ . (1.15)


| {z }
Λµ ν P ν

Therefore, U(Λ) p, σ is an eigenstate of momentum with eigenvalue (Λp)µ , and


we may write it as a linear combination of all the states with momentum Λp,
X
U(Λ) p, σ = Cσσ ′ (Λ; p) Λp, σ ′ . (1.16)
σ′
4 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

1.1.4 Little group


Any positive energy on-shell momentum pµ can be obtained by applying an or-
thochronous Lorentz transformation to some reference momentum qµ that lives on
the same mass-shell,
pµ ≡ Lµ ν (p) qν . (1.17)
The choice of the reference 4-vector is not important, but depends on whether the
particle under consideration is massive or not. Convenient choices are the following:

• m > 0 : qµ ≡ (m, 0, 0, 0), the 4-momentum of a massive particle at rest,


• m = 0 : qµ ≡ (ω, 0, 0, ω), the 4-momentum of a massless particle moving in
the third direction of space.

Then, we may define a generic one-particle state from those corresponding to the
reference momentum as follows
p, σ ≡ Np U(L(p)) q, σ , (1.18)
where Np is a numerical prefactor that may be necessary to properly normalize the
states. This definition leads to
  
U Λ p, σ = Np U L(Λp) U L−1 (Λp)ΛL(p) q, σ . (1.19)
| {z }
Σ
−1
Note that the Lorentz transformation Σ ≡ L (Λp)ΛL(p) maps qµ into itself, and
therefore belongs to the subgroup of the Lorentz group that leaves qµ invariant, called
the little group of qµ . Thus, when U(Σ) acts on the reference state, the momentum
remains unchanged and only the other quantum numbers may vary
X
U(Σ) q, σ = Cσσ ′ (Σ) q, σ ′ . (1.20)
σ′

Moreover, the coefficients Cσσ ′ (Σ) in the right hand side of this formula define a
representation of the little group,
X
Cσσ ′ (Σ2 Σ1 ) = Cσσ ′′ (Σ2 ) Cσ ′′ σ ′ (Σ1 ) . (1.21)
σ ′′

Massive particles : In the case of a massive particles, the little group is made of
the Lorentz transformations that leave the vector qµ = (m, 0, 0, 0) invariant, which
is the group of all rotations in 3-dimensional space. The additional quantum number
σ is therefore a label that enumerates the possible states in a given representation of
SO(3). These representations correspond to the angular momentum, but since we are
in the rest frame of the particle, this is in fact the spin of the particle. For a spin s, the
dimension of the representation is 2s + 1, and σ takes the values −s, 1 − s, · · · , +s.
1. BASICS OF Q UANTUM F IELD T HEORY 5

Massless particles : In the massless case, we look for Lorentz transformations


Σµ ν that leave qν = (ω, 0, 0, ω) invariant. For an infinitesimal transformation,
Σµ ν ≈ δµ ν + ωµ ν , this gives the following general form
 
0 α1 α2 0
−α 0 −θ α1 
 1 
ωµν =  , (1.22)
−α2 θ 0 α2 
0 −α1 −α2 0

where α, β, θ are three real infinitesimal parameters. Thus, an infinitesimal transfor-


mation U(Σ) reads

12 10
U(Σ) ≈ 1 − iθ M
|{z} −iα1 (M
| + M31}) − iα2 (M
{z |
20
− M23}) .
{z (1.23)
J3 K1 +J2 ≡B1 K2 −J1 ≡B2

Thus, the little group for massless particles is three dimensional, with generators J3
(the projection of the angular momentum in the direction of the momentum) and4
B1,2 . Using eq. (1.10), we have
     
J3 , B1 = i B2 , J3 , B2 = −i B1 , B 1 , B2 = 0 . (1.24)

The last commutators implies that we may choose states that are simultaneous eigen-
states of B1 and B2 . However, non-zero eigenvalues for B1,2 would lead to a con-
tinuum of states with the same momentum, that are not realized in Nature. The
remaining transformation, generated by J3 , can be viewed as a rotation about the
direction of the momentum, and the corresponding group is SO(2). Therefore, the
only eigenvalue that labels the massless states is that of J3 ,

J3 q, σ = σ q, σ , U(Σ) q, σ = e−iσθ q, σ . (1.25)


α1,2 =0

The number σ is called the helicity of the particle. After a rotation of angle θ = 2π,
the state must return to itself (bosons) or its opposite (fermions), implying that the
helicity must be a half integer:

bosons : σ = 0, ±1, ±2, · · ·


fermions : σ = ± 12 , ± 23 , · · · (1.26)
4 The generators B1,2 are the generators of Galilean boosts in the (x1 , x2 ) plane transverse to the

particle momentum, i.e. the transformations that shift the transverse velocity, vj → vj + δvj . The physical
reason of their appearance in the discussion of massless particles is time dilation: in the observer’s frame,
the transverse dynamics of a particle moving at the speed of light is infinitely slowed down by time dilation,
and is therefore non relativistic (this intuitive idea can be further substantiated by light-cone quantization).
6 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

1.1.5 Scalar field


A scalar field φ(x) is a (number or operator valued) object that depends on a spacetime
coordinate x and is invariant under a Lorentz transformation, except for the change of
coordinate induced by the transformation:

U−1 (Λ)φ(x)U(Λ) = φ(Λ−1 x) . (1.27)

This formula just reflects the fact that the point x where the transformed field is
evaluated was located at the point Λ−1 x before the transformation. The first derivative
∂µ φ of the field transforms as a 4-vector,

U−1 (Λ)∂µ φ(x)U(Λ) = Λµ ν ∂ν φ(Λ−1 x) , (1.28)

where the bar in ∂ν indicates that we are differentiating with respect to the whole
argument of φ, i.e. Λ−1 x. Likewise, the second derivative ∂µ ∂ν φ transforms like a
rank-2 tensor, but the d’Alembertian φ transforms as a scalar. c sileG siocnarF

1.2 Free scalar fields, Mode decomposition

1.2.1 Quantum harmonic oscillators


Let us consider a continuous collection of quantum harmonic oscillators, each of them
corresponding to particles with a given momentum p. These harmonic oscillators
can be defined by a pair of creation and annihilation operators a†p , ap , where p is a
3-momentum that labels the corresponding mode. Note that the energy of the particles
is fixed from their 3-momentum by the relativistic dispersion relation,
p
p0 = Ep ≡ p2 + m2 . (1.29)

The operators creating or destroying particles with a given momentum p obey usual
commutation relations,
     
ap , ap = a†p , a†p = 0 , ap , a†p ∼ 1 . (1.30)

(in the last commutator, the precise normalization will be defined later.) In contrast,
operators acting on different momenta always commute:
     
ap , aq = a†p , a†q = ap , a†q = 0 . (1.31)

If we denote by H the Hamiltonian operator of such a system, the property that


a†p creates a particle of momentum p (and therefore of energy Ep ) implies that
 
H, a†p = +Ep a†p . (1.32)
1. BASICS OF Q UANTUM F IELD T HEORY 7

Likewise, since ap destroys a particle with the same energy, we have


 
H, ap = −Ep ap . (1.33)

(Implicitly in these equations is the fact that particles are non-interacting, so that
adding or removing a particle of momentum p does not affect the rest of the system.)
In these lectures, we will adopt the following normalization for the free Hamiltonian5 ,

Z
d3 p 
H= Ep a†p ap + V Ep , (1.34)
(2π)3 2Ep

where V is the volume of the system. To make contact with the usual treatment6 of a
harmonic oscillator in quantum mechanics, it is useful to introduce the occupation
number fp defined by,

2Ep V fp ≡ a†p ap . (1.35)

In terms of fp , the above Hamiltonian reads


Z
d3 p 
H=V 3
Ep fp + 21 . (1.36)
(2π)

The expectation value of fp has the interpretation of the number of particles par unit
of phase-space (i.e. per unit of volume in coordinate space and per unit of volume
in momentum space), and the 1/2 in fp + 12 is the ground state occupation of each
oscillator7 . Of course, this additive constant is to a large extent irrelevant since only
energy differences have a physical meaning. Given eq. (1.34), the commutation
relations (1.32) and (1.33) are fulfilled provided that
 
ap , a†q = (2π)3 2Ep δ(p − q) . (1.37)
5 In a relativistic setting, the measure d3 p/(2π)3 2E has the important benefit of being Lorentz
p
invariant. Moreover, it results naturally from the 4-dimensional momentum integration d4 p/(2π)4
constrained by the positive energy mass-shell condition 2π θ(p0 ) δ(p2 − m2 ).
6 In relativistic quantum field theory, it is customary to use a system of units in which h̄ = 1, c = 1 (and

also kB = 1 when the Boltzmann constant is needed to relate energies and temperature). In this system of
units, the action S is dimensionless. Mass, energy, momentum and temperature have the same dimension,
which is the inverse of the dimension of length and duration:
           
mass = energy = momentum = temperature = length−1 = duration−1 .
Moreover, in four dimensions, the creation and annihilation operators introduced in eq. (1.34) have the
dimension of an inverse energy:
   †  
ap = ap = energy−1
(the occupation number fp is dimensionless.)
7 This is reminiscent of the fact that the energy of the level n in a quantized harmonic oscillator of base

energy ω is En = (n + 21 )ω.
8 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

1.2.2 Scalar field operator, Canonical commutation relations

Note that in quantum mechanics, a particle with a well defined momentum p is not
localized at a specific point in space, due to the uncertainty principle. Thus, when we
say that a†p creates a particle of momentum p, this production process may happen
anywhere in space and at any time since the energy is also well defined. Instead of
using the momentum basis, one may introduce an operator that depends on space-time
in order to give preeminence to the time and position at which a particle is created or
destroyed. It is possible to encapsulate all the ap , a†p into the following Hermitean
operator8
Z
d3 p  † +ip·x 
φ(x) ≡ 3
ap e + ap e−ip·x , (1.38)
(2π) 2Ep

where p · x ≡ pµ xµ with p0 = +Ep . In the following, we will also need the time
derivative of this operator, denoted Π(x),
Z
d3 p  
Π(x) ≡ ∂0 φ(x) = i Ep a†p e+ip·x − ap e−ip·x . (1.39)
(2π)3 2Ep

Given the commutation relation (1.37), we obtain the following equal-time commuta-
tion relations for φ and Π,
     
φ(x), φ(y) x0 =y0 = Π(x), Π(y) x0 =y0 = 0 , φ(x), Π(y) x0 =y0 = iδ(x−y) .
(1.40)

These are called the canonical field commutation relations. In this approach (known
as canonical quantization), the quantization of a field theory corresponds to promot-
ing the classical Poisson bracket between a dynamical variable and its conjugate
momentum to a commutator:
  
Pi , Qj = δij → ^i, Q
P ^ j = i h̄ δij . (1.41)

In addition to these relations that hold for equal times, one may prove that φ(x) and
Π(y) commute for space-like intervals (x − y)2 < 0. Physically, this is related to the
absence of causal relation between two measurements performed at space-time points
with a space-like separation.
It is possible to invert eqs. (1.38) and (1.39) in order to obtain the creation and
8 In four space-time dimensions, this field has the same dimension as energy:
   
φ(x) = energy .
1. BASICS OF Q UANTUM F IELD T HEORY 9

annihilation operators given the operators φ and Π. These inversion formulas read
Z Z
  ↔
a†p = −i d3 x e−ip·x Π(x) + iEp φ(x) = −i d3 x e−ip·x ∂0 φ(x) ,
Z Z
  ↔
ap = +i d3 x e+ip·x Π(x) − iEp φ(x) = +i d3 x e+ip·x ∂0 φ(x) ,
(1.42)

where the operator ∂0 is defined as
↔  
A ∂0 B ≡ A ∂0 B − ∂ 0 A B . (1.43)

Note that these expressions, although they appear to contain x0 , do not actually
depend on time. Using these formulas, we can rewrite the Hamiltonian in terms of φ
and Π,
Z

H = d3 x 12 Π2 (x) + 21 (∇φ(x))2 + 21 m2 φ2 (x) . (1.44)

From this Hamiltonian, one may obtain equations of motion in the form of Hamilton-
Jacobi equations. Formally, they read

δH
∂0 φ(x) = = Π(x) ,
δΠ(x)
δH  
∂0 Π(x) = − = ∇2 − m2 φ(x) . (1.45)
δφ(x)

1.2.3 Lagrangian formulation


One may also obtain a Lagrangian L(φ, ∂0 φ) that leads to the Hamiltonian (1.44)
by the usual manipulations. Firstly, the momentum canonically conjugated to φ(x)
should be given by
δL
Π(x) ≡ . (1.46)
δ∂0 φ(x)
For this to be consistent with the first Hamilton-Jacobi equation, the Lagrangian must
contain the following kinetic term
Z
L = d3 x 21 (∂0 φ(x))2 + · · · (1.47)

The missing potential term of the Lagrangian is obtained by requesting that we have
Z
H = d3 x Π(x)∂0 φ(x) − L . (1.48)
10 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

This gives the following Lagrangian,


Z

L = d3 x 21 (∂µ φ(x))(∂µ φ(x)) − 21 m2 φ2 (x) . (1.49)

Note that the action.


Z
S = dx0 L , (1.50)

is a Lorentz scalar (this is not true of the Hamiltonian, which may be considered as
the time component of a 4-vector from the point of view of Lorentz transformations).
The Lagrangian (1.49) leads to the following Euler-Lagrange equation of motion,

x + m2 φ(x) = 0 , (1.51)

which is known as the Klein-Gordon equation. This equation is of course equivalent


to the pair of Hamilton-Jacobi equations derived earlier.
c sileG siocnarF

1.2.4 Noether’s theorem

Conservation laws in a physical theory are intimately related to the continuous


symmetry of the system. This is well known in Lagrangian mechanics, and can be
extended to quantum field theory. Consider a generic Lagrangian L(φ, ∂µ φ) that
depends on fields and their derivatives with respect to the spacetime coordinates, and
assume that the theory is invariant under the following variation of the field,

φ(x) → φ(x) + ε Ψ(x) . (1.52)

Such an invariance is said to be continuous when it is valid for any value of the
infinitesimal parameter ε. If the Lagrangian is unchanged by this transformation, we
can write

∂L ∂L
0 = δL = εΨ + ε∂µ Ψ
∂φ ∂(∂µ φ)
 
∂L ∂L
= ∂µ µ
εΨ + ε∂µ Ψ
∂(∂ φ) ∂(∂µ φ)
 
∂L
= ε ∂µ Ψ . (1.53)
∂(∂µ φ)
| {z }

In the second line, we have used the Euler-Lagrange equation obeyed by the field.
The 4-vector Jµ is known as the Noether current associated to this symmetry. The fact
1. BASICS OF Q UANTUM F IELD T HEORY 11

that the variation of the Lagrangian is zero implies the following continuity equation
for this current

∂ µ Jµ = 0 . (1.54)

This is the simplest case of Noether’s theorem, where the Lagrangian itself is invariant.
But for the theory to be unmodified by the transformation of eq. (1.52), it is only
necessary that the action be invariant, which is also realized if the Lagrangian is
modified by a total derivative, i.e.

δL = ε Kµ . (1.55)

(The proportionality to ε follows from the fact that the variation must vanish when
ǫ → 0.) When the variation of the Lagrangian is a total derivative instead of zero, the
continuity equation is modified into:

∂µ Jµ − Kµ = 0 , (1.56)

where Jµ is the same current as before. As we shall see later, there are situations
where a conservation equation such as (1.54) is violated by quantum effects, due to a
delicate interplay between the symmetry responsible for the conservation law and the
ultraviolet structure of the theory.c sileG siocnarF

1.3 Interacting scalar fields

1.3.1 Interaction term


Until now, we have only considered non-interacting particles, which is of course of
very limited use in practice. That the Hamiltonian (1.34) does not contain interactions
follows from the fact that the only non-trivial term it contains is of the form a†p ap , that
destroys a particle of momentum p and then creates a particle of momentum p (hence
nothing changes in the state of the system under consideration). By momentum
conservation, this is the only allowed Hermitean operator which is quadratic in
the creation and annihilation operators. Therefore, in order to include interactions,
we must include in the Hamiltonian terms of higher degree in the creation and
annihilation operators. The additional term must be Hermitean, since H generates the
time evolution, which must be unitary.
The simplest Hermitean addition to the Hamiltonian is a term of the form
Z
λ
HI = d3 x φn (x) , (1.57)
n!
where n is a power larger than 2. The real constant λ is called a coupling constant
and controls the strength of the interactions, while the denominator n! is a symmetry
12 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

factor that will prove convenient later on. At this point, it seems that any degree n
may provide a reasonable interaction term. However, theories with an odd n have an
unstable vacuum, and theories with n > 4 are non-renormalizable in four space-time
dimensions, as we shall see later. For these reasons, n = 4 is the only case which is
widely studied in practice, and we will stick to this value in the rest of this chapter.
With this choice, the Hamiltonian and Lagrangian read
Z
1
H= d3 x 2
2 Π (x) + 12 (∇φ(x))2 + 21 m2 φ2 (x) + λ 4
4! φ (x) ,
Z
1
L= d3 x 2 (∂µ φ(x))(∂
µ
φ(x)) − 21 m2 φ2 (x) − λ 4
4! φ (x) , (1.58)

and the Klein-Gordon equation is modified into


 λ
x + m2 φ(x) + φ3 (x) = 0 . (1.59)
6

1.3.2 Interaction representation

A field operator that obeys this non-linear equation of motion can no longer be
represented as a linear superposition of plane waves such as (1.38). Let us assume
that the coupling constant is very slowly time-dependent, in such a way that

lim λ=0. (1.60)


x0 →±∞

What we have in mind here is that λ goes to zero adiabatically at asymptotic times,
i.e. much slower than all the physically relevant timescales of the theory under
consideration. Therefore, at x0 = ±∞, the theory is a free theory whose spectrum is
made of the eigenstates of the free Hamiltonian. Likewise, the field φ(x) should be
in a certain sense “close to a free field” in these limits. In the case of the x0 → −∞
limit, let us denote this by9

lim φ(x) = φin (x) , (1.61)


x0 →−∞

where φin is a free field operator that admits a Fourier decomposition similar to
eq. (1.38),
Z h i
d3 p † +ip·x −ip·x
φin (x) ≡ ap,in e + ap,in e . (1.62)
(2π)3 2Ep
9 In this equation, we ignore for now the issue of field renormalization, onto which we shall come back

later (see the section 1.9).


1. BASICS OF Q UANTUM F IELD T HEORY 13

Eq. (1.61) can be made more explicit by writing

φ(x) = U(−∞, x0 ) φin (x) U(x0 , −∞) , (1.63)

where U is a unitary time evolution operator defined as a time ordered exponential of


the interaction term in the Lagrangian, evaluated with the φin field:
Z t2
U(t2 , t1 ) ≡ T exp i dx0 d3 x LI (φin (x)) , (1.64)
t1

where
λ
LI (φ(x)) ≡ − 4! φ4 (x) . (1.65)

This time evolution operator satisfies the following properties

U(t, t) = 1
U(t3 , t1 ) = U(t3 , t2 ) U(t2 , t1 ) (for all t2 )
−1 †
U(t1 , t2 ) = U (t2 , t1 ) = U (t2 , t1 ) . (1.66)

One can then prove that


λ h i
(x +m2 )φ(x)+ φ3 (x) = U(−∞, x0 ) (x +m2 )φin (x) U(x0 , −∞) . (1.67)
6
This equation shows that φin obeys the free Klein-Gordon equation if φ obeys the
non-linear interacting one, and justifies a posteriori our choice of the unitary operator
U that connects φ and φin .

1.3.3 In and Out states

The in creation and annihilation operators can be used to define a space of eigenstates
of the free Hamiltonian, starting from a ground state (vacuum) denoted 0in . For
instance, one particle states would be defined as

pin = a†p,in 0in . (1.68)

The physical interpretation of these states is that they are states with a definite particle
content at x0 = −∞, before the interactions are turned on10 .
In the same way as we have constructed in field operators, creation and annihila-
tion operators and states, we may construct out ones such that the field φout (x) is a
10 For an interacting system, it is not possible to enumerate the particle content of states, because of

quantum fluctuations that may temporarily create additional virtual particles.


14 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

free field that coincides with the interacting field φ(x) in the limit x0 → +∞ (with
the same caveat about field renormalization). Starting from a vacuum state 0out , we
may also define a full set of states, such as pout , that have a definite particle content
at x0 = +∞. It is crucial to observe that the in and out states are not identical:
0out 6= 0in (they differ by the phase 0out 0in ) , pout 6= pin , · · · (1.69)
Taking the limit x0 → +∞ in eq. (1.63), we first see that11

ap,out = U(−∞, +∞) ap,in U(+∞, −∞) ,


a†p,out = U(−∞, +∞) a†p,in U(+∞, −∞) , (1.70)

from which we deduce that the in and out states must be related by
αout = U(−∞, +∞) αin . (1.71)
The two sets of states are identical for a free theory, since the evolution operator
reduces to the identity in this case. c sileG siocnarF

1.4 LSZ reduction formulas


Among the most interesting physical quantities are the transition amplitudes
q1 q2 · · · out p1 p2 · · · in , (1.72)
whose squared modulus enters in cross-sections that are measurable in scattering
experiments. Up to a normalization factor, the square of this amplitude gives the
probability that particles with momenta p1 p2 · · · in the initial state evolve into
particles with momenta q1 q2 · · · in the final state.
A first step in view of calculating transition amplitudes is to relate them to
expectation values involving the field operator φ(x). In order to illustrate the main
steps in deriving such a relationship, let us consider the simple case of the transition
amplitude between two 1-particle states,
qout pin . (1.73)

Firstly, we write the state |pin as the action of a creation operator on the corresponding
vacuum state, and we replace the creation operation by its expression in terms of φin ,

qout pin = qout a†p,in 0in


Z
= −i d3 x e−ip·x qout Πin (x) + iEp φin (x) 0in . (1.74)

11 The evolution operator from x0 = −∞ to x0 = +∞ is sometimes called the S-matrix: S ≡

U(+∞, −∞).
1. BASICS OF Q UANTUM F IELD T HEORY 15

Next, we use the fact that φin , Πin are the limits when x0 → −∞ of the interacting
fields φ, Π, and we express this limit by means of the following trick:
Z +∞
0 0
lim F(x ) = lim F(x ) − dx0 ∂x0 F(x0 ) . (1.75)
x0 →−∞ x0 →+∞ −∞

The term with the limit x0 → +∞ produces a term identical to the r.h.s. of the first
line of eq. (1.74), but with an a†p,out instead of a†p,in . At this stage we have

qout pin = 0out aq,out a†p,out 0in


Z
+i d4 x ∂x0 e−ip·x qout Π(x) + iEp φ(x) 0in . (1.76)

In the first line, we use the commutation relation between creation and annihilation
operators to obtain

0out aq,out a†p,out 0in = (2π)3 2Ep δ(p − q) . (1.77)

This term does not involve any interaction, since the initial state particle simply goes
through to the final state (in other words, this particle just acts as a spectator in the
process). Such trivial terms always appear when expressing transition amplitudes in
terms of the field operator, and they are usually dropped since they do not carry any
interesting physical information. We can then perform explicitly the time derivative
in the second line to obtain12
Z
.
qout pin = i d4 x e−ip·x (x + m2 ) qout φ(x) 0in , (1.78)

.
where we use the symbol = to indicate that the trivial non-interacting terms have been
dropped.
Next, we repeat the same procedure for the final state particle: (i) replace the
annihilation operator aq,out by its expression in terms of φout , (ii) write φout as a limit
of φ when x0 → +∞, (iii) write this limit as an integral of a time derivative plus a
term at x0 → −∞, that we rewrite as the annihilation operator aq,in :
Z
.
qout pin = i d4 x e−ip·x (x + m2 ) 0out aq,in φ(x) 0in
Z

+i d4 y ∂y0 eiq·y 0out Π(y) − iEq φ(y) φ(x) 0in .
(1.79)
12 We use here the dispersion relation p2 − p2 = m2 of the incoming particle to arrive at this expression.
0
The mass that should enter in this formula is the physical mass of the particles. This remark will become
important when we discuss renormalization.
16 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

However, at this point we are stuck because we would like to bring the aq,in to the
right where it would annihilate 0in , but we do not know the commutator between
aq,in and the interacting field operator φ(x). The remedy is to go one step back, and
note that we are free to insert a T-product in
  
Πout (y) − iEq φout (y) φ(x) = T Π(y) − iEq φ(y) φ(x) (1.80)
y0 →+∞

since the time y0 → +∞ is obviously larger than x0 . Then the boundary term at
y0 → −∞ will automatically lead to the desired ordering φ(x) aq,in ,
Z
.
qout pin = i d4 x e−ip·x (x + m2 ) 0out φ(x) aq,in 0in
| {z }
0
Z

+i d4 y ∂y0 eiq·y 0out T Π(y) − iEq φ(y) φ(x) 0in .
(1.81)

Performing the derivative with respect to y0 , we finally arrive at


Z
.
qout pin = i2 d4 xd4 y ei(q·y−p·x) (x+m2 )(y+m2 ) 0out T φ(x)φ(y) 0in .
(1.82)

Such a formula is known as a (Lehmann-Symanzik-Zimmermann) reduction formula.


The method that we have exposed above on a simple case can easily be applied
to the most general transition amplitude, with the following result for the part of the
amplitude that does not involve any spectator particle:

Z m
. m+n Y 4
q1 · · · qn out p1 · · · pm in =i d xj e−ipi ·xi (xi + m2 )
i=1
ZY
n
× d4 yj eiqj ·xj (yj + m2 )
j=1

× 0out T φ(x1 ) · · · φ(xm )φ(y1 ) · · · φ(yn ) 0in . (1.83)

The bottom line is that an amplitude with m + n particles is related to the vacuum
expectation value of a time-ordered product of m + n interacting field operators (a
slight but important modification to this formula will be introduced in the section 1.9,
in order to account for field renormalization). Note that the vacuum states on the left
and on the right of the expectation value are respectively the out and the in vacua. c sileG siocnarF
1. BASICS OF Q UANTUM F IELD T HEORY 17

1.5 From transition amplitudes to reaction rates

All experiments in particle physics amount to a measurement that answers the follow-
ing question: given a certain setup that defines an initial state, how many reactions
of a certain type occur per unit time? The concept of “reaction of a certain type”
may vary widely depending on the number of criteria that are imposed on the final
state for the reaction to be worth counting. For instance, one may consider the re-
action e+ e− → anything, the reaction e+ e− → µ+ µ− , or even a reaction with the
same particles in the initial and final states, but where in addition the final muons
are required to have momenta in a certain range. As we have seen in the previous
section, the LSZ reduction formulas express transition amplitudes between states with
a definite particle content in terms of correlation functions of the field operators that
are calculable in quantum field theory. The missing link to connect this to experi-
mental measurements is an explicit formula relating reaction rates to these transition
amplitudes.

1.5.1 Invariant cross-sections

Definition of a cross-section : In a scattering experiment such as those performed


in a particle collider, the observed reaction rate results from a combination of some
factors that depend on the accelerator design (the fluxes of particles in the colliding
beams), and a factor that contains the genuine microscopic information about the
reaction. In general, this microscopic input is given in terms of a quantity called
a cross-section, that has the dimension of an area. Consider two colliding beams,
containing particles of type 1 and 2, respectively. For simplicity, assume that the two
beams have a uniform particle density, and let us denote S their common transverse
area. If during the experiment, N1 particles of the first beam and N2 particles of the
second beam fly by the interaction zone, the cross-section for the process 1 + 2 → F,
where F is some final state, is the quantity σ12→F defined by
 Number of times F is  N N
1 2
= σ12→F . (1.84)
seen in the experiment S

In this formula, the left hand side is measured experimentally, while in the right hand
side the ratio N1 N2 /S depends only on the setup of the collider13 . Therefore, the
cross-section can be obtained as the ratio of two known quantities. Note that the
cross-section in general depends on the momenta p1,2 of the particles participating in
the collision (and on the momenta of the particles in the final state F), but in a Lorentz
covariant way, i.e. only through Lorentz scalars such as (p1 + p2 )2 .
13 In practice, the beam conditions are monitored by measuring in parallel the event rate of another

reaction, whose cross-section is already accurately known.


18 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Normalization of 1-particle states : An important point in the determination of


the cross-section is the normalization of the 1-particle states. We have
Z
2
pin pin = d3 x x pin = 2Ep (2π)3 δ(0) . (1.85)
| {z }
V

In the first equality, we have inserted a complete set of position eigenstates in order to
highlight the interpretation of pin pin as the integral of the square of a wavefunc-
tion. The second equality follows from the canonical commutation relation between
creation and annihilation operators. This equation means that our convention of nor-
malization of the states corresponds to “2E particles per unit volume”. We are using
quotes here because 2E does not have the correct dimension to be a proper density of
particles. This is mostly an aesthetic problem: this convention of normalization will
cancel out eventually, since cross-sections are defined in such a way that they do not
depend on the incoming fluxes of particles.

Example of a non-interacting theory : These normalization issues can be clarified


by considering the trivial example of a non-interacting theory. In this case, the exact
result for the transition amplitude between two 1-particle states is
qout pin = 2Ep (2π)3 δ(q − p) . (1.86)
By squaring this amplitude, we obtain
2
qout pin = 4Ep Eq V (2π)3 δ(q − p) . (1.87)
and integrating over q with the Lorentz invariant measure d3 q/((2π)3 2Eq ) gives
Z
d3 q 2
qout pin = pin pin = initial number of particles . (1.88)
(2π)3 2Eq
Since we are considering a non-interacting theory, we know without any calculation
that every particle in the initial state should be present in the final state with the same
momentum. Therefore, the integral in the left hand side of the previous equation is
the number of particles in the final state, and the quantity
2 d3 q
qout pin (1.89)
(2π)3 2Eq
counts those that have their momentum in a volume d3 q centered around q. More
generally, for an n-particle final state,
n
2 Y d3 qj
q1 · · · qn out · · · in (1.90)
(2π)3 2Eqj
j=1

is the number of events where the final state particles have their momenta in the
volume d3 q1 · · · d2 qn centered on (q1 , · · · , qn ).
1. BASICS OF Q UANTUM F IELD T HEORY 19

General squared amplitude : Consider now a transition amplitude from a 2-


particle state to a final state with n particles, q1 · · · qn out p1 p2 in . By momentum
conservation, all the contributions to this amplitude are proportional to a delta function,

Xn 
q1 · · · qn out p1 p2 in ≡ (2π)4 δ p1 +p2 − qj T(q1,··· ,n |p1,2 ) , (1.91)
j=1

and its squared modulus reads

2 Xn 
q1 · · · qn out p1 p2 in = (2π)4 δ p1 + p2 − qj
j=1
2
× (2π)4 δ(0) T(q1,··· ,n |p1,2 ) . (1.92)
| {z }
VT

This expression contains the square of the delta function. One of these factors becomes
a delta of zero, which has the interpretation of space-time volume VT in which the
process takes place. Since the initial state contains a fixed number of particles of each
kind (1 and 2) per unit volume in all space, we expect the total number of events to
be extensive, because interactions may happen in all the volume at any time. This is
the meaning of the factor VT that appears in this square.
From the insight gained by studying the non-interacting theory, this square
weighted by the Lorentz invariant phase-space measure of the final state counts
the number of events in which the final state particles have momenta in the volume
d3 q1 · · · d2 qn centered on (q1 , · · · , qn ):

Number of events
n
2 Y d3 qj
= q1 · · · qn out p1 p2 in
(2π)3 2Eqj
j=1
n
2 Xn Y d3 qj
= VT T(q1,··· ,n |p1,2 ) (2π)4 δ p1 + p2 − qj .
j=1 (2π)3 2Eqj
j=1
| {z }
dΓn (p1,2 )
(1.93)

(dΓn (p1,2 ) is the invariant final state measure subject to the constraint of momentum
conservation.)

Cross-section in the target frame : At this point, the relationship with the cross-
section of this transition is most easily established in the rest frame of one of the initial
state particles, e.g. the particle 2 (this frame is called the target frame). Consider a thin
20 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

ℓ v1T
Figure 1.1: Geometry of
a two-body cross-section
in the target frame. The
S two volumes represent
the particles that can take
v1 part in the reaction in the
duration T .
2 1

slice of this target, of transverse section S and infinitesimal thickness ℓ, as shown in


the figure 1.1. The interaction volume is the volume of this target, i.e. V = Sℓ. Given
the normalization of the 1-particle states, it contains N2 = 2m2 Sℓ particles of type 2.
In the target frame, the particles 1 have a velocity v1 = p1 /Ep1 . Therefore, within a
time interval T , N1 = 2Ep1 Sv1 T = 2p1 ST of them travel through the interaction
zone. Using eqs. (1.84) and (1.93), we thus obtain the following expression for the
cross-section in the target frame14
Z 2
1
σ12→1···n target = dΓn (p1,2 ) T(q1,··· ,n |p1,2 ) . (1.94)
frame 4m2 p1
Note that this is the total cross-section for a final state with n particles, since we have
integrated over all the final state momenta. By undoing some of these integrations,
we may obtain a cross-section which is differential with respect to some of the
kinematical variables that characterize the final state (e.g., the angle at which a final
state particle is scattered).

Cross-section in the center of momentum frame : Another important frame in


view of the setup of many experiments in particle physics is the center of momentum
frame. This is the observer’s frame in experiments where two beam of same-mass
particles and equal energies are collided head on. In this frame, we have

p1 + p2 = 0 , E1 = E2 , s ≡ (p1 + p2 )2 = 4E21 . (1.95)

Since s is Lorentz invariant, its expression in the rest frame of the particle 2 is

s = (m2 + E1′ )2 − p12 (in this paragraph, the primes indicate kinematical variables
in the target frame). Moreover, simple kinematics show that the combination m2 p1′
in the target frame becomes

m2 p1′ = s p1 (1.96)
dimensions, q1 · · · qn out p1 p2 in ∼ (mass)−(2+n) , T(q1,··· ,n |p1,2 ) ∼ (mass)2−n
14 Regarding

and dΓn ∼ (mass)2n−4 , and therefore this formula indeed gives an area.
1. BASICS OF Q UANTUM F IELD T HEORY 21

in the center of momentum frame. Therefore, the expression of the cross-section in


this frame reads
Z 2
1
σ12→1···n center of
= √ dΓn (p1,2 ) T(q1,··· ,n |p1,2 ) . (1.97)
momentum 4 s p1

Likewise, obtaining the expression of a cross-section in a frame where the two beams
have different momenta is a simple matter of relativistic kinematics (this is useful
when the detector apparatus is neither the rest frame of one of the particles, nor the
center of momentum frame, and one counts events in terms of some kinematical
variable measured in this frame – alternatively, one may boost all the measured final
state momenta in order to convert them to momenta in one of the above two frames). c sileG siocnarF

1.5.2 Decay rates

Another very common type of observable is the decay rate Γ of a particle, defined
so that Γ dt is the decay probability of a particle at rest in the time interval dt. The
decay rate can be obtained from matrix elements with a 1-particle initial state,
Xn 
q1 · · · qn out p1 in ≡ (2π)4 δ p1 − qj T(q1,··· ,n |p1 ) . (1.98)
j=1

Squaring this matrix element again produces a space-time volume factor VT , and
integrating over the invariant phase space of the final state particles gives

ZY
n Z
d3 qj 2 2
q1 · · · qn out p1 in = VT dΓn (p1 ) T(q1,··· ,n |p1 ) .
(2π)3 2Eqj
j=1
| {z } | {z }
Total number of decays Decays per unit of time and volume
(1.99)

Given the normalization of the 1-particles states, a sample of volume V contains


N1 = 2Ep1 V particles, and the average number of decays in the time interval T is
therefore 2Ep1 Γ VT . From this, we get the following expression for the decay rate
Z 2
1
Γ= dΓn (p1 ) T(q1,··· ,n |p1 ) . (1.100)
2Ep1

A differential decay rate can be obtained by leaving some of the final state kinematical
variables unintegrated. c sileG siocnarF
22 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

1.6 Generating functional

1.6.1 Definition
To facilitate the bookkeeping, it is useful to introduce a generating functional that
encapsulates all the expectation values, by defining

X∞ Z
1
Z[j] ≡ d4 x1 · · · d4 xn ij(x1 ) · · · ij(xn ) 0out T φ(x1 ) · · · φ(xn ) 0in
n!
n=0
Z
= 0out T exp i d4 x j(x)φ(x) 0in . (1.101)

Note that

Z[0] = 0out 0in 6= 1 (1.102)

in an interacting theory (but if the vacuum state is stable, then this vacuum to vacuum
transition amplitude must be a pure phase whose squared modulus is one). From this
functional, the relevant expectation values are obtained by functional differentiation
δn Z[j]
0out T φ(x1 ) · · · φ(xn ) 0in = . (1.103)
iδj(x1 ) · · · iδj(xn ) j=0

The knowledge of Z[j] would therefore give access to all the transition amplitudes.
However, it is in general not possible to derive Z[j] in closed form, and we need
to resort to perturbation theory, in which the answer is obtained as an expansion in
powers of the coupling constant. c sileG siocnarF

1.6.2 Relation to the free generating functional


The generating functional can be brought to a more useful form by first writing

φ(x1 ) · · · φ(xn ) = U(−∞, x01 ) φin (x1 ) U(x01 , x02 ) φin (x2 ) · · · φin (xn ) U(x0n , ∞) .
(1.104)

For convenience, we split the leftmost evolution operator as

U(−∞, x01 ) = U(−∞, +∞) U(+∞, x01 ) . (1.105)

Noticing that the formula (1.104) is true for any ordering of the times x0i and using
the expression of the U’s as a time-ordered exponential, we have
Z
T φ(x1 ) · · · φ(xn ) = U(−∞, +∞) T φin (x1 ) · · · φin (xn ) exp i d4 x LI (φin (x)) ,
1. BASICS OF Q UANTUM F IELD T HEORY 23

(1.106)

where the time-ordering in the right-hand side applies to all the operators on its right.
This leads to the following representation of the generating functional

h Z i
Z[j] = 0out U(−∞, +∞) T exp i d4 x j(x)φin (x) + LI (φin (x)) 0in
| {z }
0in
Z   Z
4 δ
= exp i d x LI 0in T exp i d4 x j(x)φin (x) 0in .
iδj(x)
| {z }
Z0 [j]
(1.107)

This expression of Z[j] is the most useful, since it factorizes the interactions into a
(functional) differential operator acting on Z0 [j], the generating functional for the
non-interacting theory. c sileG siocnarF

1.6.3 Free generating functional

It turns out that the latter is calculable analytically. The main difficulty in evaluating
Z0 [j] is to deal with the non-commuting objects contained in the exponential. A
central mathematical result that we shall need is a particular case of the Baker-
Campbell-Hausdorff formula (see the section 4.1.5 for a derivation),

1
if [A, [A, B]] = [B, [A, B]] = 0 , eA eB = eA+B e 2 [A,B] . (1.108)

This formula is applicable here because commutators [a, a† ] are c-numbers that
commute with everything else. In order to apply it, let us slice the time axis into an
infinite number of small intervals, by writing

Z +∞ +∞
Y Z x0i+1
T exp d4 x O(x) = T exp d4 x O(x) , (1.109)
−∞ i=−∞ x0
i

where the intermediate times are ordered according to · · · x0i < x0i+1 < · · · . The
product in the right hand side should be understood with the convention that the
factors are ordered from left to right when the index i decreases. When the size
∆ ≡ x0i+1 − x0i of these intervals goes to zero, the time-ordering can be removed in
24 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

the individual factors15 :


Z +∞ +∞
Y Z x0i+1
T exp d4 x O(x) = lim+ exp d4 x O(x) . (1.110)
−∞ ∆→0 x0
i=−∞ i

A first application of the Baker-Campbell-Hausdorff formula leads to


Z Z
T exp i d4 x j(x)φin (x) = exp i d4 x j(x)φin (x)
Z
1  
× exp − d4 xd4 y θ(x0 − y0 ) j(x)j(y) φin (x), φin (y) .
2
(1.111)

Note that the exponential in the second line is a c-number. In the end, we will need to
evaluate the expectation value of this operator in the 0in vacuum state. Therefore, it
is desirable to transform it in such a way that the annihilation operators are on the
right and the annihilation operators are on the left. This can be achieved by writing

(+) (−)
φin (x) = φin (x) + φin (x) ,
Z
(+) d3 p
φin (x) ≡ a† e+ip·x ,
(2π)3 2Ep p,in
Z
(−) d3 p
φin (x) ≡ ap,in e−ip·x , (1.112)
(2π)3 2Ep

and by using once again the Baker-Campbell-Hausdorff formula. We obtain


Z
T exp i d4 x j(x)φin (x)
Z Z
(+) (−)
= exp i d4 x j(x)φin (x) exp i d4 x j(x)φin (x)
Z
1  (+) (−) 
× exp d4 xd4 y j(x)j(y) φin (x), φin (y)
2
Z
1  
× exp − d4 xd4 y θ(x0 − y0 ) j(x)j(y) φin (x), φin (y) .
2
(1.113)
15 Field operators commute for space-like intervals,
 
O(x), O(y) = 0 if (x − y)2 < 0 .

Moreover, when ∆ → 0, the separation between any pair of points x, y with x0i < x0 , y0 < x0i+1 is
always space-like.
1. BASICS OF Q UANTUM F IELD T HEORY 25

The operator that appears in the right hand side of the first line is called a normal-
ordered exponential, and is denoted by bracketing the exponential between a pair of
colons (: · · · :):
Z Z Z
4 4 (+) (−)
: exp i d x j(x)φin (x) : ≡ exp i d x j(x)φin (x) exp i d4 x j(x)φin (x) .
(1.114)

A crucial property of the normal ordered exponential is that its in-vacuum expectation
value is equal to unity:
Z
0in : exp i d4 x j(x)φin (x) : 0in = 1 . (1.115)

Therefore, we have proven that the generating functional of the free theory is a
Gaussian in j(x),
Z
1
Z0 [j] = exp − d4 xd4 y j(x)j(y) G0F (x, y) , (1.116)
2

where G0F (x, y) is a 2-point function called the free Feynman propagator and defined
as
   (+) (−) 
G0F (x, y) = θ(x0 − y0 ) φin (x), φin (y) − φin (x), φin (y) . (1.117)

1.6.4 Feynman propagator

Since the commutators in the right hand side of eq. (1.117) are c-numbers, we can
also write
   (+) (−) 
G0F (x, y) = 0in θ(x0 − y0 ) φin (x), φin (y) − φin (x), φin (y) 0in
= 0in T φin (x)φin (y) 0in . (1.118)

In other words, the free Feynman propagator is the in-vacuum expectation value of
the time-ordered product of two free fields. Using the Fourier mode decomposition of
φin and the commutation relation between creation and annihilation operators, the
Feynman propagator can be rewritten as follows
Z
d3 p
G0F (x, y) = θ(x0 − y0 ) e−ip·(x−y) + θ(y0 − x0 ) e+ip·(x−y) .
(2π)3 2Ep
(1.119)
26 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

In the following, we will also make an extensive use of the Fourier transform of
this propagator (with respect to the difference of coordinates xµ − yµ , since it is
translation invariant):
Z
e 0 (k) ≡
G d4 (x − y) eik·(x−y) G0F (x, y)
F

Z +∞ Z0
1 0 0 0 0
= dz0 ei(k −Ek )z + dz0 ei(k +Ek )z .(1.120)
2Ek 0 −∞

The remaining Fourier integrals over z0 are not defined as ordinary functions. Instead,
they are distributions, that can also be viewed as the limiting value of a family of
ordinary functions. In order to see this, let use write
Z +∞ Z +∞
0 0 i
dz0 eiaz = lim+ dz0 ei(a+iǫ)z = . (1.121)
0 ǫ→0 0 a + i0+

Likewise
Z0 Z0
0 0 i
dz0 eiaz = lim+ dz0 ei(a−iǫ)z = − . (1.122)
−∞ ǫ→0 +∞ a − i0+

Therefore, the Fourier space Feynman propagator reads

e 0 (k) = i
G . (1.123)
F
k2 − m2 + i0+

Note that Ge 0 (k) is Lorentz invariant. Henceforth, G0 (x, y) is also Lorentz invariant16 .
F F
It is sometimes useful to have a representation of eq. (1.123) in terms of distributions.
This is provided by the following identity:
 
i 1
= i P + πδ(z) , (1.124)
z + i0+ z

where P(1/z) is the principal value of 1/z (i.e. the distribution obtained by cutting
out –symmetrically– an infinitesimal interval around z = 0). As far as integration
over the variable z is concerned, this prescription amounts to shifting the pole slightly
below the real axis, or equivalently to going around the pole at z = 0 from above (the
16 This is somewhat obfuscated by the fact that the step functions θ(±(x0 − y0 )) that enter in the

definition of the time-ordered product are not Lorentz invariant. The Lorentz invariance of time-ordered
products follows from the following properties:
• if (x − y)2 < 0, then the two fields commute and the time ordering is irrelevant,
• if (x − y)2 ≥ 0, then the sign of x0 − y0 is Lorentz invariant.
1. BASICS OF Q UANTUM F IELD T HEORY 27

term in πδ(z) can be viewed as the result of the integral on the infinitesimally small
half-circle around the pole):

z z

i0 + 0

From eq. (1.123), it is trivial to check that G0F (x, y) is a Green’s function of the
operator x + m2 (up to a normalization factor −i):
(x + m2 ) G0F (x, y) = −iδ(x − y) . (1.125)
Strictly speaking, the operator x +m2 is not invertible, since it admits as zero modes
all the plane waves exp(±ik · x) with an on-shell momentum k20 = k2 + m2 . The
i0+ prescription in the denominator of eq. (1.123) amounts to shifting infinitesimally
the zeroes of k20 = k2 + m2 in the complex k0 plane, in order to have a well
defined inverse. The regularization of eq. (1.123) is specific to the time-ordered
propagator. Other regularizations would provide different propagators; for instance
the free retarded propagator is given by

e 0 (k) = i
G . (1.126)
R
(k0 + i0+ )2 − (k2 + m2 )
One can easily check that its inverse Fourier transform is a function G0R (x, y) that
satisfies
(x + m2 ) G0R (x, y) = −iδ(x − y) ,
G0R (x, y) = 0 if x0 < y0 . (1.127)

In other words, G0R is also a Green’s function of the operator x + m2 , but with
boundary conditions that differ from those of G0F .c sileG siocnarF

1.7 Perturbative expansion and Feynman rules


The generating functional Z[j] is usually not known analytically in closed form, but
is given indirectly by eq. (1.107) as the action of a functional differential operator
28 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

that acts on the generating functional of the free theory. The latter is a Gaussian in j,
whose variance is given by the free Feynman propagator G0F . Although not explicit,
this formula provides a straightforward method for obtaining vacuum expectation
values of T-products of fields to a given order in the coupling constant λ. c sileG siocnarF

1.7.1 Examples

Let us first illustrate this by computing to order λ1 the following two functions:
0out 0in and 0out T φ(x)φ(y) 0in . In order to make the notations a bit lighter, we
denote G0xy ≡ G0F (x, y). At order one in λ, we have

" Z  4 #
λ 4 δ 2
0out 0in = Z[0] = 1 − i d z + O(λ ) Z0 [j]|j=0
4! iδj(z)
Z
λ
= 1−i d4 z G0zz2 + O(λ2 ) , (1.128)
8

and

0out T φ(x)φ(y) 0in


" Z  4 #
λ δ δ2 Z0 [j]
= 1−i d4 z + O(λ2 ) 2
4! iδj(z) i δj(x)δj(y) j=0
Z Z
λ λ
= G0xy − i G0xy d4 z G0zz2 − i d4 z G0xz G0zz G0zy + O(λ2 )
8 2
h Z i
λ
= 1−i d4 z G0zz2 + O(λ2 )
8
| {z }
Z[0]
h Z i
λ
× G0xy −i d4 z G0xz G0zz G0zy + O(λ2 ) . (1.129)
2

Although the final expressions at order one are rather simple, the intermediate steps
are quite cumbersome due to the necessity of taking a large number of functional
derivatives. Moreover, the expression of the 2-point function 0out T φ(x)φ(y) 0in
becomes simpler after we notice that one can factor out Z[0]. This property is in fact
completely general; all transition amplitudes contain a factor Z[0]. From the remark
made after eq. (1.102), this factor is a pure phase and its squared modulus is one
and will have no effect in transition probabilities. Therefore, it would be desirable
to identify from the start the terms that lead to this prefactor, to avoid unnecessary
calculations.c sileG siocnarF
1. BASICS OF Q UANTUM F IELD T HEORY 29

1.7.2 Diagrammatic representation


This simplification follows a quite transparent rule if we represent the above expres-
sions diagrammatically, by introducing the following notation

G0xy ≡ x y . (1.130)

The functions considered above can be represented as follows:

Z[0] = 1 + 1
8 z + O(λ2 )

0out T φ(x)φ(y) 0in = x y

+ 81 x y z + 1
2
x
z
y + O(λ2 ) . (1.131)

The graphs that appear in the right hand side of these equations are called Feynman
diagrams. By adding to eq. (1.130) the rule that each vertex should have a factor −iλ
and an integration over the entire space-time, then these graphs are in one-to-one
correspondence with the expressions of eqs. (1.128) and (1.129). For now, we have
recalled explicitly the numerical prefactors (1/8, 1/2,...) but they can in fact be
recovered simply from the symmetries of the graphs.
In the second of eqs. (1.131), the second term of the right hand side contains a
factor which is not connected to any of the points x and y. These disconnected graphs
are precisely the ones responsible for the factor Z[0] that appears in all transition
amplitudes. We can therefore disregard these type of graphs altogether. c sileG siocnarF

1.7.3 Feynman rules


The diagrammatic representation of eqs. (1.131) can in fact be used to completely
bypass the explicit calculation of the functional derivatives of Z0 [j]. The rules that
govern this construction are called Feynman rules. The contributions of order λp to a
n-point time-ordered product of fields 0out Tφ(x1 ) · · · φ(xn ) 0in can be obtained
as follows:

1. Draw all the graphs (with only vertices of valence 4) that connect the n points
x1 to xn and have exactly p vertices. Graphs that contain a subgraph which is
not connected to any of the xi ’s should be ignored.
2. Each line of a graph represents a free Feynman propagator G0F .
3. Each vertex represents a factor −iλ and an integral over the space-time coordi-
nate assigned to this vertex.
30 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

4. The numerical prefactor for a given graph is the inverse of the order of its
discrete symmetry group. As an illustration, we indicate below the genera-
tors of these symmetry groups and their order for the graphs that appear in
eqs. (1.131):

1
−→ order 8 −→ ,
z 8

1
x y −→ order 2 −→ . (1.132)
z 2

Note that this rule for obtaining the symmetry factor associated to a given graph
is correct only if the corresponding term in the Lagrangian has been properly
symmetrized. For instance, the operator φ4 should appear in the Lagrangian
with a prefactor 1/4!.

1.7.4 Connected graphs

At the step 1, graphs made of several disconnected subgraphs can usually appear
in certain functions, provided that each subgraph is connected to at least one of the
points xi . For instance, a 4-point function contains a piece which is simply made of
the product of two 2-point functions. In addition, it contains terms that correspond to
a genuine 4-point function, not factorizable in a product of 2-point functions. The
factorizable pieces are usually less interesting because they can be recovered from
already calculated simpler building blocks17 . For this reason, it is sometimes useful
to introduce the generating function of the connected graphs, denoted W[j]. This
functional is very simply related to Z[j] by

W[j] = log Z[j] . (1.133)

To give a glimpse of this identity, let us write



X Z
1
W[j] = d4 x1 · · · d4 xn Cn (x1 , · · · , xn ) j(x1 ) · · · j(xn ) , (1.134)
n!
n=1

17 Moreover, in scattering amplitudes, these disconnected contributions are not physically interesting.

For instance, if an m + n → p + q amplitude factorizes into the product of two sub-amplitudes (m → p


and n → q, respectively), then the corresponding sub-processes can happen at very distant locations in
space-time, which is usually not what one wants.
1. BASICS OF Q UANTUM F IELD T HEORY 31

where the Cn (x1 , · · · , xn ) are n-point functions whose diagrammatic representation


contain only connected graphs. If we expand Z[j] = exp W[j], we obtain
Z
Z[j] = 1 + d4 x C1 (x) j(x)
Z h i
1
+ d4 xd4 y C2 (x, y) + C1 (x)C1 (y) j(x)j(y)
2! | {z }
0out T φ(x)φ(y) 0in
Z h
1
+ d4 xd4 yd4 z C3 (x, y, z) + C2 (x, y)C1 (z)
3!
+C2 (y, z)C1 (x) + C2 (z, x)C1 (y)
i
+C1 (x)C1 (y)C1 (z) j(x)j(y)j(z)
| {z }
0out T φ(x)φ(y)φ(z) 0in

+··· (1.135)

This expansion highlights how the vacuum expectation values of time-ordered prod-
ucts of fields can be factorized into products of connected contributions. c sileG siocnarF

1.7.5 Feynman rules in momentum space


Until now, we have obtained Feynman rules in terms of objects that depend on space-
time coordinates, leading to expressions for the perturbative expansion of the vacuum
expectation value of time-ordered products of fields. However, in most practical
applications, we need subsequently to use the LSZ reduction formula (1.83) to turn
these expectation values into transition amplitudes. This involves the application of
the operator i( + m2 ) to each external point, and a Fourier transform. Firstly, note
that thanks to eq. (1.125), the application of i( + m2 ) simply removes the external
line to which it is applied:
" #
(x + m2 )
z x
x = . (1.136)

Thus, these operators just produce Feynman graphs that are amputated of all their
external lines. Then, the Fourier transform can be propagated to all the internal lines
of the graph, leading to an expression that involves propagators and vertices that
depend only on momenta. The Feynman rules for obtaining directly these momentum
space expressions are:

1 ′ . The graph topologies that must be considered is of course unchanged. The


momenta of the initial state particles are entering into the graph, and the
momenta of the final state particles are going out of the graph
32 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

2 ′ . Each line of a graph represents a free Feynman propagator in momentum space


e 0 (k)
G F

3 ′ . Each vertex represents a factor −iλ(2π)4 δ(k1 + · · · + k4 ), where the ki are


the four momenta entering into this vertex

3 ′′ . All the internal momenta that are not constrained by these delta functions
should be integrated over with a measure d4 k/(2π)4

4 ′ . Symmetry factors are computed as before.

For instance, these rules lead to:

Z
λ d4 k i
P = −i
2(2π) k − m2 + i0+
4 2

p2 k q2 Z
(−iλ)2 d4 k i i
= = .
p1 q1 2 (2π) k −m +i0 (p1 +p2 −k)2 −m2 +i0+
4 2 2 +

(1.137)

1.7.6 Counting the powers of λ and h̄

The order in λ of a (connected) graph G is of course related to the number of vertices


nV in the graph,

G ∼ λnV . (1.138)

This can also be related to the number of loops of the graph, which is a better measure
of its complexity since it determines how many momentum integrals it contains. Let
us denote nE the number of external lines, nI the number of internal lines and nL the
number of loops. These parameters are related by the following two identities:

4nV = 2nI + nE
nL = nI − nV + 1 . (1.139)

The first of these equations equates the number of “handles” carried by the vertices,
and the number of propagator endpoints that must attached to them. The right hand
side of the second equation counts the number of internal momenta that are not
constrained by the delta functions of momentum conservation carried the vertices (the
+1 comes from the fact that not all these delta functions are independent - a linear
combination of them must simply tell that the sum of the external momenta must be
1. BASICS OF Q UANTUM F IELD T HEORY 33

zero, and therefore does not constrain the internal ones in any way). From these two
identities, one obtains
nE
nV = nL − 1 + , (1.140)
2
and the order in λ of the graph is also

G ∼ λnL −1+nE /2 . (1.141)

According to this formula, the order of a graph depends only on the number of
external lines nE (i.e. on the number of particles involved in the transition amplitude
under consideration), and on the number of loops. Thus, the perturbative expansion is
also a loop expansion, with the leading order being given by tree diagrams, the first
correction in λ by one-loop graphs, etc...
It turns out that the number of loops also counts the order in the Planck constant h̄
of a graph. Although we have been using a system of units in which h̄ = 1, it is easy
to reinstate h̄ by the substitution
Z
S 1 x + m2 λ 4
S → = − d4 x φ(x) φ(x) + φ (x) . (1.142)
h̄ 2 h̄ 4!h̄
From this, we see that h̄ enters in the Feynman rules as follows

i h̄
Propagator : ,
p2 − m2 + i0+
λ
Vertex : −i , (1.143)

and the order in h̄ of a graph is given by

G ∼ h̄nI −nV ∼ h̄nL −1 . (1.144)

Therefore, each additional loop brings a power of h̄, and the loop expansion can also
be viewed as an expansion in powers of h̄. c sileG siocnarF

1.8 Calculation of loop integrals

1.8.1 Wick’s rotation

Let us consider the first of the examples given in eq. (1.137) and define
Z
λ d4 k i
−iΣ(P) ≡ −i . (1.145)
2 (2π)4 k2 − m2 + i0+
34 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

k0

-Ep+i0 + Figure 1.2: Wick rotation in


the complex k0 plane.
Ep -i0 +

In order to calculate the momentum integral, it is useful to perform a Wick rotation, in


which we rotate the k0 integration axis by 90 degrees to bring it along the imaginary
axis, as illustrated in the figure 1.2. The integrals along the horizontal and vertical
axis are opposite because the shaded domain does not contain any of the poles of the
Feynman propagator, and because the propagator vanishes as k−2 0 when |k0 | → ∞.
The integral along the vertical axis amounts to writing k0 = −iκ with κ varying from
−∞ to +∞. After this transformation, the integral of eq. (1.145) becomes
Z
λ d4 kE 1
Σ(P) = , (1.146)
2 (2π)4 k2E + m2

where kE is the Euclidean 4-vector defined by kiE = k (i = 1, 2, 3) and k4E = κ,


with squared norm k2E = k2 + κ2 . c sileG siocnarF

1.8.2 Volume element in D dimensions

When the integrand depends only on the norm |kE |, we can separate the radial
integration on |kE | from the angular integration over the orientation of the vector in 4-
dimensional Euclidean space. In D dimensions, the volume measure for a rotationally
invariant integrand reads

dD kE = D VD (1) kED−1 dkE , (1.147)

where VD (kE ) is the volume of the D-dimensional ball of radius kE . These volumes
can be determined recursively by

V1 (kE ) = 2kE , VD (kE ) = kE dθ sin θ VD−1 (kE sin θ) . (1.148)
0
1. BASICS OF Q UANTUM F IELD T HEORY 35

Therefore, we have

4π 3 π2 4
V2 (kE ) = πk2E , V3 (kE ) = k , V4 (kE ) = k . (1.149)
3 E 2 E
Although knowing V4 (kE ) is sufficient for performing a radial momentum integral in
four dimensions, it is interesting to have the formula for an arbitrary dimension, in
view of applications to dimensional regularization. More generally, we have

Γ(D
2 + 1) 2 πD/2
VD+1 (1) = VD (1) π1/2 and VD (1) = . (1.150)
Γ(D
2 + 3
2) D Γ(D
2)

1.8.3 Feynman parameterization of denominators

Let us now consider the second diagram of eq. (1.137) (with the notation P ≡ p1 +p2 ),

Z
(−iλ)2 d4 k i i
−iΓ4 (P) ≡ . (1.151)
2 (2π)4 k2 −m2 +i0+ (P − k)2 −m2 +i0+

In this more complicated example, an extra difficulty is that the integrand is not
rotationally invariant. The following trick, known as Feynman parameterization can
be used to rearrange the denominators18 :
Z1
1 dx
= . (1.152)
AB 0 [xA + (1 − x)B]2

The denominator resulting from this transformation is

x(k2 −m2 +i0+ )+(1−x)((P−k)2 −m2 +i0+ ) = l2 −m2 −∆(x, P)+i0+ , (1.153)

where we denote l ≡ k − (1 − x)P and ∆(x, P) ≡ −x(1 − x)P2 . At this point, we


can apply a Wick rotation19 to the shifted integration variable l, in order to obtain
Z1 Z
λ2 d4 lE 1
Γ4 (P) = − dx , (1.154)
2 0 (2π)4 [l2E + m2 + ∆(x, P)]2

where the integrand is again invariant by rotation in 4-dimensional Euclidean space. c sileG siocnarF

18 For n denominators, this formula can be generalized into


Z1 X
1 1
= Γ (n) dx1 · · · dxn δ(1 − xi ) .
A1 A2 · · · An 0 i
[x 1 A1 + · · · + xn An ]n

19 It is allowed because the integration axis can be rotated counterclockwise without passing through the

poles in the variable l0 .


36 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

1.9 Källen-Lehmann spectral representation


As we shall see now, the limit in eq. (1.61) that relates the interacting field φ and the
free field of the interaction picture φin is too naive. One of the consequences is that
we will have to make a slight modification to the reduction formula (1.83).
Consider the time-ordered 2-point function,

0out T φ(x)φ(y) 0in = θ(x0 − y0 ) 0out φ(x)φ(y) 0in


+θ(y0 − x0 ) 0out φ(y)φ(x) 0in . (1.155)

For each of the expectation values in the right hand side, let us insert an identity
operator between the two field operators, written in the form of a sum over all the
possible physical states,
X
1= λ λ . (1.156)
states λ
The states λ can be arranged into classes inside which the states differ only by a boost.
A class of states, that we will denote α, is characterized by its particle content and
by the relative momenta of these particles. Within a class, the total momentum of
the state can be varied by applying a Lorentz boost. For a class α, we will denote
αp the state of total momentum p. Each class of states has an invariant mass mα ,
such that the total energy p0 and total momentum p of the states in this class obey
p20 − p2 = m2α . In addition, it is useful to isolate the vacuum in the sum over the
states. Therefore, the identity operator can be rewritten as
X Z d3 p
1= 0 0 + p αp αp , (1.157)
(2π)3 2 p2 + m2α
classes α
where we have written the integral over the total momentum of the states in a Lorentz
invariant fashion. (We need not specify if we are using in or out states here.)
When we insert this identity operator between the two field operators, the vacuum
does not contribute. For instance

0out φ(x) 0 = 0 . (1.158)

(φ creates or destroys a particle, and therefore has a vanishing matrix element between
vacuum states.) Using the momentum operator P, ^ we can write

^ ^
0out φ(x) αp = 0out eiP·x φ(0)e−iP·x αp
= 0out φ(0) αp e−ip·x
= 0out φ(0) α0 e−ip·x . (1.159)
1. BASICS OF Q UANTUM F IELD T HEORY 37

The second line uses the fact that the total momentum in the vacuum state is zero,
and is p for the state αp . In the last equality, we have applied a boost that cancels the
total momentum p, and used the fact that the vacuum is invariant, as well as the scalar
field φ(0). Therefore, we obtain the following representation for the time-ordered
2-point function
X
0out T φ(x)φ(y) 0in = 0out φ(0) α0 α0 φ(0) 0in
classes α
Z
d3 p
× p θ(x0 − y0 )e−ip·(x−y) + θ(y0 − x0 )eip·(x−y) ,
(2π)3 2 p2 + m2α
| {z }
G0 (x,y;m2
α)
F

(1.160)

where the underlined integral, G0F (x, y; m2α ), is the Feynman propagator for a hy-
pothetical scalar field of mass mα (compare this integral with eq. (1.119)). It is
customary to rewrite the above representation as
Z∞
dM2
0out T φ(x)φ(y) 0in = ρ(M2 ) G0F (x, y; M2 ) , (1.161)
0 2π

where ρ(m2 ) is the spectral function defined as


X
ρ(M2 ) ≡ 2π δ(M2 − m2α ) 0out φ(0) α0 α0 φ(0) 0in . (1.162)
classes α
This function describes the invariant mass distribution of the non-empty states of
the theory under consideration, and the exact Feynman propagator is a sum of free
Feynman propagators with varying masses, weighted by this mass distribution.
In a theory of massive particles, the spectral function has a delta function corre-
sponding to states containing a single particle of mass m, and a continuum distribu-
tion20 that starts at the minimal invariant mass (2m) of a 2-particle state:

ρ(M2 ) = 2π Z δ(M2 − m2 ) + continuum for M2 ≥ 4m2 , (1.163)

where Z is the product of matrix elements that appear in eq. (1.162), in the case of
1-particle states. In a theory with interactions, Z in general differs from unity (in fact,
it may be infinite). Note that in this equation, m must be the physical mass of the
particles, as it would be inferred from the simultaneous measurement of their energy
and momentum. As we shall see shortly, this is not the same as the parameter we
denoted m in the Lagrangian.
20 Between the 1-particle delta function and the 2-particle continuum, there may be additional delta

functions corresponding to multi-particle bound states (to have a stable bound state, the binding energy
should decrease the mass of the state compared to the mass 2m of two free particles at rest).
38 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Taking the Fourier transform of eq. (1.161) and using eq. (1.163) for the spectral
function, we obtain the following pole structure for the exact Feynman propagator:

e (p) = iZ
G + terms without poles . (1.164)
F
p2 − m2 + i0+

Therefore, the parameter Z that appears in the spectral function has also the interpre-
tation of the residue of the single particle pole in the exact Feynman propagator.
The fact that Z 6= 1 calls for√a slight modification of the LSZ reduction formulas.
Eq. (1.163) implies that a factor Z appears in the overlap between the state φ(x) 0in
and the 1-particle state pin . In other words, φ(x) creates a particle with probability
Z rather than 1. Therefore, there should be a factor Z−1/2 for each incoming and
outgoing particle in the LSZ reduction formulas that relate transition amplitudes to
products of fields φ:

 m+n
. i
q1 · · · qn out p1 · · · pm in = √
Z
ZY m n
Y
−ipi ·xi
× 4
d xj e 2
(xi + m ) d4 yj eiqj ·xj (yj + m2 )
i=1 j=1

× 0out T φ(x1 ) · · · φ(xm )φ(y1 ) · · · φ(yn ) 0in . (1.165)

In practical calculations, the factor Z at a given order of perturbation theory is obtained


by studying the 1-particle pole of the dressed propagator, as the residue of this pole.
It is common to introduce a renormalized field φr defined as a rescaling of φ,

φ≡ Z φr . (1.166)

By construction, the Feynman propagator defined from the 2-point time-ordered


product of φr has a single-particle pole of residue 1. In other words, we may replace in
the right hand side of the LSZ reduction formula (1.165) all the fields by renormalized
fields, and at the same time remove all the factors Z−1/2 . c sileG siocnarF

1.10 Ultraviolet divergences and renormalization

Until now, we have not attempted to calculate explicitly the integrals over the Eu-
clidean momentum kE in eqs. (1.146) and (1.154). In fact, these integrals do not
converge when |kE | → ∞, and as such they are therefore infinite. These infinities are
called ultraviolet divergences. c sileG siocnarF
1. BASICS OF Q UANTUM F IELD T HEORY 39

1.10.1 Regularization of divergent integrals


As we shall see shortly, this has very deep implications on how we should interpret
the theory. However, before we can discuss this, it is crucial to make the integrals
temporarily finite in order to secure the subsequent manipulations. This procedure,
called regularization, amounts to altering the theory to make all the integrals finite.
There is no unique method for achieving this, and the most common ones are the
following:

• Pauli-Villars method : modify the Feynman propagator according to


i i i
→ − . (1.167)
k2 − m2 + i0+ k2 − m2 + i0+ k2 − M2 + i0+
When |kE | ≫ M, this modified propagator decreases as |kE |−4 instead of
|kE |−2 for the unmodified propagator, which is usually sufficient to render the
integrals convergent. The original theory (and its ultraviolet divergences) are
recovered in the limit M → ∞.
• Lattice regularization : replace continuous space-time by a regular lattice of
points, for instance a cubic lattice with a spacing a between the nearest neighbor
sites. On such a lattice, the momenta are themselves discrete, with a maximal
momentum of order a−1 . Therefore, the momentum integrals are replaced by
discrete sums that are all finite. The original theory is recovered in the limit
a → 0. A shortcoming of lattice regularization is that the discrete momentum
sums are usually much more difficult to evaluate than continuum integrals, and
that it breaks the usual space-time symmetries such as translation and rotation
invariance. This is nevertheless the basis of numerical Monte-Carlo methods
(lattice field theory).
• Cutoff regularization : cut the integration over the norm of the Euclidean
momentum by |kE | ≥ Λ. The underlying theory is recovered in the limit
Λ → ∞. This is a commonly used regularization in scalar theories, due to its
simplicity and because it preserves all the symmetries of the theory.
• Dimensional regularization : this method is based on the observation that the
integral
Z∞ Z∞ D
kED−1 1 u 2 −1
dkE = du
0 [kE + ∆]n
2 2 0 [u + ∆]n
D Z1
∆ 2 −n D D
= dx xn− 2 −1 (1−x) 2 −1 (1.168)
2 0
| {z! !
}
D D
Γ n− Γ
2 2
Γ (n)
40 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

is well defined for almost any D except for D = 2n, 2n + 2, 2n + 4, · · · and


D = 0, −2, −4, · · · thanks to the analytical properties of the Gamma function21 .
Dimensional regularization keeps the number of space-time dimensions D
arbitrary in all the intermediate calculations, and at the end one usually writes
D = 4 − 2ǫ with ǫ ≪ 1. This regularization does not break any of the
symmetries of the theory, including gauge invariance (which is not the case of
cutoff regularization). There is an extra complication: the coupling constant
λ is a priori dimensionless only when D = 4. In order to keep the dimension
of λ unchanged, we must introduce a parameter µ that has the dimension of a
mass, and replace λ by λµ4−D . Note that the field φ(x) has the dimension of
a mass to the power (D − 2)/2. Setting D = 4 − 2ǫ, the singular part of the
integrals Σ(P) and Γ4 (P) introduced above as examples is

λ m2 1
Σ(P) = − + O(1) ,
2 (4π)2 ǫ
λ2 1 1
Γ4 (P) = − + O(1) . (1.169)
2 (4π)2 ǫ

1.10.2 Mass renormalization


Let us now make a few observations:
• The above divergent terms are momentum independent22 ,
• They appear in 2-point and 4-point functions only.
Moreover, it is important to realize that the parameters (m2 and λ) in the Lagrangian
are not directly observable quantities by themselves23 . For instance, the mass of a
particle is a measurable property of the particle (e.g. by measuring both its energy
and its momentum, via p20 − p2 ). In quantum field theory, this definition of the mass
corresponds to the location of the poles of the propagator in the complex p0 plane.
However, as we shall see, loop corrections modify substantially the propagator, and it
turns out that the parameter m in the free propagator has in fact little to do with this
physical mass. If we dress the propagator by summing the multiple insertions of the
1-loop correction −iΣ,

e (P) ≡ P
G P P P
F + + + + ... ,

21 Γ (z) is analytic in the complex plane, at the exception of a discrete series of simple poles, located at


zn = −n for n ∈ , with residues (−1)n /n!.
22 These examples are not completely general. As we shall see later, divergent terms proportional to P 2

may also appear in the 2-point function.


23 In this regard, it is important to realize that the renormalization of the parameters of the Lagrangian

would be necessary even in a theory that has no divergent loop integrals.


1. BASICS OF Q UANTUM F IELD T HEORY 41

(1.170)

we obtain

e (P) = i
G , (1.171)
F
p20 − p2 − m2 − Σ + i0+
from which it is immediate to see that this loop correction alters the location of the
pole, now given by

p20 − p2 = 2
|m {z+ Σ} . (1.172)
new squared mass

Since the propagator given in eq. (1.171) includes loop corrections, its poles ought to
give a value of the mass closer to the physical one. Therefore, it is tempting to write:

m2phys = m2 + Σ + O(λ2 ) . (1.173)

Of course, since Σ is infinite, the only way this can be satisfied is that the parameter
m2 that appears in the Lagrangian be itself infinite, with an opposite sign in order
to cancel the infinity from Σ. To further distinguish it from the physical mass, the
parameter m in the Lagrangian is usually called the bare mass, while mphys is the
physical –or renormalized– mass. c sileG siocnarF

1.10.3 Field renormalization


Note that the 1-loop function Σ in a theory with a φ4 interaction is somewhat special,
because at this order it is independent of the momentum P. Being a constant, the
infinity it contains can be absorbed entirely into a redefinition of the bare mass, but
the residue of the pole remains equal to 1. However, starting at two loops, the 2-point
functions that correct the propagator are usually momentum dependent, as is the case
for instance with this graph:

It is convenient to expand Σ(P2 ) around the physical mass:

Σ(P2 ) = Σ(m2phys ) + (P2 − m2phys ) Σ ′ (m2phys ) + 21 (P2 − m2phys ) Σ ′′ (m2phys ) + · · ·


(1.174)
e to have a pole at P2 = m2 , we need to impose
For the resummed propagator G F phys

m2phys = m2 + Σ(m2phys ) , (1.175)


42 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

that generalizes eq. (1.173) to a momentum-dependent Σ. Then, in the vicinity of the


pole, the dressed propagator behaves as

e (P) i
G ≈ . (1.176)
F
P 2 →m2
phys
(1 − Σ ′ (m2phys )) (P2 − m2phys ) + i0+

This indicates that the field renormalization factor Z cannot be equal to 1 when the
propagator is corrected by a momentum-dependent loop. Instead, we have

1
Z= . (1.177)
1− Σ ′ (m2phys )

Moreover, Weinbergs’s theorem implies that the ultraviolet divergences of the 2-point
function Σ(P2 ) arise only in Σ(m2phys ) and in the first derivative Σ ′ (m2phys ), while
higher derivatives are all finite. Eqs. (1.175) and (1.177) therefore indicate that
these infinities can be “hidden” in the bare mass m2 and in a multiplicative field
renormalization factor Z. c sileG siocnarF

1.10.4 Ultraviolet power counting

From the above considerations, it appears crucial that Σ has divergences only in its
0th and 1st order Taylor coefficients and Γ4 only in the 0th order, in order to be able to
absorb the divergences by a proper definition of m2 , Z and λ. A simple dimensional
argument gives plausibility to this assertion (of which Weinberg’s theorem provides a
more rigorous justification). Let us assume that we scale up all the internal momenta
of a graph by some factor ξ. In doing this, a graph G with nV vertices and nI internal
lines will scale as

G ∼ ξD nL −2nI , (1.178)

assuming D space-time dimensions for more generality. The exponent ω(G) ≡


D nL − 2nI is called the superficial degree of divergence of the graph. This exponent
characterizes how the graph diverges when all its internal momenta are rescaled
uniformly:

• ω(G) ≥ 0 : The graph has an intrinsic divergence.

• ω(G) < 0 : The graph may be finite, or may contain a divergent subgraph.
More precisely, the convergence theorem states that a graph G is finite if
ω(G) < 0, and the degrees of divergence of all its subgraphs are negative as
well. Of course, subgraphs do not always satisfy this condition. But in the
renormalization process, the divergent subgraphs will have been dealt with at
an earlier stage since they occur at a lower order of the perturbative expansion.
1. BASICS OF Q UANTUM F IELD T HEORY 43

The superficial degree of divergence signals all the n-point functions that may have
ultraviolet divergences of their own (as opposed to being divergent because of a
divergent subgraph). Using eqs. (1.139), ω(G) can be rewritten in the following way

ω(G) = 4 − nE + (D − 4) nL . (1.179)

An important consequence of this formula is that in 4 dimensions the superficial


degree of divergence of a graph does not depend on the number of loops, but only on
the number of external lines. When D = 4, the only functions that have a non-negative
ω are the 2-point function and the 4-point function24 . It is important to realize that
this does not mean that a 6-point cannot be divergent. However, it can diverge only
if it contains a divergent 2-point or 4-point subgraph. Moreover, the value of the
superficial degree of divergence indicates the maximal power of the ultraviolet cutoff
that may appear in these functions:

• 2-point: up to Λ2

• 4-point: up to log(Λ)

Note also that if we differentiate a graph with respect to the invariant norm P2 of one
of its external momenta, we get
 
∂G
ω = 2 − nE + (D − 4) nL . (1.180)
∂P2

(ω further decreases by two units with each additional derivative with respect to
P2 .) Therefore, the momentum derivative Σ ′ (P2 ) of the 2-point function has ω = 0
in D = 4, and its higher derivatives all have ω < 0. The fact that only Γ4 (m2phys ),
Σ(m2phys ) and Σ ′ (m2phys ) have ω ≥ 0 is the reason why it is possible to get rid of all
the divergences of this theory (in 4 dimensions) by a redefinition of the parameters of
the Lagrangian. This theory is said to be renormalizable. c sileG siocnarF

1.10.5 Ultraviolet classification of quantum field theories

In dimensions lower than 4, ω(G) is a strictly decreasing function of the number of


loops, which indicates that graphs with a given nE do not develop new divergences
beyond a certain loop order. Such theories are said super renormalizable because
they only have a finite number of divergent graphs. Conversely, in dimensions higher
than 4, ω(G) increases with the number of loops, and any function will eventually
24 Functions with an odd number of external lines vanish in the theory under consideration. Note also

that 0-point functions (vacuum graphs) have a superficial degree of divergence equal to 4, indicating that
they may contain up to quartic divergences ∼ Λ4 .
44 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

become divergent at some loop order. These theories are usually25 non renormalizable.
One may think of introducing, as they become necessary, additional operators in the
Lagrangian with a coupling constant adjusted to cancel the new divergences that arise
at a given loop order. However, an infinite number of such parameters would need
to be introduced, thereby reducing drastically the predictive power of this type of
theory26 .
As we have seen, the renormalizability of a field theory depends both on the
interaction terms it contains, and on the dimensionality of space-time. In fact, a
simpler equivalent criterion is the mass dimension of the coupling constant in front of
the interaction term:

• dim > 0 : super-renormalizable,

• dim = 0 : renormalizable,

• dim < 0 : non-renormalizable.

For instance, the “coupling constant” m2 in front of the mass term has always a mass
dimension equal to two, and this term is therefore super-renormalizable. In contrast,
the coupling constant λ in front of a φ4 interaction has a mass dimension 4 − D, and
is (super)renormalizable in dimensions less than or equal to four. c sileG siocnarF

1.10.6 Renormalization in perturbation theory, Counterterms

A convenient setup for casting the renormalization procedure within perturbation


theory is to write the bare Lagrangian,

1   1 λb
L= ∂µ φb ∂µ φb − m2b φ2b − φ4b , (1.181)
2 2 4!
(here we denote φb , mb and λb the bare field, mass and coupling, to stress that they
are not the physical ones) as the sum of a renormalized Lagrangian and a correction:

L = Lr + ∆L
1   1 λr
Lr ≡ ∂µ φr ∂µ φr − m2r φ2r − φ4r
2 2 4!
1  µ  1 1
∆L ≡ ∆ ∂µ φr ∂ φr − ∆m φr − ∆λ φ4r .
2
(1.182)
2 Z 2 4!
25 It may happen that an internal symmetry, such as a gauge symmetry, renders a function finite while its

superficial degree of divergence is non negative.


26 Non-renormalizable field theories may nevertheless be used as low energy effective field theories,

where they approximate below a certain cutoff a more fundamental –possibly unknown– theory supposedly
valid above the cutoff.
1. BASICS OF Q UANTUM F IELD T HEORY 45

Lr contains the renormalized (i.e. physical) mass mr and coupling constant λr


(the latter may be defined from the measurement of some cross-section chosen as
reference).√In ∆L, the coefficients ∆Z , ∆m , ∆λ are called counterterms. Recalling
that φb = Z φr , the bare and physical parameters and the counterterms must be
related by

∆Z = Z − 1
∆m = Zm2b − m2r
∆λ = Z2 λb − λr . (1.183)

The terms in ∆L are treated as a perturbation to Lr , and one may introduce extra
Feynman rules for the various terms it contains:

1   1 P 
∆ ∂µ φr ∂µ φr − ∆m φ2r → = −i ∆Z P2 + ∆m
2 Z 2
1
− ∆λ φ4r → = −i ∆λ (1.184)
4!

At tree level, only the term Lr is used, and by construction the physical quantities
computed at this order will depend only on physical parameters. Higher orders
involve divergent loop corrections. The counterterms ∆Z , ∆m , ∆λ should be adjusted
at every order to cancel the new divergences that arise at this order. In particular,
after having included the contribution of the counterterms, the self-energy Σ(P2 ) are
usually required to satisfy the following conditions27 :

Σ(m2r ) = 0 , Σ ′ (m2r ) = 0 . (1.185)

With this choice, it is not necessary to dress the external lines with the self-energy in
the LSZ reduction formulas for transition amplitudes. Indeed, the renormalization
conditions (1.185) imply that

i( + m2r ) GF = 1 , lim (−iΣ)GF = 0 . (1.186)


p2 →m2
r

For each external line, the reduction formula contains an operator i(x + m2r ) acting
on the corresponding external propagator. If this propagator is dressed, this gives

i( + m2r ) GF + GF (−iΣ)GF + GF (−iΣ)GF (−iΣ)GF + · · · = 1 . (1.187)


| {z }
dressed propagator

Therefore, all the terms are zero except the first one, and we can ignore self-energy
corrections on the external lines. c sileG siocnarF

27 Strictly
speaking, the only requirement is that the counterterms cancel the infinities, which does not fix
uniquely their finite part. Various renormalization schemes are possible, that differ in how these finite parts
are chosen.
46 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

1.10.7 BPHZ renormalization

The actual proof of renormalizability is more complicated than what this superficial
discussion based on power counting may suggest. Indeed, a crucial aspect is to show
that the divergences can be removed via the subtraction of local terms only, i.e. that
the divergences are polynomial in the external momenta. While this is trivial in all
the one-loop graphs we have considered, it is not obviously true beyond one-loop. As
an illustration, let us consider the following example of a two-loop contribution to the
4-point function:
k
1 3
l
.
2 4
A B C

In the graph A, the loop we have represented with a thicker line is divergent, and is
multiplied by a non-polynomial function of P2 ≡ (p1 + p2 )2 coming from the rest
of the graph. The Feynman rules give the following integrand for this graph:

(−iλ)3 0
IA = GF (k)G0F (k − P) G0F (l)G0F (l + k + p3 ) . (1.188)
2
The superficial degree of divergence of the integration over l is ω(A; l) = 0 (at
fixed k), and therefore the boldface loop is logarithmically divergent. The diagram B
consists in subtracting from this loop a polynomial in its external momenta, whose
degree is precisely equal to its superficial degree of divergence. Since ω(A; l) = 0,
the subtraction is the zeroth order of the Taylor expansion of that loop (underlined in
the following equation):

(−iλ)3 0 h i
IA+B = GF (k)G0F (k−P) G0F (l)G0F (l+k+p3 )−G0F (l)G0F (l) . (1.189)
2
Now, the degree of divergence in l of the combination inside the square brackets
is ω(A + B; l) = −1, and the integration over l is therefore convergent in four
dimensions. After the momentum l has been integrated out, we are left with a function
of k whose behaviour is k0 , up to logarithms, whose integral is thus divergent. Since
the degree of divergence in k is ω(A + B; k) = 0, this overall divergence can again
be removed by subtracting the zeroth order of the Taylor expansion with respect to
the external momenta, i.e.

(−iλ)3 h i
IA+B+C = G0F (k)G0F (k − P) G0F (l)G0F (l + k + p3 ) − G0F (l)G0F (l)
2 h i
−G0F (k)G0F (k) G0F (l)G0F (l + k) − G0F (l)G0F (l) . (1.190)
1. BASICS OF Q UANTUM F IELD T HEORY 47

After these two successive subtractions, we have obtained a function whose integral
on both k and l is completely finite. Moreover, at each step, we have subtracted
only quantities that are polynomial in the external momenta of the corresponding
loop (with a degree equal to the superficial degree of divergence of the loop). This
recursive procedure for constructing a subtracted integrand is known as Bogoliubov-
Parasiuk-Hepp-Zimmermann renormalization. c sileG siocnarF

1.11 Spin 1/2 fields

1.11.1 Dimension-2 representation of the rotation group

In ordinary quantum mechanics, the spin s is related to the dimension n of represen-


tations of the rotation group by

n = 2s + 1 . (1.191)

Thus, spin 1/2 corresponds to representations of dimension 2. Such a representation


is based on the (Hermitean) Pauli matrices:
! ! !
1 0 1 2 0 −i 3 1 0
σ = , σ = , σ = , (1.192)
1 0 i 0 0 −1

from which we can construct the following unitary 2 × 2 matrices



U ≡ exp − 2i θi σi . (1.193)

That the Pauli matrices (up to a factor 2) are generators of the Lie algebra of rotations
can be seen from
  σi
Ji , Jj = i ǫijk Jk with Ji ≡ . (1.194)
2

1.11.2 Spinor representation of the Lorentz group

This idea can be extended to quantum field theory in order to encompass all the
Lorentz transformations rather than just the spatial rotations. We are therefore seeking
a dimension 2 representation of the commutation relations (1.10). Firstly, let us
assume that we know a set of four n × n matrices γµ that satisfy the following
anti-commutation relation:
 µ ν
γ , γ = 2 gµν 1n×n . (1.195)
48 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Such matrices are called Dirac matrices. From these matrices, it is easy to check that
the matrices
i  µ ν
Mµν ≡ γ ,γ (1.196)
4
form an n-dimensional representation of the Lorentz algebra. However, an exhaustive
search indicates that the smallest matrices that fulfill eqs. (1.195) (in four space-time
dimensions, i.e. for µ, ν = 0, · · · , 3) are 4 × 4. Several unitarily equivalent choices
exist for these matrices. A possible representation (known as the Weyl or chiral
representation) is the following28
! !
0 0 1 i 0 σi
γ ≡ , γ ≡ . (1.197)
1 0 −σi 0

In this representation, the generators for the boosts and for the rotations are
! !
0i i σi 0 ij 1 ijk σk 0
M =− , M = ǫ . (1.198)
2 0 −σi 2 0 σk

Given a Lorentz transformation Λ defined by the parameters ωµν , let us define

i 
U1/2 (Λ) ≡ exp − ωµν Mµν . (1.199)
2
A Dirac spinor is a 4-component field ψ(x) that transforms as follows:

ψ(x) → U1/2 (Λ) ψ(Λ−1 x) . (1.200)

In other words, the matrix U1/2 defines how the four components of this field
transform under a Lorentz transformation (since these four components mix, ψ(x) is
not the juxtaposition of four scalar fields). The fact that the lowest dimension for the
Dirac matrices is 4 indicates that the spinor ψ(x) describes two spin-1/2 particles: a
particle and its antiparticle, that are distinct from each other. c sileG siocnarF

1.11.3 Dirac equation and Lagrangian

Let us now determine an equation of motion obeyed by this field, such that it is
invariant under Lorentz transformations. Since the Mµν ’s act only on the Dirac
indices, a trivial answer could be the Klein-Gordon equation,

x + m2 ψ(x) = 0 . (1.201)
28 Although it is sometimes convenient to have an explicit representation of the Dirac matrices, most

manipulations only rely on the fact that the obey the anti-commutation relations (1.195).
1. BASICS OF Q UANTUM F IELD T HEORY 49

But there is in fact a stronger equation that remains invariant when ψ is transformed
according to eq. (1.200). Notice first that

U−1 µ µ ν
1/2 (Λ)γ U1/2 (Λ) = Λ ν γ . (1.202)

This equation indicates that rotating the Dirac indices of γµ with U1/2 is equivalent
to transforming the µ index as one would do for a normal 4-vector. Using this identity,
we can check that under the same Lorentz transformation we have
 
iγµ ∂µ − m ψ(x) → U1/2 (Λ) iγµ ∂µ − m ψ(Λ−1 x) . (1.203)

Therefore, the Dirac equation,



iγµ ∂µ − m ψ(x) = 0 , (1.204)

is Lorentz invariant. This equation implies the Klein-Gordon equation (to see it, apply
the operator iγµ ∂µ + m on the left), and is therefore stronger.
The Dirac matrices are not Hermitean. Instead, they satisfy
†
γµ = γ0 γµ γ0 . (1.205)

Therefore, the Hermitean conjugate of U1/2 (Λ) is

i  i 
U†1/2 (Λ) = exp ωµν (Mµν )† = γ0 exp ωµν Mµν γ0 = γ0 U−1 0
1/2 (Λ) γ .
2 2
(1.206)

Because of this, the simplest Lorentz scalar bilinear combination of ψ’s is ψ† γ0 ψ


(instead of the naive ψ† ψ). It is common to denote ψ ≡ ψ† γ0 . From this, we
conclude that the Lorentz scalar Lagrangian density that leads to the Dirac equation
reads

L = ψ iγµ ∂µ − m ψ(x) . (1.207)

1.11.4 Basis of free spinors


Before quantizing the spinor field in a similar fashion as the scalar field, we need to
find plane wave solutions of the Dirac equation. There are two types of solutions:

ψ(x) = u(p) e−ip·x with (pµ γµ − m) u(p) = 0 ,


ψ(x) = v(p) e+ip·x with (pµ γµ + m) v(p) = 0 . (1.208)

The solutions u(p) and v(p) each form a 2-dimensional linear space, and it is
customary to denote a basis by us (p) and vs (p) (the index s, that takes two values
50 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

s = ±, is interpreted as the two spin states for a spin 1/2 particle). A convenient
normalization of the base vectors is

ur (p)us (p) = 2mδrs , u†r (p)us (p) = 2Ep δrs ,


vr (p)vs (p) = −2mδrs , v†r (p)vs (p) = 2Ep δrs ,
ur (p)vs (p) = vr (p)us (p) = 0 . (1.209)

When summing over the spin states, we have:


X X
us (p)us (p) = p/+m, /−m,
vs (p)vs (p) = p (1.210)
s=± s=±

/ ≡ pµ γµ .
where we have introduced the notation p c sileG siocnarF

1.11.5 Canonical quantization


From the Lagrangian (1.207), the momentum canonically conjugated to ψ(x) is
Π(x) = iψ† (x) . (1.211)
Trying to generalize the canonical commutation relation of scalar field operators
(1.40) would lead to
 
ψa (x), ψ†b (y) x0 =y0 = δ(x − y)δab , (1.212)

where we have written explicitly the Dirac indices a, b. However, by decomposing


ψ(x) on a basis of plane waves by introducing creation and annihilation operators,
XZ d3 p
ψ(x) ≡ 3 2E
a†sp vs (p)e+ip·x + bsp us (p)e−ip·x , (1.213)
s=±
(2π) p

one would find a Hamiltonian which is not bounded from below. The resolution
of this paradox is that the commutation relation (1.212) is incorrect, and should be
replaced by an anti-commutation relation,

ψa (x), ψ†b (y) x0 =y0 = δ(x − y)δab , (1.214)

which leads to anti-commutation relations for the creation and annihilation operators
 
arp , a†sq = brp , b†sq = (2π)3 2Ep δ(p − q)δrs . (1.215)
(All other combinations are zero.) These anti-commutation relations imply that the
square of creation operators is zero, which means that it is not possible to have two
particles with the same momentum and spin in a quantum state. This is nothing
but the Pauli exclusion principle. This is the simplest example of the spin-statistics
theorem, which states that half-integer spin particles must obey Fermi statistics.
c sileG siocnarF
1. BASICS OF Q UANTUM F IELD T HEORY 51

1.11.6 Free spin-1/2 propagator

From eq. (1.213), we obtain the following expression for the free Feynman propagator
of the Dirac field29

S0F (x, y) ≡ 0 θ(x0 − y0 )ψa (x)ψb (y) − θ(y0 − x0 )ψb (y)ψa (x) 0
| {z }
T (ψa (x)ψb (y))
Z 4
d p −ip·(x−y) / + m)
i(p
= e . (1.216)
(2π)4 p − m2 + i0+
2
| {z }
S0 (p)
F

The diagrammatic representation of this propagator is a line with an arrow:

p
S0F (p) = . (1.217)

1.11.7 LSZ reduction formula for spin-1/2

The LSZ reduction formula for transition amplitudes with fermions and/or anti-
fermions in the initial and final states reads:

 m+n Z Z
. i −ip·x
qσ qσ · · · out ps ps · · · in = d4
x e d4 x e−ip·x · · ·
| {z } | {z } Z1/2
n particles m particles
Z Z → →
× d4 y e+iq·y d4 y e+iq·y · · · vs (p)(i ∂
/ x −m) uσ (q)(−i ∂
/ y +m)

× 0out T ψ(x)ψ(y)ψ(x)ψ(y) · · · 0in


← ←
/ x +m)us (p) (−i ∂
×(i ∂ / y −m)vσ (q) , (1.218)

where we give examples for fermions and anti-fermions (indicated by a bar over the
momentum and spin), both for the initial and final states. Besides the requirement
that the external lines of the Feynman graphs should be amputated, this formula leads

29 We have introduced a minus sign in the definition of the time-ordered product of Dirac fields. One

would have to mimic the derivation of the section 1.6 in order to see that this is the propagator that naturally
appears in the generating functional for the amplitudes with fermions.
52 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

to the following prescriptions for the open ends of fermionic lines:

Incoming fermion : p = u(p)

Incoming anti-fermion : p = v(p)

Outgoing fermion : p = u(p)

Outgoing anti-fermion : p = v(p) .

Note that when writing the expression corresponding to a given Feynman graph, the
fermion lines it contains must be read in the direction opposite to the arrow carried by
the lines.
c sileG siocnarF

1.12 Spin 1 fields

1.12.1 Classical electrodynamics

The best known spin-1 particle is the photon. In classical electrodynamics, the electric
field E and magnetic field B obey Maxwell’s equations,

∇·E=ρ
∇ × B − ∂t E = J
∇ × E + ∂t B = 0
∇·B=0, (1.219)

written here in terms of charge density ρ and current J. The local conservation of
electrical charge implies the following continuity equation

∂t ρ + ∇ · J = 0 . (1.220)

The last two Maxwell’s equations are automatically satisfied if we write the E, B
fields in terms of potentials V and A,

E ≡ ∂t A + ∇V , B ≡ −∇ × A . (1.221)
1. BASICS OF Q UANTUM F IELD T HEORY 53

This representation is not unique, since E and B are unchanged if we transform the
potentials as follows:

V → V + ∂t χ , A → A − ∇χ , (1.222)

where χ is an arbitrary function of space and time. Eq. (1.222) is called a (Abelian)
gauge transformation. Quantities that do not change under (1.222) are said to be
gauge invariant. For instance, the electrical and magnetic fields are invariant. c sileG siocnarF

1.12.2 Classical electrodynamics in Lorentz covariant form

In order to make manifest the properties of Maxwell’s equations under Lorentz


transformations, let us firstly rewrite them in covariant form. Introduce a 4-vector Aµ
and a rank-2 tensor Fµν ,

Aµ ≡ (V, A) , Fµν ≡ ∂µ Aν − ∂ν Aµ . (1.223)

(Fµν is called the field strength.) Recalling that ∂µ = (∂t , −∇), gauge transforma-
tions take the following form

Aµ → Aµ + ∂µ χ , (1.224)

and Fµν is gauge invariant. Moreover, we see that

Ei = F0i , Bi = 1
2 ǫijk Fjk . (1.225)

If we also encapsulate ρ and J in a 4-vector,

Jµ ≡ (ρ, J) , (1.226)

the first two Maxwell’s equations and the continuity equation read

∂µ Fµν = −Jν , ∂µ J µ = 0 . (1.227)

The last two Maxwell’s equations become

ǫµνρσ ∂ν Fρσ = 0 . (1.228)

(It is automatically satisfied thanks to the antisymmetric structure of Fµν .)


A Lorentz scalar Lagrangian density whose Euler-Lagrange equations of motion
are the Maxwell’s equations is

1
L ≡ − Fµν Fµν + Jµ Aµ . (1.229)
4
54 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Because of the term Jµ Aµ that couples the potential to the sources, this Lagrangian
density is not gauge invariant, but the action (integral of L over all space-time) is,
provided that the current is conserved (i.e. satisfies the continuity equation). Indeed,
we have
Z Z
d4 x Jµ Aµ → d4 x Jµ (Aµ + ∂µ χ)
Z Z
boundary
= 4 µ
d x J Aµ − d4 x χ ∂µ Jµ + . (1.230)
| {z } term
0

(The boundary term is zero if we assume that there are no sources at infinity.) c sileG siocnarF

1.12.3 Canonical quantization in Coulomb gauge


Although it leads to Maxwell’s equations, the above Lagrangian has an unusual
property, related to gauge invariance: the conjugate momentum of the potential A0 is
identically zero,
δL
Π0 (x) ≡ =0. (1.231)
δ∂0 A0 (x)
Therefore, we cannot quantize electrodynamics simply by promoting the Poisson
bracket between A0 and its conjugate momentum to a commutator. However, this
problem is not intrinsic to quantum mechanics: the very same issue arises when
trying to formulate classical electrodynamics in Hamilton form. The resolution of this
problem is to fix the gauge, i.e. to impose an extra condition on the potential Aµ such
that a unique Aµ corresponds to given E and B fields. Possible gauge conditions are:
Axial gauge : nµ Aµ = 0 (nµ is a fixed 4-vector) ,
µ
Lorenz gauge : ∂ Aµ = 0 ,
Coulomb gauge : ∇·A=0. (1.232)

Let us illustrate this procedure in Coulomb gauge30 . Firstly, let us decompose the
vector potential Ai into longitudinal and transverse components:
Ai = Aik + Ai⊥ , (1.233)
with

∂i ∂j  ∂i ∂j 
Aik ≡ Aj , Ai⊥ ≡ δij − 2 Aj . (1.234)
∂2 ∂
30 One may start from another gauge condition, and follow a similar line of reasoning in order to derive a

quantized theory of the photon field in another gauge. However, as we shall see later, we can make the
gauge fixing much more transparent by using functional quantization.
1. BASICS OF Q UANTUM F IELD T HEORY 55

The Coulomb gauge condition is equivalent to Aik = 0. The remaining components of


Aµ are therefore A0 and the two components of Ai⊥ , in terms of which the Lagrangian
reads:

1 1 1
L = (∂t Ai⊥ )(∂t Ai⊥ ) − (∂j Ai⊥ )(∂j Ai⊥ ) + (∂i A0 )(∂i A0 )
2 2 2
1
+(∂t Ai⊥ )(∂i A0 ) + (∂i Aj⊥ )(∂j Ai⊥ ) + J0 A0 − Ji Ai⊥ . (1.235)
2

Note that the two underlined terms will vanish in the action, after an integration by
parts (thanks to the transversality of Ai⊥ ). The Euler-Lagrange equation for the field
A0 is

∂2 A0 = J0 , (1.236)

i.e. the Poisson equation with source term J0 . Note that this equation has no time
derivative. Therefore, A0 reflects instantaneously the changes of the charge density
J0 (this does not contradict special relativity, since A0 is not an observable – only E
and B are). Ignoring all the terms that would vanish in the action upon integration by
parts, we may thus rewrite the Lagrangian as
1 1 1 1
L= (∂t Ai⊥ )(∂t Ai⊥ ) − (∂j Ai⊥ )(∂j Ai⊥ ) − Ji Ai⊥ + J0 2 J0 , (1.237)
2 2 2 ∂

and obtain the following Euler-Lagrange equation of motion for the field Ai⊥ :
 ∂i ∂j 
 Ai⊥ = − δij − 2 Jj , (1.238)

i.e. a massless Klein-Gordon equation with the transverse projection of the charge
current as source term.
In this form, electrodynamics has no redundant degrees of freedom, and can now
be quantized in the vacuum (J0 = Ji = 0) in the canonical way. Firstly, we define the
momentum conjugated to Ai⊥ ,
δL
Πi⊥ (x) ≡ = ∂t Ai⊥ (x) . (1.239)
δ ∂t Ai⊥ (x)

Then, we promote Ai⊥ and Πi⊥ to quantum operators, and we impose on them the
following canonical equal-time commutation relations,

   ∂i ∂j 
Ai⊥ (x), Πj⊥ (y) x0 =y0 = i δij − 2 δ(x − y) ,

 i   
A⊥ (x), Aj⊥ (y) x0 =y0 = Πi⊥ (x), Πj⊥ (y) x0 =y0 = 0 . (1.240)
56 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(In the first of these relations, the transverse projector in the right hand side follows
from the fact that Ai⊥ and Πj⊥ are both transverse.) These commutation relations can
be realized by decomposing Ai⊥ on a basis of solutions of the Klein-Gordon equation,
i.e. plane waves:
X Z d3 p h i †
i
Ai⊥ (x) ≡ ǫ (p) a e+ip·x
+ ǫ i∗
(p) a λp e−ip·x
, (1.241)
(2π)3 2|p| λ λp λ
λ=1,2

where the two vectors ǫi1,2 (p) are polarization vectors orthogonal to p,

p · ǫλ (p) = 0 . (1.242)

In 3 spatial dimensions, a basis of such vectors has two elements, that we have labeled
with λ = 1, 2. In addition, it is convenient to normalize the polarization vectors as
follows
X j pi pj
ǫλ (p) · ǫ∗λ ′ (p) = δλλ ′ , ǫi∗ ij
λ (p)ǫλ (p) = δ − . (1.243)
p2
λ=1,2

With this choice, the commutation relations of eqs. (1.240) are equivalent to the
following commutation relations between creation and annihilation operators:
   
aλp , aλ ′ q = a†λp , a†λ ′ q = 0 ,
 
aλp , a†λ ′ q = (2π)3 2|p| δλλ ′ δ(p − q) . (1.244)

1.12.4 Feynman rules for photons


Eq. (1.241) can be inverted to obtain the creation and annihilation operators as
Z ↔
a†λp = −i ǫi∗
λ (p) d3 x e−ip·x ∂0 Ai⊥ (x) ,
Z ↔
aλp = +i ǫiλ (p) d3 x e+ip·x ∂0 Ai⊥ (x) , (1.245)

With these formulas, it is easy to derive the LSZ reduction formulas for photons in
the initial and final states,
 m+n Z
. i
qλ ′ · · · out pλ · · · in = d4 x e−ip·x ǫi∗
λ (p) x · · ·
| {z } | {z } Z1/2
n photons m photons
Z

× d4 y e+iq·y ǫjλ ′ (q) y · · · 0out T Ai⊥ (x)Aj⊥ (y) · · · 0in .
(1.246)
1. BASICS OF Q UANTUM F IELD T HEORY 57

The free Feynman propagator of the photon (in Coulomb gauge) can be read off the
quadratic part of the Lagrangian (1.237). In momentum space, it reads
 i j

p i δij − ppp2
G0F ij (p) = i j = . (1.247)
p2 + i0+

The operator ǫiλ (p) x in the reduction formula simply amputates the external photon
line to which it is applied31 . Transition amplitudes with incoming and outgoing
photons are therefore given by amputated graphs, with a polarization vector contracted
to the Lorentz index of each external photon. c sileG siocnarF

1.13 Abelian gauge invariance, QED

So far, we have derived a quantized field theory for spin 1/2 fermions and a quantized
field theory of photons (in the absence of charged sources), but they appear as
unrelated constructions. The next step is to combine the two into a quantum theory of
charged fermions that interact electromagnetically via photon exchanges. c sileG siocnarF

1.13.1 Global U(1) symmetry of the Dirac Lagrangian

Firstly, note that the fermion Lagrangian is invariant under the following transforma-
tion of the fermion field

ψ → Ω† ψ , (1.248)

where Ω is a phase (i.e. an element of the group U(1)), provided that we consider
only rigid transformations (i.e. independent of the space-time point x). By Noether’s
theorem (see the section 1.2.4), this continuous symmetry corresponds to the existence
of a conserved current,

Jµ = ψ γµ ψ . (1.249)

It is indeed straightforward to check from Dirac’s equation that

∂ µ Jµ = 0 . (1.250)
31 Note that
 pi pj  j
δij − ǫλ (p) = ǫiλ (p) .
p2
Therefore, the transverse projectors attached to the external photon lines can be dropped.
58 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

The physical interpretation of this current emerges from the spatial integral of the
time component J0 ,
Z
Q ≡ d3 x J0 (x) . (1.251)

Using the Fourier mode decomposition (1.213) of the spinor ψ(x), we obtain the
following expression:
XZ d3 p
Q = 3 2E
asp a†sp + b†sp bsp
s=±
(2π) p

XZ d3 p
= 3 2E
b†sp bsp − a†sp asp + (infinite) constant .
s=±
(2π) p

(1.252)

Thus, the operator Q counts the number of particles created by b† minus the number
of particles created by a† . If we assign a charge +1 to the former and −1 to the latter,
we can interpret Q as the operator that measures the total charge in the system. c sileG siocnarF

1.13.2 Minimal coupling to a spin-1 field


Secondly, the gauge transformation of the potential Aµ given in eq. (1.224) can also
be written in the following form32 :
Aµ → Ω† Aµ Ω + i Ω† ∂µ Ω , (1.253)
with
Ω(x) ≡ e−i χ(x) . (1.254)
When written in this form, the gauge transformation of the photon field appears to
be also generated by the group U(1). Unlike the quantum field theory for fermions,
the photon Lagrangian is invariant under local gauge transformations, i.e. where Ω
depends on x in an arbitrary fashion. Therefore, at this point we have two disjoint
quantum field theories: a theory of non-interacting charged fermions that has a
global U(1) invariance, and a theory of non-interacting photons that has a local U(1)
invariance, but no coupling between the two.
Let us see what minimal modification would be necessary in order to promote the
U(1) symmetry of the fermion sector into a local symmetry. An immediate obstacle
is that
Ω(x) ∂µ Ω† (x) 6= ∂µ . (1.255)
32 Naturally, Ω† Aµ Ω = Aµ . We have used this somewhat more complicated form to highlight the

analogy with the non-Abelian gauge theories that we will study later.
1. BASICS OF Q UANTUM F IELD T HEORY 59

Equivalently, the problem comes from the fact that the derivative ∂µ ψ does not
transform in the same way as ψ itself when Ω depends on x. Instead, we have

∂µ ψ → ∂µ Ω† ψ = Ω† ∂µ ψ + (∂µ Ω† ) ψ . (1.256)

But we see that the second term can be connected to the variation of a photon field
under the same transformation. This suggests that the combination (∂µ − iAµ )ψ has
a simpler transformation law:

  
∂µ − iAµ ψ → ∂µ − i Ω† Aµ Ω + iΩ† ∂µ Ω Ω† ψ
 
= Ω† ∂µ − iAµ ψ + Ω† Ω(∂µ Ω† ) + (∂µ Ω)Ω† ψ .
| {z }
∂µ (ΩΩ† )=0
(1.257)

The operator Dµ ≡ ∂µ − iAµ is called a covariant derivative. The above calculation


shows that ψDµ ψ is invariant under local gauge transformations. c sileG siocnarF

1.13.3 Abelian gauge theories

This observation is the basis of (Abelian) gauge theories: the minimal change to the
Dirac Lagrangian that makes it locally gauge invariant introduces a coupling ψAµ ψ
between two fermion fields and a spin-1 field such as the photon. The complete
Lagrangian of this theory therefore reads:

1
L = − Fµν Fµν + ψ iD
/ − m) ψ . (1.258)
4
We already know the Feynman rules for the photon and fermion propagators, and the
prescription for external photon and fermion lines. The only additional Feynman rule
is the following interaction vertex,

µ
= −iγµ , (1.259)

that can be read off directly from the Lagrangian.


Quantum Electrodynamics (QED) is a quantum field theory that describes the
interactions between electromagnetic radiation (photons) and charged particles (elec-
trons and positrons for instance), whose Lagrangian is of the form (1.258). The
only necessary generalization compared to the previous discussion is to introduce a
60 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

parameter e that represents the (bare) electrical charge of the electron, which leads to
the following changes:

Covariant derivative : Dµ ≡ ∂µ − i e Aµ
i † µ
Gauge transformation of the photon : Aµ → Ω† Aµ Ω + Ω ∂ Ω
e
Electrical current : e ψγµ ψ
Photon-electron vertex : − i e γµ .
(1.260)

1.14 Charge conservation, Ward-Takahashi identities

1.14.1 Charge of 1-particle states

The charge operator Q defined in eq. (1.251) is invariant by translation in time


(because Jµ is a conserved current) and in space (because it is integrated over all space).
Since the current Jµ is a 4-vector, Q is also invariant under Lorentz transformations.
Therefore Q conserves the energy and momentum of the states on which it acts. When
acting on the vacuum state, one has

Q 0 =0. (1.261)

When acting on a 1-particle state αp , Q gives another state with the same 4-
momentum, and therefore the same invariant mass. But since single particle states are
separated from states with a higher occupancy in the spectral function of the theory,
Q |αp must in fact be proportional to αp itself,

Q |αp = qα,p αp . (1.262)

In other words, 1-particle states are eigenvectors of the charge operator. Since Q is
Lorentz invariant, the eigenvalue qα,p cannot depend on the momentum p (nor on
the spin state of the particle), and it can only depend on the species of particle α. We
will thus denote it qα , and call it the electrical charge of the particle of type α.
In theories with 1-particle states that do not correspond to the fundamental fields
of the Lagrangian (e.g. composite bound states made of several elementary particles),
one may go a bit further. The canonical anti-commutation relations imply
 0   
J (x), ψ(y) x0 =y0 = −e ψ(x) δ(x − y) , Q, ψ(y) = −e ψ(y) . (1.263)

More generally, for any local function F(ψ(x), ψ† (x)), we have


 
Q, F(ψ(y), ψ† (y)) x0 =y0 = −e (n+ − n− ) F(ψ(y), ψ† (y)) (1.264)
1. BASICS OF Q UANTUM F IELD T HEORY 61

where n+ is the number of ψ’s in F and n− the number of ψ† ’s. If we evaluate this
identity between the vacuum and a 1-particle state αp , we obtain

0 F(ψ(y), ψ† (y)) αp (qα − (n+ − n− )e) = 0 . (1.265)


Therefore, if the operator F(ψ, ψ† ) can create the particle α from the vacuum (i.e. the
matrix element in the left hand side is non-zero), then we must have
qα = (n+ − n− ) e . (1.266)
In other words, the charge of the particle α is the number of ψ’s it contains, minus
the number of ψ† ’s, times the electrical charge e of the field ψ (as it appears in
the Lagrangian). The non-trivial aspect of this assertion comes from the fact that it
does not depend on the (usually complicated and non-perturbative) interactions that
produce the binding.
So far, we have not discussed the renormalization of the parameter e. Its renor-
malized value er should be such that the covariant derivative retains its form33 when
expressed in terms of the renormalized photon field Aµr , i.e.

∂µ − i er Aµ
r . (1.267)
Since the field Aµ
r is related to the bare photon field Aµ
b by
1/2

b = Z3 Aµ
r , (1.268)
the bare and renormalized charges must be related by
−1/2
eb = Z3 er . (1.269)
In combination with eq. (1.266), this means that the charges of all 1-particle states are
−1/2
renormalized by the same factor Z3 , regardless of the species of particle contained
in the state. For this to work, cancellations between various Feynman graphs are
necessary. These cancellations are a consequence of the local gauge invariance of the
theory, and in their simplest form they can be encapsulated in the Ward-Takahashi
identities, that we shall derive now. c sileG siocnarF

1.14.2 Ward-Takahashi identities


Amplitudes with amputated external photon lines can be obtained as follows:
Z
Mµ1 µ2 ··· (q1 , q2 , · · · ) = d4 x1 d4 x2 · · · e−iq1 ·x1 e−iq2 ·x2 · · ·

× βout T Jµ1 (x1 )Jµ2 (x2 ) · · · αin ,
(1.270)
33 The implicit assumption of this sentence is that the renormalization of QED preserves its local gauge

invariance.
62 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where only electromagnetic currents appear inside the T-product, and all the external
charged particles are kept in the initial and final states α and β (and are therefore
on-shell).
Let us contract the Lorentz index µ1 with the momentum qµ1
1 of the first photon.
After an integration by parts, this reads

Z
µ1 µ2 ···
q1,µ1 M (q1 , q2 , · · · ) = −i d4 x1 d4 x2 · · · e−iq1 ·x1 e−iq2 ·x2 · · ·

× 0out ∂µ1 T Jµ1 (x1 )Jµ2 (x2 ) · · · 0in .
(1.271)

The derivative of the T-product involves two types of terms: (i) terms where the
derivative acts directly on the current Jµ1 (x1 ), that are zero thanks to current conser-
vation, and (ii) terms where it acts on the theta functions that order the times inside
the T-product. With two currents, the latter term reads34

∂   
T Jµ (x)Jν (y) = δ(x0 − y0 ) J0 (x), Jν (y) = 0 . (1.272)
∂xµ

This generalizes to more than two currents, and we therefore have quite generally

q1,µ1 Mµ1 µ2 ··· (q1 , q2 , · · · ) = 0 . (1.273)

The same property would hold for all the external photon lines of the amplitude. This
equation is known as the Ward-Takahashi identity.
A consequence of eq. (1.273) is that QED transition amplitudes are unchanged if
the photon propagators or polarization vectors are modified by terms proportional to
the momentum pµ ,

G0F µν (p) → G0F µν (p) + aµ pν + bν pµ


ǫµ
λ (p) → ǫµ µ
λ (p) + c p . (1.274)

This is precisely the modification of the Feynman rules one would encounter by using
a different gauge fixing in the quantization of the theory. Thus, the Ward-Takahashi
identities imply the gauge invariance of the transitions amplitudes in QED. c sileG siocnarF

34 This step of the argument would fail if we had kept charged field operators inside the T-product,

because their equal-time commutator with J0 is non-zero. Therefore, the Ward-Takahashi identities are
valid provided all the external charged particles are on-shell, but there is no such requirement for the neutral
external particles (e.g. the photons).
1. BASICS OF Q UANTUM F IELD T HEORY 63

1.15 Spontaneous symmetry breaking

1.15.1 Introduction
Until now, our discussion of the symmetry of a theory has been limited to a study of
its Lagrangian or Hamiltonian, and we have tacitly assumed that the symmetry of
the Lagrangian implies that the physics of this system exhibits the symmetry under
consideration to its full extent. However, strictly speaking, a symmetric Lagrangian
only implies that the corresponding equations of motion are symmetric, i.e. that
a symmetry transformation applied to a solution of the equations of motion gives
another solution. In other words, the symmetry of the Lagrangian implies that the set
of the solutions of the equations of motion is symmetric, not that every individual
solution is symmetric. A spontaneously broken symmetry is a symmetry of the
Lagrangian which is not realized by the ground state. c sileG siocnarF

Let us first recall a standard result of quantum mechanics, that on the surface
seems to forbid the possibility of non-symmetric ground states. Consider a quantum
system of Hamiltonian H, which is also invariant under a discrete symmetry R such
that R2 = 1 (such as a mirror symmetry). The Hamiltonian commutes with the
symmetry generator,
 
R, H = 0 , (1.275)

which implies that H and R are diagonalizable simultaneously. Since R2 = 1,


the eigenvalues of R are ±1, and the eigenstates of H are all either symmetric or
antisymmetric under R.

φ φ φ

Figure 1.3: Left to right: potential for the Hamiltonians H0 , H0 + V and H0 +


V +Ve.

In order to see how this result is circumvented in quantum field theory, let us
consider a simple explicit realization of this situation by a potential made of two
infinite wells centered at φ = ±φ∗ , mirror symmetric with respect to φ = 0. Let us
64 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

denote H0 the Hamiltonian of this system. Each of the wells has its own ground state,
that we denote 0+ and 0− , respectively. They are degenerate in energy, transform
into one another by the action of R, and have a vanishing overlap,

R 0+ = 0− , 0+ 0− = 0 . (1.276)

Then, we introduce a perturbation V, also mirror symmetric, such that the energy
barrier between the two wells becomes finite (this interactions acts as a kind of
coupling between the two wells). With this perturbation, we have

h0+ |H0 + V|0+ i = h0− |H0 + V|0− i = a ,


h0+ |H0 + V|0− i = h0+ |V|0− i = b 6= 0 ,
h0− |H0 + V|0+ i = h0− |V|0+ i = b , (1.277)

with a, b real. With b 6= 0, the eigenstates of the full Hamiltonian H ≡ H0 + V


are no longer the states 0+ and 0− , but the combinations35 0+ ± 0− , whose
eigenvalues are a ± b. These eigenstates are symmetric or anti-symmetric under R.
Note now that b is proportional to the time derivative of the tunneling amplitude
between the states 0+ and 0= . Therefore, it is exponentially suppressed when
the system has a large spatial volume, since the field must go from +φ∗ to −φ∗ in
the entire volume during this transition. In fact, b is identically zero if the volume
is infinite, and the two eigenstates |0+ i ± |0− i are degenerate. If this system is then
perturbed by an anti-symmetric term V, e the diagonal matrix elements36 0± V e 0±
of the perturbation are much larger than the splitting 2b between the eigenvalues of H.
Therefore, under the effect of this perturbation, the ground state (now unique since
the perturbation V e breaks the symmetry R) of the Hamiltonian is very close to one of
the original states 0± . c sileG siocnarF

1.15.2 Degenerate vacua with a continuous symmetry

Let us now return to the case of a continuous symmetry. When the volume is infinite,
a ground state v is characterized by the fact that it is en eigenstate of the momentum
Pi with a null eigenvalue37

P |vi = 0 . (1.278)
35 For this conclusion to hold, the matrix elements of H between 0
± and the excited states should
be negligible. Otherwise, the ground state of the perturbed Hamiltonian will be a more general linear
combination of the eigenstates of H0 .
36 The non-diagonal matrix elements of V e are zero per our assumption that V
e is odd under R.
37 Multiparticle states whose total momentum is zero can be excluded by the fact that they are separated

from ground states by a finite threshold.


1. BASICS OF Q UANTUM F IELD T HEORY 65

There is in general a whole set of such states, that we may choose as orthogonal,

hu|vi = δuv . (1.279)

For any matrix element u A(x)B(0) v of the equal-time product of two local
operators, we may insert a complete basis of states in order to get
X
u A(x)B(0) v = u A(0) w w B(0) v
vacua w
Z
d3 p X
+ u A(0) N, p N, p B(0) v e−ip·x ,
(2π)3
N
(1.280)

where we have separated the ground states w from the continuum of populated
states N, p (the label N – possibly continuous– distinguishes all those states that
have the same total momentum p). To obtain this relationship, we have used the
translation invariance of the ground states, and the fact that P is the generator of spatial
translations. Since the states N, p belong to a continuum of states, the integral on
the second line is smooth enough and vanishes when |x| → ∞ by Riemann’s lemma.
Therefore, we have:
X
lim u A(x)B(0) v = u A(0) w w B(0) v . (1.281)
|x|→+∞
vacua w

Likewise, we may prove:


X
lim u B(0)A(x) v = u B(0) w w A(0) v . (1.282)
|x|→+∞
vacua w

Causality implies that A(x)B(0) = B(0)A(x) since the separation between the two
points is space-like, so that the matrix elements u|A(0)|v and u|B(0)|v may be
viewed as commuting Hermitean matrices, that we can diagonalize simultaneously.
Moreover, since A and B are arbitrary local Hermitean operators, this property
is in fact true for all such operators. By choosing properly the basis of the vacua
when the volume is infinite, all the local Hermitean operators have vanishing matrix
elements between distinct vacua:

u|A(0)|v = δuv av . (1.283)

Consequently, any local interaction term that breaks the symmetry responsible for the
degeneracy of these vacua is diagonal in this basis. Therefore, it lifts the degeneracy
and promotes one of the states v to the status of true ground state of the system
(instead of a symmetric linear combination of the v ’s). c sileG siocnarF
66 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

1.15.3 Conserved currents and charges


The fact that the Lagrangian is invariant under a continuous symmetry implies the
existence of conserved currents Jµ
a (x) such that

∂ µ Jµ
a (x) = 0 . (1.284)

Let us assume that the symmetry transformation is of the form

φi (x) → φi (x) + i ǫa ta
ij φj (x) , (1.285)

where the ta ij are the generators of the Lie algebra of the group or transformations,
in the representation where the fields φi live. For the fields φi to be Hermitean, the
numbers ta ij must be purely imaginary (this would be the case if the φi are in the
adjoint representation of the Lie algebra). From Noether’s theorem, the conserved
currents read:
X δL δφi (x)

a (x) = . (1.286)
δ∂µ φi (x) δǫa
i

By integrating over all space, we obtain conserved charges


Z
Qa (x0 ) ≡ d3 xJ0a (x0 , x) . (1.287)

The time component of the currents has the form

J0a (x) = i πi (x) ta


ij φj (x) , (1.288)

where πi (x) is the canonical momentum associated with φi (x). Since the matrices
ta have imaginary components, these currents are Hermitean, as well as the charges
Qa . Using the canonical commutation relations,
 
φi (x), φj (y) x0 =y0 = 0 ,
 
πi (x), πj (y) x0 =y0 = 0 ,
 
φi (x), πj (y) x0 =y0 = i δij δ(x − y) , (1.289)

we find the following equal-time commutator between components of the conserved


currents:
 0   
Ja (x), J0b (y) x0 =y0 = ta b
ij tkl πi (x)φj (x), πk (y)φl (y)
 
= i δ(x − y) ta , tb ij πi (x)φj (x) . (1.290)

Using also the commutation relation that defines the Lie bracket,
 a b
t , t = i fabc tc , (1.291)
1. BASICS OF Q UANTUM F IELD T HEORY 67

(the fabc are real numbers) we get


 0 
Ja (x), J0b (y) x0 =y0 = δ(x − y) fabc J0c (x) . (1.292)

By integrating over the positions x and y, this becomes a commutator between the
conserved charges,
 
Qa (x0 ), Qb (x0 ) = fabc Qc (x0 ) . (1.293)

In other words, the charges Qa (x0 ) form a real representation of the Lie algebra. In
addition, the commutator between the conserved charges and the field operators is
given by38 :
Z
   
Qa (x0 ), φi (x) = i d3 y πk (y)ta
kl φl (y), φi (x) x0 =y0
Z
= i d3 y(−i)δ(x − y)δki ta kl φl (x)

= ta
ij φj (x) . (1.294)

Note that the above commutation relations are not affected by the spontaneous
breaking of symmetry, since they follow from the properties of the field operators,
regardless of the nature of the ground state of system. c sileG siocnarF

1.15.4 Ground state

The ground state of the system is characterized by the expectation values of the field
operators:

φi ≡ 0|φi (x)|0 . (1.295)

In order to see whether the ground state is invariant under the action of the symmetry
transformations, let us study the variation of the quantities φi :

δ φi = 0|δφi (x)|0
= i ǫa ta
ij 0|φj (x)|0
 
= i ǫa 0 Qa (x0 ), φi (x) 0
= i ǫa 0 Qa φi (x) − φi (x)Qa 0 . (1.296)

Thus, it is clear that these expectation values are invariant if the ground state is
annihilated by the all the generators of the Lie algebra (i.e. if Qa 0 = 0 for all a). c sileG siocnarF

38 Since the charges are conserved, we are free to evaluate them at the same time as the field φi .
68 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

1.15.5 Spectral properties

Consider now the expectation value in the ground state of the commutator between
the conserved currents and the field operators:
   µ 
0 Jµ
a (x), φi (y) 0 = 0 Ja (x − y), φi (0) 0
XZ h
= d4 p δ(p − pN ) 0 Jµ
a (x − y) N N φi (0) 0
N
i
− 0 φi (0) N N Jµ a (x − y) 0
XZ h
= d4 p δ(p − pN ) 0 Jµ
a (0) N N φi (0) 0 e
ip·(x−y)

N
i
− 0 φi (0) N N Jµ
a (0) 0 e
ip·(y−x)
. (1.297)

In the second line, we have summed over a complete set of states N , arranged
according to their 4-momentum pN . We have also used the translation invariance
of the ground state, and the properties of states with a definite momentum under
translations. If we define

µ
X
i Fa,i (p) ≡ (2π)3 δ(p − pN ) 0 Jµ
a (0) N N φi (0) 0
N
X
e µ (p) ≡ (2π)3 δ(p − pN ) 0 φi (0) N N Jµ
iF a,i a (0) 0 , (1.298)
N

we have

µ
Fa,i e µ (p) ∗ ,
(p) = − F (1.299)
a,i

since Jµ
a and φi are Hermitean. Moreover, Lorentz invariance implies that these
objects have the following form:

µ
Fa,i (p) = pµ θ(p0 ) ρa,i (p2 ) ,
e µ (p) = pµ θ(p0 ) ρ
F ea,i (p2 ) , (1.300)
a,i

where ρa,i and ρ ea,i are functions (so far unspecified) depending only on the invariant
p2 . The factor θ(p0 ) follows form the fact that the physical states N have a positive
energy. Then, by inserting unit factor given by
Z
1 = ds δ(p2 − s) , (1.301)
1. BASICS OF Q UANTUM F IELD T HEORY 69

we obtain
Z
 
0 [Jµ
a (x), φi (y)] 0 = −∂ µ
ea,i (s) ∆(y − x; s) ,
x ds ρa,i (s) ∆(x − y; s) + ρ

(1.302)

where we denote
Z
d4 p
∆(x − y; s) ≡ 2πθ(p0 ) δ(p2 − s) eip·(x−y) . (1.303)
(2π)4

This function obeys the Klein-Gordon equation with the “mass” s:

(x + s)∆(x − y; s) = 0 , (1.304)

and is Lorentz invariant. When the interval x − y is space-like, it cannot depend


separately on x0 − y0 since the sign of x0 − y0 is not invariant for a space-like
separation. Therefore, for such an interval, it can depend only on (x − y)2 and s, and
we have

∆(x − y; s) = ∆(y − x; s) if (x − y)2 < 0 . (1.305)

Therefore,
Z
 
0 [Jµ
a (x), φi (y)] 0 = −∂ µ
ρa,i (s) ∆(y−x; s) if (x−y)2 < 0 .
x ds ρa,i (s)+e

(1.306)

Since the commutator in the left hand side vanishes for local operators with a space-
like separation, we get39 :

ea,i (s) = 0 .
ρa,i (s) + ρ (1.307)

Returning to the case of a generic interval x − y, we thus have:


Z
 
0 [Jµ
a (x), φi (y)] 0 = −∂ µ
x ds ρa,i (s) ∆(x − y; s) − ∆(y − x; s) . (1.308)

By applying the derivative ∂xµ to both sides of this equation, and using the Klein-
Gordon equation and the fact that the current Jµ
a (x) is conserved, we get
Z
 
0 = ds s ρa,i (s) ∆(x − y; s) − ∆(y − x; s) , (1.309)

39 This property, combined with eq. (1.299), implies that ρa,i (p2 ) is real.
70 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

which implies

s ρa,i (s) = 0 . (1.310)

Therefore, ρa,i (s) = 0 for all s 6= 0, and the only possible support of ρa,i (s) is
localized at s = 0 (in the form of a delta function so that integrals over s are non-zero).
Let us now show that it is not possible that ρa,i (s) be identically zero everywhere
(including at s = 0) when the symmetry is spontaneously broken. By setting µ = 0
and x0 = y0 , and using eq. (1.303), we obtain

Z p
0 [J0a (x), φi (y)] 0 = 2i dsd4 p p2 +s ρa,i (s) eip·(x−y) δ(p2 − s)
x0 =y0
Z
= i δ(x − y) ds ρa,i (s) . (1.311)

Then, we can integrate over x and use the commutation relation (1.294) in order to
get
Z
ta
ij φj = i ds ρa,i (s) . (1.312)

Thus, the functions ρa,i (s) that have a non-zero integral are in one-to-one corre-
spondence with the non-zero ta ij φj , i.e. with the fact that the ground state is non
invariant under the action of some of the symmetry generators. When this happens,
we must have

ρa,i (s) = −i δ(s) ta


ij φj . (1.313)

This equation is the essence of Goldstone’s theorem. Note now that ρa,i is a spectral
function similar to the one defined in the section 1.9. Therefore, the presence of
a δ(s) in this function signals the existence of a one-particle state with zero mass
in the sum of eq. (1.298) (multiparticle states with a null total momentum would
produce a continuum extending down to s = 0 rather than a delta function). Moreover,
this results indicates that there are as many such massless particles (called Nambu-
Goldstone modes) as there are broken symmetries by the ground state.
Finally, let us note that the state φi (0) 0 is invariant under rotations, which
implies that the matrix element N φi (0) 0 is zero unless the state N has a
vanishing helicity. Thus, only spin 0 particles can contribute to the δ(s) in the non-
zero spectral functions. Moreover, 0 J0a (0) N vanishes for any state N whose
quantum numbers differ from those of J0a . Thus, the Nambu-Goldstone modes are
spin-0 particles that have the same internal quantum numbers as J0a . c sileG siocnarF
1. BASICS OF Q UANTUM F IELD T HEORY 71

1.16 Perturbative unitarity


Unitarity is one of the pillars of quantum mechanics, since it is tightly related to the
conservation of probability. A completely general consequence of unitarity is the
optical theorem, whose perturbative translation becomes manifest in the so-called
Cutkosky’s cutting rules.c sileG siocnarF

1.16.1 Optical theorem


The “S-matrix” is the name given to the evolution operator that relates the in and out
states:

αout ≡ αin S . (1.314)

In a unitary field theory, the S matrix is a unitary operator on the space of physical
states:

SS† = S† S = 1 . (1.315)

This property means that for a properly normalized initial physical state αin , we
have
X 2
|hβout |αin i| = 1 , (1.316)
states β

where the sum includes only physical states. In other words, in any interaction process,
the state α must evolve with probability one into other physical states. In general, one
subtracts from the S-matrix the identity operator, that corresponds to the absence of
interactions, and one writes:

S ≡ 1 + iT . (1.317)

Therefore, one has

1 = (1 + iT )(1 − iT † ) = 1 + iT − iT † + TT † , (1.318)

or equivalently

−i(T − T † ) = TT † . (1.319)

Let us now take the expectation value of this identity in the state αin , and insert the
identity operator written as a complete sum over physical states between T and T † in
the right hand side. This leads to:
X 2
−i αin |T − T † |αin = hαin |T |βin i . (1.320)
states β
72 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Equivalently, this identity reads

1 X 2
Im hαin |T |αin i = hαin |T |βin i . (1.321)
2
states β

This identity is known as the optical theorem. It implies that the total probability to
scatter from the state α to any state β equals twice the imaginary part of the forward
transition amplitude α → α. c sileG siocnarF

1.16.2 Cutkosky’s cutting rules

Eq. (1.321) is valid to all orders in the interactions. But as we shall see it also
manifests itself in some properties of the perturbative expansion. Let us first consider
i
as an example a scalar field theory, with a cubic interaction in − 3! λφ3 (x).
Firstly, decompose the free Feynman propagator in two terms, depending on the
ordering between the times at the two endpoints:

G0F (x, y) ≡ θ(x0 − y0 )G0−+ (x, y) + θ(y0 − x0 )G0+− (x, y) . (1.322)

The 2-point functions G0−+ and G0+− are therefore defined as

G0−+ (x, y) ≡ 0in φin (x)φin (y) 0in , G0+− (x, y) ≡ 0in φin (y)φin (x) 0in .
(1.323)

In order to streamline the notations, it is convenient to rename G0F by G0++ , and to


introduce another propagator with a reversed time ordering:

G0−− (x, y) ≡ θ(x0 − y0 )G0+− (x, y) + θ(y0 − x0 )G0−+ (x, y) . (1.324)

The usual Feynman rules in coordinate space amount to connect a vertex at x


and a vertex at y by the propagator G0++ (x, y). The coordinate x of each vertex is
integrated out over all space-time, and a factor −iλ is attached to each vertex. We
will call + this type of vertex. Thus, the Feynman rules for calculating transition
amplitudes involve only the + vertex and the G0++ propagator.
Let us then introduce a vertex of type −, to which a factor +iλ is assigned (instead
of −iλ for the vertex of type +). The integrand of a Feynman graph G is a function
G(x1 , x2 , · · · ) of the coordinates xi of its vertices. We will generalize this function
by assigning + or − indices to all the vertices,

G(x1 , x2 , · · · ) → Gǫ1 ǫ2 ··· (x1 , x2 , · · · ) , (1.325)


1. BASICS OF Q UANTUM F IELD T HEORY 73

where the indices ǫi = ± indicate which is the type of the i-th vertex. The usual
Feynman rules thus correspond to the function G++··· . These generalized integrands
are constructed according to the following rules:

+ vertex : −iλ ,
− vertex : +iλ ,
Propagator from ǫ to ǫ ′ : G0ǫǫ ′ (x, y) . (1.326)

Let us assume that the i-th vertex carries the largest time among all the vertices
of the graph. Since x0i is largest than all the other times, then the propagator that
connects this vertex to an adjacent vertex of type ǫ at the position x is given by

G0±ǫ (xi , x) = G0−ǫ ǫ (xi , x) . (1.327)

In other words, this propagator depends only on the type ǫ of the neighboring vertex,
but not on the type of the i-th vertex. Therefore, we have

G···[+i ]··· (x1 , x2 , · · · ) + G···[−i ]··· (x1 , x2 , · · · ) = 0 , (1.328)

where the notation [±i ] indicates that the i-th vertex has type + or − (the types of the
vertices not written explicitly are the same in the two terms, but otherwise arbitrary).
This identity, known as the largest time equation, follows from eq. (1.327) and from
the sign change when a vertex changes from + to −.
A similar identity also applies to the sum extended to all the possible assignments
of the + and − indices:
X
Gǫ1 ǫ2 ··· (x1 , x2 , · · · ) = 0 . (1.329)
{ǫi =±}

This is obtained by pairing the terms and using eq. (1.328). It is crucial to observe
that this identity is now valid for any ordering of the times at the vertices of the graph.
Therefore, it is also valid in momentum space after a Fourier transform. If we isolate
the two terms where all the vertices are of type + or all of type −, this also reads
X
G++··· + G−−··· = − Gǫ1 ǫ2 ··· , (1.330)
{ǫi =±} ′

where the symbol {ǫi = ±} ′ indicates the set of all the vertex assignments, except
+ + · · · and − − · · · .
Using eq. (1.119),
Z
d3 p
G0++ (x, y) = θ(x0 − y0 ) e−ip·(x−y) + θ(y0 − x0 ) e+ip·(x−y) ,
(2π)3 2Ep
74 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(1.331)

and comparing with eq. (1.324), we can read off the following representations for
G0−+ and G0+− ,

Z
d3 p
G0−+ (x, y) = e−ip·(x−y)
(2π)3 2Ep
Z
d3 p
G0+− (x, y) = e+ip·(x−y) . (1.332)
(2π)3 2Ep

Likewise, we obtain
Z
d3 p
G0−− (x, y) = θ(x0 − y0 ) e+ip·(x−y) + θ(y0 − x0 ) e−ip·(x−y) ,
(2π)3 2Ep
(1.333)

Note that G0++ + G0−− = G0−+ + G0+− .


Using the following representation for step functions:
Z
dp0 −i 0 0
θ(x0 − y0 ) = eip0 (x −y ) ,
2π p0 − i0+
Z
dp0 i 0 0
θ(y0 − x0 ) = +
eip0 (x −y ) , (1.334)
2π p0 + i0

we can derive the momentum space expressions of these propagators:

i
G0++ (p) = ,
− m2 + i0+
p2
−i  ∗
G0−− (p) = 2 = G0++ (p) ,
p − m2 − i0+
G0−+ (p) = 2π θ(+p0 )δ(p2 − m2 ) ,
G+− (p) = 2π θ(−p0 )δ(p2 − m2 ) . (1.335)

Therefore, the momentum space Feynman rules for the − sector are the complex
conjugate of those for the + sector, since we have also +iλ = (−iλ)∗ . Note that for
this assertion to be true, it is crucial that the coupling constant λ be real, which is a
condition for unitarity.
The Fourier transform of an amputated Feynman graph G gives a contribution to a
transition amplitude (recall the LSZ reduction formula), i.e. a matrix element of the S
1. BASICS OF Q UANTUM F IELD T HEORY 75

operator. Therefore, Γ ≡ iG gives a matrix element of the T operator. Therefore, after


Fourier transform, eq. (1.330) becomes
1 X  
Im Γ++··· = iΓ ǫ . (1.336)
2 1 ǫ2 ···
{ǫi =±} ′

If the graph contains N vertices, there are a priori 2N − 2 terms in the right hand side
of this equation. However, this number is considerably reduced if we notice that the
+− and −+ propagators can carry energy only in one direction (from the − vertex to
the + vertex), because of the factors θ(±p0 ). This constraint on energy flow forbids
“islands” of vertices of type + surrounded by only type − vertices, or the reverse.
From the LSZ reduction formula (1.82) and the definition (1.120) of the Fourier
transformed propagators, we see that the notation G−+ (p) implies a momentum p
defined as flowing from the + endpoint to the − endpoint:

p
G−+ (p) = . (1.337)
+ -
Thus, the proportionality G−+ (p) ∝ θ(p0 ) indicates that the energy flows from the
+ endpoint to the − endpoint.
Let us consider the example of a very simple 1-loop two-point function40 Γ (p),

p
−iΓ (p) = . (1.338)

Because of the constrained energy flow direction in the propagators G−+ , G+− , if
the momentum p is entering into the graph from the left with p0 > 0, the only
assignments that mix + and − vertices must divide the graph into two connected
subgraphs: a connected part made only of + vertices that comprises the vertex
where p0 > 0 enters in the graph, and a connected part containing only − vertices
comprising the vertex where the energy leaves the graph. For the topology shown in
eq. (1.338), there is only one possibility,

p
−iΓ+− (p) = , (1.339)

where the vertex of type − is circled in the diagrammatic representation. The division
of the graph into these two subgraphs may be materialized by drawing a line (shown
in gray above) through the graph. This line is called a cut, and the rules for calculating
40 Momentum conservation implies that it depends on a single momentum p.
76 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

the value of a graph with a given assignment of + and − vertices are called Cutkosky’s
cutting rules. For instance, in the case of the above example, they lead immediately
to the following expression41 for the imaginary part of Γ++ ,
Z
λ2 1 d4 k
Im Γ++ (p) = G−+ (k)G−+ (p − k) , (1.340)
2 2 (2π)4
that can be rewritten as
Z
λ2 d4 k1
Im Γ++ (p) = 2πθ(k01 )δ(k21 − m2 )
4 (2π)4
Z 4
d k2
× 2πθ(k02 )δ(k22 − m2 )(2π)4 δ(p − k1 − k2 ) .
(2π)4
(1.341)
In the right hand side of this equation, we recognize the square of the transition
amplitude k1 k2out pin (whose value at tree level is simply λ), integrated over the
(symmetrized) accessible phase-space for a 2-particle final state. We can therefore
view this equation as a perturbative realization of the optical theorem at order λ2 .
Indeed, at this order, the only states β that may be included in the sum over final
states are 2-particle states42 .
The considerations developed on this example can be generalized to the 2-point
function at any loop order. We can write
1 X
Im Γ++ (p) = (iΓγ (p)) , (1.342)
2
cuts γ
where the sum is now limited reduced to a sum over all the possible cuts (with the +
vertices left of the cut and the − vertices right of the cut). As an illustration, let us
consider the following 2-loop example, for which three cuts are possible:

Im Γ++ (p) = . (1.343)

At this order start to appear various contributions to the right hand side of eq. (1.321):
the central cut corresponds to a 3-body final state, while the other two cuts correspond
to an interference between the tree level and the 1-loop correction to a 2-body decay. c sileG siocnarF

41 The first factor 1/2 comes from eq. (1.336), and the second 1/2 is the symmetry factor of the graph for
a scalar loop. In the formula for Im Γ++ , it has the interpretation of the factor that symmetrizes a 2-particle
final state.
42 This result is consistent with the formula (1.100) for a decay rate, if we note that the decay rate Γ of a

particle is related to the imaginary part of the corresponding self-energy by Γ = Im Γ++ (p) /Ep . This
can be seen as follows: after resumming the self-energy Γ++ (p) on the propagator, the imaginary part
makes it decay as G++ (x, y) ∼ exp(−(Im Γ++ )|x0 − y0 |/2Ep ), and the particle density, quadratic in the
field operator, decays as the square of the propagator.
1. BASICS OF Q UANTUM F IELD T HEORY 77

1.16.3 Fermions

In the case of spin 1/2 fermions, the propagators connecting the various types of
vertices are given by

/ + m)
i(p
S0++ (p) = ,
− m2 + i0+
p2
−i(p/ + m)
S0−− (p) = 2 ,
p − m2 − i0+
S0−+ (p) = 2π (p
/ + m)θ(−p0 )δ(p2 − m2 ) ,
/ + m)θ(+p0 )δ(p2 − m2 ) .
S0+− (p) = 2π (p (1.344)

The cutting rules for fermions are therefore similar to those for scalar particles. The
possibility to interpret the cut fermion propagators in terms of on-shell final state
fermions is a consequence of the following identities:

X
/+m=
p us (p)us (p) ,
spin s
X
/−m=
p vs (p)vs (p) , (1.345)
spin s

p
that are valid when p0 = p2 + m2 > 0. In the case of the propagator S0−+ (p), we
may attach the spinor us (p) to the amplitude on the right of the cut, and the spinor
us (p) to the amplitude on the left, which are precisely the spinors required by the
LSZ formula for a fermion of momentum p in the final state. In the case of S0+− (p),
for which p0 < 0, we should first write

S0+− (p) = / − m)θ(−p0 )δ(p2 − m2 )


−2π(−p
X
= −2π vs (−p)vs (−p)θ(−p0 )δ(p2 − m2 ) , (1.346)
spin s

in order to see that it corresponds to an anti-fermion in the final state.


c sileG siocnarF
78 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

1.16.4 Photons
Coulomb gauge : For photons in Coulomb gauge, the reasoning is very similar to
the case of fermions. Firstly, the four different types of propagators read
 i j

i δij − ppp2
G0++ij
(p) = ,
p2 + i0+
 i j

−i δij − ppp2
G0−−ij
(p) = ,
p2 − i0+
 i j

G0−+ij
(p) = 2π θ(+p0 ) δij − ppp2 δ(p2 ) ,
 i j

G0+−ij
(p) = 2π θ(−p0 ) δij − ppp2 δ(p2 ) . (1.347)

Recalling also that


X j pi pj
ǫi∗ ij
λ (p)ǫλ (p) = δ − , (1.348)
p2
λ=±

we see that the projector that appears in the cut propagators can be interpreted as the
polarization vectors that should attached to amplitudes for each final state photon.
Therefore, the cutting rules in Coulomb gauge have a direct interpretation in terms of
the optical theorem. This simplicity follows from the fact that the only propagating
modes are physical modes in Coulomb gauge. c sileG siocnarF

Feynman gauge : This interpretation is not so direct in covariant gauges, such as


the Feynman gauge. In this gauge, the free photon propagator is given by:
i
G0++
µν
(p) = −gµν . (1.349)
p2 + i0+
The factor −gµν does not change anything to the cutting rules, and simply appears as
a prefactor in all propagators:
−i
G0−−
µν
(p) = −gµν ,
p2 − i0+
G0−+
µν
(p) = −2π gµν θ(+p0 )δ(p2 ) ,
G0+−
µν
(p) = −2π gµν θ(−p0 )δ(p2 ) . (1.350)
Let us assume for definiteness that the photon momentum p is in the z ^ direction, i.e.
p = (0, 0, p). Therefore, the two physical polarizations vectors, orthogonal to p, can
be chosen as follows
ǫµ
1 (p) ≡ (0, 1, 0, 0) ,
µ
ǫ2 (p) ≡ (0, 0, 1, 0) . (1.351)
1. BASICS OF Q UANTUM F IELD T HEORY 79

They are orthonormal

ǫλ (p) · ǫλ ′ (p) = −δλλ ′ , (1.352)

and transverse: pµ ǫµ
1,2 (p) = 0. However, the tensor −g
µν
that appears in the cut
photon propagators cannot be written as a sum over physical polarizations:
X µ

−gµν 6= ǫλ (p)ǫνλ (p) . (1.353)
λ=1,2

Only the µ = 1, 2 components of these tensors are equal. As a consequence, it seems


that Cutkosky’s cutting rules may lead to terms that we cannot interpret as physical
final photon states, which would violate the optical theorem. If this was the case, then
perturbation theory would not be consistent with unitarity. To see how this paradox is
resolved, let us introduce two more (unphysical) polarization vectors 43 :

1
ǫµ
+ (p) ≡ √ (1, 0, 0, 1) ,
2
µ 1
ǫ− (p) ≡ √ (1, 0, 0, −1) , (1.355)
2

thanks to which we may now write


X
gµν = ǫµ ν ∗ µ ν ∗
+ (p)ǫ− (p) + ǫ− (p)ǫ+ (p) − ǫµ ν ∗
λ (p)ǫλ (p) . (1.356)
λ=1,2

In other words, the physical polarization sum in the right hand side of eq. (1.353) is
equal to −gµν , plus some extra terms that are proportional to pµ of pν .
When we use Cutkosky’s cutting rules in order to calculate the imaginary part of
graph, a cut photon line carrying the momentum pµ leads to an expression that has
the following structure:

iMµ
1 (p) [−g
µν
] (iMν
2 (p)) , (1.357)

where iMµ ν
1 and iM2 are the amplitudes on the left and on the right of the cut,
respectively. Here, we have highlighted only one of the cut photons, and the other
cut lines have not been written explicitly since they do not play any role in the
43 For an arbitrary momentum p, these polarization vectors read:

1
ǫµ
+ (p) ≡ √ (p0 , p) ,
2|p|
1
ǫµ
− (p) ≡ √ (p0 , −p) . (1.354)
2|p|
80 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

argument. Moreover, only the tensor structure of the cut propagator matters, and we
have therefore only written the factor −gµν . The above quantity can be rewritten as
" #
µ
X µ µ µ ∗
∗ ∗ ∗
iM1 (p) ν ν
ǫλ (p)ǫλ (p) −ǫ+ (p)ǫ− (p) −ǫ− (p)ǫ+ (p) (iMν
ν
2 (p))
λ=1,2
" #
X ∗
= iMµ
1 (p) ǫµ ν
λ (p)ǫλ (p)

(iMν
2 (p)) . (1.358)
λ=1,2

Indeed, the last two terms are zero thanks to the Ward identity satisfied44 by the
amplitudes iMµ ν
1 and iM2 :

pµ Mµ ν
1 (p) = pν M2 (p) = 0 , (1.359)

and because ǫµis proportional to pµ . Therefore, the non-physical photon degrees


+ (p)
of freedom, that may appear in the cutting rules in covariant gauges, are in fact
canceled by the Ward identities satisfied by QED amplitudes. c sileG siocnarF

1.16.5 Schwinger-Keldysh formalism


Perturbation theory provides a way of computing order by order transition amplitudes
like p′ q′ out pqin . The calculation of these matrix elements is amenable via the
LSZ reduction formulas to the expectation value of time-ordered products of field
operators, between the in- and out- vacuum states, for instance

0out T φ(x1 )φ(x2 )φ(x3 )φ(x4 ) 0in ,

the calculation of which can be performed with the usual Feynman rules.
However, there is a class of more general problems that cannot be addressed by
this standard perturbation theory. One of the simplest problems of that kind is the
evaluation of the expectation value of the number operator αin a†out (p)aout (p) αin ,
that counts the particles of momentum p in the final state, given that the initial state
was the state α. To evaluate this matrix element, one needs to calculate the amplitude
αin φ(x)φ(y) αin , that has no time ordering, and where one has in states on both
sides. More generally, one sometimes needs the amplitudes
 
0in T φ(x1 ) · · · φ(xn ) T φ(y1 ) · · · φ(yp ) 0in ,

where T denotes the anti-time ordering. The Schwinger-Keldysh formalism is tailored


for addressing these more general questions. Moreover, as we shall see, it is formally
identical to Cutkosky’s cutting rules. c sileG siocnarF

44 When an amplitude has external charged particles, the Ward identity is satisfied only if these particles

are on-shell. This is indeed the case here, because all the cut lines are on-shell, as well as all the incoming
particles.
1. BASICS OF Q UANTUM F IELD T HEORY 81

Schwinger-Keldysh perturbation theory : Consider the expectation value


 
0in T φ(x1 ) · · · φ(xn ) T φ(y1 ) · · · φ(yp ) 0in . (1.360)

As we did in the derivation of ordinary perturbation theory, let us first replace each
Heisenberg field operator by its counterpart in the interaction representation, using
eq. (1.63). After some rearrangement of the evolution operators, we get :

0in T φ(x1 ) · · · φ(xn ) Tφ(y1 ) · · · φ(yp ) 0in =


h Z +∞ i
= 0in T φin (x1 ) · · · φin (xn ) exp i d4 x LI (φin (x))
−∞
h Z +∞ i
×T φin (y1 ) · · · φin (yp ) exp i d4 x LI (φin (x)) 0in .
−∞
(1.361)

Here, we have exploited the fact that the factor U(−∞, +∞) that appears in these
manipulations is the anti-time ordered exponential of the interaction term, in order to
write this formula in a more symmetric way. To go further, it is useful to imagine that
the time axis is in fact a contour C made of two branches labeled + and − running
parallel to the real axis, as illustrated in figure 1.4. This contour is oriented, with

C +
x0

Figure 1.4: Time contour in the Schwinger-Keldysh formalism.

the + branch running in the direction of increasing time, followed by the − branch
running in the direction of decreasing time. Then, it is convenient to introduce a path
ordering, denoted by P and defined as a standard ordering along the contour C. In
more detail, one has


 T A(x)B(y) if x0 , y0 ∈ C+ ,


 T A(x)B(y) if x0 , y0 ∈ C− ,
P A(x)B(y) = (1.362)


 A(x)B(y) if x0 ∈ C− , y0 ∈ C+ ,


B(y)A(x) if x0 ∈ C+ , y0 ∈ C− .
82 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

One can use this contour ordering to write the previous equations in a much more
compact way. In particular, eq. (1.361) can be generalized into :

0in P φ− (x1 ) · · · φ− (xn )φ+ (y1 ) · · · φ+ (yp ) 0in =


Z
= 0in P φ−in (x 1 ) · · · φ−
in (x n )φ+
in (y1 ) · · · φ+
in (yp ) exp i d4 x LI (φin (x)) 0in .
C
(1.363)

The differences compared to eq. (1.361) are threefold :

i. A single overall path ordering takes care automatically of both the time ordering
and the anti-time ordering contained in the original formula,
ii. For this trick to work, one must (temporarily) assume that the fields on the
upper and lower branch of the contour C are distinct: φ+ and φ− respectively,
iii. The time integration in the exponential is now running over both branches of
the contour C.

The advantage of having introduced this more complicated time contour is that it
leads to a expressions that are formally identical to those of ordinary perturbation
theory, provided one replaces the time ordering by the path ordering and provided

one extends the time integration from to C. In particular, one can first define a
generating functional,
Z
Z [j] ≡ 0in T exp i d4 x j(x)φ(x) 0in ,
SK
(1.364)
C

that encodes all the correlators considered in this section, provided the external source
j has distinct values j+ and j− on the two branches of the contour (the superscript SK
is used to distinguish this generating functional from the standard one). As in the case
of Feynman perturbation theory, one can write this generating functional as:
Z   Z
δ
ZSK [j] = exp i d4 x LI 0in T exp i d4 x j(x)φin (x) 0in , (1.365)
C iδj(x) C
| {z }
0 [j]
ZSK

with
Z
1
0 [j]
ZSK = exp − d4 xd4 y j(x)j(y) G0C (x, y)
2 C
G0C (x, y) ≡ 0in P φin (x)φin (y) 0in . (1.366)

The free propagator G0C , defined on the contour C, is a natural extension of the
Feynman propagator (in particular, it coincides with the Feynman propagator if the
1. BASICS OF Q UANTUM F IELD T HEORY 83

two time arguments are on the + branch of the contour). Besides the propagator,
the other change to the perturbative expansion in the Schwinger-Keldysh formalism
is that the time integration at the vertices of a diagram must run over the contour C
instead of the real axis.
The connection with Cutkosky’s cutting rules appears when we break down the
propagator into 4 components G0±± (x, y), depending on whether the times x0 , y0
are on the upper or lower branch of the contour. An explicit calculation of these free
propagators leads to
Z
d4 p e−ip·(x−y)
G0++ (x, y) = i ,
(2π)4 p2 − m2 + iǫ
Z 4
d p e−ip·(x−y)
G0−− (x, y) = −i ,
(2π) p − m2 − iǫ
4 2
Z 4
d p −ip·(x−y)
G0+− (x, y) = e 2πθ(−p0 )δ(p2 − m2 ) ,
(2π)4
Z 4
d p −ip·(x−y)
G0−+ (x, y) = e 2πθ(+p0 )δ(p2 − m2 ) . (1.367)
(2π)4

The time integration on the contour C is also split into two terms, the upper branch
corresponding to a vertex + (−iλ) and the lower branch to a vertex − (+iλ, because
of the minus sign due to integrating from +∞ to −∞).
In the Schwinger-Keldysh formalism, the vacuum-vacuum diagrams are simpler
than in conventional perturbation theory. Here, one has

ZSK [0] = 0in 0in = 1 , (1.368)

which means that all the connected vacuum-vacuum diagrams are zero. This is due to
the fact that in this formalism one is calculating correlators that have the in- vacuum
on both sides. This cancellation works individually for each diagram topology, and
results from a cancellation between the various ways of assigning the + and − indices
to the vertices of a diagram (a vacuum-vacuum diagram with a fixed assignment
of + and − vertices is not zero in general). This cancellation can be viewed as a
consequence of eq. (1.329). c sileG siocnarF

Relation between the functionals Z[j] and ZSK [j] : There is a useful functional
relation between the generating functional of conventional perturbation theory Z[j],
and that of the Schwinger-Keldysh formalism :
Z 
δ2
Z [j+ , j− ] = exp
SK
d xd y 4 4
G0+− (x, y) x y Z[j+ ] Z∗ [j− ] .
δj+ (x)δj− (y)
(1.369)
84 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(Here, in order to avoid any confusion, we write explicitly the two components +
and − of the source j in the Schwinger-Keldysh generating functional.) Thanks
to this formula, one can construct diagrams in the Schwinger-Keldysh formalism
by stitching an ordinary Feynman diagram and the complex conjugate of another
Feynman diagram. In order to prove this relation, it is sufficient to establish it for the
free theory, since the interactions are always trivially factorizable (see eqs. (1.107)
and (1.365)).c sileG siocnarF
Chapter 2

Functional quantization

2.1 Path integral in quantum mechanics


Let us consider a quantum mechanical system with a single degree of freedom, whose
Hamiltonian is
P2
H≡ + V(Q) . (2.1)
2m
The position
  momentum operators Q and P obey the following commutation
and
relation Q, P = i. We would like to calculate the probability for the system to start
at the position qi at a time ti and end at the position qf at the time tf . The answer
2
may be obtained as ψ(qf , tf ) by solving Schrödinger’s equation with an initial
wavefunction localized at qi ,
i∂t ψ(q, t) = H ψ(q, t) , ψ(q, ti ) ≡ δ(q − qi ) . (2.2)
More formally, in the Schrödinger picture, it is given by the squared modulus of the
following transition amplitude
qf e−iH(tf −ti ) qi , (2.3)

where q denote the eigenstate of the position operator with eigenvalue q. Let us
subdivide the time interval [ti , tf ] into N equal sub-intervals, by introducing:
tf − ti
∆≡ , tn ≡ ti + n ∆ . (2.4)
N
(Therefore, we have t0 = ti and tN = tf .) The time evolution operator can be
factorized as
e−iH(tf −ti ) = e−iH(tN −tN−1 ) ×e−iH(tN−1 −tN−2 ) ×· · ·×e−iH(t1 −t0 ) . (2.5)

85
86 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Between the successive factors in the right hand side, we can insert the identity
operator written as a complete sum over the position eigenstates:
Z +∞
1= dq q q , (2.6)
−∞

and the transition amplitude (2.3) becomes

Z N−1
Y
qf e−iH(tf −ti ) qi = dqj qf e−i∆H qN−1 qN−1 e−i∆H qN−2 · · ·
j=1

· · · q1 e−i∆H qi . (2.7)

Note that this formula, illustrated in the figure 2.1, is exact for any value of N. In the

t
q

Figure 2.1: Illustration of eq. (2.7) with 10 and 200 intermediate points. The
endpoints are fixed, while the intermediate points are integrated over. The line
segments connecting the points are just a help to guide the eye, but there is no “path”
at this stage.
2. F UNCTIONAL QUANTIZATION 87

Hamiltonian (2.1), the kinetic energy and potential energy terms do not commute,
which complicates the evaluation of its exponential. We can remedy this situation by
using the Baker-Campbell-Hausdorff formula, that we shall write here as follows
∆2 3
e∆(A+B) = e∆A e∆B e− 2 [A,B]+O(∆ ) . (2.8)

In the limit ∆ → 0 (i.e. N → ∞), we may neglect the last factor since the product of
all such factors goes to unity1 when N → ∞. Therefore, each elementary factor of
eq. (2.7) is rewritten as

P2
qi+1 e−i∆H qi ≈ qi+1 e−i∆ 2m e−i∆V(Q) qi
Z
dpi P2
= qi+1 e−i∆ 2m pi pi e−i∆V(Q) qi ,

(2.9)

where we have introduced the identity operator, written this time as a complete sum
over momentum eigenstates:
Z
dp
1≡ p p . (2.10)

In the two factors, the exponential operator depends only on P or Q, and the matrix
elements are trivial to evaluate by using the fact that the operators are enclosed
between momentum and position eigenstates:

P2 p2
i
qi+1 e−i∆ 2m pi = e−i∆ 2m qi+1 pi ,
pi e−i∆V(Q) qi = e−i∆V(qi ) pi qi . (2.11)

Using now

q p = eipq , (2.12)

we arrive at the formula2



qi+1 e−i∆H qi = e−i∆H(pi ,qi ) ei pi (qi+1 −qi ) 1 + O(∆2 ) . (2.13)
1 We use
2 2 2
lim eα1 /N eα2 /N · · · eαN /N = 1 ,
N→∞
P
provided that the sum i αi ’s does not diverge too quickly.
2 A bit more care is necessary for Hamiltonians that are not separable into a sum of a P-dependent term

and a Q-dependent term. A proper treatment should use Weyl’s prescription for defining the quantum
Hamiltonian operator from the classical Hamiltonian. In eq. (2.13), one would obtain H(pi , 21 (qi +qi+1 ))
instead of H(pi , qi ).
88 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

If we define q̇i ≡ (qi+1 − qi )/∆ the slope of the line segments in the figure 2.1, and
we take the limit N → ∞, we may write the transition amplitude as a path integral:

Z
 
qf e−iH(tf −ti ) qi = Dp(t)Dq(t)
q(ti )=qi
q(tf )=qf
Z tf

× exp i dt p(t)q̇(t) − H(p(t), q(t)) . (2.14)
ti

 
One should be aware of the fact that the functional measure Dq(t)Dp(t) in general
lacks solid mathematical foundations, although it allows for some powerful manip-
ulations that would be extremely cumbersome to perform at the level of quantum
operators. Note that at the boundaries ti,f the position is well defined, and therefore
the momentum is not constrained (by the uncertainty principle). A crucial aspect
of eq. (2.14) is that all the objects that appear in the right hand side are ordinary
c-numbers that commute, while the left hand side is made of quantum operators and
states. In this section, we have started from the conventional formulation of transition
amplitudes in quantum mechanics, in order to arrive at the formula (2.14). However,
one may now “forget” the canonical formalism and view the path integral expression
of transition amplitudes as another way of going from a classical Hamiltonian H to a
quantized theory.

For a Hamiltonian where the P dependence has no powers higher than quadratic,
as in the example of eq. (2.1), it is possible to perform exactly the integral over p(t).
This type of integral is called a Gaussian path integral. Gaussian path integrals can
be evaluated in the same way as their ordinary counterparts, using the following
formulas,
Z +∞ √ Z +∞
π√
−x2 /(2σ) 2
dx e = 2πσ , dx e±ix /(2σ)
= e±i 4 2πσ , (2.15)
−∞ −∞

and treating each p(t) as an independent variable. In the present case, we need the
integral

Z r
p2 π 2πm i∆ mq̇
2
i∆(pq̇− 2m ) −i 4
dp e =e e 2 . (2.16)
| {z ∆ }
prefactor
independent of q,q̇

 
The (infinite in the limit ∆ → 0) prefactors can be hidden in the measure Dq(t)
since they do not depend on the path, and we are therefore led to the following
2. F UNCTIONAL QUANTIZATION 89

formula:
Z Z tf
−iH(tf −ti )
 
qf e qi = Dq(t) exp i dt L(q(t))
ti
q(ti )=qi
q(tf )=qf
Z
 
= Dq(t) eiS[q(t)] , (2.17)
q(ti )=qi
q(tf )=qf

where L(q) is the classical Lagrangian:


m q̇2
L(q) ≡ − V(q) (2.18)
2
and S[q] the corresponding classical action. c sileG siocnarF

2.2 Classical limit, Least action principle


In the previous section, we have written all the formulas with h̄ = 1. Had we kept the
Planck constant, the final formula would have been:
Z
  i
qf e−iH(tf −ti ) qi = Dq(t) eh̄ S[q(t)] . (2.19)
q(ti )=qi
q(tf )=qf

(This can be guessed a posteriori based on the fact that h̄ has the dimension of an
action.) Because of the factor i inside the exponential, this integral is wildly oscillating,
except in the immediate vicinity of the function qc (t) that realizes the extremum
of the action. Note that this function is precisely the solution of the classical Euler-
Lagrange equations of motion. Roughly speaking, the phase oscillations become
significant when
S[q(t)] − S[qc (t)] ≥ 2π h̄ , (2.20)
and paths that fulfill this inequality do not contribute to the path integral. Therefore,
in the limit h̄ → 0, the path integral is dominated by the unique path qc (t), i.e. by the
classical trajectory of the system. The path integral formalism thus provides a very
intuitive way of connecting smoothly quantum and classical mechanics. c sileG siocnarF

2.3 More functional machinery


2.3.1 Time-ordered products
Consider the matrix element
qf e−iH(tf −t1 ) Q e−iH(t1 −ti ) qi , (2.21)
90 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 2.2: Illustration of eq. (2.19). The paths whose action is far apart from
the classical extremum are plotted in fainter colours. The solid black line is the
classical trajectory.

that measures the expectation value of the position at the time t1 . In order to evaluate
this object, we need to insert on either side of the position operator Q an identity
operator written as a complete sum over position eigenstates, i.e.
Z Z
Q → dqdq ′ q q Q q′ q′ = dq q q q . (2.22)
| {z }
q δ(q−q ′ )

This leads immediately to the following path integral representation:


Z
−iH(tf −t1 ) −iH(t1 −ti )
 
qf e Qe qi = Dq(t) q(t1 ) eiS[q(t)] . (2.23)
q(ti )=qi
q(tf )=qf

Likewise, if t2 > t1 , we have:

qf e−iH(tf −t2 ) Q e−iH(t2 −t1 ) Q e−iH(t1 −ti ) qi =


Z
 
= Dq(t) q(t1 ) q(t2 ) eiS[q(t)] . (2.24)
q(ti )=qi
q(tf )=qf

If we introduce a time-dependent position operator

Q(t) ≡ eiHt Q e−iHt , (2.25)


2. F UNCTIONAL QUANTIZATION 91

and its eigenstates

q, t ≡ eiHt q , (2.26)

the previous equation takes a much more compact form


Z
 
qf , tf Q(t2 )Q(t1 ) qi , ti = Dq(t) q(t1 ) q(t2 ) eiS[q(t)] .
t2 >t1
q(ti )=qi
q(tf )=qf

(2.27)

The condition t2 > t1 is crucial here, because the left hand side would be quite
different if the times are ordered differently. In contrast, the objects q(t1 ) and q(t2 )
in the right hand side are ordinary numbers that commute. One may render this
formula true for any ordering between t1 and t2 by introducing a T-product, that
ensures that the operator with the largest time is always on the left:
Z
  
qf , tf T Q(t1 )Q(t2 ) qi , ti = Dq(t) q(t1 ) q(t2 ) eiS[q(t)] .
q(ti )=qi
q(tf )=qf

(2.28)

This formula generalizes to n factors:


Z
  
qf , tf T Q(t1 ) · · · Q(tn ) qi , ti = Dq(t) q(t1 ) · · · q(tn ) eiS[q(t)] .
q(ti )=qi
q(tf )=qf

(2.29)

This result is extremely important in applications to quantum field theory, since


time-ordered products of field operators are the central objects that appear in the
LSZ reduction formulas. One may also apply differential operators containing time
derivatives on this equation, for instance:

∂ 
qf , tf T Q(t1 ) · · · Q(tn ) qi , ti
∂t1
Z
 
= Dq(t) q̇(t1 ) · · · q(tn ) eiS[q(t)] . (2.30)
q(ti )=qi
q(tf )=qf

In other words, a time derivative in the integrand of the path integral also applies to
the step functions that enforce the time ordering in the left hand side. c sileG siocnarF
92 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

2.3.2 Functional sources and derivatives


The amplitudes of the form (2.29) can all be encapsulated into the following generating
functional:
Z tf
Zfi [j(t)] ≡ qf , tf T exp i dt j(t) Q(t) qi , ti , (2.31)
ti

where j(t) is some arbitrary function of time. From Zfi [j], the amplitudes can be
recovered by functional differentiation:
 δn Zfi [j]
qf , tf T Q(t1 ) · · · Q(tn ) qi , ti = . (2.32)
in δj(t1 ) · · · δj(tn ) j≡0

Functional derivatives obey the usual rules of differentiation, with the additional
property that the values of the function j(t) at different times should be viewed as
independent variables, i.e.
δj(t)
= δ(t − t ′ ) . (2.33)
δj(t ′ )
From this formula, one may also read the dimension of a functional derivative:
h δ i    
dim = −dim j(t) − dim t . (2.34)
δj(t)
From eq. (2.29), we can derive an expression of the generating functional Zfi as a
path integral,
Z
  R tf
Zfi [j(t)] = Dq(t) eiS[q(t)]+i ti dt j(t)q(t) , (2.35)
q(ti )=qi
q(tf )=qf

that involves only the commuting c-number q(t) and no time-ordering. Note also
that there is an Hamiltonian version of this path integral:
Z
 
Zfi [j(t)] = Dp(t)Dq(t)
q(ti )=qi
q(tf )=qf
Z tf

× exp i dt p(t)q̇(t) − H(p(t), q(t)) + j(t)q(t) . (2.36)
ti

2.3.3 Projection on the ground state at asymptotic times


So far in this section, we have considered amplitudes where the initial and final states
are position eigenstates. However, the path integral formalism is not limited to this
2. F UNCTIONAL QUANTIZATION 93

situation. Let us assume for instance that the system is in a state ψi at the time ti
and in the state ψf at the time tf . For any operator O, the expectation value between
these two states can be related to transitions between position eigenstates by writing
Z
ψf , tf O ψi , ti = dqi dqf ψ∗f (qf ) ψi (qi ) qf , tf O qi , ti , (2.37)

where

ψ(q) ≡ q ψ (2.38)

is the position representation of the wavefunction of the state ψ . However, the use
of this formula is cumbersome in practice, because of the integrations over qi,f .
In the special case where the initial and final states are the ground state of the
Hamiltonian, 0 , and the initial and final times are −∞ and +∞, there is trick to
circumvent this difficulty. Let us introduce the eigenstates n of the Hamiltonian,
with eigenvalue En and eigenfunction ψn (q) ≡ q n , and write

qi , ti = eiHti qi
X∞
= eiHti n n qi
n=0
X∞
= ψ∗n (qi ) eiEn ti n . (2.39)
n=0

We will assume that the Hamiltonian is shifted by a constant so that the energy of the
ground state 0 is E0 = 0. Now, we multiply the Hamiltonian by 1 − i0+ , where
0+ denotes some positive infinitesimal number. All the factors exp(i(1 − i0+ )En ti )
go to zero when ti → −∞, except for n = 0. Therefore, after this alteration of the
Hamiltonian, we have:

lim qi , ti = ψ∗0 (qi ) 0 . (2.40)


ti →−∞

We can then weight this equation by a function ϕ(qi ),


Z Z
lim dqi ϕ(qi ) qi , ti = dqi ϕ(qi )ψ∗0 (qi ) 0 , (2.41)
ti →−∞
| {z }
0 ϕ

i.e.
Z
1
0 = lim dqi ϕ(qi ) qi , ti . (2.42)
ti →−∞ 0ϕ
94 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Any function ϕ(q) such that the state ϕ has a non-zero overlap with the ground
state 0 is appropriate in this role, but the simplest expressions are obtained with
the constant function ϕ(q) = 1, corresponding to the momentum eigenstate p = 0.
Likewise, changing H → (1 − i0+ )H has a similar effect on the final state in the
limit tf → +∞,
lim qf , tf = ψ0 (qf ) 0 . (2.43)
tf →+∞

From these considerations, when the initial and final states at ±∞ are the ground
state, we can write the generating functional in the following simple path integral
form:
Z
 
Z[j(t)] = Dp(t)Dq(t)
Z

× exp i dt p(t)q̇(t) − (1 − i0+ )H(p(t), q(t)) + j(t)q(t) .
(2.44)
From the discussion after eq. (2.42), we see that the boundary conditions on the paths
are not important. They only affect an overall prefactor, that can be adjusted by hand
in such a way that Z[0] = 1. After performing the Gaussian functional integral over
p(t), we can rewrite this expression in Lagrangian form:
Z
 
Z[j(t)] = Dq(t)
Z
mq̇2 (t) 
× exp i dt (1 + i0+ ) − (1 − i0+ )V(q(t)) + j(t)q(t) .
2
(2.45)
The term in (i0+ )q̇2 may be viewed as contributing to the convergence of the integral
at large velocities. Likewise, for a confining potential such that V(q) → +∞ when
q → ∞, the term in (i0+ )V(q) contributes to the convergence at large coordinates. c sileG siocnarF

2.3.4 Functional Fourier transform


Given a functional F[q(t)], its functional Fourier transform is defined by
Z Z
 
e
F[p(t)] ≡ Dq(t) F[q(t)] exp i dt p(t)q(t) . (2.46)

In other words, the Fourier conjugate of the “variable” q(t) is another function of
time, p(t). Eq. (2.46) may be inverted by
Z Z
 
F[q(t)] ≡ Dp(t) e F[p(t)] exp − i dt p(t)q(t) . (2.47)

The usual properties of ordinary Fourier transforms extend to the functional case, e.g.:
2. F UNCTIONAL QUANTIZATION 95

• The Fourier transform of a constant is a delta function,


• The Fourier transform of a Gaussian is another Gaussian,
• The Fourier transform of a product is the convolution product of the Fourier
transforms. c sileG siocnarF

2.3.5 Functional translation operator

The functional derivative δ/δj(t) may be viewed as the generator of translations in


the space of the functions j(t). Its exponential provides a translation operator:
Z
δ
exp dt a(t) F[j(t)] = F[j(t) + a(t)] , (2.48)
δj(t)

for any functional F[j(t)]. Another extremely important formula is


Z  δ n Z
exp λ dt exp dt j(t)q(t)
δj(t)
| {z }
A[j,q;λ]
Z

= exp dt j(t)q(t) + λ qn (t) . (2.49)
| {z }
B[j,q;λ]

The proof of this formula consists in noticing that A[j, q; λ = 0] = B[j, q; λ = 0], and
in comparing their (ordinary) derivatives with respect to λ:
Z  δ n Z
∂λ A[j, q; λ] = λ dt A[j, q; λ] = λ dt qn (t) A[j, q; λ] ,
δj(t)
Z
∂λ B[j, q; λ] = λ dt qn (t) B[j, q; λ] . (2.50)

Therefore A[j, q; λ] and B[j, q; λ] are equal at λ = 0 and obey the same differential
equation.c sileG siocnarF

2.3.6 Functional diffusion operator

It is sometimes useful to evaluate the action of an operator which is quadratic in


functional derivatives. The result is given by
Z Z Z
σ(t)  δ 2   a2 (t)
exp dt F[j] = Da(t) exp − dt F[j+a] . (2.51)
2 δj(t) 2σ(t)
96 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

In order to establish this formula, consider the following differential equation,


Z
σ(t)  δ 2
∂z F[j(t); z] = dt F[j; z] , (2.52)
2 δj(t)
where z is an ordinary real-valued variable. One may view this equation as a diffusion
equation in the space of the functions j(t), and F[j; z] as a density functional on this
space. The left hand side of eq. (2.51) is the formal expression of the solution of this
equation at z = 1, if we interpret F[j] as its initial condition at z = 0. In order to show
that it is equal to the right hand side, one should first transform the diffusion equation
(2.52) by performing a functional Fourier transform,
Z Z
 
e
F[k(t); z] ≡ Dj(t) exp i dt j(t)k(t) F[j(t)]
Z
σ(t) 2
∂z e
F[k(t); z] = − dt k (t) e
F[k(t); z] . (2.53)
2

The solution of the latter equation is simply


Z
e σ(t) 2
F[k(t); z = 1] = exp − dt k (t) e
F[k(t); z = 0] , (2.54)
2
and the inverse Fourier transform of this solution leads to the right hand side of
eq. (2.51).
c sileG siocnarF

2.4 Path integral in scalar field theory


The functional formalism that we have exposed in the context of quantum mechanics
can now be extended to quantum field theory. The main change is that the functions
over which one integrates are functions of time and space (as opposed to functions
of time only in quantum mechanics). All the result of the previous section can be
translated into analogous formulas in quantum field theory, thanks to the following
correspondence:

q(t) ←→ φ(x)
p(t) ←→ Π(x)
j(t) ←→ j(x)
(2.55)

The main results of the previous section, namely that time-ordered products of
operators in the canonical formalism become simple products of ordinary functions
in the path integral representation, and that the ground state at ±∞ can be obtained
2. F UNCTIONAL QUANTIZATION 97

by relaxing the boundary conditions and multiplying the Hamiltonian by 1 − i0+ ,


remain true in this new context. Thus, the analogue of eq. (2.44) in a real scalar field
theory is:
Z
 
Z[j] = DΠ(x)Dφ(x)
Z

× exp i d4 x Π(x)φ̇(x)−(1−i0+ )H(Π, φ) + j(x)φ(x) .
(2.56)

Since the Hamiltonian is quadratic in Π,

1 2 1 1
H= Π + (∇φ) · (∇φ) + m2 φ2 + V(φ) , (2.57)
2 2 2
it is easy to perform the (Gaussian) functional integration on Π, to obtain:
Z Z
  
Z[j] = Dφ(x) exp i d4 x L(φ) + j(x)φ(x) , (2.58)

where

1 1 
L(φ) ≡ (1 + i0+ )φ̇2 − (1 − i0+ ) (∇φ) · (∇φ) + m2 φ2 − (1 − i0+ )V(φ) .
2 2
(2.59)

Note that the 1 − i0+ in front of the interaction potential plays no role if we turn off
adiabatically the coupling constant when |x0 | → ∞. Using the analogue of eq. (2.49),
we can separate the interactions as follows
Z  δ 
Z[j] = exp − i d4 x V Z0 [j] , (2.60)
iδj(x)

with
Z Z
  
Z0 [j] ≡ Dφ(x) exp i d4 x L0 (φ) + j(x)φ(x) ,
1 1 
L0 (φ) = (1 + i0+ )φ̇2 − (1 − i0+ ) (∇φ) · (∇φ) + m2 φ2 . (2.61)
2 2

The functional integral that gives Z0 [j] in eq. (2.61) is Gaussian in φ and can be
performed in a straightforward manner, giving
Z
1
Z0 [j] = exp − d4 xd4 y j(x)j(y) G0F (x, y) , (2.62)
2
98 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where G0F (x, y) is the inverse of the operator


h i
i (1 + i0+ )∂20 − (1 − i0+ )(∇2 + m2 ) . (2.63)

Note that the terms in i0+ ensure the existence of this inverse. Going to momentum
space, we see that the Fourier transform of this inverse is
i
, (2.64)
(1 + i0+ )k20 − (1 − i0+ )(k2 + m2 )
which after some rearrangement of the i0+ ’s appears to be nothing but eq. (1.123).
Although the canonical quantization of a scalar field theory was tractable, we see on
this example that the path integral approach provides a much quicker way of obtaining
the expression of the free generating functional, with the correct pole prescription for
the free Feynman propagator. c sileG siocnarF

2.5 Functional determinants


In the earlier sections of this chapter, we have been a bit cavalier with Gaussian
integrations, since we have disregarded the constant prefactors they produce. This
was legitimate in the problems we were considering, since the normalization of the
generating functional can be fixed by hand. However, in certain situations, these
prefactors depend crucially on quantities that have a physical significance, e.g. on a
background field.
In order to compute this prefactor, let us start from a simple 1-dimensional
Gaussian integral,
Z +∞ r
1
− 2 ax2 2π
dx e = . (2.65)
−∞ a
The first stage of generalization is to replace x by an n-component vector x ≡
(x1 , · · · , xn ), and the positive number a by a positive definite symmetric matrix A,
and to consider the integral
ZYn
1 T
I(A) ≡ dxi e− 2 x Ax . (2.66)
i=1

This integral can be calculated by representing the vector x in the orthonormal basis
made of Q the eigenvectors of A (such a basis exists, since A is symmetric). The
measure i dxi is unchanged, because the diagonalization of the matrix can be done
by an orthogonal transformation. Therefore, the above integral also reads
ZY n n r
Y
−2
1P
a i y2 2π
I(A) = dyi e i i
= , (2.67)
ai
i=1 i=1
2. F UNCTIONAL QUANTIZATION 99

where the numbers ai are the eigenvalues of A. This result can be written in a much
more compact form:

(2π)n/2
I(A) = √ . (2.68)
det A
This reasoning can be generalized to the functional case by writing:
Z Z h i−1/2
  1
Dφ(x) exp − d4 xd4 y φ(x)A(x, y)φ(y) = det (A) , (2.69)
2

where A(x, y) is a symmetric operator. In this formula, we have still disregarded


some truly constant (and infinite) prefactors, made of powers of 2π. One can also
generalize this Gaussian integral to the case where the vector x is complex,
ZY
n
† (2π)n
J(A) ≡ dxi dx∗i e−x Ax
= , (2.70)
det A
i=1

where A is a Hermitean matrix. The functional analogue of this integral is


Z Z h i−1
 
Dφ(x)Dφ∗ (x) exp − d4 xd4 y φ∗ (x)A(x, y)φ(y) = det (A) ,
(2.71)

Zeta function regularization : Despite the elegance of this formula, one should
keep in mind that the functional determinant det A is most often infinite, because the
spectrum of the operator extends to infinity. A common regularization technique for
functional determinants is based on a generalization of Riemann’s ζ function. Let the
λn be the eigenvalues of A, and define:
 X 1
ζA (s) ≡ tr A−s = . (2.72)
n
λsn

(The function ζA is called the zeta function of the operator A.) The determinant of A
is related to this function by
 
det A = exp − ζA′ (0) . (2.73)

The sum over n in the definition of ζA usually converges only if Re (s) is large
enough (how large depends on the distribution of eigenvalues at large n), but not for
s = 0. However, like in the case of Riemann’s zeta function, ζA (s) can be analytically
continued to most of the complex s-plane, which provides a regularized definition of
the determinant. c sileG siocnarF
100 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Diagrammatic interpretation : Let us consider as an example the operator Aϕ ≡


 + λ2 ϕ2 , where ϕ(x) is a background field. The inverse of this operator is the
propagator of a scalar particle (with a φ4 interaction) over the background field ϕ.
We can skip the regularization step if we make a ratio with the determinant of the
similar operator with no background field:

det 
R≡ . (2.74)
det  + λ2 ϕ2
A very useful formula relates the determinant of an operator to the trace of its
logarithm,
 
det A = exp Tr log A . (2.75)
This formula can be proven (heuristically, since the objects we are manipulating may
not be finite) by writing both sides of the equation in terms of the eigenvalues of A:
 Y X 
det A = λn = exp log λn = exp Tr log A . (2.76)
n n

Therefore, the ratio defined in eq. (2.74) can be rewritten as


 2 
−1
R = exp −Tr log 1 + λϕ 2  . (2.77)

Writing −1 = iG0F , and expanding the logarithm gives



X 1  2 n 
R = exp Tr − i λϕ
2 G0F . (2.78)
n
n=1

The argument of the exponential has a simple interpretation as a 1-loop diagram made
of a line dressed with insertions of the background field, the index n being the number
of such insertions:

1  2 n 
Tr − i λϕ
2 G0F = . (2.79)
n

| {z }
n insertions

Each of the insertions of the background field (shown by lines terminated by a dot
in the above diagram) corresponds to a factor −i λ2 ϕ2 . The prefactor 1/n is the
symmetry factor for the cyclic permutations of the n insertions. The argument of the
exponential is a sum of connected 1-loop diagrams. Taking the exponential to obtain
the ratio R simply produces all the multiply connected graphs made of products of
such 1-loop diagrams. c sileG siocnarF
2. F UNCTIONAL QUANTIZATION 101

2.6 Quantum effective action


2.6.1 Definition
The action S[φ] that enters in the path integral representation of the generating
functional Z[j] is the classical action. Its parameters reflect the interactions among the
constituents of the system at tree level, but in order to express higher order corrections
loop corrections are necessary. The quantum effective action, denoted Γ [φ], is defined
as the functional that would produce the all-orders value of the interactions solely
from tree-level contributions. Γ [φ] should coincide with the classical action at lowest
order of perturbation theory, but also encapsulates all the higher order corrections.
One may write Γ [φ] formally as
X∞ Z
1
Γ [φ] ≡ d4 x1 · · · d4 xn φ(x1 ) · · · φ(xn ) Γn (x1 , · · · , xn ) . (2.80)
n!
n=2

Γ2 (x1 , x2 ) is therefore the inverse of the exact propagator, Γ4 (x1 , · · · , x4 ) is the exact
4-point function (in coordinate space), etc... c sileG siocnarF

2.6.2 Relation between Γ [φ] and W[j]


Until now, we have introduced the generating functional of the vacuum expectation
value of time-ordered products of fields, Z[j], as well as the functional W[j] ≡ log Z[j]
that generates the subset made of connected Feynman graphs. Recall that in term of
path integrals,
Z h Z i
W[j]
 
Z[j] = e = Dφ(x) exp iS[φ(x)] + i d4 x j(x)φ(x) . (2.81)

Let us replace the classical action S[φ] by the quantum effective action Γ [φ] in the
previous formula, to define
Z h Z i
WΓ [j]
 
ZΓ [j] = e = Dφ(x) exp iΓ [φ(x)] + i d4 x j(x)φ(x) . (2.82)

This functional generates graphs whose building blocks are the exact propagator
(Γ2−1 ), and the exact vertices (Γ3 , Γ4 . · · · ). From the definition of Γ [φ] as the “action”
that would generate the exact theory at tree level, we conclude that
WΓ [j]|tree = W[j] . (2.83)
In other words, the tree diagrams of WΓ [j] should be equal to the all-orders W[j]. The
tree diagrams may be isolated by reintroducing Planck’s constant in the definition of
ZΓ [j] as follows
Z hi Z i
WΓ [j;h̄]
 
ZΓ [j; h̄] = e = Dφ(x) exp Γ [φ(x)] + d4 x j(x)φ(x) . (2.84)

102 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

As we have discussed in the section 1.7.6, the order in h̄ of a connected graph is

h̄nL −1 , (2.85)

where nL is the number of loops of the graph. Therefore, the functional WΓ [j; h̄] has
the following loop expansion:

X
WΓ [j; h̄] = h̄nL −1 WΓ ,nL [j] , (2.86)
nL =0
| {z }
nL loops

and the tree level contributions in WΓ [j] are the terms that survive in the formal limit
h̄ → 0:

WΓ [j]|tree = lim h̄ WΓ [j; h̄] . (2.87)


h̄→0

But from our discussion of the classical limit of path integrals in section 2.2, we know
that the limit h̄ → 0 corresponds to the extremum of the argument of the exponential,
i.e.
δΓ [φ]
+ j(x) = 0 . (2.88)
δφ(x)

Note that this equation is the analogue of the usual Euler-Lagrange equation of
motion, with the quantum effective action in place of the classical action. This
equation implicitly defines φ as a function of j, that we will denote φj , in terms of
which we can write
hi Z i
WΓ [j;h̄]
e ≈ exp Γ [φj (x)] + d4 x j(x)φj (x) , (2.89)
h̄→0 h̄

which leads to the following relationship between the quantum effective action and
the generating functional of connected graphs:
Z
Γ [φj ] = −i W[j] − d4 x j(x)φj (x) . (2.90)

Therefore, Γ [φ] can be obtained as the Legendre transform of the generating functional
W[j] of the connected graphs.
Note that the “quantum equation of motion” (2.88) may also be viewed as defining
j in terms of φ, that we shall denote jφ . Eq. (2.90) may therefore also be written as
Z
Γ [φ] = −i W[jφ ] − d4 x jφ (x)φ(x) . (2.91)
2. F UNCTIONAL QUANTIZATION 103

Taking a functional derivative of this equation with respect to φ(y) and using the
chain rule, we obtain
Z Z
δΓ [φ] δW[j] δjφ (x) δj (x)
= −i d4 x − jφ (y) − d4 x φ φ(x) . (2.92)
δφ(y) δj(x) j=j δφ(y) δφ(y)
| {z } φ

−jφ (y)

This leads to

δW[j] δW[j]
φ(x) = −i , or equivalently φj (x) = −i = φ(x) j
.
δj(x) j=jφ δj(x)
(2.93)

In other words, φj is the connected 1-point function (i.e. the vacuum expectation
value of the field) in the presence of the source j. c sileG siocnarF

2.6.3 Second derivative of the effective action

Differentiating eq. (2.88) with respect to j(y) gives:

δ δΓ [φj ]
δ(x − y) = −
δj(y) δφj (x)
Z
δφj (z) δ2 Γ [φj ]
= − d4 z
δj(y) δφj (x)δφj (z)
Z
δ2 W[j] δ2 Γ [φj ]
= i d4 z . (2.94)
δj(y)δj(z) δφ (z)δφ (x)
| {z } | j {z j }
G(y,z)connected Γ2 (z,x)

This formula shows a posteriori that (up to a factor i) the coefficient Γ2 in the expansion
(2.80) is indeed the inverse of the exact connected 2-point function, as was expected
from our request that the effective action Γ [φ] reproduces the full content of the
theory.
By parameterizing the inverse propagator in terms of a self-energy Σ as follows,

G−1 = G−1
0 + iΣ , (2.95)

we see that the second derivative of the quantum effective action is nothing but the
self-energy. An important class of diagrams in this discussion are the one-particle
irreducible (1PI) diagrams, that are those that remain connected if one cuts any one
104 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

of their internal propagators. For instance, the first of these diagrams is 1PI while the
second one is not:

1PI diagram :

Non-1PI diagram :

The concept of 1PI diagrams is crucial in the summation of a self-energy to all orders.
Indeed, repeated insertions of a non-1PI self-energy would lead to the erroneous
multiple countings of identical graphs. To avoid this, a self-energy should only
contain 1PI graphs, and we conclude that the second derivative of the quantum
effective action is one-particle irreducible.
c sileG siocnarF

2.6.4 One-particle irreducibility

The quantum effective action Γ [φ] is in fact one-particle irreducible at all orders in φ,
not just at quadratic order in φ as the above argument suggests. By exponentiating
eq. (2.91) and using the path integral definition of exp(W[j]), we first obtain

Z  Z 
 
ei Γ [φ] = Dϕ exp i S[ϕ] + d4 x j(x)(ϕ(x) − φ(x))
j=jφ
Z  Z 
 
= Dϕ exp i S[ϕ + φ] + d4 x j(x)ϕ(x) . (2.96)
j=jφ

(In the second line, we have shifted by φ the integration variable ϕ.) Thus, the
quantum effective action can be obtained from a shifted classical action, to which is
added a source jφ that implicitly depends on φ via the quantum equation of motion
(2.88). The expansion of the shifted classical action S[ϕ + φ] leads to a number of
vertices, some of which are φ-dependent. Thus, Γ [φ] is the sum of the connected
(because we must take the logarithm in order to extract Γ [φ]) vacuum graphs build
with these φ-dependent vertices and the φ-dependent source jφ . To every line of
such a graph is associated a free propagator G0 , determined from the quadratic term
in the action.

A very important property is the fact that the expectation value of ϕ(x) with this
2. F UNCTIONAL QUANTIZATION 105

shifted action vanishes:


Z  Z 
 
ϕ(x) ≡ Dϕ ϕ(x) exp i S[ϕ + φ] + jϕ
j=jφ
Z   Z
 
= Dϕ (ϕ(x) − φ(x)) exp i S[ϕ] + d4 x j(ϕ − φ)
j=jφ
R δ
= −φ(x) ei Γ [φ] + e−i jφ
eW[j]
iδj(x) j=jφ
 δW[j] 
= ei Γ [φ] − φ(x) =0. (2.97)
iδj(x) j=jφ
| {z }
0

Note that in order to obtain the final zero, it is crucial that j be set to jφ at the end. Let
us now consider a one-particle reducible vacuum graph G that may possibly contribute
to Γ [φ]. Because it is reducible, this graph contains at least one bare propagator that
connects two subgraphs A and B, such that the two subgraphs become disconnected
when removing this propagator,
Z
x y
GAB ≡ A B . (2.98)
x,y

When summing over all graphs that may enter in B, we get


X Z
GAB = d4 xd4 y A(x) G0 (x, y) ϕ(y) = 0 , (2.99)
B

thanks to the previous result on the expectation value of ϕ. Therefore, the one-particle
reducible graphs do not contribute to Γ [φ], which generalizes to all orders in φ what
we had already seen for the quadratic terms. c sileG siocnarF

2.6.5 One-loop effective action


At one loop, one may obtain a closed expression for the quantum effective action. For
this, write the Lagrangian as a renormalized Lagrangian plus counter-terms:

L ≡ Lr (φr ) + ∆L(φr ) , (2.100)

both depending on the renormalized field φr . We will denote Sr and ∆S the corre-
sponding actions. Likewise, we write the external source j = jr + δj, where jr is the
current that solves the following equation:
δSr [φr ]
+ jr (x) = 0 , (2.101)
δφr (x) ϕ
106 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

i.e. the current that solves at lowest order the defining equation of the effective action.
The correction ∆j is then adjusted order by order so that the expectation value of the
field remains equal to ϕ at all orders,

ϕ(x) = φr (x) jr +∆j


. (2.102)

In the path integral representation of the generating functional Z[j], we write the field
as φr = ϕ + η:
Z
  iSr [ϕ+η]+∆S[ϕ+η]+R d4 x (jr +∆j)(ϕ+η)
Z[j] = Dη(x) e , (2.103)

and we expand the argument of the exponential in powers of η up to quadratic order:


Z Z
Sr [ϕ + η] + d4 x jr (ϕ + η) = Sr [ϕ] + d4 x jr (x)ϕ(x)
Z  δS [φ ] 
r r
+ d4 x + jr η(x)
δφr (x) ϕ
Z  
1 δ2 Sr [φr ]
+ d4 xd4 y η(x) η(y)
2 δφr (x)δφr (y) ϕ
+··· (2.104)

Note that the term linear in η is zero by virtue of eq. (2.101). Therefore, we may
rewrite Z[j] as follows
 Z
R 4   
Z[j] = ei Sr [ϕ]+∆S[ϕ]+ d x jϕ Dη(x) ei Sϕ [η]+∆Sϕ [η] , (2.105)

where we denote
Z  
1 δ2 Sr [φr ]
Sϕ [η] ≡ d4 xd4 y η(x) η(y) + · · · (2.106)
2 δφr (x)δφr (y) ϕ

(Likewise, ∆Sϕ [η] results from the expansion in powers of η of the counter-terms.)
At one loop, it is sufficient to keep only the quadratic terms in η, and the path integral
gives a determinant:

  i −1/2
δ2 Sr [φr ]
det −
2 δφr (x)δφr (y) ϕ
  i 
1 δ2 Sr [φr ]
= exp − tr ln − . (2.107)
2 2 δφr (x)δφr (y) ϕ
2. F UNCTIONAL QUANTIZATION 107

At this order, the generating functional of connected graphs reads3


Z  
 1 δ2 Sr [φr ]
W[j] = i Sr [ϕ]+∆S[ϕ]+ d4 x jϕ − tr ln +· · · (2.108)
2 δφr (x)δφr (y) ϕ

from which we obtain the following quantum effective action

i  δ2 Sr [φr ] 
Γ [ϕ] = Sr [ϕ] + ∆S[ϕ] + tr ln + ··· (2.109)
2 δφr (x)δφr (y) ϕ

Note that the object inside the logarithm is the inverse of the propagator dressed by
the background field ϕ. c sileG siocnarF

2.7 Two-particle irreducible effective action

2.7.1 Definition and equations of motion

The quantum effective action Γ [φ] studied in the previous section can be extended
into a functional Γ [φ, G] that depends on a field φ and a propagator G. The starting
point of this derivation is to introduce a second source k(x, y) that couples to a pair
of fields ϕ(x)ϕ(y). The corresponding generating functional W[j, k] for connected
graphs is given by
Z  Z Z 
W[j,k]
  1
e = Dϕ exp i S[ϕ]+ j(x)ϕ(x)+ k(x, y) ϕ(x)ϕ(y) . (2.110)
x 2 x,y

In terms of graphs, W[j, k] is the sum of the connected vacuum graphs built with the
bare propagator and the vertices defined by the classical action S[ϕ], with the external
source j, and with a kind of non-local mass term k(x, y). Let us denote

δW[j, k]
≡ φj,k (x) ,
iδj(x)
δW[j, k] 1 
≡ φj,k (x)φj,k (y) + Gj,k (x, y) . (2.111)
iδk(x, y) 2

In the second equation, we have separated a disconnected part φj,k (x)φj,k (y) and a
connected two-point function Gj,k (x, y). Both the field φj,k and the propagator Gj,k
depend on the sources, which we indicate by the subscript j, k. Conversely, we may
formally invert these equations to define φ, G dependent sources, jφ,G and kφ,G .
3 We have dropped a factor − 2i inside the tr ln, since it only produces an additive constant.
108 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Then, the Legendre transform that defined Γ [φ] from W[j] can be generalized into
Z
Γ [φ, G] = −i W[jφ,G , kφ,G ] − d4 x jφ,G (x)φ(x)
Z
1  
− d4 xd4 y kφ,G (x, y) φ(x)φ(y) + G(x, y) . (2.112)
2

By taking derivatives with respect to φ(x) or G(x, y), we obtain the following
equations:
Z
Γ [φ, G]
+ jφ,G (x) + d4 y kφ,G (x, y) φ(y) = 0 ,
δφ(x)
Γ [φ, G] 1
+ kφ,G (x, y) = 0 . (2.113)
δG(x, y) 2

Note that the first of these equations generalizes the quantum equation of motion
(2.88) with the adjunction of a self-energy kφ,G (x, y). c sileG siocnarF

2.7.2 Two-particle irreducibility

From the Legrendre transform of eq. (2.112), we obtain the following path integral
representation of the functional Γ [φ, G]:
Z  Z
 
ei Γ [φ,G] = Dϕ exp i S[ϕ] + j(x)(ϕ(x) − φ(x))
x
Z
1  
+ k(x, y) ϕ(x)ϕ(y) − φ(x)φ(y) − G(x, y)
2 x,y
Z  Z
 
= Dϕ exp i S[ϕ+φ]+ j(x)ϕ(x)
x
Z
1  
+ k(x, y) ϕ(x)ϕ(y)+2ϕ(x)φ(y)−G(x, y) ,
2 x,y
(2.114)

where we have omitted the subscript φ, G on the sources j, k for the sake of brevity.
From the second equation, we first obtain
Z  Z
  k
 
ϕ(x) = Dϕ ϕ(x) exp i S[ϕ + φ] + jϕ + 2 ϕϕ + 2ϕφ − G
 δW[j, k] 
= ei Γ [φ,G] − φ(x) j=jφ,G = 0 . (2.115)
iδj(x) k=kφ,G
2. F UNCTIONAL QUANTIZATION 109

Like in the case of the 1PI functional Γ [φ], this identity ensures that the one-particle
reducible graphs do not contribute to Γ [φ, G]. But as we shall see now, the functional
Γ [φ, G] is limited to a much more restricted set of graphs, since only the two-particle
irreducible graphs contribute, i.e. the graphs that cannot be made disconnected by
removing two arbitrary propagators. Consider a 2-particle reducible graph,

Z x
GAB ≡ A B , (2.116)
x,y y

in which we have exhibited the two bare propagators that would disconnect the graph
if removed. Summing over the graphs that can contribute to B, we may write this as
X Z
GAB = d4 xd4 y A(x, y) ϕ(x)ϕ(y) c , (2.117)
B

(the subscript c indicates that we keep only the connected part of the two point
function) with

ϕ(x)ϕ(y) c
Z
  R k

S[ϕ+φ]+ jϕ+ 2 [ϕϕ+2ϕφ−G]
≡e −i Γ [φ,G]
Dϕ ϕ(x)ϕ(y) ei
R k δeW[j,k]
= −φ(x)φ(y) + 2 e| −iΓ [φ,G] e−i{zjφ+ 2 (G−φφ)}
iδk(x, y)
e−W[j,k]
= G(x, y) . (2.118)

In the second equality, we have ignored some terms that have already been shown to
vanish when studying ϕ(x) , and we have extracted the combination ϕ(x)ϕ(y) by
differentiating with respect to k(x, y). From this identity, we obtain

XZ x Z
A B = A . (2.119)
B x,y y x,y
G(x,y)

In other words, when summing over all the possible graphs contributing to B, the
2-particle reducible block is replaced by a single propagator G(x, y), and the resulting
graph is two-particle irreducible. Thus, the functional Γ [φ, G], when expressed in
terms of the 2-point function G, is made only of 2-particle irreducible graphs, and its
derivatives with respect to the field φ are the 2-particle irreducible n-point functions.
Thus, Γ [φ, G] is the generating functional in φ of the 2PI correlation functions. c sileG siocnarF
110 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

2.7.3 Loop expansion of Γ [φ, G]

2PI functional at null field : Consider first the 2PI effective action at null field,
Γ [0, G]. At φ ≡ 0, we have

δΓ [0, G]
−2 = k0,G . (2.120)
δG
Using the path integral representation (2.114), and replacing k0,G by the above
equality, we get4

δΓ [0,G]  R 
Z
  S[ϕ]+tr [G−ϕϕ] + j0,G ϕ
ei Γ [0,G] = Dϕ ei δG

X 2PI vacuum diagrams built with the


= . (2.121)
propagator G and the vertices from S[ϕ]

The first equality is as an implicit identity obeyed by Γ [0, G]. The diagrammatic
interpretation in the second line follows from the discussion at the end of the previous
subsection. In the right hand side, the term in j0,G ϕ cancels the 1-particle reducible
tadpole contributions, while the term in G − ϕϕ is the one that eliminates the 2-
particle reducible ones (by replacing chains like G0 ΣG0 Σ · · · ΣG0 by G). Note that
the bare propagator G0 defined by the quadratic part of the classical action S[ϕ] does
not appear in the final result for Γ [0, G], since it is replaced systematically by G. Only
the interaction terms of the classical action matter, since they define the vertices that
connect the G’s in the diagrammatic representation of Γ [0, G]. c sileG siocnarF

Legrendre transform at fixed k : Then, it is useful to introduce the first Legrendre


transform of W[j, k], i.e. the 1PI effective action at fixed k,
Z
Γk [φ] ≡ −i W[jφ,k , k] − d4 x jφ,k (x)φ(x) , (2.122)

where the source jφ,k now has an implicit dependence on the field φ and on the
second source k. This functional obeys:
Z Z
δΓk [φ] δW δW δjφ,k (z) δjφ,k (z)
= + d4 z − d4 z φ(z)
δk(x, y) iδk(x, y) iδj(z) j=jφ,k δk(x, y) δk(x, y)
| {z }
φ(z)
δW
= . (2.123)
iδk(x, y)
4 We
R
use the compact notation x,y A(x, y)B(x, y) = tr (AB).
2. F UNCTIONAL QUANTIZATION 111

Given this identity, it is natural to define G(x, y) as follows,


δΓk [φ] 1 
≡ φ(x)φ(y) + G(x, y) . (2.124)
δk(x, y) 2
In this equation, we may either view G as a function of k, φ or k as a function (that
we denote kφ,G ) of φ, G. Adopting the latter point of view, we may perform a second
Legendre transform to obtain a functional of φ and G:
1 
Γ [φ, G] ≡ Γkφ,G [φ] − tr kφ,G [G + φφ] . (2.125)
2
(We use the same notation Γ [φ, G] because we shall prove shortly that this functional
is identical to the one we have defined earlier by a double Legendre transform.) This
definition leads to
δΓ [φ, G] 1  δΓ δkφ,G  1  δkφ,G 
k
= − k(x, y) + tr − 2 tr [G + φφ] ,
δG(x, y) 2 δk kφ,G δG δG
| {z }
0
Z
δΓ [φ, G]
= −jφ,k − kφ,G (x, y)φ(y)
δφ(x) y
 δΓ δkφ,G  1  δkφ,G 
k
+ tr − 2 tr [G + φφ] , (2.126)
δk kφ,G δφ δG
| {z }
0

that are identical to eqs. (2.113). This proves that the definition (2.125) of the 2PI
effective action is equivalent to the original definition (2.112). c sileG siocnarF

Path integral representation of Γk [φ] : Since we have


Z
  R 1R

eW[j,k]
= Dϕ ei S[ϕ]+ jϕ+ 2 ϕkϕ , (2.127)

Γk [φ] is the 1PI effective action of the modified classical action


Z
1
Sk [ϕ] ≡ S[ϕ] + d4 xd4 y ϕ(x) k(x, y) ϕ(y) , (2.128)
2
and it admits the following path integral representation
Z
  R δΓk [φ] 
eiΓk [φ] = Dϕ ei Sk [φ+ϕ]− ϕ δφ . (2.129)

Thus, if we denote Γk,1 [φ] ≡ Γk [φ] − Sk [φ] the terms at 1-loop and higher orders,
we have
Z
  R δ(Sk [φ]+Γk,1 [φ]) 
e iΓk,1 [φ]
= Dϕ ei Sk [φ+ϕ]−Sk [φ]− ϕ δφ . (2.130)
112 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Let us now expand the shifted action Sk [φ + ϕ]

Z Z
δSk [φ] 1 δ2 Sk [φ]
Sk [φ + ϕ] ≡ Sk [φ] + ϕ + ϕ ϕ + Sint [φ; ϕ] , (2.131)
δφ 2 δφδφ
| {z }
k+iG−1
φ

where Sint [φ; ϕ] denotes the terms of degree at least three in ϕ in the Taylor expansion
of S[φ + ϕ], and G−1 φ is the inverse of the tree-level propagator in the background
field φ. Therefore, Γk,1 [φ] can also be written as
Z
  1R −1
R δΓk,1 [φ] 
eiΓk,1 [φ] = Dϕ ei 2 ϕ[k+iGφ ]ϕ+Sint [φ;ϕ]− ϕ δφ . (2.132)

Diagrammatic interpretation of Γ [φ, G] : Let us now write Γ [φ, G] as follows:

Γ [φ, G] = const + S[φ] + i


2 tr ln(G−1 ) + i
2 ln(G−1
φ G) + Γ2 [φ, G] . (2.133)

This equation defines Γ2 [φ, G] so that the left hand side is indeed Γ [φ, G]. The
Combining eqs. (2.125), (2.132) and (2.133), we must have

Γ2 [φ, G] = const − 1
2 tr ([kφ,G + i G−1
φ ]G) −
i
2 tr ln(G−1 ) + Γk,1 [φ] . (2.134)

In order to eliminate kφ,G , we may use

δΓ2 [φ, G] 1 i −1 
+ kφ,G + Gφ − G−1 = 0 , (2.135)
δG 2 2
that follows from eqs. (2.113) and (2.133). We thus obtain
Z
  δΓ2
 R δΓ [φ] 
k,1
eiΓ2 [φ,G]
= Dϕ ei Sφ,G [ϕ]+tr [G−ϕϕ] δG − ϕ δφ , (2.136)

with
Z
i
Sφ,G [ϕ] ≡ ϕ G−1 ϕ + Sint [φ; ϕ] . (2.137)
2

Finally, using the fact that −jφ,G = δΓk,1 /δφ and comparing with eq. (2.121), we
see that Γ2 [φ, G] is the sum of the 2PI vacuum graphs built with the propagator G
and the vertices obtained from the expansion of S[φ + ϕ]. The first four terms of the
expansion of Γ2 [φ, G] in a scalar field theory with φ4 interaction are shown in the
figure 2.3.
2. F UNCTIONAL QUANTIZATION 113

Figure 2.3: Beginning of the diagrammatic expansion of Γ2 [φ, G] in a scalar field


theory with quartic interaction. The lines terminated by a cross are the field φ and
the black lines represent the propagator G.

2.7.4 Dyson equation

After setting the source k = 0, the equation of motion for the propagator (2.135)
becomes

δΓ2 [φ, G]
−i G−1 = −i G−1
φ −2 , (2.138)
| δG }
{z
−Σ

which is known as the Dyson equation, that resums the self-energy Σ on the propagator.
Convoluting this equation by G on the right gives

−iG−1
φ G + Σ G = −i . (2.139)

Recall that Gφ is the tree-level propagator in a background field φ, i.e.

δ2 S[φ]
i G−1
φ = = −( + m2 ) − V ′′ (φ) , (2.140)
δφδφ

where V(φ) is the interaction potential in the Lagrangian. Therefore, the equation of
motion has the following more explicit form
 
 + m2 + V ′′ (φ) + Σ G = −i . (2.141)

A closed system of equations is obtained by adding the equation of motion of φ,


obtained as δΓ/δφ = 0 (in a system with symmetry φ → −φ, and assuming that
there is no spontaneous breaking of this symmetry, the field expectation value is zero
and we may simply set φ = 0 in the above equation for G). Note that the self-energy
Σ is itself a functional of φ and G. In the 2PI framework, the self-energy is the
derivative of Γ2 [φ, G] with respect to the propagator,

δΓ2 [φ, G]
Σ = −2 . (2.142)
δG
114 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Diagrammatically, this derivative amounts to opening one internal line of the graphs
that contribute to Γ2 . For instance the graphs of the figure 2.3 give the following
topologies in Σ:

Σ∼ .

Note that the 2-particle irreducibility of Γ2 is equivalent to the 1-particle irreducibility


of Σ, which is crucial in order to avoid including multiple times the same contributions
when summing the self-energy to all orders. In practice, a truncation is necessary in
order to obtain equations that have a finite number of terms. For instance, keeping
only the first diagram in Γ2 gives the tadpole diagram in Σ (but bear in mind that since
this tadpole contains the full propagator G, the solution of the corresponding Dyson
equation resums an infinite set of Feynman graphs). c sileG siocnarF

2.8 Euclidean path integral and Statistical mechanics


2.8.1 Statistical mechanics in path integral form
A path integral formalism also exists for statistical mechanics. In order to illustrate this,
let us consider again the quantum mechanical system described by the Hamiltonian of
eq. (2.1). Our goal is to calculate the partition function in the canonical ensemble5 ,

Zβ ≡ Tr e−βH , (2.143)
where β is the inverse temperature (it is customary to use a system of units in which
Boltzmann’s constant kB is equal to unity – therefore temperature has the same
dimension as energy.) More generally, one may want to calculate the following
canonical ensemble expectation values,

O β ≡ Z−1 β Tr e
−βH
O . (2.144)
The cyclicity of the trace leads to an important identity for expectation values of
products of operators:

O1 (t)O2 (t ′ ) β ≡ Z−1 β Tr e
−βH
O1 (t)O2 (t ′ )

= Z−1 β Tr e
−βH
O1 (t) |e+βH{ze−βH} O2 (t ′ )
1

= Z−1
β Tr e
−βH
O2 (t ′ ) e−βH O1 (t)e+βH
| {z }
O1 (t+iβ)

= O2 (t )O1 (t + iβ) β
, (2.145)
5 In theories with a conserved quantity, it is also possible to study the grand canonical ensemble. One

needs to substitute H → H − µQ in the definition of the partition function, where Q is the operator of the
conserved charge and µ the associated chemical potential.
2. F UNCTIONAL QUANTIZATION 115

where we have formally identified the density operator exp(−βH) with a time evolu-
tion operator for an imaginary time iβ. This relationship is called the Kubo-Martin-
Schwinger (KMS) identity. Although we have established it for an expectation value
of a product of two operators, it is completely general.
The identification of the density operator with an imaginary time evolution opera-
tor is at the heart of the formalism to evaluate canonical ensemble expectation values.
If we represent the trace that appears in the partition function in the coordinate basis,
Z
Zβ = dq q e−βH q , (2.146)

the integrand in the right hand side is a transition amplitude similar to eq. (2.3), except
that initial and final coordinates are identical, and the time interval is imaginary. We
can nevertheless formally reproduce all the manipulations of the section 2.1, with
an initial time ti ≡ 0 and a final time tf ≡ −iβ. It is common to introduce the
Euclidean time τ ≡ it, with τ varying from 0 to β. The only changes to our original
derivation of the path integral is that the path q(t) must be replaced by a path q(τ)
whose time derivative is the Euclidean velocity q̇E , related to the usual velocity q̇ by
dq dq
q̇ ≡ =i . (2.147)
dt dτ
|{z}
q̇E

We obtain the following path integral representation of the partition function:


Z Z Zβ
  
Zβ = dq Dp(τ)Dq(τ) exp dτ i p(τ)q̇E (τ) − H(p(τ), q(τ))
0
q(0)=q
q(β)=q
Z Zβ
  
= Dp(τ)Dq(τ) exp dτ i p(τ)q̇E (τ) − H(p(τ), q(τ)) .
0
q(0)=q(β)
(2.148)

In the second line, we have simplified the boundary conditions of the path q(τ), since
the only constraint it must obey is to be β-periodic in imaginary time. The integration
over the momentum p(τ) is again Gaussian, and after performing it we obtain the
following expression
Z Zβ
  m 2 
Zβ = Dq(τ) exp − dτ q̇E (τ) + V(q(τ)) . (2.149)
0 2
q(0)=q(β) | {z }
SE [q(τ)]

The quantity SE [q] is called the Euclidean action.


116 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Then, we can generalize this formalism to calculate ensemble averages of time-


ordered (in imaginary time) products of position operators. For instance, the analogue
of eq. (2.28) is
Z
  
Tr e−βH Tτ Q(τ1 )Q(τ2 ) = Dq(τ) e−SE [q(τ)] q(τ1 )q(τ2 ) ,
q(0)=q(β)
(2.150)

where the symbol Tτ denotes the time-ordering in the imaginary time τ. Likewise,
we may define a generating functional for these expectation values
Zβ Z
−βH
   Rβ
Tr e Tτ exp dτ j(τ)Q(τ) = Dq(τ) e−SE [q(τ)]+ 0 dτ j(τ)q(τ) .
0
q(0)=q(β)
(2.151)

2.8.2 Statistical field theory

This formalism can be extended readily to a quantum field theory. In this context,
it can be used to calculate canonical ensemble expectation values of operators for a
system of relativistic particles. One can write directly the following generalization of
eq. (2.151),


Tr e−βH Tτ exp d4 xE j(x)φ(x)
| {z0 }
Z[j;β]
Z
  Rβ 4
= Dφ(x) e−SE [φ(x)]+ 0 d xE j(x)φ(x)
, (2.152)
φ(0,x)=φ(β,x)

where the measure d4 xE stands for dτ d3 x. Like in the case of ordinary QFT in
Minkowski space-time, we can isolate the interactions by writing:
Z  
4 δ
Z[j; β] = exp − d xE LE,I Z0 [j; β] , (2.153)
δj(x)

where LE,I is the interaction term in the Euclidean Lagrangian density, and Z0 [j; β]
is the generating functional of the non-interacting theory:
Z h Zβ 1 i
  
Z0 [j; β] = Dφ(x) exp − d4 xE (∂τ φ)2 +(∇φ)2 +m2 φ2 −jφ .
0 2
φ(0,x)=φ(β,x)
2. F UNCTIONAL QUANTIZATION 117

(2.154)

The Gaussian path integral in this expression leads to:



1
Z0 [j; β] = exp d4 xE d4 yE j(x) G0E (x, y) j(y) , (2.155)
2 0

where the free Euclidean propagator G0E (x, y) is the inverse of the operator m2 −
∂2τ −∇2 over the space of functions that are β-periodic in the imaginary time variable.
Because of this periodicity, the “energy” variable, conjugate to the Euclidean time, is
discrete:
2πn
ωn ≡ (n ∈ ❩) . (2.156)
β
In terms of these energies, called Matsubara frequencies, the free Euclidean propagator
in momentum space reads

e 0 (ωn , p) = 1
G . (2.157)
E
ω2n + p2 + m2

Note that the denominator cannot vanish, and therefore this propagator does not
need an i0+ prescription for being fully defined. Eqs. (2.153) and (2.155) lead to a
perturbative expansion that can be cast into an expansion in terms of Feynman dia-
grams. The Feynman rules associated to these graphs are very similar to those already
encountered when calculating scattering amplitudes, with only a few modifications:

1
Propagators : , (2.158)
ω2n + p2 + m2
X  X 
Vertices : − λ 2π δ ωni (2π)3 δ pi , (2.159)
i i
Z 3
1 X d p
Loops : . (2.160)
β (2π)3
n∈❩

In other words, the main difference with the usual perturbative expansion is that
the energies are replaced by the discrete Matsubara frequencies, and that the loop
integration on p0 is replaced by a discrete sum over these frequencies. c sileG siocnarF
118 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS
Chapter 3

Path integrals for


fermions and photons

In the previous chapter, we have learned that the quantization of a scalar field may be
performed by means of the path integral representation. This leads to a much more
concise derivation of the generating functional, and of the free propagator, compared
to the canonical approach. In this chapter, we will therefore seek a similar path
integral formalism for other types of fields, in view of the functional quantization of
a gauge theory such as QED (and later, of non-Abelian gauge theories, for which a
canonical approach would be extremely difficult to implement). c sileG siocnarF

3.1 Grassmann variables

3.1.1 Definition

In the functional formulation of a scalar field theory, we saw that time-ordered


products of field operators correspond to the ordinary product of the integration
variable in the integrand of the path integral (see the eq. (2.29)). Ultimately, a path
integral representation of the time-ordered product of fermion field operators should
allow the same, but with a catch: the T-product for fermions involves a minus sign
when two operators are exchanged (see 1.216), that we need to be able to generate
in the integrand of a would-be fermionic path integral. This can be achieved with
Grassmann numbers1 , that are anti-commuting variables. In a sense, Grassmann
1 Although we call them “numbers”, they are not representable as scalar (e.g. real or complex) variables.

A Grassmann number may be represented by a nilpotent 2 × 2 matrix, and the Grassmann algebra with N

119
120 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

numbers are the classical analogue of anti-commuting quantum operators. For a set
of Grassmann variables ψi (i = 1 · · · N), we have

ψi , ψj = 0 . (3.1)

The linear space spanned by the ψi ’s is called a Grassmann algebra.

3.1.2 Functions of a single Grassmann variable

Consider first the case N = 1. The square of a Grassmann number ψ is therefore zero,
ψ2 = 0, and by induction all higher powers of ψ are also zero. The Taylor expansion
of a function of ψ is therefore limited to the first two terms,

f(ψ) = a + ψb . (3.2)

In general, we need to deal with functions f(ψ) that are themselves commuting objects.
Therefore, the coefficient a is an ordinary number, while b is another Grassmann
number, {b, b} = {b, ψ} = 0. This implies that

f(ψ) = a + ψb = a − bψ . (3.3)

Because of the non-commuting nature of b and ψ, we may define left and right
derivatives, denoted by:
→ ←
∂ ψ f(ψ) = b , f(ψ) ∂ ψ = −b . (3.4)

One may define a linear mapping on functions of a Grassmann variable, that


behaves for most purposes as an integration (although it is not an integral in the
Lebesgue sense), called the Berezin integral. We require two basic axioms:

• Linearity :
Z Z
dψ α f(ψ) = α dψ f(ψ) , (3.5)

generators admits a representation in terms of 2N × 2N matrices, that may be viewed as operators acting
on the Hilbert space of N identical fermions of spin 1/2 (of dimension 2N since each spin has two states).
For instance, when N = 2, one may represent the Grassmann numbers ψ1,2 as
   
0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
   
ψ1 =   , ψ2 =   .
0 0 0 0 1 0 0 0
0 0 1 0 0 −1 0 0
3. PATH INTEGRALS FOR FERMIONS AND PHOTONS 121

• The integral of a total derivative is zero:


Z
dψ ∂ψ f(ψ) = 0 . (3.6)

The only definition consistent with these requirements is


Z
dψ f(ψ) = b , (3.7)

up to an overall constant that should be the same for all functions. Thus, integration
and differentiation of functions of a Grassmann variable are essentially the same thing.
In particular, the Berezin integral satisfies:
Z Z
dψ 1 = 0 , dψ ψ = 1 . (3.8)

3.1.3 Functions of N Grassmann variables


Taylor expansion : We will denote collectively ψ ≡ (ψ1 , · · · , ψN ). The most
general function of N Grassmann variables can be written as
N
X 1
f(ψ) = ψi ψi · · · ψip Ci1 i2 ···ip , (3.9)
p! 1 2
p=0

with implicit summations on the indices in . Terms of degree higher than N cannot
exist because they would contain the square of at least one of the ψi ’s, and therefore be
zero. We have chosen to write the Grassmann variables on the →
left of the coefficients
in order to simplify the calculation of the left derivatives ∂ ψ . Note that the last
coefficient Ci1 ···iN must be proportional to the Levi-Civita tensor:

Ci1 ···iN ≡ γ ǫi1 ···iN . (3.10)

Note that this last term can also be written as:


1
N! ψi1 · · · ψiN γ ǫi1 ···iN = ψ1 · · · ψN γ . (3.11)

Integration : In order to be consistent with eqs. (3.8), the integral of f(ψ) over the
N Grassmann variables ψ1 , · · · , ψN , must be given by
Z
dN ψ f(ψ) = γ . (3.12)

The terms of degree 0 through N − 1 in the “Taylor expansion” of f(ψ) cannot


contribute to the integral, since at least one of the ψi is absent in these terms, and
122 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

the integral over this ψi will therefore give zero. A somewhat more explicit for-
mulation of an integral over N Grassmann variables is to write the measure as
dN ψ ≡ dψN dψN−1 · · · dψ1 (in this order), and to perform the N integrals succes-
sively, starting with the innermost one (i.e. dψ1 ). Therefore
Z Z Z Z  
N
d ψ ψ1 · · · ψN = dψN · · · dψ2 dψ1 ψ1 ψ2 · · · ψN = 1 . (3.13)
| {z }
| {z1 }
1

Change of variables : Let us now consider a linear change of variables:

ψi ≡ Jij θj , (3.14)

where θ1 · · · θN are N Grassmann variables. The last term of the expansion of f(ψ),
the only one relevant for integration, can be rewritten as
 
ψi1 · · · ψi ǫi1 ···iN γ = Ji1 j1 θj1 · · · JiN jN θjN ǫi1 ···iN γ
N

= det J θj1 · · · θjN ǫj1 ···jN γ . (3.15)

From this relationship, we conclude that


Z Z
 −1
dN ψ f(ψ) = det J dN θ f(ψ(θ)) . (3.16)
| {z } | {z }
γ det (J) γ

Thus, a change of variables in a Grassmann integral involves the inverse of the


Jacobian that would normally appear in the same change of variables for a scalar
integral.
c sileG siocnarF

Gaussian integrals : Let ψ1 , · · · , ψN be N Grassmann variables, and consider the


following integral
Z

I(M) ≡ dN ψ exp 21 ψi Mij ψj , (3.17)

where M is an antisymmetric N × N matrix made of commuting numbers (real or


complex). Firstly, note that such an integral is non-zero only if N is even. For N = 2,
this matrix is of the form
!
0 µ
M= , (3.18)
−µ 0
3. PATH INTEGRALS FOR FERMIONS AND PHOTONS 123

and the exponential in the integral reads



exp 21 ψi Mij ψj = 1 + µ ψ1 ψ2 . (3.19)

(Recall that functions of two Grassmann variables are in fact polynomials of degree
two.) Therefore, in the case N = 2, the Gaussian integral (3.17) reads2
 1/2
I(M) = µ = det (M) . (3.20)

In the case of a general even N, the matrix M may be written in the following block
diagonal form,
 
0 µ1
−µ 
 1 0 
 
M=Q   0 µ 2  QT , (3.21)

 −µ 0 
 2 
..
.
| {z }
D

where Q is a special3 orthogonal matrix. Defining QT ψ ≡ θ, we have


Z
 −1 
I(M) = det (Q) dN θ exp 12 θT Dθ . (3.22)
| {z }
µ1 µ2 ···=[det (D)]1/2

But since det (Q) = +1, this becomes


 1/2  1/2
I(M) = det (D) = det (M) . (3.23)

Contrast this with the result of a Gaussian integral in the case of ordinary real variables,
eq. (2.68), where the square root of the determinant appeared in the denominator.
It is often necessary to perform a Gaussian integral in the presence of a source
that shifts the minimum of the quadratic form in the exponential,
Z

I(M, η) ≡ dN ψ exp 21 ψi Mij ψj + ηi ψi , (3.24)

where η is a set of N Grassmann sources. By introducing the new Grassmann variable


ψi′ ≡ ψi − M−1 ij ηj , this integral falls back to the previous type, and we obtain:
 1/2 
I(M, η) = det (M) exp − 21 ηT M−1 η . (3.25)
2 The determinant of a real antisymmetric matrix of even size is the square of its Pfaffian and is therefore
positive.
3 Orthogonal matrices have determinant +1 or −1. The special orthogonal matrices are the subgroup of

those that have determinant +1.


124 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Gaussian integral with 2N variables : Another useful type of Gaussian integral


is
Z

J(M) ≡ dN ξdN ψ exp ψi Mij ξj , (3.26)

where M is an N × N matrix of commuting numbers, and ψ and ξ are independent


Grassmann variables. The only non-zero contribution to this integral comes from the
term of order N in the Taylor expansion of the exponential,
Z
1  
J(M) = dN ξdN ψ ψi1 Mi1 j1 ξj1 · · · ψiN MiN jN ξjN
N!
N(N−1) Z
(−1) 2  
= dN ξdN ψ ψi1 · · · ψiN ξj1 · · · ξjN
N!
× Mi1 j1 · · · MiN jN
N(N−1)
(−1) 2
= ǫi1 ···iN ǫj1 ···jN Mi1 j1 · · · MiN jN . (3.27)
N!
In the second line, we have reordered the Grassmann variables in order to bring
all the ψi ’s on the left, and the sign in the prefactor keeps track of the number of
permutations that are necessary to achieve this. To give a non-zero result, the indices
{in } and {jn } must be permutations of [1 · · · N]:
N(N−1)
(−1) 2 X
J(M) = ǫ(σ)ǫ(ρ) Mσ(1)ρ(1) · · · Mσ(N)ρ(N)
N!
σ,ρ∈Sn
N(N−1)
(−1) 2 X
= ǫ(σ)ǫ(τσ) M1τ(1) · · · MNτ(N) (3.28)
N!
σ,τ∈Sn

where ǫ(σ) is the signature of the permutation σ, and with τ ≡ ρσ−1 in the second
line. Using ǫ(σ)ǫ(τσ) = ǫ(τ), this becomes:
N(N−1)  1 X  X
J(M) = (−1) 2 1 ǫ(τ) M1τ(1) · · · MNτ(N) . (3.29)
N!
σ∈Sn τ∈Sn
| {z } | {z }
1 det (M)

Note that this overall sign may be absorbed into a reordering of the measure, since:
N(N−1)
dN ξdN ψ = (−1) 2 dξN dψN · · · dξ1 dψ1 . (3.30)
Therefore, we have
Z
 
dξN dψN · · · dξ1 dψ1 exp ψi Mij ξj = det M . (3.31)
3. PATH INTEGRALS FOR FERMIONS AND PHOTONS 125

3.1.4 Complex Grassmann variables

Now, let us define complex Grassmann variables, from two of the previously defined
Grassmann variables ψ and ξ:

ψ + iξ ψ − iξ
χ≡ √ , χ≡ √ . (3.32)
2 2

Conversely, we have
χ+χ i (χ − χ)
ψ= √ , ξ= √ , (3.33)
2 2
and the integrations over these variables are related by

dξdψ = i dχdχ ,
ψξ = −i χχ ,
Z Z
dχdχ χχ = dξdψ ψξ = 1 . (3.34)

From this, we obtain


Z

dχdχ exp µ χχ = µ , (3.35)

that can be generalized into


Z

dχN dχN · · · dχ1 dχ1 exp(χT Mχ) = det M . (3.36)

In the presence of sources η and η, we obtain the following Gaussian integral:


Z
  
dχN dχN · · · dχ1 dχ1 exp χT Mχ+ηT χ+χT η = det M exp −ηT M−1 η .
(3.37)

3.2 Path integral for fermions


We now have all the ingredients for building a path integral for spin 1/2 fermions.
Let us work our way backwards, starting from a generating functional that generates
the free time-ordered products of spinors,
Z
Z0 [η, η] ≡ exp − d4 xd4 y η(x)S0F (x, y)η(y) , (3.38)
126 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where S0F (x, y) is the free Dirac time-ordered propagator and η and η are a pair of
complex Grassmann-valued sources. Indeed, we have
→ ←
δ δ
Z0 [η, η] = S0F (x, y) . (3.39)
iδη(x) iδη(y)
η=η=0

Taking more than two derivatives (but with an equal number of derivatives with respect
to η and with respect to η) will lead to all the contributions in the free time-ordered
product of spinors, with the correct signs to account for their anti-commuting nature.
Note that using Grassmann-valued sources was necessary in order to get these signs.
Then, by comparing eqs. (3.37) and (3.38), we can represent this free generating
function as a path integral over Grassmann variables:

Z Z
 
Z0 [η, η] = Dψ(x)Dψ(x) exp i d4 x ψ(x)(i∂
/ − m)ψ(x)

+η(x)ψ(x) + ψ(x)η(x)
Z
  R 4 
= Dψ(x)Dψ(x) eiS[ψ,ψ] ei d x η(x)ψ(x)+ψ(x)η(x)
.
(3.40)

We have ignored the determinant, since it is independent of the sources. Instead, one
simply adjusts the normalization of the generating functional so that Z0 [0, 0] = 1.
The second line shows that the path integral formulation of a field theory of spin 1/2
fermions takes the same form as that of scalar fields, provided we use Grassmann
variables instead of commuting c-numbers. c sileG siocnarF

In quantum electrodynamics, fermions interact only by their minimal coupling to


the photon fields,

LI = −i e ψγµ Aµ ψ . (3.41)

As in the scalar case, this interaction can be factored out of the generating functional,
by writing:

Z → ←
δ δ
Z[η, η] = exp − ie d4 x Aµ (x) γµ Z0 [η, η] . (3.42)
iδη(x) iδη(x)

Here, we are treating the photon field as a fixed background. When we consider
the path integral representation of dynamical photons in the next section, the Aµ (x)
inside the exponential will also be replaced by a functional derivative.
3. PATH INTEGRALS FOR FERMIONS AND PHOTONS 127

3.3 Path integral for photons

3.3.1 Problems with the naive path integral

In the case of photons, the difficulties encountered in the path integral formulation
are of a different nature. Since photons are bosons, we expect that they can be
represented by a functional integration over commuting functions Aµ (x). But the
gauge invariance of the theory implies that there is an unavoidable redundancy in this
representation: the naive path integral over [DAµ (x)] would integrate over infinitely
many copies of the same physical configurations. Therefore, we need a way to cut
through this redundancy, which is achieved by gauge fixing.
In order to better see the nature of this difficulty, let us assume that we can treat
Aµ (x) as four scalar fields, and write the following path integral,
Z Z
  
Z0 [j ] ≡ DAµ (x) exp i d4 x − 41 Fµν Fµν + jµ Aµ .
µ
(3.43)

This is a Gaussian integral, since Fµν Fµν is quadratic in the field Aµ ,


Z Z
1 4 µν 1  
− d xF Fµν = − d4 x ∂µ Aν − ∂ν Aµ ∂µ Aν − ∂ν Aµ
4 4
Z
1 
= + d4 x Aµ gµν  − ∂µ ∂ν Aν
2
Z
1 d4 k e µ  ν
= − e (−k) .
A (k) gµν k2 − kµ kν A
2 (2π) 4

(3.44)

Performing this Gaussian integral requires the inverse of the object gµν k2 − kµ kν ,
that one may seek as a linear combination of the metric tensor gµν and kµ kν /k2 , i.e.
we are looking for coefficients α and β such that:
 ν ρ
gµν k2 − kµ kν α gνρ + β kkk2 = δρµ . (3.45)
| {z }
α k 2 δρ
µ −α kµ k
ρ

This equation has clearly no solution, and therefore it is impossible to invert gµν k2 −
kµ kν . This means that some eigenvalues  ν of this operator are zero, and that the
quadratic form A e µ (k) gµν k2 − kµ kν Ae (−k) has flat directions. Along these flat
directions, the exponential in the path integral (3.43) does not decrease, which spoils
its convergence. These flat directions correspond to the projection of Ae µ (k) along kµ .
µ
Note that they also do not contribute to the linear term j Aµ , for a conserved current
that satisfies ∂µ jµ = 0. Therefore, one should not integrate over these components of
Aµ in eq. (3.43). c sileG siocnarF
128 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

3.3.2 Path integral in Landau gauge

A simple way out it to decompose Aµ as follows:

Aµ = Aµ µ
⊥ + Ak ,
 kµ kν  e
e µ (k) ≡
A gµν − Aν (k) ,

k2
 kµ kν 
e µ (k) ≡
A e ν (k) .
A (3.46)
k k2

The functional measure can be factorized as follows


    
DAµ = DAµ
⊥ DAµ
k , (3.47)

and since nothing depends on Aµ


k in the photon kinetic term, we can write

Z Z
 
µ
Z0 [j ] ≡ exp i d4 x jµ Aµ
DAµ
k (x) k
Z Z
  
× DAµ 4
⊥ (x) exp i d x − 4 F
1 µν
Fµν + jµ Aµ
⊥ . (3.48)

By integrating by parts the argument of the exponential in the integral on Aµ k , we


obtain a delta function of ∂µ jµ . Thus, for external currents that obey the continuity
equation ∂µ jµ = 0, this prefactor is an infinite constant that can be ignored. When
restricted to the subspace of Aµ 2 µ ν
⊥ , the operator gµν k − k k is invertible, and we
can now perform the Gaussian integral, to obtain:

Z
1
Z0 [jµ ] = exp − d4 xd4 y jµ (x) G0F µν (x, y) jν (y) , (3.49)
2

with the free photon propagator in momentum space given by

−i  pµ pν 
G0F µν (p) ≡ gµν
− . (3.50)
p2 + i0+ p2
c sileG siocnarF

(We have introduced the i0+ prescription that selects the ground state at x0 → ±∞,
using the same argument as in the section 2.3.3.) The procedure used here is equivalent
to imposing the gauge fixing condition ∂µ Aµ = 0, called Lorenz gauge or Landau
gauge. As one can see, the resulting propagator (3.50) differs from the Coulomb
gauge propagator given in eq. (1.247). c sileG siocnarF
3. PATH INTEGRALS FOR FERMIONS AND PHOTONS 129

3.3.3 General covariant gauges


All gauge fixings amount to constraining in some way the quantity ∂µ Aµ , since it
does not appear in the integrand of the photon path integral. Instead of imposing
∂µ Aµ = 0, one may instead impose the more general condition

∂µ Aµ (x) = ω(x) , (3.51)

where ω(x) is some arbitrary function of space-time. This can be done by introducing
a functional delta function, δ[∂µ Aµ − ω], inside the path integral. However, the
introduction of the function ω(x) breaks Lorentz invariance. To mitigate this problem,
one integrates over all the functions ω(x), with a Gaussian weight. This amounts to
defining the generating functional as follows4 ,
Z Z
  ξ
Z0 [jµ ] ≡ Dω(x) exp − i d4 x ω2 (x)
2
Z Z
    
× DAµ (x) δ ∂µ A − ω exp i d4 x − 14 Fµν Fµν + jµ Aµ ,
µ

(3.52)

where ξ is an arbitrary constant. Performing the integration on ω(x) thanks to the


delta functional, and integrating by parts, this becomes
Z Z
  
Z0 [jµ ] = DAµ (x) exp i d4 x 21 Aµ (gµν −(1−ξ)∂µ ∂ν )Aν +jµ Aµ .
(3.53)

From this formula, a standard Gaussian integration tells us that the corresponding
photon propagator in momentum space should be the inverse of

i gµν p2 − (1 − ξ)pµ pν . (3.54)
ν ρ
Looking for an inverse of the form α gνρ + β ppp2 , we find
 
0 µν −i gµν i 1 pµ pν
GF (p) = 2 + 1− . (3.55)
p + i0+ p2 + i0+ ξ p2
4 Since the argument of the delta function is linear in the variable Aµ k
that does not appear in the
integrand, we do not need a Jacobian. It is possible to impose non-linear gauge conditions of the form
δ[F(∂µ Aµ ) − ω], but this should be done by writing the path integral as follows
Z Z Z
  ξ    
Dω(x) exp − i d4 x ω2 (x) DAµ (x) F ′ (∂µ Aµ ) δ F(∂µ Aµ ) − ω · · ·
2 | {z }
Jacobian
In general, the Jacobian cannot be ignored since it depends on the gauge field, but it can be expressed in
terms of ghost fields. Doing this would be an useless complication in QED, but is an essential step in the
quantization of non-Abelian gauge theories.
130 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

The gauge fixing parameter ξ appears in the propagator, but only in the term pro-
portional to pµ pν . Thanks to the Ward-Takahashi identities, it does not have any
incidence on physical results, provided that all the external charged particles are on
mass-shell. The Landau gauge of the previous subsection corresponds to ξ → ∞.
Another popular choice is the Feynman gauge, obtained for ξ = 1,
−i gµν
G0F µν (p) = . (3.56)
ξ=1 p2 + i0+

Note that one could also introduce a non Lorentz covariant condition inside the delta
function, such as δ[∂i Ai − ω], in order to derive the photon propagator in Coulomb
gauge via the path integral.c sileG siocnarF

3.4 Schwinger-Dyson equations


3.4.1 Functional derivation
Consider a Lagrangian density L(φ, ∂µ φ) (φ may be a collection
R of fields, but we do
not write any index on it to keep the notation light), and S ≡ x L the corresponding
action. The generating functional of time-ordered products of fields has the following
path integral representation:
Z
  R
Z[j] = Dφ(x) eiS[φ]+i jφ . (3.57)

In the right hand side, φ(x) should be viewed as a dummy integration variable, and
the result of the integral should be unmodified if we change φ(x) → φ(x) + δφ(x).
This translates into
Z Z 
  iS[φ]+i R jφ δS 
0 = δZ[j] = i Dφ(x) e d4 x δφ(x) j(x) + . (3.58)
δφ(x)
Taking n functional derivatives of this identity with respect to ij(x1 ),...,ij(xn ) and
setting then j to zero gives:
Z Z
  δS
0 = Dφ(x) eiS[φ] d4 x δφ(x) i φ(x1 ) · · · φ(xn )
δφ(x)
Xn Y
+ δ(x − xi ) φ(xj ) . (3.59)
i=1 j6=i

Since in this discussion the variation δφ(x) is arbitrary, this implies the following
identities
Z
  δS
0 = Dφ(x) eiS[φ] i φ(x1 ) · · · φ(xn )
δφ(x)
Xn Y
+ δ(x − xi ) φ(xj ) , (3.60)
i=1 j6=i
3. PATH INTEGRALS FOR FERMIONS AND PHOTONS 131

known as the Schwinger-Dyson equations (here written in functional form). For


instance, in the case of a scalar field theory with a φ4 interaction term, this leads to

i x + m2 0out T φ(x1 ) · · · φ(xn )φ(x) 0in
λ
+i 3! 0out T φ(x1 ) · · · φ(xn )φ3 (x) 0in
Xn Y
= δ(x − xi ) 0out T φ(xj ) 0in . (3.61)
i=1 j6=i

(We have used the remark following eq. (2.30) in order to let the operator  + m2
act also on the step functions that order the operators in the time-ordered product.)
If we convolute this equation with the free Feynman propagator (i.e. the inverse of
the operator x + m2 ), the above Schwinger-Dyson equation can be represented
diagrammatically as follows:
1 2 1 2 1
i
i−1
n
X i+1
x + x = x . (3.62)
i=1
n n n
| {z }
contact terms
The Schwinger-Dyson equations have several simple consequences. When applied
to a free theory (λ = 0) in the case n = 1, we get

x + m2 0out T φ(x1 )φ(x) 0in = −iδ(x − x1 ) , (3.63)
which is nothing but the equation of motion satisfied by the Feynman propagator. In
the general case, if x differs from all the xi ’s, we obtain

x + m2 0out T φ(x1 ) · · · φ(xn )φ(x) 0in
λ
+ 3! 0out T φ(x1 ) · · · φ(xn )φ3 (x) 0in = 0 . (3.64)

Thus, in a certain sense5 , we can say that time-ordered products of fields satisfy the
Euler-Lagrange equation of motion. c sileG siocnarF

3.4.2 Schwinger-Dyson equations and conserved currents


The functional derivative of the action S with respect to φ(x) is given by
δS ∂L ∂L
= − ∂µ . (3.65)
δφ(x) ∂φ(x) ∂(∂µ φ(x))
5 I.e., up to the terms in δ(x − x ) that may appear in the right hand side, called contact terms. These
i
contact terms in fact take care of the action of the time derivative on the theta functions of the time ordering
operator T.
132 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

When we equate this to zero, we recover the Euler-Lagrange equation of motion.


Under an infinitesimal variation δφ(x) of the field, the Lagrangian density varies by

∂L ∂L
δL = δφ(x) + ∂µ (δφ(x))
∂φ(x) ∂(∂µ φ(x))
 ∂L  δS
= ∂µ δφ(x) + δφ(x) . (3.66)
∂(∂µ φ(x)) δφ(x)

When the variation δφ(x) corresponds to a symmetry of the Lagrangian, we have


δL = 0, and therefore
δS  ∂L 
δφ(x) = −∂µ δφ(x) , (3.67)
δφ(x) ∂(∂µ φ(x))
| {z }
Jµ (x)

where Jµ is the Noether current associated to this continuous symmetry. In the


classical theory, this current is conserved, i.e. ∂µ Jµ = 0, if the fields obey the
Euler-Lagrange equation of motion. The Schwinger-Dyson equations provide a
quantum analogue of this conservation law, at the level of the expectation values of
time-ordered products of fields. In eq. (3.59), we can replace δφ(δS/δφ) by −∂µ Jµ .
When the resulting identity is rewritten in terms of operators, the derivative ∂µ should
go outside the time-ordering, and we obtain

∂µ 0out T Jµ (x)φ(x1 ) · · · φ(xn ) 0in


Xn Y
+i δ(x − xi ) 0out T δφ(x) φ(xj ) 0in = 0 .
i=1 j6=i
(3.68)

Therefore, when a Noether current operator is inserted inside a time-ordered product,


it satisfies the continuity equation up to contact terms (coming from the action of ∂0
on the theta functions of the T product). Eq. (3.68) is a generalization of the Ward-
Takahashi identities, already discussed in the context of electric charge conservation. c sileG siocnarF

Note that in some cases, a continuous symmetry does not leave the Lagrangian
density invariant, but modifies it by a total derivative,

δL = ∂µ Kµ , (3.69)

so that only the action is invariant. There is still a conserved current, given by
∂L
Jµ (x) ≡ δφ(x) − Kµ (x) . (3.70)
∂(∂µ φ(x))
This however does not modify eqs. (3.68). c sileG siocnarF
3. PATH INTEGRALS FOR FERMIONS AND PHOTONS 133

3.5 Quantum anomalies


3.5.1 General considerations
It may happen that some symmetries of the Lagrangian (i.e. symmetries of the
classical theory) are broken by quantum corrections. This phenomenon is called a
quantum anomaly. One way this may appear is via the introduction of a regularization
(e.g. a cutoff), whose effect leaves an imprint on physical results even after the cutoff
has been taken to infinity. Here we will adopt a functional point of view on this issue.
In the previous section, a crucial point in the derivation of the Schwinger-Dyson
equations is that the functional measure must be invariant under the symmetry under
consideration. Quantum anomalies may be viewed as an obstruction in defining a
functional measure which is invariant under certain symmetries, e.g. axial symmetry. c sileG siocnarF

Let us consider a set of fermion fields ψn (x), that we encapsulate into a multiplet
denoted ψ(x), and assume that they interact with a gauge potential Aa µ (x) in a non-
chiral way (this is the case of electromagnetic interactions and of strong interactions).
Consider now the following transformation of the fermion fields:
ψ(x) → U(x)ψ(x) . (3.71)
The Hermitic conjugate of ψ transforms as:
ψ† (x) → ψ† (x)U† (x) , (3.72)
so that we have
ψ(x) ≡ ψ† (x)γ0 → ψ† (x)U† (x)γ0 = ψ(x)γ0 U† (x)γ0 . (3.73)
Since they are Grassmann variables, the measure should be transformed with the
inverse of the determinant of the transformation. Since the transformation under
consideration is local in x, it reads
  1  
DψDψ → DψDψ , (3.74)
det (U) det (U)
where the matrices U and U carry both indices for the fermion species and space-time
indices:

Uxm,yn ≡ Umn (x) δ(x − y) ,


Uxm,yn ≡ (γ0 U† (x)γ0 )mn δ(x − y) . (3.75)

3.5.2 Non-chiral transformations


Let us consider the following transformation:
U(x) = eiα(x)t , (3.76)
134 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where α(x) ∈
0 1 2 3

and where t is a Hermitean matrix that does not contain γ5 ≡
i γ γ γ γ . Therefore:

U† (x) = e−iα(x)t , (3.77)

and
Z X
(UU)xm,yn = d4 z Uxm,zp Uzp,yn
p
Z X   
= d4 z δ(x − z)δ(z − y) e−iα(z)t eiα(z)t
mp pn
p
= δmn δ(x − y) . (3.78)

Thus UU = 1, which implies det U det U = 1, and the fermion measure is invariant
under this kind of transformations. This means that this symmetry does not exhibit
quantum anomalies. c sileG siocnarF

3.5.3 Chiral transformations


Let us now define the right-handed and left-handed projections of a spinor,
   
1 + γ5 1 − γ5
ψR ≡ ψ , ψL ≡ ψ, (3.79)
2 2
and consider a transformation that acts differently on these two components:
5
U(x) = eiα(x)γ t
, (3.80)

where t is again a Hermitean matrix. Such transformations are called chiral transfor-
mations. The matrix γ5 ≡ iγ0 γ1 γ2 γ3 satisfies
2
γ5 =1,
5†
γ = γ5 ,
{γ5 , γ0 } = 0 , (3.81)

which implies:
5 5
γ0 U† (x)γ0 = γ0 e−iα(x)γ t γ0 = eiα(x)γ t
= U(x) . (3.82)

Thus U = U, and det U = det U. Unless this determinant is equal to one, the measure
is not invariant and transforms according to:
  1  
DψDψ → 2
DψDψ . (3.83)
(det U)
3. PATH INTEGRALS FOR FERMIONS AND PHOTONS 135

Consider an infinitesimal transformation of the form given in eq. (3.80). We can


write:

(U − 1)xm,yn = i α(x)(γ5 t)mn δ(x − y) . (3.84)


−2
In order to calculate det U , we use the formula6 :
−2
(det U) = e−2 tr ln U . (3.85)

In the present case, we have:

−2  
(det U) = exp −2 tr ln 1 + iα(x) γ5 t δ(x − y)
 
≈ exp −2 i tr α(x) γ5 t δ(x − y)
α≪1
 Z 
= exp i d4 x α(x)A(x) , (3.86)

with a function A(x) whose formal expression is

A(x) ≡ −2 tr (γ5 t) δ(x − x) . (3.87)

In this equation, the trace symbol tr denotes both a trace on the indices carried by
the Dirac matrices and a trace on the fermion species. In terms of this function, the
measure transforms as
  R 4  
DψDψ → ei d x α(x)A(x) DψDψ . (3.88)

The fact that this measure is not invariant under the transformation (3.80) implies
that there exists fermion loop corrections that break the invariance under chiral
transformations, even if the Dirac Lagrangian itself is invariant (this is the case when
one considers a global transformation, i.e. a constant α(x), and the fermions are
massless). The prefactor that alters the measure can be absorbed into a redefinition of
the Lagrangian,

L(x) → L(x) + α(x)A(x) . (3.89)

All happens as if the Lagrangian itself was not invariant under this transformation.
If one integrates out the fermion fields in order to obtain an effective theory for the
other fields, the term in α(x)A(x) must be included in the Lagrangian of this effective
theory in order to correctly account for the quantum anomalies. c sileG siocnarF

6 If the λi are the eigenvalues of U, we have:


Y X  ln U
det U = λi = exp ln λi = etr .
i i
136 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

3.5.4 Calculation of A(x)

At first sight, the expression (3.87) of the anomaly function A(x) is very poorly
defined: the trace is zero, but it is multiplied by an infinite δ(0). In order to manipulate
finite expressions, we must first regularize the delta function. This can be done by
writing:
 !
5 / 2x
D
A(x) = −2 lim tr γ t F − 2 δ(x − y) , (3.90)
y→x,M→+∞ M

/ x is the Dirac operator7


where D

/ x ≡ γµ ∂µ − i g ta Aa
D µ (x) , (3.91)

and where F(s) is a function such that

F(0) = 1 ,
F(+∞) = 0 ,
s F′ (s) = 0 at s = 0 and at s = +∞ . (3.92)

A covariant derivative is mandatory in eq. (3.90), since an ordinary derivative would


break gauge invariance. Then, we replace the delta function by its Fourier representa-
tion:
Z 4
d k ik(x−y)
δ(x − y) = e , (3.93)
(2π)4

which leads to

Z  !
d4 k / 2x
D 5
A(x) = −2 lim tr γ t F − 2 eik(x−y)
(2π)4
y→x,M→+∞ M
Z 4  
d k 5 / +D
(ik / x )2
= −2 lim tr γ t F − . (3.94)
(2π)4 M→+∞ M2

The second equality follows from

lim F(∂x ) eik·(x−y) = F(ik + ∂x ) . (3.95)


y→x

7 We are considering here the case where the fermions are coupled to a non-Abelian gauge field. The

index a carried by Aa a
µ is a “colour” index, and the t ’s are the generators of the Lie algebra representation
where the fermions live. g is the coupling of the fermions to the gauge fields. See the next chapter for more
details.
3. PATH INTEGRALS FOR FERMIONS AND PHOTONS 137

The function A(x) can then be rewritten as follows:


Z 4   2 !
4 d k 5 /x
D
A(x) = −2 lim M tr γ t F − ik /+ , (3.96)
M→+∞ (2π)4 M

by redefining the integration variable, k → Mk. Then, we can write:


 2  2
D/ k · Dx /x
D
− ik / + x = k2 − 2i − , (3.97)
M M M
and expand the function F(·) in powers of 1/M. The only terms that give a non-zero
contribution to A(x) should not go to zero too quickly when M → +∞: only the
terms decreasing at most as 1/M4 should be kept. Moreover, the Dirac trace should
be non-zero, which implies that the matrix γ5 must be accompanied by at least four
ordinary γµ matrices. The matrices γµ come from the term D / 2x in eq. (3.97), that
brings two of them8 , and we therefore need to go to the second order in the Taylor
expansion of the function F(·). In fact, a single term fulfills all these constraints:
Z 4  
d k ′′ 2 5 /4
A(x) = − F (k ) tr γ tD x . (3.98)
(2π)4
By a Wick’s rotation (k → iκ, k2 → κ2 ), we obtain9 :
Z +∞
Z
′′
4
d k F (k ) = 2iπ 2 2
dκ κ3 F′′ (κ2 ) = iπ2 . (3.99)
0

The last equality is obtained by two successive integrations by parts. We also have:

/ 2x
D = Dµ ν
x Dx γ µ γ ν
1 µ ν
= D D ({γµ , γν } + [γµ , γν ])
2 x x
1
= D2x + [Dµ , Dν ] [γµ , γν ]
4 x x
ig
= D2x − ta Fµν
a [γµ , γν ] . (3.100)
4
Using
tr (γ5 γµ γν γρ γσ ) = −4 i ǫµνρσ , (3.101)
we obtain
g2 ρσ
A(x) = − ǫµνρσ Fµν a b
a (x) Fb (x) tr (t t t) , (3.102)
16π2
8 In this counting, we assume that the matrix t does not contain Dirac matrices.
9 Recall that the rotationally invariant measure in 4-dimensional Euclidean space is 2π2 κ3 dκ.
138 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where the trace is now only on the fermion species. When t is the identity matrix,
the integral of A(x) depends only on topological properties of the gauge field con-
figuration and takes discrete values. In the context of anomalies, it is called the
Chern-Pontryagin index. Moreover, the Atiyah-Singer theorem relates this invariant
to the zero modes of the Euclidean Dirac operator in this gauge field (see the section
3.5.7). c sileG siocnarF

3.5.5 Anomaly of the axial current


When the action is invariant under global chiral transformations, its variation under
local chiral transformation may be written as
Z
δS = d4 x Jµ 5 (x) ∂µ α(x) , (3.103)

where Jµ5 (x) is the axial current. Integrating by parts, and identifying this variation
with the term obtained in the previous section, we should have
g2
h∂µ Jµ
ǫµνρσ Fµν
5 (x)iA = −
ρσ a b
a (x) Fb (x) tr (t t t) , (3.104)
16π2
where h·iA is an average over the fermion fields, in a fixed gauge field configuration. c sileG siocnarF

3.5.6 Axial anomaly in the u and d quarks sector


Consider the sector of the two lightest quarks flavours, u et d. If one neglects their
mass, the corresponding action is invariant under the following chiral transformations:
δu = i αγ5 u , δd = −i αγ5 d . (3.105)
The matrix t in quark flavour space that corresponds to this transformation is
!
1 0
t= . (3.106)
0 −1

Strong interaction : Through the strong interactions, all quark flavours couple
identically with the gluons (i.e. all quarks belong to the same representation of the
SU(3) algebra). In other words, the matrices ta that describe this coupling do not
depend on the quark flavour (equivalently, one may say that they are proportional to
the identity in quark flavour space). The trace that appears in the anomaly function
can be factored into separate flavour and colour factors
tr (ta tb t) = trcolour (ta tb ) × trflavour (t) = 0 . (3.107)
| {z }
1−1=0

This means that the anomalies that may occur in the gluon-gluon term cancel between
the u and d flavours of quarks. c sileG siocnarF
3. PATH INTEGRALS FOR FERMIONS AND PHOTONS 139

Electromagnetic interaction : The situation is different with electromagnetic in-


teractions, because the u and d quarks have different electrical charges. Now the
matrices ta are the direct product of a charge matrix in flavour space,
!
2
0
Q≡ 3 , (3.108)
0 − 13

and the identity 1colour in colour space, since all the quark colours couple identically
to photons. Therefore, the trace in the anomaly function is
Nc
trflavour (Q2 t) × trcolour (1colour ) = , (3.109)
3
where Nc = 3 is the number of colours. This leads to

e2 Nc
A(x) = − ǫµνρσ Fµν (x) Fρσ (x) , (3.110)
48π2
where Fµν is the electromagnetic field strength.

Decay of the neutral pion in two photons : At low energy, the strong interactions
may be described by an effective theory that couples a doublet of fermions ψ (the u
and d quarks), the three pions π and a field σ. The interaction term in this model is

LI ≡ λ ψ(σ + iπ · σγ5 )ψ , (3.111)

where σi (i = 1, 2, 3) are the Pauli matrices. Note that π3 must be the neutral pion,
since it couples diagonally to the two components of the doublet (σ3 is a diagonal
matrix). This interaction term is invariant under the transformation (3.105) provided
that the fields σ and π transform as

σ → σ − α π3 , π1,2 → π1,2 , π3 → π3 + ασ . (3.112)

Moreover, the masses of nucleons are due to a spontaneous breaking of this symmetry,
in which the σ field has a non-zero expectation value in the ground state: σ = fπ .
Thus the variation of the field π3 is δπ3 = fπ α.
When photons are added to this model, there is no direct coupling between the
neutral pion and the photon. Let us now consider the theory that would result from
integrating out the quark fields. The anomaly (3.110) would produce a term

e2 Nc
Lanom (x) = − ǫµνρσ Fµν (x) Fρσ (x) α(x) . (3.113)
48π2
in the Lagrangian. This term should be canceled somehow, because we are now
talking about an effective theory of pions and photons, that should be chiral invariant.
140 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

The resolution of this issue is that this effective theory contains a coupling between
the neutral pion and two photons, of the form:

e2 Nc
Lπ0 γγ = − ǫµνρσ Fµν (x) Fρσ (x) π3 (x) . (3.114)
48π2 fπ
The decay rate of a neutral pion into two photons can be easily determined from the
effective coupling (3.114):

N2c α2em m3π


Γ (π0 → 2γ) = . (3.115)
144π3 f2π

This result could also be obtained by computing the transition amplitude at one loop
from a neutral pion to two photons in the effective model we started from. The present
considerations show that this decay is in fact controlled to a large extent by a quantum
anomaly.

3.5.7 Atiyah-Singer index theorem

Covariant derivatives Dµ = ∂µ − i g Aa a
µ t are anti-Hermitean, because the gauge
potential Aµ is real and the colour matrices ta are Hermitean (recall that an ordinary
a

derivative is anti-Hermitean). However, γ0 is Hermitean, while γ1,2,3 are anti-


Hermitean. Therefore, the Dirac operator Dµ γµ in Minkowski space is neither
Hermitean not anti-Hermitean.
Let us introduce an Euclidean time via x4 ≡ ix0 . Likewise, we also have:

∂4 = i∂0 , A4 = iA0 , γ4 = iγ0 , (3.116)

and the measure over space-time becomes d4 x = i d4 xE where d4 xE is the measure


over 4-dimensional Euclidean space (d4 xE = dx1 dx2 dx3 dx4 ). The Dirac operator
becomes:
4
X
/ =
D (∂i − i g Aa a i
i t )γ , (3.117)
i=1

where the index i runs from 1 to 4. Now, the Dirac matrices γi are all anti-Hermitean,
which implies that the Euclidean Dirac operator is Hermitean. It can therefore be
diagonalized in an orthonormal basis of eigenfunctions φk :

/ x φk (x) = λk φk (x) ,
D
Z
d4 xE φ†k (x)φk′ (x) = δkk′ , (3.118)
3. PATH INTEGRALS FOR FERMIONS AND PHOTONS 141

with real eigenvalues λk . Note also that these eigenfunctions must obey the following
completeness relation:
X
φk (x)φ†k (y) = δ(x − y) . (3.119)
k

Consider now the case where t is the identity,

t φk (x) = φk (x) , (3.120)

and use the completeness identity in order to express the delta function in the anomaly
function A(x) in eq. (3.90):
 ! 
/2
D X
A(x) = −2 lim tr γ F − x2
5
φk (x)φ†k (y)
y→x,M→+∞ M
k
 ! 
X /2
D
= −2 lim tr φ†k (y) γ5 F − x2 φk (x)
y→x,M→+∞ M
k
 
X λ2
= −2 lim F − k2 φ†k (x)γ5 φk (x) . (3.121)
M→+∞ M
k

Thus, we obtain the following relationship,


Z
g2
d4 xE ǫijkl Fa b a b
ij (x) Fkl (x) tr(t t )
32π2
1
Z X  λ2  Z
=− d4 xE A(x) = lim F − k2 d4 xE φ†k (x)γ5 φk (x) ,
2 M→+∞ M
k
(3.122)

between an integral that involves the field strength of a gauge field configuration and
a sum over the spectrum of the Euclidean Dirac operator (in the same gauge field). c sileG siocnarF

Since γ5 anticommutes with the Dirac operator,


 5
γ ,D/ =0, (3.123)

the state φk′ ≡ γ5 φk (x) is also an eigenfunction of D


/ x with the eigenvalue −λk :

/ x (γ5 φk (x)) = −λk (γ5 φk (x)) .


D (3.124)

/ x is Hermitean,
When λk 6= 0, the state φk′ is distinct from the state φk (x). Since D
they are in fact orthogonal:
Z Z
d xE φk (x)γ φk (x) = d4 xE φ†k (x)φk′ (x) = 0 .
4 † 5
(3.125)
142 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

This implies that none of the eigenfunctions φk with a non-zero eigenvalue can
contribute to the right hand side of eq. (3.122). The only contributions to eq. (3.122)
come from the eigenfunctions for which λk = 0, i.e. the zero modes of the Euclidean
Dirac operator. Since we have assumed that f(0) = 1, we have:
Z X Z
g2
d4 xE ǫijkl Fa b a b
ij (x) Fkl (x) tr(t t ) = d4 xE φ†k (x)γ5 φk (x) .
32π2
k|λk =0
(3.126)

Since γ5 ,D/ x = 0, we can choose these zero modes in such a way that they are also
eigenmodes of γ5 , with eigenvalues +1 or −1. We can thus divide the zero modes in
two families, the right-handed and the left-handed zero modes:

/ x φR (x) = 0 ,
D γ5 φR (x) = +φR (x) ,
/ x φL (x) = 0 ,
D γ5 φL (x) = −φL (x) . (3.127)

Using also the fact that the eigenfunctions are normalized as follows,
Z
d4 xE φ†R (x)φR (x) = 1 ,
Z
d4 xE φ†L (x)φL (x) = 1 , (3.128)

we obtain the following identity


Z
g2
d4 xE ǫijkl Fa b a b
ij (x) Fkl (x) tr(t t ) = nR − nL , (3.129)
32π2
where nR and nL are the numbers of right-handed and left-handed zero modes,
respectively. This formula is the Atiyah-Singer index theorem. It tells us that the
integral in the left hand side is an integer, despite being the integral of a quantity that
changes continuously when one deforms the gauge field. Different considerations,
from the study of Euclidean gauge field configurations known as instantons, provide
another insight on this integral by relating it to the third homotopy group of the gauge
group, π3 (SU(Nc )) = ❩. c sileG siocnarF
Chapter 4

Non-Abelian gauge symmetry

Gauge theories are quantum field theories with matter fields (usually spin 1/2 fer-
mions, but also possibly scalars) and gauges potentials in such a way that the La-
grangian is invariant under the action of a local continuous transformation. Quantum
Electrodynamics is the simplest such theory, with a local U(1) invariance. Given
Ω(x) ∈ U(1), the various objects that enter in the theory transform as follows:

ψ → Ω−1 ψ ,
i
Aµ → Aµ + Ω−1 ∂µ Ω ,
e
Fµν → Fµν ,
 i 
D µ
→ Ω−1 Dµ Ω = ∂µ − ie Aµ + Ω−1 ∂µ Ω . (4.1)
e

In this construction, the gauge transformation of Aµ could have been found by


requesting that Dµ ψ transforms as ψ itself,

Dµ ψ → Ω−1 (x) Dµ ψ , (4.2)

with Dµ ≡ ∂µ − ieAµ the covariant derivative. The field strength Fµν would then
be defined as ∂µ Aν − ∂ν Aµ .
Our goal is now to generalize the concept of gauge theory to more general groups
of transformations, in view of applications to the electroweak and to the strong
interactions. In these two cases, the internal group of transformations is SU(2) and
SU(3), respectively, but we will consider in most of this chapter a general Lie group.
Our goal is to construct a consistent field theory that generalizes eqs. (4.1) to the case
where Ω(x) belongs to some general Lie group G. c sileG siocnarF

143
144 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

4.1 Non-abelian Lie groups and algebras


4.1.1 Lie groups
Let us start by recalling that a Lie group is a group which is also a smooth manifold.
The group operation will be denoted multiplicatively, as in Ω2 Ω1 , and we will denote
the identical element by 1 and the inverse of a group element Ω by Ω−1 . The fact
that a Lie group G is also a manifold allows to use concepts of differential geometry
in their study.
Matrix Lie groups, that will be our main concern in view of applications to

quantum field theory, are closed subsets of GL(n, ), the general linear group of
n × n matrices on the field of complex numbers. Here is a list of some classical
examples of matrix Lie groups, along with their definition:

Special linear groups : SL(n, ❈ ❘) det (Ω) = 1


Special orthogonal group : SO(n) ΩT Ω = 1 , det Ω = 1
Unitary group : U(n) Ω† Ω = 1
Special unitary group : SU(n) Ω† Ω = 1 , det Ω = 1
(4.3)

4.1.2 Lie algebras


Geometrically, the Lie algebra g is a vector space that may be viewed as tangent to the
group at the identity Ω = 1. Therefore, its dimension is the same as that of the group
manifold. The group multiplication induces on the tangent space a non-associative
multiplication, the Lie bracket, thereby turning it into an algebra. The Lie algebra
completely encapsulates the local properties of the underlying Lie group, and if the
group is simply connected its Lie algebra defines it globally. Because they are linear
spaces, Lie algebras are usually easier to study than their group counterpart, although
they provide most of the information.
In the specific case of matrix Lie groups, the corresponding Lie algebra can be
defined as the following set of matrices1

g ≡ X eit X ∈ G, for all real t . (4.4)

The matrix exponential eX is defined from the Taylor series of the exponential by

X Xn
eX ≡ , (4.5)
n!
n=0
1 The prefactor i inside the exponential is common in the quantum physics literature, but seldom used in

mathematics. Its main benefit is to make X a Hermitean matrix when the group elements are unitary.
4. N ON -A BELIAN GAUGE SYMMETRY 145

that converges for all finite size matrices X since this series has an infinite radius of
convergence. A crucial property of the matrix exponential is that

eX+Y 6= eX eY if [X, Y] 6= 0 . (4.6)

Instead, one may use Trotter’s formula (also known as the Lie product formula)2 :
 n
eX+Y = lim eX/n eY/n . (4.7)
n→∞

(See the figure 4.2 for a geometrical illustration of this formula.)


Note that for X to be in the algebra, it is sufficient that exp(tX) ∈ G for t in a
neighbourhood of t = 0. Then, since the group G is closed under multiplication, this
property extends to all t’s on the real axis. In fact, the mapping t → exp(tX) is a

group homomorphism (from the additive group to G) that spans a one-dimensional
subgroup of G. From the definition (4.4) of the Lie algebra, and using Trotter’s
formula, one can check that any real linear combination of elements of g is in g, i.e.
that g is a real vector space. Therefore, every element of g can be written as a linear
combination of some basis elements ta ,

X = Xa ta (Xa ∈ ❘) , (4.8)

with an implicit sum on the index a. The ta ’s are called the generators of the algebra.c sileG siocnarF

Thanks to the exponential mapping (4.4), the properties of the Lie groups listed
in eqs. (4.3) translate into specific properties of the matrices X in the corresponding
algebras:

Special linear groups : sl(n, ❈ ❘) tr (X) = 0


Special orthogonal group : so(n) XT = −X
Unitary group : u(n) X† = X
Special unitary group : su(n) X† = X , tr (X) = 0
(4.9)

Note that the conditions imposed on Ω in eqs. (4.3) are non-linear, in contrast with
the linear conditions obeyed by the matrices X in eqs. (4.9). This is why a Lie group
is a curved manifold, while a Lie algebra is a linear space.
2A sketch of the proof is the following:
 
eX/n eY/n = 1 + Xn
+nY
+ O(n−2 ) = exp X+Y
n
+ O(n−2 ) ,
 n  
eX/n eY/n = exp X + Y + O(n−1 ) .
146 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Lie Group manifold G

e it X

iX

Figure 4.1: Lie group


1 and Lie algebra.

Lie Algebra g

4.1.3 Geometrical interpretation


First note that we have
d itX
iX = e . (4.10)
dt t=0

The group elements exp(itX) form a smooth curve on the group manifold (t = 0
corresponds to the identity), and iX may be viewed as the vector tangent to this curve
at the identity, as illustrated in the figure 4.1. The non-commutativity of the group is
related to the curvature of the corresponding manifold3 . Because of this curvature, a
displacement eiX followed by a displacement eiY does not lead to the same point as
the two displacements performed in reverse order. This geometrical representation
also provides an interpretation of Trotter’s formula for the exponential of a sum, as
shown in the figure 4.2.
The dimension of the Lie algebra equals the number of independent directions on
the group manifold. From the conditions listed in (4.9) on the matrices X ∈ g, it is
easy to determine the dimension of these algebras (viewed as algebras over the field
❘ ). The dimensions are listed in the table 4.1 for some common cases.
Moreover, as we shall see in the section 4.1.5, the group multiplication can be
inferred from that on the Lie algebra, via the Baker-Campbell-Hausdorff formula.
Despite these correspondences, the Lie algebra may not reflect the global properties of
the group (e.g. whether it is connected), and distinct Lie groups may have isomorphic
Lie algebras. This is for instance the case of U(1) and SO(2), SO(3) and SU(2), or
SU(2) × SU(2) and SO(4). c sileG siocnarF

3 This assertion could be made more precise as follows: it is possible to define a metric tensor on the

group manifold, and the corresponding Ricci curvature tensor. This curvature may then be expressed in
terms of the constants that define the commutators between the generators of the algebra (see eq. (4.14)).
4. N ON -A BELIAN GAUGE SYMMETRY 147

Table 4.1: Dimensions Lie algebra Dimension


of a few common Lie
algebras. sl(n, ) ❘ n2 − 1
so(n) n(n − 1)/2
u(n) n2
2
su(n) n −1

eitX
Figure 4.2: Geometrical e i t (X+Y)
interpretation of Trot- iX
ter’s formula: the broken
path, made of a suc-
cession of elementary i(X+Y)
steps eitX/n and eitY/n ,
approximates better and
iY eitY
better the curve eit(X+Y)
on the group manifold as
n → ∞.

4.1.4 Lie bracket and structure constants


Consider an element Ω of the Lie group and an element X of the Lie algebra. For any
real number t, we have

exp i t Ω−1 XΩ = Ω−1 e itX
|{z} Ω , (4.11)
∈G
| {z }
∈G

where the equality follows from the Taylor series of the exponential. From the
definition of the Lie algebra, this implies that Ω−1 X Ω ∈ g. Therefore, if X, Y ∈ g
we also have
e−itX Y eitX ∈ g , (4.12)
and the derivative with respect to t at t = 0 is also an element of the algebra,
 
−i X, Y ∈ g . (4.13)
In other words, −i times the commutator of two elements of a Lie algebra is another
element of the algebra. Thus −i[·, ·] is the multiplication law4 in g (it is also called
4 In contrast, the ordinary product of two elements of the algebra is in general not in the algebra.
148 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

the Lie bracket). Therefore, the commutators between its generators can be written as
 a b
t , t = i fabc tc , (4.14)

where the fabc are real numbers called the structure constants. The antisymmetry of
the commutator implies that fabc = −fbac . Given three elements X, Y, Z ∈ g of the
algebra, their commutator satisfies the Jacobi identity
        
X, Y, Z + Y, Z, X + Z, X, Y = 0 , (4.15)

which implies the following relationship among the structure constants:

fade fbcd + fbde fcad + fcde fabd = 0 . (4.16)

4.1.5 Baker-Campbell-Hausdorff formula


Given an element X ∈ g, we may define a function from g to g as follows:
 
adX (Y) ≡ −i X, Y . (4.17)

The function adX is called the adjoint mapping at the point X. The exponential of the
adjoint mapping plays an important role, thanks to the following formula5

eadX Y = e−iX Y eiX . (4.18)

This allows to write the derivative of the exponential of a (matrix-valued) function as


follows:
ad
d iX(t) e X(t) − 1 dX(t)
e = i eiX(t) . (4.19)
dt adX(t) dt

(This is known as Duhamel’s formula6 .) The non-trivial aspect of this formula is that
it is true even when X(t) does not commute with its derivative. Then, given X, Y ∈ g,
5 This may be proven by considering a one-parameter family of such equalities:
t adX
e Y = e−itX Y eitX ,
and by noting that the left and right hand sides coincide at t = 0, and obey identical differential equations
with respect to the parameter t.
6 Note that this formula is equivalent to:

Z1
d iX(t) dX(t) isX(t)
e =i ds ei(1−s)X(t) e .
dt 0 dt
This latter form can be proven by writing
  X(t) ε 
X(t)+εX ′ (t)+O(ε2 ) ′
(t)+O(ε2 ) n
eiX(t+ε) = ei = lim ei n ei n X ,
n→∞

(we use Trotter’s formula to obtain the second equality) and by expanding the right hand side to first order
in ε.
4. N ON -A BELIAN GAUGE SYMMETRY 149

let us define a matrix Z(t) by

eiZ(t) ≡ eiX eitY . (4.20)

Differentiating both sides with respect to t (using eq. (4.19) for the left hand side),
we obtain
" ad #−1
dZ(t) e Z(t) − 1
= Y. (4.21)
dt adZ(t)

From eq. (4.18), we can also see that


adZ(t)
e = et adY eadX . (4.22)

Integrating eq. (4.21) from t = 0 to t = 1, we obtain the following identity:


  Z1
iX iY

ln e e = iX + i dt F et adY eadX Y, (4.23)
0

where the function F(·) is defined by

ln(z)
F(z) ≡ . (4.24)
z−1

Eq. (4.23) is the integral form of the Baker-Campbell-Hausdorff formula. In order to


recover the more familiar expansion in nested commutators, note that

et adY eadX = 1 + t adY + adX + 21 (t2 ad2Y + ad2X ) + t adY adX + · · ·


1
F(z) = 1 − (z − 1) + 31 (z − 1)2 · · · . (4.25)
2

This leads to
  1  i      
ln eiX eiY = i(X + Y) − X, Y − X, X, Y − Y, X, Y + · · · (4.26)
2 12
(Explicit expressions for all the coefficients of this series are given by Dynkin’s
formula.) In applications to quantum field theory, we usually need only the first
two terms of this expansion because the commutators we encounter are commuting
numbers and all the subsequent terms are zero. Besides being an intermediate step
in the derivation of eq. (4.26), the integral form (4.23) shows that the group product
can be reconstructed from Lie algebra manipulations (since the right hand side of this
equation contains only objects that belong to the algebra). c sileG siocnarF
150 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

4.1.6 Representations
A real representation of a Lie group G is a group homomorphism from elements of
❘ ❘
G to elements of GL(n, ), i.e. a mapping π from G to GL(n, ) that preserves the
group structure:
π(1) = 1 , π(Ω2 Ω1 ) = π(Ω2 )π(Ω1 ) . (4.27)
A representation is said to be faithful if it is a one-to-one mapping.
Likewise, one may define representations of a Lie algebra as homomorphisms

from g to gl(n, ), i.e. mappings π that preserve the Lie algebra structure:
 
π(αX + βY) = α π(X) + β π(Y) , π([X, Y]) = π(X), π(Y) . (4.28)
Note that if we define ta a
π ≡ π(t ) the images of the generators, then they obey
a b abc c
[tπ , tπ ] = i f tπ with the same structure constants as in the original Lie algebra.
Since we are focusing on matrix Lie groups, their elements are already matrices,
and one may wonder what representations are good for. In fact, it is often important
to know how a given group (e.g. the rotation group SO(3)) acts on a more general
linear space. In the example of SO(3), even though the “defining” action is on 3 in ❘
terms of 3 × 3 matrices, the group has many other matrix representations made of

objects that act on spaces other than 3 .

Singlet representation : The singlet representation, or trivial representation, is the


representation for which the mapping is π(Ω) = 1 for all Ω’s. The objects that
belong to this representation space are invariant under the transformations of the
group G. In quantum field theory, one says that these objects are “neutral” (under the
group G).

Fundamental representation : The fundamental representation, or standard repre-


sentation, is the smallest faithful representation. It is also the representation obtained
when π is the identical map. In other words, in the fundamental representation, the
elements of a matrix Lie group are simply represented by themselves. In the case of
compact simple Lie algebras (that give consistent non-Abelian gauge theories, as we
shall see in the next section), it is possible to choose the generators ta f (the subscript f
denotes the fundamental representation) is such a way that tr (ta b
f tf ) is proportional
to the identity. A customary choice of their normalization is to impose:
 δab
tr ta b
f tf = . (4.29)
2
This sets the normalization of the structure constants, through eq. (4.14). Then, one
usually normalizes the generators of other representations in such a way that they
fulfill the commutation relation (4.14) with the same structure constants (but the
trace formula (4.29) with a prefactor 1/2 is in general only valid in the fundamental
representation).
4. N ON -A BELIAN GAUGE SYMMETRY 151

Adjoint representation : The adjoint representation of a Lie group G is a repre-


sentation as linear operators that act on the Lie algebra g, defined by the following
mapping:

Ω ∈ G → AdΩ ∈ GL(g) such that AdΩ (X) = Ω−1 XΩ . (4.30)

If the dimension of the Lie algebra is d, then AdΩ may be viewed as a d × d matrix.
We may also define the adjoint representation of the algebra g, as follows:

X ∈ g → adX ∈ GL(g) such that adX (Y) = −i[X, Y] . (4.31)

It is sufficient to know the adjoint representation of the generators ta , for which one
often uses the following notation
a
i adta ≡ Tadj . (4.32)
a
Note that Tadj can be represented by a d × d matrix. Using Jacobi’s identity, one may
 
check that adta , adtb = −adi[ta ,tb ] = fabc adtc . Therefore, the Tadja
’s fulfill the
a
same commutation relations as the t ’s themselves:
 a b
Tadj , Tadj = i fabc Tadj
c
. (4.33)

Using eq. (4.14), we find that the components of these matrices are given by
a

Tadj bc
= −i fcab . (4.34)

In other words, the adjoint representation is a representation by matrices whose size


is the dimension of the algebra, and in which the components of the generators are
the structure constants. That eqs. (4.33) and (4.34) are consistent is a consequence of
the Jacobi identity (4.16) satisfied by the structure constants.
c sileG siocnarF

A common use of the adjoint representation is to rearrange expressions such as

e−iX Y eiX = eadX Y , (4.35)

where X and Y are in some representation r of the Lie algebra. Using X = Xa ta


r and
Y = Ya ta
r , we can rewrite this as follows
h a
i
iXa Tadj
eadX Y = eXa adta Yb tb r = tc
r e Yb . (4.36)
cb

Thus, we have
h i h i
e−iX Y eiX = eiXadj Yb , (4.37)
c cb

where the left hand side may be in any representation r. In other words, the right
and left multiplication by a group element and its inverse can be rewritten as a left
multiplication by the adjoint of this group element.
152 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

4.2 Yang-Mills Lagrangian

4.2.1 Covariant derivative and gauge transformations

When trying to extend the concept of gauge theory to a non-Abelian symmetry,


it is useful to first construct a covariant derivative. This object, denoted Dµ , is a
deformation of the ordinary derivative that transforms as follows:

Dµ → Ω−1 (x) Dµ Ω(x) , (4.38)

where Ω(x) is a spacetime dependent element of a Lie group G. Let us look for a
covariant derivative of the form

Dµ ≡ ∂µ − ig Aµ (x) , (4.39)

where g is a coupling constant similar to the constant e in QED and Aµ (x) a 4-vector
(in quantum field theory, this field is called a gauge field). The transformation law
(4.38) is satisfied provided that Aµ (x) transforms in a very specific way. Note first
that the ordinary derivative ∂µ is invariant (i.e. it belongs to the singlet representation).
If we denote AΩ µ (x) the transformed Aµ (x), then we must have:

 
∂µ − ig AΩ
µ (x) = Ω−1 (x) ∂µ − ig Aµ (x) Ω(x)

= ∂µ + Ω−1 (x) ∂µ Ω(x) − ig Ω−1 (x) Aµ (x) Ω(x) ,
(4.40)

from which we obtain the transformation law7 of Aµ (x):

−1 i −1 
Aµ (x) → µ (x) ≡ Ω
AΩ (x) Aµ (x) Ω(x) + Ω (x) ∂µ Ω(x) . (4.41)
g

From eqs. (4.17), (4.18) and (4.19), we see that if Ω is an element of a Lie group
G, then Ω−1 ∂µ Ω belongs to the Lie algebra g. Thus, if the second term in the right
hand side of eq. (4.41) belongs to the representation r of the Lie algebra, the first term
should also be in this representation for consistency. The same applies to Aµ , that we
can decompose as follows:

Aµ (x) ≡ Aa a
µ (x) tr , (4.42)

where the ta
r are the generators of the algebra in the representation r.

7 From this transformation law, we see that field configurations of the form ig−1 Ω−1 ∂ Ω may be
µ
transformed into the null field Aµ ≡ 0. Such configurations are called pure gauge fields.
4. N ON -A BELIAN GAUGE SYMMETRY 153

Infinitesimal transformations : Eq. (4.41) specifies how the field Aµ changes un-
der any transformation of G. However, it is sometimes useful to consider infinitesimal
transformations, i.e. Ω close to 1. This is done by writing Ω = exp(ig θa ta r ), with
|θa | ≪ 1, and by expanding eq. (4.41) to order one in θa . The variation of Aµ is
given by:
   
δAµ = −∂µ θr (x) + i g Aµ (x), θr (x) = − Dµ , θr (x) , (4.43)

where we have defined θr ≡ θa ta


r . This can also be written more explicitly as

δAaµ = −∂µ θa (x) + g f
abc
θb (x) Acµ (x) = − Dadj
µ ab θb (x) , (4.44)

where Dadj
µ is the covariant derivative in the adjoint representation.

4.2.2 Non-Abelian field strength

In the previous section, we have introduced a vector field Aµ in order to define


a covariant derivative. To interpret Aµ as describing a spin-1 particle, we should
construct a kinetic term for this field, with the constraint that it is invariant under the
transformations (4.41). In the case of quantum electrodynamics, this Lagrangian was
− 41 Fµν Fµν , where the field strength was defined as Fµν ≡ ∂µ Aν − ∂ν Aµ . However,
a direct verification indicates that this expression of the field strength cannot lead to
an invariant Lagrangian in the case of a theory with non-Abelian symmetry.
In order to mimic QED, we aim at constructing a Lagrangian with second order
derivatives. Indeed, since the field Aµ (x) has the dimension of a mass, two derivatives
and two powers of the field would provide the required dimension 4 for a Lagrangian
in four space-time dimensions. A useful intermediate step is the construction of a
field that depends only on Aµ (x) and has a simple transformation law. From the
transformation law of the covariant derivative, we find that the commutator [Dµ , Dν ]
transforms as
   
D µ , Dν → Ω−1 (x) Dµ , Dν Ω(x) . (4.45)

More explicitly, this commutator reads


    
Dµ , Dν = −ig ∂µ Aν − ∂ν Aµ − ig Aµ , Aν . (4.46)
| {z }
Fµν

This generalizes the field strength Fµν to an arbitrary gauge group G. Note the
commutator between gauge fields, that did not exist in QED. By construction, the
field strength is an element of algebra, in the same representation as Aµ ,

Fµν (x) ≡ Fa a
µν (x) tr , (4.47)
154 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

and its transformation law8 is


Fµν (x) → Ω−1 (x) Fµν (x) Ω(x) . (4.48)
Like in QED, one may define non-Abelian electrical and magnetic fields by
Eia = F0i
a , Bia = 1
2 ǫijk Fjk
a , (4.49)
but with an important difference: these E and B fields are not gauge invariant (they
belong to the representation r of the Lie algebra). Instead, they transform covariantly
as follows:
Ei (x) → Ω−1 (x) Ei (x) Ω(x) , Bi (x) → Ω−1 (x) Bi (x) Ω(x) . (4.50)

4.2.3 Lagrangian
In order to build a kinetic term for Aµ from Fµν , we must contract all the Lorentz
indices to have a Lorentz invariant Lagrangian. This forces us to have at least two
F’s, since gµν Fµν = 0. Therefore, if we restrict to objects of mass dimension 4,
this kinetic term should be quadratic in F, with a dimensionless prefactor. The most
general9 term of this kind is
LA ≡ −hab Fa µν Fb
µν , (4.51)
where hab is a constant real symmetric matrix in the group indices. In addition, for
this Lagrangian to define a consistent field theory, the matrix hab should be positive
definite (otherwise some parts of the kinetic term would have the wrong sign and
the energy of the system would not be bounded from below). Under an infinitesimal
gauge transformation, the variation of this Lagrangian is
δ LA = −2 hab Fa µν θc fdcb Fd
µν , (4.52)
and for the kinetic term to be gauge invariant we must have
hab fdcb Fa µν Fd
µν = 0 . (4.53)
| {z }
sym. in a,d

This condition is satisfied for any gauge field configuration provided that
fdcb hba + facb hbd = 0 . (4.54)

Note that ❤ab ≡ tr (ta tb ) is a solution of this constraint, since tr Fµν Fµν is
obviously gauge invariant given the transformation law (4.48) for the field strength,
but the positivity condition imposes some restrictions on the kind of Lie algebra we
may use. c sileG siocnarF

8 The field strength associated to a pure gauge field is zero, since there exists a transformation Ω for

which Aµ becomes the null field.


9 We ignore for now the operator ǫ µν ρσ
µνρσ hab Fa Fb . This term will be discussed in the section 4.5.
4. N ON -A BELIAN GAUGE SYMMETRY 155

4.2.4 Constraints on the Lie algebra


Eq. (4.54), combined with the fact that hab is a real symmetric positive-definite
matrix, strongly constrains the Lie algebras that lead to consistent (gauge invariant,
with a positive definite kinetic energy) non-Abelian gauge theories.

Complete antisymmetry of the structure constants : Let us start from a diago-


nalization of the matrix hab ,

hab ≡ Otac λc Ocb = Oca Ocb λc , (4.55)

where Oac is a real orthogonal matrix. Since the matrix hab is positive definite, all
the eigenvalues λc are positive, and we can define a square root of the matrix by

Ωab ≡ Oca Ocb λ1/2


c , hab = Ωac Ωcb . (4.56)

Note that Ωab is a real symmetric matrix. Now, let us introduce a new basis for the
algebra, defined by
a b
t ≡ Ω−1 b
ab t , ta = Ωab t . (4.57)

This is a legitimate change of basis for a real algebra since the matrix Ω is real and
has no vanishing eigenvalue (all the eigenvalues λc are strictly positive since hab is
positive definite). The commutator of two of these new generators is
 a b −1 a ′ b ′ c ′ c
t , t = i Ω−1 ′Ω ′ f Ωc ′ c t . (4.58)
| aa bb {z }
fabc

By rewriting eq. (4.54) in terms of the new structure constants fabc and by using the
fact that Ω is invertible, we get

fdca + facd = 0 . (4.59)

In other words, eq. (4.54) implies that there exists a basis in which the structure
constants are also antisymmetric under the exchange of the first and third indices.
From this, we conclude that they are in fact completely antisymmetric10 (and not just
in the first two indices, as implied by their definition in terms of the Lie bracket).

Allowed sub-algebras : Consider now the generators in the adjoint representation,


whose components are given by
a

Tadj bc
= −i fcab = −i fabc . (4.60)
10 Assuming they are non-zero, this requires an algebra with at least 3 generators, since it is not possible

to construct an antisymmetric rank-3 tensor with indices that take less than 3 values.
156 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Since the structure constants are real and antisymmetric, these generators are Her-
mitean matrices. Thanks to this property, there exists a basis in which all the adjoint
generators have a common block diagonal structure:
 a 
D(1) 0 0 ...
 0 Da 0 . . .
 (2) 
a
Tadj =
 0
 , (4.61)
 0 Da
(3) . . .
.. .. .. ..
. . . .

a
where the sizes of the blocks are the same for all the Tadj ’s. This block decomposition
can be obtained recursively, until one gets blocks that are not further reducible. If d is
the dimension of the adjoint representation (i.e. also the dimension of the Lie group),

it corresponds to a decomposition of d into orthogonal subspaces that are invariant
under the action of all the generators. Regarding the Lie algebra, this indicates that
it is a direct sum of simple sub-algebras11 , and u(1) sub-algebras (if some diagonal
blocks Da (n) are zero for all a’s). In addition, these simple sub-algebras are compact
a b
because Kab ≡ tr (Tadj Tadj ), restricted to the corresponding subspace, is positive
12
definite . Indeed,
P when the structure constants are totally antisymmetric, we have
Kab Xa Xb = c,d Xa facd )2 ≥ 0 for any vector X. Moreover, there is no non-zero
vector X for which this quadratic form is zero, because otherwise we would have
c
Tadj X = 0 for all c, which means that this vector X would define a u(1) sub-algebra
and cannot be part of the subspace associated with a simple sub-algebra.

Standard form of the Lagrangian : Note now that the constraint (4.54) can also
be written as
 c

Tadj ,h = 0 , for all c . (4.62)

This implies that hab has the same block decomposition as the adjoint generators (see
eq. (4.61)), with diagonal blocks that are proportional to the identity (with positive

11 A Lie algebra is not simple if there exists a set of generators T α (the number of which is strictly

smaller than the dimension of the algebra) which is closed under commutation with the algebra, i.e.
a , T α ] = gaαβ T β for all a, α. If we write these new generators as linear combinations T α ≡ V α T a ,
[Tadj a adj
the closure of the sub-algebra under commutation implies that the set of vectors {V α } is the basis of a
subspace invariant under the action of all the Tadj a ’s. Conversely, if we have an invariant subspace that

cannot be reduced to a smaller one, then it corresponds to a simple sub-algebra.


12 The coefficients K a b
ab define the Killing form, K(X, Y) ≡ tr (adX adY ) = −Kab X Y . Since Kab is
positive definite in compact Lie algebras, it naturally defines a distance d2 (X, Y) ≡ Kab (X − Y)a (X − Y)b .
This distance in g gives rise locally to a distance in the underlying Lie group G, which can then be extended
to the entire group into a metric invariant under the group action.
4. N ON -A BELIAN GAUGE SYMMETRY 157

prefactors)
 2 
α(1) 1 0 0 ...
 0 α2(2) 1 0 . . .
 
h=
 0
 (4.63)
 0 α2(3) 1 . . .
.. .. .. ..
. . . .

The prefactors α2(i) can be absorbed into the normalization of the gauge field and the
coupling constant of the corresponding sub-algebra, by writing
α(i) Fµν = ∂µ Aν′ − ∂ν Aµ′ − ig ′ [Aµ′ , Aν′ ] (4.64)

with Aµ′ ≡ α(i) Aµ and g ′ ≡ α−1


(i) g. Therefore, we can always write the Lagrangian
as a sum of terms (one for each simple and u(1) sub-algebra) having the following
standard form
1
L A = − Fa (x) Fa µν (x) . (4.65)
4 µν
Despite its resemblance with the photon kinetic term in QED, this Lagrangian has a
quite remarkable feature in the case of simple sub-algebras: due to the commutator
term in Fµν , LA contains terms that are cubic in Aµ and terms which are quartic
in Aµ . These terms are interactions between three and four of the spin-1 particles
described by Aµ , respectively. Thus, unlike in QED, the Lagrangian (4.65) has a
very rich structure, and defines in itself a very interesting quantum field theory, called
Yang-Mills theory. c sileG siocnarF

4.3 Non-Abelian gauge theories


A non-Abelian gauge theory is a quantum field theory that has at least a gauge
field whose symmetry group is a non-Abelian group G. Thus, the Lagrangian of all
non-Abelian gauge theories contains a Yang-Mills term:
1
L A ≡ − Fa (x)Fa µν (x) . (4.66)
4 µν
If Aµ is the only field of the theory, then it is a plain Yang-Mills theory.

4.3.1 Fermions
However, useful gauge theories in particle physics must also have matter fields, i.e.
fermions. Under the action of a Lie group G, a fermion field transforms as
ψ(x) → Ω−1 (x) ψ(x) , ψ(x) → ψ(x) Ω−1† (x) , (4.67)
158 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where Ω is an element of some representation r of G. By an abuse of language, it is


often said that the spinor ψ lives in the representation r (although strictly speaking, the
spinor is in the space on which the elements of the group representation are acting).
Consider now the Dirac Lagrangian constructed with a covariant derivative,

LD = ψ(x) iD / x − m ψ(x) . (4.68)
Under a gauge transformation, it becomes

LD → ψ(x) Ω−1† (x)Ω−1 (x) iD
/ x − m ψ(x) . (4.69)
| {z }
(ΩΩ† )−1

For this Lagrangian to be gauge invariant, Ω must be a unitary matrix, which restricts
to unitary representations of the gauge group (all finite dimensional representations
of compact Lie groups are equivalent to a unitary representation).
Like in electrodynamics, the necessity of using a covariant derivative in order
to have a Dirac Lagrangian invariant under local gauge transformations completely
specifies the coupling between the fermions and the gauge field Aµ :

LI = −ig ψi γµ Aa a
µ tr ij ψj , (4.70)
where we have written explicitly the Lie algebra indices i, j of all the objects. These
indices, that run from 1 to the size of the representation r, label the “charge” carried
by the fermions, while the index a may be viewed as the charge carried by the spin-1
particle associated to the vector field Aaµ (this index runs from 1 to the dimension of
the group).

4.3.2 Standard Model


Two important interactions in Nature are described by non-Abelian gauge theories:
• Quantum chromodynamics, the quantum field theory of strong interactions, is
of this type: the gauge fields are the gluons, and the matter fields are the quarks,
of which exist 6 families, or flavours (up, down, strange, charmed, bottom, top).
The charge associated to this gauge interaction is called colour. The gauge
group of QCD is SU(3), and the quarks live in the fundamental representation
(therefore, they can have three different colours). In QCD, the gluons interact
equally with the right-handed and left-handed projections of the spinors: it is
said to be a non-chiral interaction.
• Likewise, the Electroweak theory is a non-Abelian gauge theory with the gauge
group SU(2)×U(1), but with the peculiarity that the SU(2) acts only on the left-
handed projection of the fermions. In other words, the right-handed fermions
belong to the singlet representation of SU(2) (while the left-handed fermions
are arranged in doublets, corresponding to the fundamental representation of
SU(2)).
4. N ON -A BELIAN GAUGE SYMMETRY 159

It is also possible to couple a (charged) scalar field φ(x) to a gauge potential Aµ .


Under a local gauge transformation, φ transforms as follows

φ(x) → Ω† (x) φ(x) , (4.71)

and therefore the following Lagrangian density is invariant under local gauge trans-
formations:
†  
Lscalar = Dµ φ(x) Dµ φ(x) − m2 φ† (x)φ(x) − V φ† (x)φ(x) . (4.72)

(The potential should depend on the scalar field via the combination φ† φ in order to
be gauge invariant). The most important example of such a scalar in particle physics is
the Higgs boson. In the Standard Model, the potential of the Higgs field is symmetric
under the gauge transformations, but has minima at non-zero value of the field φ,
leading to spontaneous symmetry breaking. Because of its coupling to the gauge
potentials and to the fermions, the Higgs field vacuum expectation value turns them
into massive particles (see the next section for a discussion of this phenomenon).

4.3.3 Classical equations of motion

From the Lagrangians (4.66), (4.68) and (4.72), it is straightforward to obtain the
classical Euler-Lagrange equations of motion. For the fermions, we simply obtain the
Dirac equation

/ − m ψ(x) = 0 .
iD (4.73)

For scalar fields, the classical equation of motion is a deformation of the Klein-Gordon
equation, in which the ordinary derivatives are replaced by covariant derivatives:
h i
Dµ Dµ + m2 + V ′ φ† (x)φ(x) φ(x) = 0 . (4.74)

For the gauge field Aµ , the derivatives of the various pieces of the Lagrangian read:

∂LA
∂µ = −Fa µν ,
∂(∂µ Aa ν)
∂LA
= g fabc Ab
µF
c µν
,
∂Aa
ν
∂LD
= g ψ γν ta ψ ,
∂Aa
ν
∂Lscalar   † 
a
= ig φ† ta Dν φ − Dν φ ta φ . (4.75)
∂Aν
160 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

This leads to the following equation of motion


 
Dµ , Fµν a
= −Jν
a ,
  † a 
† a
Jν ν a
a = g ψ γ t ψ + ig φ t Dν φ − Dν φ t φ , (4.76)

known as Yang-Mills equation. From the Dirac and Klein-Gordon equations, one
may check that the colour current Jν
a is covariantly conserved:
 
Dν , Jν = 0 . (4.77)

The field strength also obeys another equation, known as the Bianchi identity,
     
Dµ , Fνρ + Dν , Fρµ + Dρ , Fµν = 0 , (4.78)

that follows from the Jacobi identity between covariant derivatives. c sileG siocnarF

4.3.4 Useful su(N) identities

Feynman graphs relevant for the Standard Model involve manipulations of the su(N)
generators (for N = 2, 3), mostly in the fundamental representation (since all matter
fields are in this representation). In this section, we derive some useful formulas that
help in these calculations.

Fierz identity : In the case of su(N), there are N2 − 1 generators ta f , while the
linear space of all N × N Hermitean matrices has a dimension N2 . A basis of the
latter can be obtained by adding the identity matrix to the ta
f ’s. Thus, any N × N
Hermitean matrix M can be written as

M = m0 1 + ma ta
f . (4.79)

Since the ta
f ’s are traceless, we have

1
m0 = tr (M) , ma = 2 tr (M ta ) . (4.80)
N

Considering the entry ij of the matrix M, we can write

1  
Mij = Mkk δij + 2 Mlk ta a
f kl tf ij
N
h1   i
= Mlk δkl δij + 2 ta ta
f kl f ij . (4.81)
N
4. N ON -A BELIAN GAUGE SYMMETRY 161

Since this is true for any Hermitean matrix M, we must have


1  
δkl δij + 2 ta a
f kl tf ij = δil δjk , (4.82)
N
which is usually written as follows
  1h 1 i
ta ta
f ij f kl = δ il δjk − δij δ kl . (4.83)
2 N
This formula is called a Fierz identity. It has a convenient diagrammatic representation,

j i
1 1
(ta a
f )ij (tf )kl = = − , (4.84)
2 2N
k l

in which the solid blobs represent the taf matrices, and the wavy line indicates that
the indices a are contracted. In the right hand side, the solid lines indicate how the
indices ijkl are connected by the delta symbols. By contracting the indices jk in the
Fierz identity (4.83), we obtain:
 N2 − 1
ta a
f tf il
= δil . (4.85)
2N
The quadratic combination ta a
f tf , called the fundamental Casimir operator, is pro-
portional to the identity (and therefore commutes with everything). The prefactor is
sometimes denoted Cf ≡ (N2 − 1)/(2N).
The diagrammatic representation (4.84) provides a very convenient way of obtain-
ing certain identities involving the generators of the fundamental representation. As
an illustration, let us consider the following example:

b 1 1
ta b a
f tf tf = a a = −
2 2N
1 1 b 1 b
= tr (tb
f )1 − t =− t . (4.86)
2 2N f 2N f

For the first term, we have used the fact that a closed loop in this diagrammatic
representation corresponds to a trace over the colour indices, and the tracelessness the
generators. Likewise, one would obtain

c 1 1 
ta b c a b
f tf tf tf tf = a a
= 1 + 2 tcf . (4.87)
b b 4 N
162 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

More su(N) formulas : Contrary to the commutator, the anti-commutator of two


matrices of the algebra does not belong to the algebra. However, in the case of su(N),
it can be decomposed as a linear combination of the identity and the generators of the
algebra:
 a b ab
tf , tf = δN 1 + dabc tcf . (4.88)
The first term is obtained by taking the trace of the equation, using eq. (4.29) and
the fact that the generators are traceless. The constants dabc are sometimes called
the symmetric structure constants. Therefore, the product of two generators of the
fundamental representation can be written as
1  δab 
ta t
f f
b
= 1 + (d abc
+ i fabc
) tc
f . (4.89)
2 N
From this, we deduce the following identities
 1 abc 
tr ta b c
f tf tf = d + i fabc ,
4
 1
tr ta b a c
f tf tf tf = − δbc ,
4N
facd fbcd = N δab ,
4 
dacd dbcd = − N δab ,
N
facd dbcd = 0,
ade bef cfd N abc
f f f = f . (4.90)
2
Note that the third of these equations provides the trace of the product of two genera-
tors in the adjoint representation:
a b

tr Tadj Tadj = N δab . (4.91)

4.4 Spontaneous gauge symmetry breaking


4.4.1 Dirac fermion masses and chiral symmetry
Chiral gauge theories, i.e. gauge theories in which the left and right handed fermions
belong to distinct representations, are rather special regarding the masses of these
fermions. Let us recall that the left and right spinors are defined by

1 + γ5 1 − γ5
ψR ≡ ψ , ψL ≡ ψ,
2 2
1 + γ5 0 1 − γ5 0
ψ R = ψ† γ , ψL = ψ† γ . (4.92)
2 2
4. N ON -A BELIAN GAUGE SYMMETRY 163

Consider first the term of the Dirac Lagrangian that does not depend on the mass. It
can be decomposed as follows in terms of the left and right handed spinors:

X 1 + ǫγ5 0 1 + ǫ ′ γ5

ψD = ψ† /
γ D ψ
2 2
ǫ,ǫ ′ =±
X 1 + ǫγ5 0
= ψ† γ D / ψR + ψ L D
/ ψ = ψR D / ψL . (4.93)
2
ǫ=ǫ ′ =±

Therefore, this terms does not mix the left and right spinors, and is invariant under
independent gauge transformations of the two spinor helicities. In particular, it is
perfectly possible that they belong to different representations of the Lie algebra.
c sileG siocnarF

In contrast, a Dirac mass term m ψψ has the following decomposition in terms of


the left and right handed spinors:

X 1 + ǫγ5 0 1 + ǫ ′ γ5
m ψψ = m ψ† γ ψ
2 2
ǫ,ǫ ′ =±
X 1 + ǫγ5 0
= ψ† γ ψ = m ψR ψL + m ψL ψR . (4.94)
2
ǫ=−ǫ ′ =±

If ψR and ψL belong to different representations and transform independently under


the gauge transformations, such a term is not gauge invariant and is therefore not
allowed. Therefore, generically, fermions must be massless in a chiral gauge theory.
The most prominent example of this situation is the Standard Model, where the
gauge group is SU(3) × SU(2) × U(1), and where the left and right handed fermions
transform differently under the SU(2) × U(1) part. More precisely, the two chiral
components have different charges (called hypercharge in this context) under U(1),
and the left handed fermions form SU(2) doublets while the right handed ones are
singlet under SU(2). Thus, in such a gauge theory, fermions should naively be
massless, while experimental evidence shows that they are massive.

4.4.2 Coupling to a scalar field, Yukawa terms

Let us focus on the case where the right handed fermions are singlet under the gauge
group, while the left handed ones belong to a non trivial representation. This means
that they transform as follows

ψR → ψR , ψR → ψR ,
ψL → Ω−1 ψL , ψL → ψL Ω . (4.95)
164 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Thus, a way out to construct an operator which is bilinear in the fermions, mixes the
left and right components, and does not contain derivatives is to introduce a scalar
field Φ that transforms in the same way as ψL ,

Φ → Ω−1 Φ . (4.96)

This operator, called a Yukawa term, reads


 
λ ψL ri Φi ψR r , (4.97)

where λ is a coupling constant, r is a Dirac index, and i is the index that labels the
components of the Lie algebra representation to which ψL and Φ both belong. From
the way the indices are contracted, this term is both gauge and Lorentz invariant.
Note that since the contraction of the Dirac indices between the two spinors already
produces a Lorentz invariant object, the field Φ must be Lorentz invariant on its own,
and thus must be a scalar.
At this point, the term of eq. (4.97) is not yet a mass term, but simply a tri-linear
interaction term between fermions and the newly introduced scalar field. However, a
mass term is generated if the vacuum expectation value Φi is non-zero (as we shall
see, this is related to the spontaneous breaking of the gauge symmetry). Therefore,
we may redefine the scalar field by writing (for the sake of this example, we choose
this expectation value to point in the direction i = 1)
 
v
 
0
Φ ≡ Φv + ϕ , Φv ≡  
 ..  , (4.98)
.
0

and the term (4.97) becomes


   
λ v ψL r1 ψR r + λ ψL ri ϕi ψR r .
λ ψL ri Φi ψR r = |{z} (4.99)
m

If the fermion in the right handed singlet matches the first component of the left
handed multiplet, then the first term in the right hand side is a Dirac mass term for
this fermion (the fermions corresponding to the other components of the multiplet
remain massless13 ). The second term in the right hand side is a genuine interaction
term between the fermions and the fluctuating part of the scalar. Interestingly, with
13 In the Standard Model, where the left handed fermions belong to the fundamental representation

of SU(2), it is possible to give a mass to the second component of the doublet. Indeed, by noting
that Ωt2 ΩT = t2 for any matrix Ω in the fundamental representation of SU(2), we see that the term
iλ ψL t2 Φ∗ ψR is gauge invariant and gives a mass λv to the second component of the left handed
doublet when the vacuum expectation value of the scalar field has a non-zero first component.
4. N ON -A BELIAN GAUGE SYMMETRY 165

this mechanism, the strength of this interaction is proportional to the mass of the
fermion. In a theory where several fermions acquire their masses by coupling to the
expectation value of the same scalar field, this leads to a definite prediction: the ratios
of the couplings must equal the ratios of the masses (but the masses themselves are
not predicted, since the Yukawa couplings λ are free parameters).

Family mixing : When there are several families of fermions (that we label by an
extra index f in this paragraph), the Yukawa term of eq. (4.97) can be generalized into


λff ′ ψL fri Φi ψR f ′ r , (4.100)
without spoiling Lorentz or gauge invariance. Thus, with a non-zero vacuum expecta-
tion value of the scalar field, we get a fermion mass matrix which is in general not
diagonal in the fermion families. Note that here, we are implicitly choosing a basis
of fermion fields in which the couplings to the gauge bosons are diagonal, i.e. for
which the vertex with one gauge boson and two fermions does not mix the fermion
families. Conversely, we could choose a basis of fermion fields in which the mass
matrix is diagonal. In this alternate basis, the interactions with the gauge bosons
are no longer diagonal, i.e. the coupling to a gauge boson may change the type of
fermion. These non-diagonal interactions are described by a matrix known as the
Cabbibo-Kobayashi-Maskawa in the sector of quarks.

4.4.3 Higgs mechanism


Until now, we have not explicited the mechanism by which the scalar field Φ may
have a non-zero vacuum expectation value. The simplest gauge invariant Lagrangian
that exhibits this phenomenon is
†  λ 2
Lscalar = Dµ Φ(x) Dµ Φ(x) + m2 Φ† (x)Φ(x) − Φ† (x)Φ(x) . (4.101)
4
Note the unusual sign of the mass term. Because of this feature, the value Φ = 0 is a
local maximum of the potential, and cannot be a stable field configuration. Instead,
this potential has minima for Φ† Φ = 2m2 /λ, which corresponds to a gauge invariant
shell of non-trivial minima. The general arguments developed in the section 1.15 also
apply here: in an infinite volume, the system chooses as its ground state one of these
minima (as opposed to a symmetric linear combination of all the minima). c sileG siocnarF

Let us denote Φv the ground state on which the system settles. The gauge group
G contains a subgroup H that leaves Φv invariant, called the stabilizer of Φv , and the
set of the minima of the potential can be identified with the coset space14 G/H. Then,
14 Given a group G and H one of its subgroups, two elements Ω and Ω ′ are said to be H-equivalent

if Ω−1 Ω ′ ∈ H. The quotient of the group G by this equivalence relationship, also called coset space
and denoted G/H, is the set of the resulting equivalence classes. When H is a normal subgroup (i.e.
ΩHΩ−1 = H for all Ω ∈ G), the coset space is itself a group.
166 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 4.3: Illustration of the


symmetry breaking pattern in the
case of a potential with G = O(3)
Φv symmetry. The set of minima of the
potential is a 2-dimensional sphere.
H The stabilizer of the minimum Φv
G/H is H = O(2). The dark circular
arrow shows the action of the gen-
erator of h, while the lighter arrows
show the action of the generators of
the complementary set.

the generators ta of the Lie algebra g can be divided in two sets: a basis of h (for
a > n), and a complementary set (for 1 ≤ a ≤ n):

1≤a≤n : ta 6 0,
ij Φvj =
a
a>n : tij Φvj = 0 . (4.102)

In the cases of interest in quantum field theory, the Lie algebra g is a direct sum
 
g = h ⊕ m , with h, m ⊂ m , (4.103)

and the complementary set of generators is a basis of m. (This is called a reductive


decomposition of g). In this case, the tangent space to G/H at the origin can be
identified with m, via the following mapping

X ∈ m → eitX H ∈ G/H . (4.104)

Thus, G/H is obtained by exponentiation of the elements of m, and any configuration


of the scalar field may be parameterized as follows:
 Xn  
Φ(x) ≡ exp i ϑa (x)ta Φv + r(x) . (4.105)
a=1
| {z }
Ω−1 (x)

In this representation, r(x) denotes the “radial” field variables, while the ϑa (x) are
the “angular” ones. From eqs. (4.102), the latter correspond to the generators broken
by the spontaneous symmetry breaking. Therefore, if it were not for the coupling
of Φ to the gauge fields through the covariant derivatives, we would conclude from
4. N ON -A BELIAN GAUGE SYMMETRY 167

Goldstone’s theorem that the modes r(x) are massive while the modes ϑa (x) are
the massless Nambu-Goldstone bosons. However, this conclusion is altered by the
minimal coupling to a gauge field because it is possible to absorb the matrix Ω−1 (x),
that contains the would-be Nambu-Goldstone modes, into a gauge transformation of
that field. Indeed, we may write

Dµ Φ = Dµ Ω−1 Φv + r)
= Ω−1 ΩDµ Ω−1 (Φv + r) , (4.106)
| {z }

with
i
Dµ′ ≡ ∂µ − igAµ′ , Aµ′ ≡ ΩAµ Ω−1 + Ω∂µ Ω−1 . (4.107)
g

We see that after this gauge transformation of Aµ (this choice of gauge is known as
the unitary gauge), only the modes r(x) can still be considered as physical dynamical
modes of the scalar field, and the kinetic term of the scalar Lagrangian can thus be
rewritten as
†  ′ † 
Dµ Φ Dµ Φ = D µ (Φv + r) Dµ′ (Φv + r)
′ † 
= D µ r Dµ′ r
′ †  ′ † 
+ − igA µ Φv Dµ′ r + D µ r − igA ′ µ Φv
′ † 
+ − igA µ Φv − igA ′ µ Φv . (4.108)
| {z }
1 ′a ′ bµ
2 Mab Aµ A

In this expression, the last term is particularly interesting, since it provides a mass for
some of the gauge bosons. More explicitly, the mass matrix is given by

Mab ≡ 2 g2 Φ†vi ta b
ik tkj Φvj . (4.109)

Note that since


X 2
1
2 Mab Xa Xb = g2 Xb tb
kj Φvj ≥0, (4.110)
k

this mass matrix is positive and has a number of flat directions equal to the number
of generators ta that annihilate Φv . From this, we conclude that the gauge bosons
that become massive via this mechanism are those that couple to the generators of the
broken symmetries.
168 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

4.5 θ-term and strong-CP problem

4.5.1 CP-odd gauge invariant operator

In the construction of the Lagrangian of Yang-Mills theory, we have argued that the
only dimension four gauge invariant local operator is an operator quadratic in the
field strength Fµν
a . All the Lorentz indices should be contracted in order to obtain
a Lorentz invariant Lagrangian density. An obvious possibility is Fµν a
a Fµν , which is
the combination that appears in the Yang-Mills action. However, there exists another
Lorentz invariant contraction, obtained by introducing the Levi-Civita tensor,

g2 θ
Lθ ≡ ǫµνρσ tr (Fµν Fρσ ) . (4.111)
32π2

The prefactor 1/32π2 will appear convenient later, and the coupling constant in front
of this term is usually denoted θ. Consequently, this term is referred to as the θ-term.

4.5.2 Expression as a total derivative

Firstly, we should clarify why we have not considered this term right away when we
listed the possible gauge invariant operators that may enter in a non-Abelian gauge
theory. As we shall prove now, the θ-term is a total derivative. Therefore, it does not
enter in the field equations of motion, and has also no influence on perturbation theory.
Since our discussion has been so far centered on the perturbative expansion, this
term was irrelevant. However, the θ-term –that we cannot exclude on the grounds of
symmetries– may lead to non-perturbative effects that we shall discuss in this section. c sileG siocnarF

Let us consider the following vector15 :

h g abc a b c i
Kµ ≡ ǫµνρσ Aa F
ν ρσ
a
− f Aν Aρ Aσ . (4.112)
3

15 Note that this vector can also be expressed as a trace:


 
2ig
Kµ ≡ 2ǫµνρσ tr Aν Fρσ + Aν Aρ Aσ .
3
4. N ON -A BELIAN GAUGE SYMMETRY 169

The divergence of this vector is given by


h  
∂µ Kµ = ǫµνρσ ∂µ Aa a a
ν ∂ρ Aσ − ∂σ Aρ + gf
abc b c
Aρ Aσ
+Aa a a
ν ∂µ ∂ρ Aσ − ∂µ ∂σ Aρ

+g fabc (∂µ Ab c
ρ )Aσ + g f
abc b
Aρ (∂µ Acσ )
g  b c g abc a  c
− fabc ∂µ Aa ν Aρ Aσ − f Aν ∂µ Abρ Aσ
3 3i
g 
− fabc Aa b
ν Aρ ∂µ Aσ
c
3
1 µνρσ h a a
= ǫ Fµν Fρσ − g2 fabc fade Ab c d e
µ Aν Aρ Aσ
2  i
g
+ fabc Ab c a a b c a a
µ Aν (∂ρ Aσ − ∂σ Aρ ) − Aρ Aσ (∂µ Aν − ∂ν Aµ ) .
3
(4.113)

The two terms of the third line are antisymmetric under the exchange (µν) ↔ (ρσ),
while the prefactor ǫµνρσ is symmetric under this exchange. These terms are therefore
zero after summing over the indices νρσµ. Then, the term on the second line can be
written as follows:

g2 ǫµνρσ tr ([Aµ , Aν ][Aρ , Aσ ])



= g2 ǫµνρσ tr Aµ Aν Aρ Aσ + Aν Aµ Aσ Aρ

−Aν Aµ Aρ Aσ − Aµ Aν Aσ Aρ . (4.114)

Each term is a trace of four factors, and is invariant under cyclic permutations of
the indices. Since cyclic permutations are odd in four dimensions, the ǫµνρσ tensor
changes sign under such a permutation, and the contraction with the trace is zero.
Therefore, we obtain:
1 µνρσ a a
∂µ Kµ = ǫ Fµν Fρσ , (4.115)
2
which is proportional to the θ-term. More precisely, we have

g2 θ
Lθ = ∂µ Kµ . (4.116)
32π2

4.5.3 Proof in terms of differential forms

The elementary proof of this result that we have presented in the previous subsection
is arguably rather cumbersome. This could have been made much more compact by
170 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

using the language of differential forms. The simplest differential forms are 1-forms,
that one may think of as the contraction of a spacetime dependent vector aµ (x) and
of the differential element dxµ , as in

A ≡ aµ (x) dxµ . (4.117)

1-forms measure the variation of a function along an infinitesimal one-dimensional


path (thus, 1-forms may be integrated along a path γ, yielding a number).
Higher degree forms may be constructed thanks to the exterior product, denoted
∧. A basis of 2-forms is provided by the products dxµ ∧ dxν , that are the areas of the
infinitesimal quadrangles of edges (dxµ , dxν , −dxµ , −dxν ). The exterior product is
defined to be antisymmetric,

dxµ ∧ dxν = −dxν ∧ dxµ , (4.118)

which corresponds to a definition of oriented areas, depending on the order in which


the edges of the quadrangle are traveled:

dxµ

dxν = − dxν

dxµ

This also naturally implies that dxµ ∧ dxµ = 0, in accordance with the fact that a
quadrangle of edges (dxµ , dxµ , −dxµ , −dxµ ) is reduced to a line segment and thus
has zero area. The most general 2-form can be written as

F ≡ fµν (x) dxµ ∧ dxν , (4.119)

where fµν is antisymmetric under the exchange of the µ, ν indices. 2-forms can be
integrated over a two-dimensional manifold, to give a number. In d-dimensional
space, one may iteratively construct p-forms for any p ≤ d (higher degree forms are
zero by antisymmetry of the exterior product). In particular, in four dimensions, the
volume element weighted by the fully antisymmetric tensor ǫµνρσ can be written as

d4 x ǫµνρσ = dxµ ∧ dxν ∧ dxρ ∧ dxσ , (4.120)

which allows the following compact notation


Z Z
4 µνρσ
d xǫ Aµ Aν Aρ Aσ = A ∧ A ∧ A ∧ A . (4.121)

Here, it is important to note that A ∧ A 6= 0 when Aµ belongs to a non-Abelian Lie


algebra, despite the antisymmetry of the exterior product.
4. N ON -A BELIAN GAUGE SYMMETRY 171

Another important operation on differential forms is the exterior derivative d,


defined as

d ω ≡ dxµ ∧ ∂µ ω . (4.122)

Thus, the exterior derivative of a p-form is a (p + 1)-form. For instance, given a


1-form A = aµ dxµ , we have
 
d A = dxµ ∧ ∂µ aν dxν = 12 ∂µ aν − ∂ν aµ dxµ ∧ dxν . (4.123)

Note that since ordinary derivatives commute, we have16

d2 ω = 0 . (4.124)

When applying the exterior derivative to the exterior product of two forms, one should
distribute the partial derivative on the two factors, and account for the fact that the
exterior derivative contains an anticommuting dxµ . Thus, if A is a p-form, we have

d A ∧ B = dA ∧ B + (−1)p A ∧ dB . (4.125)

Differential forms also provide a unified version of various formulas of vector calculus
(e.g., Kelvin–Stokes and Ostrogradsky–Gauss theorems), known as Stokes theorem.
Given a form ω and a manifold M, Stokes theorem states that
Z Z
ω= dω , (4.126)
∂M M

where ∂M is the boundary of M. c sileG siocnarF

In order to cast the θ-term in the language of differential forms, let us firstly
introduce the gauge potential 1-form:

A ≡ ig Aµ dxµ . (4.127)

Then, we have
ig 
dA = ∂µ Aν − ∂ν Aµ dxµ ∧ dxν
2
g2  
A∧A=− Aµ , Aν dxµ ∧ dxν , (4.128)
2
and we see that the field strength Fµν appears in the coefficients of the following
2-form,
ig 
F ≡ dA − A ∧ A = ∂µ Aν − ∂ν Aµ − ig [Aµ , Aν ] dxµ ∧ dxν . (4.129)
2 | {z }
Fµν

16 A differential form ω whose exterior derivative is zero (dω = 0) is said to be closed. A differential

form χ which is the exterior derivative of another form (χ = dω) is said to be exact.
172 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Therefore, the integrand of the θ-term may be written compactly as


  4  
d4 x ǫµνρσ tr Fµν Fρσ = − 2 tr F ∧ F . (4.130)
g
Likewise, we have
4  1 
d4 x Kµ = − dx µ
∧ tr A ∧ F + A ∧ A ∧ A . (4.131)
g2 3
Then, note that
 1 
d tr A ∧ F + A ∧ A ∧ A = tr dA ∧ F − A ∧ dF + (dA) ∧ A ∧ A
3
= tr dA ∧ dA − 2 (dA) ∧ A ∧ A
 
= tr F ∧ F , (4.132)

where we have used the cyclicity of the trace and the fact that commuting dA with
other forms does not bring any sign since it is a 2-form. In order to obtain the last
line, we have used

tr A ∧ A ∧ A ∧ A = 0 , (4.133)
which is a consequence of the fact that a cyclic permutation of four objects is odd.
Eq. (4.132) is the translation in terms of forms of the fact that the θ-term is the
derivative of the vector Kµ . Thanks to Stokes theorem, the integral of the θ-term
over a four-dimensional manifold M can be rewritten as an integral over its boundary

(located at infinity if M is the entire 4 ),
Z   Z  
1
tr F ∧ F = tr A ∧ F + A ∧ A ∧ A . (4.134)
M ∂M 3

4.5.4 Effect of the θ-term on the Euclidean path integral


We have already encountered the integral of the θ-term over Euclidean spacetime in
the context of anomalies and the Atiyah-Singer index theorem (see eq. (3.129)):
Z
d4 xE Lθ = nθ , n∈ , ❩ (4.135)

where the integer n is related to the chirality of the zero modes of the Dirac operator
in the gauge field configuration. When added to the Yang-Mills action, the integral of
the θ-term modifies the Euclidean path integral as follows
Z Z
    R 4
DAµ · · · e−S[A,··· ] → DAµ · · · e−S[A,··· ]− d xE Lθ
Z
X  
= e−nθ DAµ · · · n e−S[A,··· ] , (4.136)
n∈❩
4. N ON -A BELIAN GAUGE SYMMETRY 173

 
where the measure DAµ n is restricted to the gauge fields of index n. Thus, the
effect of the θ-term is to reweight the gauge field configurations by a factor (e−θ )n
that depends only on θ and on the index n. Note that since n is an integer, the path
integral is periodic in θ, with a period 2iπ.

4.5.5 Strong CP-problem

As we have seen in the section 3.5.6, an effective description of the interactions of


nucleons with pions is provided by the linear σ model, whose interaction term is

LI ≡ λ ψ(σ + iπ · σγ5 )ψ . (4.137)

However, this does not include any CP-violating interactions, such as those that may
result by the θ-term. Its effects may be included in the effective theory by generalizing
the interaction term into

LI ≡ ψ(λσ + π · σ(iλγ5 + λ))ψ . (4.138)

By a matching with the underlying theory, the new coupling λ can be related to the
parameter θ by the following estimate

|λ| ≈ 0.038 |θ| . (4.139)

Then, the effective theory (4.138) can be used to estimate the neutron electric dipole
moment DN (in the chiral limit where the pion mass mπ is much smaller than the
nucleon mass mN ). This leads to
 
m
ln mN π
DN ≈ λ λ e ≈ 5 × 10−16 θ e · cm . (4.140)
4π2 mN

Current experimental limits on the neutron electric dipole moment indicate that

DN ≤ 3 × 10−26 e · cm , (4.141)

implying that

|θ| . 10−10 . (4.142)

We thus face a paradoxical situation. The gauge symmetry of quantum chromody-


namics allows the addition of the θ-term to the Yang-Mills action, and without any
prior knowledge of the coupling θ, one may expect that natural values are of order
unity. This constitutes the strong-CP problem: lacking a symmetry principle that
would force θ to be zero, why is it nevertheless extremely small?
174 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

4.5.6 θ-term and quark masses

There is an interesting interplay between the θ-term and chiral transformations of


quark fields:

ψf −→ eiγ5 αf ψf , (4.143)

where f is an index labeling the quark flavours and the αf are real phases. Under this
transformation, the functional measure for the quarks is not invariant, but transforms
as follows
 Z X h
  i 
DψDψ −→ exp − 2
d4 x ǫµνρσ Fa a
µν Fρσ αf DψDψ . (4.144)
32π
f

The same effect would have been obtained by a change of the angle θ:
X
θ→θ−2 αf . (4.145)
f

For the quarks, we can write generically the following mass term17
X 1 + γ5 X 1 − γ5
Mf ψ f ψf + M∗f ψf ψf , (4.146)
2 2
f f

that transforms into the following under the above chiral transformation
X 1 + γ5 X 1 − γ5
e2iαf Mf ψf ψf + e−2iαf M∗f ψf ψf . (4.147)
2 2
f f

This is equivalent to transforming the quark masses as follows:

Mf → e2iαf Mf . (4.148)

Since any change of θ can be absorbed by a chiral transformation of the quarks,


whose effect is to multiply the quark masses by phases, physical quantities cannot
depend separately on θ and on the quark masses. Instead, they can depend only on
the following combination
Y
eiθ Mf , (4.149)
f

which is invariant. This discussion indicates that the θ-term has no effect if at least
one of the quarks is massless. Unfortunately, a massless up quark (the lightest quark)
does not seem consistent with existing experimental and lattice evidence. c sileG siocnarF

17 If the masses are complex, then the symmetries P and CP are explicitly broken.
4. N ON -A BELIAN GAUGE SYMMETRY 175

4.5.7 Link with the topology of gauge fields

Using Stokes’ theorem, the integral of the θ-term over Euclidean spacetime may be
rewritten as an integral over a surface localized at infinity:
Z Z Z
g2 θ g2 θ
d4 xE Lθ = d4
x ∂ µ Kµ
= lim dSµ Kµ , (4.150)
32π2 E
32π2 R→∞
S3,R

where S3,R is a 3-dimensional sphere of radius R and dSµ the measure on this surface.
Let us now assume that the coloured objects of the problem are comprised in a
finite region of space-time, so that the gauge field configuration goes to a pure gauge
at infinity. Such a field can be written as
i †
Aµ (x) = aµ (x) + Ω (b
x) ∂µ Ω(b
x) , (4.151)
g
where Ω(b x) is an element of the gauge group that depends only on the direction of
the vector xµ , and aµ (x) is the deviation from the asymptotic pure gauge. For the
total field to be a pure gauge at infinity, this deviation must decrease faster than |x|−1 .
When |x| → +∞, Aν (x) goes to 0 as |x|−1 , while Fρσ (x) goes to 0 faster than |x|−2
(since Aν (x) goes to a pure gauge), and we have:
4ig µνρσ
Kµ −→ ǫ tr (Aν Aρ Aσ ) ∼ |x|−3 , (4.152)
|x|→+∞ 3
and
Z Z
θ
d4 xE Lθ = lim xµ ǫµνρσ
dS b
24π2 R→∞
S3,R

×tr Ω† (∂ν Ω)Ω† (∂ρ Ω)Ω† (∂σ Ω) , (4.153)

where we have used dSµ = b xµ dS, with dS the element of area on the 3-sphere. Note
that the integrand decreases as R−3 because of the three derivatives, while dS ∼ R3 .
Therefore, the integral is in fact independent of the radius R and we can drop the limit:

Z Z
θ 
4
d xE L θ = xµ ǫµνρσ tr Ω† (∂ν Ω)Ω† (∂ρ Ω)Ω† (∂σ Ω) . (4.154)
dS b
24π2
S3

x), that maps the


Thus, the integral of the θ-term depends only on the function Ω(b
3-dimensional sphere S3 onto the gauge group:

Ω : S3 7−→ G. (4.155)
176 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

It turns out that these mappings can be grouped in equivalence classes of Ω’s that
can be deformed continuously into one another. On the contrary, Ω’s that belong to
distinct classes cannot be related by a continuous deformation. The set of these classes
possesses a group structure, and is called the third homotopy group of G, denoted
π3 (G). For all SU(N) groups with N ≥ 2, the third homotopy group is isomorphic
to (❩, +). The interpretation of eq. (4.154) is that the integral of the θ-term depends
only on the class to which Ω belongs, and is therefore a topological quantity that can
change only in discrete amounts. This discussion provides another point of view on
the Atiyah-Singer index theorem, where the same integral was related to the chirality
imbalance between the zero modes of the Euclidean Dirac operator in a background
gauge field.

4.6 Non-local gauge invariant operators

4.6.1 Two-fermion non-local operator


The discussion in the previous sections exhausts the local gauge invariant objects of
dimension less than or equal to 4. However, it is sometimes useful to construct gauge
invariant non-local operators, for instance in the definition of parton distributions.
The simplest operator of this type is an operator with two spinor fields at different
space-time positions, ψ(y) W(y, x) ψ(x). Since the transformation laws of the two
spinors involve different Ω’s, such an operator is gauge invariant only if the object
W(y, x) between the spinors transforms as follows:

W(y, x) → Ω† (y) W(y, x) Ω(x) . (4.156)

4.6.2 Wilson lines


In order to construct such an object, let us define a path γµ (s) that goes from x to y,

γµ (0) = xµ , γµ (1) = yµ , (4.157)

and consider the following differential equation

dW dγµ  
≡ Dµ (γ(s)) W = 0, with initial condition W(0) = 1. (4.158)
ds ds

where the notation Dµ (γ(s)) indicates that the gauge field in the covariant derivative
must be evaluated at the point γµ (s). In other words, the covariant derivative of W,
projected along the tangent vector to the path γµ (s), is zero. From this definition, it
follows that W(s) is an element of the representation r of the gauge group if Aµ is in
the representation r of the algebra.
4. N ON -A BELIAN GAUGE SYMMETRY 177

Note that when the gauge field Aµ is zero everywhere, then the solution is trivially
W(s) = 1. For a generic gauge field, the value of the solution18 at s = 1 is a property
of the path γµ and of the gauge potential Aµ . This object, that we will denote as

Wyx [A; γ] ≡ W(1) , (4.159)

is called a Wilson line. Let us now study how it changes under a gauge transformation
Ω. From the transformation law of the covariant derivative, the differential equation
that defines the transformed WΩ (s) is

dγµ †
Ω (γ(s))Dµ (γ(s))Ω(γ(s))WΩ (s) = 0, with initial condition WΩ (0) = 1.
ds
(4.160)

If we define Z(s) ≡ Ω(γ(s))WΩ (s), this equation is equivalent to

dγµ
Dµ (γ(s)) Z(s) = 0 , with initial condition Z(0) = Ω(x) . (4.161)
ds
Comparing this equation with the original equation (4.158), we obtain

Z(s) = W(s) Ω(x) , i.e. WΩ (s) = Ω† (γ(s)) W(s) Ω(x) . (4.162)

Looking now at the point s = 1, we see that the Wilson line transforms as

Wyx [A; γ] → Ω† (y) Wyx [A; γ] Ω(x) . (4.163)

Thus, the Wilson line transforms precisely as we wanted in eq. (4.156), and we
conclude that the operator ψ(y)Wyx [A; γ]ψ(x) is gauge invariant. Note that the c sileG siocnarF

Wilson line Wyx [A; γ], solution of eq. (4.158) at s = 1, can also be written as a
path-ordered exponential,
 Z 
Wyx [A; γ] = P exp ig dxµ Aµ (x) . (4.164)
γ

Although this compact notation is suggestive, it is often useful to return to the defining
differential equation (4.158).

4.6.3 Path dependence

By inserting a Wilson line between the points x and y, we can construct a gauge
invariant non-local operator ψ(y) · · · ψ(x). However, in doing so, we have introduced
18 Note that if the initial condition is W(0) = Ω instead of 1, then the solution would be changed as
0
follows W(s) → W(s)Ω0 .
178 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

a path γ, for which there are infinitely many possible choices since only its endpoints
are fixed. It turns out that in general, the Wilson line depends on the path γ, i.e.

Wyx [A; γ] 6= Wyx [A; γ ′ ] . (4.165)

This implies that, although we may define gauge invariant non-local bilinear operators,
their definition is not unique and each choice of the path connecting the two points
leads to a different operator.

4.6.4 Case of pure gauge fields

When the gauge potential is a pure gauge field, there exists a function Ω(x) such that

i †
µ (x) =
AΩ Ω (x) ∂µ Ω(x) . (4.166)
g
Since this field is a gauge transformation of the null field Aµ ≡ 0, Wilson lines in
this pure gauge field are given by

Wyx [AΩ ; γ] = Ω† (y) Ω(x) . (4.167)

In other words, in a pure gauge field, the Wilson lines depend only on their endpoints,
but not on the path chosen to connect them. This is the only exception to the remark
of the previous paragraph.
Conversely, a gauge potential Aµ (x) in which the Wilson lines depend only on
the endpoints is a pure gauge. A function Ω(x) that gives this gauge potential through
eq. (4.166) can be constructed as a Wilson line from x to some arbitrary base point
x0 :

Ω(x) = Wx0 x [A; γ] . (4.168)

(The path γ can be chosen arbitrarily.)

4.6.5 Wilson loops

A Wilson loop is a special kind of Wilson line, where the initial point and endpoint
are identical, x = y, and therefore the path γ is a closed loop:
 I 
W[A; γ] = P exp ig dxµ Aµ (x) . (4.169)
γ

Note that they are a property of the closed loop γ, and do not depend on the choice of
the starting point x. Because they have identical endpoints, the trace of a Wilson loop
4. N ON -A BELIAN GAUGE SYMMETRY 179

is gauge invariant. From the result of the previous paragraph, they are equal to the
identity in a pure gauge field, but they depend non-trivially on the path in a generic
gauge field19 .
In Abelian gauge theories, the Wilson loop can be rewritten in terms of the integral
of the field strength Fµν over a surface Σ of boundary γ, by using Stokes theorem:
 I   gZ 
µ
exp ig dx Aµ (x) = exp i dxµ ∧ dxν Fµν (x) . (4.170)
γ Abelian 2 Σ
Generalizations of this formula to the non-Abelian case exist, that involve a path-
ordering in the left hand side (thus giving a Wilson loop) and a surface-ordering in
the right hand side. For infinitesimally small closed loops, a more direct connection
to the field strength may be established. Consider for instance a small square closed
path in the (12) plane,

γ= a

x
a

The Wilson loop along this path may be approximated by


 
W[A; γ] ≈ exp − iga A2 (x + a2 ^) exp − iga A1 x + a2 ^ı + a^)
 
× exp iga A2 (x + a^ı + a2 ^) exp iga A1 (x + a2 ^ı) ,
(4.171)
where we make an error of order a3 on each of the Wilson lines at the edges of the
square. By expanding the exponentials, we obtain

W[A; γ] = 1 + iga2 ∂1 A2 (x) − ∂2 A1 (x)

−g2 a2 A2 (x)A1 (x) − A1 (x)A2 (x) + O(a3 )
= 1 + ig a2 F12 (x) + O(a3 ) . (4.172)
Thus, the first non-trivial correction to a small Wilson loop is the area of the loop times
the field strength projected on the plane of the loop. Since W[A; γ] is an element of
the representation r of the group, in the vicinity of the identity, it may be represented
as

W[A; γ] = exp i ǫ αa ta 2 a a
r + ǫ β tr + O(ǫ )
3

 ǫ2 a b a b
= 1r + i ǫ αa ta 2 a a
r + ǫ β tr − α α tr tr + O(ǫ3 ) ,
2
(4.173)
19 Wilson loops are extensively used in lattice gauge theories. Moreover, Giles’ theorem states that all

the gauge invariant information contained in a gauge potential Aµ can be reconstructed from the trace of
Wilson loops (assuming we know Wilson loops for arbitrary loops).
180 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where ǫ is an infinitesimal parameter quantifying how close W[A; γ] is from the


identity, and 1r is the identity matrix in the representation r. Comparing eqs. (4.172)
and (4.173), we see that we must identity ǫ ≡ ga2 and αa ≡ Fa 12 (x). The formula
(4.172) is insufficient in order to determine βa , since this term gives a contribution
of order a4 in the Wilson loop. But we can nevertheless use eq. (4.173) in order to
determine the lowest order correction to the trace of the Wilson loop,

 g2 a4 a 
tr (W[A; γ]) = tr 1r − F12 (x)Fb a b 6
12 (x) tr tr tr + O(a ) , (4.174)
2
where we have used the fact that the generators ta r are traceless for the su(N) algebra.
Eq. (4.174) is the basis of the discretization of the Yang-Mills action, the first step in
the formulation of lattice gauge theories.

4.6.6 Wilson lines and eikonal scattering

Wilson lines also appear in the high energy limit of scattering by an external potential,
known as the eikonal limit. Consider the following S-matrix element,

Sβα ≡ βout αin = βin U(+∞, −∞) αin , (4.175)

for the transition between two arbitrary states made of quarks, antiquarks and gluons,
α and β. In the second equality, U(+∞, −∞) is the evolution operator from the
initial to the final state. It can be expressed as the time ordered exponential of the
interaction part of the Lagrangian,
h Z i
U(+∞, −∞) = T exp i d4 x LI (φin (x)) , (4.176)

where φin denotes generically the fields in the interaction picture. In this discussion,
LI contains both the self-interactions of the fields, and their interaction with the
external field. Consider now the high energy limit of this scattering amplitude,
(∞) 3 3
Sβα ≡ lim βin e−iωK U(+∞, −∞) e+iωK αin (4.177)
ω→+∞ | {z }
boosted state

where K3 is the generator of Lorentz boosts in the +z direction.


Before doing any calculation, a simple argument can help understand what hap-
pens in this limit. Quite generally, scattering amplitudes are proportional to the
overlap in space-time between the wavefunctions of the two colliding objects. In
the present case, it should scale as the time spent by the incoming state in region
occupied by the external field. This duration is inversely proportional to the energy of
the incoming state, and goes to zero in the limit ω → +∞. If the interaction between
4. N ON -A BELIAN GAUGE SYMMETRY 181

the projectile and the external field was via a scalar exchange, then the conclusion
would be that the scattering amplitude vanishes in the high energy limit (in other
words, S-matrix elements would go to unity). However, interactions with a colour
field involve a vector exchange, i.e. the external field couples to a four-vector Jµ that
represents the colour current carried by the projectile, by a term of the form Aµ Jµ . At
high energy, the longitudinal component of this four-vector increases proportionally
to the energy, and compensates the small time spent in the interaction zone. Thus, for
states that interact via a vector exchange20 , we expect that scattering amplitudes have
a finite high energy limit (nor zero, nor infinite).
This calculation is best done using light-cone coordinates. For any four-vector
aµ , one defines

a0 + a3 a0 − a3
a+ ≡ √ , a− ≡ √ . (4.178)
2 2

These coordinates satisfy the following formulas,

x · y = x+ y− + x− y+ − x⊥ · y⊥
d4 x = dx+ dx− d2 x⊥
∂ ∂
 = 2∂+ ∂− − ∇2⊥ with ∂+ ≡ , ∂− ≡ + . (4.179)
∂x− ∂x

Note also that the non-zero components of the metric tensor are

g+− = g−+ = 1 , g11 = g22 = −1 . (4.180)

For a highly boosted projectile in the +z direction, x+ plays the role of the time,
and the Hamiltonian is the P− component of the momentum. The generator of
longitudinal boosts in light-cone coordinates is

K3 = M+− . (4.181)

Using the commutation relations of the Poincaré algebra, this leads to the following
identities:

3 3
e−iωK P− eiωK = e−ω P−
3
−iωK + iωK3
e P e = e+ω P+
−iωK3 j iωK3
e P e = Pj . (4.182)
20 By the same reasoning, gravitational interactions, that involve a spin two exchange, would lead to

scattering amplitudes that grow linearly with energy.


182 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

They express the fact that, under longitudinal boosts, the components P± of a four-
vector are simply rescaled, while the transverse components are left unchanged.
Likewise, states, creation operators and field operators are transformed as follows,

3
eiωK p · · · in = (eω p+ , p⊥ ) · · · in
3 3
eiωK a†in (q) e−iωK = a†in (eω q+ , e−ω q− , q⊥ )
3 3
eiωK φin (x) e−iωK = φin (e−ω x+ , eω x− , x⊥ ) . (4.183)

Note that the last equation is valid only for a scalar field, or for the transverse
components of a vector field. In addition, the ± components of a vector field receive
an overall rescaling by a factor e±ω . Moreover, since a longitudinal boost does not
alter the time ordering, we can also write
Z
−iωK3 iωK3 3 3
e U(+∞, −∞) e = T exp i d4 x LI (e−iωK φin (x) eiωK ) . (4.184)

The components of the vector current that couples to the target field transform as

3 3
e−iωK Ji (x) eiωK = Ji (e−ω x+ , eω x− , x⊥ )
3 3
e−iωK J− (x) eiωK = e−ω J− (e−ω x+ , eω x− , x⊥ )
3 3
e−iωK J+ (x) eiωK = eω J+ (e−ω x+ , eω x− , x⊥ ) . (4.185)

Naturally, the target field Aµ does not change when we boost the projectile. For
simplicity, let us assume that Aµ is confined in the region −L ≤ x+ ≤ +L. We can
thus split the evolution operator into three factors,

U(+∞, −∞) = U(+∞, +L) U(+L, −L) U(−L, −∞) . (4.186)

The factors U(+∞, +L) and U(−L, −∞) do not contain the external potential. For
these two factors, the change of variables e−ω x+ → x+ , eω x− → x− leads to

3 3
lim e−iωK U(+∞, +L) eiωK = U0 (+∞, 0)
ω→+∞
3 3
lim e−iωK U(−L, −∞) eiωK = U0 (0, −∞) , (4.187)
ω→+∞

where U0 is the same as U, but defined with the self-interactions only (since these
two factors correspond to the evolution of the projectile while outside of the target
field). For the factor U(+L, −L), the change eω x− → x− gives
h Z i
−iωK3 iωK3
lim e U(+L, −L) e = exp i d2 x⊥ χ(x⊥ ) ρ(x⊥ ) , (4.188)
ω→+∞
4. N ON -A BELIAN GAUGE SYMMETRY 183

 Z

χ(x⊥ ) ≡ dx+ A− (x+ , 0, x⊥ )

with Z (4.189)

ρ(x⊥ ) ≡ dx− J+ (0, x− , x⊥ ) .

Thus, the high-energy limit of the scattering amplitude is


h Z i
(∞)
Sβα = βin U0 (+∞, 0) exp i d2 x⊥ χ(x⊥ )ρ(x⊥ ) U0 (0, −∞) αin . (4.190)

This formula is an exact result in the limit ω → +∞. One may also note the following
important properties:

• Only the A− component of the external vector potential, integrated along the
trajectory of the projectile, matters. c sileG siocnarF

• The self-interactions and the interactions with the external potential are fac-
torized into three separate factors – this is a generic property of high energy
scattering. The role of the longitudinal boost in this factorization is illustrated
in the figure 4.4.

Figure 4.4: Illustration of the role of kinematics in the factorization of eq. (4.190).
Left: before the boost is applied, quantum fluctuations of the incoming projectile
may occur in the region of the external field. Right: after the boost, the region of
the external field shrinks due to Lorentz contraction (in the frame of the projectile),
and the effect of quantum fluctuations inside this region go to zero.

Eq. (4.190) is an operator formula that still contains the self-interactions of the fields
to all orders. In order to evaluate it, one must insert the identity operator written as a
sum over a complete set of states on each side of the exponential,

(∞)
X
Sβα = βin U0 (+∞, 0) γin
γ,δ h Z i
× γin exp i d2 x⊥ χ(x⊥ )ρ(x⊥ ) δin

× δin U0 (0, −∞) αin . (4.191)


184 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

The factor
X
δin δin U0 (0, −∞) αin (4.192)
δ

is the Fock expansion of the initial state: it accounts for the fact that the state α
prepared at x+ = −∞ may have fluctuated into another state δ before it interacts
with the external potential. The matrix elements of U0 that appear in this expansion
can be calculated perturbatively to any desired order. There is a similar factor for the
final state evolution.
The interactions with the external field are in the central factor, γin exp ... δin .
In order to rewrite it into a more intuitive form, let us first rewrite the operator ρ in
terms of creation and annihilation operators. For instance, the fermionic part of the
current gives
Z
dp+ d2 p⊥ d2 q⊥ a   †
ρa (x⊥ ) = g t b + bsj p+ q⊥ ei(p⊥ −q⊥ )·x⊥
4πp+ (2π)2 (2π)2 f ij si p p⊥ 
−d†si p+ p⊥ dsj p+ q⊥ e−i(p⊥ −q⊥ )·x⊥ , (4.193)

where the taf are the generators of the fundamental representation of the su(N)
algebra and b, d, b† , d† are the annihilation and creation operators for quarks and
antiquarks. ρa also receives a contribution from gluons, not written here, obtained
with the generators in the adjoint representation and the annihilation and creation
operators for gluons instead. This formula captures the essence of eikonal scattering:

• Each annihilation operator has a matching creation operator – therefore, the


number of quarks and gluons in the state does not change during the scattering,
nor their flavour.

• The p+ component of the momenta are not affected by the scattering.

• The spins are unchanged during the scattering.

• The colours and transverse momenta of the constituents of the state may change
during the scattering.

Scattering amplitudes in the eikonal limit take a very simple form if one trades
transverse momentum for a transverse position by a Fourier transform. For each
intermediate state δin ≡ k+ i , ki⊥ , we first define the corresponding light-cone
wave function by :
Y Z d2 ki⊥
Ψδα ({k+
i , xi⊥ }) ≡ e−iki⊥ ·xi⊥ δin U0 (0, −∞) αin , (4.194)
(2π)2
i∈δ
4. N ON -A BELIAN GAUGE SYMMETRY 185

where the index i runs over all the constituents of the state δ. Then, each charged
particle going through the external field acquires an SU(N) phase that depends on
the representation in which it lives
Y
Ψδα ({k+ +
i , xi⊥ }) −→ Ψδα ({ki , xi⊥ }) Ui (x⊥ )
i∈δ
h Z i
Ui (x⊥ ) ≡ T exp ig dx+ A− + a
a (x , 0, xi⊥ ) tri , (4.195)

where ri is the representation corresponding to the constituent i. We recognize in


this formula Wilson lines defined on the light-cone direction that corresponds to the
boosted projectile. The simplicity of this result is entirely due to kinematics: thanks
to the longitudinal boost, the external field is crossed in an infinitesimally short time,
during which the transverse positions of the incoming quanta cannot vary.
186 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS
Chapter 5

Quantization of
Yang-Mills theory

5.1 Introduction

Generically, the Lagrangian density of a non-Abelian gauge theory reads:

†  
L ≡ Dµ φ(x) Dµ φ(x) − m2 φ† (x)φ(x) − V φ† (x)φ(x)

/ x − m ψ(x)
+ψ(x) iD
− 41 Fa
µν (x)F
a µν
(x) . (5.1)

The local non-Abelian gauge invariance of this Lagrangian does not change anything
to the quantization of the scalar field φ and of the spinor ψ, for which we may use
the standard canonical or path integral approaches, with the result that the usual
Feynman rules still apply. The main complication resides in the pure Yang-Mills part
(third term) of this Lagrangian, i.e. with the quantization of the gauge potential Aµ .
The identification of the degrees of freedom that are made redundant by the gauge
symmetry is much more complicated than in QED, and a lot more care is necessary
in order to isolate the genuine dynamical variables of the theory.
In order to get a sense of the difficulty, let us try to mimic the QED case in order
to guess the Feynman rules for non-Abelian gauge fields. Using the explicit form of
the field strength,

Fa
µν = ∂µ Aν − ∂ν Aµ + g f
abc b c
Aµ Aν , (5.2)

187
188 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

we can rewrite the Yang-Mills Lagrangian as follows



LA = 12 Aa µ g
µν
 − ∂µ ∂ν Aa ν
abc a
 bµ cν
−g f ∂µ Aν A A
− 41 g2 fabc fade Ab c
µ Aν A
dµ eν
A , (5.3)

where we have anticipated an integration by parts in the first (kinetic) term. Note
that the kinetic term is formally identical to the kinetic term of a photons, except for
the colour index a carried by the gauge potential. Therefore, one may be tempted
to generalize the QED Feynman rules to a non-Abelian gauge boson. As in the
QED case, the quadratic part of the Lagrangian (5.3) poses a difficulty when trying
to determine the free propagator, because the operator between Aµ · · · Aν is not
invertible. If we take for granted that a similar gauge fixing procedure (more on this
later, as this is in fact the heart of the problem) can be applied here, we may assume
that the free gauge boson propagator1 in Feynman gauge is
p
−i gµν δab
G0F µν
ab (p) = = , (5.4)
p2 + i0+
and one may read off directly from the Lagrangian (5.3) the following 3-gluon and
4-gluon vertices:

k

g fabc gµν (k − p)ρ
= (5.5)
p + gνρ (p − q)µ + gρµ (q − k)ν

bν q cρ

aµ bν

−i g2 fabe fcde (gµρ gνσ − gµσ gνρ )
= + face fbde (gµν gρσ − gµσ gνρ ) (5.6)
+ fade fbce (gµν gρσ − gµρ gνσ )
cρ dσ

All this seems fine, except for a rather subtle problem that would appear when
using this perturbation theory: these Feynman rules lead to amplitudes that do not
1 In this chapter, we use the diagrammatic convention of QCD, where the gauge bosons (gluons) are

represented as springs in Feynman diagrams. In the electroweak theory, it is more common to represent
them as wavy lines, like the photon in QED.
5. Q UANTIZATION OF YANG -M ILLS THEORY 189

fulfill Ward identities, even when all the external coloured particles are on their mass-
shell. From the discussion of perturbative unitarity for amplitudes with external gauge
bosons in 1.16.4, the lack of Ward identities seems to imply a violation of unitarity in
perturbation theory. Since unitarity is one of the cornerstones of any quantum theory,
this is not a conclusion we are ready to accept, and we must conclude that something
is missing in the above Feynman rules.

5.2 Gauge fixing


In our naive attempt to guess the Feynman rules appropriate for non-Abelian gauge
bosons, we have implicitly assumed that the gauge fixing works in the same way as in
QED, namely that the gauge fixing trivially leads to the factorization of an infinite
factor in the path integral, with no other change to the degrees of freedom that are not
constrained by the gauge condition. It turns out that this assumption is incorrect. Let
us start from the path integral representation of the expectation value of some gauge
invariant operator O(Aµ ):
Z Z  1 
 
O ≡ DAa µ (x) O(A µ ) exp i d4 x − Fa µν F
a µν
. (5.7)
4
| {z }
SYM [Aµ ]

Local gauge transformations of the field Aµ ,

† i †
Aµ (x) → µ (x) ≡ Ω (x) Aµ (x) Ω(x) +
AΩ Ω (x) ∂µ Ω(x) , (5.8)
g

leave the action and the observable unchanged. Moreover, the functional measure is
also invariant, since
 Ω 
   δAa µ (x) 
DAa µ (x) = DAa µ (x)] det

, (5.9)
δAb ν (y)

where the determinant is the Jacobian of the change of coordinates. Using eq. (4.37),
this determinant can be rewritten as follows
  
a µ (x)
δAΩ    
det = det δµ ν δ(x − y) Ωadj (x) ab = 1 , (5.10)
δAb ν (y)

since the group element Ωadj is a unitary matrix. Therefore, there is a large amount of
redundancy in the above path integral, and it is in fact infinite. By applying a gauge
transformation, each field configuration Aµ develops into a gauge orbit (see the figure
5.1), along which the physics is invariant. In order to eliminate this redundancy, we
190 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

G(Aµ) = 0

gauge
fixed Aµ

gauge orbit

Figure 5.1: Illustration of the gauge fixing procedure. The lines represent the
gauge field configurations spanned when varying Ω. The shaded surface is the
manifold where the gauge condition is satisfied, and the black dots are the gauge-
fixed field configurations.
5. Q UANTIZATION OF YANG -M ILLS THEORY 191

would like to impose a condition at every space-time point x on the gauge fields,

Ga Aµ (x) = 0 , (5.11)

in order to select a unique2 field configuration along each orbit. Geometrically, the
gauge condition (5.11) defines a manifold that intersects each orbit, as shown in
the figure 5.1, and we choose this intersection as the representative of this field
configuration.

5.3 Fadeev-Popov quantization and Ghost fields


Thus, we would like to split the integration measure in eq. (5.7) into a physical
component in the manifold G(A) = 0, and a component along the gauge orbits that
we should factor out. Unfortunately, achieving this in a non-Abelian gauge theory is
far more complicated than in QED, because the modification of the gauge potential
under a gauge transformation is non-linear. In order to see the difficulty, let us define
Z
−1
 
∆ [Aµ ] ≡ DΩ(x) δ[Ga (AΩ µ )] . (5.12)

∆[Aµ ] is the determinant of the derivative of the constraint G(Aµ ) with respect to
the gauge transformation Ω, at the point where G(Aµ ) = 0,
 a
δG
∆(Aµ ) = det . (5.13)
δΩ Ga (AΩ µ )=0

In QED, for linear gauge fixing conditions, this derivative (and therefore the determi-
nant) is independent of the gauge field, and can be trivially factored out of the path
integral. This is not the case in non-Abelian gauge theories, and this determinant
is the source of significant complications. One can first prove that the determinant
∆[Aµ ] is gauge invariant. Indeed, changing Aµ → AΘ µ , we have:

Z Ω′

−1
  z}|{
∆ [Aµ ] =
Θ
DΩ(x) δ[Ga (AµΘΩ )]
Z
 
= D(Θ† (x)Ω ′ (x)) δ[Ga (AΩ ′
µ )]
Z
 
= DΩ ′ (x) δ[Ga (AΩ ′
µ )] = ∆
−1
[Aµ ] . (5.14)

2 It turns out that this is not possible, due to the Gribov ambiguity: all gauge conditions of the form

(5.11) have several solutions, called Gribov copies. However, only one of these solutions is a “small field”,
while the others are proportional to the inverse coupling g−1 . Since perturbation theory is an expansion
around the vacuum (i.e. in the small field regime), these non-perturbatively large copies do not play any
role in perturbation theory.
192 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Here, we have used the fact that there exists a group invariant integration measure on
a Lie group. By inserting
Z
 
1 = ∆[Aµ ] DΩ(x) δ[Ga (AΩ µ )] (5.15)

inside the path integral (5.7), we obtain


Z Z
    iSYM [Aµ ]
O = DΩ(x) DAa a
µ (x) ∆[Aµ ] δ[G (Aµ )] O(Aµ ) e

. (5.16)

Now, we change the integration variable of the second integral according to Aµ →



AΩµ . In this transformation, the measure [DAµ ], the Yang-Mills action SYM [Aµ ],
the observable O(Aµ ) and the determinant ∆[Aµ ] are all unchanged (because they
are gauge invariant):
 †
  
DAΩµ = DAµ ,

SYM [AΩ
µ ] = SYM [Aµ ] ,

O[AΩµ ] = O[Aµ ] ,

µ ] =
∆[AΩ ∆[Aµ ] , (5.17)
while the field AΩ
µ becomes Aµ . Therefore, we have
Z Z
    iSYM [Aµ ]
O = DΩ(x) DAa a
µ (x) ∆[Aµ ] δ[G (Aµ )] O(Aµ ) e . (5.18)

At this point, the second integral does not contain the gauge transformation Ω any-
more, and therefore we have managed
  to factorize the “integral along the orbits” in the
form of the first integral over DΩ . Dropping this constant factor, we can therefore
write an integral free of any redundancy:
Z
  iSYM [Aµ ]
O = DAa a
µ (x) ∆[Aµ ] δ[G (Aµ )] O(Aµ ) e . (5.19)

In the above formula, the determinant ∆[Aµ ] depends on the gauge field and must
therefore have an effect on the Feynman rules. The Fadeev-Popov method consists
in rewriting this determinant as a path integral. Note that since ∆[Aµ ] appears in the
numerator, we need Grassmann variables in order to represent it as a path integral3 ,
according to eq. (3.36):
Z
  
det i M = Dχa (x)Dχa (x)
Z
× exp i d4 xd4 y χa (x) Mab (x, y) χb (y) . (5.20)

3 The factor i in det i M has been included for aesthetic reasons, but does not change anything. In

fact any rescaling M → κ M would leave the results unchanged. Indeed, such a change would alter
the ghost propagator according to S → κ−1 S, and the ghost-gauge boson vertex by V → κV. Since the
ghosts appear only in closed loops, that contain an equal number of propagators and vertices, these factors
κ would cancel out.
5. Q UANTIZATION OF YANG -M ILLS THEORY 193

An extra generalization, that we have already used in the path integral quantization
of the photon (see eq. (3.52)), is to shift the gauge condition from Ga (A) = 0 to
Ga (A) = ωa and to perform a Gaussian integration over ωa . The final result takes
the following form:
Z
  
O = DAaµ (x) Dχa (x)Dχa (x) O(Aµ )
Z  1 
ξ
× exp i d4 x − Fa µν F
a µν
− (Ga (Aµ ))2 + χa Mab χb ,
| 4 {z } | 2 {z } | {z }
LYM LGF LFPG

(5.21)

where Mab is the derivative of Ga (AΩ ) with respect to the gauge transformation Ω,
at the point Ω = 1 (here, we use the fact that the determinant is gauge invariant to
choose freely the Ω at which we compute the derivative). The unphysical Grassmann
fields χ and χ introduced as a trick to express the determinant are called Fadeev-Popov
ghosts, or simply ghosts. Although physical observables do not depend on these
fictitious fields, there is in general a coupling between the ghosts and the gauge fields,
because the matrix Mab may contain the gauge field. This implies that the ghosts
may appear in the form of loop corrections in the perturbative expansion. As we
shall see shortly, they are in fact crucial for the consistency of perturbation theory in
non-Abelian gauge theories. In particular, the ghosts ensure that the theory is unitary. c sileG siocnarF

5.4 Feynman rules for non-abelian gauge theories


Eq. (5.21) contains all the necessary ingredients to complete the Feynman rules that
we have started to derive heuristically at the beginning of this chapter. To turn this
formula into explicit Feynman rules, we should first choose the gauge fixing function
Ga (A), since it enters directly in the term in ξ2 (Ga (A))2 , and implicitly in the matrix
Mab that defines the ghost term. In the common situation where this gauge fixing
function is linear in Aµ (all our examples will be of this type), then the terms that are
quadratic in the gauge field are the same as in QED, and therefore the gauge boson
propagator is also the same (except for an extra factor δab that expresses the fact that
the free propagation of a gluon does not change its colour). Thus our guess (5.4) for
the Feynman gauge propagator was in fact correct. In addition, the gauge fixing term
and the ghost term cannot contain terms of degree 3 or 4 in the gauge field, which
implies that the vertices given in eqs. (5.5) and (5.6) are also correct.c sileG siocnarF

5.4.1 Covariant gauge


Let us now consider the general covariant gauge, all known as the Rξ -gauge, already
introduced in eq. (3.51) for QED. This amounts to choosing the gauge fixing function
194 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

as
Ga (A) ≡ ∂µ Aa a
µ − ω (x) . (5.22)
With this gauge fixing, the free gauge boson propagator is
p  
−i gµν δab i δab 1 pµ pν
G0F µν
ab (p) = = 2 + 2 1− . (5.23)
p + i0+ p + i0+ ξ p2
(The simplest form is obtained in the limit ξ → 1, giving the Feynman gauge4 .) The
matrix Mab can be calculated by applying an infinitesimal gauge transformation
Ω = exp(iθa ta ) to Aµ . The variation of the gauge field is
δAa µ (x) = g fabc θb (x) Ac µ (x) − ∂µ θa (x) , (5.24)
and the variation of Ga (A) at the point x is
 
δGa = g fabc ∂µ θb (x) Ac µ (x)+g fabc θb (x) ∂µ Ac µ (x) − θa (x) . (5.25)
Therefore, we have
δGa (A) 
Mab = b
= g fabc ∂µ Ac µ (x) + g fabc Ac µ (x) ∂µ − δab  , (5.26)
δθ
and the terms that depend on the Fadeev-Popov ghosts can be encapsulated in the
following effective Lagrangian:
  
LFPG = χa − δab  + g fabc ∂µ Ac µ (x) + g fabc Ac µ (x) ∂µ χb (5.27)

The first term leads to the following propagator for the ghosts:
p
0 i δab
GF (p) = = . (5.28)
p2 + i0+
Note that it has the form of a scalar propagator, although the ghosts are anti-com-
muting Grassmann variables. The vertex between ghosts and gauge bosons reads

a
r
q
= g fabc (pµ + qµ ) = g fabc rµ . (5.29)

p
b
The Feynman rules for non-Abelian gauge theories in covariant gauge are summarized
in the figure 5.2, where we have added for completeness the rules relative to fermions.
4 Another popular choice is the Landau gauge, obtained in the limit ξ → +∞, that corresponds to a

strict enforcement of the condition ∂µ Aµ = 0. Indeed, in this limit the exponential of i ξ2 (∂µ Aµ )2 in the
gauge fixed Lagrangian oscillates wildly –and produces cancellations– unless ∂µ Aµ = 0. Equivalently,
the Gaussian distribution for the function ωa (x) has a vanishing width in this limit, which forces the strict
equality ∂µ Aaµ = 0.
5. Q UANTIZATION OF YANG -M ILLS THEORY 195

p  
−i gµν δab i δab 1 pµ pν
= + 2 1−
p2 + i0+ p + i0+ ξ p2

p
i δij
=
p − m + i0+
/

p
i δab
=
p2 + i0+



k g fabc gµν (k − p)ρ
=
+ gνρ (p − q)µ + gρµ (q − k)ν
p

bν q cρ

aµ bν 
−i g2 fabe fcde (gµρ gνσ − gµσ gνρ )
= + face fbde (gµν gρσ − gµσ gνρ )
+ fade fbce (gµν gρσ − gµρ gνσ )
cρ dσ

i

= −i g γµ ta
r ij

j

a
r
q
= g fabc (pµ + qµ ) = g fabc rµ

p
b

Figure 5.2: Feynman rules of non-Abelian gauge theories in covariant gauge. We


also list the rules involving fermions for completeness. Latin characters a, b, c refer
to the adjoint representation, while the letters i, j refer to the representation r in
which the fermions live.
196 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

5.4.2 Axial gauge

The axial gauge fixing consists in constraining the value of nµ Aa µ


µ , where n is a
fixed 4-vector (when this vector is time-like, this gauge is called the temporal gauge,
and when it is light-like, it is called the light-cone gauge). Therefore, the gauge fixing
function is

Ga (A) ≡ nµ Aa a
µ − ω (x) . (5.30)

After gauge fixing, the quadratic part of the effective Lagrangian reads

1 a µν 
Aµ g  − ∂µ ∂ν − ξ nµ nν Aa
ν , (5.31)
2

and the free gauge boson propagator is obtained in momentum space by inverting

gµν p2 − pµ pν + ξ nµ nν . (5.32)

The inverse of this matrix must be of the form

A gµν + B pµ pν + C nµ nν + D (nµ pν + nν pµ ) . (5.33)

(This is the most general symmetric tensor that one may construct with gµν , pµ and
nµ .) This leads to the following propagator

−i δab h µν pµ nν + pν nµ pµ pν i
G0F µν
ab (p) = g − + n 2
+ξ −1 2
p . (5.34)
p2 + i0+ p·n (p · n)2

Note that this propagator does not vanish as p−2 at large momentum, because of the
term proportional to ξ−1 , With this gauge fixing, the variation of the gauge fixing
function under an infinitesimal gauge transformation is given by

δGa = g fabc θb (x) nµ Ac µ (x) − nµ ∂µ θa (x) . (5.35)

and the matrix M reads

Mab = g fabc nµ Ac µ (x) − δab nµ ∂µ , (5.36)

Therefore, the Fadeev-Popov term in the effective Lagrangian is


 
LFPG = χa − δab nµ ∂µ + g fabc nµ Ac µ (x) χb , (5.37)
5. Q UANTIZATION OF YANG -M ILLS THEORY 197

which leads to the following expressions for the ghost propagator and its coupling to
the gauge boson:
p
0 δab
GF (p) = =−
p · n + i0+

a
r
q
= i g fabc nµ . (5.38)

p
b

A significant simplification of these Feynman rules occurs in the limit ξ → ∞ (that


one may call the strict axial gauge, since the condition nµ Aaµ = 0 holds exactly in
this limit). In this limit, the gauge boson propagator becomes
−i δab h µν pµ nν + pν nµ pµ pν n2 i
G0F µν
ab (p) = g − + , (5.39)
p2 + i0+ p·n (p · n)2
and satisfies
nµ G0F µν 0 µν
ab (p) = nν GF ab (p) = 0 . (5.40)
Therefore, the gauge boson propagator gives zero when contracted into the ghost-
gauge boson vertex, which effectively decouples the ghosts from the gauge bosons.
Thus, the limit ξ → ∞ of the axial gauge is ghost-free (but its propagator is arguably
much more complicated than the Feynman gauge propagator). c sileG siocnarF

5.5 On-shell non-Abelian Ward identities


In Quantum Electrodynamics, the interpretation of Cutkosky’s cutting rules as a
perturbative realization of unitarity depends crucially on the Ward-Takahashi identities
satisfied by amplitudes with external photons, namely

1 Γµ1 µ2 ··· (k1 , k2 , · · · ) = 0 ,
1
(5.41)
valid when all the charged external lines are on-shell. Note that this identity does not
require to contract the remaining photons with a polarization vector. c sileG siocnarF

In Yang-Mills theory, it turns out that the identity (5.41) is in general not satisfied.
Instead, it is replaced by a different on-shell identity, discovered by ’t Hooft. In order
to derive it, let us consider a generalized covariant gauge condition of the form
∂µ Aµ
a (x) − ζa (x) = ωa (x) . (5.42)
198 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Like in the Fadeev-Popov quantization, we integrate over the function ωa with a


Gaussian weight, which leads to the following gauge fixed Lagrangian:
1 ξ
L ≡ − Fa µν F
a µν
− (∂µ Aµ 2
a − ζa ) + χa Mab χb . (5.43)
4 2
This Lagrangian is the same as the one encountered before (with ζa ≡ 0), with the
addition of two terms that depend on the arbitrary function ζa (x) introduced in the
gauge fixing:
ξ
− ζa ζa + ξ ζa ∂µ Aµ a . (5.44)
2
Since ζa is not a dynamical field but merely a parameter, the first term can be factored
out of the generating functional for correlation functions, and cancels once it is
properly normalized. The second term acts as a source coupled to the divergence of
the gauge field. In momentum space, the Feynman rule for the insertion of such a
source is
k
= i ξ kµ ζea (k) , (5.45)

where ζea is the Fourier transform of ζa . This source is always contracted into an
external gluon propagator, leading to the combination5
ζeb (k) kν
i ξ kµ ζea (k) G0F µν
ab (k) = . (5.46)
k2
(The propagator is given in eq. (5.23).) In this contraction, the external gluon prop-
agator is replaced by a factor kν directly contracted into the amputated correlation
function, independently of the gauge parameter.
Since the function ζa has been introduced as part of our choice of gauge fixing
condition, gauge invariant quantities should not depend on it. Consequently, the
sum of the graphs contributing to gauge invariant quantities with a given non-zero
number of insertions of the source ζa must be zero. Consider S-matrix elements,
i.e. transition amplitudes between physical states. The graphs contributing to such a
matrix element have a number of external gluons corresponding to the in- and out-
states of the amplitude, plus possibly some insertions of the source ζa :

out Γ in = Γ {µa}i∈[1,n] (k1 · · · kn ; q1 · · · qp )


{νb}j∈[1,p] | {z } | {z }
on-shell off-shell
" # " ν #
n
Y
µi
p
Y qj j ζebj (qj )
× ǫ (k ) × = 0 . (5.47)
| {z i} q2j
i=1 j=1
physical

eb (k) kν /(k · n). Thanks to the factor


5 In axial gauge, the insertion of a source ζ leads to a factor ζ
a
kν , the identity (5.47) is also valid in axial gauges.
5. Q UANTIZATION OF YANG -M ILLS THEORY 199

In the second line, Γ is a (n + p)-gluon amplitude, with amputated external lines. n of


these gluons correspond to the in- and out- states, with colour ai . The corresponding
momentum ki is on-shell and the Lorentz index µi is contracted with a physical
polarization vector. In contrast, the lines to which the sources ζbj are attached are
off-shell and contracted with their own momentum qj . When including all the graphs
contributing to a given order in the coupling and in ζ, this expression must vanish
if p ≥ 1 because it is a contribution to a gauge invariant quantity. Thus, the Ward
identity of eq. (5.41) can be adapted to non-Abelian gauge theories, with the restriction
ν
that all the gluon lines not contracted with qj j must be on-shell and contracted with a
physical polarization vector. For instance, the gluon self-energy obeys the following
two relations:

kµ ǫν (k) Πµν (k) =0, kµ kν Πµν (k) = 0 , (5.48)


k2 =0

while in QED it is sufficient to contract the self-energy with a single k (even off-shell)
to obtain zero. Note that these identities are insufficient in order to obtain an unitary S-
matrix with only internal gluons, because the tensor structure of the internal cut gluon
propagators also involves polarizations which are neither physical nor proportional
to qν . The Zinn-Justin equation, that we shall derive in the next chapter, may be
viewed as a generalization of these Ward identities to off-shell momenta and arbitrary
polarizations.

5.6 Ghosts and unitarity


5.6.1 Explicit example
In Abelian gauge theories, we were able to show that cutting rules provide a per-
turbative realization of the optical theorem, by using the Ward identities obeyed by
amplitudes when all the external charged particles are on-shell. These identities were
sufficient to conclude that the unphysical polarizations carried by the internal photon
lines of a graph cancel when these lines are cut. But in non-Abelian gauge theories,
this reasoning faces two difficulties:

i. There are no Ward identities similar to those of QED, that could be used to
prove unitarity.
ii. Higher order graphs in general have ghost loops, whose interpretation is at the
moment unclear when such loops are cut.

As we shall see, these two issues are in fact related: the cut ghost lines precisely
cancel the unphysical polarizations of the cut gluons. Let us first work out an explicit
c sileG siocnarF

example that illustrates this assertion: the tree level annihilation of a quark and an
antiquark into two gluons in QCD. The corresponding diagrams are the following:
200 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

We denote p et q the momenta of the incoming quark and antiquark, respectively, and
k1,2 the momenta of the outgoing gluons (with Lorentz indices µ, ν and colours a, b,
respectively).
The contribution of the first two graphs is very similar to that of the analogous
graphs in QED for the emission of two photons, except for the extra colour matrices
at the quark-gluon vertices:

i
i Mµν 2 µ a
ab |1+2 (p, q|k1 , k2 ) = (i g) v(q) γ t γν tb
/1 − q
k / −m
i
+γν tb γµ ta u(p) . (5.49)
/ −k
p /1 − m

By contracting this amplitude with the photon momentum k1µ , we get:

i
k1µ i Mµν 2 / a
ab |1+2 (p, q|k1 , k2 ) = (i g) v(q) k1 t γν tb
/1 − q
k / −m
i
+γν tb / ta u(p) .
k (5.50)
/1 − m 1
/ −k
p

In the numerator of the first term, we may write

/1 = (k
k /1 − q
/ − m) + (q
/ + m) , (5.51)

/ + m) = 0. Likewise, we may simplify the second


and use the Dirac equation v(q)(q
term by using

/1 = (p
k / − m) − (p
/ −k
/1 − m) ,
/
(p − m)u(p) = 0 , (5.52)

which leads to

k1µ i Mµν 2 ν a b
ab |1+2 (p, q|k1 , k2 ) = i (i g) v(q) γ [t , t ] u(p) . (5.53)

This is non-zero, because of the non-commutativity of the Lie generators in a non-


Abelian gauge theory. However, by using [ta , tb ] = ifabc tc , this result may be
5. Q UANTIZATION OF YANG -M ILLS THEORY 201

related to the third graph, that contains a 3-gluon vertex. If we use the Feynman gauge
for the internal gluon propagator, its contribution can be written as
−i
i Mµν c
ab |3 (p, q|k1 , k2 ) = i g v(q)γρ t u(p)
k23
×g fabc [gµν (k2 − k1 )ρ + gνρ (k3 − k2 )µ + gρµ (k1 − k3 )ν ] ,
(5.54)

where we denote k3 ≡ −k1 − k2 . Contracting this amplitude with k1µ gives

−i
k1µ i Mµν c
ab |3 (p, q|k1 , k2 ) = i g v(q)γρ t u(p)
k23
ρ ν ρ
×g fabc [gνρ k22 − kν νρ 2
2 k2 − g k3 + k3 k3 ] . (5.55)
ρ
In this equation, the term in kν
3 k3 vanishes once contracted with γρ , since we can
write
v(q)γρ tc u(p)kρ3 = −v(q)[(p / + m)]tc u(p) = 0 .
/ − m) + (q (5.56)
However, this is not sufficient for (5.55) to fully cancel (5.53).
ρ
Setting k22 = 0 kills another term in eq. (5.55). The term in kν 2 k2 would be
canceled if in addition we contract the amplitudes with a transverse polarization
vector ǫ1,2ν (k2 ), since kν
2 ǫ1,2ν (k2 ) = 0. We indeed have:
 
k1µ ǫ1,2ν (k2 ) i Mµν µν
ab |1+2 (p, q|k1 , k2 ) + i Mab |3 (p, q|k1 , k2 ) k2 =0 = 0 .
2

(5.57)

The same cancellation happens if we contract the amplitudes simultaneously with


k1µ et k2ν :
 
k1µ k2µ i Mµν µν
ab |1+2 (p, q|k1 , k2 ) + i Mab |3 (p, q|k1 , k2 ) = 0 , (5.58)
even if the momentum k2 is not on-shell. Thus, we obtain for this process a Ward
identity similar to the QED one, provided certain extra conditions are satisfied by
the second gluon (both eqs. (5.57) and (5.58) are special cases on the non-Abelian
Ward identity (5.47)). These restrictions weaken the resulting identity, and it is not
sufficient to eliminate the longitudinal gluon polarizations when we try to recover the
amplitude from the imaginary part of the qq̄ → qq̄ forward amplitude at one loop.
In particular, some unphysical polarizations will not cancel in the following cut:
202 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Except for a graph with a quark loop that does not play any role in the present
discussion (since it does not give any 2-gluon final state when cut), the complete list
of graphs contributing to the qq̄ → qq̄ forward amplitude at one loop is shown in the
5.3. The contribution of the first 5 graphs (i.e. those with gluon internal lines) to the

Figure 5.3: One-loop diagrams contributing to qq̄ → qq̄.

optical theorem can be calculated easily by noting that it can be expressed in terms of
the amplitude we have just calculated:

i Mµν µν µν
ab (p, q|k1 , k2 ) ≡ i Mab |1+2 (p, q|k1 , k2 )+ i Mab |3 (p, q|k1 , k2 ) , (5.59)

as follows6
Z Z
1 d4 k1 d4 k2
(2π)4 δ(4) (p + q − k1 − k2 )
2 (2π)4 (2π)4
×2π(−gµρ )θ(k01 )δ(k21 − m2 ) 2π(−gνσ )θ(k02 )δ(k22 − m2 )

×i Mµν ρσ
ab (p, q|k1 , k2 ) (i Mab (p, q|k1 , k2 )) . (5.60)

For a successful interpretation of this formula as a physical contribution in the optical


theorem, only physical polarizations should survive after we have replaced the tensors
−gµρ and −gνσ by using (see eq. (1.356))
X µ
gµν = ǫµ ν ∗ µ ν
+ (k)ǫ− (k) + ǫ− (k)ǫ+ (k) −

ǫλ (k)ǫν ∗
λ (k) , (5.61)
λ=1,2

where ǫµ
± (k) are unphysical polarizations (with ǫµ µ
+ (k) proportional to k ). After this
substitution, several terms are not problematic:

• The terms that contain only the polarizations ǫµ


1,2 since they are fully physical.
c sileG siocnarF

6 The factor 1/2 is a symmetry factor due to the presence of two identical gluons in the final state.
5. Q UANTIZATION OF YANG -M ILLS THEORY 203

• The terms containing ǫµ ν µ ν


1,2 ǫ+ or ǫ+ ǫ+ vanish by virtue of eqs. (5.57) and
(5.58).

Thus, we need only study the following term

1h ∗
(i Mµν ρσ
ab ǫ−µ ǫ+ν ) (i Mab ǫ+ρ ǫ−σ )
2 i

+ (i Mµν ρσ
ab ǫ+µ ǫ−ν ) (i Mab ǫ−ρ ǫ+σ ) , (5.62)

integrated over the on-shell momenta k1 and k2 . Using ǫµ µ
+ (k) = k / 2|k| and
eqs. (5.53) and (5.55), we obtain

g2 1
ǫ+µ (k1 ) i Mµν
ab = − √
/2 kν
v(q) k 2 f
abc c
t u(p) . (5.63)
2|k1 | k23
Likewise with the other gluon, we have

g2 1
ǫ+ν (k2 ) i Mµν
ab = √
/1 kν
v(q) k 1 f
abc c
t u(p) . (5.64)
2|k2 | k23

Using then ǫµ
− (k) = (k0 , −k)/ 2|k|, we get

|k2 | 1
ǫ−ν (k2 ) ǫ+µ (k1 ) i Mµν
ab = −g
2 /2 fabc tc u(p) ,
v(q) k
|k1 | k23
|k1 | 1
ǫ+ν (k2 ) ǫ−µ (k1 ) i Mµν
ab = +g2 /1 fabc tc u(p) . (5.65)
v(q) k
|k2 | k23

Furthermore, notice that


/1 + k
v(q)(k /2 )u(p) = v(q)(q
/ +m+p
/ − m)u(p) = 0 . (5.66)

Combining these equations, the non-physical contribution to the optical theorem of


the diagrams with a gluon loop, (5.62), can be written as follows:
1    
g4 2 2
/1 fabc tc u(p) v(q) k
v(q) k /1 fabd td u(p) . (5.67)
(k3 )
If this was all there is, as the naive Feynman rules we tried to guess at the beginning of
this chapter would suggest, then we would have to conclude that Yang-Mills theories
are inconsistent because they violate unitarity. Fortunately, there is one more graph in
figure 5.3, with a ghost loop. Let us first evaluate the annihilation amplitude of the
quark-antiquark pair into a ghost-antighost pair:
i
i Mqq̄→χχ = i g v(q) γρ tc u(p) (g fabc kρ1 ) . (5.68)
k23
204 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Squaring this amplitude, and including the − sign7 associated to a ghost loop8 , the
contribution of the last graph of fig. 5.3 to the optical theorem becomes

1    
−g4 2 2
/1 fabc tc u(p) v(q) k
v(q) k /1 fabd td u(p) , (5.69)
(k3 )

that exactly cancels the unphysical gluon contribution of eq. (5.67). In other words,
the optical theorem is satisfied with only physical modes in the final state sum, thanks
to a crucial cancellation that involves ghosts.

5.6.2 Becchi-Rouet-Stora-Tyutin symmetry

The cancellation that occurred in the previous example is in fact general: for every
gluon loop, there is a graph of identical topology where this loop is replaced by a
ghost loop, that cancels the contribution from the unphysical gluon polarizations in the
optical theorem. However, it is difficult to turn the calculation of the previous subsec-
tion into a general proof. It turns out that this cancellation originates from a residual
symmetry of the gauge fixed Lagrangian: although the gauge fixing term explicitly
breaks the gauge symmetry, the effective Lagrangian that appears in eq. (5.21) has a
remnant of the original gauge symmetry, known as the Becchi-Rouet-Stora-Tyutin
symmetry (BRST).
Under an infinitesimal gauge transformation parameterized by θa (x), the gauge
field and fermion field vary by

δAa
µ (x) = − Dadj
µ ab
θb (x)
δψ(x) = −i gθa (x) ta
r ψ(x) , (5.70)

where r is the representation in which the fermions live. A BRST transformation is


similar to the above transformation, but with the substitution θa (x) → −ϑ χa (x),
where ϑ is a Grassmann constant9 ,
  
δBRST Aa
µ (x) = Dadj
µ ab ϑ χb (x)
 
δBRST ψ(x) = i g ϑ χa (x) ta
r ψ(x) . (5.71)

Since the BRST transformation is structurally identical to a local gauge transformation,


any gauge invariant combination of gauge fields and fermions is also BRST-invariant.
This is therefore the case of the Yang-Mills Lagrangian and the Dirac Lagrangian with
7 Thereis no 1/2 symmetry factors for a ghost-antighost final state, because they are not identical.
8 Wesee here how essential it is that ghosts are anti-commuting fields – otherwise, their contribution
would not have the proper sign to cancel the unphysical gluon polarizations in the optical theorem.
9 This Grassmann constant makes ϑ χ (x) a commuting object like θ .
a a
5. Q UANTIZATION OF YANG -M ILLS THEORY 205

a minimal coupling of the fermions to the gauge fields. It is customary to introduce a


generator QBRST for this transformation, by denoting δBRST = ϑ QBRST . Thus

QBRST Aa adj
µ (x) = Dµ ab χb (x) , QBRST ψ(x) = i g χa (x) tar ψ(x) . (5.72)

Eqs. (5.71) do not tell how ghost and antighost fields transform under BRST. For
reasons that will become clear later, we shall impose that the BRST transformation
is nilpotent, i.e. that Q2BRST = 0 when applied to any of the fields of the theory. This
requirement constrains the BRST transformation of the ghosts. Indeed, a double
BRST transformation applied to fermions reads
 
Q2BRST ψ(x) = i g QBRST χa (x) ta a
r ψ(x) − χa (x) tr QBRST ψ(x)
 a
= i g QBRST χa (x) tr ψ(x) + g2 χa (x)χb (x) ta b
r tr ψ(x) .
(5.73)

(The BRST generator is an anti-commuting object, which leads to a minus sign in the
second term of the first line when we push it through the Grassmann field χa .) Since
1 a b i abc c
χa and χb anti-commute, we can replace ta b
r tr by 2 [tr , tr ] = 2 f tr . We see that
eq. (5.73) will identically vanish provided that
1
QBRST χa (x) = − g fabc χb (x) χc (x) . (5.74)
2
Then, we can calculate the action of a double BRST transformation on the gauge
field,
  
Q2BRST Aa
µ = Dadj µ ab QBRST χb − g f
abc
QBRST Acµ χb
  g bcd 
= Dadj µ ab − 2 f χc (x)χd (x)
 
−g fabc ∂µ χc −gfcde Aeµ χd χb . (5.75)

The terms linear in the gauge field cancel by using the anti-commuting nature of the
χ’s and the Jacobi identity satisfied by the structure constants:

− 12 g2 fabe fbcd Aeµ χc (x)χd (x) + g2 fabc fcde Aeµ χd χb


 ace cbd 
= 12 g2 −f
| f + fabc
{zf
cde
− fadc fcbe} Aeµ χb χd .
0
(5.76)

The terms with the derivative ∂µ read


 
− 21 gfacd ∂µ χc χd − gfabc ∂µ χc χb
 
= 21 g fabc ∂µ (χc χb ) −(∂µ χc )χb + (∂µ χb )χc = 0 . (5.77)
| {z }
−(∂µ χc )χb −χc (∂µ χb )
=−∂µ (χc χb )
206 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

The double transformation of the ghost field also vanishes

g2

Q2BRST χa = 4 |f
abc bde
f + facb fbde} χc χd χe
{z (5.78)
0

Therefore, the prescription (5.74) for the BRST transformation of a ghost field leads
to

Q2BRST ψ = 0 , Q2BRST Aa
µ =0 , Q2BRST χa = 0 . (5.79)

We need now to specify the BRST transformation of the antighost field. Note that in
the path integral that gives the Fadeev-Popov determinant, the ghost and antighost
fields are treated as independent; therefore the BRST transformation of the antighost
does not have to be related to that of the ghost. Let us denote:

QBRST χa (x) ≡ Ba (x) , (5.80)

where Ba (x) is a commuting field. For QBRST to be nilpotent, we must have in addition:

QBRST Ba (x) = 0 . (5.81)

(And of course Q2BRST Ba (x) = 0.)


Consider now a local function Ξ of all the fields (including Ba ), and add its BRST
variation to the Yang-Mills and Dirac Lagrangians:

L ≡ LYM + LD + QBRST Ξ . (5.82)


| {z }
BRST-invariant

Since QBRST is nilpotent, this Lagrangian is BRST-invariant. Let us choose


h1 i
Ξ ≡ χa (x) Ba (x) + Ga (A(x)) , (5.83)

where ξ is a parameter and Ga (A) is the gauge fixing function. We can write10

h 1 a i h1  ∂Ga i
QBRST Ξ = QBRST χa B + Ga − χa QBRST Ba + Q Ab
µ
2ξ 2ξ ∂Ab
µ
BRST

1 a a ∂Ga 
= B B + Ba Ga + χa b
− Dadj
µ bc χc . (5.84)
2ξ ∂Aµ
| {z }
LFPG

10 Note that a minus sign arises when moving QBRST through the anti-commuting field χb .
5. Q UANTIZATION OF YANG -M ILLS THEORY 207

Note that the last term is nothing but the Fadeev-Popov part of the Lagrangian we
have derived earlier in this chapter. Moreover, the field Ba enters only quadratically
in this Lagrangian. Therefore, the path integral on Ba can be performed trivially11 ,
Z 
 a  i R d4 x 1 Ba Ba +Ba Ga ξR 4 a a
DB (x) e 2ξ = e−i 2 d x G G . (5.85)

Therefore, after integrating out the auxiliary field Ba , the resulting theory has exactly
the same effective Lagrangian as the one resulting from the Fadeev-Popov procedure:

ξ a a ∂Ga 
Leff = LYM + LD − G G + χa b
− Dadj
µ bc χc . (5.86)
2 ∂Aµ
The formal construction we have followed in this section proves that Leff is BRST
invariant, but in a somewhat obfuscated manner after the auxiliary field Ba has been
integrated out. The BRST invariance of eq. (5.86) is realized if we define the BRST
variation of the antighost field as follows
QBRST χa = −ξ Ga , (5.87)
which is reminiscent of the relationship between Ba and Ga when we do the Gaussian
integration on Ba . c sileG siocnarF

5.6.3 BRST current and charge


The Lagrangian (5.82), with the choice (5.83) for the function Ξ, possesses the
following symmetries:

• Global gauge invariance (because all the colour indices are contracted).
• BRST invariance.
• Ghost number conservation, if we assign a ghost number +1 to χ’s and −1 to
χ’s.

The BRST invariance implies the existence of a conserved current:


X ∂L 

BRST
≡  QBRST Φ . (5.88)
∂ ∂µ Φ
Φ∈{Aµ ,ψ,χ,χ,B}

From the 0-th component of this current, we may obtain the BRST charge
Z
QBRST ≡ d3 x J0BRST (x0 , x) . (5.89)

11 Note that this is equivalent to evaluating the argument of the exponential at the stationary point

Ba = −ξ Ga , since the stationary phase approximation is exact for Gaussian integrals.


208 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

In fact, this charge generates the BRST transformation in the following sense:
 
i QBRST , Φ ± = QBRST Φ (Φ ∈ {Aµ , ψ, χ, χ, B}) , (5.90)

where [·, ·]± is a commutator if Φ is a commuting field and an anti-commutator if


Φ is anti-commuting. If we consider free fields (i.e. we set g = 0), and we Fourier
decompose all the fields that appear in the (anti)-commutation relations (5.90),
Z
X d3 p †

a (x) = ǫµ
λ (p) aaλp e
+ip·x
+ ǫµ∗
λ (p) aaλp e
−ip·x
(2π)3 2|p|
λ=1,2,+,−
XZ d3 p
ψ(x) ≡ d†sp vs (p)e+ip·x + bsp us (p)e+ip·x
s=±
(2π)3 2Ep
Z
d3 p
χa (x) ≡ α†ap e+ip·x + αsp e+ip·x
(2π)3 2|p|
Z
d3 p
χa (x) ≡ β†ap e+ip·x + βsp e+ip·x , (5.91)
(2π)3 2|p|

we obtain
 
QBRST , a†aλp ∝ δλ+ α†ap ,

QBRST , αap = 0 ,

QBRST , βap ∝ a†a−p ,
   
QBRST , b†sp = QBRST , d†sp = 0 . (5.92)

5.6.4 BRST cohomology, Physical states and Unitarity

The fact that the BRST charge is nilpotent, Q2BRST = 0, has profound implications on
the states of the system. The kernel of QBRST is the set of states annihilated by QBRST ,

Ker QBRST ≡ ψ QBRST ψ = 0 . (5.93)

The set of states that can be obtained by the action of QBRST on another state is called
the image of QBRST ,

Im QBRST ≡ QBRST ψ . (5.94)

Because QBRST is nilpotent, the image is a subset of the kernel,


 
Im QBRST ⊂ Ker QBRST . (5.95)
5. Q UANTIZATION OF YANG -M ILLS THEORY 209

Note that states in the image cannot be physical states, because they have a null norm:

ψ ψ = φ QBRST QBRST φ = 0 . (5.96)


| {z }
0

Consider now the following equivalence relationship between states in the kernel:
two states are considered equivalent if their difference is in the image,

ψ ∼ ψ′ if ψ − ψ ′ ∈ Im QBRST . (5.97)
The cohomology of QBRST is the set of classes of equivalent states,
  
H QBRST ≡ Ker QBRST / Im QBRST . (5.98)
It turns out that the physical states are the elements of the cohomology with non-zero
norm12 . Indeed, using eqs. (5.92), it is easy to prove that if ψ is a state in the
cohomology, then

a†a{1,2}p ψ ∈ H QBRST

b†sp ψ ∈ H QBRST

d†sp ψ ∈ H QBRST , (5.99)

while

a†a±p ψ 6∈ H QBRST

α†p ψ 6∈ H QBRST

β†p ψ 6∈ H QBRST . (5.100)

In other words, adding to the state a physical particle (gluon with a physical polar-
ization, or quark or antiquark) gives another state in the cohomology, while adding
to the state a nonphysical quantum (gluon with a non-physical polarization, ghost or
antighost) takes the state out of the cohomology. c sileG siocnarF

Furthermore, since the effective Lagrangian is BRST invariant, it corresponds


to a Hamiltonian H that commutes with QBRST . Therefore, a state in the kernel (i.e.
for which QBRST ψ = 0) stays in the kernel under the time evolution generated by
this Hamiltonian. Furthermore, the time evolution preserves the norm, and therefore
states in the cohomology stay in the cohomology at all times. Therefore, starting from
a physical states, the time evolution cannot produce unphysical objects in the final
state. This explains why unphysical modes cancel in the final states sum in the optical
theorem, despite the fact that the internal lines of Feynman graphs may propagate all
sorts of unphysical excitations. c sileG siocnarF

12 This
 
restriction is necessary, because one of the classes in H QBRST is Im QBRST itself, that we
know has only zero-norm states.
210 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS
Chapter 6

Renormalization of
gauge theories

6.1 Ultraviolet power counting


Before studying in more detail the renormalizability of gauge theories, one may
assess the plausibility of this renormalizability by calculating the superficial degree
of ultraviolet divergence of graphs in such a theory. Furthermore, this will guide us
regarding which classes of graphs may contain divergences. For simplicity, we will
consider here a pure Yang-Mills theory, without matter fields (keeping fermions would
force us to distinguish the fermion propagators from the gluon and ghost propagators
in the counting, because they have different behaviours at large momentum, but
would not change the final conclusion). Note that the gluon propagator decreases as
(momentum)−2 in the ultraviolet, both in covariant and strict axial gauge. This is
also the behaviour of the ghost propagator1 . Moreover, the 3-gluon vertex and the
gluon-ghost-antighost vertex have the same scaling with momentum. Therefore, we
need not distinguish in the ultraviolet power counting the ghosts and the gluons. Thus,
let us consider a generic connected graph G with the following list of propagators and
vertices:

• nE external lines (gluons or ghosts),


• nI internal lines (gluons or ghosts),
• n3 trivalent vertices (3-gluon or gluon-ghost-antighost),
1 In strict axial gauge, the ghost propagator behaves differently, but the ghosts decouple completely

from the gluons.

211
212 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

• n4 four-gluon vertices,
• nL loops.

These quantities are related by the following identities:

nE + 2nI = 3n3 + 4n4 (6.1)


nL = nI − (n3 + n4 ) + 1 . (6.2)

The first equation states that each vertex must have all its “handles” attached to the
endpoint of a propagator, and the second equation counts the number of internal
momenta that are not determined by energy momentum conservation. In terms of
these parameters, the ultraviolet degree of divergence of this graph (in four space-time
dimensions) is

ω(G) = 4nL − 2nI + n3 . (6.3)

Note that each trivalent vertex contains one power of momentum and therefore
contribute +1 to this counting. Adding eq. (6.1) and four times eq. (6.2), we obtain

ω(G) = 4 − nE , (6.4)

that does not depend on any of the internal details of the graph. Moreover, the only
functions that have intrinsic ultraviolet divergences are the 2-point, 3-point and 4-point
functions, which suggests that Yang-Mills theories may indeed be renormalizable.
However, a Yang-Mills theory is not simply the addition of gluon and ghost kinetic
terms, 3- and 4-gluon vertices, and a ghost-antighost-gluon vertex: all these terms
of the Lagrangian are tightly constrained by gauge symmetry. For instance (but this
is not the only constraint), all the vertices depend on a unique coupling constant g.
Therefore, in order to establish the renormalizability of Yang-Mills theories, one
needs to prove that the structure of the divergences in the above listed functions is
such that they can be absorbed into a redefinition of the classical Lagrangian that does
not upset these tight constraints (up to a renormalization of the fields).

6.2 Symmetries of the quantum effective action

6.2.1 Linearly realized symmetries


After fixing the gauge with the Fadeev-Popov procedure, we have obtained the
following effective Lagrangian:
ξ a a ∂Ga 
Leff = LYM + LD − G G + χa b
− Dadj
µ bc χc . (6.5)
2 ∂Aµ
6. R ENORMALIZATION OF GAUGE THEORIES 213

Although the local gauge invariance of the Yang-Mills Lagrangian is now broken (this
was precisely the goal of the gauge fixing procedure), this effective Lagrangian has a
number of symmetries. One of them is the BRST symmetry, that we have exhibited
in the previous chapter. In addition, Leff has the following symmetries:

• Ghost number conservation : the effective Lagrangian is invariant under global


phase transformations of the ghost and antighost,

χ → eiα χ , χ → e−iα χ . (6.6)

Therefore, if we assign a ghost number +1 to the field χ and −1 to the field χ,


this quantity is conserved by the Feynman rules of the gauge fixed theory.
• Global gauge invariance : since all colour indices are contracted in the effective
Lagrangian, it is invariant under gauge transformations that do not depend on
spacetime.
• Lorentz invariance is of course also present in the effective Lagrangian.

For these three symmetries, the infinitesimal variation of the fields is linear in the fields
(which is not the case of the BRST symmetry). These linearly realized symmetries of
the classical action are inherited directly by the quantum effective action.
In order to prove this assertion, let us consider a generic infinitesimal linear
transformation of the fields

φn (x) → φn (x) + ε Fn [x; φ] , (6.7)

where φ1 , φ2 , · · · denote the various fields of the theory (gauge fields, ghosts, ...) and
Fn [x; φ] is a local function of the fields (for now, we do not assume that it is linear in
the fields). We assume that both the classical action and the functional measure are
invariant under this symmetry. Consider now the generating functional Z[j],
Z
  R 4 
Z[j] ≡ Dφn (x) ei S[φn ]+ d x jn (x)φn (x) , (6.8)

where there is one external source jn for each field φn . Since φn (x) is a dummy
integration variable in this path integral, we should obtain the same result after
performing the change of variable (6.7). Using the fact that this transformation
preserves the measure and the classical action, this implies that
Z
  R 4 R 4 
Z[j] = Dφn (x) ei S[φn ]+ d x jn (x)φn (x)+ε d x jn (x)Fn [x;φ]
 i S[φn ]+R d4 x jn (x)φn (x)
Z Z

≈ Z[j] + iε Dφn (x) e d4 x jn (x)Fn [x; φ] .
(6.9)
214 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Therefore, for any sources jn , we must have


Z
d4 x jn (x) Fn [x; φ(x)] j = 0 , (6.10)

where · · · denotes the quantum average in the presence of an external source j,


j
Z
  
1 R 4
O[φ] j ≡ Dφn (x) ei S[φn ]+ d x jn (x)φn (x) O[φ] . (6.11)
Z[j]
(We have normalized it so that 1 j = 1.) Recall now that the sources and field can
be related implicitly by using the quantum effective action:
δΓ [φ]
jn;φ (x) = − . (6.12)
δφn (x)
Therefore, the condition (6.10) is equivalent to
Z
δΓ [φ]
d4 x Fn [x; φ(x)] j =0, (6.13)
φ δφ (x)
n

now satisfied for any fields φn . This is known as the Slavnov-Taylor identity. In
other words, the functional Γ [φ] is invariant under the transformation
φn (x) → φn (x) + ε Fn [x; φ] jφ
. (6.14)
It is crucial to note that, because the quantum average in the right hand side is
performed with the external source jn;φ that depends implicitly on the fields φn , this
is a priori not the same transformation as in eq. (6.7). c sileG siocnarF

Let us now consider the special case of a transformation of type (6.7) which is
linear in the fields. In this case, we may write
Z
Fn [x; φ] = d4 y fnm (x, y) φm (y) . (6.15)

(In most practical cases, the transformation will be local and the coefficients propor-
tional to δ(x − y), but this restriction is not necessary for the following argument.)
For such a linear transformation, we have
Z
Fn [x; φ] j = d4 y fnm (x, y) φm (y) j . (6.16)
φ φ

Recalling that jφ is the configuration of the source j such that the quantum average
φ(x) j precisely equals φ(x), this in fact reads

Fn [x; φ] jφ
= Fn [x; φ] . (6.17)
It is this last step that fails when Fn is nonlinear in the fields. From eq. (6.17), we
see that the transformations (6.14) and (6.7) are identical. We have thus proven that
all linearly realized symmetries of the classical action are also symmetries of the
quantum effective action.
6. R ENORMALIZATION OF GAUGE THEORIES 215

6.2.2 BRST symmetry and Zinn-Justin equation

Since an infinitesimal BRST variation is not linear in the fields, the BRST symmetry
of the classical action is not inherited so simply by the quantum effective action.
Instead, it leads to a set of identities that may be viewed as the analogue of Ward
identities for the BRST invariance. Their derivation follows the method of the section
3.4.2. Since we need to apply a BRST transformation to the Yang-Mills  path integral,

we should first study how this transformation affects the measure DAµ DχDχ .
Under such a transformation, the fields transform into
 
Aa
µ → Aa′ a adj a
µ ≡ Aµ + ϑ Dµ ab χb = Aµ + ϑ ∂µ δab + gf
abc c
Aµ χb
ϑ
χa → χ′a ≡ χa − g fabc χb χc
2
χa → χ′a ≡ χa + ϑ Ba = χa − ξ ϑ Ga , (6.18)

where ϑ is a Grassmann constant. The Jacobian matrix has the following block
structure:
 ν 
′ ′
 δµ (δab − gϑfabc χc ) ∗ 0
∂ Aa′
µ , χa , χa  
 = δ(x−y)  0 δab + gϑfabc χc 0  ,
∂ Ab
ν , χb , χb a
−ξϑ ∂G
∂Ab
0 δab
ν

(6.19)

where the ∗ denotes a non-zero element that we do not need to calculate because it
does not contribute to the determinant. From this structure, we see that the determinant
is given by the product of the diagonal elements, and is therefore equal to 1 (recall
that ϑ2 = 0).
In the derivation, it is convenient to introduce sources ja µ , ηa , ηa that couple
respectively to Aa a
µ , χa , χa , but also two extra sources that couple directly to QBRST Aµ
and QBRST χa :
Z Z 
 
Z[j, η, η; ζ, κ] ≡ DAµ DχDχ exp i d4 x Leff + ja µ
µ Aa + ηa χa + χa ηa
 
+ζµ a
a QBRST Aµ − κa QBRST χa
Z Z
 
= DAµ DχDχ exp i d4 x Ltot , (6.20)

where we use the shorthand Ltot for the sum of terms inside the exponential. Note
that the coefficients of the new sources ζµa and κa are BRST invariant since the
BRST transformation is nilpotent. Let us now perform a BRST transformation of the
216 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

integration variables inside the path integral. This is just a change of variables, that
does not change the value of the path integral. Using the fact that the measure and the
Lagrangian Leff are BRST invariant, we obtain

Z Z
 
Z[j, η, η; ζ, κ] = DAµ DχDχ exp i d4 x Ltot
h Z  
× 1 + i d4 x ja µ
µ ϑ QBRST Aa
  i
+ηa ϑ QBRST χa + ϑ QBRST χa ηa
Z  δZ δZ
= Z[j, η, η; ζ, κ] + i ϑ d4 x jaµ (x) a
+ ηa (x)
iδζµ (x) iδκa (x)
  
δZ
−ξ Ga ηa (x) . (6.21)
iδj(x)

(Note that ϑ anticommutes with ηa .) Therefore, we conclude that


Z    
δZ δZ δZ
d4 x ja
µ (x) a
+ ηa (x) − ξ Ga ηa (x) = 0 . (6.22)
iδζµ (x) iδκa (x) iδj(x)

This is one of the forms of the conservation identities. In this derivation, we see that
having introduced sources specifically coupled to the BRST variation of the gauge
field Aa
µ and of the ghost χa avoided the need for terms with higher order derivatives
(indeed, these variations are non-linear in the fields, and would have required more
derivatives to be expressed as functional derivatives with respect to sources coupled
to elementary fields). By writing Z = exp(W), we see that the same identity applies
to W,
Z    
δW δW δW
d4 x ja
µ (x) + η a (x) − ξ G a
η a (x) = 0 . (6.23)
iδζa
µ (x) iδκa (x) iδj(x)

(Here, we have assumed that the gauge fixing function is linear in the gauge field.)

The next step is to convert this into an identity for the quantum effective action
Γ that generates the 1PI graphs. In this transformation, we will keep the auxiliary
sources ζa
µ and κa unmodified, as parameters. Thus, Γ and W are related by

−i W[j, η, η; ζ, κ] = Γ [A, χ, χ; ζ, κ]
Z  
+ d4 x jµ a a a
a (x)Aµ (x) + χa (x)η (x) + η (x)χa (x) . (6.24)
6. R ENORMALIZATION OF GAUGE THEORIES 217

Fields and sources are related by the following quantum equations of motion:

δΓ
+ jµ
a (x) = 0 ,
δAaµ (x)
δΓ
+ ηa (x) = 0 ,
δχa (x)
δΓ
+ ηa (x) = 0 , (6.25)
δχa (x)

and we also have

δW
= i Aa
µ (x) ,
δjµ
a (x)
δW δΓ
=i a ,
δζaµ (x) δζ µ (x)
δW δΓ
=i a . (6.26)
δκa (x) δκ (x)

Therefore, the conservation identity expressed in terms of the functional Γ reads


Z  δΓ δΓ δΓ δΓ δΓ 
d4 x + − ξ G a
(A) = 0 . (6.27)
δAa a
µ (x) δζµ (x) δχa (x) δκa (x) δχa (x)

This equation can be simplified a bit as follows. By inserting a derivative δ/δχa (x)
under the integral in the definition (6.21) of Z, we obtain zero since we now have the
integral of a total derivative. Recalling that the Fadeev-Popov term in the effective
Lagrangian is

∂Ga 
LFPG = χa b
− Dadj
µ bc χc , (6.28)
∂Aµ

we can perform explicitly this derivative to obtain


Z
  h ∂Ga  i R 4
0 = DAµ DχDχ − D adj
µ bc c χ (x) +η a (x) ei d x Ltot . (6.29)
∂Abµ | {z }
−Q Ab
µ (x)
BRST
δ
= i
δζµ
b (x)

This implies the following functional identity


h ∂Ga δ i
ηa (x) + i µ Z=0, (6.30)
∂Ab
µ δζb (x)
218 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

or equivalent identities for W or Γ :

∂Ga δW δΓ ∂Ga δΓ
ηa (x) + i µ =0 , + =0. (6.31)
∂Ab
µ δζb (x) δχa (x) ∂Aµ δζµ
b
b (x)

Furthermore, define a slightly modified effective action:


Z
ξ
Γ ≡Γ+ d4 x Ga (A)Ga (A) . (6.32)
2
Now the BRST conservation identity takes the following more compact form, known
as the Zinn-Justin equation:
Z  δΓ δΓ δΓ δΓ 
d4 x µ + =0, (6.33)
δAa
µ (x) δζa (x) δχa (x) δκa (x)

from which any explicit reference to the gauge fixing function Ga (A) has disappeared,
as well as the coupling constant g.
Eq. (6.33) applies to the full quantum effective action, that encapsulates the results
from all-order perturbation theory. In the next section, we will show that this identity
(combined with the other symmetries of the effective action) completely constrains
the structure of its local terms of dimension less than or equal to four, forcing them to
be identical to those in the classical action (up to a rescaling of the fields and of the
coupling constant). c sileG siocnarF

6.3 Renormalizability

6.3.1 Constraints on the counterterms

By taking the h̄ → 0 limit in eq. (6.33), one immediately concludes that it is also
satisfied by the classical action, S, supplemented with ghosts as well as the sources
ζµa and κa :
Z h µν a  adj 
S[A, χ, χ; ζ, κ] = d4 x − 1
4 Fa Fµν + ζµ µ
a + ∂ χa Dµ ab χb
i
+ g2 fabc κa χb χc . (6.34)

By introducing the following compact notation,


Z  δA
 δB δA δB 
A, B ≡ d4 x µ + , (6.35)
δAa
µ (x) δζa (x) δχa (x) δκa (x)
6. R ENORMALIZATION OF GAUGE THEORIES 219

we therefore have

S, S = 0 ,

Γ, Γ = 0 . (6.36)

The first equation may be viewed as a constraint on the terms that can appear in the
classical action, while the second equation constrains which divergences may appear
in higher orders.
Let us now write the effective action as a loop expansion,

X
Γ ≡S+ Γl , (6.37)
l=1

where S is given in eq. (6.34), and the subsequent terms Γ l are of order l in h̄. The
Zinn-Justin equation at order L thus reads
X 
Γ p, Γ q = 0 . (6.38)
p+q=L

The renormalization procedure amounts to correcting order by order with counterterms


the classical action S,

S → S(L) , (6.39)

such that S(L) contains counterterms up to order L, and gives finite Γ l ’s for l ≤ L (but
in general not beyond the order L).
The first step
 is to prove that it is possible to find counterterms such that the
equation S, S = 0 is preserved at every order. Let us assume that we have achieved
this up to the order L − 1. All Γ l for l ≤ L − 1 are now finite, while Γ L still contains
a divergent part, that we denote Γ L,div . We can rewrite the Zinn-Justin equation at
order L as follows,
L−1
  X 
S, Γ L + Γ L , S = − Γ l , Γ L−l . (6.40)
l=1

Only the left hand side contains divergences, and we therefore have
 
S, Γ L,div + Γ L,div , S = 0 , (6.41)

which constrains the structure of the divergences at order L. A natural candidate for
the counterterm at order L is to simply add −Γ L,div to the classical action,

S → S − Γ L,div , (6.42)
220 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

since this automatically cancels the superficial divergence of Γ L without affecting


anything in the lower orders. However, this modified classical action does not obey
exactly the Zinn-Justin equation, since
  h  i 
S−Γ L,div , S−Γ L,div = S, S − S, Γ L,div + Γ L,div , S + Γ L,div , Γ L,div .
| {z } | {z } | {z }
0 from lower orders 0 from eq. (6.41) 6=0
(6.43)
Note that the non-zero term in the right hand side is of order strictly greater than L. It
is possible to make it vanish by adding to the shift of eq. (6.42) some terms of higher
order than L, that do not change anything for any order ≤ L. The conclusion of this
inductive argument is that one can shift the classical action at each order in such a
way that the divergences in Γ are canceled, while always preserving S, S = 0.

6.3.2 Allowed terms in the classical action


The second step in the discussion of the renormalization of Yang-Mills theory is to
determine the terms that are allowed in the classical action. This action must satisfy
the constraint S, S = 0, as well as Lorentz invariance, global gauge symmetry and
ghost number conservation. In addition, from the power counting of the section 6.1
and Weinberg’s theorem, we know that all the ultraviolet divergences in Yang-Mills
theory will occur in local operators of dimension 4 at most.
In order to discuss the form of the allowed terms, let us first list the mass dimension
and ghost number of the various fields that enter in S:

field Aµa χa χa ζµ
a κa
mass dimension 1 1 1 2 2
ghost number 0 +1 -1 -1 -2

All the allowed terms in S must obey the following conditions:

• mass dimension 4 or less,


• ghost number 0,
• Lorentz invariance,
• global gauge invariance. c sileG siocnarF

In addition, eq. (6.31) implies that the χ and ζ dependences come in the form of a
dependence on the combination
∂G
ζµ − χ = ζµ + ∂ µ χ , (6.44)
∂Aµ
6. R ENORMALIZATION OF GAUGE THEORIES 221

where in the right hand side we have assumed the covariant gauge condition G(A) =
∂µ Aµ and anticipated an integration by parts. Finally, the Zinn-Justin equation
S, S = 0 must be satisfied.
Since the sources ζµa and κa have mass dimension 2, at most two of them may
appear. However, terms with two such sources cannot contain any other field since
the mass dimension 4 is already reached, and they cannot have ghost number zero.
Therefore, S can only contain terms that have degree 0 or 1 in ζµ
a and κa .
The source ζµa must be combined with another combination of fields that have
one Lorentz index, one colour index, mass dimension at most 2, and ghost number
+1. The only operators that fulfill these conditions are
fabc ζµ b
a Aµ χc and ζµ
a ∂µ χa . (6.45)
Once the dependence on ζµ
is fixed, the dependence on the antighosts will be com-
a
pletely known from eq. (6.44). Likewise, κa must be combined with an object that
has one colour index, mass dimension at most 2 and ghost number +2. The only
possibility is
fabc κa χb χc . (6.46)

From the information gathered so far, the classical action must have the following
general form:
Z h  b
S[A, χ, χ; ζ, κ] = Σ[A] + d4 x gα fabc ζµ µ
a + ∂ χa Aµ χc
 i
γ abc
+β ζµ a + ∂ µ
χa ∂ µ χa + 2 f κ a χb χc , (6.47)

where α, β, γ are three arbitrary constants. The term Σ cannot depend on the sources
ζµ
a and κa because we have already constructed explicitly all the allowed terms that
contain these sources, and cannot depend on χ because the antighost dependence is
already encapsulated in the combination ζµ µ
a + ∂ χa . A dependence on χ in Σ is also
forbidden because χ would be the only field in Σ with a non-zero ghost number. Our
next step is to constrain the coefficients gα, β, γ and the functional Σ[A] in order to
satisfy the Zinn-Justin equation (6.33). The functional derivatives that enter in (6.33)
are given by:
δS δΣ
= − gα fabc (ζµb + ∂µ χb ) χc ,
δAµa δAµa
δS
= gα fade Aµ µ
d χe + β ∂ χa ,
δζµa
δS
= gα fabc (ζµb + ∂µ χb ) Aµ µ
c + β (ζµa + ∂µ χa ) ∂ + γ f
abc
κb χc ,
δχa
δS γ ade
= f χd χe . (6.48)
δκa 2
222 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Thus, the Zinn-Justin equation reads


Z h δΣ 
ade µ

0 = d4 x µ gα f Ad χe + β ∂µ χa
δAa
 
+(ζµb + ∂µ χb ) − gα fabc χc gα fade Aµ µ
d χe + β ∂ χa

µ γ ade

+ gα fabc Aµ c + β δab ∂ 2 f χd χe
γ2 abc ade i
+ f f κb χc χd χe . (6.49)
2

Using the Jacobi identity satisfied by the structure constants, one may first check
that the last term, in κχχχ, is identically zero, and therefore does not provide any
constraint. Consider now the terms in ζAχχ:
 
gα ζµb − gα fabc fade Aµ
d χc χe +
γ
2
abc ade µ
|f {zf } Ac χd χe
−fabd faec −fabe facd

= gα (γ − gα) f abc ade


f ζµb Aµ
d χc χe . (6.50)

Since this is the only term containing this combination of fields, it cannot be canceled
by other terms, and therefore we must have

gα = γ . (6.51)

Let us now study the terms in ζχ(∂χ),


 γ 
β ζµb −gα fabc χc ∂µ χa + fbde ∂µ (χd χe ) = β (γ−gα) fbac ζµb (∂µ χa ) χc .
2
(6.52)

Thus, the cancellation of this term does not bring any additional constraint beyond
eq. (6.51). At this point, all the terms containing ζµa have been canceled (and by
extension also the terms with ∂µ χa ), and the Zinn-Justin equation reduces to
Z
δΣ  
0 = d4 x gα facb Aµ µ
c χb + β ∂ χa . (6.53)
δAµ a

Let us first rewrite the second factor as follows



β ∂µ δab − igαβ−1 (−ifcab )Aµ χ , (6.54)
| {z c
} b
(Aµ
adj )ab

and note that it has the structure of an adjoint covariant derivative acting on χb ,
µ 
Dadj ab ≡ ∂µ δab − igαβ−1 (Aµ adj )ab . (6.55)
6. R ENORMALIZATION OF GAUGE THEORIES 223

Thus, eq. (6.53) is equivalent to


Z
δΣ µ 
0 = d4 x µ Dadj ab χb . (6.56)
δAa
The second factor may be viewed as the variation of the gauge field under an infinites-
imal gauge transformation,
µ 
Aµ µ
a → Aa + ϑ Dadj ab χb , (6.57)

where we have introduced a constant Grassmann variable ϑ to make the second term
a commuting object. Therefore, for the integral to be zero for an arbitrary χb (x), the
functional Σ[A] must be invariant under this transformation. Recalling our discussion
of the local gauge invariant operators of mass dimension four or less, we conclude
that the only possible form for Σ is
Z
δ µν a
Σ[A] = − d4 x Fa Fµν , (6.58)
4
µν µ
where F is the field strength constructed with the covariant derivative D and δ
another constant. Given all the above constraints, we must have
Z h µν a  adj 
S[A, χ, χ; ζ, κ] = d4 x − δ
4 Fa Fµν + β ζµ µ
a + ∂ χa Dµ ab χb
i
+ gα
2 fabc
κ a χb χc . (6.59)

Up to rescalings of the various fields and of the coupling constant g, this is structurally
identical to the bare classical action of eq. (6.34). Note that this equation implies
that the field renormalization factors for the gauge field Aµa and for the source κa are
equal, ZA = Zκ . c sileG siocnarF

6.4 Background field method

6.4.1 Rescaled fields

In this section, we describe the calculation of the one-loop quantum corrections to the
coupling constant by a method based on the quantum effective action combined with
the so-called background field method.
The first step of this method is to rescale the gauge field by the inverse of the
coupling constant:

g Aµ → Aµ . (6.60)
224 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

By doing this, the various objects that appear in the Yang-Mills action are transformed
as follows:
1 µ ν 
Fµν → ∂ A − ∂ν Aµ − i [Aµ , Aν ]
g
Dµ → ∂µ − i Aµ . (6.61)

In other words, up to a rescaling in the case of the field strength Fµν , these objects
are transformed into their counterparts for a coupling equal to unity. In the rest of this
section, the notation Aµ , Dµ , Fµν will refer to the rescaled quantities. In terms of the
rescaled fields, the Yang-Mills action simply reads
Z
1
SYM = − 2 d4 x Fµν a
a Fµν , (6.62)
4g | {z }
no g

where all the dependence on the coupling constant appears now in the prefactor g−2 .
This action has a local non-Abelian gauge invariance analogous to the original one,
but with g = 1:
† †
Aµ → AΩ
µ ≡ Ω Aµ Ω + i Ω ∂µ Ω . (6.63)

6.4.2 Background field gauge


The background field method consists in choosing a background field Aa
µ (x), and in
writing the gauge field Aa
µ (x) as a deviation around this background
Aa a a
µ ≡ Aµ + aµ . (6.64)
In this decomposition, the background field Aµ is not a dynamical field: it will just
act as a parameter that we shall not quantize, and the path integration is thus only on
the deviation aaµ (one may thus view this as a shift of the integration variable). In
terms of Aa a
µ and aµ , the field strength that enters in the Yang-Mills action can be
written as
 
Fµν = Fµν + ∂µ aν − i [Aµ , aν ] − ∂ν aµ − i [Aν , aµ ] − i [aµ , aν ] , (6.65)
where Fµν is the field strength constructed with the background field. With explicit
colour indices, this reads
µ   µ abc µ ν
Fµν µν ν ν
a = Fa + Dadj ab ab − Dadj ab ab + f ab ac , (6.66)
 µ
where Dµ µ
adj = ∂ − i A , ·] is the adjoint covariant derivative associated to the
background field Aµ . If we view the background field as a constant, the original
gauge transformation on Aµ corresponds to the following transformation on aµ ,
aµ → Ω† aµ Ω + Ω† Aµ Ω − Aµ + i Ω† ∂µ Ω . (6.67)
6. R ENORMALIZATION OF GAUGE THEORIES 225

If we parameterize Ω = exp(iθa ta ) and expand to first order in θa , an infinitesimal


gauge transformation of aµ
a reads

µ 
aµa → aµ a − Dadj ab θb + f
abc
θ b aµ
c . (6.68)

This invariance leads to the same pathologies as in the original theory, and we
must fix the gauge in order to have a well defined path integral. The background field
gauge corresponds to the following condition on aa µ,

Ga (A) ≡ Dµ b
adj ab aµ = ωa . (6.69)

Let us recall that a gauge fixing function Ga (A) leads to the following terms in the
effective Lagrangian:
ξ
LGF = − Ga (A)Ga (A) (gauge fixing term)
2 g2
∂Ga 
LFPG = − χa b
Dadj
µ bc χc (Fadeev-Popov ghosts) . (6.70)
∂Aµ

With the choice of eq. (6.69), the Fadeev-Popov term becomes


   
LFPG = −χa Dµ adj µ adj
adj ab Dµ bc χc = Dadj χ a Dµ χ a , (6.71)

where in the second equality we have anticipated an integration by parts and used the
µ
notation (Dadj adj
µ χ)a ≡ (Dµ )ab χb (and a similar notation for (Dadj χ)a ).

6.4.3 Residual symmetry of the gauge fixed Lagrangian


The effective Lagrangian LYM + LGF + LFPG possesses a residual gauge symmetry
that corresponds to gauge transforming in the same way the background field Aµ and
the total field Aµ ,

Aµ → Ω† Aµ Ω + i Ω† ∂µ Ω ,
Aµ → Ω† Aµ Ω + i Ω† ∂µ Ω . (6.72)

Indeed, under this joint transformation we have

aµ → Ω† aµ Ω ,
Dµ → Ω† Dµ Ω ,
Dµ → Ω† Dµ Ω ,
χ → Ω† χ ,
χ → χΩ ,
G(A) → Ω† G(A) Ω . (6.73)
226 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

From this, we conclude that the gauge fixing Lagrangian LGF and the Fadeev-Popov
Lagrangian LFPG are both invariant in this transformation, as well as the Yang-Mills
Lagrangian. Since the path integration measure over aµ , χ, χ is also invariant under
this transformation, the result of the path integral must be invariant under local gauge
transformations of the background field Aµ .

6.4.4 One-loop running coupling

Let us now turn to the calculation of the quantum effective action at one-loop. For
this, we use the results of the section 2.6.5, where we have shown that these one-
loop corrections are obtained by expanding the classical action to quadratic order
in deviations with respect to a background field, and by performing the resulting
Gaussian path integration with respect to the deviations (which gives a functional
determinant).
The first step is to expand the three terms of the gauge fixed Lagrangian to second
order in the deviation aµ . In this calculation, we choose the gauge fixing parameter
ξ = 1. The quadratic terms in the combined Yang-Mills and gauge fixing terms read

1 2
LYM + LGF = − 1
2 (Dµ ν ν µ
adj a )a −(Dadj a )a
2g2
µ 
a 2
+fabc Faµν ab c
µ aν + (Dadj aµ )

1 h i
= − aa 2
µ − Dadj )ac g
µν
− 2fabc Fbµν acν
2g2
1 h (1)
i
ρσ
= − 2 aa
µ − D adj )2
ac gµν
+ (F adj )ac (M ρσ )µν
acν ,
2g
(6.74)

(1)
where we have introduced (Mρσ )µν ≡ i(δρ µ δσ ν − δρ ν δσ µ ) the generators of the
Lorentz transformations for 4-vectors (the Lorentz transformation corresponding to
(1)
the transformation parameters ωρσ reads Λµν = exp( 2i ωρσ (Mρσ )µν )). For the
ghost term, the quadratic part is
h 2 i
LFPG = χa − Dadj ab χb . (6.75)

Note that the operator that appears between the two ghost fields is the spin-0 analogue
of the one that appears in eq. (6.74), since the generators of Lorentz transformations
(0)
for spin-0 objects are identically zero (Mρσ ≡ 0). Although we have not considered
fermions so far in this chapter, the Dirac Lagrangian would give a contribution equal
/ or equivalently the square root of the determinant of (iD)
to the determinant of iD, / 2.
6. R ENORMALIZATION OF GAUGE THEORIES 227

Noting that
2 
/
iD = −D2 + i i
2 [γµ , γν ] Dµ Dν
(1/2)
= −D2 + (Fρσ ) Mρσ , (6.76)

(1/2)
where the Mρσ ≡ 4i [γρ , γσ ] are the generators of Lorentz transformations for
spin-1/2 fields. Note that the covariant derivatives and the field strength are in
the fundamental representation (assuming fermions that transform according to the
fundamental representation, like quarks). Therefore, for each of the fields that appear
in the quantum effective action (gauge fields, ghosts, fermions), we get a determinant
∆r,s of an operator containing −D2 (in the representation r corresponding to the field
under consideration) plus a “spin connection”2 made of the contraction of the field
strength with the Lorentz generators corresponding to the spin s of the field:
 (1)

ρσ
gauge fields : ∆adj,s=1 ≡ det − D2adj + Fadj Mρσ
 (0)

ρσ
ghosts : ∆adj,s=0 ≡ det − D2adj + Fadj Mρσ
| {z }
=0
 (1/2)

2 ρσ
fermions : ∆f,s=1/2 ≡ det − Df + Ff Mρσ . (6.77)

In terms of these determinants, the 1-loop quantum effective action is given by


i i nf
Γ [A, χ, ψ] = Sr + ∆S + ln ∆adj,s=1 − ln ∆f,s=1/2 − i ln ∆adj,s=0 , (6.78)
2 2
where ∆S denotes the 1-loop counterterms, and nf is the number of fermion flavours.
Using the invariance with respect to local gauge transformations of the background
field, we must have
Z
i
ln ∆r,s = Cr,s d4 x Faµν Fµν a
+ ··· , (6.79)
4
where the dots represent higher dimensional gauge invariant operators. Being of
dimension higher than four, these operators do not contribute to the renormalization
of the coupling. The constant Cr,s depends on the group representation r and spin s
of the field. These coefficients are ultraviolet divergent,

Λ2
Cr,s = cr,s ln , (6.80)
κ2
where Λ is an ultraviolet scale and κ the typical scale of inhomogeneities of the back-
ground field. After combining them with the counterterms from ∆S, the ultraviolet
2 This terms describes the coupling between the magnetic moment of the particle and the background

field. Its detailed form depends on the spin of the particle.


228 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

scale is replaced by a renormalization scale µ,

µ2
Cr,s → Cr,s = cr,s ln . (6.81)
κ2
From eq. (6.78), we see that the 1-loop renormalized coupling at the scale µ and the
bare coupling must be related by

1 1 1 nf
= + Cadj,1 − Cf,1/2 − Cadj,0
g2b g2r (µ) 2 2
1 1 nf  µ2
= + c adj,1 − cf,1/2 − cadj,0 ln 2 . (6.82)
g2r (µ) 2 2 κ

The explicit calculation of the constants cr,s requires to expand the logarithm of the
functional determinants to second order in the background field strength Fµν . Thanks
to the organization of eqs. (6.77), this calculation needs to be performed only once,
for generic gauge group and Lorentz representations. This leads to
1 h1 i
cr,s = 2 3 d(s) − 4C(s) N(r) , (6.83)
(4π)

where d(s) is the number of spin components (respectively 1, 4, 4 for scalars, fer-
mions, and vector particles), C(s) is the normalization of the trace of two Lorentz
generators3 ,
(s) (s)  
tr Mρσ Mαβ = C(s) (gρα gσβ − gρβ gσα , (6.84)

and N(r) is the normalization of the trace of two generators of the Lie algebra in
representation r,

tr ta b
r tr = N(r) δab . (6.85)

For the fundamental and adjoint representations of su(N), we have N(f) = 12 and
N(adj) = N. Therefore, the constants involved in the 1-loop running coupling are

N 20 N 4
cadj,0 = , cadj,1 = − cf,1/2 = − , (6.86)
3(4π)2 3(4π)2 3(4π)2

and the coupling evolves according to

1 1 1  11 2
 µ2
= + N− nf ln . (6.87)
g2r (µ) g2b (4π)2 |3 {z 3
} κ2
11 N
>0 for nf ≤ 2

3 For 1
spin-0, 2
and 1, this constant is respectively 0, 1 and 2.
6. R ENORMALIZATION OF GAUGE THEORIES 229

Given two scales µ and µ0 , the renormalized couplings at these scales are related by

1 1 1  11 2
 µ2
− = N− nf ln 2 , (6.88)
g2r (µ) g2r (µ0 ) (4π)2 3 3
µ0

which may be rewritten as

g2 (µ0 )
g2 (µ) =  . (6.89)
g2r (µ0 )
2
1+ (4π)2
11
3 N− 2
3 nf ln µ
µ2
0

In quantum chromodynamics, where the gauge group is SU(3) (i.e. N = 3) and where
there are 6 flavours of quarks in the fundamental representation, the coefficient in
front of the logarithm is positive, which indicates that the coupling constant decreases
as the scale µ increases. The coupling constant in fact goes to zero when µ → ∞, a
property known as asymptotic freedom. Thanks to the formula (6.83), it would have
been easy to determine the one-loop running of the coupling in the presence of matter
fields in arbitrary representations. c sileG siocnarF
230 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS
Chapter 7

Renormalization group

In quantum field theory, the renormalization group refers to a set of tools for in-
vestigating the changes of a system when observed at varying distance scales, akin
to varying the magnifying power of a microscope in order to uncover new features
that were not visible at lesser resolution scales. For renormalizable theories, such a
change of scale merely amounts to a change in a few parameters of the theory (masses,
coupling, field normalization), but the use of the renormalization group is not limited
to this class of theories, as we shall discuss in the last section.

7.1 Callan-Symanzik equations


Let us consider a renormalizable quantum field theory, for instance a scalar theory with
a φ4 interaction (renormalizable in d ≤ 4 space-time dimensions). For simplicity,
assume firstly that this field is massless, and denote by M the scale at which the
renormalization conditions are imposed. For instance, these conditions can be chosen
as follows:

Γ (4) (p1 , p2 , p3 , p4 ) = −iλ ,


for (p1 +p2 )2 = (p1 +p3 )2 = (p2 +p4 )2 = −M2 ,
Π(p)|p2 =−M2 = 0 ,
dΠ(p)
=0, (7.1)
dp2 p2 =−M2

where Π(p) is the self-energy and Γ (4) the 1-particle irreducible 4-point function.
There is a large amount of freedom in the choice of the renormalization conditions.
Two sets of renormalization conditions may correspond to the same physical theory

231
232 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

provided that the bare Green’s functions, expressed in terms of the bare parameters
of the Lagrangian, are identical. Indeed, the renormalization scale M appears
√ only
when we replace the bare field φb by the renormalized field φr ≡ φb / Z and the
bare coupling constant λb by the renormalized coupling constant λr . The bare and
renormalized Green’s functions are related by
(n)
G(n)
r (x1 , · · · , xn ) = Z−n/2 Gb (x1 , · · · , xn ) . (7.2)

In order to have the same physical theory, we must change Z and λ when varying the
renormalization scale M. With such a variation of the scale M, we can write:
(n) (n) (n)
dGr ∂Gr ∂Gr ∂λ
= + . (7.3)
dM ∂M ∂λ ∂M
On the other hand, we may obtain this derivative from the right hand side of eq. (7.2)
and the fact that bare Green’s functions must remain unchanged:
(n)
dGr n ∂Z (n)
=− G . (7.4)
dM 2 Z ∂M r
Combining the previous two results, we obtain
 
∂ ∂
M +β + nγ G(n) r =0, (7.5)
∂M ∂λ
where we have defined

∂λ M ∂Z
β≡M , γ≡ . (7.6)
∂M 2 Z ∂M

Eq. (7.5) is known as the Callan-Symanzik equation, or renormalization group (RG)


equation. The quantity β, called the beta function of the theory, controls how the
coupling constant varies with the scale M. γ, known as the anomalous dimension
of the field φ, controls the rescaling of the field when the scale M is changed. The
“group” terminology comes from the following considerations. Formally, the solutions
of eq. (7.5) can be written as follows:

G(n)
r (· · · ; M) = U(M, M0 ) G(n)
r (· · · ; M0 ) , (7.7)

where the evolution operator U(M, M0 ) is a Green’s function of the operator between
the square brackets in the right hand side of eq. (7.5). A 1-dimensional group structure
can be attached to this evolution by noting that

U(M2 , M0 ) = U(M2 , M1 ) U(M1 , M0 ) . (7.8)

In other words, a finite rescaling can be broken down into several smaller rescalings
without affecting the final result.
7. R ENORMALIZATION GROUP 233

7.1.1 One-loop calculation of β and γ


In practice, one can determine the anomalous dimension γ and the β function at one-
loop from the wavefunction and vertex counterterms δZ and δλ . Since Z = 1 + δZ ,
we can directly write
M ∂δZ
γ= . (7.9)
2 ∂M
In order to determine the β function for a 4-legs vertex, one should start from the
(4)
renormalized 4-point function Gr (p1 , · · · , p4 ). Diagrammatically, this function
reads
 
X
= + + +  +  , (7.10)
i

where the first term in the right hand side is the tree-level vertex, the second and
third terms are respectively the 1PI vertex correction and the associated counterterm.
The fifth and sixth terms are the self-energy corrections on the external lines and the
corresponding counterterms. Up to one-loop, this equation can be written as follows:
!
Y i h
(4)
Gr (p1 , · · · , pn ) = − iλb
i
p2i
(4)
+Γb − iδλ
X 1 (2) i
−iλb 2
(Γb (pi ) − p2i δiZ ) . (7.11)
i
pi

In this equation, the first line is the tree-level 4-point function, the second line contains
the one-loop 1PI vertex correction and the vertex counterterm (necessary in order to
fulfill the renormalization condition for the vertex at the scale M), and the last line
is the sum of the 1-loop corrections on the external lines (the counterterms δiZ are
determined by the normalization condition of the propagator at the scale M). The
dependence of this renormalized Green’s function on the renormalization scale M
arises from the counterterms δλ and δiZ . By applying the Callan-Symanzik equation
to this Green’s function, we obtain at leading order
!
∂ X
i λ X ∂δiZ
M δλ − λ δZ + β + M =0, (7.12)
∂M 2 ∂M
i i

where we have replaced the anomalous dimensions γi attached to the external lines
by their expression given by eq. (7.9) in terms of the corresponding counterterms δiZ .
Therefore, we obtain the following formula for the β function:
!
∂ λX i
β=M −δλ + δZ . (7.13)
∂M 2
i
234 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(2)
7.1.2 Solution for the 2-point function Gr
In a massless theory, we may always parameterize the 2-point function as follows:
i
G(2)
r (p) = g(−p2 /M2 ) , (7.14)
p2
where g(−p2 /M2 ) is a function so far arbitrary. Since the M dependence arises
solely from the ratio −p2 /M2 , we can rewrite the derivative with respect to M in the
Callan-Symanzik equation in the form of a derivative with respect to p:
 
∂ ∂
p −β + 2 − 2γ G(2)
r (p) = 0 . (7.15)
∂p ∂λ
In order to solve this equation, let us introduce a function λ(p, λ) defined by:
dλ(p, λ)
= β(λ) , λ(M, λ) = λ . (7.16)
d ln(p/M)
In other words, λ is the running coupling constant that takes the value λ at the
momentum scale M. We can then write the solution of the Callan-Symanzik equation
in the following form:
 p 
Z ′
i  dp
G(2)
r (p) = 2 G(λ(p, λ)) exp 2 γ(λ(p′ , λ)) , (7.17)
p p′
M

where G(λ(p, λ)) is an arbitrary function that cannot be determined from the renormal-
ization group equations1 . This function must be determined order by order from pertur-
bative calculations. In the case of the 2-point function, we have G(λ(p, λ)) = 1+O(λ).
The exponential in eq. (7.17) is the cumulative field renormalization between the
scales M and p. In particular, for a constant anomalous dimension, this factor is
(p/M)2γ , and we see that it alters the power law dependence of the propagator with
respect to momentum, changing a power −2 into −2 + 2γ. c sileG siocnarF

7.2 Correlators containing composite operators


7.2.1 Callan-Symanzik equations
A very useful extension of the previous formalism concerns the case of correlators
that contain one of more composite operators, i.e. made of several fields evaluated at
1 An arbitrary function of the running coupling is allowed as a prefactor, since we have:
 
∂ ∂
p −β G(λ(p, λ)) = 0 .
∂p ∂λ
7. R ENORMALIZATION GROUP 235

the same space-time point. Similarly to the case of elementary operators, we must in-
troduce a renormalization factor ZO , determined order by order in perturbation theory
in order to fulfill a certain renormalization condition at the scale M. The renormalized
operator Or is related to the bare operator Ob by the relationship Or = Ob /ZO . Let us
consider now a renormalized correlation function involving a composite operator O
and n elementary fields:

G(n;1)
r (x1 , · · · , xn ; y) ≡ hφ(x1 ) · · · φ(xn )O(y)i . (7.18)

The corresponding bare correlation function is given by


(n;1)
G(n;1)
r (x1 , · · · , xn ; y) = Z−n/2 Z−1
O Gb (x1 , · · · , xn ; y) . (7.19)

By requesting that the bare correlation function remains unchanged upon changes
of the renormalization scale M, we obtain the following equation satisfied by the
renormalized correlation function
 
∂ ∂
M +β + nγ + γO G(n;1)
r =0, (7.20)
∂M ∂λ
where we have defined the anomalous dimension of the composite operator O as
follows
M ∂ZO
γO ≡ . (7.21)
ZO ∂M

7.2.2 Anomalous dimension of the operator O


The practical determination of the anomalous dimension γO of a composite operator
O made of m elementary fields φ can be done by studying the correlation function
(m;1)
Gr and by applying to it the Callan-Symanzik equation. This method is identical
to the one used in the determination of the expression (7.13) for the beta function, and
leads to
!
∂ 1X i
γO = M −δO + δZ , (7.22)
∂M 2
i

where δO is the counterterm that one must adjust in order to satisfy the renormalization
condition of the operator O at the scale M.

7.2.3 Anomalous dimension of a conserved current


A very useful example in practice is that of a current such as

Jµ ≡ ψγµ ψ . (7.23)
236 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

The anomalous dimension of such an operator is given by


∂ 
γJ = M −δJ + δψ = 0 . (7.24)
∂M
The equality of the counterterms δJ and δψ is a consequence of the Ward identities,
i.e. of the gauge symmetry associated to charge conservation and to the conservation
of the current Jµ .

7.2.4 Renormalization of operators of arbitrary dimensions


Let us denote by LM the renormalized Lagrangian at the scale M. Consider now
adding to this Lagrangian the following sum of interaction terms:
X
LM → LM + ci Oi (x) , (7.25)
i

where the Oi ’s are arbitrary local operators, not necessarily renormalizable in four
dimensions. The Callan-Symanzik equation for a correlator containing n elementary
fields φ and an arbitrary number of these new interaction terms reads:
" #
∂ ∂ X ∂
M +β + nγ + γi ci G(n)
r =0. (7.26)
∂M ∂λ ∂ci
i

In this equation, γi is the anomalous dimension of the operator Oi and the operator
(n)
ci ∂/∂ci counts the number of occurrences of Oi inside the function Gr . If di is the
dimension of the operator Oi (in mass units), it is convenient to define a dimensionless
coupling constant ρi by the following relation,
ci ≡ ρi M4−di . (7.27)
Thanks to this definition, the previous Callan-Symanzik equation becomes:
" #
∂ ∂ X ∂
M +β + nγ + βi G(n)
r =0, (7.28)
∂M ∂λ ∂ρi
i

where we denote βi ≡ ρi (γi + di − 4). With these notations, we see that the
additional couplings ρi play exactly the same role as the original coupling λ. We can
therefore mimic the explicit solution found in the case of the two-point function in
the section 7.1.2. Let us first introduce running couplings λ, ρi , as solutions of the
following differential equations

dλ(p, λ)
= β(λ, ρi ) , λ(M, λ) = λ ,
d ln(p/M)
dρi (p, ρi )
= βi (λ, ρi ) , ρi (M, ρi ) = ρi . (7.29)
d ln(p/M)
7. R ENORMALIZATION GROUP 237

In the weak coupling limit, the functions βi are given at lowest order by

βi ≈ (di − 4)ρi , (7.30)

and the solution of the previous equations for ρi reads


 p di −4
ρi (p) = ρi (M) . (7.31)
M
This result sheds some light on the fact that all fundamental interactions (except
gravity, for which the proper quantum theory is not known) appear to be described by
renormalizable quantum field theories at the energy scales relevant for the Standard
Model (i.e. p . 1 TeV). Indeed, let us assume that there exists at a much higher scale
(typically M ∼ 1016 GeV, the conjectured scale for the unification of all couplings)
a more fundamental quantum field theory, comprising all sorts of interactions and
whose couplings are of order one (at this unification scale, couplings that are allowed
by symmetries have no reason to be much smaller than unity). After evolving
the scale down to the sub-TeV scale of the Standard Model, all the couplings for
which di − 4 > 0, i.e. all the operators that are not renormalizable in four space-
time dimensions, have become much smaller than the others and have effectively
disappeared from the Lagrangian. c sileG siocnarF

7.3 Operator product expansion

7.3.1 Introduction

The operator product expansion (OPE) is a tool that allows to study the renormal-
ization flow at the level of the operator themselves, instead of encapsulating them
inside a correlator (although the derivation still requires that we consider a correlator).
The intuitive idea is that a non-local product of operators may be approximated by a
local composite operator when the separations between the original operators go to
zero, possibly with a numerical prefactor that depends on the separation between the
operators in the original product. However, since limits of operators are difficult to
handle, it is convenient to consider a weaker form of limit, in which the product of
operators under consideration is encapsulated into a correlation function of the form
(n)
G12 (x; y1 , · · · , yn ) ≡ hA1 (x)A2 (0)φ(y1 ) · · · φ(yn )i , (7.32)

where A1 and A2 are local operators, and φ an elementary field. Let us consider a
limit where the coordinates yi are fixed, while x → 0. We can already note that, since
the product of operators at the same point is ill-defined in general, we may expect
divergences in this limit. c sileG siocnarF
238 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(n)
It turns out that the behaviour of G12 when x → 0 is entirely determined by the
operators A1 and A2 themselves, in a way that does not depend on the other fields
φ(yi ) (provided they are kept at a finite distance from the the points 0 and x). In order
to determine this behaviour, Wilson proposed to expand the product A1 (x)A2 (0) as
a sum of composite local operators, with x dependent coefficients:
X
A1 (x)A2 (0) = Ci12 (x) Oi (0) , (7.33)
i

where the Oi are a basis of composite local operators that have the same quantum
numbers as the product A1 A2 . All the x dependence is carried by the Wilson
coefficients Ci12 (x). This decomposition can then be used in any correlation function
(n)
where the product A1 (x)A2 (0) appears. For instance, the correlation G12 introduced
at the beginning of this section would read
(n)
X (n)
G12 (x; y1 , · · · , yn ) = Ci12 (x) Gi (y1 , · · · , yn ) , (7.34)
i

where we denote
(n)
Gi (y1 , · · · , yn ) ≡ hOi (0)φ(y1 ) · · · φ(yn )i . (7.35)

7.3.2 Callan-Symanzik equation for Ci12 (x)

Let us assume that we have defined the normalization of the operators A1 , A2 , Oi


at the scale M. The coefficients Ci12 (x) in eq. (7.33) should a priori also depend on
M. In order to determine this dependence, let us firstly write the Callan-Symanzik
(n)
equation for the renormalized correlator2 G12 :
 
∂ ∂ (n)
M +β + nγ + γA1 + γA2 G12 = 0 , (7.36)
∂M ∂λ

where γ, γA1 and γA2 are the anomalous dimensions of the operators φ, A1 and A2 ,
(n)
respectively. Concerning the correlation functions Gi that enter in the right hand
side of eq. (7.34), we have the following equations:
 
∂ ∂ (n)
M +β + nγ + γi Gi = 0 , (7.37)
∂M ∂λ

where γi is the anomalous dimension of Oi . The left hand side and right hand sides
of eq. (7.34) are consistent provided that the coefficients Ci12 obey the following
2 In the rest of this chapter, we do not write explicitly the subscript r to indicate the renormalized

quantities, in order to simplify the notations. From the context, it is always clear when a quantity is
renormalized.
7. R ENORMALIZATION GROUP 239

equation:
 
∂ ∂
M +β + γA1 + γA2 − γi Ci12 = 0 . (7.38)
∂M ∂λ

This equation confirms a posteriori the fact that the coefficients Ci12 must depend on
the renormalization scale M. Moreover, we see that this dependence only depends on
the anomalous dimensions of the operators A1 , A2 and Oi , but not on the specific
(n)
correlation function G12 that was used in the derivation (in particular, eq. (7.38) does
not depend on the number n of fields φ, nor on their anomalous dimension). It is this
property that renders the operator product expansion universal.

7.3.3 Separation dependence of Ci12 (x)

If the dimensions of A1 , A2 and Oi are respectively D1 , D2 and di , then the


dimension of Ci12 is D1 + D2 − di .Therefore, we may write

1 ei (M|x|) ,
Ci12 (x; M) ≡ C (7.39)
|x|D1 +D2 −di 12

where C ei (Mx) is a dimensionless function of the sole variable M|x|. One can
12
determine this function similarly to the case of the 2-point function considered in the
section 7.1.2, by introducing the running coupling λ(1/|x|). We obtain the following
structure for the coefficient Ci12 :

Ci (λ(1/|x|))
Ci12 (x; M) = 12D +D −d
|x| 1 2 i
 
M
Z ′
 dp 
× exp  (γi (λ(p′ )) − γA1 (λ(p′ )) − γA2 (λ(p′ ))) ,
p′
1/|x|
(7.40)

where Ci12 is a function of the running coupling that can be obtained by a matching
to perturbative calculations. We see that the leading short distance behaviour is
controlled by the prefactor |x|di −D1 −D2 , that becomes singular if di < D1 + D2 .
Moreover, the contribution of the operators Oi whose dimension obeys di > D1 +D2
goes to zero when x → 0. One does not need to consider such operators in the OPE
when studying the short distance limit.
In asymptotically free theories where the coupling goes to zero at short distance,
such as QCD, we may carry a bit further the determination of the Wilson coefficients.
240 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Indeed, at the first order of perturbation theory, the anomalous dimensions are pro-
portional to g2 , and we may write the anomalous dimension of any operator O as
follows:
g2
γO ≡ −aO , (7.41)
(4π)2
where aO is a numerical constant (the minus sign is conventional). Therefore, we
have
αs
γi − γA1 − γA2 = (aA1 + aA2 − ai ) , (7.42)

with αs ≡ g2 /4π. At one loop, the running coupling αs is given by:

αs (Q2 ) 1
=  , (7.43)
4π Q2
β0 ln Λ2
QCD

where β0 is the first Taylor coefficient of the QCD β function. From this, we get
" # ai −a2β
A1 −aA2
2 2
Ci12 (g(1/|x|)) ln(1/|x| ΛQCD ) 0

Ci12 (x; M) = . (7.44)


|x|D1 +D2 −di ln(M2 /Λ2QCD )

We see that, besides the trivial power law prefactor in |x|di −D1 −D2 , there are cor-
rections in the form of powers of logarithms that may be large when x → 0. When
di = D1 + D2 , these logarithms are in fact the main source of |x| dependence.

7.3.4 Operator mixing

It may happen that several of the operators Oi that enter in the OPE basis for the
product A1 (x)A2 (0) mix under the evolution of the scale M. This means that the
anomalous dimensions γi are in fact a matrix γij (when there is no mixing, this
matrix is diagonal and the γi ’s that we have used so far are its diagonal elements) and
(n)
the Callan-Symanzik equations for the correlators Gi are coupled:
X  ∂ ∂
 
(n)
δij M +β + n γ + γij Gj = 0 . (7.45)
∂M ∂λ
i

(n)
The equation for G12 is unchanged, and we obtain the following equation for the
Wilson coefficients
  X
∂ ∂
M +β + γA1 + γA2 Cj12 − γij Ci12 = 0 . (7.46)
∂M ∂λ
i
7. R ENORMALIZATION GROUP 241

Note that when the operators A1 and A2 are conserved currents, their anomalous
dimensions are zero, and this equation simplifies into

  X
∂ ∂
M +β Cj12 − γij Ci12 = 0 . (7.47)
∂M ∂λ
i

This situation turns out to be quite frequent in applications of the OPE.c sileG siocnarF

7.4 Example: QCD corrections to weak decays

7.4.1 Fermi theory

In order to illustrate the use of the operator product expansion on a concrete case, let
us consider the weak interactions between quarks and leptons. In the standard model,
the interactions between charged currents take the following form:

g2 µ
LI = J (0) Dµν (0, x) Jν† (x) + h.c. , (7.48)
2 L L

where JµL
is the left handed charged current (containing a leptonic term and a term
due to quarks) and Dµν (0, x) is the propagator of the W ± boson between the points
0 and x.
At low energy, we may neglect the momentum carried by the W ± boson prop-
agator in front of the W ± mass. In this approximation, the propagator becomes
momentum independent, and its Fourier transform is proportional to δ(x). We may
then replace the non-local interaction term of eq. (7.48) by a 4-fermion (local) contact
interaction, which is nothing but the interaction term of Fermi’s
√ theory. The prefactor
of this interaction term, g2 /2M2W , is usually denoted 4GF / 2 where GF is Fermi’s
constant:
4G
Lint ≈ √ F Jµ (0) Jν† (0) + h.c. . (7.49)
2 L L

Thanks to the operator product expansion, one may study in greater detail the limit
from the electroweak theory to Fermi’s theory, i.e. the process by which one replaces
the non-local product of two currents by one or more local interaction terms. This
example will also illustrate how this decomposition in local operators depends on the
energy scale of the processes under consideration, by including the strong interaction
corrections at one loop.
Let us discuss first two trivial cases regarding the effect of QCD corrections
at one loop. Firstly, purely leptonic weak interactions are not affected by strong
242 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

interactions at this order since leptons do not couple directly to gluons (but QCD
corrections do exist at two loops and beyond). The other simple case is that of
semi-leptonic weak interactions, involving a leptonic current and a current made of
quarks. Indeed, the leptonic current is not renormalized by strong interactions. The
quark current, conserved at leading order, is also not affected by strong interactions
since its anomalous dimension is zero. Finally, a gluon cannot connect the lepton
and the quark currents. Thus, semi-leptonic weak interactions are not affected by
QCD corrections at one loop. The only non-trivial case, to which we will devote
the rest of this section, is that of weak interactions between quark currents, i.e. the
non-leptonic weak interactions. As an example, let us consider the QCD corrections
to the weak decay of the strange quark, which in Fermi’s theory comes from the
following coupling: (dL γµ uL ) (uL γµ sL ).

7.4.2 Operator product expansion


Let us consider the OPE of the following product of currents Aµ
1 (x)A2µ (0), with:

Aµ µ
1 ≡ dL γ uL , Aµ µ
2 ≡ uL γ sL . (7.50)
When going from the standard model to Fermi’s theory, the non-local dependence
of the W ± propagator is captured by the Wilson coefficients Ci12 (x). Therefore,
the typical separation x is x ∼ M−1 W
(since the mass MW is the only dimensionful
parameter in the propagator). On the other hand, the scale M characteristic of Kaon
decays is of the order of the mass of a Kaon, around 500 MeV. The simplest operators
on which we may expand the product A1 (x)A2 (0) are the following:

O1 ≡ (dL γµ uL )(uL γµ sL ) ,
O2 ≡ (dL γµ sL )(uL γµ uL ) , (7.51)

where in the second one two quark operators of different flavours have been inter-
changed. Note that the mass dimension of the operators A1 and A2 is 3, while that
of O1 and O2 is 6. Therefore, we have dA1 + dA2 − di = 0, which means that the
x dependence of the Wilson coefficients comes entirely from the logarithms in the
expression (7.44). The more complicated operators that may enter in this expansion
all have a larger mass dimension, so that dA1 + dA2 − di < 0. Thanks to the
prefactor in eq. (7.44), the corresponding Wilson coefficients are very small since
M|x| ∼ M/MW ≪ 1. Thus, one can restrict the OPE of A1 (x)A2 (0) to the sole
operators O1 and O2 when applied to the physics of Kaon decays.

7.4.3 Wilson coefficients


In order to determine the Wilson coefficients Ci12 for the operators Oi with equation
(7.44), we first need to calculate the anomalous dimensions γA1 , γA2 , as well as γ1 ,
7. R ENORMALIZATION GROUP 243

γ2 , for the operators A1 , A2 , O1 and O2 . Since A1 and A2 are conserved at the first
order, their anomalous dimension is zero:
γA1 = γA2 = 0 . (7.52)
In order to obtain the anomalous dimensions of the operators O1 and O2 , let us
introduce the following graphical representation for these operators:
u d u d
O1 = , O2 = . (7.53)
u s u s
This representation renders explicit the fact that these operators are products of two
currents. Thanks to eq. (7.22), the anomalous dimension of these operators is obtained
by calculating the vertex counterterm and the counterterms associated to the external
lines. All the order-g2 strong interaction corrections are listed in the figure 7.1 in
the case of O1 . The contributions to γ1 of the first three diagrams on the first line

u d

u s

Figure 7.1: Order-g2 QCD corrections to the operator O1 .

cancel, because their sum gives the anomalous dimension of a conserved (at first
order) current. The same conclusion holds for the remaining three graphs of the first
line. Thus, we need only to consider the diagrams of the second line. In Feynman
gauge, the expression of the first diagram of the second line is given by:

u d

Z  
dD k −i /
ik
= (−ig)2 d γ µ
ta λ
γ u
(2π)D k2 L
(k + p)2 f L

u s
 
/
ik
× uL γλ ta
f γµ sL , (7.54)
(k − q)2
244 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where p and q are the (incoming) momenta carried by the quark lines to which the
gluon is attached. The ta
f are the generators of the fundamental representation of the
su(3) algebra, that holds the quarks. In the numerator, some terms in p / and q / have
been dropped because they do not contribute to the ultraviolet divergence of the graph.
The integral over k can be rewritten as follows3 :
Z ′ ′ Z
dD k kν kν gνν dD k 1
=
(2π)D k2 (k + p)2 (k − q)2 d (2π) (k + p) (k − q)2
D 2

1
′ Z Z D
gνν d k 1
= dx
d (2π)D (k2 + ∆)2
0
′ Z1
gνν Γ (2 − D
2) 1
= i dx D/2 2−D/2
, (7.55)
d (4π) ∆
0

where we denote k ≡ k + xp − (1 − x)q and ∆ ≡ x(1 − x)(p + q)2 . Since


the renormalization scale is M, we may impose that the Lorentz invariant quantity
(p + q)2 is equal to −M2 , so that ∆ is proportional to M2 . Since the power 2 − D/2
to which the denominator ∆ is raised goes to zero in four dimensions, we may neglect
the prefactor x(1 − x) inside ∆ and the integral over the Feynman parameter x simply
gives a factor equal to unity. If we take the limit D → 4 in all the factors that do not
diverge and do not depend on M, we obtain:
u d

g2 Γ (2 − D
2) 1  
= 2 4−D
dL γµ γν ta λ a
f γ uL [uL γλ tf γν γµ sL ] . (7.56)
4 (4π) M
u s

The contribution of this graph to the counterterm for the normalization of O1 is given
by the opposite of this result. c sileG siocnarF

In order to simplify the combination of spinors, Dirac and colour matrices that
appear in the result of eq. (7.56), it is useful to use the chiral representation (also
known as Weyl’s representation) since only the left handed component of the spinors
enter in this expression. In this representation, the Dirac matrices are given by
! !
µ
0 σ −1 0
γµ = , γ5 = , (7.57)
σµ 0 0 1

with σµ ≡ (1, σ) and σµ ≡ (1, −σ) where σ is a vector made of the three Pauli
matrices. In this representation, the left handed projector PL ≡ (1 − γ5 )/2 and the
3 The first equality disregards some terms that are ultraviolet finite.
7. R ENORMALIZATION GROUP 245

right handed one PR ≡ (1 + γ5 )/2 simplify into:


! !
1 0 0 0
PL = , PR = , (7.58)
0 0 0 1

so that any 4-component spinor can be viewed as two 2-components spinors, one of
which is right handed and the other one left handed:
!
ψL
ψ= . (7.59)
ψR

Using this representation, we can for instance easily obtain

dL γµ γν γλ ta µ ν λ a
f uL = dL σ σ σ tf uL . (7.60)

This equation contains a small abuse of notations, since it contains the 4-component
spinors (ψL , 0) in the left hand side, while the right hand side contains only the
2-component left handed spinors ψL .
In order to reduce the combination of spinors that appear in eq. (7.56), we need
to simplify the products (σµ )αβ (σµ )γδ and (σµ )αβ (σµ )γδ as well as (ta a
f )ij (tf )kl .
In both cases, this can be done by using the Fierz identity for the generators of the
fundamental representation of the su(n) algebra, introduced in the section 4.1.6. Let
us recall this identity here:
 
a a 1 1
(tf )ij (tf )kl = δil δjk − δij δkl . (7.61)
2 n
For the contraction of colour matrices ta f , we can apply it directly with n = 3:
 
1 1
(ta ) (ta
)
f ij f kl = δ δ
il jk − δ δ
ij kl . (7.62)
2 3

For the contraction of the σµ or the σµ , let us recall that the Pauli matrices σi are
related to the su(2) fundamental generators τi by

σi = 2 τi . (7.63)

Using this relation and the Fierz identity for the fundamental representation of su(2),
we obtain:

(σµ )αβ (σµ )γδ = (σµ )αβ (σµ )γδ = δαβ δγδ − 4(τi )αβ (τi )γδ
 
1
= δαβ δγδ − 2 δαδ δβγ − δαβ δγδ
2
= 2 [δαβ δγδ − δαδ δβγ ] . (7.64)
246 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Thanks to eqs. (7.62) and (7.64), we obtain4


 
dL γµ γν ta λ a
f γ uL [uL γλ tf γν γµ sL ]
2
= 2(uL γµ uL )(dL γµ sL ) − (dL γµ uL )(uL γµ sL ) . (7.65)
3

We recognize the operators O1 and O2 in this expression. We are therefore in a


situation where renormalization introduces a mixing between operators. The second
diagram is identical to the one we have just calculated.
The third diagram of the second line reads:

u d

Z  
dD k −i /
ik
= (−ig)2 d γ µ
ta λ
γ u
(2π)D k2 L
(k + p)2 f L

u s
 
/ a
−ik
× uL γµ t γλ sL , (7.66)
(k − r)2 f

where r is the momentum that flows into the diagram by the line carrying the s quark.
The integration over k is similar to the previous case, and leads to

u d

g2 Γ (2 − D
2) 1  
=− 2 4−D
dL γµ γν ta λ a
f γ uL [uL γµ γν tf γλ sL ] .
4 (4π) M
u s
(7.67)

Likewise, we can simplify the Dirac and colour matrices by using Fierz identities:
 
dL γµ γν ta λ a
f γ uL [uL γµ γν tf γλ sL ]
8
= 8(uL γµ uL )(dL γµ sL ) − (dL γµ uL )(uL γµ sL ) , (7.68)
3

which is again a linear combination of O1 and O2 . The last diagram gives the same
result. c sileG siocnarF

By combining the four contributions, we obtain the following form for the operator
O1 , renormalized at the scale M, in terms of the bare operators:

O1r = O1b − δ11 O1b − δ12 O2b , (7.69)


4 The derivation can be made easier by using the graphical form (4.84) of the Fierz identity.
7. R ENORMALIZATION GROUP 247

where the counterterms δij are given by

g2 Γ (2 − D
2) g2 Γ (2 − D
2)
δ11 ≡ 2 4−D
, δ12 ≡ −3 2 4−D
. (7.70)
(4π) M (4π) M

By calculating in the same way the one-loop corrections to the operator O2 , we obtain
the counterterms δ22 and δ21 , that are equal to

δ21 = δ12 , δ22 = δ11 . (7.71)

Because of the mixing, the anomalous dimensions for the operators O1,2 form a
non-diagonal matrix
!
∂δij g2 −2 6
γij = M = . (7.72)
∂M (4π)2 6 −2

In order to solve the coupled Callan-Symanzik equations (7.47), we must find a basis
of operators in which the matrix of anomalous dimensions becomes diagonal. This is
achieved by choosing5 :

1
O1/2 ≡ [O1 − O2 ] ,
2
1
O3/2 ≡ [O1 + O2 ] . (7.73)
2

The corresponding eigenvalues of the matrix γij are

g2 g2
γ1/2 = −8 , γ3/2 = 4 . (7.74)
(4π)2 (4π)2

Using the equation (7.44) (the functions Ci12 are equal to 1 at the first order of
perturbation theory) at a distance scale x ≈ M−1
W
, we obtain the following values for
the Wilson coefficients:

" # β4
1/2 ln(M2W /Λ2QCD ) 0

C12 (M−1 ; M) = ,
W
ln(M2 /Λ2QCD
)
" #− β2
3/2 ln(M2W /Λ2QCD ) 0

C12 (M−1 ; M) = . (7.75)


W
ln(M2 /Λ2QCD )

5 The subscripts 1/2 and 3/2 are related to the isospin variation in the s quark decay mediated by these

operators.
248 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Since MW ≫ M and β0 = 11 − 2Nf /3 is positive6 , the operator A1 (x)A2 (0)


responsible for the weak decay of the quark s receives a larger contribution from the
operator O1/2 than from O3/2 (roughly by a factor 3.6 if we use M ≈ 500 MeV,
ΛQCD ≈ 150 MeV, and 5 quark flavours). This calculation qualitatively7 corroborates
the empirical observation that weak decays of Kaons correspond predominantly to an
isospin variation of 1/2.

7.5 Non-perturbative renormalization group


Until now, our discussion of renormalization has been strictly rooted in perturbation
theory and limited to the context of renormalizable theories, at the exception of the
section 7.2.4 where we discussed the running of the couplings in front of operators
of any dimension. In this framework, the renormalization flow is formalized by
the Callan-Symanzik equations, that describe the scale dependence of correlation
functions. However, the ideas behind renormalization have a much wider range of
application: they are also relevant non-perturbatively, and they may be applied directly
at the level of actions rather than correlation functions. In this section, we first develop
heuristically some general concepts related to the renormalization flow in an abstract
space of theories. These ideas are then made more tangible in the form of a functional
flow equation for the quantum effective action, whose solution interpolates between
the classical action and the full quantum action.

7.5.1 Kadanoff’s blocking for lattice spin systems

The general concepts of renormalization that we aim at introducing in this section can
be first exposed by considering the simple example of a system of spins on a lattice,
the simplest of which is the Ising model in two dimensions, which is exactly solvable
for interactions among nearest neighbors. This model is known to have a disordered
phase at high temperature, a ferromagnetic order at low temperature (where spins
align with an external magnetic field, no matter how small), and a second order phase
transition at a critical temperature T∗ . At the second order transition, the correlation
length of the system becomes infinite, despite the fact that the interactions are short
ranged. Roughly speaking, a measure of the complexity of the study of a discrete
physical system (at least if one attempts to do it from the theory that describes the
interactions among the microscopic degrees of freedom) is the number of elementary
degrees of freedom per correlation length. By this account, second order phase
transitions are among the hardest problems to analyze.
6 In this problem, N = 5 flavours of quarks should be taken into account in the running of the strong
f
coupling constant, in order to include all the quarks up to mass of the W ± bosons.
7 The measured imbalance between the isospin variations 1/2 and 3/2 is even larger, but a quantitative

explanation would involve non-perturbative aspects of QCD.


7. R ENORMALIZATION GROUP 249

Figure 7.2:
Kadanoff’s block-
spin renormalization.
Top: the spins are
grouped into 3 × 3
blocks.
Middle: each block
of 9 spins is replaced
by a single spin
determined by the
rule of majority.
Bottom: the lattice
is scaled down (new
spins come into the
picture, that where
previously outside of
the represented area).
250 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Kadanoff devised a method, called block-spin renormalization to facilitate the


study of such a situation. The basic ideas of this method are illustrated in the figure
7.2. Firstly, one groups the spins into connected sets, for instance in 3 × 3 blocks as
shown in the figure. Then, the spins inside each of these blocks are replaced by some
sort of average spin. One possibility is to use the “rule of majority”: the new spin is
chosen to be up if five or more of the original spins were up, and down otherwise.
The physical motivation for this replacement is that the calculation of macroscopic
observables (e.g. the total magnetization in a large sample of the material under
consideration) does not require to know in detail the value of each of the elementary
spins, and should be doable from these coarse grained variables. Of course, one
should adjust carefully the interactions among the newly introduced averaged spins,
so that the macroscopic properties of the system are unchanged. One may for instance
require that the partition function of the system is unmodified. In general, even if
the original Hamiltonian had only short range nearest neighbors interactions, the
Hamiltonian that describes the coarse-grained spins may have long range interactions.
The block-spin renormalization comprises a third step, that consists in a rescaling of
distances so that each of the coarse-grained spin occupy the same area as one of the
original elementary spins (this step is necessary for the transformation to have fixed
points).
The combination of these three steps, called a (discrete) renormalization group
step R, may be viewed as transforming a bare action S0 into a renormalized action:

Sr ≡ R S0 . (7.76)

However, the real power of this idea comes by iterating the renormalization group
steps R until there are only a few of the coarse-grained spins in a macroscopic area
of the system. Under such a sequence of renormalization steps, the actions are
sequentially transformed as follows:

S0 7−→ S1 7−→ S2 7−→ S3 7−→ · · · (7.77)


R R R R

The behaviour of the mapping Rn for large n contains all the information we may
need about the macroscopic properties of the system. In particular, a critical point,
where the system has an infinite correlation length and is self-similar, corresponds to
a fixed point of this transformation, i.e. to an action S∗ that satisfies

S∗ = R S∗ . (7.78)

The concept of renormalization group introduced so far in the case of a discrete


system, consisting in a coarse graining followed by a rescaling, can be generalized to
a continuous system such as a quantum field theory. In this case, one introduces a
length scale ℓ, and the renormalization group transformation consists in integrating
out the smaller length scales. One may denote τ ≡ ln(ℓ/ℓ0 ), where ℓ0 is the initial
7. R ENORMALIZATION GROUP 251

short distance scale. Thus, τ = 0 corresponds to the bare action at short distance, and
τ = +∞ corresponds to macroscopic distances, and the discrete steps of eq. (7.77)
are replaced by an equation of the form

∂ τ Sτ = H S τ , (7.79)

where the RG flow for an infinitesimal step ∆τ is R = 1 + ∆τ H. c sileG siocnarF

7.5.2 Wilsonian RG flow in theory space

One may view a given action S as a point in an abstract space, where each axis
corresponds to the coupling constant in front of a given operator. For instance, in the
case of a lattice spin system, there would an axis for the strength of the interactions
among nearest neighbors,
√ an axis for the strength of the interactions among sites
whose distance is 2 lattice units, and so on... In a scalar quantum field theory, these
could be the couplings for the operators φφ, φ2 , φ4 , φ6 , ... A renormalization
group transformation such as (7.76) defines a mapping of the points in this theory
space, either discrete or continuous depending of the system. We have illustrated this
in the continuous case in the figure 7.3, where the thick gray line shows how a bare
action S0 at short distance flows as the distance scale ℓ increases, leading to a theory
that may have very different couplings at macroscopic scales. Note that only three
out of many (possibly infinitely many for a continuous system) dimensions are shown
in the figure.c sileG siocnarF

As we have already mentioned in the previous section, a critical point must be a


fixed point of this mapping, e.g. the point S∗ in the figure 7.3. Important properties of
the renormalization group flow may be learned by linearizing the flow in the vicinity
of such a fixed point, by writing

S ≡ S∗ + ∆S , H S∗ = 0 ,
H S = L ∆S + · · · , (7.80)

where L is a linear mapping. Then, one may define the eigenoperators of L,

L On = λn On , (7.81)

where λn is the corresponding eigenvalue. In the vicinity of the fixed point, we thus
have
X
S ≈ S∗ + cn eλn τ On , (7.82)
n
252 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 7.3:
Renormalization group
S0 flow in theory space (the
arrows go from UV to
IR scales). The black dot
is a critical fixed point
S∗ . The gray surface is
the critical surface, i.e.
the universality class
made of all the theo-
S* ries that flow into the
critical point. The light
colored line, flowing
away from the critical
point, corresponds to the
direction of a relevant
operator. The thick gray
line illustrates the flow
from a generic initial
action S0 .
7. R ENORMALIZATION GROUP 253

where the cn are coefficients determined by initial conditions. This expression leads
to the following classification of operators8 :

• λn < 0 : such an operator corresponds to an attractive direction in the vicinity


of the fixed point. Even if the action contains this operator at some short
distance scale, its coupling vanishes as one gets close to the critical point. This
operator is said to be irrelevant, because it plays no role in the long distance
critical phenomena.

• λn > 0 : this operator corresponds to a repulsive direction in the vicinity of S∗ .


Any admixture of this operator will grow as one goes to larger distance scales.
An operator with a positive eigenvalue is called relevant.

• λn = 0 : such an operator is called marginal. Usually, it means that the operator


may either grow or shrink, but slower than exponentially (and a more refined
calculation that goes beyond this linear analysis is necessary in order to decide
between the two behaviours).

The previous discussion, based on a linear analysis near the critical point, may be
extended globally as follows. One defines the critical surface as the domain of theory
space which is attracted into the critical point as the length scale goes to infinity.
All the bare actions that lie in this domain (the shaded surface in the figure 7.3)
describe systems that have the same long distance behaviour. Despite the fact that
these systems may correspond to completely different microscopic degrees of freedom
and interactions, they are described by the same action S∗ at large distances. For
this reason, this domain is also called the universality class of the critical point. The
relevant operators correspond to the directions of theory space that are “orthogonal”
to the critical surface. The term relevant follows from the fact that the coupling of
these operators must be fine-tuned in order to be on the critical surface: in other
words, the relevant couplings matter for making the system critical. A remarkable
aspect of phase transitions is that the number of these relevant operators is small9 ,
despite the fact that the microscopic interactions may require a very large number of
distinct couplings. Heuristically, this follows from a dimensional argument: since
the action is dimensionless, the coupling constants of higher dimensional operators
must have a negative mass dimension, and therefore they scale as inverse powers of
8 This discussion does not exhaust all the possibilities. Firstly, in a theory space with two or more

dimensions, eigenvalues can be complex valued, corresponding to RG trajectories that spiral around the
fixed point (spiraling inwards if the real part is negative and outwards if it is positive). Another possibility
is limit cycles (i.e. closed RG trajectories), that play a role for instance in the Efimov effect (a scaling
law in the binding energies of 3-boson bound states when the 2-body interaction is too weak to have a
two-body bound state).
9 In the case of 2-dimensional Ising model, the only parameters that need to be adjusted in order to

reach the critical point are the temperature (T∗−1 ≈ 0.44) and the external field (equal to zero).
254 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

the ultraviolet cutoff. Thus, these operators are irrelevant. Only operators of low
dimensionality can be relevant, and there is usually a (small) finite number of them10 .
Let us now consider the domain that originates from the fixed point (the light
colored line in the figure 7.3), sometimes called the ultraviolet critical surface. This
is the domain spanned by the renormalization group flow if one starts from an
infinitesimal region around the fixed point. Any theory that lies on the UV critical
surface is renormalizable, since it evolves into the fixed point at short distance: this
indeed means that one may safely send the ultraviolet cutoff to infinity in such a
theory (this corresponds to moving in the direction opposite to the arrows in the figure
7.3). Note also that theories on the UV critical surface transform into one another
under the renormalization flow, but the couplings of the various relevant operators
depend on the scale. The following situations may occur:

• For such a theory to be renormalizable in the perturbative sense, the couplings


should remain small all the way to the ultraviolet scales. This happens when
the fixed point is a Gaussian fixed point, whose action S∗ contains only a
kinetic term (i.e. is Gaussian in the fields). This is the case for quantum
chromodynamics, thanks to asymptotic freedom.

• It may also happen that around a Gaussian fixed point, the only relevant op-
erators are quadratic in the fields, like mass and kinetic terms. In this case,
there is no interacting renormalizable action, and the theory is said to suffer
from triviality. There is nowadays strong evidence that, in a pure real scalar
field theory, the operator φ4 is not relevant in four space-time dimensions (it is
relevant in three dimensions or less) and therefore such a field theory is trivial
because only the non-interacting theory makes sense.

• When the fixed point is a non-trivial interacting fixed point instead of a Gaussian
one, the theories on the UV critical surface are also renormalizable, but their
high energy behaviour cannot be studied by perturbative means. This situation
is called asymptotic safety11 .

To conclude this discussion, let us say a word about generic RG trajectories,


i.e. neither located on the critical surface nor on the UV critical surface, such as
the line originating from the short distance action S0 in the figure 7.3. Generically,
when evolving towards larger length scales, the irrelevant couplings decrease and the
relevant ones increase, and the action approaches that of a renormalizable theory. This
sets in a more general framework our observation of the section 7.2.4 (there, it was
largely based on dimensional analysis). Moreover, if the microscopic action S0 starts
10 An exception to this assertion is the renormalization group on the light-cone used in the study of deep

inelastic scattering. There, peculiarities of the kinematics lead to an infinite number of relevant operators.
11 The concept of asymptotic safety was introduced by Weinberg, as a logical possibility for a renormal-

izable quantum field theory of gravity.


7. R ENORMALIZATION GROUP 255

close to but not exactly on the critical surface, the theory firstly approaches the critical
point upon increasing the length scale, but instead of reaching it, it departs from it
on even larger scales to follow one of the repulsive directions. In such a system, the
correlation length may be large but not infinite as it would be at the critical point (the
turning point between the approach of the critical point and the subsequent departure
from it happens roughly when the RG scale equals the correlation length).

7.5.3 Functional RG equation for scalar theories

The block-spin renormalization procedure that we have discussed in the section 7.5.1
can be extended to the case of a continuous system such as a quantum field theory.
Moreover, while our discussion has been so far qualitative, we shall now derive an
explicit RG flow equation for the quantum effective action, the solution of which
would provide the full quantum content from tree level contributions only.

Reminders about the quantum effective action : Let us first recall some basic
results about the quantum effective action Γ [φ], taken from the section 2.6. It is
related to the generating functional W[j] of connected Feynman graphs by
Z
i Γ [φ] = W[jφ ] − i d4 x jφ (x)φ(x) . (7.83)

where the current jφ is defined implicitly by

δΓ [φ]
+ jφ (x) = 0 . (7.84)
δφ(x)

or equivalently in terms of W by

δW[j]
φ(x) = . (7.85)
i δj(x) j=jφ

In other words, jφ (x) is the external source such that the expectation value of the
field is φ(x). By combining the path integral representation of W,
Z h Z i
 
eW[j] = Dφ(x) exp iS[φ(x)] + i d4 x j(x)φ(x) , (7.86)

with eqs. (7.83) and (7.84) we obtain the following functional equation satisfied by
the effective action Γ :
Z h Z i
  δΓ [ϕ]
ei Γ [ϕ] = Dφ(x) exp iS[φ + ϕ] − i d4 x φ(x) . (7.87)
δϕ(x)
256 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(We have performed a shift φ → φ + ϕ in the dummy functional integration variable.)


Although this equation formally defines the quantum effective action, its use is not
convenient because it still contains a path integration. Physically, this difficulty is
related to the fact that the equation integrates out all the length scales at once. The
functional RG equation that we derive now circumvents this problem by integrating
out quantum fluctuations only in a small range of scales at a time.c sileG siocnarF

Regularized generating functional : Let us introduce a momentum scale κ and


define

h δ i
eWκ [j] ≡ exp i ∆Sκ Z[j]
iδj
Z   Z
 
= Dφ(x) exp i S[φ] + ∆Sκ [φ] + i jφ , (7.88)

where Z[j] is the usual generating functional for time ordered correlation functions
and ∆Sκ is defined in terms of the Fourier transform of the fields as follows:
Z
d4 p e e
∆Sκ [φ] ≡ φ(−p) Rκ (p) φ(p) . (7.89)
(2π)4

Rκ is an ordinary function that plays the role of a cutoff in momentum. At low


momentum p/κ ≪ 1, it should be positive in order to give a mass for the soft modes,
and thus provide an infrared regulator:

lim Rκ (p) = µ2 > 0 . (7.90)


p/κ→0

Moreover, this function is assumed go to zero when the scale κ → 0,

lim Rκ (p) = 0 , (7.91)


κ→0

which means that the cutoff plays no role in this limit and we recover the full quantum
theory. This is the limit we aim at reaching at the end of the RG flow. In contrast, it
should become large when κ → ∞:

lim Rκ (p) = ∞ . (7.92)


κ→∞

This property ensures that when κ is large, the right hand side of eq. (7.88) is
dominated by the saddle point, so that the corresponding effective action equals the
classical action.
7. R ENORMALIZATION GROUP 257

Scale dependence of Wκ : By denoting τ ≡ ln(κ/Λ) (where Λ is the ultraviolet


scale at which the classical action is defined), we have
Z
i d4 p e κ (p) ,
∂τ Wκ [j] = i ∂τ ∆Sκ [ φ ]+ ∂τ Rκ (p) G (7.93)
κ 2 (2π)4

where Gκ (p) is the connected 2-point function obtained from Wκ [j],

δ2 Wκ [j]
Gκ (x, y) ≡ , (7.94)
iδj(x)iδj(y)

and φ κ
is the corresponding 1-point function:

δWκ [j]
φ(x) κ
≡ . (7.95)
iδj(x)

Scale dependent effective action : Let us now alter the definition (7.83) in order
to make it depend on the scale κ, by writing
Z
Γκ [φ] + ∆Sκ [φ] = −i Wκ [jφ ] − d4 x jφ (x) φ(x) . (7.96)

The left hand side is written as Γκ + ∆Sκ in order not to include in the definition of
the effective action the unphysical regulator ∆Sκ . Like in the original definition, the
field φ and the current jφ are related by

δWκ [j]
φ(x) = . (7.97)
iδj(x)
j=jφ

In terms of Γκ this relationship reads

δΓκ [φ] h e i
jφ (x) + + Rκ φ (x) = 0 . (7.98)
δφ(x)

Differentiating eq. (7.97) with respect to j(y) and eq. (7.98) with respect to φ(y), and
multiplying the results, we obtain the following identity:
Z " #
4 δ2 Wκ [j] δ2 Γκ [φj ]
i δ(x − y) = d z +Rκ (x, y) , (7.99)
iδj(y)iδj(z) δφj (z)δφj (x)
| {z } | {z }
Gκ (y,z) Γκ,2 (z,x)

that generalizes eq. (2.94).


258 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Flow equation for Γκ : Now, we can differentiate eq. (7.96) with respect to the
scale:
Z
∂τ Γκ [φ] = −∂τ ∆Sκ [φ] − i ∂τ Wκ [jφ ] − d4 x φ(x) ∂τ jφ (x)
h i
= −∂τ ∆Sκ [φ] − i ∂τ Wκ [j]
j=jφ
Z Z
4 δW [j
κ φ ]
−i d x ∂τ jφ (x)− d4 x φ(x)∂τ jφ (x)
δjφ (x)
Z
1 d4 p e κ (p) .
= ∂τ Rκ (p) G (7.100)
2 (2π)4

In the second line, we have made explicit the fact that Wκ [jφ ] contains both an
intrinsic scale dependence and an implicit one from the κ dependence of its argument
jφ . Using eq. (7.99), this can be put into the following form:
 " #−1 
i δ2 Γκ [φ]
∂τ Γκ = Tr (∂τ Rκ ) + Rκ , (7.101)
2 δφδφ

that depends only on Γk (the integral over the momentum p has been written compactly
in the form of a trace). Let us make a few remarks concerning this equation:

• It describes the renormalization group trajectory of the effective action in theory


space, starting from the bare classical action at κ = ∞ and going to the full
quantum effective action when κ → 0. c sileG siocnarF

• This equation is a functional differential equation, that does not involve any
functional integral, unlike eq. (7.87). Nevertheless, it cannot be solved exactly
in general, and various truncation schemes have been devised in order to obtain
physical results.
• The term Rκ in the denominator provides an infrared regularization (by adding
a kind of mass term to the inverse propagator).
• The factor ∂τ Rκ is peaked around momentum modes of order κ. Thus, the
right hand side is rather localized in momentum space, in contrast with the
equation (7.87) that includes all the momentum scales at once.
• The choice of the regularizing function Rκ is not unique, provided that it fulfills
the conditions (7.90-7.92). Consequently, the renormalization group trajectories
depend somehow on this choice (this may be viewed as a dependence on the
renormalization scheme). However, the fixed points of the renormalization
group flow do not depend on this choice.
Chapter 8

Effective field theories

Until now, we have discussed various quantum field theories (the electroweak theory
and quantum chromodynamics) that are believed to provide a unified description of
all particle physics up to the scale of electroweak symmetry breaking, i.e. roughly
ΛEW ∼ 200 GeV. However, it is hard to imagine that there isn’t some kind of new
physical phenomena (new particles, new interactions) at higher energy scales (so
far out of reach of experimental searches). An interesting question is therefore to
understand why the Standard Model is such a good description of physics below the
electroweak scale, despite the fact that it does not contain any of the physics at higher
scale. In other words, despite the fact that there is distinct physics on scales that
span many orders of magnitude, why can “low energy” phenomena be described by
ignoring most of the higher scales? The same question could be asked in other areas:
for instance, why can chemistry (i.e. phenomena of atomic bonding in molecules)
get away without any of the complications of quantum electrodynamics? The general
question is that of the separation between various physical scales. c sileG siocnarF

In the context of quantum field theory, such a low energy description is called
an effective theory. The basic idea is that most of the details of an underlying more
fundamental (i.e. valid at higher energy) description are not important at lower
energies, except for a small number of parameters. As we shall see in this chapter,
effective field theories may occur in several situations:

• Top-down : the quantum field theory which is valid at higher energy is known,
but it is unnecessarily complicated to describe phenomena at lower energy
scales. A typical example is that of a theory that contains particles that are
much heavier than the energy scale of interest (e.g., the top quark in quantum
chromodynamics, while one is interested in interactions at the GeV scale). In
this case, the effective theory “integrates out” the higher mass particles in order
to obtain a simpler theory.

259
260 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

• Bottom-up : we have a theory believed to be valid at a given energy scale, but


have no clear idea of what may exist at higher scales. In this case, one may
view the existing theory as an effective description of some (so far unknown)
more fundamental theory at higher energy, and try to complete it by adding
new (higher dimensional, and therefore usually non-renormalizable in four
dimensions) local interactions to it.
• Symmetry driven : even when the underlying theory is known, its direct applica-
tion may be rendered very impractical because the physics of interest involves
some non-perturbative phenomena, such as the formation of bound states (for
instance, in QCD at low energy, the quarks and gluons cease to be the relevant
degrees of freedom and the physical excitations are the light hadrons). An effec-
tive theory for these bound states may be constructed from the requirement that
it should be consistent with the symmetries of the underlying theory. This case
differs from the top-down approach in the sense that the low energy description
is not constructed by integrating out the high scales, but solely from symmetry
considerations.

In the top-down approach, where the fundamental underlying theory is known, the
goal of obtaining an effective description for low energy phenomena could in principle
be achieved by the renormalization group. In particular, the functional renormalization
group introduced in the section 7.5.3 allows to evolve from an ultraviolet classical
action towards a low energy quantum effective action, by progressively integrating
out layers of lower and lower momentum. There is nothing wrong with this approach,
but one has to keep in mind that the effective action obtained in this way is usually
extremely complicated and cumbersome to use in practical applications (in particular,
it could have infinitely many effective interactions, all of which are in general non-
local). In a sense, the quantum effective action that results from the RG evolution
is much more complex that the original ultraviolet action, and the gain in terms of
simplicity is rather dubious. In contrast, the concept of effective theory that we
are aiming at in this chapter is a field theory in which the ultraviolet physics is
encapsulated into a finite number of local operators, with coupling constants that may
depend on the energy scale and on the properties of the degrees of freedom that have
been integrated out.

8.1 General principles of effective theories

8.1.1 Low energy effective action


For the purpose of this general discussion, let us consider a quantum field theory in
which the fields are collectively denoted φ (this may be a single field, or a collection
of several fields) and a classical action S[φ]. We view this theory as the high energy
8. E FFECTIVE FIELD THEORIES 261

theory, and we wish to construct an alternative description applicable to low energy


phenomena, below some energy scale Λ. To this effect, let us assume that we can
split the field into a low frequency part (soft) and a high frequency part (hard),
φ ≡ φS + φH . (8.1)
This separation may be achieved by a cutoff in Fourier space, but the details of how
this is done are not important at this level of discussion. The classical action of
the original theory is thus a function of φS and φH and the path integration is over
the soft and the hard components of the field. Now, assume that we are interested
in calculating the expectation value of an observable that depends only on the soft
component of the field, O(φS ). Then, we may write
Z Z
   
hOi = DφS DφH eiS[φS ,φH ] O(φS ) = DφS eiSΛ [φS ] O(φS ) , (8.2)

where in the second equality we have defined


Z
iSΛ [φS ]
 
e ≡ DφH eiS[φS ,φH ] . (8.3)

SΛ [φS ] is the action of the low energy effective theory. Using the operator product
expansion, it may be written as a sum of local operators, possibly infinitely many of
them:
Z X
SΛ [φS ] ≡ dd x λn On . (8.4)
n

8.1.2 Power counting


The behaviour of the couplings λn can be inferred from dimensional analysis. For
the sake of this discussion, let us consider the case where φ is a scalar field, whose
mass dimension is φ ∼ (mass)(d−2)/2 in d spacetime dimensions. If the operator On
contains Nn powers of the field φ and Dn derivatives, its dimension is
On ∼ (mass)dn with dn = Dn + Nn d−2
2 , (8.5)
and it must be accompanied with a coupling λn whose dimension is (mass)d−dn .
Assuming that the cutoff Λ is the only dimensionful parameter that enters in the
construction of the effective theory (except for the field operator and derivatives,
that enter in the operators On ), we must have λn = Λd−dn gn , where gn is a
dimensionless constant, whose numerical value is typically of order one.
Consider now the application of this effective theory to the study of a phenomenon
characterized by a single energy scale E. On dimensional grounds, we have
Z
dd x On ∼ Edn −d . (8.6)
262 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Combined with the corresponding coupling constant, the contribution of this operator
would be of order
Z  d−dn
d Λ
λn d x On ∼ gn . (8.7)
E
This estimate is the basis of the following classification of the operators that may
enter in the action of the effective theory:

• dn > d : the contribution of these operators is suppressed at low energy, i.e.


when E ≪ Λ. For this reason, these operators are called irrelevant. This does
not mean that their contribution is not important and interesting, since there
may be observables for which they are the sole contribution. Note also that
these operators are non-renormalizable by the standard power counting rules. c sileG siocnarF

• dn = d : the contribution of these operators does not depend on the ratio of


scales E/Λ, except perhaps via logarithms. These operators are called marginal,
and correspond to renormalizable operators.
• dn < d : the contribution of these operators becomes more and more important
as the energy scale decreases. These operators, called relevant, are super-
renormalizable.

Recall also that a higher dimension dn corresponds to operators of greater complexity


(since in d > 2 the dimension increases with more powers of the field or more
derivatives). Therefore, there is in general only a finite number of operators whose
dimension is below a given value. For a given a cutoff Λ and an energy scale E, one
must therefore only consider a finite number of operators in order to reach a given
accuracy.
In a conventional quantum field theory, one usually insists on including only
renormalizable operators, in order to avoid the proliferation of new couplings at each
order or perturbation theory, and the usual statement of renormalizability amounts to
saying that all infinities may be absorbed into the redefinition of a finite number of
parameters of the theory, at every order of perturbation theory. In contrast, since a
low energy effective theory may contain operators of dimension dn > d, it is usually
not renormalizable in this usual sense, but the cutoff Λ provides a natural way of
keeping all the contributions finite. In this case, the power counting is organized by
the fact that the cutoff Λ is also the dimensionful scale that enters in the couplings of
negative mass dimension that come with operators of mass dimension greater than
four. For instance, an operator of dimension 6 has a coupling constant that scales
as Λ−2 , and physical observables may be expanded in powers of E/Λ, where E is
some low energy scale. In the presence of such higher dimensional operators, the
usual statement of renormalizability must now be replaced by a weaker assertion:
namely, that all the ultraviolet divergences that occur at a given order in E/Λ can be
8. E FFECTIVE FIELD THEORIES 263

absorbed into the redefinition of a finite number of parameters. More precisely, in


order to calculate consistently effects of order Λ−r , we must include all operators up
to a mass dimension of 4 + r. Thus, the number of constants that must be adjusted in
the renormalization process grows as we go to higher order.
In the case of top-down effective theories, the renormalizability of the underlying
field theory implies that the low energy physics depends on the ultraviolet only
through the values of the relevant and marginal couplings. In addition, a small number
of irrelevant couplings may matter in certain specific observables (e.g., if an irrelevant
operator is the only one that contributes). In fact, if the cutoff of the effective theory
is high enough compared to the physical energy scale of interest, the effective theory
can have a very strong predictive power, despite the fact that it a priori contains an
infinity of operators. But conversely, in a bottom-up approach where we try to extend
a renormalizable theory by adding to it higher dimensional operators, the fact that the
low energy theory is renormalizable implies that it is not sensitive to the scale of new
physics (in other words, a renormalizable low energy theory cannot predict at which
high energy scale it breaks down and is superseded by another theory).

8.1.3 Relevant operators

In fact, in an effective theory, the relevant operators (super-renormalizable) are often


more troublesome than the irrelevant ones (non-renormalizable). Consider for instance
the operator φ2 , that corresponds to the mass term in the effective Lagrangian and
has dimension φ2 ∼ (mass)d−2 , and whose corresponding coupling has dimension
(mass)2 , i.e. λ = g Λ2 . Thus, small masses are not natural in a low energy effective
theory: the natural scale of a mass is that of the cutoff Λ (the dimensionless coupling
g is generically of order one). In order to obtain small masses in a low energy effective
field theory, there must be some symmetry that prevents the corresponding mass term,
such as:

• A gauge symmetry for spin 1 particles.

• A chiral symmetry for fermions.

• A spontaneous breaking of symmetry, so that some scalars are the correspond-


ing massless Nambu-Goldstone bosons.

• Supersymmetry may also forbid certain types of mass terms (if unbroken, the
mass must be strictly zero, and if broken, the mass will settle to a value close
to the scale of supersymmetry breaking).

By that account, the Standard Model (without any supersymmetric extension) is not
natural, since it does not contain any mechanism to prevent the mass of the Higgs
264 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

scalar boson to be at a cutoff scale (possibly much higher than the electroweak scale)
where the Standard Model is superseded by a more fundamental theory. c sileG siocnarF

Likewise, relevant interaction terms have a large contribution to low energy


observables, that scales like
 d−dn
Λ
≫ 1 with d > dn . (8.8)
E
Therefore, the existence of relevant interaction terms implies that the dynamics is
strongly coupled at low energy. This may lead to the formation of bound states
or condensates, which calls for a low energy effective theory that contains different
degrees of freedom. An example is that of the identity operator, which is not forbidden
by any symmetry and has mass dimension 0 (therefore, it is a relevant operator).
Although this operator has no effect if added to the Lagrangian of a field theory (since
it amounts to adding a constant to the potential energy), its coefficient becomes a
cosmological constant if this field theory is minimally coupled to gravity1 . From the
power counting of the previous section, the natural value of the coupling constant
in front of this operator is Λd . Thus, if we view the Standard Model as an effective
theory, the cosmological constant should be at least as large as the fourth (d = 4)
power of the cutoff at which the Standard Model is replaced by some other theory.
This in sharp contrast with observations. Indeed, if the dark energy inferred from the
measured acceleration of the expansion of the Universe is attributed to a cosmological
constant, its value is many orders of magnitude below its natural value in quantum
field theory (its corresponds to an energy density of the vacuum of the order of
10−47 GeV4 ).

8.2 Example: Fermi theory of weak decays


As a first illustration of the concept of effective field theory, let us consider the case
of Fermi’s theory of weak interactions. Historically, this model was constructed
before the advent of the electroweak gauge theory, and therefore it may be viewed
as a bottom-up construction. Nowadays, since the electroweak theory provides us a
more fundamental description of weak interactions, we may derive Fermi’s theory in
a top-down fashion, as a low energy approximation of a known high energy theory.

8.2.1 Fermi theory as a phenomenological description


If we consider the Standard Model at a scale of the order of the nucleon mass, i.e.
around a GeV, it contains only the leptons, the light quarks, and the massless gauge
1 This example illustrates an ambiguity one faces when coupling a field theory to gravity: only energy

differences matter for the dynamics of the field theory, but the absolute value of the energy enters in the
energy-momentum tensor that acts as a source in Einstein’s equations.
8. E FFECTIVE FIELD THEORIES 265

bosons (photon and gluons). Thus, this low energy truncation has no mechanism
for weak decays. Nevertheless, one may write an effective coupling involving a
proton, a neutron (here, we prefer to use hadrons, that are the states encountered in
actual experimental situations), an electron and the corresponding neutrino. The most
general local operator combining these four fields may be written as
g12  
2
ψp Γ1 ψn ψe Γ2 ψν , (8.9)
Λ
where g12 is a dimensionless constant, Λ is a dimensionful scale, and Γ1,2 are matrices
chosen in the following set

Γ1,2 ∈ 1, γ5 , γµ , γµ γ5 , 4i [γµ,γν ] . (8.10)
| {z }
σµν

Note that σµν γ5 is not linearly independent from these matrices, since σµν γ5 ∝
ǫµνρσ σρσ , and therefore need not be included in this list. Thus, the most general
Lorentz invariant Lagrangian involving these four fields reads
 
Leff = ψp γµ ψn ψe γµ (CV + CV′ γ5 )ψν
 
+ ψp γµ γ5 ψn ψe γµ γ5 (CA + CA′ γ5 )ψν
| {z }
vector, axial
 
+ ψp ψn ψe (CS + CS′ γ5 )ψν
 
+ ψp γ5 ψn ψe γ5 (CP + CP′ γ5 )ψν
| {z }
scalar, pseudo-scalar
 
+ ψp σµν ψn ψe σµν (CT + CT′ γ5 )ψν . (8.11)
| {z }
tensor

Note that the presence of certain terms violate some discrete symmetries. For instance,

the primed terms CV,S,P,T all violate parity, and T -invariance requires that the ratio

Ci /Ci be real for all i ∈ {V, A, S, P, T }. On the other hand, by confronting this
effective Lagrangian with the existing data on weak decays, we learn that

CV = Λ−2 with Λ ∼ 350 GeV ,


CA ≈ 1.25 × CV ,
CV ∼ CV′ , CA ∼ CA′ ,

CS,P,T CS,P,T
, . 1% . (8.12)
CV CV

The first of these results is an indication of the energy scale at which the Fermi theory
breaks down and should be replaced by a more accurate microscopic description of
266 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

weak decays, and the second one implies that this underlying theory is chiral . The

fact that CV,A ∼ CV,A is a sign of parity violation in weak interactions. Finally, the
last property tells us that this microscopic interaction is not mediated by a scalar
or a tensor with a mass less than ∼ 2 TeV. All these informations may be used in
constraining the possible form of the theory that describes weak interactions at higher
energies.

8.2.2 Fermi theory from the electroweak model

Let us now consider the opposite exercise: namely, start from the Lagrangian of the
Standard Model and obtain the low energy effective theory of weak interactions by
a matching procedure. We know that the W ± bosons responsible for weak decays
couple to left-handed fermions arranged in SU(2) doublets:
ν   
e d
, , (8.13)
e L u L

where we have written only the relevant doublets for the decay n → peνe . In
addition, we have to keep in mind that the mass eigenstates are misaligned with
the weak interaction eigenstates in the quark sector. Thus, the vertex Wud contains
a factor Vud from the CKM matrix. With these ingredients, the tree level decay
amplitude d → ueνe reads:

g2 i   
µ µ
A= Vud 2 uγ (1 − γ 5 )d eγ (1 − γ )ν
5 e , (8.14)
8 k − M2W

where kµ is the 4-momentum carried by the intermediate W boson. In the low


momentum limit, k2 ≪ M2W , this amplitude becomes independent of the momentum
transfer and could have been generated by the following contact interaction

G    G g2
Leff = √F Vud ψu γµ (1−γ5 )ψd ψe γµ (1−γ5 )ψν with √F ≡ .
2 2 8 M2W
(8.15)

In order to obtain from this the physical decay amplitude n → peνe , we need the
matrix element

p ψu γµ (1 − γ5 )ψd n (8.16)

with initial and final nucleons instead of quarks. In the low momentum limit, it may
be related to a similar matrix element with the spinors of the proton and neutron by

p ψu γµ (1 − γ5 )ψd n = p ψp γµ (gV − gA γ5 )ψn n + O(kµ ) , (8.17)


8. E FFECTIVE FIELD THEORIES 267

where gV,A are two constants that may be viewed as the zero momentum limit of
some form factors. Then, by comparing the decay amplitudes obtained from the low
energy effective theory guessed on the basis of phenomenological considerations, and
the one obtained by starting from the electroweak theory, we obtain

g2 1
CV = −CV′ = gV 2
Vud = 2 ,
8 MW Λ
g2
CA = −CA′ = −gA Vud ,
8 M2W

CS,P,T = CS,P,T =0. (8.18)

In this top-down approach, we see that the parity violation inferred from experimental
evidence is in fact maximal in the electroweak theory, and that the scalar and tensor
contributions are exactly zero. Note also that the scale Λ that we introduced by hand
in the low energy effective theory does not coincide exactly with the mass of the
heavy particle which is integrated out (in the present case, the W boson), but has
the same order of magnitude. Finally, even though we performed here the matching
at tree level, it is in principle possible to correct the coefficients of the low energy
effective theories by electroweak and QCD loop corrections.

8.3 Standard model as an effective field theory


8.3.1 Standard Model
The Standard Model unifies the strong and electroweak interactions into a unique
renormalizable field theory. Although it agrees with most observed phenomena2 , it
is unreasonable to expect that the Standard Model remains an accurate description
of particle physics to arbitrarily high energy scales. A more modest point of view
is to consider the Standard Model as a low energy approximation of some more
fundamental theory that we do not yet know. In this perspective, it would just be the
zeroth order of some expansion,
L = LSM + L(1) + L(2) + · · · , (8.19)
|{z} |{z} |{z}
Λ0 Λ−1 Λ−2

and a natural endeavor is to construct the terms L(1,2,··· ) , made of operators with
mass dimension greater than four. By power counting, these operators must be
suppressed by coupling constants that are inversely proportional to powers of some
high energy scale Λ at which corrections to the Standard Model become important. In
the construction of these corrections, one usually abides by the following constraints:
2 One exception is the fact that neutrinos have masses, that does not have a very compelling explanation

in the Standard Model – we shall return to this issue in the next subsection.
268 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 8.1: Left: a higher di-


mensional operator provides a
correction (light) to an observable
which is non-zero in the Standard
Model (dark). Right: the higher
dimensional operator allows a
process that was impossible in the
Standard Model. In the latter case,
experiments usually provide an
upper value for the yield of these
very rare processes, that decreases
as the sensitivity improves, thereby
pushing higher up the energy scale
of this new physics.

• Lorentz invariance is preserved to all orders in Λ−1 ,

• The SU(3) × SU(2) × U(1) gauge symmetry of the Standard Model remains
a symmetry of the higher order corrections (the idea being that whatever is
the more fundamental theory that underlies the Standard Model, it is more
symmetric, not less),

• The corrections are built with the degrees of freedom of the Standard Model,

• The vacuum expectation value of the Higgs is not modified by the corrections.

As we have mentioned earlier, since the Standard Model is renormalizable, there is no


way to determine the scale Λ within the Standard Model itself. Instead, one should
enumerate the higher dimensional operators up to a certain mass dimension (which
corresponds to a certain order in Λ−1 ) and investigate their possible observable
consequences. Experiments can then search for these effects, and either provide the
c sileG siocnarF

values of some of the parameters introduced in L(1,2,··· ) , or give lower bounds on


the scale of new physics in case of a null observation. Note that there are two main
classes of higher dimensional operators, illustrated in the figure 8.1:

• Operators that lead to corrections to processes already allowed in the Standard


Model. These corrections may become potentially visible in more precise
experiments.

• Operators that allow processes that were forbidden in the Standard Model.
In this case, what is needed are more sensitive experiments, able to detect
extremely rare events.
8. E FFECTIVE FIELD THEORIES 269

8.3.2 Dimension 5 operators and neutrino masses

The right handed neutrinos are singlet under SU(3) and SU(2) and have a null
electrical charge, which means that they do not feel any of the interactions of the
Standard Model. As a consequence, all the neutrinos detected in experiments (via their
weak interactions with the matter of the detector) are left handed neutrinos, implying
that there is no direct evidence for the existence of right handed neutrinos. For this
reasons, right handed neutrinos are usually not considered as a part of Standard
Model.
The observation of neutrino oscillations, i.e. the fact that the flavour of a neutrino
can change as it propagates, implies that there are non-zero mass differences between
neutrinos3 . Therefore, at most one of the neutrinos can be massless, and at least two
of them must be massive.

Neutrino masses from the Higgs mechanism : Since the electroweak theory is
chiral (right handed leptons are SU(2) singlet, while the left handed ones belong to
SU(2) doublets), a naive Dirac mass term of the form mD ψL ψR is not invariant
under SU(2). However, we may construct such a Dirac mass in the same way as for
the other leptons, by starting from a Yukawa coupling involving the Higgs boson:

λ ψL ,iα ǫij Φ∗j ψR ,α , (8.20)

where i, j are indices in the fundamental representation of SU(2) and α is a Dirac


index. The matrix ǫ ≡ it2 is the second generator of the fundamental representation
of SU(2). Thanks to the contraction of the left handed spinor doublet with the Higgs
field, we now have an SU(2) invariant combination. Then, spontaneous symmetry
breaking gives a non-zero expectation value v to the Higgs field, and this interaction
term becomes a Dirac mass term for the neutrino, with a mass mD = λ v. Generating
the neutrino mass by this mechanism would place the neutrinos almost on the same
footing as the other leptons, provided we add right handed neutrinos to the degrees
of freedom of the Standard Model4 . The only distinctive feature of the right handed
3 Consider for instance a β decay: it produces an electron anti-neutrino (i.e. a weak interaction

eigenstate) of definite momentum. If mass eigenstates are misaligned with the weak interaction eigenstates,
then this neutrino may project on several mass eigenstates. Since the time evolution of the phase of a
wavefunction depends on the mass of the particle, these mass eigenstates evolve slightly differently in time
(unless all the neutrino masses are identical). At the detection time, this leads to a flavour decomposition
which is different from the one at the time of production. Thus, the original electron anti-neutrino will be a
mixture of electron, muon and tau anti-neutrinos. Conversely, the observation of this change of flavour
implies mass differences in the neutrino sector.
4 Whether this type of term is “beyond the Standard Model” is to a large extent a matter of definition.

Before the observation of neutrino oscillations, the Standard Model was most often defined without right
handed neutrinos, and therefore massless neutrinos. But it would have been equally acceptable to include
right handed neutrinos from the start, with Yukawa couplings so small that their masses were too small to
detect.
270 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

neutrinos would be that they do not feel any of the gauge interactions of the Standard
Model. For this reason, they are sometimes called sterile neutrinos. The main
drawback of this solution is that it requires an even larger range of values of the
Yukawa couplings, with no natural explanation.

Majorana neutrino masses : An alternative would be to have a Majorana mass for


the left handed neutrinos of the Standard Model. Instead of introducing this mass term
by hand, it can be generated via spontaneous symmetry breaking from a Weinberg
operator:
c  
ψtL ,iα ǫij Φj Cαβ Φtk ǫkl ψL ,lβ , (8.21)
Λ
where C ≡ γ0 γ2 is the charge conjugation operator. Firstly, note that this operator
has mass dimension 5, hence the coupling constant proportional to Λ−1 . In fact, this
operator is the only lepton number violating 5-dimensional operator that obeys the
constraints listed in the previous section5 . After spontaneous symmetry breaking, the
Higgs field acquires a vacuum expectation value, leading to a Majorana mass term for
the left handed neutrinos,

c v2 t
ν C νL , (8.22)
Λ L
which corresponds to a Majorana mass mM = cv2 Λ−1 . The appeal of this mecha-
nism, is that a small mass of the neutrinos is naturally explained by a high scale Λ for
the new physics. For instance, a neutrino mass of the order of 1 eV or below corre-
sponds to Λ & 1013 GeV. As we have already mentioned, the operator in eq. (8.21)
does not conserve lepton number, since it is not invariant under the following global
transformation

ψ → eiα ψ , ψ† → e−iα ψ† , ψt → eiα ψt . (8.23)

For this reason, this alternative mechanism is clearly beyond the Standard Model.
However, as long as gauge symmetries are preserved, the violation of lepton number is
not considered particularly dramatic. In a sense, one may view the lepton conservation
that exists in the Standard Model as accidental, being a consequence of the fact that
only dimension-four operators are included.

Weinberg operator from the low energy limit of another QFT : In the spirit
of the bottom-up construction of an effective theory, the operator of eq. (8.21) can
5 ψt ǫij Φj and Φtk ǫkl ψL ,lβ are both SU(2) invariant (but not Lorentz invariant), and the combi-
L ,iα
nation ψt Cαβ ψL ,lβ is Lorentz invariant. This combination is SU(3) invariant only for the leptons
L ,iα
(not for the quarks).
8. E FFECTIVE FIELD THEORIES 271

be obtained by exploring all the possibilities for dimension 5 operators built with
the degrees of freedom of the Standard Model and some symmetry requirements.
However, this operator can also be obtained in the low energy limit of a renormalizable
quantum field theory. Consider an extension of the field content of the Standard Model,
where we add a right handed neutrino νR with a very large Majorana mass MR (much
heavier than the electroweak scale), that also couples to the SU(2) doublet containing
the left handed neutrino and to the Higgs field via a Yukawa coupling,

L = LSM + LνR ,
 
/ νR − y ψL ǫΦ∗ νR − y∗ νR Φt ǫ† ψL
L νR ≡ i ν R ∂
1 
+ MR νtR C νR + M∗R νt∗ C ν∗R . (8.24)
2 R

With two instances of the Yukawa coupling and a propagator of the heavy Majorana

Φ Φ Φ Φ
ψL ψL ψL ψL νL νL
p << MR Φ=v
p

Figure 8.2: Diagrammatic illustration of the see-saw mechanism. Left: ΦΦψtL ψL


4-point function made with two Yukawa vertices and one insertion of the νR
propagator. Middle: ΦΦψtL ψL local vertex obtained after integrating out the right
handed neutrino. Right: Majorana mass term for νL obtained after spontaneous
symmetry breaking.

neutrino, it is possible to build a (non-local) four-point function involving two Higgs


fields and two left handed leptons (see the figure 8.2). At energies much lower
than the mass MR of the right handed neutrino, the intermediate propagator may be
approximated by a constant6
/ + MR )C
i(p C
→ −i , (8.25)
p2 − M2R p≪MR MR
which leads to the (local) Weinberg operator. The latter gives a Majorana mass for the
left handed neutrino after the Higgs field has acquired a non-zero vacuum expectation
value through spontaneous symmetry breaking. This mechanism is known as the
see-saw mechanism7 .
6 The propagator of a Majorana fermion is that of a Dirac fermion multiplied by the charge conjugation

matrix C.
7 More precisely, it corresponds to the Type-I see-saw mechanism. Type-II and Type-III see-saw

mechanisms exist, that differ in the nature of the heavy particle that connects the ΦΦψtL ψL fields in the
original four point function.
272 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

8.3.3 Higher dimensional operators


The number of operators of mass dimension 6 is much larger, and they lead to a
broader array of possible phenomena. Even if we restrict to operators that conserve
lepton and baryon number, there are 80 independent operators. Instead of listing them,
let us discuss an important result used to reduce the list of possible operators down to
a smaller set of independent operators, thanks to the field equations of motion that
result from the zeroth order Lagrangian (i.e. the Standard Model Lagrangian).

Operator removal from the Lagrangian : As an illustration of the principles at


work in this reduction, consider the Lagrangian of a real scalar field with a quartic
interaction, extended by two operators of dimension 6,
1   1 λ 1  
L = ∂µ φ ∂µ φ − m2 φ2 − φ4 + 2 λ1 φ6 + λ2 φ3 φ . (8.26)
|2 {z2 4! } Λ
L(0)

The equation of motion that follows from the zeroth order Lagrangian is
 λ
 + m2 φ + φ3 = 0 . (8.27)
6
Naively, it is tempting to replace the last term of the effective Lagrangian, φ3 φ, by
a sum of terms in φ4 and φ6 . However, it is not totally clear that this is legitimate
when this interaction term is inserted in a more complicated graph. A more robust
justification goes as follows. Consider a new scalar field ψ related to φ by

φ = ψ + λ2 Λ−2 ψ3 . (8.28)

Note that both terms in the right hand side have mass dimension 1 and transform
as Lorentz scalars. Rewriting the terms of the above Lagrangian in terms of ψ, we
obtain
1   1  
∂µ φ ∂µ φ = ∂µ ψ ∂µ ψ − λ2 Λ−2 ψ3 ψ + O(Λ−4 ) ,
2 2
m2 φ2 = m2 ψ2 + 2λ2 Λ−2 ψ4 + O(Λ−4 ) ,
λ 4 λ 4
φ = ψ + 4λ2 Λ−2 ψ6 + O(Λ−4 ) , (8.29)
4! 4!
and finally
1   1 λ′ 1
L= ∂µ ψ ∂µ ψ − m2 ψ2 − ψ4 + 2 λ1′ ψ6 + O(Λ−4 ) , (8.30)
2 2 4! Λ
where λ ′ , λ1′ are new coupling constants for the quartic and sextic terms. In the spirit
of an effective field theory, we do not care about the terms of order Λ−4 since they
8. E FFECTIVE FIELD THEORIES 273

come with operators of dimension 8, that we are not considering here. Thus, by the
change of variable of eq. (8.28), we can eliminate the term that seemed redundant in
the Lagrangian. More generally, any term of the form

Λ−2 f(φ) φ + m2 φ + λ6 φ3 , (8.31)
| {z }
l.h.s. of the EOM

where f(φ) is any local function of the fields of mass dimension 3 (e.g., φ3 , m2 φ,
φ), can be removed from the effective Lagrangian by the following field redefinition

φ = ψ + Λ−2 f(φ) . (8.32)

Functional Jacobian : Having removed the operator φ3 φ from the Lagrangian


is not enough, because the change of variable (8.28) has also implications elsewhere.
Firstly, in the path integral representation of the generating functional, this change of
variable introduces a Jacobian since the functional integration measure is modified as
follows
 
    δφ(x)
Dφ(x) = Dψ(x) det . (8.33)
δψ(y)

For a transformation of the type (8.28), the determinant depends on the field since we
have
δφ(x)  
= Λ−2 δ(x − y) Λ2 + 3 λ2 ψ2 (x) . (8.34)
δψ(y)

and therefore this determinant should not be disregarded. Like in the Fadeev-Popov
quantization procedure, we may express it as an path integral over fictitious Grassmann
fields χ, χ, by writing:

  Z  Z 
δφ(x)    
4 2 2
det = Dχ(x)Dχ(x) exp i d x χ(x) Λ +3 λ2 ψ (x) χ(x) .
δψ(y)
(8.35)

In the case of our simple example, the kinetic term of this ghost field is a bit peculiar
since it does not contain any derivatives. However, it exhibits a feature which is
completely generic, namely the fact that its mass is of order Λ. Since the ghosts can
only appear in closed loops, their contribution is suppressed by inverse powers of
Λ. In other words, the determinant depends on the field ψ, but this dependence is of
higher order in Λ−2 and will not affect our effective theory. c sileG siocnarF
274 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Modifications at the external points : Secondly, the change of variables (8.28)


modifies the coupling to the fictitious source in the generating functional:
Z Z  
d x J(x)φ(x) = d4 x J(x) ψ(x) + λ2 Λ−2 ψ3 (x) .
4
(8.36)

Thus, every functional derivative with respect to J brings a factor ψ + λ2 Λ−2 ψ3 in


the time-ordered product of interest:

0out T φ(x1 ) · · · 0in = 0out T ψ(x1 ) + λ2 Λ−2 ψ3 (x1 ) · · · 0in . (8.37)

At this point, we have shown that the only possible effect of the term we have removed
from the effective Lagrangian is to modify the operators inside a time-ordered product
of fields (in the form of extra terms that will appear on the external legs of the
corresponding Feynman graphs). However, the physical quantities are not the above
correlation functions themselves, but the on-shell transition amplitudes obtained with
the LSZ reduction formulas, i.e. the residue of the 1-particle poles in the Fourier
transform of eq. (8.37). For instance, in a 4-point function contributing to a 2 → 2
scattering amplitude, one would have a graph such as the following

ψ ψ

ψ ψ3
where one of the operators in the T-product is a ψ3 (in this diagrammatic representa-
tion, we have not yet amputated the external propagators.) We can readily see that one
point of this function is not terminated by a propagator, and therefore does not exhibit
a 1-particle pole. Thus, such a graph does not contribute to the on-shell transition
amplitude when inserted in the LSZ reduction formula. Although we have used a very
simple example to illustrate the chain of arguments leading to this result, it is in fact
completely general: if a term of the effective Lagrangian can be rewritten as a linear
combination of other operators thanks to the leading order equation of motion, then
this term can be ignored in the effective theory without changing anything to the S
matrix.

8.4 Effective theories in QCD


Quantum Chromodynamics is also an area where effective field theories are quite
useful. Indeed, since QCD contains many dimensionful scales (the scale ΛQCD at which
the coupling constant becomes large, and the masses of the six families of quarks, that
8. E FFECTIVE FIELD THEORIES 275

span a wide range of momentum scales), we may expect that some simplifications are
possible if one is interested in processes in which some of these scales are irrelevant.
Several effective theories have been developed in order to simplify the treatment of
strong interactions in some special kinematical situations, and we shall discuss two of
them in this section.

8.4.1 Heavy quark effective theory

Main ideas : There are six families of quarks in Nature, u, d, s, c, b, and t. The
u, d and s quarks are light in comparison to other QCD scales (in particular the
confinement scale ΛQCD ), while the c, b and t are considered heavy. Besides the well
known nucleons (proton, neutron) and light mesons (pions, rho), that are made of u
and d valence quarks, some hadrons contain heavy quarks (c and b only, since the t
quark decays before a bound state can form). An obvious source of simplification
in the presence of heavy quarks is asymptotic freedom, thanks to which the strong
coupling constant at the scale mQ is not very large and thus the strong interactions
are more like electromagnetic interactions. In particular, hadrons made of a pair of
heavy quark and antiquark QQ have a size of order (αs mQ )−1 . When this size is
much smaller than Λ−1QCD
, these bound states are quite similar to a hydrogen atom.
However, hadrons mixing heavy and light quarks are not as simple, because their
size is of order Λ−1
QCD
and the typical momentum transfer between the light and heavy
quarks is of order ΛQCD . Thus, in these heavy-light hadrons, on may view the heavy
quark as surrounded by a non-perturbative cloud of light quarks and gluons. Such
systems are characterized by two different scales:

• The heavy quark mass mQ and the corresponding Compton wavelength λQ =


m−1
Q
,
• The confinement scale ΛQCD , that controls the typical size of bound states,
Rh ∼ Λ−1
QCD
.

For a heavy quark, one has λQ ≪ Rh . Thus, in a certain sense, the heavy quark
may be viewed as a point-like object inside a much larger hadron. Loosely speaking,
the quantum numbers of the heavy quark (flavour, spin) are confined in a volume of
order of its Compton wavelength λQ , but the accompanying cloud of light quarks and
gluons can only resolve distances as small as Λ−1 QCD
. Therefore, the light degrees of
freedom are totally insensitive to the heavy quark quantum numbers, and they only
feel its colour field. Moreover, for a heavy-light hadron, the rest frame of the hadron
is almost equivalent to the rest frame of the heavy quark. In this frame, the colour field
of the heavy quark is the Coulomb electrical field produced by a static colour charge,
that does not depend on the heavy quark mass. Thus, we expect that the configuration
of the light constituents is independent of mQ when mQ → ∞. These observations
276 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

constitute what is called heavy quark symmetry, that we shall derive more formally
later in this section. Note that, unlike chiral symmetry for massless quarks, it is
not a symmetry of the QCD Lagrangian, but rather an approximate symmetry that
arises in special kinematical conditions (namely, when a heavy quark interacts only
with light degrees of freedom via soft exchanges). Heavy quark symmetry provides
relationships between bound states that differ only in the flavour and/or the spin8 of
the heavy quark, for instance the B, D, B∗ , D∗ mesons, or the Λb , Λc baryons. Heavy
quark effective theory exploits this separation of scales in a systematic way in order
to calculate the dependence of various physical quantities on the mass of the heavy
quark, by an expansion in powers of m−1 Q
.

Spinor decomposition : Let us assume that there is a large gap between the con-
finement scale ΛQCD and the heavy quark mass mQ , and introduce an intermediate
scale Λ such that ΛQCD ≪ Λ ≪ mQ . Our goal is to construct an effective theory
which is equivalent to QCD at long distance, i.e. for momenta below Λ (but may
differ from QCD above Λ). Heavy quark effective theory is somewhat special in that
we do not completely integrate out the heavy quarks (since one of its applications is
to describe bound states that contain heavy quarks), but we rather integrate out only a
part of the heavy quark degrees of freedom. This is done by writing the momentum
of a heavy quark as follows:

pµ ≡ mQ vµ + qµ , (8.38)

where vµ is the hadron velocity (satisfying vµ vµ = 1) and qµ is a residual momentum


whose components are much smaller than mQ . This decomposition just highlights
the fact that the heavy quark moves almost with the hadron velocity. By definition,
the term mQ vµ does not change, while qµ fluctuates due to the interactions of the
heavy quark with the light degrees of freedom. However, the typical changes of qµ
are of order ΛQCD . Thus, the physical picture that emerges from this separation is that
of a heavy quark that moves almost along a straight line, just undergoing little kicks
from the surrounding cloud of light constituents.
By combining eq. (8.38) and the Dirac equation, we can see that the dominant
spacetime dependence of spinors is a phase exp(±i mQ v · x). Moreover, the velocity
vµ can be used to construct two spin projectors,

1 ± v/
P± ≡ , (8.39)
2
8 This is analogous to the fact that isotopes have almost identical chemistry, since the cloud of electrons

surrounding the nucleus is almost independent of its mass (in a first approximation, it depends only on its
electrical charge). Likewise, the independence with respect to the spin of the heavy quark is analogous to
the near degeneracy of the hyperfine levels in atomic physics.
8. E FFECTIVE FIELD THEORIES 277

thanks to which we may decompose the spinor ψ of a heavy quark into

qv (x) ≡ ei mQ v·x P+ ψ(x) ,


Qv (x) ≡ ei mQ v·x P− ψ(x) , (8.40)

or conversely
h i
ψ(x) = e−i mQ v·x qv (x) + Qv (x) . (8.41)

By introducing this decomposition in the Dirac Lagrangian, we obtain



L = / − mQ ψ
ψ iD

= / − mQ + mQ v/ (qv + Qv )
(qv + Qv ) iD
   
= / ⊥ Qv + Qv iD
qv i v · D qv − Qv i v · D + 2mQ Qv + qv iD / ⊥ qv ,
(8.42)

where we have decomposed the covariant derivative as Dµ ≡ vµ (v · D) + Dµ ⊥ . From


this form of the Lagrangian, we see that qv is a massless spinor while Qv has a mass
2mQ . Thus, the heavy quark effective theory will be obtained by integrating out Qv .
c sileG siocnarF

Effective Lagrangian : Let us consider the generating function of the correlation


functions of the “light” field qv , defined as
Z
  R 4
Z[η, η] ≡ Dqv Dqv DQv DQv ei d x (L+ηqv +qv η) . (8.43)

The path integration over the heavy field Qv is Gaussian and can be performed
analytically, giving
Z
  R 4
Z[η, η] = Dqv Dqv ∆v [A] ei d x (Leff +ηqv +qv η) , (8.44)

with the following effective Lagrangian


  1 
/⊥
Leff ≡ qv i v · D qv + qv iD / ⊥ qv ,
iD (8.45)
2mQ + iv · D

and where ∆v [A] is the functional determinant produced by the Gaussian integral:
 1/2
∆v [A] ≡ det 2mQ + iv · D . (8.46)

Note that if one chooses the strict axial gauge v · A = 0, then this determinant is
constant and may be disregarded.
278 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Derivative expansion : Because of the presence of derivatives in the denominator


of eq. (8.45), the corresponding effective action is non-local. In order to obtain
a local effective theory, we should expand this expression into a series of local
operators. Such an expansion is legitimate because we have pulled out the fast phase
exp(i mQ v · x) from the spinor. The resulting light field qv (x) has only a slow
spacetime dependence associated with the residual momentum qµ ∼ ΛQCD . Moreover,
interactions with soft gluons involve a gauge field Aµ ∼ ΛQCD , and we thus have

v · D ∼ ΛQCD ≪ mQ , (8.47)

which allows the following expansion


∞  n
1 1 X v·D
= −i . (8.48)
2mQ + i v · D 2mQ 2mQ
n=0

Up to the terms of order m−1


Q
, the effective Lagrangian therefore reads
2
 i D⊥ g
Leff = qv i v · D qv +qv qv + q Mij Fij qv + O(m−2 ) , (8.49)
| {z } 2mQ 2mQ v Q

L∞

where Fij is the QCD field strength tensor, and Mij ≡ 4i [γi , γj ] are the generators of
the Poincaré algebra in the spin 1/2 representation (latin indices i, j run only over the
spatial components transverse to the velocity). The first term L∞ is the only one that
survives in the limit of infinite quark mass. The terms of order m−1
Q
can be interpreted
respectively as the contribution of the transverse motion to the kinetic energy and the
interaction between the spin of the quark and the chromo-magnetic field.

Heavy quark symmetry : The leading term in the effective Lagrangian, L∞ , cor-
responds to the following Feynman rules

i
p
i δij P+ 
= , = −i g vµ ta
r .
v · p + i0+ aµ ij

Since there are no Dirac matrices in the expression of the vertex, the interactions with
gluons do not alter the spin of the heavy quark at order m0Q . More formally, since
L∞ does not contain Dirac matrices, it is invariant under
!
i
i i 1 σ 0
qv → ei θ S qv , with Si ≡ . (8.50)
2 0 σi
8. E FFECTIVE FIELD THEORIES 279

(The σi are the Pauli matrices.) Since we have [Si , Sj ] = i ǫijk Sk , this corresponds
to an SU(2) invariance of L∞ . Moreover, since L∞ is independent of mQ , all heavy
quarks play the same role. With Nf flavours of heavy quarks, the leading effective
Lagrangian
Nf
X 
L∞ = qvf i v · D qvf (8.51)
f=0

has an SU(2Nf ) symmetry, that constitutes the spin-flavour heavy quark symmetry.
These symmetries are broken by the corrections in m−1
Q
, since they depend explicitly
on the mass and contain Dirac matrices.

8.4.2 Colour Glass Condensate

Kinematics of high energy collisions : Another area of strong interactions which


is hardly tractable in QCD itself, but where progress can be made with the help
of an effective description is that of collisions between hadrons (or nuclei) at very
high energy. Consider for instance a proton. A naive picture is that it is made of
three valence quarks, bound by gluon exchanges. However, in a relativistic quantum
description, these constituents can all fluctuate into virtual quarks, antiquarks and
gluons. The important point is that these fluctuations are short-lived: roughly speaking,
their lifetime spans all scales from zero to the proton size.
When a proton is probed in an experiment (for instance, by colliding it with
another proton) characterized by a certain time resolution, the fluctuations of the
proton whose lifetime is smaller than this resolution do not play an active role.
Through renormalization, their only effect is to set the values of the parameters of
the Lagrangian (in particular, the coupling constant) at the scale relevant for this
experiment. On the other hand, the fluctuations whose lifetime is large compared to
the characteristic timescale of the probe are seen as actual constituents of the proton.
For instance, if a quark has fluctuated into a long-lived (compared to the timescale
probed in the experiment) quark-gluon state, then the experiment will see a quark plus
a gluon, both on-shell. Note that on-shellness, i.e. the fact that a momentum satisfies
P2 = m2 for a particle of mass m, should not be viewed in a strict sense
p in this context.
On-shellness in principle requires that the energy be exactly p0 = p2 + m2 . But
by the uncertainty principle, one would have to observe this particle for an infinitely
long time to check an exact equality. Thus, pin a measurement of duration ∆t, any
particle whose energy p0 satisfies ∆t p0 − p2 + m2 . 1 cannot be distinguished
from an exactly on-shell particle.
This discussion provides the physical justification of the parton model, in which
bound states such as protons are described by means of distributions of quarks,
antiquarks and gluons (generically called “partons”). Except for the valence quarks,
280 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

these constituents are in fact quantum fluctuations, but their long lifetime (compared
to the interaction time) allows to treat them as on-shell. Moreover, these partons
distributions must vary with the resolution scale (in space and time) with which the
proton is probed, since a smaller resolution scale will resolve more partons in the
measurement. c sileG siocnarF

Figure 8.3: Cartoon of the fluctuations inside a nucleon. The shaded strip indicates
the time resolution of some external probe. Top: slow nucleon. Bottom: boosted
nucleon. All the internal time scales are dilated by a Lorentz factor, and new virtual
fluctuations become accessible to the probe.

In particular, in a collision between two hadrons, the duration of the collision


scales as the inverse of the collision energy (roughly speaking, this is the time
necessary for the two Lorentz contracted hadrons to go through each other), and
therefore the parton distributions grow with the collision energy. Let us now assume
that in such a collision we are interested in processes characterized by a transverse
momentum Q. This means that this measurement resolves a fixed transverse distance
of the order of Q−1 , which may also be viewed as the minimal spatial extension of
the wavefunction of the partons probed in this collision. Combining this fact with the
increase of the number of partons with energy, we see that higher collision energy
leads eventually to a situation where the partons fully pack the available volume in
the hadron. This regime of strong interactions is known as (gluon) saturation. Note
that the gluon occupation number cannot grow above α−1 s , because this is the value
at which gluon splittings and gluon recombinations roughly balance each other.

Degrees of freedom : In order to discuss the relevant degrees of freedom, let us


consider the point of view of an observer at rest, while the hadron moves with a very
large momentum in the z direction. Due to the special kinematics of this problem, it
8. E FFECTIVE FIELD THEORIES 281

is convenient to introduce light-cone coordinates, defined as

x0 + x 3 x 0 − x3
x+ ≡ √ , x− ≡ √ . (8.52)
2 2
(The remaining two coordinates are the transverse coordinates x⊥ .) Similar definitions
can be introduced for 4-momenta. These coordinates have the virtue of transforming
very simply under boosts in the z direction, since x± just undergo a rescaling:

x+ → eω x+ , x− → e−ω x− , x⊥ → x⊥ . (8.53)

In order to order the constituents by their longitudinal momentum, the most convenient
variable is rapidity, defined as y ≡ 21 ln(p+ /p− ), since it is shifted by an additive
constant under a boost in the z direction. By definition, y = 0 (i.e. pz = 0)
corresponds to objects with no longitudinal momentum in the observer’s frame.
Quantum fluctuations with a large positive rapidity appear to the observer as nearly
on-shell constituents. At the largest rapidities (corresponding to the total pz of the
hadron), there are few constituents, mostly the valence quarks. Because of their large
longitudinal momentum, the dynamics of these constituents is considerably slowed
down by time dilation, and therefore they appear static to the observer. The only
relevant information about these fast partons is the colour current they carry. This
current is longitudinal, and because these constituents are static, it does not depend
on the light-cone variable9 x+ and takes the following form :


a (x) ≡ δ
µ+
ρa (x− , x⊥ ) , (8.54)

where the function ρa is the spatial distribution of colour charge. For a high energy
hadron, Lorentz contraction implies that the x− dependence of this function is very
peaked around x− ≈ 0. On the other hand, the x⊥ dependence reflects the distribution
of the constituents of the hadron in the plane transverse to the collision axis. Since
this depends on the peculiar spatial arrangement of the constituents at the time of
the collision, the function ρa (x− , x⊥ ) is not known and may be considered as a
random variable with a probability distribution W[ρ]. When one repeats may similar
collisions, the expectation value of an observable is obtained by a functional average
Z
 
O = Dρ W[ρ] O[ρ] , (8.55)

where O[ρ] is the value of this observable calculated with an arbitrary instance of the
distribution ρa .
In contrast, the constituents that lie at small rapidity in the observer’s frame have
a time evolution that cannot be neglected. These modes are thus described according
9 The evolution in x+ is generated by the component P − of the momentum. However, for massless

on-shell modes, we have P− = P2⊥ /(2P+ ) → 0 for the fast moving modes in the z direction.
282 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

fields sources

yobs ycut yproj


µ µ+
J = ρδ
1 W[ρ]
− F µν Fµν + A µ J µ
4

Figure 8.4: Degrees of freedom in the Colour Glass Condensate effective descrip-
tion of a high energy hadron.

to the original Yang-Mills action, as illustrated in the figure 8.4. Moreover, due to
the hierarchy between the longitudinal momenta of the modes described as a colour
current and those described as regular gauge fields, the coupling between them may
be approximated as eikonal, i.e. by a term of the form Jµ Aµ , and therefore the action
of the effective theory reads
Z  1 
S = d4 x − Fa F µν a
+ J µ a Aµ
a . (8.56)
4 µν
This effective theory is called the Colour Glass Condensate.

Power counting in the saturation regime : The power counting for the graphs that
appear in this effective theory is a bit peculiar in the saturation regime. Indeed, this
situation corresponds to a gluon occupation number of order g−2 , which is achieved
with a colour current of order g−1 . The order of a connected graph G with nE external
gluons, nL loops and nJ insertions of the colour current is given by
n
G ∼ g−2 gnE g2 nL gJ J , (8.57)

where J denotes the typical magnitude of the current. Thus, in the saturated regime
where J ∼ g−1 , the magnitude of connected graphs does not depend on nJ , which
means that all observables depend non-perturbatively on the colour current. In contrast,
the loop expansion still corresponds to an expansion in powers of g2 . Observables at
tree-level are given by an infinite sum of tree diagrams (corresponding to an arbitrary
8. E FFECTIVE FIELD THEORIES 283

Figure 8.5: Typical


graph in a hadron-
hadron collision, in
which both hadrons
are described with
the CGC effective
theory. The solid dots
represent insertions of
the colour current Jµ .

number of insertions of Jµ ), that can be expressed in terms of classical solutions of


the Yang-Mills equations of motion in the presence of an external source:
 
Dµ , Fµν a = −Jνa . (8.58)

In order to have a unique solution, these equations must be supplemented with


boundary conditions. One may show that in the case of inclusive observables10 , the
appropriate boundary condition is a retarded one, in which the initial fields (and
their time derivative) are zero in the remote past (i.e. long before the collision). The
classical field obtained by solving the above equation of motion is a strong field,

Aµ ∼ g−1 , (8.59)

which leads to several technical complications. Some of these issues are discussed
in the chapter 15. Higher order contributions correspond to loops evaluated in the
presence of this classical field as a background.

Cutoff dependence : In addition, this effective description must be endowed with


a cutoff (denoted ycut in the figure 8.4) in rapidity that separates the two types of
degrees of freedom. This cutoff does not appear explicitly in the above classical
action (8.56), and therefore observables do not depend on it at tree level. But it enters
in loop corrections as an upper limit in the integral over the longitudinal momentum
circulating in the loop. Indeed, including in the loop modes that have a rapidity larger
than ycut would lead to a double counting, since these modes are already included via
the colour current Jµ . Generically, this cutoff introduces a linear dependence on ycut
in the 1-loop correction of observables. In fact, one may show that for all inclusive
observables, the cutoff dependence at 1-loop can be written as

δONLO [ρ] = ycut H OLO [ρ] + terms that do not depend on ycut , (8.60)
10 Inclusive observables are measurements for which one sums over all the possible final states without

excluding any of them. For instance, the average particle multiplicity in the final state is an inclusive
observable, while the probability of producing exactly 3 particles is not.
284 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where H is a universal (i.e. the same for all inclusive observables) operator containing
second order derivatives with respect to ρa . An important property of this operator is
that it is self-adjoint:
Z Z
 
[Dρ] A[ρ] H B[ρ] = [Dρ] H A[ρ] B[ρ] . (8.61)

However, the cutoff is not a physical parameter, since it was just introduced by
hand in order to separate the two types of degrees of freedom, and therefore it should
not appear in physical quantities. The way out of this situation is to realize that by
changing the value of the cutoff, one is also modifying which modes are described
by the colour current Jµ . Consequently, the distribution W[ρ] should in fact depend
on ycut . Using eqs. (8.60) and (8.61), we see immediately that the cutoff dependence
coming from the loop correction to observables can be canceled if we also change

W[ρ] → W[ρ] − ycut H W[ρ] . (8.62)

More precisely, this substitution cancels the linear dependence on ycut . A more
rigorous procedure is to apply it to an infinitesimal variation δycut of the cutoff, for
which the quadratic terms are truly negligible. By doing so, the change of eq. (8.62)
becomes a differential equation
∂W[ρ]
= − H W[ρ] , (8.63)
∂ycut
that controls how the probability distribution W[ρ] changes as one varies the cutoff
(this equation is called the JIMWLK equation). c sileG siocnarF

8.5 EFT of spontaneous symmetry breaking


8.5.1 Nambu-Goldstone bosons
Spontaneous breaking of a global continuous symmetry leads to the emergence of
massless spin 0 bosons, one for each broken generator of the original symmetry, the
Nambu-Goldstone bosons. The other fields of the theory remain massive. Therefore,
at low energy, we expect that the physics is dominated by the fluctuations of the
Nambu-Goldstone bosons, and that we may neglect all the other excitations. Non-
linear sigma models11 provide an effective description that contains only the massless
particles.
11 Their name comes from early applications to the physics of pions, that may be viewed as Goldstone

bosons of the (approximate for small but non-zero quark masses) chiral symmetry SU(2) × SU(2) that
exists in the light quark (u and d) sector of quantum chromodynamics. This symmetry is spontaneously
broken to a residual SU(2) symmetry in the vacuum of QCD, leading to the appearance of three nearly
massless scalar particles.
8. E FFECTIVE FIELD THEORIES 285

Figure 8.6: Illustration of the


symmetry breaking pattern in the
case of a model with G = O(3)
symmetry. The set M0 of the φc
minima of the potential is a
2-dimensional sphere. The sta-
H
bilizer of the minimum φc is G/H
H = O(2). The light colored
arrows show the field configura-
tions obtained by applying G/H
to φc , i.e. the allowed values of
the Nambu-Goldstone fields.

Let our starting point be an action of the form


Z
S ≡ dd x 21 (∂µ φ(x))(∂µ φ(x)) − V(φ(x)) , (8.64)

assumed to be invariant under the global action of a Lie group G. The metric of
d-dimensional spacetime is chosen to be Minkowskian (but this discussion is equally
applicable to Euclidean space). In addition, the potential V(φ) has non trivial minima
at some φ 6= 0. Due to the G-invariance of the action, the non trivial minima cannot
be unique. Given a certain minimum φc , all the field configurations that may be
reached from φc by the action of G are also minima. If we assume that there are no
accidental (i.e. not caused by the symmetry of the action) degeneracies of the minima,
the set of all minima can therefore be written as

M0 ≡ gφc g ∈ G . (8.65)
If H is the subgroup of G that leaves φc invariant (sometimes called the stabilizer of
φc ), then M0 is also the coset G/H.

8.5.2 Non-linear sigma model


Definition : At low energy, the quantum fluctuations of the fields that have remained
massive can be neglected, and the massless components of φ may be obtained by
acting on φc by a matrix representation of G,
φi = Rij (g) φcj . (8.66)
Note that, given an element h ∈ H, we have
Rij (gh) φcj = Rij (g) Rjk (h) φck = Rij (g) φcj . (8.67)
| {z }
φcj
286 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Thus, the field φ given by eq. (8.66) is not really a function of the full group G, but
depends only on elements of the coset G/H. Let us now split the generators ta of the
Lie algebra g into those (for n < a) that correspond to h, and the complement (for
1 ≤ a ≤ n). From the definition of H as the stabilizer of φc , we have
1≤a≤n : ta
ij φcj 6= 0 ,
a>n : ta
ij φcj = 0 . (8.68)
Thus, the matrix R(g) can be written as
n
!
X
a a
R(θ) = exp i θ t . (8.69)
a=1

The value of the potential does not change under the action of G on φc , and we are
free to choose the value of its minimum to be V(φc ) = 0. Thus, the action becomes
Z
1  µ 
S= dd x φci ∂µ R−1ik (θ) ∂ Rkj (θ) φcj
2
Z
1 
=− dd x φci Aµ (θ)Aµ (θ) ij φcj , (8.70)
2
where in the second expression we have introduced Aµ ≡ R−1 ∂µ R (an element of
the algebra).
Eq. (8.70) gives the action in terms of the “coordinates” θa on the coset G/H,
corresponding to a certain choice of the generators ta . However, it is interesting to
express the action in terms of a completely arbitrary system of coordinates on G/H,
that we may denote ϑm . Since eq. (8.70) has only two derivatives ∂µ · · · ∂µ , the same
must be true of its expression in any system of coordinates. On the other hand, it may
contain terms of arbitrarily high degree in ϑ. Thus, the most general action is of the
form
Z
1  
S= dd x gmn (ϑ) ∂µ ϑm ∂µ ϑn , (8.71)
2
where the coefficients gmn (ϑ) can be related to R(θ) as follows:
   
 a b
 a −1 ∂R b −1 ∂R
gmn (ϑ) ≡ −4 φci tik tkj φcj tr t R tr t R . (8.72)
∂ϑm ∂ϑn
They form a metric tensor on G/H, if the coset is viewed as a Riemannian manifold.
Indeed, if we use a different system of coordinates ̟p on G/H, gmn (ϑ) would be
replaced by
   
  a −1 ∂R b −1 ∂R
gpq (̟) ≡ −4 φci ta tb
φ
ik kj cj tr t R tr t R
∂̟p ∂̟q
 m  n 
∂ϑ ∂ϑ
= gmn (ϑ) , (8.73)
∂̟p ∂̟q
8. E FFECTIVE FIELD THEORIES 287

which is indeed the expected transformation law of a metric tensor under a change
of coordinates. The field theory described by the action (8.71) is called a non-linear
sigma model. Note that the derivative ∂µ ϑm of the coordinate ϑm is a vector that
c sileG siocnarF

lives on the tangent space to the manifold G/H at the point ϑ. Therefore, the action
(8.71), in which the tensor gmn is contracted with two vectors, is a scalar – invariant
under changes of coordinates on the manifold.
The Taylor expansion of the metric in powers of the field ϑ determines which
couplings exist in the classical action. Interestingly, even though the kinetic term of
the original action was quadratic in the fields, we now have a term with two derivatives
and possibly arbitrarily high orders in the field. Loosely speaking, this is due to the
fact that spontaneous symmetry breaking has restricted the fields from a space n in ❘
which the symmetry G was linearly realized, down to a curved manifold in which it is
realized non-linearly. In addition, it is worth stressing that the final action is uniquely
determined from eq. (8.69), but may take various explicit forms depending on the
choice of coordinates ϑm on G/H. In other words, the non-linear sigma model has
an intrinsic geometrical meaning, that does not depend on the system of coordinates
one uses.

Path integral quantization : The quantization of the non-linear sigma model can
be achieved via path integration. The action is quadratic in derivatives of the field, but
with the unusual feature that these derivatives are multiplied by a function of the field.
In order to ascertain the consequence of this property, it is necessary to start from the
Hamilton formulation of the path integral, and to perform explicitly the integral over
the conjugate momenta. For a Lagrangian density
1  
L= gmn (ϑ) ∂µ ϑm ∂µ ϑn , (8.74)
2
the conjugate momenta read
∂L
πm ≡ = gmn (ϑ) ∂0 ϑn , (8.75)
∂∂0 ϑm
and the Hamiltonian is given by
 1 1  
H = πm ∂0 ϑm −L = gmn (ϑ) πm πn + gmn (ϑ) ∇ϑm · ∇ϑn , (8.76)
2 2
where gmn is the inverse of the metric tensor, gmn gnp = δm p . The Hamiltonian is
quadratic in the momenta, but since the coefficient in front of πm πn depends on the
field, the determinant produced in the Gaussian integration over the momenta cannot
be disregarded. After this integral has been performed, the generating functional is
given by the following formula
Z p  Z 
 Y  i  
Z[jm ] = g(x) Dϑm (x) exp dd x L(ϑ) + jm ϑm , (8.77)
m

288 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

φc
Figure 8.7: Perturbative ex-
pansion in the non-linear sigma
G/H model: only field configurations
near φc are explored.

where we denote g(x) ≡ det (gmn (ϑ(x))). Interestingly, the field dependence
Q  of
m
gmn (ϑ) alters the
√ Q path integral
 in a rather natural way: the measure m Dϑ is
replaced by g m Dϑm , which is invariant under changes of coordinates on the
manifold G/H.
Note that in eq. (8.77), we have introduced an explicit h̄, that will be useful later
to keep track of the number of loops. The perturbative expansion in the non-linear
sigma model corresponds to an expansion in powers of h̄. From the path integral, we
can infer that the typical field amplitudes scale as

ϑ ∼ h̄ , (8.78)

which means that the perturbative expansion is also an expansion around ϑ = 0 (i.e.
around φ = φc ). For such small fields, the effects of the curvature of the manifold are
perturbative, and we can expand the metric tensor in powers of the field (an explicit
choice of coordinates must be made for this). The bare propagator of the ϑ fields is
given by

i δmn
Gmn (p) = . (8.79)
p2 + i0+

Renormalization : Dimensional analysis tells that the field ϑ has the dimension

ϑ ∼ (mass)(d−2)/2 (8.80)

(in a system of units where h̄ = 1). From this, we see that there are three cases
regarding the ultraviolet power counting in the non-linear sigma model:
8. E FFECTIVE FIELD THEORIES 289

• d < 2 : the Taylor coefficients of the metric tensor all have a positive mass
dimension, and are therefore super renormalizable.
• d = 2 : the Taylor coefficients are dimensionless, and the theory is renormaliz-
able.
• d > 2 : the Taylor coefficients have a negative mass dimension and are all
non-renormalizable by power counting.
The most interesting situation is therefore the two-dimensional case. It differs some-
what from the renormalization of the quantum field theories we have encountered
until now, since the action contains an infinite series of terms (of increasing degree
in ϑ), and an important question is whether the action (8.71) conserves its structure
under renormalization.
Recall that the fields ϑm transform under a non-linear representation of the group
G. Thus, their variation under an infinitesimal transformation of parameters ǫa may
be written as
δϑm ≡ ǫa Tam (ϑ) , (8.81)
where the Tam (ϑ)
are smooth functions of the fields. Under the same transformation,
the variation of the action reads
Z
δS
δS = ǫa d2 x Tam (ϑ) m , (8.82)
δϑ (x)
and the invariance of the action under G thus requires that
∂gmn ∂Tap ∂Tap
Tap + g pn + gpm =0. (8.83)
∂ϑp ∂ϑm ∂ϑn
In other words, the possible forms of the metric tensor are constrained by the symmetry
G. Indeed, the coset G/H is an homogeneous space12 , i.e. a manifold that possesses
additional symmetries that reduce the dimension of the space of allowed metrics.
More precisely, an homogeneous space is such that given any pair of points ϑ and
ϑ ′ on the manifold, there is an isometry (i.e. a distance preserving transformation)
that maps ϑ to ϑ ′ . If in addition the space is isotropic, then it is said to be maximally
symmetric13 . In an N-dimensional maximally symmetric space, there is a particularly
simple relationship between the metric and curvature tensors:
R
Rmn = gmn (R ≡ Rm m ) ,
N
R 
Rmnpq = gmp gnq − gmq gnp . (8.84)
N(N − 1)
12 Thanks to their connections to Lie algebras, a systematic classification of homogeneous spaces is

possible.
13 A maximally symmetric manifold of dimension N has N(N + 1)/2 distinct isometries. In Euclidean

space, this corresponds to N translations and N(N − 1)/2 rotations, but this maximal number of isometries
is the same in N-dimensional manifolds with curvature.
290 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(These two identities imply that the scalar curvature R is constant over the entire
manifold for a dimension N > 2.)
A possible strategy for studying the renormalization of the sigma model is to
introduce an analogue of the BRST transformation of non-Abelian gauge theories,
and the associated Slavnov-Taylor identities obeyed by the quantum effective action.
These identities, combined with dimensional and symmetry arguments that restrict
the terms that may arise in the renormalized action, are sufficient to show that the
renormalized action is structurally identical to eq. (8.71), with a group-invariant
metric tensor that obeys a renormalized version of eqs. (8.83). c sileG siocnarF

Example of G = O(n) : A scalar field φi with n components has an O(n)


symmetry if the action depends only on the combination φi φi . Potentials with
non-trivial minima (i.e. at φ 6= 0) in fact have infinitely degenerate minima that
form a (n − 1)-dimensional sphere Sn−1 (see the figure 8.6 for an illustration in the
case n = 3). Each minimum has a stabilizer subgroup H = O(n − 1) (the smaller
group of rotations around the direction fixed by this minimum), and we indeed have
Sn−1 = O(n)/O(n − 1). A possible explicit parameterization of the field φ consists
in writing

φ ≡ σ, ξ , (8.85)

where σ has one component and ξ has n − 1 components. Assuming the parameters
of the potential are adjusted so that the sphere Sn−1 of minima has radius φ = 1,
we must impose the constraint σ2 + ξ2 = 1, which means that σ may be viewed as a
dependent field that depends non-linearly on ξ. Usually, these coordinates are chosen
in such a way that the symmetry-breaking vacuum is φc = σ = 1, ξ = 0 . In the
vicinity of φc , σ is the “radial” massive field, while the ξi are the “angular” variables
corresponding to the massless Nambu-Goldstone bosons.
Then, we may split the generators of the o(n) algebra into those of the stabilizer
o(n − 1) and the complementary set of generators:

• The generators of o(n−1) act linearly on ξ. More precisely, they leave σ2 +ξ2
invariant by leaving both σ and ξ2 unchanged (thus simply rotating the n − 1
components of ξ).
• In contrast, the generators of the complementary set preserve σ2 + ξ2 , but mix
σ and ξ as follows:

σ → σ − ǫ i ξi ,
q
ξi i
→ ξ +ǫ i
1 − ξ2 , (8.86)

and therefore they act non-linearly on ξ.


8. E FFECTIVE FIELD THEORIES 291

σ
Figure 8.8: Illustration of the
(σ, ξ) coordinates for an O(3) ξ1
model. The dark circle corre-
sponds to the transformations that
preserve σ and act linearly on ξ
(as an O(2) rotation). The light
colored circles are the transfor-
mations that mix σ and ξ (and
transform the latter non-linearly).

ξ2

The most general O(n)-invariant action with σ2 + ξ2 = 1 reads


Z
1
S = dd x (∂µ σ)(∂µ σ) + (∂µ ξ)(∂µ ξ)
2
Z
1
= dd x gij (ξ) (∂µ ξi )(∂µ ξj ) , (8.87)
2

where in the second line we have eliminated σ and we have defined


ξi ξ j
gij (ξ) ≡ δij + . (8.88)
1 − ξ2
The tensor gij is the metric on the Sn−1 sphere, in the system of coordinates provided
by the ξi . The couplings of this theory are determined by the Taylor expansion of
the metric tensor, which in this case is completely specified by the choice of the
coordinates and by the symmetries of the problem. In d = 2 dimensions, this theory
is renormalizable by power counting. Although it contains an infinite number of
couplings, it is not necessary to renormalize each of them individually. Instead, the
renormalization preserves the structure of the action (8.87) with a metric tensor that
remains dictated by the O(n) symmetry.

8.5.3 Nonlinear sigma model on a generic Riemannian manifold

We have derived the non-linear sigma model as the effective action that describes the
dynamics of the massless Nambu-Goldstone bosons after a spontaneous breaking of
symmetry. In this case, the fields of the non-linear sigma model live on a manifold
which is also a homogeneous space thanks to the symmetries of the original problem.
292 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

These symmetries severely constrain the possible forms of the metric, and play an
important role in constraining the form of the loop corrections.
However, it is possible to consider an action of the form (8.71) for fields ϑm living
on a generic smooth Riemannian manifold that does not possess any special symmetry.
The power counting argument made earlier is unchanged, and we expect that this
more general kind of sigma model is also renormalizable in 2 dimensions. For these
generalized models, it has been shown that the dependence of the metric tensor (i.e.
the function that defines all the couplings of the model) on the renormalization scale
µ is governed by the following Callan-Symanzik equation:

∂ mn 1 mn 1
µ g =− R − 2 Rmpqr Rn pqr + higher orders . (8.89)
∂µ 2π 8π

Note that if we apply this equation in the case of a maximally symmetric space, for
which the curvature tensors have simple expressions in terms of the metric tensor, it
reduces to
∂ mn R h R i
µ g =− gmn 1 + + ··· . (8.90)
∂µ 2π N 2π N(N − 1)

Thus, in this special case, the metric is rescaled but retains its form under changes
of scale (because it is constrained by the isometries of the manifold). On a generic
manifold, the scale evolution governed by eq. (8.89) explores a much broader space
of metrics. Generally speaking, the renormalization flow tends to expand the regions
of negative curvature and to shrink those of positive curvature.

Figure 8.9: Left to right : successive stages of the Ricci flow on a 2-dimensional
manifold.

There is an interesting analogy between the renormalization group eq. (8.89) and
the Ricci flow,

∂τ gmn = −2 Rmn , (8.91)


8. E FFECTIVE FIELD THEORIES 293

introduced independently in mathematics by Hamilton in 1981 as a tool for studying


the geometrical classification of 3-dimensional manifolds14 . In a sketchy way, the idea
is to start with a generic metric tensor on the manifold, and to smoothen this metric
by evolution with the Ricci flow (the Ricci flow is somewhat analogous to a heat
equation, that tends to uniformize the temperature distribution). For instance, if the
metric evolves into one that has a constant positive curvature, one would have proved
that the original manifold is homeomorphic to a sphere. For 2-dimensional manifolds,
this is indeed what happens: the Ricci flow evolves the metric tensor into one that
has a constant scalar curvature, corresponding to one of the three possible geometries.
Applications of Ricci flow to 3-dimensional manifolds turned out to be complicated by
singularities that develop as the metric evolves, and required additional steps known
as “surgery” to excise the singularities. There is nowadays some speculation about
whether the additional terms in eq. (8.89) compared to eq. (8.91) have a regularizing
effect that may prevent the appearance of these singularities and thus make the surgical
steps unnecessary.

14 In 2 dimensions, connected manifolds are known to fall into three geometrical classes: flat, spherical

or hyperbolic, depending on their curvature. More precisely, any such 2-dimensional manifold can be
endowed with a metric that has a constant scalar curvature, either null, positive or negative. Thurston
geometrization conjecture proposed a similar –but much more complicated– classification of 3-dimensional
manifolds. In particular, this conjecture contains as a special case Poincaré’s conjecture, stating that every
closed simply connected 3-dimensional manifold is homeomorphic to a 3-sphere. The geometrization
conjecture was proved in 2003 by Perelman, with techniques in which the Ricci flow plays a central role.
294 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS
Chapter 9

Quantum anomalies

Noether’s theorem states that for each continuous symmetry of a classical Lagrangian,
there exists a corresponding conserved current. By construction, this conservation law
holds at tree level, and a very important question is whether it is preserved by quantum
corrections in higher orders of the theory. Quantum anomalies are situations where
a classical symmetry is violated by quantum effects. We have already encountered
anomalies in the section 3.5, where we saw that the fermionic functional measure is
not invariant under chiral transformations of massless fermions, which had interesting
connections with the index of the Dirac operator (its zero modes in the presence of an
external field).
c sileG siocnarF

When such an anomaly arises in a global symmetry like chiral symmetry, its
effect is just to introduce a corrective term into the conservation equation of the
corresponding current (which may have some physical consequences, however).
But when it affects a local gauge symmetry, its effects are devastating, since the
renormalizability and unitarity of gauge theories relies on the validity to all orders of
the gauge symmetry. In general, gauge theories with an anomalous gauge symmetry
do not make sense, and it is therefore of utmost importance to check that no such
gauge anomaly is present in theories of phenomenological relevance.

9.1 Axial anomalies in a gauge background


9.1.1 Two dimensional example: Schwinger model
The simplest example of theory that exhibits a quantum anomaly is quantum electro-
dynamics in two dimensions with massless fermions, also known as the Schwinger
model. The Lagrangian of this theory reads:
1
/ Ψ − Fµν Fµν ,
L ≡ i Ψ̄ D (9.1)
4

295
296 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where Dµ ≡ ∂µ − ie Aµ and Fµν ≡ ∂µ Aν − ∂ν Aµ . This Lagrangian is invariant


under (local) U(1) transformations,

Ψ(x) → eieχ(x) Ψ(x) ,


Aµ (x) → Aµ (x) + ∂µ χ(x) , (9.2)

which, by Noether’s theorem, implies the existence of a conserved electromagnetic


current:

Jµ ≡ −ie Ψ γµ Ψ , ∂µ J µ = 0 . (9.3)

(In the following, this current will be called a vector current.) Being a gauge symmetry,
this invariance is crucial for the unitarity of the theory, since it ensures that longitudinal
photons do not contribute as initial or final states of physical amplitudes.
Because the fermions are massless, this theory has another symmetry. In order to
see it, let us introduce1 a matrix γ5 ,

1
γ5 = ǫµν γµ γν = γ0 γ1 , (9.4)
2
where ǫµν is the 2-dimensional completely antisymmetric tensor, normalized by
ǫ01 = +1. Using γ5 , one may decompose Ψ in its left and right handed components:

1 + γ5 1 − γ5
Ψ = Ψ R + ΨL , ΨR ≡ Ψ, ΨL ≡ Ψ, (9.5)
2 2
and the fermionic part of the Lagrangian can be rewritten as
/ Ψ = i Ψ†R γ0 D
i ΨD / ΨR + i Ψ†L γ0 D
/ ΨL . (9.6)

In other words, the kinetic term does not mix the left and right components (this
would not be true with a mass term). As a consequence, the Lagrangian is invariant if
we multiply the left and right components by independent phases,

ΨR → eiα ΨR , ΨL → eiβ ΨL . (9.7)

Note that this is a global invariance, unlike the gauge symmetry discussed previously.
Equivalently, the massless Dirac Lagrangian is invariant under the following global
transformation,
5
Ψ → eiθγ Ψ , (9.8)
1 It is possible to define γ5 in any even space-time dimension D = 2r, as follows
ir−1
γ5 ≡ ǫµ1 µ2 ···µ2r γµ1 γµ2 · · · γµ2r .
(2r)!
9. Q UANTUM ANOMALIES 297

that amounts to multiplying by conjugate phases the left and right components
(because of the γ5 in the exponential). Since this is a continuous symmetry, Noether’s
theorem also applies here and tells us that the axial current is conserved:

Jµ 5 µ
5 ≡ −ie Ψ γ γ Ψ , ∂µ J µ
5 =0. (9.9)

Figure 9.1: Left: 1-loop contribution to the vector current in a background gauge
potential (the wavy line terminated by a cross represents the background field).
Right: 1-loop contribution to the axial current.

The conservation laws (9.3) and (9.9) have been obtained with Noether’s theorem,
from the fact that the classical Lagrangian possesses certain continuous symmetries.
Let us now study how the vector and axial currents are modified at 1-loop. Here, we
consider a fixed configuration of the gauge potential Aµ (x), that acts as a background
external field (this also means that the photon kinetic term plays no role in this
discussion). The lowest order 1-loop graphs that contribute to these currents are
shown in the figure 9.1. The expectation values of the currents resulting from these
graphs can be written as
eJµ (q) = Πµν (q) A
e ν (q) , eJµ (q) = Πµν (q) A
e ν (q) , (9.10)
5 5

(the tilde denotes the Fourier transform of the external field) where the self-energies
Πµν and Πµν 5 are given by

Z 
µν 2 dD k /γν (k
tr γµk /+q /)
iΠ (q) ≡ e ,
(2π)D (k2 + i0+ )((k + q)2 + i0+ )
Z D 
d k tr γ5 γµk/γν (k
/+q /)
i Πµν
5 (q) ≡ e2 . (9.11)
(2π) (k + i0 )((k + q) + i0+ )
D 2 + 2

(The only difference between them is the γ5 inside the trace, that comes from the
definition of the axial current). In order to secure the subsequent manipulations, let
us assume that some regularization has been performed on the momentum integrals,
without specifying it for now. The denominators can be arranged into a single factor
by using Feynman’s parameterization,
Z1
1 1
= dx 2 , (9.12)
(k2 + i0+ )((k + q)2 + i0+ ) 0 (l + ∆(x))2
298 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where we have introduced l ≡ k + x q and ∆(x) ≡ x(1 − x) q2 . After calculating


the trace, the vector-vector self-energy can be written as follows:

qµ qν
Πµν (q) = A(q2 ) gµν − B(q2 ) , (9.13)
q2

where the coefficients A(q2 ) and B(q2 ) are given by the following integrals:

Z Z1 2

2 2 dD l ∆(x) + D − 1 l2
A(q ) ≡ −iDe dx ,
(2π)D 0 (l2 + ∆(x))2
Z Z1
dD l 2 ∆(x)
B(q2 ) ≡ −iDe2 dx . (9.14)
(2π)D 0 (l2 + ∆(x))2

In D = 2 spacetime dimensions, the second integral is finite and gives:

e2
B(q2 ) = , (9.15)
D=2 π

while the first integral is ambiguous. Indeed, the term in l2 in the numerator leads
2
to an ultraviolet divergence, but it is multiplied by the factor D − 1 that vanishes
precisely when D = 2. If we use a cutoff as ultraviolet regulator, this term would
vanish and we would have A = B/2, which would violate the conservation of the
vector current at one-loop. In dimensional regularization, in contrast, the factor
2
D − 1 compensates a pole in 1/(D − 2) that comes from evaluating the integral in D
dimensions, leaving a finite but non-zero result. In fact, in dimensional regularization
we obtain A = B, and the conservation of the vector current holds at one-loop. No
matter which regularization procedure we adopt, it must give A = B for vector current
conservation, i.e. for preserving gauge symmetry at 1-loop. c sileG siocnarF

Let us now turn to the axial-vector self-energy Πµν


5 . Using the definition of γ5 ,
we obtain

tr γ5 γµ γν = −D ǫµν , (9.16)

and
   
tr γ5 γµAγ
/ νB
/ = Aν tr γ5 γµ B
/ + Bν tr γ5 γµA
/ − A · B tr γ5 γµ γν
h i
= −D ǫµσ Bσ Aν + Aσ Bν − (A · B) gσ ν . (9.17)

This identity leads to


h qσ qν i
Πµν
5 (q) = −ǫ
µσ
Πσ ν (q) = −ǫµσ A(q2 ) gσ ν − B(q2 ) 2 , (9.18)
q
9. Q UANTUM ANOMALIES 299

where A and B are the same coefficients as in eq. (9.13). Therefore, the divergence of
the axial current is given by

qµ eJµ 2
5 (q) = −A(q ) ǫ
µν e ν (q) .
qµ A (9.19)

If we have adopted a regularization that preserves gauge symmetry, i.e. such that
A = B, this divergence is non-zero and reads
2
e µν
qµ eJµ
5 (q) = −
e ν (q) ,
ǫ qµ A (9.20)
π
or, going back to coordinate space:

e2 µν e2 µν
∂ µ Jµ
5 (x) = − ǫ ∂µ Aν (x) = − ǫ Fµν (x) . (9.21)
π 2π
The non-conservation of the axial current at one loop is the unavoidable conclusion
in any regularization scheme that preserves the conservation of the vector current.
Moreover, since when this is the case A becomes equal to the ultraviolet finite
coefficient B, it does not suffer from any scheme dependence, and the above result
may thus be viewed as a scheme-free result. The result (9.21) is known as an axial
anomaly. A somewhat milder conclusion of this 2-dimensional exercise is that it not
possible to preserve both vector and axial current conservation at one-loop. We could
in principle adopt a regularization scheme that conserves the axial current, which
requires A = 0. But the price to pay would be the loss of gauge invariance at 1-loop.
Since gauge invariance is deemed more fundamental (in particular, it ensures the
unitarity of the theory), this route is generally not considered further.
Note that ultraviolet divergences are necessary2 for the existence of this anomaly.
Indeed, at the classical level, the Lagrangian density is invariant under the global
transformation:
5 5
Ψ → eiθγ Ψ , Ψ† → Ψ† e−iθγ . (9.22)

The Feynman graphs that contribute to the expectation value of the axial current in a
background electromagnetic field have an equal number of Ψ’s and Ψ† ’s (this state-
ment is true to all orders of perturbation theory). Since the axial symmetry is global,
when we apply the above axial transformation to a graph, all the factors exp(±iθγ5 )
should naively cancel, leaving a result that does not depend on θ. This conclusion
would indeed be correct if all the integrals were finite, but may be invalidated by the
subtraction procedure necessary to obtain finite results in the presence of divergences.
In the explicit example that we have studied, the ultraviolet regularizations that are
consistent with gauge symmetry all spoil axial symmetry.
2 In a certain sense, the axial anomaly is also an infrared effect since it exists only for massless fermions

(for massive fermions, there is no axial symmetry to begin with). Moreover, as we have already seen when
discussing the Atiyah-Singer index theorem, the axial anomaly is related to the zero modes of the Dirac
operator in a background field.
300 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Beyond one loop, a graph contributing to the expectation value of the axial
current may contain subgraphs that are ultraviolet divergent. However, since QED
is renormalizable, these sub-divergences will all have been made finite thanks to
counterterms calculated in the previous orders of the perturbative expansion. Thus, we
need only to study the intrinsic ultraviolet divergence of the graph under consideration,
an indicator of which is given by its superficial degree of divergence. For the sake of
definiteness, let us assume that the graph G has nψ fermion propagators, nγ photon
propagators, nV photon-fermion-fermion vertices, nA insertions of the external
electromagnetic field and nL loops (plus one extra vertex where the axial current is
attached). These quantities are not independent, but obey the following identities:

2nγ = nV ,
2nψ = 2 + 2(nV + nA ) ,
n L = nψ + nγ − nA − nV . (9.23)

Using these relations, the superficial degree of divergence of the graph reads:

ω(G) ≡ 2nL − nψ − 2nγ = 2 − nψ . (9.24)

The simplest graph that contributes to the axial current, shown in the figure 9.1, has
nψ = 2 and therefore has a logarithmic ultraviolet divergence. More complicated
graphs, either with more insertions of the external field or with more than one loop,
all have nψ > 2 and are therefore convergent after all their sub-divergences have
been subtracted. This argument, although it lacks some rigor, indicates that the
axial anomaly does not receive any correction beyond the one-loop result, and that
eq. (9.21) is therefore an exact result. An alternate justification of this property is
based on the derivation of the axial anomaly from the fermionic path integral, which
gives the determinant of the Dirac operator in the background field. Indeed, as we
have seen in the section 2.5, functional determinants correspond to 1-loop diagrams.

9.1.2 Axial anomaly in four dimensions


γ5 in four dimensions : Let us now turn to a more realistic 4-dimensional example,
that also has some relevance in understanding the decay of pseudo-scalar mesons
like the π0 . The setup is exactly the same as in the previous section, except that we
consider now four space-time dimensions. The main modification is the definition of
the γ5 matrix,
i
γ5 = ǫµνρσ γµ γν γρ γσ = i γ0 γ1 γ2 γ3 . (9.25)
D=4 4!
The traces of a γ5 with any odd number of ordinary Dirac matrices are all zero,

tr γ5 γµ1 · · · γµ2n+1 = 0 . (9.26)
9. Q UANTUM ANOMALIES 301

In order to evaluate the traces of γ5 with an even number of Dirac matrices, let us
firstly recall the general formula for a trace of an even number of Dirac matrices:
 X Y
tr γµ1 · · · γµ2n = D sign (P) gµs1 µs2 , (9.27)
pairings P s∈P

where a pairing P is a set of pairs P = (s1 s2 ), (s1′ s2′ ), · · · made of the integers
in [1, 2n]. The signature of P, denoted sign (P), is the signature of the permutation
that reorders the sequence s1 s2 s1′ s2′ · · · into 1234 · · · . Since the Minkowski metric
tensor gµν is diagonal, each Lorentz index carried by one of the Dirac matrices must
coincide with the Lorentz index of another matrix in order to obtain a non vanishing
result. Hence, we have
 
tr γ5 = i tr (γ0 γ1 γ2 γ3 = 0 . (9.28)

The same is true if the γ5 is accompanied by only two ordinary Dirac matrices,
 
tr γ5 γµ γν = i tr (γ0 γ1 γ2 γ3 γµ γν = 0 , (9.29)

and the simplest non-zero trace is tr (γ5 γµ γν γρ γσ ). By the previous argument, each
c sileG siocnarF

of the indices µνρσ must match one of the indices 0123 hidden in γ5 = i γ0 γ1 γ2 γ3 .
Therefore, µνρσ must be a permutation of 0123. Since the four Dirac matrices are all
distinct, they all anticommute, and the result is completely antisymmetric in µνρσ,
so that we have

tr γ5 γµ γν γρ γσ = A ǫµνρσ . (9.30)

In order to calculate the prefactor, we just need to evaluate the trace for a particular
assignment of the indices, for instance µνρσ = 3210,
3210 5 3 2 1 0
 0 1 2 3 3 2 1 0

Aǫ| {z } = tr γ γ γ γ γ = i tr γ γ γ |γ {zγ } γ γ γ = −4 i . (9.31)
+1 −1
| {z }
+1
| {z }
−1
| {z }
−1

This gives A = −4 i, i.e.



tr γ5 γµ γν γρ γσ = −4 i ǫµνρσ . (9.32)

Order 1 in the external field : Let us now turn to the calculation of the expectation
value of the axial current in four dimensions. The simplest graph to consider is again
the graph on the right of the figure 9.1. Its contribution to axial current is
eJµ (q) = Πµν (q) A
e ν (q) , (9.33)
5 5
302 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 9.2: Graph contributing to the


chiral anomaly in a gauge background in
four space-time dimensions.

with
Z 
dD k tr γ5 γµk/γν (k /+q /)
i Πµν
5 (q) ≡ e2
(2π)D (k2 + i0+ )((k + q)2 + i0+ )
Z D Z1 
2 d l tr γ5 γµ (l/ − xq/ )γν (l/ + (1 − x)q
/)
= e dx ,
(2π)D 0 (l2 + ∆(x))2
(9.34)
where we have introduced the Feynman parameterization in the second line, and the
new integration variable l ≡ k + xq. The trace that appears in the numerator is
proportional to
ǫµανβ (l − xq)α (l + (1 − x)q)β ∝ ǫµανβ lα qβ , (9.35)
and is therefore odd in the momentum l. Therefore, the momentum integral vanishes,
and this graph does not contribute to the axial current.

Order 2 in the external field : At second order in the external field, we encounter
the graph of the figure 9.2. Its contribution to the expectation value of the axial current
reads
Z 4 4
eJµ (q) = 1 d k1 d k2 (2π)4 δ(q + k1 + k2 )
5
2! (2π)8
e ν (k1 )A
×Γ µνρ (q, k1 , k2 ) A e ρ (k2 ) , (9.36)
5

where we have introduced the following three-point function:


i Γ5µνρ (q, k1 , k2 ) ≡
Z D 
3 d k tr γ5 γµ (k
/+a/ +k/1 )γν (k
/+a/ )γρ (k
/+a/ −k/2 )
≡e
(2π)D ((k+a+k1 )2 +i0+ )((k+a)2 +i0+ )((k+a−k2 )2 +i0+ )
Z D 
3 d k tr γ5 γµ (k
/+b/ +k/2 )γρ (k
/+b/ )γν (k
/+b/ −k/1 )
+e .
(2π)D ((k+b+k2 )2 +i0+ )((k+b)2 +i0+ )((k+b−k1 )2 +i0+ )
(9.37)
9. Q UANTUM ANOMALIES 303

The two terms correspond to the two ways of attaching the fields with momenta k1
and k2 to the external photon lines. For a reason that will become clear later, we have
taken the freedom to introduce independent shifts a and b of the integration variables
in the two terms. Such shifts would of course have no effect on convergent integrals,
since they just correspond to a linear change of variable. However, we are here in the
presence of linearly divergent integrals, and these shifts have a nontrivial interplay
with the ultraviolet regularization. Note that since {γ5 , γα } = 0, we may move the
γ5 just before the matrices γν or γρ without changing the integrand, as if the axial
current was attached at the other summits of the triangle (where the momenta k1 or
k2 enter, respectively).

Next, in order to test the conservation of the axial current, we contract this
amplitude with qµ , that we may rewrite as follows:

qµ = −(k1 + k2 )µ
= (k + a − k2 )µ − (k + a + k1 )µ
= (k + b − k1 )µ − (k + b + k2 )µ . (9.38)

This leads to

Z D
d k ανβρ
qµ Γ5µνρ (q, k1 , k2 ) = 4e3 ǫ
(2π)D

(k1 )α (k + a)β
×
((k+a) + i0+ )((k+a+k1 )2 +i0+ )
2

(k2 )α (k + a)β
+
((k+a)2 + i0+ )((k+a−k2 )2 +i0+ )
(k1 )α (k + b)β

((k+b)2 + i0+ )((k+b−k1 )2 +i0+ )

(k2 )α (k + b)β
− . (9.39)
((k+b)2 + i0+ )((k+b+k2 )2 +i0+ )

By taking a = b = 0, and assuming a regularization that preserves Lorentz invariance,


each term leads to a vanishing integral. Consider for instance the first term. Since
k1 is the only 4-vector that enters in the integrand besides the integration variable k,
the result of its integral is proportional to ǫανβρ (k1 )α (k1 )β = 0. Since the same
reasoning applies to the four terms, we would therefore naively conclude that the
axial current is conserved. However, we should make sure that the vector currents are
also conserved. For this, we also need to calculate (k1 )ν Γ5µνρ and (k2 )ρ Γ5µνρ . The
304 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

same method as above gives


Z
dD k αµβρ
(k1 )ν Γ5µνρ (q, k1 , k2 ) = −4e3
ǫ
(2π)D

(k + a)α (k + a − k2 )β
×
((k+a)2 + i0+ )((k+a−k2 )2 +i0+ )
(k + a + k1 )α (k + a − k2 )β

((k+a+k1 )2 + i0+ )((k+a−k2 )2 +i0+ )
(k + b + k2 )α (k + b − k1 )β
+
((k+b+k2 )2 + i0+ )((k+b−k1 )2 +i0+ )

(k + b + k2 )α (k + b)β
− . (9.40)
((k+b)2 + i0+ )((k+b+k2 )2 +i0+ )

and
Z D
d k αµβν
(k2 )ρ Γ5µνρ (q, k1 , k2 ) = −4e3 ǫ
(2π)D

(k + a + k1 )α (k + a − k2 )β
×
((k+a+k1 )2 + i0+ )((k+a−k2 )2 +i0+ )
(k + a + k1 )α (k + a)β

((k+a+k1 )2 + i0+ )((k+a)2 +i0+ )
(k + b)α (k + b − k1 )β
+
((k+b)2 + i0+ )((k+b−k1 )2 +i0+ )

(k + b + k2 )α (k + b − k1 )β
− . (9.41)
((k+b+k2 )2 + i0+ )((k+b−k1 )2 +i0+ )

It turns out that the choice a = b = 0 leads to non vanishing results for the conser-
vation of the vector currents. Consider for instance (k1 )ν Γ5µνρ . With a = b = 0
and a regularization that preserves Lorentz invariance as well as reflection symmetry
k → −k, we have:
Z D
µνρ 3 d k αµβρ
(k1 )ν Γ5 (q, k1 , k2 ) = −8e ǫ
(2π)D
(k + k2 )α (k − k1 )β
×
((k+k2 )2 + i0+ )((k−k1 )2 +i0+ )
∝ ǫαµβρ (k2 )α (k1 )β 6= 0 . (9.42)
A systematic search indicates that the only choice of a and b that gives a null result
for both eqs. (9.40) and (9.41) is
a = −b = k2 − k1 . (9.43)
9. Q UANTUM ANOMALIES 305

Since the conservation of the vector current is necessary in order to preserve gauge
symmetry, and that the latter is a requirement for unitarity, we must adopt this choice.
Returning to eq. (9.39) for the axial current with these values of a and b, we obtain:
Z
dD k ανβρ (k1 )α (k+k2 −k1 )β
qµ Γ5µνρ (q, k1 , k2 ) = 16e3 ǫ .
(2π)D (k+k2 )2 +i0+ (k+k2 −k1 )2 +i0+
(9.44)

Let us define
(k1 )α (k−k1 )β
Fνρ (k) ≡ ǫανβρ , (9.45)
k2 +i0+ (k−k1 )2 +i0+
and note that
Z D
d k νρ
F (k) = 0 . (9.46)
(2π)D
(because with a Lorentz invariant regularization the result can only depend on the
vector k1 , which would unavoidably give zero when contracted with the two free
slots of the ǫανβρ .) Therefore, we can write
Z h i
dD k
qµ Γ5µνρ (q, k1 , k2 ) = 16e3 Fνρ (k + k2 ) − Fνρ (k)
(2π)D
Z D h i
d k ∂Fνρ (k) kσ kτ ∂2 Fνρ (k)
= 16e3 kσ
2 + 2 2 + ··· .
(2π)D ∂kσ 2 ∂k ∂k σ τ

(9.47)

Since the integrand now contains only derivatives, we can use Stokes’s theorem
in order to rewrite the divergence of the axial current as a surface integral on the
boundary at infinity of momentum space. If we view this boundary as the limit
k∗ → ∞ of a sphere of radius k∗ , the “area” of this boundary grows like k3∗ in
D = 4. On the other hand, the function Fνρ (k) behaves as k−3 , and each subsequent
derivative decreases faster by one additional power of k−1 . Therefore, the result is
given in full by the first term of the expansion:
Z
dD k σ ∂Fνρ (k)
qµ Γ5µνρ (q, k1 , k2 ) = 16e3 k
(2π)D 2 ∂kσ
Z
16ie3 ανβρ kσ kβ
= ǫ (k1 )α (k2 )σ lim d3 S
(2π)4 k∗ →∞ k k4
S3 (k∗ )
| {z }
π2 gσβ
2

e3 νραβ
= −i ǫ (k1 )α (k2 )β , (9.48)
2π2
306 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

In the second line, S3 (k∗ ) is the 3-sphere of radius k∗ (i.e. the boundary of a 4-ball of
radius k∗ ), kσ /k is the unit vector normal to the sphere, and the factor i arises when
going to Euclidean momentum space. Note that we have anticipated the limit k → ∞
in order to simplify the function Fνρ (k). Therefore, the contribution of the triangle
graph to the divergence of the axial current reads
Z
e3 νραβ d4 k1 d4 k2
qµ eJµ
5 (q) = −i ǫ δ(q + k1 + k2 )
4π2 (2π)4
e ν (k1 )A
×(k1 )α (k2 )β A e ρ (k2 ) , (9.49)

or in coordinate space

e3 ανβρ
∂ µ Jµ
5 (x) = − ǫ Fαν (x) Fβρ (x) . (9.50)
16π2
This is the main result of this section, namely the existence of an anomalous divergence
of the axial current in the presence of a background electromagnetic field. In the course
of the calculation, we have seen that depending on the labeling of the integration
momentum, we can make the anomaly appear in any of the three external currents.
In the situation considered here, with one axial current corresponding to a global
symmetry, and two vector currents stemming from a local gauge symmetry, we must
enforce the conservation of the vector currents and therefore assign in full the anomaly
to the axial one. But the same calculation would arise in the context of a chiral gauge
theory (where the left and right handed fermions belong to different representations of
the gauge group). In this case, the natural choice would be to regularize the triangle
so that the symmetry among the three currents is preserved, and the anomaly would
then be equally shared by the three currents. c sileG siocnarF

Corrections : Let us now discuss potential corrections to the result (9.50). Firstly,
we should examine one-loop graphs with more than two photons in addition to the
insertion of the axial current. A simple dimensional argument can exclude that such
graphs contribute to the divergence of the axial current. Indeed, ∂µ Jµ 5 has mass
dimension 4. In an abelian gauge theory, each external photon must appear in the
right hand side in the form of the field strength Fµν , that has mass dimension 2. A
term with n photons would thus have mass dimension 2n, and require a prefactor
of mass dimension 4 − 2n to be a valid contribution to the divergence of the axial
current. But since the fermions we are considering are massless and the coupling
constant is dimensionless in four dimensions, there is no dimensionful parameter in
the theory for making up such a prefactor.
Let us now consider higher loop corrections. From the calculation that led to
eq. (9.50), the anomaly results from the integration over the momentum that runs in
the fermion loop, provided that the integrand has mass dimension 4 or higher. Note
9. Q UANTUM ANOMALIES 307

that some of the higher order corrections just renormalize the objects that appear in
the right hand side of eq. (9.50), such as the photon field strength and the coupling
constant, without changing the structure of the anomaly (including the numerical
prefactor). Quite generally however, adding an internal photon line requires to add
more fermion propagators in the main loop, which reduces its degree of ultraviolet
divergence. Of course, the integration over the momentum of this internal photon
may itself be ultraviolet divergent, but it can be regularized in a way that does not
interfere with axial symmetry and thus does not contribute to the anomaly.

9.2 Generalizations
9.2.1 Axial anomaly in a non-abelian background
In the previous section, we have discussed axial anomalies in an abelian gauge theory.
However, a similar anomaly arises in the presence of a non-abelian background gauge
field. Let us assume that the fermions are in a representation of the gauge algebra
where the generators are ta . The calculation of the triangle graph proceeds almost
in the same way as in the abelian case, except for the Lie algebra generators, and
eq. (9.50) becomes
e3   
∂ µ Jµ
5 (x) = − 2
tr ta tb ǫανβρ ∂α Aa
ν (x) ∂β Ab
ρ (x) . (9.51)

This is not gauge invariant, but it is easy to guess what should be the right hand side
to restore gauge invariance:
e3 
∂ µ Jµ
5 (x) = − 2
tr ta tb ǫανβρ Fa b
αν (x) Fβρ (x) . (9.52)
16π
The same dimensional argument that we have used in the abelian case also applies
here: there cannot be contributions to the anomaly of degree higher than two in
the field strength. Note that when expanded in terms of the gauge potential Aa µ,
eq. (9.52) contains terms of degree 3 and 4, that exist only in a non-abelian background.
Diagrammatically, they correspond to contributions coming from the following two
diagrams:
308 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(But the direct extraction of the anomaly contained in these graphs would be very
cumbersome, due to the numerous terms arising from permutations of the external
gauge fields.)

9.2.2 Axial anomaly in a gravitational background


Another situation where an axial anomaly is present is the case of a gravitational
background. Of course, this is to a large extent an academic exercise since the resulting
anomaly is extremely small, due to the weakness of the gravitational coupling at the
usual scales of particle physics. Nevertheless, since every field is in principle coupled
to gravity, the anomalies caused by a gravitational background are unavoidable unless
the matter fields of the theory are arranged in a specific way. Interestingly, the
calculation of this gravitational anomaly can be performed even if we do not have a
consistent quantum theory of gravity, since it does not involve quantum fluctuations
of the gravitational field (the only loop is a fermion loop).
At tree level, the couplings between gravity and ordinary fields are determined
from the principle of general covariance. Let us sketch here how such a calculation
is done, without entering into too many technical detail. The first step is to obtain a
generally covariant generalization of the Dirac operator, for an arbitrary metric tensor
gµν , from which we can read off the coupling of the fermion to the background
gravitational field. In a curved spacetime, we wish to generalize the Dirac matrices so
that they satisfy
 µ
γ (x), γν (x) = 2 gµν (x) . (9.53)
(In this section, we use the Greek letters µ, ν, ρ, σ for indices related to curved
coordinates, and Greek letters from the beginning of the alphabet α, β, γ, δ for
indices related to flat Minkowski coordinates.) In a curved spacetime, the covariant
derivative of the metric tensor vanishes, and it is therefore natural to request the same
for the Dirac matrices. However, this requires that we introduce a spin connection,
which is a matrix Γµ defined so that
λ
∇µ γν ≡ ∂µ γν − Γµν γλ − Γµ γν + γν Γµ = 0 , (9.54)
λ
where Γµν is the usual Christoffel’s symbol. The covariant derivative acting on a
c sileG siocnarF

spinor is (∂µ − Γµ ) Ψ and the generally covariant Dirac equation for a massless
fermion reads
i γµ (∂µ − Γµ ) Ψ = 0 . (9.55)
In order to construct a Lagrangian that transforms as a scalar, we need a matrix Γ such
that ψ† Γψ is a real scalar. This is the case if the following conditions are satisfied
Γ = Γ† ,
à 㵠= 㵆 à ,
∇µ Γ = ∂µ Γ + Γµ† Γ + Γ Γµ . (9.56)
9. Q UANTUM ANOMALIES 309

We then define Ψ ≡ Ψ† Γ , and the Lagrangian density is



L ≡ i −g Ψ γµ ∇µ Ψ . (9.57)

(g is the determinant of the metric tensor.) The vector current and its conservation
law generalize into

J µ ≡ Ψ γµ Ψ , ∇ µ Jµ = 0 . (9.58)

In the massless case, we can in addition define a conserved axial current:

Jµ 5 µ
5 ≡ Ψγ γ Ψ , ∇ µ Jµ
5 =0. (9.59)

However, as we shall see, this conservation law suffers from an anomaly in a


curved spacetime. Firstly, let us introduce a representation of the Dirac matrices for a
generic curved spacetime, that makes an explicit connection with the metric tensor.
This is achieved by introducing four vector fields eα µ (x) (called a vierbein, or tetrad)
such that3

gµν (x) = ηαβ eα µ (x) eβ ν (x) , (9.60)

where in this section we use the notation ηαβ for the Minkowski metric tensor. This
is equivalent to introducing at each point x a local Minkowski frame with coordinates
yα . Note that eα µ transforms as a vector under diffeomorphisms (a coordinate vector)
with respect to the index µ, and as an ordinary 4-vector under Lorentz transformations
(called a tetrad vector in this context) with respect to the index α. The indices
α, β, · · · are raised and lowered with the Minkowski metric tensor, while the indices
µ, ν, · · · are raised and lowered with the curved space metric gµν (x). Since in the
right hand side of eq. (9.60) the indices α and β are contracted with the Lorentz
tensor ηαβ , the result is a scalar under Lorentz transformations, but a rank-2 tensor
under diffeomorphisms. The Dirac matrices in curved spacetime (γµ (x)) can then be
related to those in flat spacetime (γα ) by

γµ (x) = eα µ (x) γα , (9.61)

and a spin connection Γµ that satisfies eq. (9.54) (and reduces to zero in flat spacetime)
is given by
1
Γµ (x) = − γα γβ eαρ (x) ∇µ eβ ρ (x) , (9.62)
4
with ∇µ eβ ρ = ∂µ eβ ρ − Γµρν β
e ν (since eβ ρ is a coordinate vector with respect to the
index ρ). A matrix Γ that fulfills eqs. (9.56) is the flat spacetime γ0 , and the matrix
γ5 is still given in terms of the flat spacetime Dirac matrices by γ5 = i γ0 γ1 γ2 γ3 .
3 In this section, we denote η
αβ ≡ diag (1, −1, −1, −1) the flat spacetime Minkowski metric, in order
to distinguish it from gµν .
310 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 9.3: Graph contributing to the


chiral anomaly in a gravitational back-
ground in four space-time dimensions.

We have now a representation of the Dirac operator in an arbitrary curved space-


time, expressed in terms of the vierbein eα µ that encodes the curved metric, from
which we may read off the coupling between a spin 1/2 field and the external gravi-
tational field. We will not go into the technology required for calculating a fermion
loop like the graph of the figure 9.3, and just quote the final result for the divergence
of the axial current:
1
∇ µ Jµ
5 (x) = ǫαβγδ Rαβ µν (x) Rγδµν (x) , (9.63)
384π2
where Rµνρσ is the curvature tensor (it plays in gravity the same role as the field
strength Fµν in a non-abelian gauge theory). This formula indicates that a curved
spacetime, i.e. an external gravitational field, leads to an anomalous contribution to
the divergence of the axial current. This effect is of course tiny in ordinary situations
where gravity is weak. But it should in principle be kept in mind when attempting
to construct an anomaly free chiral gauge theory, if one wishes this theory to remain
consistent all the way up to the Planck scale.

9.2.3 Anomalies in chiral gauge theories


In all the examples that we have considered until now in this chapter, the anomaly
appeared in the conservation of a current associated to a global symmetry such as
chiral symmetry. Although it indicates a violation of this symmetry by quantum
corrections, the anomaly does not make the theory inconsistent in this case. However,
in graphs mixing the axial current and insertions of external gauge fields, we made
sure that the ultraviolet regularization does not spoil the Ward identity associated to
the gauge symmetry.
But we may also consider chiral gauge theories, in which the left and right handed
components of the fermions belong to different representations of the gauge algebra.
This is for instance the case in the Standard Model, where the electroweak interaction
is chiral (the left handed fermions form SU(2) doublets, while the right handed
fermions are singlet under SU(2)). In such a theory, the gauge coupling between
9. Q UANTUM ANOMALIES 311

fermions and gauge fields involve the left or right projectors PR,L ≡ (1 ± γ5 )/2, and
the generators of the Lie algebra that appear in these vertices are taR,L
, respectively
(the left and right generators would be equal in a theory where the two fermion
chiralities belong to the same representation).
The triangle diagram that gave the axial anomaly in four dimensions is replaced
by a graph with three external gauge bosons, with chiral couplings to the fermion
loop. When the fermion in the loop is massless, the left and right chiralities do not
mix, and the multiple occurrences of the projectors simplify into a single one, thanks
to

PR PL = 0 , PL2 = PL , PR2 = PR ,
 
PR,L , γµ γν = 0 ,
 
tr PR γµp
/ 1 PR γ ν p
/ 2 PR γ ρ p
/ 3 = tr PR γµp
/ 1 γν p
/ 2 γρ p
/3 ,
 
tr PL γµp
/ 1 PL γ ν p / 3 = tr PL γµp
/ 2 PL γ ρ p / 1 γν p
/ 2 γρ p
/3 . (9.64)

The γ5 contained in the projectors PR,L may lead to an anomaly, with a relative sign
between the right and left chiralities. The calculation is almost identical to the case
of a global axial symmetry, except that now we should choose the shifts a and b so
that the resulting 3-point function is symmetric in the external fields, since they play
identical roles. But this choice does not eliminate the anomaly; it just distributes
it evenly among  the three external currents, leading to an anomaly proportional to
tr ta {tb , tc } . When there are both right and left fermions in the loop, the anomaly
is proportional to
 
dabc ≡ tr ta R
{tb
R
, tcR } − tr ta
L
{tb
L
, tcL } . (9.65)

Obviously, this is zero in a vector theory, where the right and left fermions couple in
the same way to the gauge bosons. c sileG siocnarF

Anomaly cancellation in the Standard Model : Unlike anomalies of global sym-


metries, an anomaly of a gauge symmetry makes it immediately inconsistent because
it would for instance spoil its unitarity and renormalizability. For this reason, most
chiral gauge theories do not make sense. The only ones that actually do are those for
which the fermion fields are arranged in representations of the gauge group such that
dabc = 0. This turns out to be the case for the Standard Model with its known matter
fields: all the gauge anomalies cancel (within each generation of fermions) thanks in
particular to the peculiar values of the weak hypercharges of the quarks and leptons.
In order to proceed with this verification, we need the the quantum numbers listed in
the table 9.1 for the fermions of the Standard Model.
The weak isospin and hypercharge are the quantum numbers of the fermion under
SU(2) × U(1). Both of these gauge interactions are chiral, since the charges T3 and
312 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Weak isospin T3 Weak hypercharge Y Elec. charge


Left handed fermions
νe , νµ , ντ + 12 −1 0
e, µ, τ − 12 −1 −1
u, c, t + 12 + 31 + 32
d, s, b − 12 + 31 − 31
Right handed fermions
eR , µR , τR 0 −2 −1
uR , cR , tR 0 + 34 + 32
dR , sR , bR 0 − 32 − 31

Table 9.1: Weak isospin, hypercharge and electrical charge of the fermions of the
Standard Model.

Y of the left and right handed fermions are different. After spontaneous symmetry
breaking via the Higgs mechanism, the fields Bµ 3 (third component of SU(2)) and
Aµ (U(1)) mix to give the Z boson and the photon fields. The electrical charges of
the fermions are then given by Q = T3 + Y2 (since the electrical charges are the same
for left and right fermions, the resulting U(1)em of electromagnetism is a non-chiral
gauge interaction).
The simplest case of anomaly cancellation is the 3-gluon triangle, which is not
anomalous because the strong interaction vertex is a vector coupling:
su(3)

c
su(3)

a
b
R − L cancellation (see eq. (9.65)).
su(3)

For the triangle involving three SU(2) bosons, the anomaly cancels thanks to a
peculiar identity obeyed by the su(2) generators:
su(2)

k
su(2)
i
j
trsu(2) (ti {tj , tk }) = 0 .
su(2)

In triangles that have a single SU(3) or a single SU(2) boson, the anomaly cancels
because the corresponding generators are traceless:
9. Q UANTUM ANOMALIES 313

su(2) u(1)

j
su(3) su(3)
a
i
a trsu(3) (ta ) = 0 ,
su(2) u(1)

su(3) u(1)

b
su(2) su(2)

i
a
i trsu(2) (ti ) = 0 .
su(3) u(1)

In triangles with a single U(1) boson and a pair of SU(2) or SU(3) bosons, the
anomaly cancels thanks to the specific linear combination of weak hypercharges one
gets by summing over all the allowed fermions in the loop:
su(3)

b
u(1) P  
a
y = 2 − 31 + 34 − 23 = 0 ,
quarks
su(3)

su(2)

j
u(1) P  
i
y = 3 − 13 +1 = 0 .
left handed
su(2)
fermions

(In the first of these cancellations, there is a factor of 2 in the first term to account for
the fact that the left handed quarks form SU(2) doublets, and in the second equality
the first term has a factor 3 because the quarks can have three colours.) Note also that
loops with left handed fermions should be counted with a minus sign, according to
eq. (9.65). Finally, the triangle with three U(1) bosons has no anomaly, thanks to the
fact that the sum of the cubes of the weak hypercharges over all fermions is zero:
u(1)

u(1) P  3  3  3  3
y3 = 6 − 13 +3 43 +3 − 32 +2+ −2 = 0 .
u(1)

(Again, the numerical prefactors count the number of SU(3) and SU(2) states for
each fermion.) Interestingly, gravitational anomalies also cancel in the standard
model. Indeed, an anomaly may potentially exist in the triangle with a U(1) boson
and two gravitons. But this anomaly would be proportional to the sum of the weak
hypercharges of all fermions, which turns out to be zero:
G

u(1) P        
y = 6 − 13 +3 43 +3 − 32 +2+ −2 = 0 .
G

One can see the crucial role played by the weak hypercharges assigned to the various
fermions of the Standard Model in these cancellations. Conversely, one may try to
314 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

determine these hypercharges so that all anomalies cancel. Up to a permutation of


the right handed quarks (uR and dR in the first family), there are only two solutions,
say U(1)A and U(1)B (one of them corresponds to the Standard Model). Moreover,
these two solutions cannot be mixed in the same theory, because one would have a
non-canceling anomaly in a triangle that mixes the U(1)A and U(1)B gauge bosons.

9.3 Wess-Zumino consistency conditions

9.3.1 Consistency conditions

In the subsection 9.2.1, where we have derived the axial anomaly in a non-abelian
background field, we first obtained a partial answer with only the terms quadratic
in the external field, and then we used gauge symmetry in order to reconstruct the
missing terms (of order 3 and 4 in the external field). However, how to promote
such a partial result into the full expression of the anomaly is not always so obvious,
for instance in the case of chiral gauge theories where the gauge symmetry itself
is anomalous (in this case, we cannot invoke gauge invariance to restore the full
answer). The Wess-Zumino consistency conditions are a set of equations satisfied by
the anomaly function, that are powerful enough to allow reconstructing the anomaly
from the knowledge of its lowest order in the gauge fields.
Even in the case where the anomalous symmetry is global, it is convenient to
couple a (fictitious in that case) gauge field Aµ to the corresponding current Jµ whose
conservation is violated by the anomaly. By doing this, we promote the symmetry
to a local gauge invariance (violated by the anomaly), and we may return to a global
symmetry by letting the gauge coupling go to zero. Let us denote Γ [A] the effective
action for the gauge field (i.e. the effective action in which the fermions are included
only in the form of loop corrections). In the absence of anomaly, Γ [A] would be
invariant under gauge transformations of the field Aµ ,
Z   δΓ [A]

0 = δθ Γ [A] = d4 x Dadj
µ ab θ b (x)
no anomaly δAa
µ (x)
Z
 δ
= − d4 x θb (x) Dadj
µ ba Γ [A] . (9.66)
δAaµ (x)
| {z }
i Tb (x)

When this symmetry is spoiled by an anomaly, the effective action is no longer


invariant, and we may write

Ta (x) Γ [A] ≡ Ga [x; A] , (9.67)


9. Q UANTUM ANOMALIES 315

where the function Ga [x; A] encodes the anomaly. This function is closely related
to the non-zero right hand side of the anomalous conservation law for the current
associated to the symmetry, since the effective action and the current are related by

δΓ [A]
Jµa (x) + =0, (9.68)
δAaµ (x)

which implies

Dadj
µ ba
Jµa (x) = −Gb [x; A] . (9.69)

Since the anomaly is local, Gb [x; A] should be a local (at the point x) polynomial in
the gauge field and its derivatives. One may then check that the operators Ta (x) obey
the following commutation relation,
 
Ta (x), Tb (y) = i g fabc δ(x − y) Tc (x) , (9.70)

where the fabc are the structure constants of the gauge group. From this, we deduce
the following identity

Ta (x) Gb [y; A] − Tb (y) G[x; A] = i g fabc δ(x − y) Gc [x; A] , (9.71)

called the Wess-Zumino consistency conditions. Since this identity is linear in the
anomaly function Ga , it cannot constrain its overall normalization (for this, it is
usually necessary to compute the triangle diagram). However, this equation is strong
enough to fully constrain its dependence on the gauge field from the term of lowest
order in A. c sileG siocnarF

9.3.2 BRST form of the Wess-Zumino condition

The consistency condition can be recasted into a more convenient form that involves
BRST symmetry. Let us introduce a ghost field χa , and recall that the BRST transfor-
mation reads:
 g
QBRST Aa adj
µ (x) = Dµ ab
χb (x) , QBRST χa (x) = − fabc χb (x) χc (x) .
2
(9.72)

Then, let us encapsulate the anomaly function into the following local functional of
ghost number +1:
Z
G[A, χ] ≡ d4 x χa (x) Ga [x; A] . (9.73)
316 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

We obtain:
Z
QBRST G[A, χ] = i d4 xd4 y χa (x)χb (y) Tb (y)Ga [x; A]
Z
g
− d4 x fabc χa (x)χb (x) Gc [x; A]
2
Z
i
= d4 xd4 y χa (x)χb (y) Tb (y)Ga [x; A] − Ta (x)Gb [y; A]
2
+i g δ(x − y) fabc Gc [x; A] .
| {z }
=0
(9.74)

Therefore, the Wess-Zumino consistency conditions are equivalent to the statement


that the functional G[A, χ] is BRST-invariant:

QBRST G[A, χ] = 0 . (9.75)

Since QBRST is nilpotent, a trivial solution of this equation is of course

G[A, χ] = QBRST h[A] , (9.76)

where h[A] does not depend on the ghost field (indeed, QBRST increases the ghost
number by one unit, and G[A, χ] must have ghost number unity). But since h[A] is
a local functional of the gauge field, it may be subtracted from the action to cancel
the anomaly. Thus, genuine anomalies are given by local functionals G[A, χ] of ghost
number +1 that satisfy the consistency condition (9.75), modulo a term obtained
by acting with QBRST on a functional of A only. Note that if we write G[A, χ] as the
integral of a local density,
Z
G[A, χ] ≡ d4 x G(x) , (9.77)

then the BRST action on the density should be a total derivative

QBRST G(x) = ∂µ ζµ . (9.78)

9.3.3 Solution of the consistency condition

In order to determine how the Wess-Zumino equation constrains G(x), the language of
differential forms introduced in the section 4.5.3 is very handy, as a way to encapsulate
both Lorentz and group indices in compact objects. The 1-forms dxµ anticommute
among themselves under the exterior product ∧. In addition, they also anticommute
9. Q UANTUM ANOMALIES 317

with the ghost field and the BRST generator QBRST . The volume element weighted by
the fully antisymmetric tensor ǫµνρσ can therefore be written as
d4 x ǫµνρσ = dxµ ∧ dxν ∧ dxρ ∧ dxσ . (9.79)
Then, given a vector Vµ and the corresponding 1-form
V ≡ Vµ dxµ , (9.80)
we may write in a compact manner
Z Z
4 µνρσ
d xǫ Vµ Vν Vρ Vσ = V ∧ V ∧ V ∧ V . (9.81)

The exterior derivative d ≡ ∂µ dxµ ∧ satisfies

d2 = 0 , QBRST d + dQBRST = 0 . (9.82)


If we also denote
A ≡ ig Aa a
µ t dx
µ
, χ ≡ ig χa ta , (9.83)
(for later convenience, we absorb a factor i in the definitions of A and χ) the BRST
transformations take the following form

QBRST A = −dχ + A ∧ χ + χ ∧ A ,
QBRST χ = χ ∧ χ . (9.84)

On dimensional grounds, the anomaly function G[A, χ] may contain the following
terms:
Z  
G[A, χ] = −iC d4 x ǫµνρσ χa tr ta ∂µ Aν (∂ρ Aσ )
 
+ia1 ∂µ Aν Aρ Aσ + ia2 Aµ ∂ν Aρ Aσ + ia3 Aµ Aν (∂ρ Aσ )

−b Aµ Aν Aρ Aσ . (9.85)

The term on the first line comes from the triangle diagram, whose explicit calculation
gives the overall coefficient C. The terms of the second and third lines come from the
square and pentagon diagrams, respectively. Alternatively, they can be obtained from
the consistency conditions. Firstly, the previous equation may be rewritten as a sum
of forms:
Z 
G[A, χ] = γ tr χ ∧ (dA) ∧ (dA))

+α1 (dA) ∧ A ∧ A + α2 A ∧ (dA) ∧ A



+α3 A ∧ A ∧ (dA) + β A ∧ A ∧ A ∧ A , (9.86)
318 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where γ, α1,2,3 , β are constants related to C, a1,2,3 , b. Consider first the BRST
transform of the last term,

QBRST tr χ ∧ A ∧ A ∧ A ∧ A = tr χ ∧ χ ∧ A ∧ A ∧ A ∧ A
+ terms in χ ∧ (dχ) ∧ A ∧ A ∧ A .
(9.87)

Since QBRST cannot increase the degree in A, the term in χ ∧ χ ∧ A ∧ A ∧ A ∧ A


cannot be canceled by the terms in α1,2,3 , and therefore we must have β = 0. We
need then to evaluate the BRST transformation of the other terms. For instance,

QBRST tr χ ∧ (dA) ∧ (dA) = tr − χ ∧ χ ∧ (dA) ∧ (dA)


+χ ∧ (dχ) ∧ A ∧ (dA)
−(dχ) ∧ χ ∧ (dA) ∧ A
−A ∧ χ ∧ (dA) ∧ (dχ)
−χ ∧ A ∧ (dχ) ∧ (dA) . (9.88)

By evaluating similarly the BRST transforms of the other terms, one can check that
when α1 = −α2 = α3 = −1/2 the BRST transform of the anomaly functional is the
integral of an exact form and therefore vanishes:
Z Z
QBRST G[A, χ] = γ dF = γ F=0. (9.89)
❘ 4 ∂ ❘
4

This is in fact the only possibility. Introducing the field strength 2-form,
ig a a
F ≡ dA − A ∧ A = t Fµν dxµ dxν , (9.90)
2
the anomaly functional for these values of the coefficients can then be rewritten as
Z h i
1
G[A, χ] = γ tr χ ∧ d A ∧ F + A ∧ A ∧ A . (9.91)
2
Therefore, except for the prefactor γ whose determination requires to calculate the
triangle diagram, the consistency relations completely determine the dependence of
the anomaly function on the gauge field. c sileG siocnarF

9.4 ’t Hooft anomaly matching


Some models of physics beyond the Standard Model conjecture that the quarks and
leptons are bound states of more fundamental degrees of freedom, confined by some
9. Q UANTUM ANOMALIES 319

strong gauge interaction at a scale Λ ≫ Λelectroweak . A difficulty with this picture is to


explain the fact that quarks and leptons are light (in fact, massless, if it were not for
electroweak symmetry breaking), while being bound states of some strong interaction
at a much higher scale. Indeed, the naive mass of these confined states is naturally of
order Λ (the Goldstone mechanism cannot give light fermions, only scalar particles).

As shown by ’t Hooft, one way this may happen is to have in the underlying
fundamental theory a global chiral symmetry with generators T a , such that the
anomaly function tr (T a {T b , T c }) is non-zero. In the low energy sector of the spectrum
of this theory, there must be spin 1/2 massless bound states, on which this chiral

symmetry acts with generators a , and whose anomaly coefficients are identical to
the high energy ones:

tr ❚a  ❚b , ❚c   
= tr T a T b , T c . (9.92)

The proof of this assertion goes as follows. Let us first couple a fictitious weakly
coupled gauge boson to the generators T a . We also introduce additional fictitious
massless fermions coupled only to the fictitious gauge boson, but not to the strongly
interacting gauge bosons responsible for the confinement, tuned so that their contribu-
tion exactly cancels the anomaly:

h  i h  i
tr T a T b , T c physical
+ tr T a T b , T c fictitious
=0. (9.93)
high energy fermions

Let us now examine the low energy part of the spectrum of this theory, i.e. at energies
much lower than the strong scale Λ. Since they are not coupled in any way to the
strong sector, this low energy spectrum contains the fictitious gauge bosons and
massless fermions, unmodified compared to what we have introduced at high energy.
In addition, this spectrum contains the bound states made of the trapped fermions and
strongly interacting gauge bosons. For consistency, this low energy description must
also be anomaly-free, which means that the bound states must transform under the
chiral symmetry with generators a , such that ❚
h i h i
tr ❚ a  ❚b , ❚ c physical

+ tr T a T b , T c fictitious
=0. (9.94)
bound states fermions

The crucial point in this argument is that the contribution of the fictitious fermions is
the same in the equations (9.93) and (9.94), because these fermions are not coupled
to the strongly interacting sector. Eqs. (9.93) and (9.94) immediately give (9.92). In
other words, the anomalies of the trapped elementary fermions must be mimicked by
those of the massless spin 1/2 bound states they are confined into.
320 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

9.5 Scale anomalies

9.5.1 Classical scale invariance


Until now, the quantum anomalies we have encountered in this chapter are related
to chiral couplings of fermions. But there exist another anomaly, ubiquitous in most
quantum field theories, related to quantum violations of scale invariance. Consider a
quantum field theory whose Lagrangian does not contain any dimensionful parameter.
In four spacetime dimension, this means that it contains only operators of mass
dimension exactly equal to 4, which excludes all mass terms. This is the case for
instance for a massless scalar field theory with a quartic coupling, whose action reads:

Z
gµν µ λ 4
S[φ] ≡ d4 x (∂x φ(x))(∂ν
x φ(x)) − φ (x) . (9.95)
2 4!
A scaling transformation amounts to multiplying all length scales by some factor

xµ → yµ ≡ eϑ xµ . (9.96)
d
In this transformation, a field φ(x) of dimension mass φ and its derivative trans-
form as:

φ(x) → φ ′ (y) ≡ e−ϑ dφ φ(e−ϑ y) ,


′ −ϑ (dφ +1) µ
∂µ
x φ(x) → ∂µ
y φ (y) = e ∂x φ(x) , (9.97)
x=e−ϑ y

while the integration measure over spacetime is rescaled by

d4 x → d4 y = e4 ϑ d4 x . (9.98)

Consider now the transformed action,


Z
gµν µ ′ λ ′4
S[φ ′ ] = d4 y (∂y φ (y))(∂ν ′
y φ (y)) − φ (y)
2 4!
Z
gµν µ −4ϑdφ λ 4
= e4ϑ d4 x e−2ϑ(1+dφ ) (∂x φ(x))(∂νx φ(x)) − e φ (x)
2 4!
= S[φ] , (9.99)

where we have used the fact that the mass dimension of φ is dφ = 1 in four spacetime
dimensions. The action defined in eq. (9.95) is thus invariant under scale transfor-
mations. The same conclusion holds for any classical action that does not contain
any dimensionful parameter, provided the appropriate dimension dφ is used for each
field. This is for instance the case of pure Yang-Mills theory in four dimensions, or
quantum chromodynamics in which we neglect the quark masses.
9. Q UANTUM ANOMALIES 321

9.5.2 Dilatation current

Since the transformation (9.97) is continuous, Noether’s theorem implies that there is
a corresponding conserved current. On the one hand, the infinitesimal variation of the
field is

δφ(x) ≡ φ ′ (x) − φ(x) = −ϑ dφ + xµ ∂µ ) φ(x) + O(ϑ2 ) . (9.100)

On the other hand, the scale transformation (9.96) directly applied to the integrand of
the action gives a variation
h i 
δ d4 x L(x) = −ϑ d4 x 4 + xµ ∂µ L(x) + O(ϑ2 )
 
= −ϑ d4 x ∂µ xµ L(x) + O(ϑ2 ) . (9.101)

It is important to include the measure in this calculation, since it is not invariant under
scale transformations. The variation of the measure gives the 4 in the first line, which
is crucial for obtaining a total derivative in the second line. Then, from the derivation
of Noether’s theorem, we conclude that
 ∂L 
∂µ dφ + xν ∂ν )φ − xµ L = 0 . (9.102)
∂(∂µ φ)
| {z }

The vector Dµ is called the dilatation current. In the case of the scalar field theory
used earlier as an example, the explicit form of Dµ is
 
Dµ = xν (∂µ φ)(∂ν φ) − gµν L + φ(∂µ φ) . (9.103)
| {z } | {z }
Θµν 1 µ 2
2∂ φ

In this formula, we recognize that the factor multiplying xν is the energy-momentum


tensor Θµν , whose divergence is zero thanks to translation invariance, i.e. ∂µ Θµν = 0.
c sileG siocnarF

This observation facilitates the calculation of ∂µ Dµ , since we have

∂µ Dµ = Θµ µ + (∂µ φ)(∂µ φ) + φ(φ)


= 2(∂µ φ)(∂µ φ) + φ(φ) − 4 L
 λ 
= φ φ + φ3 . (9.104)
| {z6 }
=0

The final zero follows from the classical equation of motion of the field.
322 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

9.5.3 Link with the energy-momentum tensor


In the previous section, we have seen that the energy-momentum tensor appears in the
expression of the dilatation current. More precisely, this energy-momentum tensor
is the canonical one (i.e. the one obtained as the Noether’s current associated to
translation invariance). Then, the divergence of the dilatation tensor is the trace of
this energy-momentum tensor (Θµ µ = − 21  φ2 ), plus an additional term that turns
out to cancel it exactly.
In fact, in such a scale invariant theory, it is possible to introduce a traceless
definition of the energy-momentum tensor, that we shall denote T µν , such that
T µµ = 0 , ∂µ T µν = 0 , (9.105)
and a valid definition of the dilatation current is
Dµ ≡ xν T µν . (9.106)
(As we shall see shortly, this new dilatation current gives the same conserved charge
as the current Dµ introduced earlier.) The tracelessness of T µν is then equivalent to
the conservation of Dµ .
In the case of a massless φ4 scalar field theory in four dimensions, this improved
energy-momentum tensor reads
1 µν 
T µν ≡ Θµν + g  − ∂µ ∂ν φ2 . (9.107)
6
This tensor is traceless, because the trace of the additional term is + 21  φ2 . Moreover,
this additional term has a null divergence, and therefore we have ∂µ T µν = 0. Note
also that the component µ = 0 of the added term is a total spatial derivative,

0ν 0 ν
 2 − ∂i ∂i φ2 (ν = 0)
g −∂ ∂ φ = i 0 2
, (9.108)
−∂ ∂ φ (ν = i)
which implies that the conserved charges (i.e. the momenta Pν ) obtained from Θµν
and T µν are the same.
Since T µν is traceless, the current Dµ = xν T µν is conserved. But we should
also check that we have not modified the corresponding conserved charge. We have
1 
D0 = xν T 0ν = xν Θ0ν + xν g0ν  − ∂0 ∂ν φ2
6
0ν 1 i 0 i 
= xν Θ + x ∂ ∂ − x0 ∂i ∂i φ2
6
0ν 1 i i 0
= xν Θ + − (∂ x )∂ + ∂i (xi ∂0 − x0 ∂i )) φ2
|{z}
6
= −3
1 0 2 h1 i
= 0ν
xν Θ + ∂ φ +∂i (xi ∂0 − x0 ∂i )φ2 . (9.109)
| {z 2 } 6
D0
9. Q UANTUM ANOMALIES 323

The first two terms are identical to the original D0 of eq. (9.103), and the third term
is a total spatial derivative. Therefore, when we integrate this charge density over all
space, the new definition of the dilatation current gives the same conserved charge as
eq. (9.103).

9.5.4 Energy-momentum tensor via coupling to gravity


The discussion of the previous section highlights the fact that, even in classical field
theory, the energy-momentum tensor is not uniquely defined. It is possible to add a
term that does not alter its conservation and does not change the conserved charges,
but that modifies its trace. There are also cases (e.g., Yang-Mills theory), where the
canonical energy-momentum tensor Θµν is not even symmetric, but can be improved
into a symmetric one.
An alternate method of deriving the energy-momentum tensor, that leads directly
to a symmetric tensor, is to minimally couple the theory to gravity, and to vary the
metric. To that effect, consider an infinitesimal spacetime dependent translation

xµ → xµ + ξµ (x) . (9.110)

Under such a transformation, the metric tensor varies by4

δgµν (x) = ∇µ ξν (x) + ∇ν ξµ (x) , (9.111)

where ∇µ is the covariant derivative. Let us recall for later use an important identity
Z Z
√  √ 
d4 x −g A ∇µ B = − d4 x −g ∇µ A B , (9.112)

where g ≡ det(gµν ). Since eq. (9.111) is merely a change of a dummy integration


c sileG siocnarF

variable, the action is not modified. Therefore, we may write


Z !
4 δS  δS
0 = δS = d x ∇µ ξν (x) + ∇ν ξµ (x) + δφ(x)
δgµν (x) | {z } δφ(x)
δgµν (x) | {z }
=0
Z  
√ 2 δS
= − d4 x −g ∇µ √ ξν (x) . (9.113)
−g δgµν (x)

In the first line, the second term vanishes when the field φ is a solution of the classical
equation of motion. For this to be true for an arbitrary variation ξν (x), we must have
2 δS
∇µ T µν = 0 , with T µν ≡ √ . (9.114)
−g δgµν
4 Note that, although xµ is not a vector, the infinitesimal variation ξµ (x) is a vector, tangent to the

coordinate manifold at the point x. Therefore, it makes sense to act on it with a covariant derivative.
324 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

By construction, this tensor is symmetric and (covariantly) conserved, and the nature
of the coordinate transformation (9.111) makes it clear that it is related to translation
invariance5 . In order to obtain the flat space energy-momentum tensor, one should set
gµν to the Minkowski metric tensor after evaluating the derivative.
Moreover, if we apply a scale transformation to the coordinates,
xµ → eϑ xµ , (9.115)
the metric tensor is simply rescaled:
gµν → e−2ϑ gµν . (9.116)
Moreover, if the classical action does not contain any dimensionful parameter, it is
invariant under this rescaling, and we can write
Z !
δS δS
0 = δS = d4 x e−2ϑ gµν (x) + δφ(x) . (9.117)
δgµν (x) δφ(x)
| {z }
=0

This equation implies that the derivative of the action with respect to the metric, and
therefore the energy-momentum tensor T µν , is traceless.
In order to illustrate this method, let us consider Yang-Mills theory, whose action
coupled to gravity reads
Z
1 √
S=− d4 x −g gµρ gνσ Fµν ρσ
a Fa . (9.118)
4
In order to calculate
√ the derivative of this action with respect to the metric, we need
the variation of −g, that can be obtained as follows:

det (gµν + δgµν ) = etr ln(gµν +δgµν )


ρ ρσ
= etr ln(gµρ (δ ν +g δgσν ))

≈ det (gµν ) 1 + gµν δgµν . (9.119)

Hence,

∂ −g √ gµν
= −g , (9.120)
∂gµν 2
and we obtain the following expression for the energy-momentum tensor:
gµν a αβ a
T µν = Fµα a Fα ν a − F F , (9.121)
4 αβ
whose trace is obviously zero.
5 It is important to note that the derivation implicitly assumes that the parameters in the action, such as

the coupling constants, do not depend explicitly on the position.


9. Q UANTUM ANOMALIES 325

9.5.5 Scale anomaly and β function


Until now, our analysis of the dilatation current has been purely classical, since we
have shown its conservation from the classical action. At this level, it follows from the
absence of any dimensionful parameters in the theory. However, this main not remain
true when loop corrections are taken into account, because of ultraviolet divergences.
This is quite clear if we regularize these divergences by introducing an ultraviolet
cutoff, but it is also true in dimensional regularization. In the latter case, the fact that
d 6= 4 implies that the coupling constants become dimensionful, which also breaks
scale invariance. Taking the trace of eq. (9.121) in d dimensions, we obtain
4 − d a αβ a
Tµµ = Fαβ F . (9.122)
4
The expectation value in the right hand side has ultraviolet divergences that, in
dimensional regularization, become poles in (d − 4)−1 . These terms cancel the
prefactor, leaving a non-zero result for the trace of the energy-momentum tensor even
in the limit d → 4. c sileG siocnarF

Another point of view is to introduce counterterms to subtract the ultraviolet


divergences, and then remove the regulator that controlled the ultraviolet behaviour.
But after this procedure, the bare coupling constant in the action becomes scale
dependent, which also breaks scaling symmetry. From this hand-waving discussion,
we expect that the divergence of the dilatation current, i.e. the trace of the energy-
momentum tensor, is related to the β function that controls the running of the coupling:

∂g
µ = β(g) . (9.123)
∂µ
Moreover, even if the classical scale invariance is broken by the renormalization
group flow, it should be recovered at the fixed points of the RG flow. For instance, a
quantum field theory is scale invariant at critical points.
In Yang-Mills theory, we can derive the form of this trace in the following (non-
rigorous) manner. Let us start from the Yang-Mills action, written in terms of rescaled
fields, so that the coupling appears in the form of a prefactor g−2 :
Z h
√ 1 a µν a i
S = d4 x −g − F F , (9.124)
4 g2b µν
where gb is the bare coupling constant. When this theory is regularized by an gauge
invariant cutoff µ (e.g., a lattice regularization), the bare coupling becomes cutoff
dependent in order for the renormalized quantities to have a proper ultraviolet limit.
Then, consider again the scaling transformation defined in eqs. (9.115) and (9.116).
With a scale dependent coupling, the physics is invariant provided we also change the
scale at which the coupling is evaluated
gb (µ) → gb (e−ϑ µ) . (9.125)
326 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

The infinitesimal form of this transformation is


∂gb
δgµν = −2 ϑ gµν , δgb = −ϑ µ . (9.126)
∂µ
| {z }
β function

Then, by writing explicitly the two sources of ϑ dependence in the variation of the
action, we get
Z h 1 β(gb ) a µν a i
0 = δS = −ϑ d4 x 2 T µ µ − F F . (9.127)
gµν =ηµν gb 2 g3b µν

Therefore, we obtain the following form of the anomalous divergence of the dilatation
current:
β(g) a µν a
∂µ Dµ = T µ µ = F F . (9.128)
2 g µν
This derivation is only heuristic, but a more rigorous treatment using properly renor-
malized operators would lead to the same result. This anomaly can also be derived in
perturbation theory, from the loop corrections to the dilatation current,

(The dotted line terminated by the dark blob denotes the vertex between two gluons
and the dilatation current.) Note that, thanks to asymptotic freedom, Yang-Mills
theory becomes better and better scale invariant as the energy scale increases. Finally,
when one adds quarks in order to obtain QCD, the right hand side of the previous
equation contains also terms in m ψψ, due to the explicit breaking of scale invariance
(already in the classical theory, therefore this is not a quantum anomaly) by the masses
of the quarks.
Chapter 10

Localized field configurations

All the applications of quantum field theory we have encountered so far amount to
study situations that may be viewed as small perturbations above the vacuum state;
i.e. interactions involving states that contain only a few particles. Besides the fact that
these situations are actually encountered in scattering experiments, their importance
stems from the stability of the vacuum, that makes it a natural state to expand around.
In this chapter, we will study other field configurations, classically stable, that may
also be sensible substrates for expansions that differ from the standard perturbative
expansion that we have studied until now. However, under normal circumstances, a
localized “blob” of fields is not stable: it will usually decay into a field which is zero
everywhere. As we shall see, the stability of the field configurations considered in
this chapter is due to topological obstructions that prevent a smooth transformation
between the field configuration of interest and the null field that corresponds to the
vacuum. These field configurations can be classified according to their space-time
structure:

• Event-like : localized both in time and space (e.g., instantons). These may be
viewed as local extrema of the 4-dimensional action, and therefore may give a
(non-perturbative) contribution to path integrals.

• Worldline-like : localized in space, independent of time (e.g., skyrmions,


monopoles). These field configurations behave very much like stable particles
(at least classically), and their non-trivial topology confers them conserved
charges.

• Strings, Domain walls : extended in one or two spatial dimensions, independent


of time.c sileG siocnarF

327
328 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

50

45

40

35

30 Figure 10.1: Quartic po-


V(φ)

25 tential (10.1) exhibiting


20 spontaneous symmetry
15 breaking.
10

0
-10 -5 0 5 10
φ

10.1 Domain walls


A domain wall is a 2-dimensional1 interface between two regions of space where a
discrete symmetry is broken in different ways. Their simplest realization arises in a
real scalar field theory, symmetric under φ → −φ, but with a potential that leads to
spontaneous symmetry breaking, such as
µ2 2 λ 4
V(φ) ≡ V0 − φ + φ , (10.1)
2 4!
where the constant shift V0 is chosen so that the minima of this potential are 0. There
are two such minima, at field values
r
6 µ2
φ = ±φ∗ , φ∗ ≡ . (10.2)
λ

In order to simplify the discussion, let us consider field configurations that depend
only on x, and are independent of time, as well as of the transverse coordinates y, z.
We seek field configurations that obey the classical field equation of motion,
−∂2x φ + V ′ (φ) = 0 , (10.3)
and have a finite energy (per unit of transverse area),
Z +∞
dE 2
= dx 21 ∂x φ(x) + V(φ(x)) < ∞ . (10.4)
dydz −∞

This energy density is the sum of two positive definite terms (since we have adjusted
the potential so that its minima are V(±φ∗ ) = 0. For the integral over x to converge
1 This is for 4-dimensional spacetime. In D-dimensional spacetime, domain walls have dimension

D − 2.
10. L OCALIZED FIELD CONFIGURATIONS 329

when x → ±∞, it is necessary that φ(x) becomes constant when |x| → ∞, and that
this constant be +φ∗ or −φ∗ . There are therefore four possibilities for the values of
the field at x = ±∞:

(i) : φ(−∞) = +φ∗ , φ(+∞) = +φ∗ ,


(ii) : φ(−∞) = −φ∗ , φ(+∞) = −φ∗ ,
(iii) : φ(−∞) = −φ∗ , φ(+∞) = +φ∗ ,
(iv) : φ(−∞) = +φ∗ , φ(+∞) = −φ∗ . (10.5)

The first two of these possibilities do not lead to stable field configurations of positive
energy, because they can be continuously deformed (while holding the asymptotic
values unchanged) into the constant fields φ(x) = +φ∗ , or φ(x) = −φ∗ , respectively,
that have zero energy. Physically, this means that if one creates a field configuration
with these boundary values, it will decay into a constant field (i.e., the regions where
the field was excited to values different from ±φ∗ will dilute away to |x| = ∞).
The interesting cases are encountered when the field takes values corresponding to
opposite minima at x = −∞ and x = +∞. If one holds the asymptotic values of the
field fixed, then it is not possible to deform continuously such a field configuration into
one that would have zero energy. Thus, there must be stable field configurations of
positive energy with these boundary values. A very handy trick, due to Bogomol’nyi,
is to rewrite the energy density as follows:
Z  p 2 Z φ(+∞) p
dE 1 +∞
= dx ∂x φ(x)± 2 V(φ(x)) ∓ dφ 2 V(φ) . (10.6)
dydz 2 −∞ φ(−∞)

In the cases i, ii, the second term vanishes, and the energy density is allowed to be
zero, by having a constant field equal to ±φ∗ . Let us consider now the case iii. In
this case, it is convenient to choose the minus sign in the first term, so that
Z  p 2 Z +φ∗ p
dE 1 +∞
= dx ∂x φ(x) − 2 V(φ(x)) + dφ 2 V(φ) . (10.7)
dydz 2 −∞ −φ
| ∗ {z }
>0

The second term is now strictly positive, and does not depend on the details of φ(x)
(except its boundary values). Since the first term is the integral of a square, this
implies that there is no field configuration of zero energy with this boundary condition.
The minimal energy density possible with this boundary condition is
Z +φ∗ p
dE
= dφ 2 V(φ) , (10.8)
dydz min −φ∗

reached for a field configuration that obeys


p
∂x φ(x) = 2 V(φ(x)) . (10.9)
330 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 10.2: Domain


x wall profile correspond-
ing to the potential of the
figure 10.1.

Taking one more derivative implies that

(∂x φ) V ′ (φ)
∂2x φ = p = V ′ (φ) , (10.10)
2 V(φ)

which is nothing but the classical equation of motion (10.3). Solutions of this equation
with prescribed boundary values ±φ∗ at x = ±∞ interpolate between the two ground
states of the potential of the figure 10.1. The ground state φ = +φ∗ is realized at
x → +∞, while the other ground state is realized at x → −∞. Since these two vacua
correspond to two different ways to spontaneously break the φ → −φ symmetry,
there must exist an interface between the two phases, called a domain wall. From
eq. (10.9), we may write


x(φ) = x0 + p , (10.11)
0 2 V(ξ)

where x0 is an integration constant that can be interpreted as the coordinate where


the field φ is zero. In other words, x0 is the location of the center of the domain wall
that separates the regions of different vacua. The domain wall is a local minimum
of the energy density (and the absolute minimum for the mixed boundary conditions
iii). Moreover, it is separated from the (lower energy) configurations i, ii that have
a constant field by an infinite energy barrier2 . Indeed, going from iii to i implies
shifting the value of the field from −φ∗ to +φ∗ in the (infinite) vicinity of x = −∞.
2 From this fact, we may infer that domain walls are also stable quantum mechanically.
10. L OCALIZED FIELD CONFIGURATIONS 331

In the middle of this process, the field in this region will be φ = 0, at which
V(φ) = V0 > 0, a configuration that has an infinite energy density. Thus, the domain
wall solution is stable, except for shifts of x0 (since the energy density is independent
of x0 ): the domain wall may move along the x axis, but cannot disappear.
Let us finish by a note on the y, z dependence that has been neglected sofar.
Reintroducing the transverse dependence adds the term 12 (∂y φ)2 + (∂z φ)2 to
the integrand of the energy density in eq. (10.4). This term is positive, or zero for
fields that do not depend on y and z. Therefore, the minimum of energy density is
reached for domain walls that are invariant by translation in the transverse directions.
Domain walls that are not translation invariant are not stable, but will relax to this
y, z-invariant configuration. Physically, one may view the term 12 (∂y φ)2 + (∂z φ)2
as a surface tension energy, and the energetically favored configurations are those for
which the interface has the lowest curvature. c sileG siocnarF

10.2 Skyrmions
Skyrmions are field configurations that arise in models resulting from a spontaneous
symmetry breaking, such as a non-linear sigma model. Consider for instance the
following action,
Z
1X  
S[ξ] = dD x gab (ξ) ∂i ξa ∂i ξb + · · · , (10.12)
2
a,b

where the fields ξa are the Nambu-Goldstone bosons of a broken symmetry from the
symmetry group G down to H. The matrix gab (ξ) is positive definite, and in general
field dependent. The dots represent terms with higher derivatives, that we have not
written explicitly. In such a model, the Nambu-Goldstone fields ξa may be viewed as
elements of the coset G/H.
In order to have a finite action, the derivatives of the fields should decrease faster
than |x|−D/2 at large distance,

∂i ξa (x) . |x|−D/2 , (10.13)


|x|→∞

which means that the field ξa (x) should go to a constant, with a remainder that
decreases faster than |x|1−D/2 .
The constant value of ξa at infinity can be chosen to be some fixed predefined
element of G/H. Thus, we may view the field ξa (x) as a mapping

ξa : SD 7→ G/H , (10.14)

where SD is the D-dimensional sphere, which is topologically equivalent to the



euclidean space D with all the points |x| = ∞ identified as a single point. This
332 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 10.3:
Stereographic pro-
jection that maps the

plane 2 to the sphere
S2 . All the points at
infinity in the plane
are identified, and
mapped to the north
pole of the sphere.

equivalence may be made manifest by a stereographic projection, illustrated for


D = 2 in the figure 10.3.
These mappings, taking a fixed value at |x| = ∞, can be organized into topological
classes containing functions that can be continuously deformed into one another. The
set of these classes is a group, known as the D-th homotopy group of G/H, denoted
πD (G/H).
The original version of this model was intended to describe nucleons as a topo-
logically stable configuration of the pion field. In this case, there are D = 3 spatial
dimensions, and the chiral symmetry SU(2) × SU(2) is spontaneously broken to
SU(2). The coset in which ξa lives is SU(2), and the relevant homotopy group is
π3 (SU(2)) = ❩. The integer that enumerates the topological classes is then identified
with the baryon number.
Note that the model defined by eq. (10.12), with only second order derivatives,
cannot have stable solutions, a result known as Derricks’ theorem. In order to see this,
consider a skyrmion solution ξa (x), and construct another field by a rescaling:

ξa
R
(x) ≡ ξa (x/R) . (10.15)

The action becomes S[ξR ] = RD−2 S[ξ]. In D > 2 dimensions, we may make it
decrease continuously to zero, despite the fact that ξa and ξa
R
have the same topology.
Such a solution may be stabilized by adding a term with higher derivatives, such as
Z   
V[ξ] ≡ dD x habcd (ξ) ∂i ξa ∂i ξb ∂j ξc ∂j ξd . (10.16)

Under the same rescaling, we now have V[ξR ] = RD−4 V[ξ]. In D = 3 spatial
dimensions, the term with second derivatives decreases to zero when R → 0, while
the above quartic term increases to +∞. Their sum therefore exhibits an extremum at
10. L OCALIZED FIELD CONFIGURATIONS 333

some finite scale R∗ . Although we obtain in this way non-trivial stable solutions, there
is a priori no reason to limit ourselves to terms with four derivatives, and therefore the
predictive power of such a model is limited by the many possible choices for these
higher order terms.

10.3 Monopoles

10.3.1 Dirac monopole

Magnetic monopoles are not forbidden in quantum electrodynamics, but their exis-
tence would automatically lead to the quantization of electrical charge, as first noted
by Dirac. Let us reproduce here this argument. Consider the radial magnetic field of a
would-be monopole:

b
x
B=g . (10.17)
|x|2

Maxwell’s equation ∇ · B = 0 implies that we cannot find a vector potential A


for this magnetic field in all space. But it is possible to find one that works almost
everywhere, for instance

1 − cos θ
A(x) = g eφ , (10.18)
|x| sin θ

where θ is the polar angle, φ the azimuthal angle, and eφ is the unit vector tangent to
the circle of constant |x| and θ. This vector potential is not defined on the semi-axis
θ = π (i.e. the semi-axis of negative z). One may argue that on this semi-axis, we
have in addition to the monopole field a singular Bz whose magnetic flux precisely
cancels the magnetic flux of the monopole, so that the total flux on any closed surface
containing the origin is zero, as illustrated in the figure 10.4. Thus, in this solution, the
magnetic flux Φm ≡ 4π g of the monopole is “brought from infinity” by an infinitely
thin “solenoid”. Even if it is infinitely thin, such a solenoid may in principle be
detected by looking for interferences between the wavefunctions of charged particles
that have propagated left and right of the solenoid (this corresponds to the Aharonov-
Bohm effect). For a particle of electrical charge e, the corresponding phase shift is
eΦm = 4πeg. Dirac pointed out that this interference is absent when the phase shift
is a multiple of 2π, i.e. when the electric and magnetic charges are related by
n
ge = ,n ∈ ❩ . (10.19)
2
Thus, electrodynamics can perfectly accommodate genuine magnetic monopoles,
provided this condition is satisfied, since the annoying solenoid that comes with
334 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

θ
er


φ

Figure 10.4: Left: notations for the polar coordinates local frame used in
eq. (10.18). Right: magnetic field lines of the Dirac monopole, corresponding
to the vector potential of eq. (10.18).

the above vector potential is totally undetectable. In particular, this implies that
all electrical charges should be multiples of some elementary quantum of electrical
charge if monopoles exist. Note that in quantum electrodynamics, while the electric
and magnetic charges must be related by eq. (10.19), there is no constraint a priori on
the mass of monopoles and it should be regarded as a free parameter. c sileG siocnarF

Let us mention briefly an alternative argument, that does not involve discussing
the detectability of Dirac’s solenoid. Instead of the vector potential of eq. (10.18),
one could instead have chosen
1 + cos θ
A ′ (x) = −g eφ , (10.20)
|x| sin θ

that has a singularity on the semi-axis θ = 0. When Dirac’s quantization condition is


satisfied, one may patch eqs. (10.18) and (10.20) in order to obtain a vector potential
which is regular in all space (except at the origin, where the monopole is located).
To see this, consider a region Ω1 corresponding to 0 ≤ θ ≤ 3π/4 and a region Ω2
corresponding to π/4 ≤ θ ≤ π. Then, we choose A in Ω1 and A ′ in Ω2 . In the
overlap of the two regions, π/4 ≤ θ ≤ 3π/4, we have

(A − A′ ) · dx = 2g dφ , (10.21)

and we can write

A − A′ = ∇χ with χ(φ) ≡ 2g φ . (10.22)


10. L OCALIZED FIELD CONFIGURATIONS 335

For this to be an acceptable gauge transformation, the phase by which it multiplies


the wavefunction of a charged particle should be single-valued, i.e.

eie χ(φ+2π) = eie χ(φ) , (10.23)

which is precisely the case when the condition (10.19) is satisfied.


This argument can even be made without any reference to the explicit solutions
(10.18) and (10.20). Let us consider a large sphere surrounding the origin, divide it in
an upper and lower hemispheres (see the figure 10.7), and denote A and A ′ the vector
potentials that represent the monopole in these two hemispheres. On the equator, their
difference should be a pure gauge,
i †
A − A′ = Ω (x) ∇ Ω(x) . (10.24)
e
Along the equator, we have
 Z 


Ω(φ) = Ω(0) exp − ie A−A · dx , (10.25)
γ[0,φ]

where the integration path γ[0, φ] is the portion of the equator that extends between
the azimuthal angles 0 and φ. After a complete revolution, we have
 I 


Ω(2π) = Ω(0) exp − ie A−A · dx
Equator

= Ω(0) exp − ie ΦU + ΦL = Ω(0) e−4πi eg . (10.26)
| {z }
flux =4πg

To obtain the first equality on the second line, we use Stokes’s theorem to rewrite the
contour integrals of A and A ′ as surface integrals of the corresponding magnetic field.
Therefore, we obtain the magnetic fluxes through the upper and lower hemispheres,
respectively, whose sum is the total flux 4πg of the monopole. Requesting the
single-valuedness of Ω leads to Dirac’s condition on eg.

10.3.2 Monopoles in non-Abelian gauge theories


There are also non-abelian field theories that exhibit U(1) magnetic monopoles, as
classical solutions whose stability is ensured by topology. The simplest example is
an SU(2) gauge theory coupled to a scalar field in the adjoint representation3 , whose
3 This model is known as the Georgi-Glashow model. It was considered at some point as a possible

candidate for a field theory of electroweak interactions, until the neutral vector boson Z0 was discovered.
Here, we use it as a didactical example of a theory with classical solutions that are magnetic monopoles.
336 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Lagrangian density reads


1 1 
L ≡ − Fa Fa,µν + Dµ Φa Dµ Φa ) − V(Φ) , (10.27)
4 µν 2
with

λ 2
V(Φ) ≡ Φa Φa − v2 ,
8
Dµ Φa = ∂µ Φa − e ǫabc Ab c
µΦ ,

Fa a a b c
µν = ∂µ Aν − ∂ν Aµ − e ǫabc Aµ Aν , (10.28)

where we have written explicitly the structure constants of the su(2) algebra. In order
to study static classical solutions, it is simpler to consider the minima of the energy:
Z
1 a a  
E ≡ d3 x Ei Ei + Ba a
i B i + Di Φ
a
Di Φa + V(Φ) , (10.29)
2
1
where Ea a a
i ≡ F0i is the (non-abelian) electrical field and Bi ≡ 2 ǫijk Fa
jk is the
magnetic field.
It is possible to choose a gauge (called the unitary gauge) in which the scalar field
triplet takes the form

Φa = 0, 0, v + ϕ . (10.30)

In this equation, we have anticipated spontaneous symmetry breaking, that will give
to the scalar field a vacuum expectation value v, and we have made a specific choice
about the orientation of the vacuum in SU(2). The field ϕ is thus the quantum
fluctuation of the scalar about its expectation value. In this process, the fields A1,2
µ
√ massive (with a mass3MW = e v), as well as the scalar field (with mass
will become
MH = λ v), while the field Aµ remains massless (it corresponds to a residual
unbroken U(1) symmetry). The classical vacuum of this theory corresponds to

ϕ=0, Aa
µ =0. (10.31)

Now, we seek stable classical field configurations that are local minima of the
energy, but are not equivalent to the vacuum in the entire space. To prove the
existence of such fields, it is sufficient to exhibit a field configuration of non-zero
energy that cannot be continuously deformed into the null fields of eq. (10.31) (up to
a gauge transformation). In order to have a finite energy, the scalar field Φa should
reach a minimum of the potential V(Φ) at large distance |x| → ∞ (we have shifted
the potential so that its minimum is zero), but it may approach different minima
depending on the direction b x in space. The allowed asymptotic behaviours of Φa
c sileG siocnarF

define a mapping from the sphere S2 (the orientations b x, for three spatial dimensions)
10. L OCALIZED FIELD CONFIGURATIONS 337

Figure 10.5: Cartoon of the sileG siocnarF

hedgehog configuration of
eq. (10.32). Each needle indicates
the internal orientation of Φa at
the corresponding point on the
sphere.

to the sphere Φa Φa = v2 of the minima of V(Φ). Since su(2) is 3-dimensional, the


set of zeroes of the scalar potential is also a sphere S2 , and it is natural to consider the
following configuration4 :

Φa (b xa ,
x) ≡ v b (10.32)

sometimes called a “hedgehog field” because the direction of internal space pointed
to by the scalar field is locked to the spatial direction, as shown in the figure 10.5.
Any smooth classical field Φa that obeys this boundary condition at infinite spatial
distance must vanish at some point in the interior of the sphere. Therefore, it cannot
simply be a gauge transform of the constant field Φa = v δ3a (the expectation value
of the scalar field in the vacuum). Once again, the classes of fields that can be
continuously deformed into one another are given by a homotopy group, in this case
the group π2 (M0 ) where M0 is the manifold of the minima of the scalar potential.
For the SU(2) group, M0 is topologically equivalent to the 2-sphere S2 , and the
equivalence classes of the mappings S2 7→ S2 are indexed by the integers, since
π2 (S2 ) = ❩. The hedgehog field of eq. (10.32) has topological number +1, while
the vacuum has topological number 0.

At spatial infinity, the hedgehog configuration (10.32) is gauge equivalent to the


standard scalar vacuum aligned with the third colour direction, Φa = v δ3a . In order
to see this, let us introduce the following SU(2) transformation, that depends on the

4 Here, we see that it is crucial that the scalar potential has non-trivial minima. If Φa ≡ 0 was the

only minimum, it would not be possible to construct solutions of finite energy that are not topologically
equivalent to the vacuum.
338 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

polar angle θ and azimuthal angle φ as follows5 :


θ  θ θ 
Ω(θ, φ) ≡ − cos sin φ + 2i sin t1f + cos cos φ t3f , (10.33)
2 2 2
where the ta
f are the generators of the fundamental representation of su(2). Then,
one may check explicitly that
 
δ3a Ω† ta Ω = sin θ cos φ t1f + sin φ t2f + cos θ t3f = b
xa ta
f . (10.34)

Thus, Ω transforms the usual scalar vacuum into the hedgehog configuration at
infinity. Note that (10.33) is not a valid gauge transformation over the entire space
because it is not well defined at the origin.
The choice of eq. (10.32) for the asymptotic behaviour of the scalar field was
motivated by the requirement that the potential V(Φ) gives a finite contribution to the
2
energy. The term in Di Φa should also give a finite contribution. However, note
that
v ia 
∂i Φa (b
x) = δ −b xi b
xa (10.35)
|x|
is not square integrable. We must therefore adjust the asymptotic behaviour of the
gauge potential in order to cancel this term in the covariant derivative, by requesting
that
δia − bxi b
xa
ǫabc Ab xc
i b = , (10.36)
|x|→∞ e |x|
which is satisfied if
ǫibd bxd
Ab
i = xb .
+ term in b (10.37)
|x|→∞ e |x|
The corresponding field strength and magnetic field are given by

1   d 
Fa
ij = 2
2 ǫija + 2 ǫiad b xi b
xj − ǫjad b xa b
x − ǫijd b xd ,
|x|→∞ e |x|
xi b
b xa
Ba
i = . (10.38)
|x|→∞ e |x|2

Therefore, at large distance (these considerations do not give the precise form of the
fields at finite distance) there is a purely radial magnetic field that vanishes like |x|−2 ,
5 When an SU(2) transformation in the fundamental representation is written as
Ω ≡ u0 + 2i ua ta
f ,

its unitarity (Ω† Ω = 1) is equivalent to u20 + u21 + u22 + u23 = 1.


10. L OCALIZED FIELD CONFIGURATIONS 339

i.e. according to Coulomb’s law, thus suggesting that a magnetic monopole is present
at the origin. For a more robust interpretation, we should apply a gauge transformation
that maps the asymptotic Hedgehog scalar field into the usual scalar vacuum, aligned
with the third colour direction. Thanks to eq. (10.34), we see that in this process
the magnetic field of eq. (10.38), proportional to b xa , will become proportional to
δ3a . But the third colour direction precisely corresponds to the gauge potential that
remains massless in the spontaneous symmetry breaking SU(2) → U(1). Therefore,
eq. (10.38) is indeed the magnetic field of a U(1) magnetic monopole. Its flux through
a sphere surrounding the origin is


Φm = , (10.39)
e
equivalent to that of a magnetic charge g ≡ e−1 at the origin.
Until now, we have only discussed the implications of requiring a finite energy on
the asymptotic form of the scalar field and of the gauge potentials. In order to obtain
their values at finite distance, one may make the following ansatz:

xb
b
Φa (x) = v b
xa f(|x|) , Aa
i (x) = ǫiab g(|x|) , (10.40)
e |x|

where f, g are two functions that can be determined from the classical equations of
motion. From this solution over the entire space, one sees that the monopole is an
extended object made of two parts:

• A compact core, of radius Rm ∼ M−1 W


, in which the SU(2) symmetry is
unbroken and the vector bosons are all massless. One may view the core as a
cloud of highly virtual gauge bosons and scalars.

• Beyond this radius, a halo in which the SU(2) symmetry is spontaneously


broken. In this halo, up to a gauge transformation, the scalar field is that of the
ordinary broken vacuum, the vector bosons A1,2 are massive, and the A3 field
is massless, with a tail that corresponds to a radial U(1) magnetic field.

Given these fields, the total energy of the field configuration can be identified with
the mass (in contrast with Dirac’s point-like monopole in quantum electrodynamics,
whose mass is not constrained) of the monopole (since it is static). It takes the form


Mm = M C(λ/e2 ) , (10.41)
e2 W
where C(λ/e2 ) is a slowly varying function of the ratio of coupling constants, of
order unity. Note that the core and the halo contribute comparable amounts to this
mass. Interestingly, the size M−1W
of this monopole is much larger (by a factor
340 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

α−1 = 4π/e2 ) than its Compton wavelength M−1 m . Therefore, when α ≪ 1, the
monopole receives very small quantum corrections and is essentially a classical object.
We have argued earlier that the topologically non-trivial configurations of the
scalar field that lead to a finite energy can be classified according to the homotopy
group π2 (S2 ). Since this group is the group ❩ of the integers, there are monopole
solutions with any magnetic charge multiple of e−1 (the solution we have constructed
explicitly above has topological number 1), i.e.
ge = n , n∈❩. (10.42)
Therefore, in this field theoretical monopole solution, the electrical charge would also
be naturally quantized. At first sight, eqs. (10.42) and (10.19) appear to differ by a
factor 1/2. Note however, that in the SU(2) model we are considering in this section,
it is possible to introduce matter fields in the fundamental representation6 that carry
a U(1) electrical charge ±e/2 (this is the smallest possible electrical charge in this
model). Thus, if rewritten in terms of this minimal electrical charge, the monopole
quantization condition (10.42) is in fact identical to Dirac’s condition. Although the
Georgi-Glashow model studied in this section is no longer considered as phenomeno-
logically relevant, theories that unify the strong and electroweak interactions into a
unique compact Lie group (such as SU(5) for instance) do have magnetic monopoles. c sileG siocnarF

10.3.3 Topological considerations


In the previous two subsections, we have encountered two seemingly different topo-
logical classifications of magnetic monopoles. The Dirac monopole appeared closely
related to the mappings from a circle (the equator between the two hemispheres in the
figure 10.7) to the group U(1), whose classes are the elements of the homotopy group
π1 (U(1)) = ❩. In contrast, the monopole discussed in the Georgi-Glashow model
was related to the behaviour of the scalar field at large distance, i.e. to mappings
from the 2-sphere S2 to the manifold M0 of the minima of the scalar potential V(Φ),
whose equivalence classes are the elements of the homotopy group π2 (M0 ) = ❩.
Let us now argue that these two ways of viewing monopoles are in fact equivalent.
In order to make this discussion more general, consider a gauge theory with internal
group G, coupled to a scalar, spontaneously broken to a residual gauge symmetry of
group H. Let us denote M0 the manifold of the minima of the scalar potential. This
manifold is invariant under transformations of G. Given a minimum Φ0 , the other
minima can be obtained by multiplying Φ0 by the elements of G:

M0 = Φ Φ = ΩΦ0 ; Ω ∈ G . (10.43)
6 If Ψ is a doublet that lives in this representation, the covariant derivative acting on it reads:
 
e 3 1 0
Dµ Ψ = ∂µ Ψ − i e Aa t
µ f
a
Ψ = ∂ µ Ψ − i A Ψ.
2 µ 0 −1
10. L OCALIZED FIELD CONFIGURATIONS 341

Figure 10.6: Illustration of Φ0


the symmetry breaking pattern.
H is the residual invariance
after choosing a minimum Φ0 .
The coset G/H is the manifold
G/H H
that holds the minima of V(Φ)
(a 2-sphere in the case of the
Georgi-Glashow model).

(Here, we are assuming that there are no accidental degeneracies among the minima,
i.e. no minima Φ0 and Φ0′ that are not related by a gauge transformation.) The
manifold defined in eq. (10.43) is in fact the coset G/H,

M0 = G/H . (10.44)

This pattern of spontaneous symmetry breaking is illustrated in the figure 10.6.


The first way of classifying monopoles is to consider the gauge field on a sphere,
as was done in the subsection 10.3.1. At large distance compared to the inverse
mass of the bosons that became massive due to spontaneous symmetry breaking,
only the massless gauge bosons contribute, and the corresponding gauge fields live
in the algebra h of the residual group H. We can reproduce the argument made
at the end of the subsection 10.3.1. The gauge potentials in the upper and lower
hemispheres are related on the equator by a gauge transformation Ω(φ) ∈ H, that
must be single-valued as the azimuthal angle φ wraps around the equator. Ω(φ)
is therefore a mapping from the circle S1 to the residual gauge group H. These
mappings can be grouped into classes that differ by their winding number. In this
general setting, we may adopt the winding number as the definition of the product eg
of the electric charge by the magnetic charge comprised within the sphere. Note that
π1 (H) is discrete, and therefore the winding number can vary only by finite jumps7 .
Moreover, the mapping Ω(φ) on the equator is a smooth function of the azimuthal
angle φ and of the radius R of the sphere. Consequently, the winding number must be
independent of the radius R. From this fact, two different situations may arise:

• The relevant gauge fields belong to h all the way down to zero radius. In this
case, the magnetic charge is independent of the radius of the sphere at all R,
7 Therefore, it must be conserved by time evolution. Indeed, time evolution is continuous, and the only

way for a discrete quantity to evolve continuously is to be constant.


342 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

A∈h
Figure 10.7: Decomposition of
the sphere into two hemispheres
Ω(φ) ∈ H
with gauge potentials A and A ′ .

A' ∈ h

which means that the monopole is a point-like singularity at the origin, like the
original Dirac monopole.

• There exists a short-distance core in which the gauge fields live in an algebra
which is larger than h (possibly the algebra g before symmetry breaking). Inside
this core, the above argument is no longer valid, and the magnetic charge inside
the sphere may vary continuously with the radius. In this case, the monopole is
an extended object whose size is the radius of the core (its magnetic charge is
spread out in the core).

Alternatively, we may construct a monopole as a non-trivial classical field config-


uration that minimizes the energy, by starting from the behaviour at infinity of the
scalar field. In order to have a finite energy, the scalar field should go to a minimum
of V(Φ) when |x| → ∞. The asymptotic scalar field is therefore a mapping from
the 2-sphere S2 to M0 = G/H, and it leads to a classification of the classical field
configurations based on the homotopy group π2 (G/H). The correspondence between
the two points of view is based on the following relationship,

π2 (G/H) = π1 (H)/π1 (G) . (10.45)

For a simply connected Lie group G (e.g., all the SU(N)), the first homotopy group is
trivial, π1 (G) = {0}, and we have

π2 (G/H) = π1 (H) , (10.46)

hence the equivalence between the two ways of classifying monopoles.


10. L OCALIZED FIELD CONFIGURATIONS 343

10.4 Instantons
Until now, all the extended field configurations we have encountered were time
independent. After integration over time, their action is infinite, and therefore they do
not contribute to path integrals. In this section, we will discuss field configurations of
finite action, called instantons, that are localized both in space and in time. Consider
a Yang-Mills theory in D-dimensional Euclidean space, whose action reads
Z
1
S[A] ≡ dD x Fij ij
a (x)Fa (x) . (10.47)
4
(We use latin indices i, j, k, · · · for Lorentz indices in Euclidean space.) Instantons
are non-trivial (i.e. not pure gauges in the entire spacetime) gauge field configurations
that realize local minima of this action.

10.4.1 Asymptotic behaviour


In order to have a finite action, these fields must go to a pure gauge when |x| → ∞,
i †
Aia ta → x) ∂i Ω(b
Ω (b x) , (10.48)
|x|→∞ g
where Ω(b x) is an element of the gauge group that depends only on the orientation
b
x. Since multiplying Ω(b x) by a constant group element Ω0 does not change the
asymptotic gauge potential, we can always arrange that Ω(b x0 ) = 1 for some fixed
orientation bx0 . Note that a gauge potential such as (10.48), that becomes a pure gauge
at large distance, must decrease at least as fast as |x|−1 . More precisely, we may write

i †
Aia (x)ta = x) ∂i Ω(b
Ω (b x) + ai (x) . (10.49)
g
| {z } | {z }
|x|−1 ≪|x|−1

The field strength associated to such a field decreases faster than |x|−2 , and therefore
the corresponding action is finite in D = 4 dimensions. There is in fact a scaling
argument showing that instanton solutions can only exist in four dimensions. Given
an instanton field configuration Ai (x) and a scaling factor R, let us define
1 i
AiR (x) ≡ A (x/R) . (10.50)
R
Since classical Yang-Mills theory is scale invariant, the field AiR is also an extremum
of the action (i.e. a solution of the classical Yang-Mills equations) if Ai is. The action
of this rescaled field is given by
S[AR ] = RD−4 S[A] . (10.51)
344 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Instanton

Figure 10.8: Cartoon


of an instanton (the
illustration is for D = 3,
although instantons
actually exist in D = 4).
The sphere S3 is in
fact infinitely far away
from the center of the
instanton.

Pure gauge
F ij = 0 S3

Therefore, given an instanton Ai (x), we may continuously deform it into another


field configuration AiR (x) whose action is multiplied by RD−4 . Unless D = 4, this
action has a higher or lower value, in contradiction with the fact that Ai was a local
extremum8 . Thus, non-trivial local extrema of the classical Euclidean Yang-Mills
action can only exist in D = 4. In four dimensions, if Ai is an instanton, then AiR
is also an instanton (with the same value of the action). Thus, classical instantons
can exist with any size. But this degeneracy is lifted by quantum corrections, that
introduce a scale into Yang-Mills theory via the running coupling.

10.4.2 Bogomol’nyi inequality and self-duality condition


In the study of instantons, a useful variant of Bogomol’nyi trick is to start from the
following obvious inequality,
Z
1
0 ≤ d4 x (Fa a 2
ij ∓ ǫijkl Fkl ) , (10.52)
2
which leads to
Z  
4 a a a a 1 a a
0 ≤ d x Fij Fij ∓ ǫijkl Fij Fkl + ǫijkl ǫijmn Fkl Fmn
4
Z  
4 a a a a 1 a a
= d x Fij Fij ∓ ǫijkl Fij Fkl + (δkm δln − δkn δlm )Fkl Fmn
2
Z

= d4 x 2Fa a a a
ij Fij ∓ ǫijkl Fij Fkl . (10.53)

8 The only exception to this reasoning occurs if S[A] = 0. But this happens only in the trivial situation

where Ai is a pure gauge in the entire spacetime.


10. L OCALIZED FIELD CONFIGURATIONS 345

By choosing appropriately the sign, this can be rearranged into a lower bound for the
action:
Z
1
S[A] ≥ ǫijkl d4 x Fa a
ij Fkl , (10.54)
8
known as Bogomol’nyi’s inequality. Interestingly, we recognize in the right hand side
an integral identical to the one that enters in the θ-term of Yang-Mills theories (see
the section 4.5) or in the anomaly function (see the section 3.5 and the chapter 9). c sileG siocnarF

This equality becomes an equality when:


1
Fa a
ij = ± ǫijkl Fkl . (10.55)
2
A solution that obeys this condition is by construction a minimum of the Euclidean
action S[A], and therefore a solution of the classical Yang-Mills equations. But like
in the case of domain walls, finding field configurations that fulfill this self-duality
condition is somewhat simpler than solving directly the Yang-Mills equations. Thus,
from now on, we will look for gauge fields that fulfill eq. (10.55) and go to a pure
gauge as |x| → ∞.

10.4.3 Topological classification


In D = 4, the functions Ω(bx) that define the asymptotic behaviour of instantons map
the 3-sphere S3 into the gauge group G,

Ω : S3 7→ G , (10.56)

with a fixed value Ω(bx0 ) = 1. These functions can be grouped into topological classes,
such that mappings belonging to the same class can be continuously deformed into
one another. The set of these classes can be endowed with a group structure, called
the third homotopy group of G and denoted π3 (G) (for any SU(N) group with N ≥ 2,
we have π3 (G) = ❩). Note that the asymptotic forms of the fields Ai and AiR are
identical, implying that these two instantons belong to the same topological class.
Since their actions are identical in four dimensions, this scaling provides a continuous
family of instantons that belong to the same topological class and have the same
action. This is in fact more general: we will show later that the action of an instanton
depends only on the topological class of the instanton, and therefore can only vary by
discrete amounts.

10.4.4 Minimal action


Let us assume that we have found a self-dual gauge field configuration, that realizes
the equality in eq. (10.54). In order to calculate its action, we can use the fact that
346 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

ǫijkl Fa a
ij Fkl is a total derivative,

1 ijkl a a h  g abc a b c  i
ǫ Fij Fkl = ∂i ǫijkl Aa F
j kl
a
− f Aj Ak Al . (10.57)
2 | 3
{z }
Ki

(This property was derived in the section 4.5.) The vector Ki can also written as a
trace of objects belonging to the fundamental representation:
 2ig 
Ki = 2 ǫijkl tr Aj Fkl + Aj Ak Al . (10.58)
3
Since the integrand in the right hand side of eq. (10.54) is a total derivative, one
may use Stokes’s theorem in order to rewrite the integral as a 3-dimensional integral
extended to a spherical hypersurface SR of radius R → ∞:
Z Z
1 4 a a 1
Smin [A] ≡ ǫijkl d x Fij Fkl = lim d3 Si Ki . (10.59)
8 R→∞ 4 S
R

Thus, the minimum of the action depends only on the behaviour of the gauge field
at large distance (this does not mean that the action does not depend on details of
the gauge field in the interior, but more simply that the gauge fields that realize the
minima are fully determined in the bulk by their asymptotic behaviour). From the
earlier discussion of the asymptotic behaviour of instanton solutions, we know that

Ai (x) ∼ |x|−1 , Fij (x) ≪ |x|−2 . (10.60)


|x|→∞ |x|→∞

Therefore, in the current Ki , the term Aj Fkl is negligible in front of the term Aj Ak Al
at large distance, and we can also write
Z
ig 
Smin [A] = lim d3 Si ǫijkl tr Aj Ak Al
R→∞ 3 SR
Z
1 
= lim d3 Si ǫijkl tr Ω† (∂j Ω)Ω† (∂k Ω)Ω† (∂l Ω) ,
R→∞ 3 g2 SR
(10.61)

where Ω(b x) is the group element that defines the asymptotic pure gauge behaviour
of the gauge potential in the direction b
x. In this expression, each derivative brings
a factor R−1 , while the domain of integration scales as R3 . The result is therefore
independent of the radius of the sphere and we can ignore the limit R → ∞.
On this sphere, let us choose a system of coordinates made of three variables
(θ1 , θ2 , θ3 ), such that the volume element in SR is dθ1 dθ2 dθ3 . To rewrite the
previous integral more explicitly in terms of these variables, it is convenient to
10. L OCALIZED FIELD CONFIGURATIONS 347

introduce a fourth –radial– coordinate θ0 ≡ |x|. The coordinates (θ0 , θ1 , θ2 , θ3 ) are



thus coordinates in 4 , and d4 x = dθ0 dθ1 dθ2 dθ3 . The volume element on the
sphere SR is dθ1 dθ2 dθ3 = d4 x δ(θ0 − R). Noting that bxi = ∂θ0 /∂xi , we can write

∂θ0 ∂θ0
d3 Si = b
xi dθ1 dθ2 dθ3 = dθ1 dθ2 dθ3 = d4 x δ(θ0 − R) (10.62)
∂xi ∂xi

and the minimal action becomes


Z
1 ∂θ0 ∂θa ∂θb ∂θc
Smin [A] = d4 x δ(θ0 − R) ǫijkl
3 g2 ∂xi ∂xj ∂xk ∂xl
∂Ω(θ) † ∂Ω(θ) † ∂Ω(θ)
×tr Ω† (θ) Ω (θ) Ω (θ) ,
∂θa ∂θb ∂θc
(10.63)

where we have rewritten the derivatives with respect to xi in terms of derivatives with
respect to θa (the implicit sums on a, b, c run over the indices 1, 2, 3 only, because
the group element Ω depends only on the orientation b x). Finally, we may use:
 
∂θ0 ∂θa ∂θb ∂θc ∂(θ0 θ1 θ2 θ3 )
ǫlijk = det ǫ0abc . (10.64)
∂xi ∂xj ∂xk ∂xl ∂(x1 x2 x3 x4 ) | {z }
=ǫabc

The determinant is nothing but the Jacobian of the coordinate transformation {xi } →
{θa }. Therefore, we obtain
Z
1 ∂Ω(θ) † ∂Ω(θ) † ∂Ω(θ)
Smin [A] = 2
dθ1 dθ2 dθ3 ǫabc tr Ω† (θ) Ω (θ) Ω (θ) .
3g ∂θa ∂θb ∂θc
(10.65)

10.4.5 Cartan-Maurer invariant


Definition : In order to calculate the integral that appears in eq. (10.65), let us make
a mathematical diggression. Consider a d-dimensional manifold S, of coordinates
(θ1 , θ2 , · · · , θd ), a manifold M that may be viewed as a matrix representation of a
Lie group, and a mapping Ω from S to M:

(θ1 , θ2 , · · · , θd ) ∈ S −→ Ω(θ1 , θ2 , · · · , θd ) ∈ M . (10.66)

The Cartan-Maurer form F[Ω] is an integral that generalizes the one encountered
earlier:
Z
∂Ω(θ) ∂Ω(θ)
F[Ω] ≡ dθ1 · · · dθd ǫi1 ···id tr Ω† (θ) · · · Ω† (θ) , (10.67)
∂θi1 ∂θid
348 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where ǫi1 ···id is the d-dimensional completely antisymmetric tensor, normalized


according to ǫ12···d = +1. In d dimensions, this tensor tranforms as follows under
circular permutations:
ǫi1 ···id = (−1)d−1 ǫi2 ···id i1 . (10.68)
Using the cyclicity of the trace, we conclude that F[Ω] = 0 if the dimension d is
even. In the following, we thus restrict the discussion to the case where d is odd9 .

Coordinate independence : Consider now another system of coordinates on S,


that we denote θ′i . We have:
 
i1 ···id
∂θ′j1 ∂θ′jd ∂(θ′i )
ǫ ··· = det ǫj1 ···jd , (10.69)
∂θi1 ∂θid ∂(θj )
and the determinant in the right hand side is the Jacobian of the coordinate transfor-
mation. We thus obtain
Z  
∂Ω(θ) ∂Ω(θ)
F[Ω] = dθ′1 · · · dθ′d ǫj1 ···jd tr Ω† (θ) · · · Ω† (θ) , (10.70)
∂θ′j1 ∂θ′jd

which is identical to eq. (10.67), except for the fact that it is expressed in terms of
the new coordinates θ′i . This proves that F[Ω] is independent of the choice of the
coordinate system on S, and is a property of the manifold S itself.

Change under a small variation of Ω : Let us now study the change of F[Ω]
when we vary the mapping Ω by δΩ. Thanks to the cyclicity of the trace, the
variation of each factor Ω† ∂Ω/∂θi gives the same contribution to the variation of
F[Ω]. Therefore, it is sufficient to consider one of these variations, and to multiply its
contribution by the number of factors, d:
Z  
∂Ω(θ) ∂Ω(θ)
δF[Ω] = d dθ1 · · · dθd ǫi1 ···id tr Ω† (θ) · · · δ Ω† (θ) .
∂θi1 ∂θid
(10.71)
The variation of the last factor inside the trace can be written as
 
∂Ω(θ) ∂Ω(θ) ∂δΩ(θ)
δ Ω† (θ) = −Ω† (θ)δΩ(θ) Ω† (θ) +Ω† (θ)
∂θid ∂θid ∂θid
| {z }
∂Ω† (θ)
− ∂θi Ω(θ)
d

∂δΩ(θ)Ω† (θ)
= Ω† (θ) Ω(θ) . (10.72)
∂θid
9 This is the case in the study of instantons, since in this case the manifold S is the 3-sphere S3 .
10. L OCALIZED FIELD CONFIGURATIONS 349

Then, integrating by parts with respect to θid , we obtain:


Z
δF[Ω] = −d dθ1 · · · dθd ǫi1 ···id
 
∂ ∂Ω(θ) † ∂Ω(θ) †
× tr Ω (θ) · · · Ω (θ) δΩ(θ)Ω† (θ) .
∂θid ∂θi1 ∂θid−1
(10.73)

All the terms containing a factor ∂2 Ω/∂θid ∂θia vanish because the second derivative
is symmetric under the exchange of of the indices id and ia , while the prefactor
ǫi1 ···id is antisymmetric. The remaining terms are those where the derivative with
respect to θd act on one of the factors Ω† . There are d − 1 such terms, which after
some reorganization can be written as
Z X ∂Ω† ∂Ω ∂Ω†
δF[Ω] = −d dθ1 · · · dθd ǫi1 iσ(2) ···iσ(d) tr ··· δΩ .
∂θi1 ∂θi2 ∂θid
σ cyclic perm.
of 2···d
| {z }
0
(10.74)

ǫi1 ···id changes sign under a one-step cyclic permutation of its last d − 1 indices.
Therefore, the d − 1 terms in the sum exactly cancel since d − 1 is even, and we have

δF[Ω] = 0 . (10.75)

Therefore, F[Ω] is invariant under small changes of Ω, which implies that F[Ω] can
only vary by discrete jumps. In particular, when S is the d-sphere Sd , F[Ω] depends
only on the homotopy class of Ω. These classes form a group πd (M). Moreover,
F[Ω] provides a representation of πd (M): if Ω denotes the homotopy class to which
Ω belongs, we have

F[Ω1 × Ω2 ] = F[Ω1 ] + F[Ω2 ] . (10.76)

(We denote by × the group composition in πd (M).) As a consequence, if there exists


an Ω for which the Cartan-Maurer invariant is nonzero, then all its integer multiples
can also be obtained, thereby proving that the homotopy group πd (M) contains Z. c sileG siocnarF

Case of a Lie group target manifold : Let us now specialize to the case where the
target manifold M is a d-dimensional Lie group H, and exploit its group structure
in order to obtain simpler expressions. In this case, the θa ’s can also be used as
coordinates on H. Consider two elements Ω1 and Ω2 of H, represented respectively
by the coordinates θa and φa . Their product Ω2 Ω1 is an element of H of coordinates
ψ(θ, φ) (the group multiplication determines how ψ depends on θ and φ). Since we
350 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

have shown that the choice of cordinates on S is irrelevant, we may choose them in
such a way that the function Ω(θ) is a representation of the group H, i.e.

Ω(φ)Ω(θ) = Ω(ψ(θ, φ)) . (10.77)

By differentiating this equality with respect to ψj at fixed φ, we obtain


∂Ω(θ) ∂θi ∂Ω(ψ)
Ω(φ) = , (10.78)
∂θi ∂ψj ∂ψj

and after left multiplication by Ω† (ψ), this leads to


∂Ω(θ) ∂ψj † ∂Ω(ψ)
Ω† (θ) = Ω (ψ) . (10.79)
∂θi ∂θi ∂ψj
Using (10.69), the integrand of F[Ω] at the point θ can be expressed as

∂Ω(θ) ∂Ω(θ)
ǫi1 ···id tr Ω† (θ) · · · Ω† (θ)
∂θi1 ∂θid
 
∂(ψ) ∂Ω(ψ) ∂Ω(ψ)
= det ǫj1 ···jd tr Ω† (ψ) · · · Ω† (ψ) ,
∂(θ) ∂ψj1 ∂ψjd
(10.80)

where ψ can be any fixed reference point in the group. In the right side, the integration
variable θ now appears only inside the determinant.
The Lie group H being a smooth manifold, it can be endowed with a metric tensor
γij (θ), that transforms as follows in a change of coordinates
∂θk ∂θl
γij (ψ) = γkl (θ) . (10.81)
∂ψi ∂ψj
Given a mapping Ω(θ) between coordinates and group elements, a possible choice
for the metric is given by10
1 ∂Ω(θ) † ∂Ω(θ)
γij (θ) = − tr Ω† (θ) Ω (θ) . (10.82)
2 ∂θi ∂θj
Moreover, for any such metric γij (θ), we have:
  s
∂(ψ) det γ(θ)
det = . (10.83)
∂(θ) det γ(ψ)
10 In

the algebra of a compact Lie group, the Killing form K(X, Y) ≡ tr adX adY is a negative definite
inner product, from which one can define a distance on the group manifold in the vicinity of the origin
(see the section 4.2.4). Eq. (10.82) extends this definition globally to the entire group, in a way which is
invariant under left and right group action.
10. L OCALIZED FIELD CONFIGURATIONS 351

Therefore, the Cartan-Maurer invariant F[Ω] takes the following form

∂Ω(ψ) ∂Ω(ψ)
F[Ω] = ǫj1 ···jd tr Ω† (ψ) · · · Ω† (ψ)
∂ψj1 ∂ψjd
Z p
1
×p dd θ det γ(θ) , (10.84)
det γ(ψ)

in which all the terms pthat do not depend on θ have been factored out in front of the
integral. In fact, dd θ det γ(θ) is an invariant measure on the Lie group, and the
integral is therefore the volume of the group. In other words, the previous formula
exploits the group invariance in order to rewrite the Cartan-Maurer invariant as the
product of the integrand evaluated at a fixed point by the volume of the group. Since
ψ is arbitray in this expression, we may choose the value ψ0 that corresponds to the
group identity. Furthermore, groups elements in the vicinity of the identity may be
written as

Ω(ψ) ≈ 1 + 2i (ψ − ψ0 )a ta , (10.85)
ψ→ψ0

where the ta ’s are the generators of the Lie algebra h. Then, the derivatives read
simply

∂Ω(ψ)
= 2i ta . (10.86)
∂ψa ψ0

From this, we obtain the following compact expression for F[Ω]:


Z p
 1
F[Ω] = (2i)d ǫi1 ···id tr ti1 · · · tid p dd θ det γ(θ) . (10.87)
det γ(ψ0 )

Cartan-Maurer invariant for H = SU(2) : Consider the following mapping from


the 3-sphere S3 to the fundamental representation of SU(2):
 
θ4 + iθ3 θ2 + iθ1
Ω(θ) = = θ4 + 2i θa ta , (10.88)
−θ2 + iθ1 θ4 − iθ3

with t1,2,3 the generators of the su(2) algebra (for the fundamental representation,
the Pauli matrices divided by 2) and θ21 + θ22 + θ23 + θ24 = 1. The following identities
hold:

det Ω(θ) = 1 , Ω† (θ) = θ4 − 2i θa ta ,


∂Ω(θ) θi
∀i ∈ {1, 2, 3}, = 2i ti − p . (10.89)
∂θi 1 − θ2
352 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(We denote θ2 ≡ θ21 + θ22 + θ23 .) In the evaluation of eq. (10.82), we need traces of
products of up to four ta matrices. In the fundamental representation, they can all be
obtained from

tr (ti ) = 0 ,
i 1
ti tj = ǫijk tk + δij , (10.90)
2 4

which leads to

1 ij
tr (ti tj ) = δ ,
2
i
tr (ti tj tk ) = ǫijk ,
4
1
tr (ti tj tk tl ) = (δij δkl + δil δjk − δik δjl ) . (10.91)
8

Then, the metric tensor of eq. (10.82) reads


θi θj
γij (θ) = δij + , (10.92)
1 − θ2
and its determinant is
1
det γ(θ) = . (10.93)
1 − θ2

Combining the above results, we obtain the following expression for the Cartan-
Maurer invariant of the homotopy class of Ω in π3 (SU(2))
Z
2 d3 θ
F[Ω] = (2i)3 ǫabc tr (ta tb tc ) p . (10.94)
1 − θ2
The factor 2 comes from the fact that there are two allowed values of θ4 for each
θ1,2,3 . Finally, we have

Z1
dθ θ2
F[Ω] = 96π √ = 24π2 . (10.95)
1 − θ2
0

In fact, the mapping of eq. (10.88) wraps only once in SU(2), and the above result
therefore corresponds to the topological index +1. Since 24π2 is non-zero, there are
other classes of Ω’s whose Cartan-Maurer invariants are the integer multiples of this
result, and the second homotopy group is π3 (SU(2)) = ❩. Note also that this result
extends to any Lie group that contains an SU(2) subgroup. c sileG siocnarF
10. L OCALIZED FIELD CONFIGURATIONS 353

10.4.6 Explicit instanton solution

In a gauge theorie whose gauge group contains an SU(2) subgroup, the mapping of
eq. (10.88)) can be used to construct the asymptotic form of an instanton of topological
index +1,
i †
Ai (x) = Ω (b
x)∂i Ω(b
x) , (10.96)
|x|→∞ g

with Ω(b x) ≡ b
x4 + 2i bxi ti . One may then prove that the self-dual field configuration
in the bulk that has this large distance behaviour is given by

i r2
Ai (x) = Ω† (b
x)∂i Ω(b
x) , (10.97)
g r 2 + R2
with an arbitrary radius R. From the result (10.95) of the previous subsection, we find
that the minimum of the action that corresponds to this solution is:

8π2
Smin [A] = . (10.98)
g2
Up to translations, dilatations or gauge transformations, this is the only field con-
figuration that gives this action. The field strength corresponding to eq. (10.97) is
localized in Euclidean spacetime, with a size of order R. One may also superimpose
several such solutions. Provided that their centers are separated by distances much
larger than R, this sum is also a solution of the classical equations of motion, and its
action is a multiple of 8π2 /g2 .

10.4.7 Instantons and the θ-term in Yang-Mills theory

Since we have uncovered classical field configurations of non-zero topological index


with finite action, a legitimate question is their role in an Euclidean path integral,
since functional integration a priori sums over all classical field configurations. For
more generality, we may assume that in the path integral the fields of topological
index n are weighted with a factor P(n) that may vary with n (this generalization
would allow for instance to exclude fields of topological index different from zero).
Thus, the expectation value of an observable O may be written as
X Z
−1
hOi = Z P(n) [DA]n O[A] e−S[A] , (10.99)
n∈❩

where [DA]n is the functional measure restricted to gauge fields of topological


index n. The normalization factor Z is given by the same path integral without the
observable.
354 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

The dependence of P(n) on the topological index cannot be arbitrary. In order


to see this, let us consider two spacetime subvolumes Ω1 and Ω2 , non overlapping

and such that Ω1 ∪ Ω2 = 4 . Assume further that the support of the observable O
is entirely inside Ω1 . The topological number, that may be obtained as the integral
over spacetime of ǫijkl Fa a 11
ij Fkl , is additive and we may define topological numbers
n1 and n2 for Ω1 and Ω2 , respectively. The total topological number n is given
by n = n1 + n2 . In the expectation value of eq. (10.99), we can therefore split the
integration into the domains Ω1 and Ω2 as follows
X Z Z
hOi = Z−1 P(n1 + n2 ) [DA]n1 O[A] e−SΩ1 [A] [DA]n2 e−SΩ2 [A] ,
n1 ,n2 ∈❩
(10.100)
where [DA]ni is the functional measure for gauge fields with topological number
ni in the domain Ωi . Since the observable is localized inside the domain Ω1 , we
should be able to remove any dependence on the domain Ω2 from its expectation
value. This dependence cancels between the numerator and the factor Z−1 in the
previous expression provided that the weight P(n1 + n2 ) factorizes as follows:
P(n1 + n2 ) = P(n1 )P(n2 ) , (10.101)
which implies that
P(n) = e−nθ , (10.102)
where θ is an arbitrary constant. From the previous results, the topological number of
a field configuration is given by the integral
Z
g2
n= d4 x ǫijkl Fa a
ij Fkl . (10.103)
64π2
Therefore, we may capture the effect of the topological weight P(n) by adding to the
Lagrangian density the following term
θ g2 ijkl a a
Lθ ≡ ǫ Fij Fkl . (10.104)
64π2
After this term has been added, it is no longer necessary to split the path integral into
separate topological sectors. The previous Lagrangian is nothing but the θ-term that
we have already encountered in the discussion of non-Abelian gauge theories. There,
it appeared as a term that cannot be excluded on the grounds of gauge symmetry. In
the present discussion, we see that the θ-term results from a non-uniform weighting
of the field configurations of different topological index (θ = 0 corresponds to a path
integration where all the fields are weighted equally, regardless of their topological
index).
11 But note that this integral does not have to be an integer when the integration domain is not the entire

spacetime. However, it is approximately an integer when the size of the domain is much larger than the
instanton size.
10. L OCALIZED FIELD CONFIGURATIONS 355

10.4.8 Quantum fluctuations around an instanton

Consider an instanton solution Aµ n,α (x), that provides a local minimum of the Eu-
clidean action, where the subscript n is the topological index of the instanton, and α
collectivey denotes all the other parameters that characterize the instanton (its center,
its size, its orientation in colour space). The expectation value of an observable reads
Z Z
O = Z−1 [DA] e−S[A] O(A) = Z−1 [Da] e−S[An,α +a] O(An,α + a) ,
(10.105)

where we denote aµ the difference Aµ − Aµ n,α . Since the instanton is an extremum


of the action, the dependence of the action on aµ begins with quadratic terms:
Z
8π2 |n| 1
S[An,α + a] = + d4 xd4 y G−1
nα,mβ (x, y) a(x)a(y) + · · · (10.106)
g2 2

It is important to note that the action has flat directions in the space of field con-
figurations, that correspond to changing the parameters of the instanton inside its
topological class. For instance, changing the center coordinates of the instanton does
not modify the value of its action. Along these directions, the second derivative of the
action vanishes. This means that the matrix of second-order coefficient G−1
nα,mβ (x, y)
has a number of vanishing eigenvalues, corresponding to these flat directions.
If we expand the action only to quadratic order in aµ , which amounts to a one-
loop approximation in the background of the instanton, a typical contribution to the
expectation value of eq. (10.105) is a product of dressed propagators Gnα,mβ (x, y)
connecting pairwise the gauge fields contained in the observable O and a determinant:

2
 1/2 Y
|n|/g2
O = Z−1 e−8π det G G + ··· . (10.107)

Our goal here is simply to extract the dependence of such an expectation value on the
topological index n. Besides the obvious exponential prefactor, a dependence on n
hides in the determinant. Le us rewrite it as a product on the spectrum of G−1
 1/2 Y
det G = λ−1/2
s , (10.108)
s

where the λs are the eigenvalues of G−1 . If we rescale the gauge fields by a power
of the coupling g, g A → A, the only dependence on g in the Yang-Mills action is a
prefactor g−2 , and all the eigenvalues λs are also proportional to g−2 . Moreover, as
explained above, we should remove the zero modes from this product, since they do
356 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

not give a quadratic term in eq. (10.106). If we are interested only in the powers of g,
we may write
 1/2 Y Y
det G ∼ g g−1 . (10.109)
all modes zero modes

The first factor, that involves a (continuous) infinity of modes, is not well defined but
it does not depend on the details of the instanton background. In contrast, the second
factor brings one factor of g−1 for each collective coordinate of the instanton. For an
instanton of topological number n = 1, these collective coordinates are:

• the 4 coordinates of the center of the instanton,


• the size R of the instanton,

• 3 angles that determine the orientation of the instanton,


• for SU(2), 3 parameters defining a global gauge rotation. c sileG siocnarF

Of the last 6 parameters, 3 correspond to simultaneous spatial and colour rotations


that produce the same instanton solution, and they should not be counted. There
are therefore 8 collective coordinates for the n = 1 SU(2) instanton12 , and its
contribution to expectation values scales as
2
/g2
O n=1
∼ e−8π g−8 . (10.110)

Because of the exponential factor that contains the inverse coupling, all the Taylor
coefficients of this function are vanishing at g = 0. Thus, such a contribution never
shows up in perturbation theory.

12 This counting is more involved for an SU(3) instanton. In this case, there are 7 collective coordinates

corresponding to rotations and gauge transformations, hence a total of 12 collective coordinates.


Chapter 11

Modern tools for


tree level amplitudes

11.1 Shortcomings of the usual approach


Transition amplitudes play a central role in quantum field theory, since they are the
building blocks of most observables. Their square gives transition probabilities, that
enter in measurable cross-sections. Until now, we have exposed the traditional way of
calculating these amplitudes. Starting from a classical action that encapsulates the
bare couplings of a given quantum field theory, one can derive Feynman rules for
propagators and vertices (listed in the figure 5.2 for Yang-Mills theory in covariant
gauges), whose application provides a straightforward algorithm for the evaluation
of amplitudes. However, the use of these Feynman rules is very cumbersome for the
following reasons:

• Even at tree level, the number of distinct graphs contributing to a given ampli-
tude increases very rapidly with the number of external lines, as shown in the
table 11.1 for amplitudes with external gluons only.

• The internal gluon propagators of these diagrams carry unphysical degrees of


freedom, which contributes to the great complexity of each individual diagram.

• The Feynman rules are sufficiently general to compute amplitudes with arbitrary
external momenta (not necessarily on-shell) and polarizations (not necessarily
physical), although this is not useful for amplitudes that will be used in cross-
sections. One would hope for a leaner formalism, that only calculates what is
strictly necessary for physical quantities.

357
358 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

# of gluons # of diagrams n! Table 11.1: Number of Feyn-


4 4 24 man diagrams contributing
5 25 120 to tree level amplitudes with
external gluons only. (The
6 220 720
third column indicates the
7 2,485 5,040 values of n!, for comparison.
8 34,300 40,320 We see that the number of
9 559,405 362,880 graphs grows faster than the
10 10,525,900 3,628,800 factorial of the number of
external gluons.)

The situation becomes even worse with loop diagrams. Another situation with an
even higher degree of complexity, even at tree-level, is that of gravity. It would be
desirable to be able to calculate tree-level amplitudes with gravitons, since they enter
for instance in the study of the scattering of gravitational waves by a distribution of
masses. But because the graviton has spin 2, the corresponding Feynman rules are
considerably more complicated (especially the self-couplings of the graviton) than
those of Yang-Mills theory.
It turns out that physical on-shell amplitudes in gauge theories are considerably
simpler than one may expect from the Feynman rules and the intermediate steps of
their calculation by the usual perturbation theory, and a legitimate query is whether
there is a more direct route to reach these compact answers. The goal of this chapter
is to give a glimpse (in particular, our discussion will be restricted to tree-level
amplitudes, but a significant part of the many recent developments deal with loop
corrections) of some of the recent developments that led to powerful new methods for
calculating amplitudes. A recurring theme of these methods is to avoid as much as
possible references to the Lagrangian, which may be viewed as the main source of
the complications in standard perturbation theory (for instance, the gauge invariance
of the Lagrangian is the reason why non-physical gluon polarizations appear in the
Feynman rules). Instead, these methods try to gather as much information as possible
on amplitudes based on symmetries and kinematics.

11.2 Colour ordering of gluonic amplitudes


Let us firstly focus on the colour structure of tree Feynman diagrams, in order to
organize and simplify it. Although the techniques we expose here can be extended
to quarks, we consider tree amplitudes that contain only gluons for simplicity, in the
case of the SU(N) gauge group. The structure constants fabc of the group appear
in the three-gluon and four-gluon vertices. The first step is to rewrite the structure
constants in terms of the generators ta
f of the fundamental representation of su(N).
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 359

Using the following relations among the generators,


  δab
ta b
f , tf = i f
abc c
tf , tr (ta b
f tf ) = , (11.1)
2
we can write

i fabc = 2 tr (ta b c b a c
f tf tf ) − 2 tr (tf tf tf ) , (11.2)

which has also the following diagrammatic representation


 c c 

 

abc
if =2 − . (11.3)

 b b
a a

The black dots indicate the fundamental representation generators ta f . Note that the
“loops” in this representation are not actual fermion loops, they are just a graphical
cue indicating how the indices carried by the ta f ’s are contracted in the traces. We
may also apply this trick to the 4-gluon vertex, which from the point of view of its
colour structure (but not for what concerns its momentum dependence) is equivalent
to a sum of three terms with two 3-gluon vertices,
a b a b
a b a b

= + + . (11.4)
d c d c
d c d c

Since the gluon propagators are diagonal in colour (i.e proportional to a δab ),
the ta
f that are attached to the endpoints of the internal gluon propagators have their
colour indices contracted and summed over. The result of this contraction is given by
the following su(N) Fierz identity:
j i
1 1
(ta a
f )ij (tf )kl = = − . (11.5)
2 2N
k l

Thus, it seems that these contractions produce 2n terms for n internal gluon propaga-
tors, but this can in fact be simplified tremendously by noticing that the second term
of the Fierz identity corresponds to the exchange of a colourless object1 , that does
not couple to gluons. All these terms in 1/N must therefore cancel in purely gluonic
amplitudes (this is not true anymore if quarks are involved, either as external lines or
via loop corrections). c sileG siocnarF

1A more rigorous justification is to note that SU(N) × U(1) = U(N), where U(N) is the group of the
360 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

We illustrate in the following equation a few of the colour structures generated by


this procedure in the case of a tree-level five-gluon diagram:

= + + . . . (11.6)

Each of the terms contains a single trace of five ta


f , one for each external gluon (the
colour matrices attached to the internal gluon lines have all disappeared when using
the Fierz identity). The terms in the right hand side correspond to the various ways
of choosing the clockwise or counterclockwise loop for each fabc (see eq. (11.3)).
“Twists” such as the one appearing in the second term of the previous equation arise
when two such adjacent loops have opposite orientations.
Quite generally, any n-gluon tree amplitude Mn (1 · · · n) can be decomposed
as a sum of terms corresponding to the allowed colour structures. These colour
structures are single traces of fundamental representation colour matrices carrying
the colour indices of the external gluons. A priori, these matrices could be reshuffled
by an arbitrary permutation in Sn , but thanks to the cyclic invariance of the trace
we can reduce the sum to the quotient set Sn /❩n of permutations modulo a cyclic
permutation2 :
X a a
Mn (1 · · · n) ≡ 2 tr (tf σ(1) · · · tf σ(n) ) An (σ(1) · · · σ(n)) , (11.7)
σ∈Sn /❩n

where the prefactor 2 combines the factors 2 from eq. (11.2) and the factors 12 from
the first term of the Fierz identity (11.5). The object An (σ(1) · · · σ(n)) is called a
colour-ordered partial amplitude. By construction, it depends only on the momenta
and polarizations of the external gluons, but not on their colours since they have
already been factored out in the trace. Therefore, the partial amplitudes are gauge
N × N unitary matrices. For the fundamental generators of the u(N) algebra, the Fierz identity is
j i
1
= .
u(N) 2
k l

The U(N) gauge theory differs from the SU(N) one by the extra U(1), and the comparison of their Fierz
identities indicates that the term in 1/2N in eq. (11.5) is due to this U(1) factor. Being Abelian, this extra
factor corresponds to a photon-like mode that does not couple to gluons.
2 This is equivalent to considering permutations that have the fixed point σ(1) = 1, i.e. permutations that

only reshuffle the set {2 · · · n}. For n external gluons, there are (n − 1)! independent colour structures. The
basis provided by these traces is over-complete, and there exist linear relationships among the tree-level
partial amplitudes, known as the Kleiss-Kuijf relations. These relations reduce the number of partial
amplitudes from (n − 1)! to (n − 2)!. Additional relationships known as the Bern-Carrasco-Johansson
relations further reduce this number to (n − 3)!,
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 361

Table 11.2: Comparison # (gluons) # (graphs) # (cyclic-ord.)


between the number of 4 4 3
Feynman graphs and the 5 25 10
number of cyclic-ordered
6 220 38
graphs for tree level
amplitudes with external 7 2,485 154
gluons only. 8 34,300 654
9 559,405 2,871
10 10,525,900 12,925

invariant. From eq. (11.7), the squared amplitude summed over all colours can be
written as
X 2 X X a a a a
Mn (1 · · · n) = 4 tr (tf σ(1) · · · tf σ(n) ) tr∗ (tf ρ(1) · · · tf ρ(n) )
colours σ,ρ∈Sn /❩n colours

× An (σ(1) · · · σ(n)) A∗n (ρ(1) · · · ρ(n)) . (11.8)

The sum over colours of the product of two traces that appears in the first line can be
performed using the su(N) Fierz identity (11.5). For instance

tr (ta tb tc td te ) tr∗ (tb ta tc td te ) = , (11.9)

which can be then expressed as a function of N by repeated use of the Fierz identity.
At this point, we have isolated the colour dependence of the amplitude, from its
momentum and polarization dependences that are factorized into the partial ampli-
tudes. Of course, calculating the latter is still not easy, but the task is significantly
reduced for two reasons:

• The colour-ordered partial amplitudes only receive contributions from planar


graphs where the gluons are cyclic-ordered, whose number grows much slower
than the total number of graphs, as shown in the table 11.2. The graphs
contributing to the 4, 5, 6-point colour ordered amplitudes are listed in the
figure 11.1.
• The Feynman rules for calculating the cyclic colour-ordered amplitudes, listed
in the figure 11.2, are much simpler than the original Yang-Mills Feynman
rules because the vertices are stripped of all their colour factors.
362 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

3 2 4 3 2
4 1 5 1

5 4 3 2
6 1

Figure 11.1: Diagrams contributing to the 4-point, 5-point and 6-point colour
ordered amplitudes in Yang-Mills theory. The external points are labeled 1 to
n = 4, 5, 6 in the counterclockwise direction. The solid lines represent gluons.

 
p
−i gµν i 1 pµ pν
= + 1−
p2 + i0+ p2 + i0+ ξ p2

k 
g gµν (k − p)ρ
=
q
+ gνρ (p − q)µ + gρµ (q − k)ν

ν
p ρ

µ ν

− i g2 (2 gµρ gνσ
=
− gµσ gνρ − gµν gρσ )

ρ σ

Figure 11.2: Rules for colour-ordered graphs in Yang-Mills theory.


11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 363

In the case of the 4-gluon vertex, we have included only the terms that cor-
respond to the cyclic ordering µνρσ (note that it is invariant under cyclic
permutations, i.e. the Feynman rule is the same for the vertices νρσµ, ρσµν
and σµνρ). We can already see a considerable simplification of the Feynman
rules, since all the colour factors have disappeared, and the Lorentz structure of
the 4-gluon vertex is also much simpler than in the original Feynman rules.

But even after having isolated the colour structure, the remaining colour-ordered
amplitudes are still complicated. As an illustration of the colour-ordered Feynman
rules, let us consider the partial amplitude A4 (1, 2, 3, 4) that contributes to one of the
colour structures in the gg → gg amplitude. Because of colour ordering, only three
graphs contribute to this partial amplitude:

2 3
2 3 2 3

A4 (1, 2, 3, 4) = + + . (11.10)

1 4 1 4
1 4

For definiteness, let us assume that the external momenta p1 · · · p4 are defined as
incoming, and denote ǫ1 · · · ǫ4 the four polarization vectors. Using the rules listed in
the figure (11.2), we obtain:

A4 (1, 2, 3, 4) =
−i g2 h
= (2p2 + p1 ) · ǫ1 ǫλ2 − (2p1 + p2 ) · ǫ2 ǫλ1
(p1 + p2 )2
ih
+ǫ1 · ǫ2 (p1 − p2 )λ (p3 + 2p4 ) · ǫ3 ǫ4λ
i
−(2p3 + p4 ) · ǫ4 ǫ3λ + ǫ3 · ǫ4 (p3 − p4 )λ
−i g2 h
+ (p2 + 2p3 ) · ǫ2 ǫλ3 − (2p2 + p3 ) · ǫ3 ǫλ2
(p2 + p2 )2
ih
+ǫ2 · ǫ3 (p2 − p3 )λ (2p1 + p4 ) · ǫ4 ǫ1λ
i
−(p1 + 2p4 ) · ǫ1 ǫ4λ + ǫ1 · ǫ4 (p4 − p1 )λ
h i
−i g2 2(ǫ1 · ǫ3 )(ǫ2 · ǫ4 ) − (ǫ1 · ǫ4 )(ǫ2 · ǫ3 ) − (ǫ1 · ǫ2 )(ǫ3 · ǫ4 ) .
(11.11)

Although this is considerably simpler than the full 4-gluon amplitude, it remains quite
difficult to extract physical results from such an expression.
364 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

11.3 Spinor-helicity formalism

11.3.1 Motivation

Part of the complexity of eq. (11.11) lies in the fact that this formula still contains a
large amount of redundant and unnecessary information, since each polarization may
be shifted by a 4-vector proportional to the momentum of the corresponding external
gluon, thanks to gauge invariance. For instance, the transformation

ǫµ µ µ
1 → ǫ1 + κ p1 , (11.12)

leaves the amplitude unchanged. However, it is not clear how to optimally choose the
polarization vectors in order to simplify an expression such as eq. (11.11). In other
words, the question is how to represent the spin degrees of freedom of the external
particles in order to make the amplitude as simple as possible. In the traditional
approach to the calculation of amplitudes, one usually refrains from introducing any
explicit form for the polarization vectors. Instead, one first squares the amplitude
written in terms of generic polarization vectors, such as eq. (11.11), and then the sum
over the polarizations of the external gluons is performed by using
X pµ nν nµ pν
ǫµ∗ (p)ǫν (p) = −gµν + + , (11.13)
p·n p·n
physical pol.

where nµ is some arbitrary light-like vector. Note that this is the formula for summing
over all physical polarizations, which is necessary when calculating unpolarized cross-
sections. For cross-sections involving polarized particles, one would perform only a
partial sum, which leads to a different projector in the right hand side. If the amplitude
is a sum of Nt terms, then this process generates 3N2t terms in the squared amplitude
summed over polarizations. In contrast, the spinor-helicity method that we shall
expose below aims at obtaining the amplitude with explicit polarization vectors, for a
given assignment of the helicities {h1 = ±, · · · , hn = ±} of the external gluons, in
the form of an expression made of Nt terms that can be easily evaluated (numerically
at least). The sum of these Nt terms is done first, and then squared, which is an O(1)
computational task (simply squaring a complex number). Thus, the total cost scales
as 2n Nt in this approach. Since Nt grows very quickly with n, this is usually better.

11.3.2 Representation of 4-vectors as bi-spinors

In the previous section, we have seen how the adjoint colour degrees of freedom may
be represented in terms of the smaller fundamental representation. Likewise, we will
now represent the Lorentz structure associated to spin-1 particles in terms of spin-1/2
variables. From a mathematical standpoint, this representation exploits the fact that
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 365

elements of the Lorentz group SO(3, 1) can be mapped to 2 × 2 complex matrices



of unit determinant, i.e. elements of the group SL(2, ). Likewise, 4-momenta can
be mapped to 2 × 2 complex matrices. In order to make this mapping explicit, let us
introduce a set of four matrices σµ defined by

σµ ≡ (1, σi ) , (11.14)

where σ1,2,3 are the usual Pauli matrices. In terms of these matrices, a 4-vector pµ
can be mapped into
 
µ µ p0 + p3 p1 − ip2
p → P ≡ pµ σ = . (11.15)
p1 + ip2 p0 − p3
(In the second equality, we have used the explicit representation of the Pauli matrices.)
For amplitudes involving only external gluons, the momentum pµ has a vanishing
invariant norm, pµ pµ = 0, which translates into

0 = pµ pµ = p20 − p21 − p22 − p23


= (p0 + p3 )(p0 − p3 ) − (p1 + ip2 )(p1 − ip2 ) = det (P) . (11.16)

Thus, the massless on-shell condition is equivalent to the determinant of the matrix
P being zero. For a 2 × 2 matrix, a null determinant means that the matrix can be
factorized as the direct product of two vectors:

Pab = λa ξb , (11.17)

where λ, ξ are complex vectors known as Weyl spinors. An explicit representation of


these vectors is
√ ! √ !
p0 + p3 p0 + p3
λa ≡ p1 +ip2

, ξb ≡ √p1 −ip2 . (11.18)
p0 +p3 p0 +p3

For a real valued 4-vector, λa and ξa are mutual complex conjugates. However, when
we later analytically continue the external momenta in the complex plane, this will no
longer be the case. To make the notations more compact, it is customary to introduce
the following notations:

p = λa , p = ξa , (11.19)

so that the matrix P may be written as:



P= p p . (11.20)

It is also convenient to define spinors with raised indices, related to the previous ones
as follows,

λa ≡ ǫab λb = p , ξa ≡ ǫab ξb = p , (11.21)
366 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where ǫab is the completely antisymmetric tensor in two dimensions, normalized


with ǫ12 = +1. From these spinors with raised indices, we may define a 2 × 2 matrix
representation of the 4-vector pµ with raised indices:

P≡ p p . (11.22)

Note that this alternative representation corresponds to the definition3

P ≡ pµ σµ , (11.23)

with σµ ≡ (1, −σi ). In the Weyl representation, where the Dirac matrices read
 
µ 0 σµ
γ = , (11.24)
σµ 0

we thus have
 
0 P
/ ≡ pµ γµ =
p . (11.25)
P 0

The fact that we are dealing with on-shell momenta is already built in the factorized
representation of eq. (11.17). Amplitudes depend on kinematical invariants such as
(p + q)2 , for which it is straightforward to check that4
 
(p + q)2 = 2 p · q = pq pq , (11.26)

where the brackets are defined by contracting upper and lower spinor indices, as in
a
pq ≡ p a
q . (11.27)

These brackets are antisymmetric ( pq = − qp ), since they may also be written


as:

pq = ǫab ξa (p)ξb (q) . (11.28)



Note that the mixed brackets are zero, pq = 0,as well
 as the angle and square
brackets with twice the same momentum, pp = pp = 0. c sileG siocnarF

It is useful to work out the form of momentum conservation in the spinor formal-
ism. For an amplitude
 with external momenta {pi }, chosen to be all incoming, let us
denote i , i , · · · the corresponding spinors. For any arbitrary on-shell momenta p
and q, we may then write
X  X  
0= p Pi q = pi iq . (11.29)
i i
3 We may use ǫac ǫbd δdc = δab and ǫac ǫbd σidc = −σiab .
4 For real momenta, angle and square brackets are complex conjugates, and (p + q)2 is a real quantity.
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 367

Another interesting identity follows from the fact that three 2-component spinors
cannot be linearly independent. Thus, given p , q and r , we must have a
relationship of the form:

r =α p +β q . (11.30)

Contracting this equation with p and q gives the explicit expression of the
coefficients α and β:

qr pr
α= , β= . (11.31)
qp pq

This leads to

p qr + q rp + r pq = 0 , (11.32)

known as the Schouten identity. A similar identity holds with square brackets:
     
p qr + q rp + r pq = 0 . (11.33)

11.3.3 Polarization vectors

At this point, we have a representation in terms of spinors for the on-shell momenta
that appear on the external legs of amplitudes. We also need a similar representation
for the polarization vectors. The polarization vectors for a gluon of momentum p
with positive and negative helicities may be represented as follows:
 
q σµ p p σµ q
ǫµ
+ (p; q) ≡ − √ , ǫµ
− (p; q) ≡ − √   , (11.34)
2 qp 2 pq

where q is an arbitrary reference momentum, whose presence is due to the gauge


invariance (eq. (11.12)). It does not have to correspond to any of the physical momenta
upon which the amplitude depends, and can be chosen in such a way that it simplifies
the amplitude. This auxiliary vector can be different for each external line, but it must
be the same in each contribution to a given process (this is because a single graph
usually does not give a gauge invariant contribution when considered alone). Let us
mention a useful Fierz identity for contracting two of the numerators that appear in
the above polarization vectors5 :
   
1 σµ 2 3 σµ 4 = 2 13 24 , (11.35)
 
5 We may use (σµ ab
(σµ cd
= 2(δab δcd − δad δbc ) = 2 ǫac ǫbd .
368 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

from which we obtain the contractions between polarization vectors


 
′ ′ pp ′ qq ′
ǫ+ (p; q) · ǫ+ (p ; q ) = ,
qp q ′ p ′
 
pp ′ qq ′
ǫ− (p; q) · ǫ− (p ′ ; q ′ ) =    ,
qp q ′ p ′
 
qp ′ pq ′
ǫ+ (p; q) · ǫ− (p ′ ; q ′ ) =  . (11.36)
qp p ′ q ′

Using eq. (11.25), we also obtain the following identities:

p · ǫ± (p; q) = q · ǫ± (p; q) = 0 ,
 
qk kp
k · ǫ+ (p; q) = − √ ,
2 qp
 
pk kq
k · ǫ− (p; q) = − √   . (11.37)
2 pq

11.3.4 Three-point amplitudes in Yang-Mills theory

Let us now discuss the very important case of 3-particle amplitudes in the massless
case, since they will appear later as the building blocks of more complicated am-
plitudes. Such an amplitude depends on three on-shell momenta p1,2,3 such that
p1 + p2 + p3 = 0. This implies that
 
12 12 = 2 p1 · p2 = (p1 + p2 )2 = p23 = 0 . (11.38)
 
Therefore, either 12 = 0 or 12 = 0. Let us assume that 12 6= 0. We also have:
       
12 23 = 1 P2 3 = − 1 P1 +P3 3 = − 11 13 − 13 33 = 0 , (11.39)
|{z} |{z}
0 0
   
which implies that 23 = 0. Likewise, 13 = 0. Therefore, all the square brackets
are zero if 12 6= 0. Conversely, all the angle brackets would be zero if instead we
had assumed that 12 6= 0. From this discussion, we conclude that massless on-shell
3-point amplitudes may depend either on square brackets or on angle brackets, but not
on a mixture of both. Recall now that, for real momenta, angle and square brackets
are related by complex conjugation. Thus, 3-point amplitudes can only exist for
complex momenta. This is of course a trivial consequence of kinematics: momentum
conservation p1 + p2 + p3 = 0 is impossible for three real-valued light-like momenta,
except on a measure-zero subset of exceptional configurations.
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 369

Let us now be more explicit and calculate the 3-point amplitudes in Yang-Mills
theory. For generic polarization vectors ǫ1,2,3 , the second Feynman rule of the figure
(11.2) leads to
h i
A3 (123) = 2g (ǫ1 ·ǫ2 )(p1 ·ǫ3 )+(ǫ2 ·ǫ3 )(p2 ·ǫ1 )+(ǫ3 ·ǫ1 )(p3 ·ǫ2 ) , (11.40)

where we have used pi · ǫi = 0 to cancel several terms. Consider first the helicities
− − +. Using eqs. (11.35) and (11.37), we obtain

− − + 2g
A3 (1 2 3 ) = −   
q1 1 q2 2 q3 3
       
× 12 q1 q2 q3 1 13 + 2q3 q2 3 12 2q1
   
+ q3 1 3q1 23 3q2 . (11.41)

Each of the three terms contains in the numerator an angle bracket between the
external momenta (respectively 12 , 12 and 23 ). Therefore, for this amplitude
to be non-zero, we must adopt the choice of spinor representation where it is the
square
  brackets that are zero. With this choice, the first term vanishes since it contains
13 :
       
− − +
√ 2q3 q2 3 12 2q1 + q3 1 3q1 23 3q2
A3 (1 2 3 ) = − 2 g    .
q1 1 q2 2 q3 3
(11.42)
Using momentum conservation (11.29) in the form of
     
11 1q1 + 12 2q1 + 13 3q1 = 0 , (11.43)
|{z}
0

and the Schouten identity (11.32), we arrive at


  
− − +
√ q1 3 q2 3
A3 (1 2 3 ) = 2 g 12   . (11.44)
q1 1 q2 2
Momentum conservation also implies
   
q1 3 12 q2 3 12
 = ,  = , (11.45)
q1 1 23 q2 2 31
which leads to a form of the amplitude that does not contain the auxiliary vectors
q1,2 anymore:
3
√ 12
A3 (1− 2− 3+ ) = 2g . (11.46)
23 31
370 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

We have thus obtained a remarkably compact expression of the 3-point amplitude in


terms of spinor variables, which is explicitly independent of all the auxiliary vectors
qi . Likewise, a similar calculation would give the following answer for the + + −
amplitude:
 3
+ + −
√ 12
A3 (1 2 3 ) = 2 g    . (11.47)
23 31

(The + + + and − − − amplitudes are zero in Yang-Mills theory, as argued in the next
subsection.) Eqs. (11.46) and (11.47) are both much simpler than the Feynman rule
for the 3-gluon vertex. This is the simplest illustration of an assertion we made at the
beginning of this section, namely that on-shell amplitudes with physical polarizations
are much simpler than one may expect from the traditional perturbative expansion. In
the case of the 3-gluon amplitude, we may think that the simplicity comes from the
fact that it receives contributions from a single diagram. However, this is not true. As
a teaser for the next section, let us give the answers for some 4-gluon and 5-gluon
amplitudes in the spinor-helicity formalism:

3
√ 12
A4 (1− 2− 3+ 4+ ) = i( 2 g)2 ,
23 34 41
3
√ 12
A4 (1− 2− 3+ 4+ 5+ ) = i2 ( 2 g)3 , (11.48)
23 34 45 51

that appear to generalize trivially eq. (11.46) although they result from the sum of 3
and 10 Feynman graphs (see the figure 11.1), respectively. In this section, we have
followed a pedestrian approach that consists in starting from the usual Feynman rules,
and translating all their building blocks in the spinor-helicity language. However,
the simplicity of the results provides an important hint: there must be a better way
to obtain them, that bypasses the traditional Feynman rules and provides the answer
much more directly. c sileG siocnarF

11.3.5 Little group scaling

It turns out that massless on-shell 3-point amplitudes are almost completely con-
strained by a scaling argument, except for an overall prefactor. Thus, the Lagrangian
is in a sense not necessary for specifying their form (it only plays a marginal role
in setting their normalization). From eqs. (11.20) and (11.22), it is clear that the
representation of massless on-shell 4-momenta as bi-spinors is invariant under the
following rescaling:
 
p →λ p , p → λ−1 p , (11.49)
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 371

known as little group scaling. The terminology follows from the fact that there is
a one-parameter SO(2) subgroup (the rotations in the plane transverse to p) of the
Lorentz group that leaves invariant the vector pµ . Such a residual symmetry that
leaves a vector invariant is called little group. In the spinor formulation, this residual
symmetry precisely corresponds to the transformation of eq. (11.49).

Under little group scaling of p and p , the polarization vectors of eq. (11.34)
scale as follows:

ǫµ
+ (p; q) → λ
−2 µ
ǫ+ (p; q) , ǫµ 2 µ
− (p; q) → λ ǫ− (p; q) , (11.50)

i.e. a scaling by a factor λ−2h for a helicity h. Note that the polarization vectors are
invariant under little group scaling of the auxiliary vector q. In an amplitude, the
internal ingredients (propagators and vertices) are not affected by little group scaling.
Therefore, if we apply the little group scaling λi to an external momentum i of an
amplitude, its expression in terms of square and angle spinors must transform as

An (1 · · · ihi · · · n) → λ−2h
i
i
An (1 · · · ihi · · · n) , (11.51)

where hi is the helicity of the external line i (we do not need to specify the helicities
of the other external lines).
It turns out that the structure of all 3-point amplitudes6 is completely fixed by this
property. Let us start from the following generic expression
α β γ
A3 (1h1 2h2 3h3 ) = C 12 23 31 , (11.52)

with α, β, γ undetermined exponents and C a numerical prefactor. Little group scaling


implies that

−2 h1 = α + γ , −2 h2 = α + β , −2 h3 = β + γ , (11.53)

whose solution is

α = h3 − h1 − h2 , β = h1 − h2 − h3 , γ = h2 − h3 − h1 . (11.54)

Therefore the 3-point amplitude must have the following structure


h3 −h1 −h2 h1 −h2 −h3 h2 −h3 −h1
A3 (1h1 2h2 3h3 ) = C 12 23 31 , (11.55)

in which only the numerical prefactor remains to be determined. The − − + 3-gluon


amplitude derived in the previous subsection indeed has this structure.
6 This reasoning cannot be extended to higher n-point amplitudes because they can depend on both

square and angle brackets, and because the number of constraints provided by the helicities of the external
lines is not sufficient to fix the unknown exponents.
372 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Note that instead of eq. (11.52), we could have chosen an ansatz that involves the
square brackets,
 α ′  β ′  γ ′
A3 (1h1 2h2 3h3 ) = C 12 23 31 . (11.56)

(This is the only alternative, since we are not allowed to mix square and angle brackets
in a 3-point amplitude for massless particles.) Little group scaling would now lead to

α ′ = −h3 + h1 + h2 , β ′ = −h1 + h2 + h3 , γ ′ = −h2 + h3 + h1 , (11.57)

and consequently
 −h3 +h1 +h2  −h1 +h2 +h3  −h2 +h3 +h1
A3 (1h1 2h2 3h3 ) = C 12 23 31 . (11.58)

The expected dimension of the amplitude is sufficient to choose between eqs. (11.55)
and (11.58). Indeed, both angle and square brackets have mass dimension 1, while the
3-gluon amplitude should have dimension 1 in 4-dimensional Yang-Mills theory (for
which the coupling constant is dimensionless). Since all the kinematical dependence
is carried by the brackets, the prefactor C can only be made of coupling constants
and numerical factors, and must therefore be dimensionless in Yang-Mills theory.
Consider first the − − + amplitude: eq. (11.55) gives a mass dimension +1, while
eq. (11.58) gives a mass dimension −1. Therefore, the − − + amplitude must be
expressed by eq. (11.55) in terms of angle brackets. The same argument tells us that
the + + − amplitude must be given by eq. (11.58), in terms of square brackets.
Let us consider now the − − − amplitude, for which the little group scaling tells
us that

A3 (1− 2− 3− ) = C 12 23 31 . (11.59)

Therefore, the prefactor C should have mass dimension −2, which cannot be con-
structed from the dimensionless coupling constant of Yang-Mills theory, unless C = 0
(the same conclusion holds if we try to construct this amplitude with square brackets).
Likewise, we conclude that the + + + amplitude is zero as well.

11.3.6 Maximally Helicity Violating amplitudes

Let us consider a tree Feynman diagram contributing to a n-point amplitude, with n3


3-gluon vertices, n4 4-gluon vertices and nI internal propagators. These quantities
are related by:

n + 2 nI = 3 n3 + 4 n4 ,
nI = n3 + n4 − 1 . (11.60)
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 373

The second equation is the statement that this graph has no loops. From these equation
we get the following identities:

n = n3 + 2 n4 + 2 , n3 − 2 n I = 4 − n . (11.61)

The contribution of this Feynman graph to the amplitude is made of n polarization


vectors, nI denominators coming from the internal propagators, and n3 powers of
momentum in the numerator, that come from the 3-gluon vertices7 :
hQ ih Q i
n µi n3 νj n
ǫ
i=1 i L
j=1 j mass 3 4−n
An (1 · · · n) ∼ QnI 2
∼  2n
∼ mass . (11.62)
k=1 Kk mass I

Firstly, we see that the mass dimension of the n-point amplitude is 4 − n. Moreover,
the amplitude An does not carry any Lorentz index. Therefore, in the numerator all
the Lorentz indices µi and νj must be contracted pairwise. These contractions lead to
three type of factors:

ǫi · ǫi ′ , ǫi · Lj , Lj · Lj ′ . (11.63)

Only-+ amplitude : Now, consider an amplitude with only + helicities. From


eqs. (11.36), we see that all contractions between polarization vectors are proportional
to

ǫ+ (i; qi ) · ǫ+ (i ′ ; qi ′ ) ∝ qi qi ′ . (11.64)

By choosing the auxiliary momenta qi to be all equal to q, we make all these


contractions vanish. Therefore, to obtain a non-zero contribution, it is necessary
to contract all the polarization vectors with momenta from the 3-gluon vertices,
ǫi · kj . But from the first of eqs. (11.61), we see that n > n3 , which means that it is
impossible to contract all the n polarization vectors with the n3 momenta from the
vertices. Thus, the all-plus amplitude is zero:

An (1+ 2+ · · · n+ ) = 0 . (11.65)

By the same reasoning, we conclude that the all-minus amplitude is also zero. We can c sileG siocnarF

see here the power that stems from the freedom of choosing the auxiliary vectors qi ;
for generic qi ’s, this amplitude would still be zero (since it does not depend on the
qi ’s), but this zero would result from intricate cancellations among the many graphs
that contribute to An . Instead, with a smart choice of the auxiliary vectors, we can
make this cancellation happen graph by graph.
7 We assume for simplicity Feynman gauge, in which the numerator of the gluon propagator does not

depend on momentum.
374 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

− + · · · + amplitude : Consider now an amplitude with one − helicity carried by


the first external leg, and n − 1 + helicities carried by the external legs 2 to n. Now,
it is convenient to choose the auxiliary vectors as follows
q2 = q3 = · · · = qn = p1 . (11.66)
Again, all the contractions between pairs of polarization vectors cancel, since we have
ǫ+ (i; qi ) · ǫ+ (i ′ ; qi ′ ) = 0 for i, i ′ ≥ 2, and ǫ− (1; q1 ) · ǫ+ (i; qi ) = 0 for i ≥ 2.
Since n > n3 , it is not possible to contract all the polarization vectors with momenta
from the 3-gluon vertices, and these amplitudes also vanish at tree level:
An (1− 2+ · · · n+ ) = 0 . (11.67)
(We also have An (1+ 2− · · · n− ) = 0 at tree level.)

Maximally Helicity Violating amplitudes : Let us flip one more helicity, e.g. with
the assignment 1− 2− 3+ · · · n+ . This time, a useful choice of auxiliary vectors is
q1 = q2 = pn , q3 = q4 = · · · = qn = p1 . (11.68)
With this choice, all the contractions of polarization vectors are zero, except:
ǫ− (2; q2 ) · ǫ+ (i; qi ) 6= 0 for i = 3, · · · , n − 1 . (11.69)
Thus, this time, we need to contract the remaining n − 2 polarization vectors with
the n3 momenta from the 3-gluon vertices, which is possible (provided that n4 = 0,
which means that diagrams containing 4-gluon vertices do not contribute to the
− − + · · · + amplitude for our choice of auxiliary vectors). Therefore, this assignment
of helicities gives a non-zero amplitude:
An (1− 2− 3+ · · · n+ ) 6= 0 . (11.70)
These amplitudes, called the Maximally Helicity Violating (MHV) amplitudes, are
the simplest non-zero amplitudes. As we shall see later, they are given at tree level by
very compact formulas in terms of square and angular brackets (note that up to n = 5
external lines, all the non-zero amplitudes are MHV amplitudes). Generically, the
complexity of amplitudes increases with the number of − helicities, culminating with
amplitudes that have comparable numbers of − and + helicities (increasing further
the number of − helicities then reduces the complexity).

11.4 Britto-Cachazo-Feng-Witten on-shell recursion


11.4.1 Main idea
As we have seen, the main obstacle to the calculation of amplitudes by the usual
Feynman rules is the proliferation of graphs as one increases the number of external
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 375

legs. This problem remains true even after one has factorized the colour factors, even
if it is somewhat mitigated by the fact that the number of cyclic-ordered graphs grows
at a slower pace.
This issue could be avoided if there was a way to break down a tree amplitude into
smaller pieces (themselves tree amplitudes) that have a smaller number of external
legs. It turns out that an amplitude naturally factorizes into two sub-amplitudes
when one of its internal propagators goes on-shell. The physical reason of such a
factorization is that on-shell momenta correspond to infinitely long-lived particles.
Thus, the two sub-amplitudes on each side of this on-shell propagator do not talk
to one another. The other advantage of this situation is that the two sub-amplitudes
would themselves be on-shell, and therefore we may use for them spinor-helicity
formulas that could have been previously obtained for amplitudes with fewer external
legs. If this were possible, we would thus obtain a recursive relationship (in the
number of external legs) for on-shell amplitudes.

11.4.2 Analytical properties of amplitudes with shifted momenta


Unfortunately, with fixed generic external momenta, tree amplitudes do not have
internal on-shell propagators. The trick is to consider a one-parameter complex defor-
mation of the external momenta, adjusted in order to make an internal denominator
vanish:

An (12 · · · n) → An (12 · · · n; z) , (11.71)

where z is a complex variable that controls the deformation. The singularities of tree
Feynman graphs come from the zeroes of the denominators of its internal propagators,
which give poles in z. Our goal will be to choose this deformation in such a way that
the total momentum remains conserved, and the deformed external momenta are still
on-shell. With such a choice, we will be able to reuse the on-shell formulas obtained
for smaller amplitudes.
Let us consider the ratio An (· · · ; z)/z. Besides the poles coming from the internal
propagators, the ratio also has a simple pole at z = 0. Let us assume that An (· · · ; z)
vanishes when |z| → ∞, so that the integral of An (· · · ; z)/z on a contour at infinity
in the complex plane vanishes. Then, we may write
I
dz An (· · · ; z)
0 = = An (· · · ; 0)
γ 2πi z
X An (· · · ; z)
+ Res . (11.72)
z zi
zi ∈{poles of An }

The first term, An (· · · ; z = 0), is nothing but the amplitude we aim at calculating.
This formula therefore expresses it in terms of the residues of An (· · · ; z)/z at the
376 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

simple poles corresponding to the internal propagators of the amplitude. Moreover,


these residues will be factorizable into smaller on-shell amplitudes, precisely because
the poles zi correspond to the on-shellness of some internal propagator. c sileG siocnarF

11.4.3 Minimal momentum shifts


There are many ways to implement a complex shift of the external momenta, but all
of them must fulfill the following conditions:

• The sum of the shifted incoming momenta should remain zero. Therefore, we
must shift at least two momenta (and the simplest is to shift only two).
• The shifted momenta should stay on-shell at all z.
• The amplitude evaluated at the shifted momenta should go to zero as |z| → ∞.

The condition of momentum conservation is trivially satisfied by choosing two mo-


menta i, j to be shifted, and by giving them opposite shifts:

pi → bi = pi (z) ≡ pi + z k ,
p
pj → bj = pj (z) ≡ pj − z k ,
p (11.73)

where we denote with a hat the shifted momenta. All the momenta pk for k 6= i, j are
bi,j are satisfied provided that
left unmodified. The on-shell conditions for p
k2 = 0 , pi · k = 0 , pj · k = 0 . (11.74)
It turns out that these equations have two solutions (up to an arbitrary prefactor),
provided we allow complex momenta. In the spinor notation, the first condition is
automatically satisfied if K can be factorized as in eq. (11.17), while the second and
third conditions become
   
ik ik = 0 , jk jk = 0 . (11.75)

This explains why we need a complex momentum kµ . Indeed, for a real kµ , k and
k are related by complex conjugation, and the above conditions reduce to ik =
jk = 0. With two-component spinors, this implies k ∝ i and k ∝ j , which
is in general impossible. By allowing a complex momentum kµ , we let k and k
be independent, which allows to solve the above conditions by having for instance:
 
k = i , k = j . (11.76)
(The other independent solution consists in exchanging the roles of i and j.) The
bi-spinors corresponding to the shifted momenta are
       
b i = i i +z j i =
P i +z j i , b j = j j −z j i = j
P j −z i ,
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 377

Figure 11.3: i j
Propagators affected
by the momentum shift
(shown in dark) in a tree
amplitude. The lighter
colored lines do not
depend on z. The prop-
agators on the external
lines are not actually
part of the expression of
the amplitude.

(11.77)

from which we read the shifted spinors:

^ı = i , ^ = j − z i ,
    
^ı = i + z j , ^ = j . (11.78)

11.4.4 Behaviour at |z| → ∞


Until now, our description of this method has been completely generic and applicable
to all sorts of quantum field theories, since no reference has been made to the details
of its Lagrangian. These details become important when discussing the condition that
An (· · · ; z) vanishes at infinity. Let us discuss the behaviour at large z in the case of
Yang-Mills theory. Firstly, a z dependence enters in the polarization vectors of the
external lines i and j. For generic auxiliary vectors, we have:
 
q σµ ^ı q σµ ^
ǫµ
+ (^
ı; q) =−√ ∼z , ǫµ
+ (^
; q) =−√ ∼ z−1 ,
2 q^ı 2 q^
 
µ ^ı σµ q µ ^ σµ q
ǫ− (^ı; q) = − √   ∼ z−1 , ǫ− (^; q) = − √   ∼ z .
2 ^ıq 2 ^q
(11.79)

Inside a graph contributing to this n-point amplitude, we can follow a string of


propagators that all carry shifted momenta, from the external line i to the external
line j, as illustrated in the figure 11.3. For all these propagators, since k2 = 0, the
denominators are linear in z. In addition, the 3-gluon vertices along this string of
propagators are linear in the momenta, and therefore scale as z. Along this string,
there are s vertices (3-gluon or 4-gluon vertices), and s − 1 propagators, hence a
378 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

global behaviour at most ∼ z at large z (obtained when all these vertices are 3-gluon
vertices, that scale as z). For the assignment {hi = −, hj = +} of polarizations, we
thus find an overall behaviour8 in z−1 , valid graph by graph.
For other combinations of polarizations on the lines i, j, this diagrammatic ar-
gument suggests that they do not go to zero. However, the actual behaviour for
{hi = −, hj = −} and {hi = +, hj = +} is better than the one suggested by this
graph by graph estimate. Firstly, note that this problem is reminiscent of the eikonal
approximation, in which a hard on-shell particle punches through a background of
much softer particles that very mildly disturb its motion. This can be studied by
splitting the gauge field Aµ into a hard component aµ that describes the gluons along
the string with shifted momenta and a soft background Aµ (describing the unshifted
gluons attached to the hard ones),

Aµ ≡ Aµ + aµ . (11.80)

When rewriting the Yang-Mills Lagrangian in terms of these fields, it is sufficient


to keep terms that are quadratic in the hard field, since in our problem exactly two
external lines are shifted:

1    ig   
LYM = · · ·− tr Dµ aν −Dν aµ Dµ aν −Dν aµ + tr aµ , aν Fµν +· · · ,
4 2
(11.81)

where the covariant derivative Dµ is constructed with the background field. When c sileG siocnarF

splitting the gauge potential as in eq. (11.80), one may fix independently the gauge
for the background and for the fluctuation aµ . For the latter, a convenient choice is
the background field gauge,

Dµ aµ (x) = ω(x) . (11.82)

After adding the gauge fixing term, the quadratic part of the Lagrangian becomes
1    ig   
LYM+GF = · · · − tr Dµ aα Dµ aα + tr aµ , aν Fµν + · · · (11.83)
4 2
In this equation, the first term possesses an extended Lorentz symmetry, since it is
invariant under independent Lorentz transformations of the fluctuations and of the
background, while the second term is only invariant under simultaneous transforma-
tions of Aµ and aµ .
Let us denote Mαβ [A] the propagator of the fluctuation aµ , amputated of its
final lines. This propagator contains 3-gluon couplings to the background field, that
8 When the shifted amplitude decreases faster than z−1 , one may obtain a more compact expression

by integrating An (· · · ; z)(1 − z/z∗ )/z, where z∗ is one of the poles of An . There is no boundary term
thanks to the faster decrease of An , and the additional subtraction removes the contribution from the pole
z∗ , leading to an expression with one less term.
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 379

come from the first term of eq. (11.83), and 4-gluon couplings to the background
field coming from the second term. With only 3-gluon couplings, we have Mαβ ∼ z
(because there is one more vertex than propagators), and each 4-gluon vertex removes
one power of z. Given the Lorentz structure of eq. (11.83), we may write

Mαβ = c1 z + c0 + c−1 z−1 + · · · gαβ + Aαβ + z−1 Bαβ + · · · (11.84)
In this formula, the first term comes entirely from the first term of eq. (11.83), whose
extended Lorentz symmetry leads to the factor gαβ . All the coefficients in this
expansion are functionals of the soft field. The term Aαβ , that comes from a single
insertion of the 4-gluon vertex, is antisymmetric. The subsequent terms correspond to
2 or more insertions of the 4-gluon vertex. These terms have no definite symmetry,
but they are not needed in the discussion. The amputated 2-point function Mαβ also
obeys the following on-shell Ward identities:
β

pi Mαβ ǫhj (b
) = 0 , ǫα
hi (b bβ
ı) Mαβ pj =0, (11.85)

with shifted on-shell momenta and polarization vectors. Note that, unlike in an
Abelian gauge theory, it is necessary to contract one side of the function with a
physical polarization vector for the identity to hold.
The shifted amplitude An is obtained by keeping n − 2 powers of the background
field in Mαβ , and by contracting with the appropriate polarization vectors:

An ∼ ǫα ı; q) Mαβ ǫβ
hi (b ; q ′ ) .
hj (b (11.86)

Choosing the auxiliary vectors to be q ≡ pi and q ′ ≡ pj , the explicit form of the


polarization vectors is9
√ √
µ 2 ∗µ µ µ 2
ǫ+ (bı; q) = − k + z pj , ǫ− (bı; q) = −   kµ ,
ji ij
√ √
µ 2 µ 2 

ǫ+ (b; q ) = − k , ǫ− (b; q ) = −   k∗µ − z pµ
µ ′
i .
ij ji
(11.87)
Note that with this choice of auxiliary vectors, we have lost a power of z in the
denominators of ǫµ ı; q) and ǫµ
− (b ; q ′ ). This will not change the final results, since
+ (b
on-shell amplitudes do not depend on the auxiliary vectors. As a check of the insight
gained from Feynman diagrams, let us first consider the case {hi = −, hj = +}. The
shifted amplitude behaves as
An;−+ ∼ (k · k)(c1 z + · · · ) + kα kβ Aαβ +O(z−1 ) ∼ O(z−1 )
|{z} (11.88)
| {z }
0 0
9 In order to obtain these expressions, one may contract the polarization vectors with σ
µ to first obtain the
ab  
corresponding 2 × 2 matrix, which can be done by using σµ σµ = 2 δa d δb c , j i = kµ σµ ,
 cd
and i j = k∗µ σµ .
380 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

The first term vanishes because k is on-shell, and the second one thanks to the
antisymmetry of Aαβ . Next, consider the case {hi = −, hj = −}, for which we
obtain

An;−− ∼ kα Mαβ ǫβ ; q ′ )
− (b
β
= −z−1 pα ; q ′ )
i Mαβ ǫ− (b
∼ z−1 pi · (k∗ −z pi )(c1 z + · · · ) + z−1 pα
i (k
∗β
−z pβ
i )Aαβ + O(z
−1
)
∼ O(z−1 ) . (11.89)

The second line is obtained by using the Ward identity, and in the third line all terms
that could be larger than z−1 vanish due to p2i = pi · k∗ = 0 and thanks to the
antisymmetry of Aαβ . The case {hi = +, hj = +} is very similar and leads to

An;++ ∼ ǫα ı; q) Mαβ kβ
+ (b

= z−1 ǫα ı; q) Mαβ pβ
+ (b j
β
∼ z−1 (k∗ +z pj ) · pj (c1 z + · · · ) + z−1 (k∗α +z pα
j )pj Aαβ + O(z
−1
)
∼ O(z−1 ) . (11.90)

Finally, in the last case, {hi = +, hj = −}, we obtain An;+− ∼ O(z3 ), and therefore
we cannot use such a shift in eq. (11.72).

11.4.5 Recursion formula


Since all-+ amplitudes are zero, the assignment of helicities that we shall consider is
generically of the form 1− · · · r− (r + 1)+ · · · n+ , and the shift applied to the lines
i = 1, j = n leads to a vanishing amplitude when |z| → ∞. We can therefore apply
eq. (11.72) and write the amplitude in the following way:
X An (· · · ; z)
An (· · · ) = − Res . (11.91)
z zi
zi ∈{poles of An }

As explained earlier, the poles zi come from the vanishing denominators of the
internal propagators, i.e. one of the dark colored propagators in the figure 11.4. Let
us denote KI the momentum (before the shift) carried by the propagator producing
the pole, with the convention that it is oriented in the same direction as p1 . The shift
changes this momentum into
KI → b ≡ K + zk ,
K (11.92)
I I

and the condition that the denominator of the propagator vanishes after the shift is

b 2 = K2 + 2 z K · k , K2I
0=K I I
i.e. zI = − . (11.93)
I I
2 KI · k
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 381

Figure 11.4: Setup for


the BCFW recursion for-
mula with shifts applied
to the external lines 1
and n. The pole comes KI
from the propagator
3 AL AR n−2
carrying the momen-
tum KI , highlighted
in dark. This singular
propagator divides the 2 n−1
graph into left and right
sub-amplitudes, AL and 1− n+
AR .

The singular propagator divides the amplitude into left and right sub-amplitudes, so
that we may write:

X i
An (^
1 2 · · · (n−1)^
n; z) ≡ AL (^ b +h ; z)
1 2 · · ·− K b −h · · · (n−1)^
AR (K n; z) ,
I
b
KI2 I
h=±
(11.94)

with a sum over the helicity h of the intermediate gluon10 . From this expression, the
residue at the pole zI of An (· · · ; z)/z takes the form

An (· · · ; z) X i
Res =− AL (^ b +h ; z )
1 2 · · ·− K b −h · · · (n−1)^
A (K n; zI ) .
z zI I I
K2I R I
h=±
(11.95)

Both AL and AR have strictly less than n external lines, which means that the formula
is recursive: it expresses an amplitude in terms of smaller amplitudes, eventually
breaking it down to 3-point amplitudes. Moreover, the crucial point here is that,
when evaluated at the value zI that gives K b 2 = 0, the left and right sub-amplitudes
I
have only on-shell (but complex) external momenta. Therefore, this recursion never
requires off-shell amplitudes, which is of utmost importance for keeping out of the
calculation unnecessarily complicated kinematics and unphysical degrees of freedom.
Since each internal z-dependent propagator can be singular for some z, eq. (11.91)
contains one term for each such propagator. There are at most n − 3 terms in this sum,
corresponding to the partitions of [2, n−1] = [2, l]∪[l+1, n−1] with 2 ≤ l ≤ n−2. c sileG siocnarF

10 Both AL and AR are defined with all gluons incoming. This is why one has argument −K b and the
I
b
other one +KI . For the same reason, the helicity is +h on one side and −h on the other side.
382 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 11.5: Setup for


applying the BCFW
+ +
recursion formula to
+ +
the calculation of the
− − + · · · + MHV
KI amplitude. We have in-
+ − dicated explicitly all the
3
+
AL AR n−2
+
helicities. Note that only
h = + is allowed in the
sum over the helicity of
+
2− n−1 the singular propagator
(otherwise the right-side
1

n
+ sub-amplitude would
be a vanishing all-+
amplitude).

11.4.6 Parke-Taylor formula for MHV amplitudes

MHV recursion formula : As an illustration of the BCFW recursion formula, let


us determine the explicit expression of the MHV amplitudes11 An (1− 2− 3+ · · · n+ ).
We show all the helicity assignments, including those of the singular propagator, in
the figure 11.5. In order to avoid having an all-+ sub-amplitude on the right, we must
choose h = +. This choice makes AR an − + · · · + amplitude, which is also zero
unless it is a 3-point amplitude. Thus, the BCFW formula reduces to a single term:

An (1− 2− 3+ · · · n+ ) = An−1 (^
1− 2− 3+ · · · (n − 2)+ − Kb+; z ) i
I I
K2I
×A3 (Kb − (n − 1)+ n ^ +; z ) , (11.96)
I I

where the momentum carried by the singular propagator is (before the shift)

KI = −(pn−1 + pn ) . (11.97)

In the right hand side of eq. (11.96), the factor on the right is an already known 3-point
amplitude, and the factor on the left is an MHV amplitude with n − 1 external legs.

Four-point MHV amplitude : Let us now calculate the first few iterations of
this recursion, in order to guess a formula for the MHV amplitude that will be
11 This assignment of helicities, with the negative helicities carried by adjacent lines, is the simplest.

MHV amplitudes with non-adjacent negative helicities are also given by the Parke-Taylor formula, but the
proof is a bit more complicated in this case.
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 383

our hypothesis for an inductive proof. Firstly, consider the − − ++ 4-point MHV
amplitude, In this case, the BCFW recursion formula gives

b+; z ) i b − 3+ ^4+ ; z ) ,
A4 (1− 2− 3+ 4+ ) = A3 (^
1− 2− − K A3 (K (11.98)
I I
K2I I I

and both amplitudes in the right hand side are known. This gives:
3  3
^
12 1 3^4
− − + + 2   
A4 (1 2 3 4 ) = 2 i g  . (11.99)
b K
2K b ^1 12 12 ^4K b K b 3
I I I I

Using the fact that


   
^
1 = 1 , ^
1 = 1 +z 4 , ^
4 = 4 −z 1 , ^4] = 4 , (11.100)

we obtain
   
b K
K b = 1 1 + 2 2 + zI 1 4 ,
I I
   
2Kb K b 4 = 21 14 ,
I I
   
1Kb K b 3 = 12 23 , (11.101)
I I

which leads to
 3
− − + + 342
A4 (1 2 3 4 ) = 2 i g     . (11.102)
41 12 23
This formula, that depends only on square brackets, can also be expressed in terms
3
of angle brackets. Let us multiply the numerator and denominator by 12 . Then,
momentum conservation leads to
   
41 12 = − 43 32 ,
   
12 23 = − 14 43 ,
   
12 12 = (p1 + p2 )2 = (p3 + p4 )2 = 34 34 , (11.103)

and we finally obtain


3
12
A4 (1− 2− 3+ 4+ ) = 2 i g2 . (11.104)
23 34 41
This formula could in principle have been obtained from eq. (11.11), by putting
the external lines on-shell and by using − − ++ polarization vectors, at the cost of
considerable effort. We see here the power of on-shell recursion: since one only
manipulates on-shell sub-amplitudes with physical polarizations, the complexity of
all the intermediate expressions is comparable to that of the final result, unlike with
the standard method.
384 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Five-point MHV amplitude : Consider now the amplitude A5 (1− 2− 3+ 4+ 5+ ).


The BCFW recursion formula (11.96) now reads:

A5 (1− 2− 3+ 4+ 5+ ) = A4 (^ b + ; z ) i A3 (K
1− 2− 3+ − K b − 4+ ^5+ ; z )
I I
K2I I I

 
√ ^12 3 4^5 3
= ( 2 g)3 i2     ,
23 3K b K b 1 45 45 ^5K b K b 4
I I I I

(11.105)

 
where we have chosen to express K2I as (p4 + p5 )2 = 45 45 . This time, we use

   
^
1 = 1 , 1 ^ = 1 + z 5 , ^5 = 5 − z 1 , ^5] = 5 ,
   
b K
K b =− 4 4 − 5 5 +z 1 5 ,
I I I
   
3Kb K b ^ 5 = − 34 45 ,
I I
   
^
1Kb K b 4 = − 51 45 , (11.106)
I I

which gives

3
− − + + +
√ 12
A5 (1 2 3 4 5 ) = ( 2 g)3 i2 . (11.107)
23 34 45 51

This remarkably simple formula, that encapsulates the sum of 10 cyclic-ordered


Feynman diagrams (in QCD, this corresponds to 25 diagrams before colour ordering),
in fact exhausts all the possibilities for 5-point functions (the + + − − − amplitude is
given by the same formula with square brackets instead of angle brackets).

Parke-Taylor formula : The previous results for 3, 4 and 5-point MHV amplitudes
lead us to conjecture the following general formula:

3
− − + +
√ 12
An (1 2 3 · · · n ) = ( 2 g)n−2 in−3 ,
23 34 · · · (n − 1)n n1
(11.108)

known as the Parke-Taylor formula. Let us assume the formula to be true for all
p < n, and consider now the case of the n-point MHV amplitude. The BCFW
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 385

recursion formula reads:

An (1− 2− 3+ · · · n+ ) = i An−1 (^ b+; z )


1− 2− 3+ · · · (n−2)+ − KI I

1 b (n−1) n
− + +
× 2 A3 (K ^ ; zI )
KI I

3
√ ^12
= ( 2 g)n−2 in−3
b K
23 · · · (n−2)K b 1
I I
 3
1 (n−1)^ n
×    ,
(n−1)n (n−1)n n b K
^K b (n−1)
I I

(11.109)

where we have used our induction hypothesis for the (n − 1)-point MHV amplitude
that appears in the left sub-amplitude. The spinor manipulations that are necessary to
simplify this expression are the same as in the case of the 5-point amplitude, and lead
to:
   
(n−2)K b K b n^ = − (n−2)(n−1) (n−1)n ,
I I
   
^
1Kb K b (n−1) = − n1 (n−1)n , (11.110)
I I

thanks to which we obtain eq. (11.108) for n points. Up to 5-points, all amplitudes
are MHV (or anti-MHV, i.e. + + − − −). Beyond 5-points, there exist non-MHV
amplitudes, that are not given by the Parke-Taylor formula. However, multiple MHV
amplitudes can be sewed together in order to construct the non-MHV ones, with
a set of rules known as the Cachazo-Svrcek-Witten (CSW) rules, derived in the
section 11.6. Such an expansion is much more efficient that the textbook perturbation
theory, because it is in terms of on-shell gauge-invariant building blocks (the MHV
amplitudes) that already encapsulate a lot of the underlying complexity.

11.5 Tree-level gravitational amplitudes

11.5.1 Textbook approach for amplitudes with gravitons


In the previous section, we have derived the BCFW recursion formula and applied it
to the calculation of the tree-level MHV amplitudes in Yang-Mills theory. However,
the validity of this recursion is by no means limited to a gauge theory with spin-1
bosons such as gluons. It may in fact be applied to any quantum field theory provided
that:

• we have expressions for the on-shell 3-point amplitudes,


386 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

• the shifted amplitudes vanish when |z| → ∞. c sileG siocnarF

In particular, it could be interesting to apply it to the calculation of scattering ampli-


tudes that involve gravitons12 . The Feynman rules for Einstein gravity can be obtained
from the Hilbert-Einstein action,
Z
√ 2 gµν gρσ gµν m2 2
SHE ≡ d4 x −g 2
R− Fµρ Fνσ + (∂µ φ)(∂ν φ) − φ ,
κ 4 2 2
(11.111)

where gµν is the metric tensor, R is the Ricci curvature and κ is a coupling constant
related to Newton’s constant by κ2 = 32π GN . In this action, we have also added
the minimal coupling to a gauge field and to a scalar field, in order to investigate
gravitational interactions with light and matter. The rules for the propagators and
vertices involving gravitons are obtained by expanding the metric around flat space:

gµν = ηµν + κ hµν . (11.112)

(ηµν is the flat space Minkowski metric.) Let us make a remark on dimensions:
Newton’s constant has mass dimension −2, κ has mass dimension −1, the Ricci
curvature has mass dimension 2, and hµν has mass dimension +1 (like the scalar φ
and the photon Aµ ). The expansion in powers of hµν leads to an infinite series of
terms (because the Ricci tensor contains the inverse
√ gµν of the metric tensor, and also
because of the expansion of the square root −g). Schematically, the expansion of
the Hilbert-Einstein action starts with the following terms:
Z
SHE ∼ d4 x h∂2 h + κ h2 ∂2 h + κ2 h3 ∂2 h + · · ·

+κ hφ∂2 φ + κ h F2 + · · · . (11.113)

This sketch only indicates the number of powers of h and the number of derivatives
contained in each term, but of course the actual structure of these terms is much more
complicated. For instance, the vertex describing the coupling φφh between two
scalars and a graviton reads:

iκ h µ ν µ
i
Γ µν (p1 , p2 ) = − p1 p2 + pν p
1 2 − η µν
(p1 · p2 − m2
) , (11.114)
2
where p1,2 are the momenta carried by the two scalar lines (since the graviton has
spin 2, the graviton attached to this vertex carries two Lorentz indices). But the
12 At tree-level, these amplitudes are completely prescribed by the equivalence principle and general

relativity, and their calculation does not require to have a consistent theory of gravitational quantum
fluctuations.
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 387

γγh coupling is far more complicated, and the hhh tri-graviton vertex is even more
complex, leading to extremely cumbersome perturbative calculations if performed
within the traditional approach.
It turns out that tree amplitudes in Einstein gravity have a simple form in the
spinor-helicity formalism, very much like their Yang-Mills analogue. The goal of
this section is to illustrate on two examples the use of the spinor-helicity formalism,
combined to the BCFW recursion, in order to calculate some amplitudes that have
a relevance in gravitational physics: (1) gravitational bending of light by a mass,
and (2) scattering of a gravitational wave by a mass. In both examples, the mass
acting as a source of gravitational field is taken to be a scalar particle. In the approach
based on conventional Feynman perturbation theory, these processes are given by the
diagrams shown in the figure 11.6. In particular, the second example (bending of a

Aφγ→φγ ∼

Aφh→φh ∼

Figure 11.6: Feynman diagrams contributing to the gravitational photon-scalar and


graviton-scalar scattering amplitudes. The wavy double lines represent gravitons.

gravitational wave by a mass) would be an extremely difficult calculation, because of


the complexity of the 3-graviton vertex.

11.5.2 Three-point amplitudes with gravitons


In order to obtain these amplitudes with the formalism previously exposed in the
case of Yang-Mills theory, the first step is to obtain the 3-point amplitudes involving
scalars, photons and gravitons. External scalar particles must have helicity h = 0,
photons can have helicities h = ±1 and gravitons can have helicities h = ±2 with
polarization vectors that are “squares” of the gluon polarization vectors:

ǫµν µ ν
2h (p; q) = ǫh (p; q) ǫh (p; q) . (11.115)
388 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

For 3-point amplitudes that involve only massless particles (photons and gravitons),
little group scaling is sufficient to constrain completely their form. We obtain:

Ahγγ (1±2 2+ 3+ ) = Ahγγ (1±2 2− 3− ) = 0 ,


 4  −2
Ahγγ (1+2 2+ 3− ) = − κ2 12 23 ,
  −2   4
Ahγγ (1+2 2− 3+ ) = − κ2 23 31 ,
−2 4
Ahγγ (1−2 2+ 3− ) = − κ2 23 31 ,
−2 − + 4 −2
Ahγγ (1 2 3 )= − κ2 12 23 . (11.116)

In order to obtain the zeroes of the first line, and to choose between square and angle
brackets for the non-zero results, we use the fact that the 3-point amplitude must have
mass dimension +1, with a prefactor made up only of numerical constants and one
power of κ (that has mass dimension −1). The value √ of the prefactor is obtained
by inspecting the term of order κ in the expansion of −g F2 . For the 3-graviton
amplitudes, little group scaling leads to

Ahhh (1+2 2+2 3+2 ) = Ahhh (1−2 2−2 3−2 ) = 0 ,


6 −2 −2
Ahhh (1−2 2−2 3+2 ) ∝ κ 12 23 31 ,
  6   −2   −2
Ahhh (1+2 2+2 3−2 ) ∝ κ 12 23 31 . (11.117)

Interestingly, the kinematical part of the non-zero 3-graviton amplitudes is simply the
square13 of that of the 3-gluon amplitudes with like-sign helicities (see eqs. (11.46)
and (11.47)), despite a considerably more complicated Feynman rule for the 3-graviton
vertex. This is yet another illustration of the fact that traditional Feynman rules carry
a lot of unnecessary information that disappears in on-shell amplitudes with physical
polarizations.
For the φφh amplitude, we cannot rely on little group scaling because the scalar
field is massive. Instead, we simply contract eq. (11.114) with the polarization vector
(11.115) of the graviton, and take the external momenta on mass-shell. For a graviton
of helicity +2, we have
 
Aφφh (10 20 3+2 ) = −i κ p1 · ǫ+ (p3 ; q) p2 · ǫ+ (p3 ; q)
 
i κ q P1 p3 q P2 p3
= − 2
, (11.118)
2 qp 3

13 This property of 3-point purely gravitational amplitudes has a generalization for n-point amplitudes,

known as the Kawai-Lewellen-Tye (KLT) relations. These relations have also been interpreted as a form of
colour-kinematics duality by Bern, Carrasco and Johansson.
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 389

where p1 + p2 + p3 = 0. With a graviton of helicity −2, we have


 
i κ p3 P1 q p3 P2 q
Aφφh (10 20 3−2 ) = −  2 . (11.119)
2 qp 3

Note that,
   
p3 P2 q = − p3 P1 q − p3 P3 q = − p3 P1 q ,
| {z }
0
   
q P2 p3 = − q P1 p3 − q P3 p3 = − q P1 p3 , (11.120)
| {z }
0

which allows the following simplification of the above 3-point amplitudes:


2 2
0 0 +2 i κ q P1 p3 0 0 −2 i κ p3 P1 q
Aφφh (1 2 3 )= 2
, Aφφh (1 2 3 )=  2 .
2 qp3 2 qp3
(11.121)

Note that since p21 = m2 6= 0, the bi-spinor P1 does not admit a factorized form, and
this cannot be simplified further. c sileG siocnarF

11.5.3 Gravitational bending of light

Shifted momenta : Consider now the amplitude Aγγφφ (1+ 2− 30 40 ), and apply
the shift to the lines 2 and 3, as illustrated in the figure14 11.7:

b2 ≡ p2 + z k ,
p b3 ≡ p3 − z k ,
p
2
k =0, k · p2 = k · p3 = 0 . (11.122)

Since p2 is massless, the condition p2 · k = 0 can be satisfied by choosing for


instance:

k = 2 (11.123)

However, since p23 = m2 , the bi-spinor P3 that represents the momentum p3 cannot
be factorized. Instead, we may write

0 = 2 k · p3 = − k P3 k , (11.124)
14 Note that the factorization with one scalar and one photon on each side of the singular propagator is

not allowed: indeed, the intermediate propagator would need to carry a scalar, and we would have two
φφγ sub-amplitudes, that are zero per our assumption that the scalar field is not electrically charged.
390 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

0
1 4

KI Figure 11.7: BCFW shift for


AL AR the calculation of the γγφφ
amplitude.

2 30

which can be solved by15



k = P3 2 . (11.125)

The shifted bi-spinors read


   
b2 = 2 2 + z k k =
P 2 + z P3 2 2 ,
| {z }

b
2

b 3 = P3 − z k k = P3 − z P3 2 2 .
P (11.126)

Note that the second one is not factorizable, because the line 3 carries a massive
particle.

Scattering amplitude : With this choice of shifts, the BCFW recursion formula
can be written as follows
i X b +h ; z ) Ahφφ (K
b −h ^30 40 ; z ) ,
Aγγφφ (1+ 2− 30 40 ) = Aγγh (1+ ^2− − K
K2I I I I I
h=±2
(11.127)

where the shifted momenta in the 3-point amplitudes are evaluated at the zI for which
the shifted momentum K b of the intermediate graviton is on-shell. The condition for
I
the intermediate momentum to be on-shell reads
   
0=K b 2 = (p1 + p
b2 )2 = 2 p1 · p
b2 = 12 12 + zI 1 P3 2 , (11.128)
I
| {z }
 
^ =0
12


15 We use k P k = 2 P P 2 = m2 22 = 0. When p is massless, P factorizes as P =
 3 3 3  3 3 3
3 3 , and this solution becomes k = 3 32 . Up to a rescaling, this is the solution we have previously
used in the massless case.
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 391

  
whose solution is zI = − 12 / 1 P3 2 . Plugging in the results for the 3-point
amplitudes and summing explicitly over the two helicities of the intermediate graviton,
the γγφφ amplitude can be written as

  
κ2 1 b 1 4 K
K b P4 q 2
+ − 0 0   I I
Aγγφφ (1 2 3 4 ) =  2  
4 12 12 1^2 qKb 2
I
 
Kb ^2 4 q P4 K b 2
+ I 2 I
. (11.129)
1^2 qK b 2
I

For the first term, we may write

    4 4  2
b 1 4
K b 1 4 K
K b 1 4 b2
2 p1 · p 1^2 1^2
I I I
 2 =  2 =  2 = =0. (11.130)
1^
2 1^2 K b 1 4 1^
2 K b 1 4 b 1 4
K
I I I

The final zero occurs when we evaluate the expression at zI , as a consequence of


eq. (11.128). Therefore, the amplitude reduces to a single term. Furthermore, we
are still free to choose the auxiliary vector q. A convenient choice turns out to be
q = p2 , which leads to:
2
b 2 2 2 P4 K
κ2 K b
Aγγφφ (1+ 2− 30 40 ) = I
3  
I
. (11.131)
4 12 12

Then, notice that


 
b K
2 P4 K b 2 = 2 P4 1 12 , (11.132)
I I

which gives the following extremely compact form for the amplitude:
2
κ2 2 P4 1
Aγγφφ (1+ 2− 30 40 ) =   . (11.133)
4 12 12

† 
Cross section and deflection angle : Using 2 P4 1 = 1 P4 2 , the modulus
square of the amplitude is
2 2
+ − 0 0
2 κ4 2 P4 1 1 P4 2
Aγγφφ (1 2 3 4 ) = 2  2
. (11.134)
16 12 12
392 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Note that
  ab  cd  ab  cd 
2 P4 1 1 P4 2 = 2 a P4 1 b 1 c P4 2 d = P4 1 b 1 c P4 2 d 2 a
 
= tr P4 P1 P4 P2 = p4µ p1ν p4ρ p2σ tr σµ σν σρ σσ
= 2 p4µ p1ν p4ρ p2σ ηµν ηρσ − ηµρ ηνσ

+ηµσ ηνρ − i ǫµνρσ

= 2 2 (p1 · p4 )(p2 · p4 ) − p24 (p1 · p2 )
= s13 s14 − m4 , (11.135)

where we have introduced the Lorentz invariants sij ≡ (pi +pj )2 and used s24 = s13
and s12 + s13 + s14 = 2 m2 (both follow from momentum conservation). Therefore,
the squared amplitude reads
2 κ4 (s13 s14 − m4 )2
Aγγφφ (1+ 2− 30 40 ) = . (11.136)
16 s212
The differential cross-section with respect to the solid angle of the outgoing photon is
given by
dσ 1 2
= Aγγφφ (1+ 2− 30 40 ) . (11.137)
dΩ 64π2 s14
Let us now consider the limit of long wavelength photons, namely ω = |p1,2 | ≪ m.
In this limit, the Lorentz invariants that appear in the cross-section simplify into16

s12 ≈ 4 ω2 sin2 θ
2 ,
s13 ≈ m2 − 2 m ω − 4 ω2 sin2 θ
2 ,
2
s14 ≈ m + 2mω , (11.138)

where ω is the photon energy and θ its deflection angle in the center of mass frame
(which is also the frame of the massive scalar particle in this limit). For large enough
impact parameters, the deflection angle is small, θ ≪ 1. Thus we obtain in this limit

dσ 16 G2N m2
≈ . (11.139)
dΩ θ4
In order to determine the deflection angle as a function of the impact parameter b,
consider a flux F of photons along the z direction, with the massive scalar at rest at
16 If we are only interested in the limit of small energy ω and small deflection angle θ, then the somewhat

complicated calculation of the numerator done in eq. (11.135) can be avoided. Indeed, in this limit, the
ab
massive scalar is at rest and P4 ≈ m δab . Moreover, the 3-momenta of the incoming and outgoing √
  2ω 
photons are nearly parallel to the z axis. This implies that 1 a ≈ 1 a ≈ − 2 a ≈ − 2 a ≈ 0
,
 
and 2 P4 1 ≈ 1 P4 2 ≈ −2mω.
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 393

the origin. Out of this flux, consider specifically the incoming photons in a ring of
radius b and width db. The number of photons flowing per unit time through this
ring is

2π b F db . (11.140)

All these photons are scattered in the range of polar angles [θ(b) + dθ, θ(b)] (note
that dθ is negative for db > 0, because the deflection angle decreases at larger b),
which corresponds to a solid angle:

dΩ = −2π sin θ(b) dθ . (11.141)

By definition, the number of scattering events is the flux times the cross-section, i.e.
c sileG siocnarF


2π b F db = F dΩ , (11.142)
dΩ
that can be integrated for small angles into

4 GN m
θ(b) = . (11.143)
b
(The integration constant is chosen so that the deflection vanishes when b → ∞.)
This is indeed the standard formula from general relativity, that can be derived by
considering geodesics in the Schwarzchild metric.

11.5.4 Scattering of gravitational waves by a mass

Let us now study the scattering amplitude between a scalar and a graviton, whose low
energy limit will provide us information about the scattering of a long wavelength
gravitational wave by a mass. A priori, each of the two gravitons may have a helicity
±2, but the cases {+2, +2} and {−2, −2} correspond to a helicity flip of the graviton,
which is suppressed at low frequency. Therefore, let us consider the amplitude
Ahhφφ (1−2 2+2 30 40 ). When writing the BCFW recursion for this amplitude, the
simplest shift is one that affects the lines 1 and 2, more specifically:

 
^ = 2 , 2
2 ^ = 2 −z 1 ,
  
^
1 = 1 +z 2 , ^1 = 1 , (11.144)

Because the polarization vectors of the gravitons are squares of the spin-1 ones, this
shift can be proven to lead to a vanishing amplitude when |z| → ∞ simply by power
counting. With the shift (11.144), the intermediate propagator carries a scalar, and
394 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

40 30 30 40

K23 K24
AL AR AL AR

1−2 2+2 1−2 2+2

Figure 11.8: BCFW shift for the calculation of the hhφφ amplitude.

therefore it has only the h = 0 helicity. The BCFW recursion formula contains two
terms,

b0 ) i b0 )
Ahhφφ (1−2 2+2 30 40 ) = Ahφφ (^
1−2 40 K23 Ahφφ (^2+2 30 − K 23
K223
− m2
b0 ) i b0 ) ,
+ Ahφφ (^
1−2 30 K24 2
Ahφφ (^2+2 40 − K 24
K24 − m2
(11.145)

that differ by a permutation of the external scalars. In the above equation, we have
made explicit the intermediate momentum, K b 23 ≡ pb2 + p3 in the first term and
b 24 ≡ p
K b2 + p4 in the second one. The explicit forms of the first and second terms are

2 2
Ahφφ (^ b 0 )Ahφφ (^
1−2 40 K 23
b0 )
2+2 30 − K23 −iκ2 ^1 P4 q q ′ P3 ^2
i =  2 ,
K223 − m2 4(K223 −m2 ) ^1q q ′ ^2
2

2 2
Ahφφ (^ b 0 )Ahφφ (^
1−2 30 K 24
b0 )
2+2 40 − K24 −iκ2 ^1 P3 q q ′ P4 ^2
i =  2 .
K224 − m2 4(K2 −m2 ) ^1q q ′ ^2
2
24
(11.146)

A convenient choice of auxiliary vectors is q = p2 and q ′ = p1 , which leads to

4  
−2 +2 0 0 κ2 1 P3 2 1 1
Ahhφφ (1 2 3 4 ) = −i   +
4 12 2 12 2 K223 − m2 K224 − m2
4
κ2 1 P3 2 1
= i   . (11.147)
16 12 12 (p2 · p3 )(p2 · p4 )
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 395

The square of this amplitude can be related to that of photon-scalar gravitational


scattering by
 
−2 +2 0 0
2
− + 0 0
2 m2 s12
Ahhφφ (1 2 3 4 ) = Aγγφφ (1 2 3 4 ) 1− .
(s13 −m2 )(s14 −m2 )
(11.148)

In the limit of a graviton of small energy (i.e. a gravitational wave of long wavelength)
and small deflection angle (i.e. at large impact parameter), the second factor in the
right hand side becomes equal to 1, and we have
2 2
Ahhφφ (1−2 2+2 30 40 ) ≈
ω≪m
Aγγφφ (1− 2+ 30 40 ) . (11.149)
θ≪1

This implies that in this limit the bending of a gravitational wave by a mass is the
same as the bending of a light ray (but there are some differences beyond this limit).

11.6 Cachazo-Svrcek-Witten rules

11.6.1 Off-shell continuation of MHV amplitudes

Our proof of the Parke-Taylor formula, based on BCFW recursion, is not faithful
to the actual chronology, since the formula was conjectured in 1986 and a proof
was found in 1988 using an off-shell recursion derived by Berends and Giele, well
before on-shell recursion. The Cachazo-Svrcek-Witten rules, also anterior to BCFW
recursion, provide a way to construct the tree-level non-MHV amplitudes by an
expansion in which the MHV ones play the role of vertices, as in the following
diagram:
- +
+ - + -
+ - ,

where the two “vertices” are + + −− 4-point MHV amplitudes, sewed together in
order to make a contribution to a non-MHV 6-point amplitude. In this section, we
will use ideas inspired of the derivation of on-shell recursion in order to establish
these rules.
Firstly, for energy-momentum conservation to hold in the vertices of such a
diagram, the intermediate propagator linking the two vertices must generically carry
an off-shell momentum. This means that in such a construction, we need first to
396 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

generalize the MHV amplitudes to external off-shell momenta. As we shall see


shortly, the CSW rules only use as vertices the MHV amplitudes that have exactly two
negative helicities, i.e. those that are expressible in terms of angle brackets only. Thus,
we would like to have an off-shell extension of these brackets. The main issue here
is that for an off-shell momentum pµ , the 2 × 2 matrix P by which it is represented
in the spinor-helicity method
 has a non-vanishing determinant, and is therefore not
factorizable as P = p p . Let us assume that ηµ is a light-like auxiliary vector,
then we may always write
pµ ≡ Pµ + zηµ (11.150)
with P2 = 0. The on-shellness of Pµ determines uniquely the value of z,
p2
z= . (11.151)
2p · η
Since Pµ is on-shell, the correspondingmatrix P is factorizable into the direct product
of a square and angle spinors, P = P P . Then, we may write
      
η P = η|P + z η|η = η|P P + z η|η η , (11.152)
| {z }
0

which gives

ηP
P = . (11.153)
η|P

Note that this identity contains the angle spinor P in the numerator and the square
spinor P in the denominator, consistent with the fact that rescaling one with λ and
the other with λ−1 gives an equally valid representation. For this reason we may
simply ignore the denominator and define

P = ηP. (11.154)
In the following, we adopt this formula as the definition of the angle spinor associated
with the off-shell momentum pµ . Note that when pµ is on-shell, then p = P, and the
angle spinor P defined in eq. (11.154) is indeed proportional to the usual p . c sileG siocnarF

11.6.2 CSW rules for next-to-MHV amplitudes


Consider now the case of amplitudes with exactly three negative helicities, carried by
the external lines i, j, k, and consider the following deformed square spinors
  
bı ≡ i + z jk η ,
  
b ≡ j + z ki η ,
  
b ≡ k + z ij η ,
k (11.155)
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 397

k−

Figure 11.9:
Propagators affected by
the shift of eq. (11.155) i−
in a generic amplitude.

j−

while the corresponding angle spinors are left unchanged17 . The shifted external
momenta are defined as direct products of these shifted square spinors and the original
angle spinors, and are therefore still on-shell. Note also that

     
b k
bı i + b j + k = i i + j j + k k
| {z }
0
 
+z η jk i + ki j + ij k .
| {z }
0
(11.156)

The first zero is due to momentum conservation in the unshifted amplitude, and
the second one follows from Schouten identity. Thus, the above shift preserves
momentum conservation.

The propagators affected by the shift of eq. (11.155) form three lines starting at the
three external points of negative helicity, that meet somewhere inside the graph. Since
the shift modifies only the square spinors, the polarization vectors of negative helicity
scale as z−1 . Moreover, from the figure 11.9, we see that there are p + 1 vertices and
p propagators along the affected lines. Even in the worst case where all these vertices
are 3-gluon vertices that scale like z, the overall scaling of the graph is bounded by
z−3+(p+1)−p ∼ z−2 , and therefore it goes to zero as |z| → ∞. Then, we proceed
like in the derivation of BCFW’s recursion formula, by integrating An (· · · ; z)/z over
z on a circle of infinite radius. The behaviour of the deformed amplitude at large z

17 Recall that the usual BCFW shift acts only on a pair i, j of external lines, by shifting the angle spinor

of one and the angle spinor of the other.


398 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

ensures that there is no boundary term, and we obtain:

X An (· · · ; z)
An (· · · ) = − Res
z z∗
z∗ ∈{poles of An }
X
= b −h ; z ) i A (· · · K
AL (· · · − K bh; z ) , (11.157)
I I
K 2 R I I
I,h=± I
i∈A
L

where KI is the momentum of the intermediate propagator producing the pole. By


construction, the singular propagator separating the left and right sub-amplitudes must
be one of the dark propagators in the figure 11.9. Thus, when writing this factorized
formula, we may decide that the external line i belongs to the left sub-amplitude, and
at least one of j, k belongs to the right sub-amplitude. The propagator carrying the
momentum KI can be on any of the three branches highlighted in the figure 11.9. In
all three cases, there is only one choice of the intermediate helicity h that gives a
non-zero result, and this h is such that both AL and AR are MHV amplitudes with
two negative helicities. The next-to-MHV amplitude under consideration takes the
following diagrammatic form:
k i k
- - + - - + - - - + - -
An = i - + - k + - j, (11.158)
j j i

where we have only indicated the relevant helicity assignments (all the thin lines carry
positive helicities). Thus, as far as the helicities are concerned, the vertices in these
graphs are MHV amplitudes with exactly two negative indices.
Note that in eq. (11.158) the shift has no influence on the external lines because the
MHV amplitudes with two negative helicities depend only on the angle spinors. The
MHV vertices in this equation also depend on the angle spinor K b that corresponds
I
to the on-shell (because we evaluate the amplitude at the value of z for which the
intermediate propagator is singular) shifted momentum. Consider for instance the
first diagram, obtained when the singular propagator is on the line that stems from the
external line i. In this case, we have
 
b K
K b =K b = K + z∗ jk η i , (11.159)
I I I I


where z∗ is the value of z at the pole. By contracting this relation with η , we obtain
  
b K
ηK b = ηK . (11.160)
I I I

Thus, the angle spinor K b in the MHV vertices resulting from the theorem of
I 
residues is proportional to the off-shell extension η KI that we have proposed in the
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 399

 
previous subsection. Finally, note that the factors ηKb all cancel, because the line
I
of momentum KI has opposite helicities on either side of the propagator that links the
two MHV amplitudes18 . Therefore, in the MHV diagrams of eq. (11.158), we do not
need to find the poles z∗ and we may directly evaluate
 the MHV amplitudes that play
the role of vertices with the off-shell angle spinor η KI .
Let us summarize here the CSW rules for calculating amplitudes with exactly
three negative helicities:

• Start from the three skeleton diagrams of eq. (11.158), and interpret the vertices
as MHV amplitudes with one off-shell external leg. Note that the actual number
of MHV graphs depends on the number of positive helicity external lines, since
they may be attached to either of the two MHV vertices (provided we do not
change the cyclic ordering of the external lines, and that all the vertices obtained
in this way have at least three lines).

• The intermediate propagator is simply a scalar propagator i/K2I , with the value
of KI determined by momentum conservation at the vertices.

• Replace the MHV vertices by their Parke-Taylor expression,


 with the angle
spinor of the intermediate off-shell line given by KI ≡ η KI (from now on,
we omit the hat on KI , since shifting the momenta was just an intermediate
device for establishing the CSW rules).

Note that in this derivation of the CSW rules, we are guaranteed


 that the final result
final result does not depend on the choice of the spinor η introduced in the shift.
Indeed, the pole at z = 0 if An (· · · ; z)/z is by construction the amplitude we are
looking for. The main advantage of the CSW rules is that they express amplitudes
in terms of high-level building blocks that already contain a large number of colour
ordered graphs, as illustrated in the following example of a 2-vertices MHV graph
contributing to the 1− 2− 3− 4+ 5+ 6+ six-point amplitude:

6 5 4
I 3

1 6
5
I
4
2 3
18 Recall that under a rescaling by a factor λ of an angle spinor, the MHV amplitudes with exactly two

negative helicities scale by λ2 if the scaling affects an external line of negative helicity, and by λ−2 if it
affects an external line of positive helicity.
400 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

11.6.3 Examples

Four-point − − −+ amplitude : let us first use the CSW rules to evaluate the
1− 2− 3− 4+ amplitude. Of course, we know beforehand that the result should be zero,
so this is no more than a trivial illustration of the rules. In this case, the CSW rules
give the following graphs:

4 3 1 4 3
+ - + - - + - + 4 - + - -
A4 (1− 2− 3− 4+ ) = - - + - - +
+ -
2,
1 2 2 3 1
(11.161)

but the last one is trivially zero because one of the vertices (−−) does not exist. In
the first graph, the intermediate momentum (oriented  from left to right) is KI =
p1 + p4 = −(p2 + p3 ), and its angle spinor KI ≡ η KI obeys
   
KI 1 = − η4 14 , KI 2 = η3 23 ,
   
KI 3 = − η2 23 , KI 4 = η1 14 , (11.162)

while the CSW rules give

3 3
1KI 1 23
A4,1 (1− 2− 3− 4+ ) = 2ig2
KI 4 41 K2I 3KI KI 2
 3
2  η4 14
= −2ig      . (11.163)
η1 η2 η3 23

In the second graph, the intermediate momentum is LI = p1 + p2 = −(p3 + p4 ),


and the corresponding angle spinor satisfies
   
LI 1 = − η2 12 , LI 2 = η1 12 ,
   
LI 3 = η4 34 , LI 4 = − η3 34 . (11.164)

This time, the CSW rules give

3 3
− − − + 2 12
1 LI 3
A4,2 (1 2 3 4 ) = 2ig 2
2LI LI 1 LI 34 4LI
 3
2  η4 34
= 2ig      . (11.165)
η1 η2 η3 12
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 401

Therefore, the sum of these two contributions is


 3  
− − − + 2  η4 34 14
A4 (1 2 3 4 ) = 2ig      −   = 0 , (11.166)
η1 η2 η3 12 23
| {z }
0

where the final cancellation is due to momentum conservation. c sileG siocnarF

Six-point − − − + ++ amplitude : Let us consider now a genuine example of


next-to-MHV amplitude, the six-point [123]− [456]+ amplitude. The MHV graphs
that contribute to this function are:
6 5 4 3 1 6 5 4
+ + + +
+ - - +
A6 ([123]− [456]+ ) = - - + - + - + - - , (11.167)
1 2 2 3

where the shaded ellipses indicate that the lines (4, 5) (in the left graph) or (5, 6) (in
the right graph) can either be attached to the left or right MHV vertex, provided the
cyclic order is not modified. Each term in eq. (11.167) therefore corresponds to three
MHV graphs, i.e. a total of six19 . For this amplitude, the CSW rules give:

A6 ([123]− [456]+ )
 5 3 3
X 1Ki 23
= −4ig4 1
K2
Ki i+1 i+1i+2 ··· 61 i Ki 2 34 ··· iKi
i=3
6 3 3

X 12 Lj 3
1
+ L2
,
2Lj Lj j+1 ··· 61 j 34 ··· j−1j jLj
j=4
(11.168)

where the intermediate momenta are respectively Ki ≡ −(p2 + p3 + · · · + pi ) and


Lj ≡ −(p3 +  p4 + · · · + pj ), and the corresponding angle spinors are Ki ≡ η Ki
and Lj ≡ η L j . For a numerical evaluation of this amplitude, one may take any
auxiliary spinor η , since the amplitude does not depend on this choice.
 A
 somewhat
simpler analytic expression may also be obtained by choosing η ≡ 2 . Indeed,
since Ki = Li − p2 , this choice leads to the following simplification,

Ki j = Li j , for all i, j , (11.169)


19 This number should be contrasted with the 38 six-point colour ordered graphs (see the figure 11.1).

Moreover, each of these colour-ordered graphs is considerably more complicated than the MHV diagrams.
402 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

thanks to which many factors become identical in the two terms of eq. (11.168). With
this choice, the terms i = 4, 5 in the first line and j = 4, 5 in the second line combine
in the following compact expression:
X ii+1
4ig4
A6,1 = −
34 45 56 61 Ki i Ki i+1 Ki 2
i=4,5
" 3 3 3 3
#
23 Ki 1 12 Ki 3
× K2
+ (Ki +p2 )2
. (11.170)
i

The terms i = 3 in the first line and j = 6 in the second line of eq. (11.168) must be
handled
  more carefully. Indeed,
  they both contain a denominator that vanishes when 
η ≡ 2 , due to a factor η2 . In order to calculate these terms, we must leave η
unspecified, but such that
   
η2 ≪ ηj for j 6= 2 , (11.171)
 
and expand in powers of η2 . After simplifications involving
  the Schouten identity,
the sum of these two terms is found to be finite when η → 2 and equal to
2  
4ig4 13 s13 +2(s 12 36 23 14
A6,2 =  12 +s
 32 ) +  +  ,
34 45 56 61 12 32 12 16 23 34

(11.172)
 
where we denote sij ≡ ij ij = (pi + pj )2 . Combining all the contributions, the
6-point amplitude reads
4ig4
A6 ([123]− [456]+ ) =
34 45 56 61
 " 3 3 3 3
#
X ii+1 23 Ki 1 12 Ki 3
× − K2
+ (Ki +p2 )2
Ki i Ki i+1 Ki 2 i
i=4,5
 
2 s13 +2(s 12 36 23 14
+ 13  12 +s
 32 ) +  +  . (11.173)
12 32 12 16 23 34

This is the simplest of the 6-point amplitudes with three negative helicities, because
the legs with negative helicities are adjacent. The non-adjacent cases lead to more
complex expressions, but nevertheless considerably simpler than what one would get
from traditional perturbation theory.

11.6.4 General CSW rules


In order to prove the CSW rules in general, we now proceed by induction. i.e. we
assume that the CSW rules are applicable to all on-shell amplitudes with up to N − 1
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 403

Figure 11.10: Propagators


affected by the shift of
eq. (11.174) in a generic
amplitude. They correspond
to the subgraph obtained by
following the momentum
b
that flows from the external
lines of negative helicity, a
assuming that the momenta
of the positive helicity ones
are held fixed.

negative helicities and we consider an amplitude with N negative helicities. Let us


denote N the set
 of external lines of negative helicity. We then introduce an auxiliary
square spinor η , and we apply the following complex shift to these lines
  
bı ≡ i + ri z η , bı ≡ i for i ∈ N , (11.174)

where the ri are coefficients chosen in such a way that momentum conservation is
preserved at any z. Namely, they must satisfy
X
ri i = 0 . (11.175)
i∈N

(We further assume that partial sums are not zero, so that the internal propagators
connected to the external negative helicities all carry a z-dependent momentum.)
When |z| → ∞, the shifted amplitude goes to zero like |z|1−N (graph by graph),
which ensures that there is no boundary term when we integrate over z on the circle
at infinity. The propagators that contribute poles to this integral are among those
represented in dark in the figure 11.10 (we may assume that they become singular
at distinct values of z, so that all the poles are simple). When we assign helicities to
these singular propagators, two cases may arise:

• The singular propagator is directly connected to one of the external lines of


negative helicity (e.g. the propagator labeled “a” in the figure). In this case,
only one choice of helicity is allowed, and the amplitude factorizes into an
amplitude with 2 negative helicities and one with N − 1 negative helicities.
• The singular propagator is an inner propagator, such as the one labeled “b” in
the figure. It means that both the left and the right sub-amplitudes contain at
least two of the N external lines of negative helicity. In this case, both choices
of helicity are valid for the singular propagator, and in both cases the left and
right sub-amplitudes have at most N − 1 negative helicities.
404 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

In these two cases, the theorem of residues divides the amplitude into a left and right
on-shell sub-amplitudes that have at most N − 1 negative helicities, for which we
may use the CSW rules assumed to be valid by the induction hypothesis.
After the left and right sub-amplitudes have been replaced by using the CSW
rules proven for < N negative helicities, all the poles produce terms that correspond
to the same topology of MHV graph, but whose expressions differ because the value
of z∗ is different for each pole. How this sum of contributions produces the product
of denominators one would obtain by applying directly the CSW rules for amplitudes
with N negative helicities requires some clarification. Let us consider a graph with nI
internal shifted propagators. The application of the theorem of residues to the shifted
amplitude divided by z produces nI terms, whose sum corresponds to the following
combination of denominators:
nI
X 1 Y 1
Dpoles = . (11.176)
K2 b2
I=1 I J6=I KJ,I

Each term in this sum corresponds to the vanishing of one of the shifted internal
b 2 = 0,
propagators. The factor K2I comes from the residue of the pole zI for which K I
b 2 denotes the value taken by K
and K b 2 at this z .
I


J,I J

From eq. (11.174), we can write the SL(2, ) matrices that represent the shifted
internal momenta as follows,

Kb =K +z η I , (11.177)
I I

where I is a linear combination of the ri i that depends on the precise topology of


the MHV graph and of the internal propagator under consideration. This implies that

b 2 = K2 − z η K I ,
K (11.178)
I I I

and the corresponding denominator vanishes at zI = K2I / η KI I . Therefore, we
have
nI nI
X 1 Y 1 Y 1
Dpoles =  = . (11.179)
K2I η KJ J K2I
I=1 J6=I K2 − K2 
J I
I=1
η KI I

The last equality20 shows that the nI terms produced by the theorem of residues
combine into an expression which is nothing but the product of unshifted denominators
that would appear in the CSW rules. This completes the proof of the general CSW
diagrammatic rules:
20 This may be proven by integrating over a circle at infinity the following function
n
I
Y 1
f(z) ≡ z−1  .
I=1
K2I − z η KI I
11. M ODERN TOOLS FOR TREE LEVEL AMPLITUDES 405

• Draw all the diagrams with the required assignment of external helicities, and
such that all vertices have exactly two negative helicities. With N external
negative helicities, these graphs all have N − 1 vertices. For instance, the
[1234]− [5 · · · n]+ amplitude receives contributions from the following three
classes of MHV diagrams:

n 4 n 5 1 n 5
- - - + - - + - - -
- - - - - - - - - -
1 2 3 1 2 3 4 2 3 4

(Only the negative helicities are indicated explicitly.)


• The intermediate propagators are scalar propagators i/K2I , with the value of KI
determined by momentum conservation at the vertices,
• Replace the vertices by MHV amplitudes,
 with some off-shell legs for which
the angle spinor is defined as KI ≡ η KI , and use the Parke-Taylor formula
to obtain their expression. c sileG siocnarF
406 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS
Chapter 12

Worldline formalism

In the previous chapter, we have exposed the spinor-helicity language in which the
building blocks of scattering amplitudes are expressed in terms of 2-component
spinors. As we have seen, when combined with techniques such as on-shell recursion,
this leads to great simplifications in the evaluation of on-shell tree amplitudes with
physical polarizations. To a large extent, this simplification stems from the fact that the
calculation of amplitudes based on these methods bypasses the usual representation
in terms of Feynman diagrams.
In fact, the spinor-helicity method is not the only one that relegates Feynman
diagrams to a minor secondary role. Another approach, that we shall discuss in
this chapter, is the worldline formalism. The name comes from the fact that in this
approach, Feynman graphs are replaced by a representation in terms of a path integral
over a function zµ (τ) (plus additional auxiliary variables in the case of fields with
internal degrees of freedom, such as spin or colour), that defines a line embedded in
spacetime. This function can be viewed as a parameterization of the whole history
of a point-like particle. Historically, this method was first derived by starting from a
string theory and by taking the limit of infinite string tension. Subsequently, it was
rederived in a more mundane manner, in a first quantized framework. This is the point
of view that we shall adopt in this chapter.

12.1 Worldline representation


12.1.1 Heat kernel
In order to illustrate the principles of the worldline representation, consider a scalar
field theory of Lagrangian
1 1
L≡ (∂µ φ)(∂µ φ) − m2 φ2 − V(φ) . (12.1)
2 2

407
408 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Let us assume that we wish to obtain the tree-level propagator G(x, y) in a background
field ϕ. Up to an irrelevant factor i, this propagator is the inverse of the operator
 + m2 + V ′′ (ϕ),

x + m2 + V ′′ (ϕ(x)) G(x, y) = −i δ(x − y) . (12.2)

This equation must be supplemented by boundary conditions that depend on the type
of propagator one wishes to obtain (time-ordered, retarded, etc...). Formally, we may
write
Z∞
−1 2 ′′
 + m2 + V ′′ (ϕ) = dT e−T (+m +V (ϕ)) . (12.3)
0

(The integrand in this formula is sometimes called a heat kernel, by analogy with the
propagator of a heat equation.) However, for this integral to make sense in the limit
T → ∞, it is necessary that the eigenvalues of the operator  + m2 + V ′′ (ϕ) be
positive. The high lying eigenvalues of this operator do not depend on the background
field ϕ(x) (assuming that it is smooth enough), and are of the form

−gµν kµ kν + m2 . (12.4)

In order to be positive for any momentum kµ , it is therefore necessary that the metric
be Euclidean, with only minus signs. For this reason, we restrict our discussion to an
Euclidean field theory from now on, so that we may write −gµν kµ kν = ki ki .

12.1.2 Propagator in a background field

The propagator G(x, y) is obtained by evaluating eq. (12.3) between states of definite
position:
Z∞
2 ′′
G(x, y) = −i dT y e−T (+m +V (ϕ)) x . (12.5)
0

Such a matrix element is quite common in ordinary quantum mechanics, and its
representation as a path integral is well known. For a non-relativistic Hamiltonian of
the form
P2
H≡ + V(Q) , (12.6)
2M
we have
Z
  R t1 M .2
y e−i(t1 −t0 )H x = Dq(t) ei t0 dt ( 2 q (t)−V(q(t))) . (12.7)
q(t0 )=x
q(t1 )=y
12. W ORLDLINE FORMALISM 409

Eq. (12.5) can be similarly expressed as a path integral, if we use the following
dictionary

i(t1 − t0 ) → T
it → τ
1
M →
2
V(Q) → m2 + V ′′ (ϕ(x)) . (12.8)

This leads to
Z∞ Z
  RT 1 .2 2 ′′
G(x, y) = −i dT Dz(τ) e− 0 dτ ( 4 z (τ)+m +V (ϕ(z(τ)))) , (12.9)
0
z(0)=x
z(T )=y

where the dot denotes a derivative with respect to τ. For simplicity, we denote the
integration variable z(τ) instead of zµ (τ), although it takes values in a d-dimensional
spacetime. This expression is known as the worldline representation of the tree
propagator in a background field. Very much like in ordinary quantum mechanics,
the function zµ (τ) explores all the paths that start at x and end at y. Note also that
the formula contains an integral over the “duration” (we use quotes here because τ is
not a physical time) of this evolution. c sileG siocnarF

12.1.3 Alternate derivation


Starting from eq. (12.3), it is possible to follow a slightly different route (that one may
view as another derivation of the path integral formulation of quantum mechanics),
that has the virtue of providing more control on all the prefactors. Consider first the
case of the theory in the vacuum, i.e. with no background field. An important result
is
Z +∞
2 dz z2
eT∂ f(x) = √ e− 4T f(x + z) , (12.10)
−∞ 4πT
which may be proven by Fourier transform. In words, this formula means that an
operator Gaussian in a derivative is equivalent to a Gaussian smearing. Note here that
it is crucial that the squared derivative has a positive prefactor inside the exponential,
otherwise in the right hand side we would have a Gaussian with a wrong sign and the
result would ill-defined. From this formula, we get
Z +∞
2 2 dd z z2
− 4T
y e−T (+m ) x = e−Tm d/2
e y x+z
−∞ (4πT ) | {z }
δ(x+z−y)
−Tm2 (y−x)2
e
= e− 4T , (12.11)
(4πT )d/2
410 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

which, apart from the prefactor exp(−Tm2 ), is a Gaussian probability distribution


normalized to unity. This Gaussian distribution may be viewed as a Green’s function
for the diffusion equation

∂T f(T, x) = −x f(T, x) , (12.12)

which highlights the connection that exists between the propagator associated to an
elliptic differential operator and diffusion (i.e. Brownian motion). By comparing
eqs. (12.9) (without external field) and (12.11), one can obtain the following formula
for the absolute normalization of the integral over closed loops in d dimensions
Z  ZT
z2 
.
  1 1
Dz(τ) exp − dτ = = . (12.13)
0 4 (4πT )d/2 d=4 (4πT )2
z(0)=z(T )

The next step is to note that such a Gaussian distribution may be written as the
convolution of two similar distributions defined on half the interval:

(y−x)2 Z (y−z)2 (z−x)2


e− 4T d e− 2T e− 2T
= d z . (12.14)
(4πT )d/2 (2πT )d/2 (2πT )d/2

By taking n − 1 of these intermediate points, we arrive at

Z (zi −zi−1 )2
2 2 dd z1 dd z2 · · · dd zn−1 − ǫ Pni=1
y e−T (+m )
x = e−Tm e 4 ǫ2 ,
(4πǫ)nd/2
(12.15)

where we denote ǫ ≡ T/n, z0 ≡ x and zn ≡ y. In the limit where n → ∞, the


argument of the exponential becomes an integral, and we obtain the following path
integral
Z
−T (+m2 ) −Tm2
  1 RT .2
ye x =e Dz(τ) e− 4 0 dτ z (τ) . (12.16)
z(0)=x
z(T )=y

Taking into account the term V ′′ (ϕ) due to a background field poses no difficulty
if one breaks the interval [0, T ] into many small intervals. Indeed, even though
V ′′ (ϕ(x)) does not commute with x , the Baker-Campbell-Hausdorff formula in-
dicates that the exponential of their sum is equal to the product of their respective
exponentials, up to terms of higher order in ǫ = T/n, that do not matter in the limit
n → ∞.
12. W ORLDLINE FORMALISM 411

12.1.4 One-loop effective action


A minor modification of this derivation also applies to the quantum effective action at
one-loop in the background field ϕ,
 
1  + m2 + V ′′ (ϕ)
Γ [ϕ] = − Tr ln . (12.17)
2  + m2
The denominator inside the logarithm is not crucial since it is independent of the
background field, but it produces an ultraviolet subtraction since the large eigenvalues
of the numerator and denominator are almost equal. Firstly, the logarithm may be
represented as follows,
  Z∞
 + m2 + V ′′ (ϕ) dT  −T (+m2 +V ′′ (ϕ)) −T (+m2 )

ln = − e − e ,
 + m2 0 T
(12.18)
which is very similar to eq. (12.3), except for the denominator 1/T . The same
restrictions on the signs of the eigenvalues apply here, forcing us to consider an
Euclidean theory. The proof of this formula goes along the following lines:
  ZB ZB Z∞ Z∞
A dY −TY dT −TB 
ln = = dY dT e = e − e−TA . (12.19)
B A Y A 0 0 T
Then, the trace is obtained as
  Z
Tr · · · = dd x x · · · x . (12.20)

Therefore, we obtain a path integral representation similar to eq. (12.9), but with a
path that starts and ends at the same point:
Z Z
1 ∞ dT   RT 1 .2 2 ′′
Γ [ϕ] = const + Dz(τ) e− 0 dτ ( 4 z (τ)+m +V (ϕ(z(τ)))) .
2 0 T
z(0)=z(T )
(12.21)
(We have not written explicitly the term coming from the denominator  + m2 – it is
contained in the unspecified additive constant.) In this case, the worldlines are closed,
and therefore form loops in spacetime.

12.1.5 Length scales


Let us now discuss some qualitative aspects of eqs. (12.9) and (12.21). Firstly, note
that the parameter T has the dimension of an inverse squared mass,
T ∼ (mass)−2 . (12.22)
412 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

On dimensional grounds, one sees that the typical diameter of the loops1 z(τ) that
appear in eq. (12.21) is

∆z ∼ T . (12.23)
In contrast, the perimeter of these loops scales as T . These scaling laws are consistent
with a Brownian motion of duration T . c sileG siocnarF

T 1/2

Figure 12.1: Typical worldloop that contributes in eq. (12.21). While its length
scales as T , its extent in spacetime only grows like T 1/2 .

The integration measure dT/T corresponds to a uniform distribution of the values


of ln(T ). Thus, the loop sizes are uniformly distributed on a log scale. Large loops (i.e.
large values of T ) encode the infrared sector of the theory. In a massless theory, loops
of arbitrarily high size are allowed. With a non-zero mass, the factor exp(−Tm2 )
suppresses the values T ≫ m2 , i.e. the loops of size larger than the inverse mass (the
Compton wavelength of the particle). By suppressing the probability of occurrence of
large loops, the mass thus regulates the infrared. Note also how the second derivative
V ′′ (ϕ(z)) acts as a position dependent squared mass.
On the other hand, small loops encode the ultraviolet behaviour of the theory.
When the extent of the loop becomes smaller than the typical scale over which the
background field ϕ(x) varies, the loop sees only a constant background field, whose
sole effect in eq. (12.21) is an overall rescaling. Therefore, these small loops behave
as in the vacuum, and the ultraviolet sector of the theory does not depend on the
background field.
1 For .
a uniform background field, the exponential depends only on the derivative z. As a consequence,
this exponential weight constrains the size of the loops, but not the possible location of their barycenter.
12. W ORLDLINE FORMALISM 413

12.2 Quantum electrodynamics


12.2.1 Scalar QED
Let us now consider the case where the background field is an Abelian vector field
Aµ (x), while the particle in the loop is a (complex) scalar. The one-loop effective
action is now given by
 
−(∂ − ie A)2 + m2
Γ [A] ≡ − Tr ln . (12.24)
 + m2
Note that there is no prefactor 1/2 because the scalar field in this theory is a complex
field. Firstly, we obtain
Z∞ Z
dT 2 2
Γ [A] = const + dd x x e−T (m −(∂−ieA) ) x . (12.25)
0 T
Then, note that the exponential contains an operator which is very similar to the
Hamiltonian of a charged particle in an external electromagnetic field. We can use
this analogy in order to obtain a path integral representation of the matrix element
under the integral in the previous equation. This leads to the following worldline
representation:

Z Z
dT   RT 1 .2 . 2
Γ [A] = const+ Dz(τ) e− 0 dτ ( 4 z (τ)+iez(τ)·A(z(τ))+m ) . (12.26)
T
0 z(0)=z(T )

Likewise, the tree-level scalar propagator in a background electromagnetic field is


given by
Z∞ Z
  RT 1 .2 . 2
G(x, y) = −i dT Dz(τ) e− 0 dτ ( 4 z (τ)+iez(τ)·A(z(τ))+m ) . (12.27)
0
z(0)=x
z(T )=y

Under an Abelian gauge transformation of the electromagnetic potential,

Aµ (z) → Aµ (z) + ∂µ χ(z) , (12.28)


.
the term in z · A transforms as follows
. .  .
z · A → z · A + ∂z χ = z · A + ∂τ χ(z(τ)) . (12.29)

Thus, the gauge transformation modifies this term by the addition of a total derivative
with respect to τ, whose integral is
ZT
dτ ∂τ χ(z(τ)) = χ(z(T )) − χ(z(0)) . (12.30)
0
414 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

In the calculation of the one-loop effective action, the trajectories z(τ) have equal
initial and final points, and this shift is therefore zero. Thus, the expression (12.26)
of the one-loop effective action is explicitly gauge invariant. If we were considering
instead the scalar propagator G(x, y), this term would be
ZT
dτ ∂τ χ(z(τ)) = χ(y) − χ(x) , (12.31)
0

and as a consequence the propagator transforms as


G(x, y) → eieχ(x) G(x, y) e−ieχ(y) , (12.32)
which is indeed the correct gauge transformation law of the scalar propagator.

12.2.2 Spinor QED


When the particle in the loop is a spin 1/2 fermion, the one-loop effective action
involves the Dirac operator
 
iDµ γµ + m
Γ [A] ≡ Tr ln . (12.33)
i∂µ γµ + m
The first step is to note that
  h i1/2
det m + iD /2
/ = det m2 +D
/ = det m − iD , (12.34)

which leads to
" #
1 D/ 2 + m2
Γ [A] ≡ Tr ln 2 . (12.35)
2 / + m2

Then, we may use

/2
D = D µ Dν γ µ γ ν
1 µ ν 1 µ ν

= D µ Dν 2 {γ , γ } + 2 [γ , γ ]
= −D2 + 1 µ ν
4 [Dµ , Dν ][γ , γ ]
= −D2 − e Fµν Mµν , (12.36)

where Mµν ≡ 4i [γµ , γν ]. (We have assumed an Euclidean metric tensor with only
minus signs in the 3rd and 4th lines.) This gives the following representation of the
one-loop effective action:

Z 
1 dT −T m2 −D2 −e Fµν Mµν
Γ [A] = const − tr e . (12.37)
2 T
0
12. W ORLDLINE FORMALISM 415

The term m2 − D2 , identical to the operator encountered in the case of scalar QED,
is now supplemented by a potential
U(x) ≡ −e Fµν (x) Mµν . (12.38)
However, because U(x) still contains non-commuting Dirac matrices, the worldline
representation of the exponential is now more complicated, and the overall trace
applies both to the spacetime dependence and to the Dirac indices. A first possibility
is to reproduce the method used in the previous sections, where we introduce a path
integral over classical trajectories zµ (τ). When doing this, the matrix Dirac structure
inside the exponential is not altered, and is handled by a path ordering:
 Z
 
2 2 µν
x e−T m −D −e Fµν M x = Dz(τ)
z(0)=x
z(T )=x
 RT 1 . . 
dτ ( 4 z2 (τ)+iez(τ)·A(z(τ))+m2 +U(z(τ)))
× P e− 0 . (12.39)

But it is in fact possible to remove the path ordering by introducing some auxiliary
variables. In the procedure that leads to eq. (12.39), one breaks the interval [0, T ]
into infinitesimal sub-intervals and one inserts a complete sum of states between each
factor. When the evolution operator to be evaluated contains extra internal degrees
of freedom (in the present case, the spin degree of freedom encoded in the Dirac
matrices), the intermediate states inserted in the expression must contain information
about this internal structure, for the matrix elements produced in the process to be
c-numbers. c sileG siocnarF

Let us define the following operators from the Dirac matrices:


iγ1 ± γ2 iγ3 ± γ4

1 ≡ , c±2 ≡ . (12.40)
2 2
From the anti-commutation relation obeyed by the Dirac matrices, we have
{c+ −
r , cs } = δrs , {c+ + − −
r , cs } = {cr , cs } = 0 . (12.41)
Therefore the operators c+
are fermionic creation operators (creating independent
r
fermions), and the c−
i are the corresponding annihilation operators. By inverting
eq. (12.40),
γ1 = −i (c+ −
1 + c1 ) , γ2 = c+ −
1 − c1 ,
γ3 = −i (c+ −
2 + c2 ) , γ4 = c+ −
2 − c2 , (12.42)
the potential U(x) may be viewed as a Hamiltonian quadratic in fermionic creation
and annihilation operators, and a time evolution operator constructed with this Hamil-
tonian may be written as a Grassmann path integral. In order to see this, let us start
from a state 0 which is annihilated by c− 1,2 ,

c−
1,2 0 = 0] , (12.43)
416 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

and construct populated states by applying c+


1,2 to it,
n1 + n2
n 1 n2 ≡ c+ 1 c2 0 . (12.44)

Consider now a pair of complex Grassmann variables ξ1,2 such that


   
ξi , ξj = ξi , ξj = ξi , ξj = ξi , ξj = 0 ,
 
ξi , c ±
j = ξi , c ±
j =0. (12.45)

A fermionic coherent state ξ may be defined as


+
ξ ≡ e−ξc 0 , ξ = 0 eξc . (12.46)

As with bosonic coherent states, they are eigenstates of the annihilation operators,

c−
i ξ = ξi ξ , ξ c+
i = ξi ξ . (12.47)

In addition, the overlap between two such coherent states is given by

ξ ζ = eξζ , (12.48)

and one may construct the identity operator as a superposition of projectors on these
coherent states:
Z
1 = dξ1 dξ1 dξ2 dξ2 e−ξξ ξ ξ . (12.49)
| {z }
≡ dξdξ

Moreover, if A(c+ , c− ) is a normal ordered operator made of the creation and


annihilation operators, then its matrix element between two coherent states is given
by

ξ A(c+ , c− ) ζ = eξζ A(ξ, ζ) , (12.50)

and the trace over the Dirac indices of an operator A may be written as
Z

tr A = dξdξ e−ξξ − ξ A ξ . (12.51)

(One may easily check that this gives 4 when A is the identity.) Note that in the
calculation of this trace, the coherent states that appear on the left and on the right
are defined with opposite Grassmann variables ξ and −ξ. This is a standard property
of fermionic traces, whose path integral representation must obey an anti-periodic
boundary condition in time.
This formalism can be used to transform the Dirac structure in eq. (12.39) into
a Grassmann path integral. To achieve this, we follow the standard procedure of
12. W ORLDLINE FORMALISM 417

breaking the interval [0, T ] into N small sub-intervals, and we insert a unit operator
given by eq. (12.49) at the boundaries of the sub-intervals. This produces matrix
elements of the form
µν
ξi+1 e−ǫeFµν (z(τi )) M ξi , (12.52)

(ǫ ≡ T/N) that may be evaluated by replacing the Dirac matrices in Mµν by their
expression in terms of the operators c± 1,2 and by using the properties of fermionic
coherent states. This leads to the following worldline representation for the one-loop
effective action in QED, with a spin 1/2 field in the loop:
Z∞ Z
1 dT  
Γ [A] = const − Dz(τ)Dψ(τ)
2 0 T
z(0)=z(T )
ψ(0)=−ψ(T )
RT 1 . . 1 .
dτ ( 4 z2 +iez·A(z)+m2 + 2 ψµ ψµ −ie ψµ Fµν (z)ψν )
× e− 0 , (12.53)

where ψµ is a collection of four Grassmann variables that combine the ξ1,2 , ξ1,2 at
each intermediate time. In this formula, the ordering that was necessary to handle
the non-commutative nature of the Dirac matrices has now been replaced by a path
integral over fermionic internal degrees of freedom.

12.3 Schwinger mechanism


Since it provides expressions for propagators and effective actions in a background
field, the worldline formalism is well suited to study phenomena that occur in the
presence of such an external field, for instance the splitting of a photon into two
photons in an external magnetic field (γ → 2γ is forbidden in the vacuum, but
becomes possible if an external electromagnetic field provides a fourth photon), or
the bremsstrahlung radiation by a charged particle in a magnetic field. c sileG siocnarF

Another interesting process which can be addressed by the worldline formalism


is the Schwinger mechanism, which amounts to the spontaneous production of e+ e−
pairs by a static and homogeneous electrical field. That this is possible may be
understood intuitively as follows. Without any external field, QED has an empty band
of states with energy larger than m corresponding to free electrons and anti-electrons,
and a filled band of states with energy lower than −m that corresponds to “trapped”
particles (the Dirac see). A minimal energy of 2m (the minimal energy of an e+ e−
pair) must be provided to move one of these particles from the Dirac see to the positive
band of free particles. Consider now an electrical field E in the x direction. If this field
is static and homogeneous, we may find a gauge in which it is represented by a vector
potential whose only non-zero component is A0 = −Ex, which tilts the boundaries
of the band of free states and of the Dirac see, as shown in the figure 12.2. Now, a pair
418 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 12.2: Schematic


picture of the tunneling
E process involved in the
Schwinger mechanism.
The empty band is the
+m x gap between the anti-
electron Dirac sea and
the positive energy elec-
−m tron continuum, tilted
by the potential energy
eV(x) = −eEx in the
presence of an external
electrical field E.

from the Dirac see can move to the band of free particles by a tunneling process, that
does not require any energy. Standard results of quantum mechanics indicate that the
tunneling probability should behave as exp(−const × m2 /(eE)). This expression is
non-analytic in the coupling constant e, making it impossible to obtain in a standard
perturbative expansion.
Although the Schwinger mechanism was computed a long time ago by resummed
perturbation theory, the worldline formalism provides a straightforward way to cal-
culate it and offers very interesting new insights about the space-time development
of the particle production process. Let us consider the case of scalar QED in order
to illustrate this in a simpler setting. The probability of pair production may be
inferred from the vacuum-to-vacuum transition amplitude, that can be written as an
exponential,

0out 0in = ei V , (12.54)

i V being the sum of all the connected vacuum diagrams. The possibility of particle
production is intimately related to the imaginary part of V, since the total probability
of producing particles reads
2
Pprod = 1 − 0out 0in = 1 − e−2 Im V . (12.55)

In scalar QED, the graphs made of one scalar loop embedded in a background
electromagnetic field lead to the following contribution to V,

V1 loop = ln det gµν Dµ Dν + m2 , (12.56)

where Dµ is the covariant derivative in the background field. The metric should
be Euclidean in order to apply the worldline formalism, i.e. gµν = −δµν . At
12. W ORLDLINE FORMALISM 419

one-loop, the sum of the connected vacuum graphs in the presence of a background
electromagnetic field is given by

Z Z
dT −m2 T   RT 1 .2 .
V1 loop = e Dz(τ) e− 0 dτ ( 4 z (τ)+iez(τ)·A(z(τ)) . (12.57)
T
0 z(0)=z(T )

This formula involves a double integration: a path integral over all the worldlines
z(τ), i.e. closed paths in Euclidean space-time parameterized by the fictitious time
τ ∈ [0, T ], and an ordinary integral over the length T of these paths. The sum over
all the worldlines can be viewed as a materialization of the quantum fluctuations in
space-time, and the prefactor exp(−m2 T ) suppresses the very long worldlines that
explore regions of space-time that are much larger than the Compton wavelength of
the particles.
In eq. (12.57), the path integral can be factored into an integral over the barycenter
Z of the worldline and the position ζ(τ) about this barycenter,
ZT
z(τ) ≡ Z + ζ(τ) , dτ ζ(τ) = 0 . (12.58)
0

After this separation, all the information about the background field contained in
eq. (12.57) comes via a Wilson line,
 ZT 
  .
WZ ζ ≡ exp − ie dτ ζ(τ) · A(Z + ζ(τ)) , (12.59)
0

averaged over all closed loop of length T ,


R      R . 
T ζ2
Dζ(τ) WZ ζ exp − 0 dτ 4
ζ(0)=ζ(T )
hWZ iT ≡ R    R .  . (12.60)
T ζ2
Dζ(τ) exp − 0 dτ 4
ζ(0)=ζ(T )

This path average is dominated by an ensemble of loops localized around the bary-
center Z, and hWZ iT encapsulates the local properties of the quantum field theory in
the vicinity of Z (roughly up to a distance of order T 1/2 ). In terms of this averaged
Wilson loop, the 1-loop Euclidean connected vacuum amplitude reads
Z Z∞
1 dT −m2 T
V1 loop = d4 Z e hWZ iT . (12.61)
(4π)2 0 T3

(In this formula, the prefactor and power of T in the measure assume 4 spacetime
dimensions.) The imaginary part of V1 loop comes from the existence of poles in
420 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

hWZ iT at real values of the fictitious time T . In terms of these poles, the imaginary
part can be written as
Z X e−m2 Tn
π 4

Im (V1 loop ) = d Z Re Res hWZ iTn , Tn . (12.62)
(4π)2 Tn3
poles Tn

Let us now be more specific and consider a static and uniform electrical field E.
Since one can choose a gauge potential which is linear in the coordinates Z, ζ, the
path integral that gives the average Wilson loop is Gaussian and can therefore be
performed in closed form, leading to

eET
hWZ iT = . (12.63)
sin(eET )

(Note that it does not depend on the barycenter Z since the field is constant.) This
quantity has an infinite series of single poles along the positive real axis, located at
Tn = nπ/(eE) (n = 1, 2, 3, · · · ), that give the following expression for the imaginary
part:

V4 2
X (−1)n−1 −nπm2 /(eE)
Im (V1 loop ) = (eE) e . (12.64)
16π3 n2
n=1

In this formula, V4 is the volume in space-time over which the integration over the
barycenter Z is carried out. After exponentiation, this formula gives the vacuum
survival probability P0 = exp(−2 Im V). A more detailed study would reveal that
the term of index n comes from Bose-Einstein correlations among n produced pairs,
while the first pole τ1 only contains information about the uncorrelated part of the
spectrum. Given the origin of these terms in the present derivation, as coming from
poles Tn that are more distant from T = 0, we see that increasingly intricate (the index
n is the number of correlated particles) quantum correlations come from worldlines
that explore larger and larger portions of space-time. This supports the intuitive image
that quantum fluctuations and correlations are encoded in the fact that the worldlines
explore an extended region around the base point Z. c sileG siocnarF

12.4 Calculation of one-loop amplitudes

The worldline formalism can also be used in order to derive expressions for one-loop
amplitudes. The main difference, compared to the calculation of the Schwinger
mechanism, is that in the case of amplitudes the momenta carried by the lines attached
to the loop are fixed instead of integrated over. The expected result is therefore a
function of N momenta (or coordinates), rather than just a number.
12. W ORLDLINE FORMALISM 421

12.4.1 φ3 scalar field theory


As a first simple illustration of the method, let us consider a scalar field theory with a
λ
cubic coupling V(φ) = 3! φ3 , for which the worldline representation of the one-loop
effective action is
Z Z
1 ∞ dT   RT 1 .2 2
Γ [ϕ] = const + Dz(τ) e− 0 dτ ( 4 z +m +λϕ(z)) . (12.65)
2 0 T
z(0)=z(T )

Like in the previous section, we first split z(τ) into the barycenter and a deviation
about it:
z(τ) ≡ Z + ζ(τ) . (12.66)
In the case of amplitudes, the integration over Z will simply produce the delta function
of overall energy-momentum conservation.
.
Using the T -periodicity of the paths over
which we integrate, the term in ζ2 inside the exponential can be integrated by parts,
Z Z
1 T . 1 T
− dτ ζ2 = dτ ζζ̈ , (12.67)
4 0 4 0
and the the path integral on ζ(τ) involves the inverse G(τ, τ ′ ) of the operator 12 ∂2τ ,
defined by
∂2τ G(τ, τ ′ ) = 2 δ(τ − τ ′ ) . (12.68)
This inverse exists thanks to the fact that we have removed the barycenter from z(τ),
which amounts to removing the zero mode from ζ(τ). Indeed, a general T -periodic
function can be written as
X τ
ζ(τ) = ζn e2iπn T , (12.69)
n∈❩

and excluding the zero mode corresponds to ζ0 = 0. A very useful identity is


X 2iπn τ X
e T =T δ(τ − nT ) . (12.70)
n∈❩ n∈❩

Using this formula, we can check that the propagator G(τ, τ ′ ) is given by

X 1 (τ−τ )
2iπn
G(τ, τ ′ ) = 2T e T . (12.71)
(2iπn)2
n∈❩∗

Note that this function is even in τ − τ ′ and T -periodic. Integrating2 eq. (12.70) twice
from 0 to τ − τ ′ , we obtain
(τ − τ ′ )2 T
G(τ, τ ′ ) = |τ − τ ′ | − − . (12.72)
T 6
2 adopt a symmetric convention for handling the delta function δ(τ), which amounts to
Rτ We′ ′ 1
0 dτ δ(τ ) = 2 θ(τ).
422 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

From the quantum effective action, one-particle irreducible amplitudes are ob-
tained by differentiating with respect to the field, as many times as there are external
legs, and by setting ϕ ≡ 0 afterwards. Thus, the N-point function is given by

Z∞ Z
(−λ)N dT −m2 T  
ΓN (x1 , · · · , xN ) = e Dz(τ)
2 0 T
z(0)=z(T )
N ZT
Y RT 1 .
dτ 4 z2
× dτi δ(z(τi ) − xi ) e− 0 . (12.73)
i=1 0

In this formula, the path integral is over all closed paths that pass at all the coordinates
x1 , · · · , xN , in any order, which provides a rather intuitive picture of the worldline
representation of the amplitude. Let us now Fourier transform this expression in order
to obtain the amplitude in momentum space. The Fourier integrals over the xi are
trivial thanks to the N delta functions, and we obtain
Z∞ Z Z
(−λ)N dT −m2 T  
ΓN (p1 , · · · , pN ) = e dd Z Dζ(τ)
2 0 T
ζ(0)=ζ(T )
N ZT
Y RT 1
dτ 4 ζζ̈
× dτi eipi ·(Z+ζ(τi )) e 0 , (12.74)
i=1 0

where we also have separated the barycenter coordinate Z from the deviation ζ and
integrated by parts the term in ζ̇2 . The integral over Z produces a delta function of
the sum of the momenta, and the path integral over ζ is Gaussian, leading to

Z
(−λ)N X  ∞ dT 2
ΓN (p1 , · · · , pN ) = d/2
(2π) δ pi e−m T
21+d/2 i 0 T 1+d/2
ZT Y
N  X 
× dτi exp 12 G(τi , τj ) (pi · pj ) .(12.75)
0 i=1 i,j

This is the worldline expression of a one-loop N-point scalar amplitude. One may
make a number of remarks about this formula:

• In contrast with the formulas obtained from Feynman diagrams, there is no


loop momentum. It is replaced by the variable T that measures the length of the
worldloops. As we have said earlier, there is loose connection between T and
a momentum, since small values of T correspond to the ultraviolet and large
values of T to the infrared.c sileG siocnarF
12. W ORLDLINE FORMALISM 423

• The dependence on the external momenta is directly expressed in terms of the


Lorentz invariants pi · pj . In a formula that contains a loop momentum k, one
would also have all the k · pi .

• The integral on T may be divergent at small T , because of the factor 1/T 1+d/2 .
However, the integrals of the second line roughly behave as T N (since there
are N integrals over an interval of size T , with an integrand of order one).
Thus, the overall behaviour of the T integral is dT T N−1−d/2 . This integral is
convergent if N − 1 − d/2 > −1, i.e. N > d/2. In four spacetime dimensions,
this is N > 2, in agreement with conventional power counting that indicates
that all one-loop functions with n > 3 are finite in the φ3 scalar theory.

• Each ordering of the fictitious times τi corresponds to a given cyclic ordering


of the momenta pi around the loop. The corresponding to one such ordering
in the formula (12.75) can be mapped to the expression of the corresponding
Feynman diagram. The τi correspond to the N Feynman parameters introduced
to combine the N denominators into a single one3 , and T corresponds to the
squared momentum that appears in this unique denominator.

• The constant term −T/6 in the propagator of eq. (12.72) does not contribute in
eq. (12.75). Indeed, its contribution inside the exponential is

T X T X  X 
− pi · pj = − pi · pj = 0 , (12.76)
12 12
i,j i j

which is zero thanks to momentum conservation.

12.4.2 Scalar quantum electrodynamics

As a slightly more complicated example of application, let us now derive the expres-
sion of the one-loop N-photon amplitude in scalar QED. The starting point is the
one-loop quantum effective action in an Abelian background gauge field,


Z Z
dT −m2 T   RT 1 .2 .
Γ [A] = const + e Dz(τ) e− 0 dτ ( 4 z +iez·A(z)) . (12.77)
T
0 z(0)=z(T )

3 Note that there are only N − 1 independent Feynman parameters, since their sum is constrained to be

one, but because of the periodicity and translation invariance of the propagators G(τi , τj ), it is possible to
choose one of the τi ’s to be equal to zero, hence only N − 1 of them are truly independent.
424 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Differentiating N times with respect to Aµ (x) and setting the background field to
zero afterwards, we obtain


Z Z
µ1 ···µN dT −m2 T   RT 1 .2
ΓN (x1 , · · · , xN ) = (−ie) N
e Dz(τ) e− 0 dτ 4z
T
0 z(0)=z(T )
ZT Y
N
.
× dτi δ(z(τi ) − xi ) zµi (τi ) . (12.78)
0 i=1

Next, we Fourier transform this expression and contract a polarization vector to each
external Lorentz index, and we isolate the integral over the barycenter Z,


Z Z
N d
X  dT −m2 T  
ΓN (p1 ǫ1 , · · · , pN ǫN ) = (−ie) (2π) δ pi e Dζ(τ)
i T
0 ζ(0)=ζ(T )
ZT Y
N
RT
dτ 4
1
ζζ̈
. 
×e 0 dτi eipi ·ζ(τi ) ζ(τi ) · ǫi .
0 i=1

(12.79)

.
The path integral is still Gaussian, but the factors ζ(τi ) · ǫi complicate it significantly
compared to the φ3 theory. In particular, the answer will now contain derivatives of
the propagator

2 (τ − τ ′ )
Ġ(τ, τ ′ ) ≡ ∂τ G(τ, τ ′ ) = sign(τ − τ ′ ) − ,
T
2
G̈(τ, τ ′ ) ≡ ∂τ ∂τ ′ G(τ, τ ′ ) = 2 δ(τ − τ ′ ) − . (12.80)
T

Note that the first derivative of the propagator with respect to the second time is the
opposite of the Ġ defined above since the propagator G is even. Note again that the
term −T/6 in this propagator will not contribute, thanks to momentum conservation.
A convenient trick to perform this integral is to write

Y .
X .

ζ(τi ) · ǫi = exp ζ(τi ) · ǫi , (12.81)
multi-linear
i i
12. W ORLDLINE FORMALISM 425

where the subscript “multi-linear” means that we keep only the term in ǫ1 ǫ2 · · · ǫN
in the Taylor expansion of the exponential. This leads to
Z
(−ie)N X  ∞ dT 2
ΓN (p1 ǫ1 , · · · , pN ǫN ) = (2π) d/2
δ pi e−m T
2d/2 i 0 T 1+d/2
ZT YN X h1
× dτi exp G(τi , τj ) (pi · pj )
0 i=1 2
i,j
1 i
+i Ġ(τi , τj ) (pi · ǫj ) + G̈(τi , τj ) (ǫi · ǫj ) .
2 multi-linear
(12.82)

The expansion of the exponential and extraction of the term that contains each
polarization vector exactly once leads to an expression of the form
 X 
exp · · · = PN (Ġ, G̈) exp 21 G(τi , τj ) (pi · pj ) , (12.83)
multi-linear
i,j

where PN is a polynomial in the derivatives of the propagator, with coefficients made


of the Lorentz invariants pi · ǫj and ǫi · ǫj . By integration by parts, it is possible
to replace the second derivatives G̈ by first derivatives Ġ. In this operation, the
polynomial PN is replaced by another polynomial QN that depends only on the Ġ,
hence
 X 
exp · · · = QN (Ġ) exp 21 G(τi , τj ) (pi · pj ) . (12.84)
multi-linear
i,j

The polynomial QN (Ġ) corresponds to the combination of numerators that would


appear in the expression of this amplitude from the usual Feynman rules.

12.4.3 Spinor QED


In quantum electrodynamics with spin 1/2 matter fields, the one-loop effective action
in a photon background is given by:
Z Z
1 ∞ dT  
Γ [A] = const − Dz(τ)Dψ(τ)
2 0 T
z(0)=z(T )
ψ(0)=−ψ(T )
RT 1 . . 1 .
dτ ( 4 z2 +iez·A(z)+m2 + 2 ψµ ψµ −ie ψµ Fµν (z)ψν )
× e− 0 . (12.85)

Now, we have a second path integral, that involves the anti-periodic Grassmann
variables ψµ . This additional integral is also Gaussian, and its result can be expressed
426 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

in terms of the inverse of the operator 12 ∂τ over the space of anti-periodic functions4 ,
whose expression reads
X 1 1 τ−τ ′
S(τ, τ ′ ) = 2 1
e2iπ(n+ 2 ) T = sign (τ − τ ′ ) . (12.86)
n∈❩
2iπ(n + 2 )

One can then in principle follow the same sequence of steps as in the scalar QED
case, to obtain an expression of the one-loop N-photon amplitude in spinor QED in
terms of the propagators G, Ġ, G̈ and S. In fact, it was shown by Bern and Kosower
that this expression can be obtained from the corresponding scalar QED amplitude
by a simple substitution. Starting from the final scalar QED expression in terms of
the polynomial QN (Ġ) (see the eq. (12.84)), one should arrange each term of this
polynomial as a product of cycles of the form
[1, 2, 3, · · · c]G ≡ Ġ(τ1 , τ2 )Ġ(τ2 , τ3 ) · · · Ġ(τc−1 , τc ) . (12.87)
Then, the Bern-Kosower rule states that in order to obtain the analogous spinor QED
amplitude, one should perform the following substitution on each such cycle:

[1, 2, 3, · · · c]G → −2 [1, 2, 3, · · · c]G − [1, 2, 3, · · · c]S , (12.88)
where [· · · ]S is the same cyclic product made of the propagator S defined above
instead of Ġ. c sileG siocnarF

12.4.4 Example: QED polarization tensor


Scalar QED : As an illustration of the calculation of amplitudes in the worldline
formalism, let us study the one-loop photon polarization tensor in quantum elec-
trodynamics, starting first from the simpler case of scalar QED. In d dimensions,
the polarization tensor is related to the one-particle irreducible two-point function
Γ2µ1 µ2 (p1 , p2 ) by
Γ2µν (p, q) ≡ (2π)2 δ(p + q) Πµν (p) . (12.89)
From eq. (12.79), we obtain the following expression in scalar QED

Z Z
dT −m2 T   RT 1
Πµν
scalar (p) = 2
−e e Dζ(τ) e 0 dτ 4 ζζ̈
T
0 ζ(0)=ζ(T )
ZT
. .
× dτ1 dτ2 eip·ζ(τ1 ) e−ip·ζ(τ2 ) ζµ (τ1 )ζν (τ2 ) . (12.90)
0
4 Anti-periodic
functions defined over the interval [0, T ] can be written as
X 1 τ
ψ(τ) ≡ ψn e2iπ(n+ 2 ) T .
n∈ ❩
When restricted to these functions, the derivative operator ∂τ has no zero mode and is thus invertible.
12. W ORLDLINE FORMALISM 427

Figure 12.3:
Feynman graphs
contributing to the
one-loop photon
polarization tensor in
scalar QED.

The path integration over ζ leads to


Z
  RT 1 . .
Dζ(τ) e 0 dτ 4 ζζ̈ eip·ζ(τ1 ) e−ip·ζ(τ2 ) ζµ (τ1 )ζν (τ2 )
ζ(0)=ζ(T )
1 h i 2
= d/2
gµν G̈(τ1 , τ2 ) − pµ pν Ġ2 (τ1 , τ2 ) e−p G(τ1 ,τ2 )
(4πT )
1   2
= d/2
gµν 2
p − p µ ν
p Ġ2 (τ1 , τ2 ) e−p G(τ1 ,τ2 ) , (12.91)
(4πT )
where in the third line we have anticipated an integration by parts on τ1 for the term
in G̈. We can already see that the polarization tensor is transverse. At this point, it is
convenient to use rescaled variables τi ≡ T ϑi . Moreover, thanks to the translation
invariance of the integrand in τ1,2 and to its T -periodicity, we are free to set ϑ2 ≡ 0.
Having done that, the propagator and its derivative become simple functions of ϑ1 ,

G(τ1 , τ2 ) = T ϑ1 (1 − ϑ1 ) ,
Ġ(τ1 , τ2 ) = 1 − 2 ϑ1 . (12.92)

(We have already dropped the constant term in −T/6 from the propagator, since it
does not contribute to amplitudes thanks to momentum conservation.) At this point,
the polarization tensor reads

e2  µν 2  Z1
Πµν
scalar (p) = − g p − pµ ν
p dϑ1 (1 − 2ϑ1 )2
(4π)d/2 0

Z
dT 2−d/2 −T (m2 +p2 ϑ1 (1−ϑ1 ))
× T e
T
0

e2  µν 2  Z1
µ ν
= − g p − p p dϑ1 (1 − 2ϑ1 )2
(4π)d/2 0
d
 2 2
d/2−2
× Γ (2 − 2 ) m + p ϑ1 (1 − ϑ1 ) . (12.93)

One may check that this expression is identical to the one we would have obtained
from the two Feynman diagrams of the figure 12.3, after introducing Feynman
parameters and performing the integration over the loop momentum.
428 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Spinor QED : Let us now consider the same quantity in QED with a spin 1/2
fermion. Eq. (12.90) is replaced by

Z Z
e2 dT −m2 T   RT 1 1 .

Πµν
spin 1/2 (p) = e Dζ(τ)Dψ(τ) e 0 dτ ( 4 ζζ̈− 2 ψ·ψ)
2 T
0 ζ(0)=ζ(T )
ψ(0)=−ψ(T )
ZT
× dτ1 dτ2 eip·ζ(τ1 ) e−ip·ζ(τ2 )
0
. 
× ζµ (τ1 ) + 2i ψµ (τ1 ) (ψ(τ1 ) · p)
. 
× ζν (τ2 ) − 2i ψν (τ2 ) (ψ(τ2 ) · p) . (12.94)

The path integral for the term in ζ̇µ ζ̇ν is the same as in eq. (12.91), but we should
now multiply the result by5
Z
  RT 1 .

Dψ(τ) e− 0 dτ 2 ψ·ψ = 4 . (12.95)


ψ(0)=−ψ(T )

For the terms involving the Grassmann variables, we have


Z
  RT 1 .

4 Dψ(τ) e− 0 dτ 2 ψ·ψ ψµ (τ1 ) (ψ(τ1 ) · p)ψν (τ2 ) (ψ(τ2 ) · p)


ψ(0)=−ψ(T )
 
= − S2 (τ1 , τ2 ) gµν p2 − pµ pν , (12.96)

and this result should be multiplied by (4πT )−d/2 to account for the integration over
the variable ζ. Thus, we see here an example of the Bern-Kosower substitution rule:
the spin 1/2 loop can be obtained from the scalar loop, by replacing Ġ2 (τ1 , τ2 ) by
Ġ2 (τ1 , τ2 ) − S2 (τ1 , τ2 ) and by multiplying by an overall factor −2 (this comes from
a −1/2 due to the different prefactors in the scalar and spin 1/2 one-loop effective
actions, times the factor 4 from eq. (12.95)). In terms of the variables ϑ1,2 and after
setting ϑ2 = 0, we have simply

S(τ1 , τ2 ) = 1 , (12.97)

and therefore, the factor (1 − 2 ϑ1 )2 from Ġ2 becomes

(1 − 2 ϑ1 )2 − 1 = −4 ϑ1 (1 − ϑ1 ) . (12.98)
5 This formula may be obtained by ζ function regularization. If we denote A ≡ −∂ (restricted to the
τ
subspace of anti-periodic functions) and λn = −2iπ(n + 21 ) its eigenvalues, the ζ function of this operator
P  2
is ζA (s) ≡ n∈❩ λ−s n . Since there are four variables ψµ , the value of the path integral is det A =
exp(−2ζA ′ (0)). On the other hand, we have ζ (s) = (iπ)−s (1 + eiπs )(1 − 2−s )ζ(s), where ζ(s) is
A
Riemann’s zeta function. This function can be expanded at small s, giving: ζA (s) = − ln(2) s + O(s2 ).
12. W ORLDLINE FORMALISM 429

Therefore, the worldline expression of the one-loop photon polarization tensor in


spinor QED reads

8 e2  µν 2  Z1
Πµν
spin 1/2 (p) = g p − pµ ν
p dϑ1 ϑ1 (1 − ϑ1 )
(4π)d/2 0
 d/2−2
× Γ (2 − d2 ) m2 + p2 ϑ1 (1 − ϑ1 ) , (12.99)

that agrees with the expression obtained from Feynman graphs (only the first topology
in the figure 12.3, with the scalar loop replaced by a spinor loop, contributes in this
case). Remarkably, all the Dirac algebra usually involved in the calculation of fermion
loops is completely avoided in the worldline formalism, since it is encapsulated into
the Grassmann functional integration over ψµ .
430 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS
Chapter 13

Lattice field theory

We have seen earlier that the running coupling in an SU(N) non-Abelian gauge
theories decreases at large energy (provided the number of quark flavours is less
than 11N/2). The counterpart of asymptotic freedom is that the coupling increases
towards lower energies, precluding the use of perturbation theory to study phenomena
in this regime. Among such properties is that of colour confinement, i.e. the fact
that coloured states cannot exist as asymptotic states. Instead the quarks and gluons
arrange themselves into colour neutral bound states, that can be mesons (e.g. pions,
kaons) made of a quark and an antiquark or baryons (e.g. protons, neutrons) made of
three quarks1 . A legitimate question would be to determine the mass spectrum of the
asymptotic states of QCD from its Lagrangian.
Since the perturbative expansion is not applicable for this type of problem, one
would like to be able to attack it via some non-perturbative approach. By non-
perturbative, we mean a method by which observables would directly be obtained to all
orders in the coupling constant, without any expansion. One such method, known as
lattice field theory, consists in discretizing space-time in order to evaluate numerically
the path integral. The continuous space-time is replaced by a discrete grid of points,
the simplest arrangement being a hyper-cubic lattice such as the one shown in the
figure 13.1. The distance between nearest neighbor sites is called the lattice spacing,
and usually denoted a. The lattice spacing, being the smallest distance that exists in
this setup, therefore provides a natural ultraviolet regularization. Indeed, on a lattice
of spacing a, the largest conjugate momentum is of order a−1 . Moreover, one usually
uses periodic boundary conditions; if the lattice has N spacings in all directions,
then we have φ(x + N µ b ) = φ(x) for bosonic fields and φ(x + N µ b ) = −φ(x) for
fermionic fields (bµ is the displacement vector by one lattice spacing in the direction µ
of spacetime). c sileG siocnarF

1 More exotic bound states made of four (tetraquarks) or five (pentaquarks) have also been speculated,
but the experimental evidence for these states is so far not fully conclusive. Likewise, there may exist
bound states without valence quarks, the glueballs.

431
432 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 13.1: Discretization


of Euclidean space-time on a
hyper-cubic lattice (here shown
in three dimensions).

13.1 Discretization of bosonic actions

13.1.1 Scalar field theory

As an illustration of some of the issues involved in the discretization of a quantum


field theory, let us consider a simple scalar field theory with a local interaction in φ4 ,
whose action in continuous space-time is
Z
1 λ
S = d4 x − φ(x) ∂µ ∂µ + m2 )φ(x) − φ4 (x) . (13.1)
2 4!

A natural choice is to replace the integral over space-time by a discrete sum over the
sites of the lattice, weighted by the volume a4 of the elementary cells of the lattice,
X Z
4
a → d4 x . (13.2)
a→0
x∈ lattice

Then we replace the continuous function φ(x) by a discrete set of real numbers that
live on the lattice nodes. For simplicity, we keep denoting φ(x) the value of the field
on the lattice site x. The discretization of the mass and interaction terms is trivial, but
the discretization of the derivatives that appear in the D’Alembertian operator is not
unique. Using only two nearest neighbors, one may define forward or backward finite
differences,

b ) − f(x)
f(x + µ
∇µ f(x) ≡
F
a
µ f(x) − f(x − µb)
∇B f(x) ≡ , (13.3)
a
13. L ATTICE FIELD THEORY 433

that both go to the continuum derivative in the limit a → 0. However, unlike the
continuous derivative, ∇µF
and ∇µ
B
are not anti-adjoint. Instead, assuming periodic
boundary conditions, we have
X   X  
f(x) ∇µ
F
g(x) = − ∇ µ
B
f(x) g(x) . (13.4)
x∈ lattice x∈ lattice

In other words, ∇µ†


F
= −∇µ B
. From this, we may construct a self-adjoint discrete
second derivative as follows:
b ) + f(x − µ
f(x + µ b ) − 2 f(x)
∇µ ∇µ f(x) = → f ′′ (x) . (13.5)
B F
a2 a→0

(There is no summation on µ in the left hand side.) Thus, a self-adjoint discretization


of the scalar Lagrangian leads to
X 1 λ
Slattice = a4 − φ(x) gµν ∇µ ∇ν + m2 )φ(x) − φ4 (x) . (13.6)
2 B F
4!
x∈ lattice

Let us make a few remarks concerning the errors introduced by the discretization.
Firstly, the continuous spacetime symmetries (translation and rotation invariance) of
the underlying theory are now reduced to the subgroup of the discrete symmetries of
a cubic lattice. They are recovered in the limit a → 0. Another source of discrepancy
between the continuum and discrete theories is the dispersion relation that relates the
energy and momentum of an on-shell particle. In the continuum theory, this relation
is of course

E2 = p2 + m2 , (13.7)

where −p2 is an eigenvalue of the Laplacian. In order to find its counterpart with the
above discretization, we must determine the spectrum of the finite difference operator
∇µB
∇µF
. On a lattice with N sites and periodic boundary conditions, its eigenfunctions
are given by

with k ∈ ❩ ,
kx
φk (x) ≡ e2iπ Na −N
2 ≤k≤
N
2 . (13.8)

The associated eigenvalue is


2 2πk  4 πk
λk ≡ 2
cos − 1 = − 2 sin2 . (13.9)
a N a N
Thus, the one dimensional discrete analogue of the continuum p2 + m2 is
4 πk
m2 + 2
sin2 . (13.10)
a N
434 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

35

30

25
Figure 13.2:
Discrepancy between the
20
continuous (solid curve)
E

15
and discrete (points)
dispersion relations, on a
10 one-dimensional lattice
5
with N = 40.

0
-20 -15 -10 -5 0 5 10 15 20
k

As long as k ≪ N, this agrees quite well with the continuum dispersion relation, but
the agreement is not good for larger values of k. This discrepancy is illustrated in
the figure 13.2. This mismatch does not improve by increasing the number of lattice
points: only the center of the Brillouin zone has a dispersion relation that agrees
with the continuum one. In order to mitigate this problem, one should choose the
parameters of the lattice in such a way that the physically relevant scales correspond
to values of k for which the distortion of the dispersion curve is small. c sileG siocnarF

13.1.2 Gluons and Wilson action

Non-Abelian gauge theories pose an additional difficulty: since the local gauge
invariance plays a central role in their properties, any attempt at discretizing gauge
fields should preserve this symmetry. It turns out that there exists a discretization of
the Yang-Mills action that goes to the continuum action in the limit where a → 0, and
has an exact gauge invariance. The main ingredient in this construction is eq. (4.174),
that relates the Wilson loop along a small square,

[]x;µν ≡ U†ν (x) U†µ (x + ν


^ ) Uν (x + µ
^ ) Uµ (x) , (13.11)

to the squared field strength. These elementary lattice Wilson loops are called
plaquettes. In the fundamental representation of su(N), we have

 g2 a4 µν
tr []x;µν = N − Fa (x)Fa 6
µν (x) + O(a ) . (13.12)
4
Note that, although the first two terms in the right hand side are real valued, the
remainder (terms of order a6 and beyond) may be complex. Therefore, it is convenient
to take the real part of the trace of the Wilson loop in order to construct a real valued
13. L ATTICE FIELD THEORY 435

discrete action. By summing this equation over all the lattice points x and all the pairs
of distinct directions (µ, ν), we obtain
X  1 µν 
a4 − Fa (x)Fa
µν (x)
4
x∈ lattice
N X X   
= N−1 tr Re []x;µν − 1 +O(a2 ) . (13.13)
g2
x∈ lattice (µ,ν)
| {z }
1
Wilson action, denoted S [U]
g2 W

Note that the error term of order a6 becomes a term of order a2 after summation over
the lattice sites, since the number of sites grows like a−4 if the volume is held fixed.
Thus, the sum of the traces of the Wilson loops over all the elementary plaquettes
of the lattice provides a discretization of the Yang-Mills action. In this discrete
formulation, the natural variables are not the gauge potentials Aµ (x) themselves, but
the Wilson lines Uµ (x) that live on the edges of the lattice, called link variables. In
this notation, x is the starting point and µ the direction of the Wilson line, as illustrated
in the left panel of figure 13.3. The Wilson line oriented in the −^ µ direction, i.e.

x+ν̂

x x+µ̂ x x+µ̂
Uµ (x)
Figure 13.3: Left: link variable. Right: plaquette on an elementary square of the
lattice.

starting at the point x + µ


^ and ending at the point x, is simply the Hermitean conjugate
of Uµ (x). Under a local gauge transformation, the link variables are changed as
follows:
Uµ (x) → Ω† (x + µ
^ ) Uµ (x) Ω(x) . (13.14)
The plaquette variable, shown in the right panel of figure 13.3, can then be obtained by
multiplying four link variables, as indicated by eq. (13.11), and its trace is obviously
invariant under the transformation of eq. (13.14).
436 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

At this stage, the discrete analogue of the path integral that gives the expectation
value of a gauge invariant operator reads,
ZY
  N X X  −1  
hOi = dUµ (x) O U exp i 2 N tr Re []x;µν − 1 .
x,µ
g x
(µ,ν)
(13.15)

Since there exists a left- and right-invariant2 group measure dUµ (x), the left hand
side of this formula is gauge invariant. Moreover, it goes to the expectation value of
the continuum theory in the limit of zero lattice spacing.

13.1.3 Monte-Carlo sampling


Thanks to the discretization, the path integral of the original theory is replaced by
an ordinary integral over each of the link variables Uµ (x), whose number is finite.
A non-perturbative answer could be obtained if one were able to evaluate these
integrals numerically. However, because of the prefactor i inside the exponential
in eq. (13.15), the integrand is a strongly oscillating function, whose numerical
evaluation is practically impossible except on lattices with a very small number
of sites. In order to be amenable to a numerical calculation, this integral must be
transformed into an Euclidean one,
ZY
  N X X  −1  
hOiE = dUµ (x) O U exp N tr Re []x;µν −1 .
x,µ
g2 x
(µ,ν)
(13.16)

The exponential under the integral is now real-valued, and thus positive definite.
Note that numerical quadratures such as Simpson’s rule, are not practical for this
problem, given the huge number of dimensions of the integral to be evaluated. For
instance, for the 8-dimensional Lie group SU(3), in 4 space-time dimensions, on a
lattice with N4 points, this dimension is 8 × 4 × N4 . For N = 32, the path integral
is thus transformed into a 225 -dimensional (225 ∼ 3.107 ) ordinary integral. Instead,
one views the exponential of the Wilson action as a probability distribution (up to a
normalization constant) for the link variables, that may be sampled by a Monte-Carlo
algorithm (e.g. the Metropolis-Hastings algorithm) in order to estimate the integral.
In this approach, as long as one is evaluating the expectation value of gauge
invariant observables, it is not necessary to fix the gauge in lattice QCD calculations.
2 This means that:
Z Z Z
dU f[U] = dU f[ΩU] = dU f[UΩ] .

Such a measure, known as the Haar measure, exists for compact Lie groups, like SU(N).
13. L ATTICE FIELD THEORY 437

Gauge fixing is necessary when calculating non-gauge invariant quantities, such as


propagators for instance. The Landau gauge is the most commonly used, because the
Landau gauge condition is realized at the extrema of a functional of the link variables,
However, the comparison between gauge-fixed lattice calculations and analytical
calculations is very delicate, because of the existence of Gribov copies (the problem
stems from the fact that the two setups may not select the same Gribov copy).
Although considering the Euclidean path-integral instead of the Minkowski one
allows for a numerical evaluation by Monte-Carlo sampling, this leads to a serious
limitation: only quantities that can be expressed as an Euclidean expectation value are
directly calculable. Others could in principle be reached by an analytic continuation
from imaginary to real time, but this turns out to be practically impossible numerically.
For instance, the masses of hadrons are accessible to lattice QCD calculations (see
the section 13.3 for an example), while scattering amplitudes cannot be calculated by
this method.c sileG siocnarF

13.2 Fermions

13.2.1 Discretization of the Dirac action


Consider now the Dirac action, whose expression in continuum space reads
Z  
SD = d4 x ψ(x) i γµ Dµ − m ψ(x) . (13.17)

In the discretization, we assign a spinor ψ(x) to each site of the lattice. Under a gauge
transformation Ω(x), these spinors transform in the same way as in the continuous
theory,

ψ(x) → Ω† (x) ψ(x) , ψ(x) → ψ(x) Ω(x) . (13.18)

The main difficulty in defining a discrete covariant derivative that transforms appro-
priately under a gauge transformation is that ψ(x) and ψ(x ± µb ) transform differently
when Ω(x) depends on space-time. This problem can be remedied by using a link
variable between the point x and its neighbors. Like with the ordinary derivatives,
one may define forward and backward discrete derivatives,

U†µ (x)ψ(x + µ
b ) − ψ(x)
Dµ ψ(x) ≡ ,
F
a
ψ(x) − Uµ (x − µ b )ψ(x − µb)
Dµ ψ(x) ≡ , (13.19)
B
a
that both transform like a spinor at the point x, and therefore are valid discretizations
of a covariant derivative. However, none of these two operators is anti-adjoint, and
438 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

35

30

Figure 13.4:
25
Discrepancy between the
20
continuous (solid curve)
and discrete (points)
E

15
dispersion curves for
10
fermions, on a one-
dimensional lattice with
5 N = 40.
0
-20 -15 -10 -5 0 5 10 15 20
k

therefore they would not give a Hermitean Lagrangian density. This may be achieved
by using instead 12 Dµ
F
+ Dµ B
, which corresponds to a symmetric forward-backward
difference
1 µ  U†µ (x)ψ(x + µ
b ) − Uµ (x − µ
b )ψ(x − µ
b)
DF + Dµ ψ(x) = . (13.20)
2 B
2a

13.2.2 Fermion doublers


Let us now study how the dispersion relation of fermions is modified by this dis-
cretization. This can easily be done in the vacuum, i.e. by setting all the link variables
to the identity. In this case, the eigenfunctions of the operator 21 Dµ F
+ Dµ B
are
(k+1/2)x
ψk (x) = e2iπ Na with k ∈ ❩ , −N
2 ≤k≤
N
2 , (13.21)
and the corresponding eigenvalue is
i 2π (k + 1/2)
λk = sin , (13.22)
a N
and the corresponding dispersion relation is E2 = |λk |2 +m2 . This dispersion relation
is shown in the figure 13.4. Like in the bosonic case, the discrete dispersion relation
agrees with the continuous one only for small enough k. However, the discrepancy at
large k is now much more serious, because the discrete dispersion curve has another
minimum at the edge of the Brillouin zone. This additional minimum indicates the
existence of a second propagating mode of mass m. This spurious mode is called a
fermion doubler. In d dimensions, the number of these fermionic modes is 2d , while
our goal was to have only one. This problem is quite serious, because it affects all
quantities that depend on the number of quark flavours. In particular, this is the case
of the running of the coupling constant, whenever quark loops are included.
13. L ATTICE FIELD THEORY 439

35

Figure 13.5: 30

Discrepancy between the


continuous (solid curve) 25

and discrete (points) 20


dispersion curves in the

E
fermionic case, on a 15

one-dimensional lattice 10
with N = 40, after
inclusion of the Wilson 5

term. 0
-20 -15 -10 -5 0 5 10 15 20
k

13.2.3 Wilson term


Various modifications of the discretized Dirac action have been proposed to remedy
the problem of fermion doublers. One of these modifications, known as the Wilson
term, consists in adding to the Lagrangian the following term (written here for the
direction µ),
1 h i
− ψ(x) U†µ (x) ψ(x + µb ) + Uµ (x − µb ) ψ(x − µ
b ) − 2ψ(x) , (13.23)
2a
which is nothing but a D’Alembertian (or a Laplacian in the Euclidean theory)
constructed with covariant derivatives. The corresponding operator in the continuum
theory is
a 
ψ Dµ D µ ψ . (13.24)
2
Note that the denominator in eq. (13.23) has a single power of the lattice spacing a,
hence the prefactor a in the previous equation. Therefore, this term goes to zero in
the limit a → 0, and it should have no effect in the continuum limit. In the absence
of gauge field (Uµ (x) ≡ 1), the functions of eq. (13.21) are still eigenfunctions after
adding the Wilson term, but with modified eigenvalues,
i 2π (k + 1/2) 1  2π (k + 1/2) 
λk = sin + 1 − cos . (13.25)
a N a N
Thus, the Wilson term does not modify the spectrum at small k, but lifts the spurious
minimum that existed at the edge of the Brillouin zone, as shown in the figure 13.5.
Roughly speaking, the Wilson term gives a mass of order a−1 to the fermion doublers,
making them decouple from the rest of the degrees of freedom when a → 0.
However, the Wilson term has an important drawback: there is no Dirac matrix γµ
in eqs. (13.23) and (13.24) since the Lorentz indices are contracted directly between
440 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

the two covariant derivatives. Therefore, the Wilson term –like an ordinary mass
term– breaks explicitly the chiral symmetry of the Dirac Lagrangian in the case
of massless fermions. The fermion doublers are in fact intimately related to chiral
symmetry. Without the Wilson term, lattice QCD with massless quarks has an exact
chiral symmetry unbroken by the lattice regularization, and therefore there cannot be
a chiral anomaly. In fact, this absence of anomaly is precisely due to a cancellation
of anomalies among the multiple copies (the doublers) of the fermion modes. This
argument is completely general and not specific to the Wilson term: any mechanism
that lifts the degeneracy among the doublers will spoil the anomaly cancellation and
thus break chiral symmetry. For this reason, the study of phenomena related to chiral
symmetry is always delicate in lattice QCD.

13.2.4 Evaluation of the fermion path integral


The path integral representation for fermions uses anti-commuting Grassmann vari-
ables. However, such variables are not representable as ordinary numbers in a
numerical implementation. To circumvent this difficulty, one exploits the fact that
the Dirac action is quadratic in the fermion fields (this remains true after adding
the Wilson term to remove the fermion doublers). Therefore, the path integral over
the fermion fields can be done exactly. In addition to the fermion fields contained
in the action, there may be ψ’s and ψ’s (in equal numbers) in the operator whose
expectation value is being evaluated. The result of such a fermionic path integral is
given by
Z
    
DψDψ ψ(x1 )ψ(x2 ) eiSD [ψ,ψ] = S(x1 , x2 )×det i γµ Dµ −m ; (13.26)

/ − m between the points x1


where S(x1 , x2 ) is the inverse of the Dirac operator iD
and x2 . When there is more than one ψψ pair in the operator, one must sum over all
the ways of connecting the ψ’s and the ψ’s by the fermion propagators S(x, y). The
same can be done in the lattice formulation. In this case, the Dirac operator is simply
a (very large) matrix that depends on the configuration of link variables. Therefore
one needs the inverse of this matrix, and its determinant.c sileG siocnarF

In eq. (13.26), the Dirac determinant provides closed quark loops, while the
propagator S(x1 , x2 ) connects the external points of the operator under consideration.
This observation, illustrated in the figure 13.6, clarifies the meaning of the quenched
approximation, in which the determinant of the Dirac operator is replaced by 1. This
approximation, motivated primarily by the computational difficulty of evaluating the
Dirac determinant, was widely used in lattice QCD computations until advances in
algorithms and computer hardware made it unnecessary. Note that, although quark
loops are not included in the quenched approximation, gluon loops are present to their
full extent. In contrast, lattice QCD calculations that include the Dirac determinant,
and thus the effect of quark loops, are said to use dynamical fermions.
13. L ATTICE FIELD THEORY 441

Figure 13.6:
Illustration of the
two types of quark
contributions. In dark:
quark propagators
(i.e. inverse of the
Dirac operator) that
connect the ψ’s and
ψ’s in the operator
being evaluated. In
lighter color: quark
loops coming from
the determinant of the
Dirac operator.

13.3 Hadron mass determination on the lattice


Let us consider a hadronic state h . Any operator O that carries the same quantum
numbers as this hadron leads to a non-zero matrix element h O 0 . The vacuum
expectation value of the product of two O at different times 0 and T can be rewritten
as follows,
X
0 O† (0)O(T ) 0 = 0 O† (0) Ψn Ψn O(T ) 0
n
X
= 0 O† (0) Ψn Ψn O(0) 0 e−Mn T
n
X 2
= Ψn O(0) 0 e−Mn T . (13.27)
n

In the first equality, we have inserted a complete basis of eigenstates of the QCD
Hamiltonian, and the second equality follows from the fact that Ψn is an eigenstate
of rest energy Mn (there is no factor i inside the exponential because of the Euclidean
time used in lattice QCD). The sum in the last equality receives non-zero contributions
from all the states Ψn that possess the quantum numbers carried by the operator O.
However, taking the limit T → ∞ selects the one among these eigenstates that has
the smallest mass. This observation can be turned into a method to determine hadron
masses in lattice QCD:

1. Choose an operator O that has the quantum numbers of the hadron h of interest.
The choice of the operator is not crucial, as long as the overlap h O 0 is not
zero. However, eq. (13.27) suggests that a better result, i.e. less noisy with
limited statistics, may be obtained by trying to maximize this overlap.
442 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 13.7: Hadron


mass determination
from lattice QCD. Open
circles: masses used as
input in order to set the
lattice parameters. Filled
circles: predictions of
lattice QCD. Boxes:
experimental values.

2. Evaluate the vacuum expectation value of O† (0)O(T ) by Monte-Carlo sampling,


as a function of T .

3. Fit the large T tail of this expectation value. The slope of the exponential gives
the mass of the lightest hadron that possesses these quantum numbers.

The discretized QCD Lagrangian contains several dimensionful parameters: the


lattice spacing a and the quark masses mf (one for each quark flavour), whose values
need to be fixed before novel predictions can be made. One must choose (at least) an
equal number of physical quantities that are known experimentally. Their computed
values depend on a, mf , and one should adjust these parameters so that they match
the experimental values. After this has been done, quantities computed in lattice QCD
do not contain any free parameter anymore and are thus predictions. The figure 13.7
shows hadron masses calculated using lattice QCD.

13.4 Wilson loops and confinement

13.4.1 Strong coupling expansion

While perturbation theory is an expansion in powers of g2 , it is possible to use the


lattice formulation of a Yang-Mills theory in order to perform an expansion in powers
of the quantity β ≡ g−2 that appears as a prefactor in the Wilson action. This is
called a strong coupling expansion, since it becomes exact in the limit of infinite
coupling. c sileG siocnarF

This expansion produces integrals over the gauge group such as


Z
dU Ui1 j1 · · · Uin jn U†k1 l1 · · · U†km lm . (13.28)
13. L ATTICE FIELD THEORY 443

The simplest of these integrals,


Z
dU = 1 (13.29)

is simply a choice of normalization of the invariant group measure. From the unitarity
of the group elements, one then obtains3
Z
1
dU Uij U†kl = δjk δil . (13.30)
N

In these integrals, the link variables on different edges of the lattice are independent
variables, and there is a separate integral for each of them. This is completely general:
integrals of the form (13.28) are non-zero only if the integrands contains an equal
number of U’s and U† ’s, i.e. for n = m. Therefore, each link variable U that appears
in such a group integral must be matched by a corresponding U† . For instance, the
group integral of the Wilson loop defined on an elementary plaquette is zero,
ZY  
dUµ (x) tr U†ν (x) U†µ (x + ν ^ ) Uν (x + µ^ ) Uµ (x) = 0 , (13.31)
x,µ
| {z }
[]x;µν

because the four link variables live on four distinct edges of the lattice. In contrast,
the integral of the trace of a plaquette times the trace of the conjugate plaquette is
non-zero:
ZY   
dUµ (x) tr []x;µν tr []†x;µν = 1 . (13.32)
x,µ

Using these results, we can calculate to order β the expectation value of the trace of a
plaquette:

tr []x;µν
RQ   P  −1 
dUµ (x) tr []x;µν exp βN N tr Re []y;ρσ − 1
x,µ y;ρσ
≡ RQ P  −1 
dUµ (x) exp βN N tr Re []y;ρσ − 1
x,µ y;ρσ
β
= + O(β2 ) . (13.33)
2

Consider now the trace of a more general Wilson loop along a path γ (planar, to
simplify the discussion). Each U and U† in the Wilson loop must be compensated
by a link variable coming from the β expansion of the exponential of the Wilson
action. The lowest order term in β corresponds to a minimal tiling of the Wilson
444 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 13.8: Tiling


of a closed loop by
elementary plaquettes.

loop by elementary plaquettes, as illustrated in the figure 13.8. The corresponding


contribution is
 Area (γ)
β
htr Wγ i = + ··· , (13.34)
2
where the dots are terms of higher order in β (that can be constructed from non-
minimal tilings of the contour γ, such that all the U’s and U† ’s are still paired
appropriately).

13.4.2 Heavy quark potential

Let us consider now a rectangular loop, with an extent R in the spatial direction 1 and
an extent T in the Euclidean time direction 4. The previous result indicates that the
expectation value of the trace of the corresponding Wilson loop has the following
form,

htr Wγ i ∼ e−σRT + · · · , (13.35)

where σ is a constant. Although it is gauge invariant, this expectation value is easier


to interpret in an axial gauge where A4 ≡ 0. Indeed, in this gauge, the Wilson loop
receives only contributions from gauge links along the spatial direction, as shown
in the figure 13.9. Note that the remaining Wilson lines are precisely those that are
needed to make a non-local gauge invariant operator with a quark at x = R and an
antiquark at x = 0,

Oqq (t) ≡ ψ(t, 0) W[0,R] ψ(t, R) , (13.36)


3 For SU(2), one may parameterize the group elements in the fundamental representation by U =

θ0 + 2i θa ta 2 2 2
f with θ0 + θ1 + θ2 + θp
2 = 1, and the invariant group measure normalized according to
3 p
eq. (13.29) is dU = dθ1 dθ2 dθ3 /(π2 1 − θ2 ) (with θ0 = 1 − θ2 ). By using this measure and the
Fierz identity satisfied by the generators ta
f , an explicit calculation leads easily to eq. (13.30).
13. L ATTICE FIELD THEORY 445

Figure 13.9:
Rectangular Wilson
loop in the A4 ≡ 0 t
= T
gauge.

where W[0,R] is a (spatial) Wilson line going from (t, R) to (t, 0). Consider now
the vacuum expectation value 0 O†qq (0)Oqq (T ) 0 . In this expectation value, the
fermionic path integral produces two quark propagators that connect the ψ’s to the
ψ’s. However, in the limit of infinite quark mass, the quarks are static and their
propagator is just a Wilson line in the temporal direction, that reduces to the identity
in the A4 = 0 gauge (represented by the dotted lines in the figure 13.9). Thus, we
have

htr Wγ i ∝ lim 0 O†qq (0)Oqq (T ) 0 . (13.37)


M→∞

By inserting a complete basis of eigenstates of the Hamiltonian in the right hand side
of eq. (13.37) and by taking the limit T → ∞, we find a result dominated by the
quark-antiquark state of lowest energy E0 ,
2
lim 0 O†qq (0)Oqq (T ) 0 = 0 O†qq (0) Ψ0 e−E0 T . (13.38)
T →∞

Moreover, in the limit of large mass, the energy E0 of this state is dominated by the
potential energy V(R) between the quark and the antiquark (the quark and antiquark
are non-relativistic, and their kinetic energy behaves as P2 /2M → 0),
2
lim 0 O†qq (0)Oqq (T ) 0 = 0 O†qq (0) Ψ0 e−V(R) T . (13.39)
M,T →∞

By comparing this result with that of the strong coupling expansion, eq. (13.35), we
conclude that

V(R) = σ R . (13.40)

This linear potential indicates that the force between the quark and antiquark is con-
stant at large distance, in sharp contrast with a Coulomb potential in electrodynamics.
This is a consequence of the colour confinement property of QCD. c sileG siocnarF
446 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

13.5 Gauge fixing on the lattice


Until now, we have described lattice field theory restricted to the evaluation of the
expectation value of gauge invariant operators. One may legitimately argue that this
is sufficient as far as the computation of physical observables is concerned. There
are however some applications that involve gauge dependent quantities, for which
a gauge fixing is necessary. This is the case for instance if one wishes to compare
the behaviour of propagators or vertex functions evaluated non-perturbatively on the
lattice and in perturbation theory.
For practical reasons, it is convenient to consider gauge conditions that may be
recasted into the problem of finding the extrema of a functional. This is the reason
why the Landau gauge, i.e. the strict covariant gauge ∂µ Aµ = 0, is most often
employed in these lattice studies. In the continuum theory, this condition is satisfied
at the extrema of the following functional:
Z
1 µa
FLandau [A, Ω] ≡ d4 x AaΩµ (x)AΩ (x) , (13.41)
2

where Aµ µ
Ω is the gauge transform of the field configuration A we aim at bringing to
Landau gauge. Indeed, if we apply an infinitesimal gauge transformation to the field

Ω , the corresponding variation of FLandau [A, Ω] is

Z
 µa
δFLandau [A, Ω] = − d4 x DΩ µab θb AΩ
Z

= − d4 x ∂µ θa − gfcab AcΩµ θb Aµa

Z

= d4 x θa ∂µ Aµa
Ω . (13.42)

Therefore, if AµΩ realizes an extremum of the functional, then this variation must be
zero for all possible θa (x), which means that Aµ Ω obeys Landau gauge condition.
The discrete analogue of the functional defined in eq. (13.41) reads
XX  
FLandau [U, Ω] ≡ −2 a2 Re tr Ω(x)Uµ (x)Ω† (x + µ b) . (13.43)
x µ

Finding extrema of such a functional is a rather straightforward task, for instance with
the steepest descent algorithm.
Due to the existence of Gribov copies, the gauge fixed field configuration is not
defined uniquely by the gauge condition, which implies that this functional has more
than one extremum corresponding to the various solutions of ∂µ Aµ Ω = 0 along the
same gauge orbit (see the figure 13.10). A natural criterion to decide which extrema
13. L ATTICE FIELD THEORY 447

G(Aµ) = 0

gauge
fixed Aµ gauge orbit

Figure 13.10: Gauge orbit that intersects multiple times the gauge fixing manifold.
448 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

to take into account is to try and reproduce the perturbative Fadeev-Popov procedure.
Let us recall here the starting point, which amounts to inserting under the path integral
the left hand side of the following equation
Z  a
  a µ δG
DΩ(x) δ[G (AΩ )] det =1, (13.44)
δΩ Ga (Aµ )=0

a
where G (A) = 0 is the gauge fixing condition. However, two conditions must be
met for this integral to be really equal to one:

• Ga (AΩ ) has a unique zero along each gauge orbit,


• The determinant is positive at this zero.

However, it was shown by Gribov that the unicity condition is generically not satis-
fied: the gauge condition has multiple solutions, called Gribov copies. When these
conditions are not satisfied, the inserted factor is not one, but instead
   a −1
X δGa δG
ZFP = det det
δΩ i δΩ i
zeroes i
   
X δGa
= sign det . (13.45)
δΩ i
zeroes i | {z }
≡ sign(i)

Thus, one may try to mimic the perturbative Fadeev-Popov procedure by the following
definition of a gauge fixed operator on the lattice
P
sign(i) O[AΩi ]
O[A] ≡ P i
extrema
, (13.46)
Landau extrema i sign(i)

where the denominator follows from the requirement that gauge invariant operators
should remain unaffected by the gauge fixing. However, it was shown by Neuberger
that this definition is flawed, because the distribution of Gribov copies is such that
both the numerator and the denominator are exactly zero. In order to see this, consider
a gauge invariant observable O[U], and let us try to mimic closely the continuum
BRST quantization, by introducing ghosts and antighosts, the BRST variation B of
the antighost (B ≡ QBRST χ, QBRST B = 0) and a gauge fixing parameter ξ. By doing
so, the expectation value of the observable would read
Z
  − 1 S [U]− 2ξ
1
P
x BB+QBRST
P
x χG
O = Z−1 D(U, χ, χ, B) O[U] e g2 W
Z
  − 1 S [U]− 2ξ
1
P
x BB+QBRST
P
x χG
Z ≡ D(U, χ, χ, B) e g2 W . (13.47)
13. L ATTICE FIELD THEORY 449

(Here, we are assuming a completely generic gauge fixing function G(U).) Note
that with a compact gauge group and a finite lattice, all the integrals involved in this
formula are finite. Consider now the following quantity:
Z
  − 1 S [U]− 2ξ 1
P
x BB+tQBRST
P
x χG
FO (t) ≡ D(U, χ, χ, B) O[U] e g2 W . (13.48)

The numerator in the gauge fixed definition of O is nothing but FO (1). The
derivative of this function is given by
Z
dFO  h X i
= D(U, χ, χ, B) QBRST χG
dt x
1 1
P P
− SW [U]− 2ξ BB+tQ χG
× O[U] e g2 x BRST x
| {z }
BRST invariant
Z hX i
 
= D(U, χ, χ, B) QBRST χG O[U]
x
1 1
P P
− SW [U]− 2ξ x BB+tQ χG
×e g2 BRST x
=0. (13.49)

In the last equality, we have used the fact that the integral of a total BRST variation is
zero. Thus, we have
Z
  − 1 S [U]− 2ξ 1
P
x BB
FO (1) = FO (0) = D(U, χ, χ, B) O[U] e g2 W = 0 . (13.50)

This time, the zero follows from the fact that the integrand does not depend on χ or χ,
hence the integrals over the ghost and antighost are equal to zero. The same reasoning
applies to the denominator Z in the gauge-fixed definition of O , hence we have an
undefined ratio4 :
0
O = . (13.51)
gauge fixed 0
If we interpret this result in the light of eq. (13.46), we see that these zeroes result
from an even number of Gribov copies with alternating signs for the determinant of
the Fadeev-Popov operator. One may view this issue as a fundamental obstruction for
a non-perturbative definition of gauge fixing by the Fadeev-Popov procedure. Because
of this problem, the practical lattice definition of the Landau gauge fixing is simply to
pick one of the extrema of the functional (13.43), without any special selection rule.
One should be aware of this procedure when comparing with perturbative results,
since it is a priori not guaranteed that the solutions of the gauge condition used in the
perturbative and in the non-perturbative calculations are the same. c sileG siocnarF

4 The same conclusion holds if the operator O is not gauge invariant, but simply BRST invariant.
450 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

13.6 Lattice worldline formalism

13.6.1 Discrete analogue of the heat kernel

In the chapter 12, we have exposed the worldline representation for a quantum field
theory in a continuous spacetime. However, a similar representation is also possible
for propagators in a field theory defined on a discrete spacetime. For the sake of
simplicity, let us first consider first a free scalar field theory, defined on a cubic lattice
instead of a continuous spacetime. As we have seen earlier in this chapter, the second
derivatives ∂µ ∂µ that appear in the inverse propagator are replaced by centered finite
differences, and with an Euclidean metric we have
X 2 φ(x) − φ(x + µ
b ) − φ(x − µ
b)
( + m2 ) φ(x) = m2 φ(x) + . (13.52)
µ
a2

We could in principle introduce a continuous fictitious time T as in eq. (12.3) in order


to write a heat-kernel representation of this finite difference operator. But it turns
out to be more convenient to introduce a discrete variable here as well, based on the
identity

X
A−1 = (1 − A)n . (13.53)
n=0

13.6.2 Random walks on a cubic lattice

This formula should be applied to a dimensionless operator, for instance a2 ( + m2 )


instead of  +m2 itself. Since the discrete operator in eq. (13.52) contains a term
m2 + 2da−2 φ(x) that acts as the identity, it is in fact more convenient to choose
A = (m2 + 2da−2 )−1 ( + m2 ), in order to make 1 − A simpler. Therefore, we
may write

 + m2 )−1 =
1
∞ 
X ❉ n
, (13.54)
lattice m2 + 2da−2 2d + m2 a2
n=0

where ❉ is the discrete operator defined as follows


X 
❉ φ(x) ≡ φ(x + µb) + φ(x − µb) . (13.55)
µ

(In d dimensions, there are 2d terms in the sum of the right hand side.) The operator
❉ /(2d + m2 a2 ) realizes one hop from a lattice site to one of its nearest neighbors,
with a probability (2d + m2 a2 )−1 for a jump in any given direction. Raised to the
13. L ATTICE FIELD THEORY 451

Figure 13.11:
Worldlines on a
cubic lattice. The
points x and x ′ are
materialized by the
two little balls, and we
have represented three
different paths on the
lattice connecting
these two points.

power n, we get an operator that performs n successive jumps. The probability of a


given sequence of n hops is (2d + m2 a2 )−n , and there are (2d)n such sequences,
hence a total probability (1 + m2 a2 /2d)−n . This is equal to unity in a massless
theory, but suppressed exponentially at large n with a non-zero mass (this observation
is the discrete analogue of the fact that long worldlines are suppressed exponentially
by a mass in the continuous case).

The propagator evaluated between the sites x and x ′ is proportional to the total
probability to connect these two sites by a sequence of jumps, regardless of its length
(because of the sum on n):


  1 X 1 X
 + m2 )−1 =
xx ′ lattice
1 , (13.56)
m2 + 2da−2 (2d + m2 a2 )n
n=0 γ∈Pn (x,x ′ )

where Pn (x, x ′ ) is the set of all paths of length n drawn on the edges of the lattice
that connect x to x ′ (see the figure 13.11). Therefore, the second sum merely counts
the number of such paths. This number has an upper bound of (2d)n , which implies
trivially the convergence of the sum on n in the massive case.

13.6.3 Scalar electrodynamics

Let us now consider a complex scalar field in a background Abelian gauge field.
The D’Alembertian is replaced by the square of the covariant derivative. In order to
maintain an exact gauge symmetry in the discrete lattice formulation, the gauge field
is represented by link variables Uµ (x) defined on each edge of the lattice, and the
452 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

forward and backward discrete covariant derivatives read

U†µ (x)φ(x + µ
b ) − φ(x)
Dµ φ(x) ≡ ,
F
a
φ(x) − Uµ (x − µ b )φ(x − µb)
Dµ φ(x) ≡ . (13.57)
B
a

In order to evaluate the scalar propagator in this gauge background, we can reproduce
the derivation of the previous subsection, and the only change will arise in the

definition of the operator , whose action now reads
X 
❉ φ(x) ≡ U†µ (x)φ(x + µ
b ) + Uµ (x − µ
b )φ(x − µ
b) . (13.58)
µ

Note that the right hand side transforms like a scalar field at the point x under a gauge
transformation. The consequence of this modification is that the jumps performed by

the operator are now weighted by U(1) phases corresponding to the link variables
that appear in the right hand side of eq. (13.58). The lattice worldline representation
of the dressed scalar propagator is


  1 X 1 X
D2 + m2 )−1 xx ′
= Wγ [A] ,
lattice m2 + 2da−2 (2d + m2 a2 )n
n=0 γ∈Pn (x,x ′ )
(13.59)

where Wγ [A] is the product of all the phases collected along the path γ, which
is nothing but the Wilson line defined on this path. This expression transforms as
expected for a dressed scalar propagator, thanks to the properties of Wilson lines.
A representation similar to eq. (13.59) is also possible for the one-loop quantum
effective action in the gauge background, that can be obtained by first noting that

X 1
ln(A) = ln(1 − (1 − A)) = − (1 − A)n . (13.60)
n
n=1

The derivation mimics that of the propagator, and we finally obtain



a4 XX 1 X
Γ [A] = − 2 −2
Wγ [A] . (13.61)
lattice m + 2da x
2n(2d + m2 a2 )2n
n=1 γ∈P2n (x,x)

Note that now the paths γ involved in the sum are the closed paths starting and ending
at the point x, which is why only even values of the length are allowed.
13. L ATTICE FIELD THEORY 453

13.6.4 Combinatorics of loop areas on a planar square lattice

In two dimensions, the representation of eq. (13.59) may be used in order to study the
properties of charged particles on an atomic lattice, under the influence of an external
electromagnetic field. In particular, when this field is purely magnetic and transverse
to the plane of the lattice (i.e. the field strength F12 is non-zero), this model is related
to the quantum Hall effect. c sileG siocnarF

This relationship may also be exploited in order to derive explicit formulas for the
moments of the distribution of areas of random closed loops on a cubic lattice. For
this application, it is interesting to consider a 2-dimensional anisotropic lattice, with
lattice spacings a1,2 in the two directions. On this lattice, consider the propagator of
a massless scalar at equal points in the presence of a transverse magnetic field. It is
straightforward to generalize the previous derivation to the anisotropic case, and one
obtains the following expression for the propagator
∞  n1  n2
a2 X h1 h2 X
G(x, y) = Wγ [A] , (13.62)
4 4 4
n1,2 =0 γ∈Pn1 ,n2 (x,y)

where Pn1 ,n2 (x, y) is the set of paths drawn on the edges of the lattice, that connect x
to y, and contain n1 jumps in the first direction and n2 jumps in the second direction.
In this formula, we have also defined

2 1 1 a2
≡ 2+ 2 , h1,2 ≡ . (13.63)
a2 a1 a2 a21,2

If we specialize to a closed path x = y = 0 and to a transverse magnetic field B, the


paths γ in the right hand side are closed loops, and we have

Wclosed γ [A] = eiΦ Area (γ) , (13.64)

where we denote Φ ≡ Ba1 a2 the magnetic flux through an elementary plaquette of


the lattice, and where Area (γ) is the area enclosed by the loop γ. Note that this is an
algebraic area, whose sign depends on the orientation of the loop, and that accounts
for the winding number of the loop. Expanding the exponential, we have

∞  2n1 2n2 X

a2 X h1 h2 (iΦ)2l X 2l
G(0, 0) = Area (γ) .
4 4 4 (2l)!
n1,2 =0 l=0 γ∈P2n1 ,2n2 (0,0)

(13.65)

(Because the area is algebraic, the odd moments are all zero.) In this formula, we have
made explicit the fact that a closed path must have an even number of hops in each
direction. On the other hand, the propagator can be determined perturbatively order
454 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

by order in Φ, since this corresponds to a weak field expansion. This calculation


requires that one chooses a gauge5 . A convenient choice is provided by the following
link variables

U1 (x) ≡ 1 , U2 (x) ≡ eiΦ i1 , (13.66)


P
where i1 the integer that labels the first coordinate (x ≡ µ iµ µb ). By perform-
ing the expansion of G(0, 0) and identifying the coefficient of the term of order
Φ2l h2n
1
1 2n2
h2 on both sides of eq. (13.65), one can show that

X 2l (2(n1 +n2 ))!


(Area (γ)) = P2l (n1 , n2 ) , (13.67)
n1 !2 n2 !2
γ∈P2n1 ,2n2 (0,0)

where P2l is a symmetric polynomial in n1 , n2 of degree 2l. Note that the combi-
natorial factor (2(n1 +n2 ))!/(n1 !2 n2 !2 ) is the number of loops in P2n1 ,2n2 (0, 0).
This expansion also provides a semi-explicit form of the polynomial P2l , and the
evaluation of the first two terms gives,

n1 n2
P2 (n1 , n2 ) = ,
3 
n1 n2 7n1 n2 −(n1 +n2 )
P4 (n1 , n2 ) = , (13.68)
15

All these polynomials satisfy two simple identities:

1
P2l (n1 , 0) = P2l (0, n2 ) = 0 , P2l (1, 1) = . (13.69)
3
The first one is a consequence of the fact that if n1 or n2 is zero, then all the closed
paths one can construct have a vanishing area. The second one follows from the fact
that for n1 = n2 = 1, all the closed paths have area −1, 0 or +1, and therefore
contribute equally to all the even moments.
By summing the above results over all n1 + n2 = n, we obtain the moments of
the algebraic area of closed loops of length n, with no restriction on the respective
number of hops in each direction. Quite generally, these moments can be written
as a prefactor (2n)!2 /n!4 (which is the number of closed loops of length 2n on a
2-dimensional lattice) multiplied by a rational fraction in n of degree 2l. For instance,
5 The product of the link variables along a closed loop does not depend on the choice of the gauge, as

can be seen from the following gauge transformation formulas for the link variables

Uµ (x) → Ω(x) Uµ (x) Ω∗ (x + µ


b) .
13. L ATTICE FIELD THEORY 455

the first two moments are

X 2 (2n)!2 n2 (n − 1)
(Area (γ)) = ,
n!4 6(2n − 1)
γ∈P2n (0,0)
X 4 (2n)!2 n3 (n−1)(7n2 − 18n + 13)
(Area (γ)) = . (13.70)
n!4 60(2n − 1)(2n − 3)
γ∈P2n (0,0)
456 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS
Chapter 14

Quantum field theory


at finite temperature

Historically, the main area of developments and applications of Quantum Field


Theory has been high energy particle physics. This corresponds to situations where
the background is the vacuum, only perturbed by the presence of a few excitations
whose interactions one aims at studying. Consequently, most of the QFT tools we
have encountered so far are adequate for calculating transition amplitudes between
pure states that contain only a few particles.
However, there are interesting physical problems that depart from this simple
situation. For instance, in the early universe, particles are believed to be in thermal
equilibrium and form a hot and dense plasma. The typical energy of a particle in this
thermal bath is of the order of the temperature1 , which implies that this surrounding
medium may have an influence on all processes whose energy scale is comparable
or lower. As a consequence, these problems contain some element of many body
physics that was not present in applications of QFT to scattering reactions.
Another class of problems where many body effects are important is condensed
matter physics. When studied at some sufficiently large distance scale, where the
atomic discreteness is no longer important, these problems may be described in terms
of (non relativistic) quantum fields where collective effects are usually important.

14.1 Canonical thermal ensemble


Usually, the system one would like to study is a little part of a much larger system
(this is quite obvious in the case of the early universe, but is also generally true in
1 In this chapter, we extend the natural system of units we have used so far to also set the Boltzmann

constant kB = 1. Therefore, the temperature has the dimension of an energy.

457
458 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

condensed matter physics). Thus, its energy and other conserved quantities are not
fixed. Instead, they fluctuate due to exchanges with the surroundings, that play the
role of a thermal reservoir. The appropriate statistical ensemble for describing this
situation is the (grand) canonical ensemble, in which the system is described by the
following density operator

ρ ≡ exp − βH , (14.1)

where β = T −1 is the inverse temperature and H is the Hamiltonian. Given an


operator O, one is usually interested in calculating its expectation value in the above
statistical ensemble,
Tr (ρ O)
O ≡ . (14.2)
Tr (ρ)

Let us span the Hilbert space by states n that are eigenstates of H,

H n = En n . (14.3)

In terms of these states, the trace of ρ O can be represented as follows


 X −β En
Tr ρ O = e nOn . (14.4)
n

From this representation, it is easy to see that the zero temperature limit selects the
state of lowest energy, i.e. the ground state of the Hamiltonian. Assuming that this
state is non-degenerate, this corresponds to a vacuum expectation value:

lim Tr ρ O = 0 O 0 . (14.5)
T →0

In this sense, eq. (14.2) should be viewed as an extension of the formalism we


already know, rather than something entirely different. In this chapter, we discuss
various aspects of these thermal averages, starting with the necessary extensions to
the formalism in order to perform their perturbative calculation. c sileG siocnarF

14.2 Finite-T perturbation theory

14.2.1 Naive approach

The extension of ordinary perturbation theory to calculate expectation values such as


(14.2) is usually called Quantum Field Theory at finite temperature. A first approach
for evaluating such an expectation value could be to use the representation of the trace
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 459

provided by eq. (14.4), and a similar formula for the denominator, which would fall
back to the perturbative rules we already know (since the temperature and chemical
potential appear only in the form of numerical prefactors). Note however a peculiarity
of the matrix elements that appear in eq. (14.4): n and n are identical states
since they come from a trace (they are both in-states, since ρ defines the initial state
of the system). This is a bit different from the transition amplitudes that enter in
scattering cross-sections, where the matrix elements are evaluated between an in-state
and an out-state. The perturbative rules to compute these in-in expectation values are
provided by the Schwinger-Keldysh formalism introduced in the section 1.16.5.
A difficulty with this naive approach is that the number of states that contribute
significantly to the sum in eq. (14.4) is large at high temperature, especially when
the temperature is large compared to the masses of the fields (and even more so with
massless particles like photons). In fact, it is possible to encapsulate the sum over the
eigenstates n and the canonical weight of these states exp(−β En ) directly into the
Schwinger-Keldysh rules, by a modest modification of its propagators.

14.2.2 Thermal time contour

To mimic closely the derivation of the Feynman rules at zero temperature, let us
consider an observable made of the time-ordered product of elementary fields:

O ≡ T φ(x1 ) · · · φ(xn ) . (14.6)

Each Heisenberg representation field φ can be related to the corresponding field in


the interaction representation by

φ(x) = U(ti , x0 ) φin (x) U(x0 , ti ) , (14.7)

where ti is the time at which the system is prepared in equilibrium, and U is the time
evolution operator defined by:
Z t2
U(t2 , t1 ) ≡ T exp i dx0 d3 x LI (φin (x)) , (14.8)
t1

with LI the interaction term in the Lagrangian. Thanks to eq. (14.7), we remove all
the interactions from the field, and relegate them into the evolution operator where
they can easily be Taylor expanded.
In the canonical ensemble at non-zero temperature, there is another source of
dependence on the interactions, hidden in the Hamiltonian inside the density operator.
Indeed, for the system to be in statistical equilibrium, the canonical density operator
should be defined with the same Hamiltonian as the one that drives the time evolution,
460 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

i.e. a Hamiltonian that also contains the interactions of the system2 . If we decompose
the full Hamiltonian as H ≡ H0 + HI , we have

Z −ti −iβ
e−βH = e−βH0 T exp i dx0 d3 x LI (φin (x)) . (14.9)
−ti
| {z }
U(−∞−iβ,−∞)

(This formula in fact does not depend on ti ). It can be proven by noticing that right
and left hand sides are equal for β = 0, and by checking that their derivatives with
respect to β are also equal (for this, we use the fact that the derivative of the time
evolution with respect to its final time is known).
From the previous formulas, we can write

e−βH T φ(x1 ) · · · φ(xn ) =


Z
−βH0
=e P φin (x1 ) · · · φin (xn ) exp i dx0 d3 x LI (φin (x)) ,
C
(14.10)

where the symbol P indicates a path ordering, and where the time integration contour
is C = [ti , +∞] ∪ [+∞, ti ] ∪ [ti , ti − iβ]:

ti
C = . (14.11)

t i − iβ
In this contour, ti is the time at which the system is prepared in thermal equilibrium.
As we shall see shortly, all observables are independent of this time, which physically
means that a system in equilibrium has no memory of when it was put in equilibrium.
Note also that in eq. (14.10), the times x01 , · · · , x0n are on the upper branch of the
contour (but this constraint can be relaxed shortly).

14.2.3 Generating functional

The time contour (14.11) is very similar to the contour of the figure 1.4, with the
addition of a vertical part that captures the interactions hidden in the density operator.
Since we had to extend the real time axis into the contour C, it is natural to extend
2 An alternative point of view is to decide that ρ is the density operator of the system at x0 = −∞.

There, we may turn off adiabatically the interactions, and therefore use only the free Hamiltonian inside ρ.
In this section, we derive the formalism for an initial equilibrium state specified at a finite time x0 = ti .
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 461

also the observable of eq. (14.6) to allow the field operators to be located anywhere
on C, with a path ordering instead of a time ordering,
O ≡ P φ(x1 ) · · · φ(xn ) . (14.12)
The expectation values of these operators can be encapsulated in the following
generating functional,
 R 
Tr ρ P exp i C d4 x j(x)φ(x)
Z[j] ≡  , (14.13)
Tr ρ
where the fictitious source j(x) also lives on the contour C. In order to bring this
generating functional to a useful form, we can follow very closely the derivation of
the section 1.6.2, by first pulling out a factor that contains the interactions, and by
rearranging the ordering of the free factor with two successive applications of the
Baker-Campbell-Hausdorff formula. This leads to
Z   Z
4 δ 1
Z[j] = exp i d x LI exp − d4 xd4 y j(x) j(y) G0 (x, y) ,
C iδj(x) 2 C
(14.14)
where the free propagator G0 (x, y), defined on the contour C, is given by
 
Tr e−β H0 P φin (x)φin (y)
G0 (x, y) ≡  . (14.15)
Tr e−β H0

14.2.4 Expression of the free propagator


In order to calculate the free propagator, we need the free Hamiltonian expressed in
terms of creation and annihilation operators3 ,
Z
d3 p
H0 = Ep a†p,in ap,in , (14.16)
(2π)3 2Ep
and the canonical commutation relation of the latter:
 
ap,in , a†p ′ ,in = (2π)3 2Ep δ(p − p ′ ) . (14.17)
From this, we get
 −βH  
e 0
, ap,in = e−βH0 1 − e−βEp ap,in

Tr e−βH0 ap,in = 0

Tr e−βH0 a†p,in ap ′ ,in
 = (2π)3 2Ep nB (Ep ) δ(p − p′ ) , (14.18)
Tr e−βH0
3 We can drop the zero point energy here. It would simply multiply the density operator by a constant

factor, that would be canceled since all expectation values are normalized by the factor 1/Tr (ρ).
462 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where nB (E) is the Bose-Einstein distribution:

1
nB (E) ≡ . (14.19)
eβE −1
This leads to the following formula for the free propagator:
Z
d3 p h 
G0 (x, y) = 3
θc (x0 − y0 ) + nB (Ep ) e−ip·(x−y)
(2π) 2Ep  i
+ θc (y0 − x0 ) + nB (Ep ) e+ip·(x−y) ,
(14.20)

where θc generalizes the step function to the contour C (i.e. θc (x0 − y0 ) is non-zero
if x0 is posterior to y0 according to the contour ordering). This expression of the
propagator generalizes to a non-zero temperature the formula (1.119) (the Bose-
Einstein distribution goes to zero when T → 0). Let us postpone a bit the calculation
of the propagator in momentum space. For now, we just note the following rules for
the perturbative expansion in coordinate space:

1. Draw all the graphs (with vertices corresponding to the interactions of the
theory under consideration) that connect the n points of the observable. Graphs
containing disconnected subgraphs should be ignored. Each graph should be
weighted by its symmetry factor. c sileG siocnarF

2. Each line of a graph brings a free propagator G0 (x, y).

3. Each vertex brings a factor −iλ. The space-time coordinate of this vertex is
integrated out, but the time integration runs over the contour C.

Thus, the only differences with the zero temperature Feynman rules are the explicit
form of the free propagator, and the fact the time integrations are over the contour C
instead of the real axis.

14.2.5 Kubo-Martin-Schwinger symmetry

The canonical density operator exp(−βH) can be viewed as an evolution operator for
an imaginary time shift, which implies the following formal identity

e−βH φ(x0 −iβ, x) eβH = φ(x0 , x) . (14.21)

Let us now consider the following correlator



G(ti , · · · ) ≡ Tr e−βH P φ(ti , x) · · · , (14.22)
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 463

that contains a field whose time argument is the initial time ti (the other fields it
contains need not be specified in this discussion). Since ti is the “smallest” time on
the contour C, the field operator that carries it is placed to the rightmost position by
the path ordering. Thus, we have
  
G(ti , · · · ) = Tr e−βH P · · · φ(ti , x) , (14.23)

where the path ordering now applies only to the remaining (unwritten) fields. Using
the cyclic invariance of the trace and eq. (14.21), we then get
 
G(ti , · · · ) = Tr e−βH φ(ti − iβ, x) P · · ·

= Tr e−βH P φ(ti − iβ, x) · · ·
= G(ti − iβ, · · · ) , (14.24)

where in the second line we have used the fact that ti − iβ is the “latest” time on
the contour C in order to put back the operator carrying it inside the path ordering.
This equality is one of the forms of the Kubo-Martin-Schwinger (KMS) symmetry:
all bosonic path-ordered correlators take identical values at the two endpoints of the
contour C. Note that, although we have singled out the first field in the correlator, this
identity applies equally to all the fields.
The KMS symmetry is very closely tied to the fact that the system is in thermal
equilibrium, since it is satisfied only when the density operator is the canonical
equilibrium one. One of its consequences is that all the equilibrium correlation
functions are independent of the initial time ti . In order to prove this assertion, let us
first note that the free propagator satisfies the KMS symmetry, and does not contain ti
explicitly. A generic Feynman graph leads to time integrations that have the following
structure:
Z
G(x1 , · · · , xn ) = dy01 · · · dy0p F(y01 , · · · , y0p | x1 , · · · , xn ) . (14.25)
C

(We assume that the integrals over the positions at every vertex have already been
performed.) Since the free propagator does not depend on ti , the derivative of the
integral with respect to ti comes only from the endpoints of the integration contour,
and we can write
p Z Y
X h
∂G(x1 , · · · , xn )
= dy0j F(· · · , y0i = ti , · · · | x1 , · · · , xn )
∂ti
i=1 C j6=i
i
−F(· · · , y0i = ti − iβ, · · · | x1 , · · · , xn )
= 0. (14.26)

The vanishing result follows from the fact that the bracket in the integrand is zero,
since it is built from objects that obey the KMS symmetry. The independence with
464 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

respect to ti merely reflects the fact that, in a system in thermal equilibrium, no


measurement can tell at what time the system was prepared in equilibrium. From the
analyticity properties of the integrand, the result of the integrations in eq. (14.25) is in
fact invariant under all the deformations of the contour C that preserve the spacing
−iβ between its endpoints.

14.2.6 Conserved charges


Until now, we have considered only the simplest case of a boson field coupled to a
thermal bath. Although energy is conserved, the system under consideration may
exchange energy with the environment, which translates into the canonical density
operator exp(−β H).
Let us now consider a Hermitean operator Q that commutes with the Hamiltonian,
i.e. that corresponds to a conserved quantity. A field φ is said to carry a charge q if it
obeys the following commutation relation:
 
Q, φ(x) = −q φ(x) . (14.27)

Note that if φ is a real field, then q = −q∗ . Therefore, in order to have a non-zero
real valued charge, the field should be complex.
When there are additional conserved quantities such as Q, their conservation
constrains in a similar fashion how they may be exchanged with the heat bath. The
canonical equilibrium ensemble must be generalized into the grand canonical ensem-
ble, in which the density operator of the subsystem is given by

ρ ≡ exp − β H + µQ , (14.28)

where µ is the chemical potential associated to the charge Q. Although we have


introduced a single such charge, there could be any number of them, each accompanied
by its chemical potential. A first consequence of this generalization is that the KMS
symmetry is modified by the conserved charge. Now it reads:

G(ti , · · · ) = eβµq G(ti − iβ, · · · ) , (14.29)

where q is the charge carried by the field on which the identity applies. Thus, the
values of correlation functions at the endpoints are equal up to a twist factor that
depends on the chemical potential. c sileG siocnarF

The simplest field that can carry a non trivial charge is a complex scalar field. In
the interaction picture, it can be decomposed as follows on a basis of creation and
annihilation operators:
Z h i
d3 p
φin (x) = 3
ap,in e−ip·x + b†p,in e+ip·x .
(2π) 2Ep
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 465

(This field requires two sets {ap,in , bp,in } of such operators, because it describes
a particle which is distinct from its anti-particle.) With this field, it is possible to
construct a theory that has a global U(1) symmetry, corresponding to the conservation
of the following charge
Z
d3 p
Q≡ b†p,in bp,in − a†p,in ap,in . (14.30)
(2π)3 2Ep

It is then easy to obtain the following grand-canonical averages:



Tr eβ(H0 +µQ) a†p,in ap ′ ,in = (2π)3 2Ep nB (Ep − µq) δ(p − p′ )

Tr eβ(H0 +µQ) b†p,in bp ′ ,in = (2π)3 2Ep nB (Ep + µq) δ(p − p′ ) ,
(14.31)

and finally obtain the free propagator for a complex scalar carrying the charge q:
Z
d3 p h
G0 (x, y) = (θc (x0 − y0 ) + nB (Ep − µq)) e−ip·(x−y)
(2π)3 2Ep
i
+(θc (y0 − x0 ) + nB (Ep + µq)) e+ip·(x−y) .
(14.32)

14.2.7 Fermions

Consider now spin 1/2 fermions, whose interaction picture representation reads
XZ d3 p
ψin (x) = a†sp,in vs (p)e+ip·x +bsp,in us (p)e+ip·x , (14.33)
s=±
(2π)3 2Ep

where the creation and annihilation operators obey canonical anticommutation rela-
tions (see eqs. (1.215)). Because they are anticommuting fields, a minus sign appears
in the derivation of the KMS identity:

G(ti , · · · ) = −eβµq G(ti − iβ, · · · ) . (14.34)

Moreover, the eqs. (14.31) are modified into



Tr eβ(H0 +µQ) a†p,in ap ′ ,in = (2π)3 2Ep nF (Ep − µq) δ(p − p′ )

Tr eβ(H0 +µQ) b†p,in bp ′ ,in = (2π)3 2Ep nF (Ep + µq) δ(p − p′ ) ,
(14.35)
466 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where nF (E) is the Fermi-Dirac distribution,

1
nF (E) ≡ , (14.36)
eβ E +1
and the free propagator reads
Z
d3 p h 
0
S (x, y) = / + +m) θc (x0 −y0 )−nF (Ep −µq) e−ip·(x−y)
(p
(2π)3 2Ep
 i
+(p/ − +m) θc (y0 −x0 )−nF (Ep +µq) e+ip·(x−y) ,
(14.37)

/ ± ≡ ±Ep γ0 − p · γ.
with the notation p

14.2.8 Examples of physical observables

Thermodynamical quantities : A central quantity that encapsulates the thermo-


dynamical properties of a system is its partition function, defined in the canonical
ensemble as

Z ≡ Tr e−βH . (14.38)

In perturbation theory, the logarithm of Z is obtained as the sum of all the connected
vacuum graphs at finite temperature. For instance, for a scalar field, its perturbative
expansion starts with the following diagrams:

From Z, one may access various thermodynamical quantities as follows:

∂Z
Energy : E=− ,
∂β
Entropy : S = βE + ln(Z) ,
1
Free energy : F = E − TS = − ln(Z) . (14.39)
β

These quantities encode the bulk properties of the system, such as its equation of state
or the existence of phase transitions.
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 467

Production rates of weakly coupled particles : In a system at high temperature,


it is sometimes interesting to calculate the production rate of a given species of
particles. Firstly, note that this quantity is not interesting for particles that are in
thermal equilibrium, since by definition they are produced and destroyed in equal
amounts, so that their net production rate is zero. The real interest arises for weakly
coupled particles that are not in thermal equilibrium with the bulk of the system. For
instance, in a hot plasma of quarks and gluons interacting via the strong nuclear force,
photons are also produced. However, since they interact only electromagnetically,
they may not be thermalized. This is the case for instance when the system size is
small compared to the mean free path of photons (i.e. the average distance between
two interactions of a photon), because in this situation the produced photons escape
without re-interactions.
A pedestrian method for calculating a production rate is the following formula:

2
ω
Z
dNγ n(ω1 ) · · · n(ωn )
ω ∝ ,
dtd3 xd3 p (unobserved
particles )
×(1 ± n(ω′1 )) · · · (1 ± n(ω′p ))

(14.40)

where the integration is over the invariant phase-space of the unobserved incoming
and outgoing particles, weighted by the appropriate occupation factor (nB or nF for a
particle in the initial state, and 1 + nB or 1 − nF for a particle in the final state). In
this formula, the gray blob should be calculated with the finite-T Feynman rules.
The previous approach becomes rapidly cumbersome as the number of initial and
final state particles increase. The bookkeeping may be simplified by using a finite-T
generalization of the formula that relates the decay rate of a particle to the imaginary
part of its self-energy:
dNγ 1
ω ∝ ω/T Im Πµ µ (ω, p) . (14.41)
dtd3 xd3 p e −1 | {z }
photon self-energy

Moreover, there exists a finite-T generalization of the Cutkosky’s cutting rules, in


order to organize the perturbative calculation of the imaginary part that appears in the
right hand side. c sileG siocnarF

Transport coefficients : Let us now discuss the case of transport coefficients. As


their name suggest, these quantities characterize the ability of the system to move
certain (locally conserved) quantities. For instance, the electrical conductivity encodes
the properties of the system with respect to the transport of electrical charges, the
shear viscosity tells us about how the system reacts to a shear stress (this coefficient
468 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

is related to the transport of momentum), etc. Note that in their simplest version,
these quantities do not depend on frequency (in fact, they are the zero frequency
limit of a 2-point function), and therefore they describe the response of the system to
an infinitely slow perturbation. They can be generalized into frequency dependent
quantities that also contain information about the response to a dynamical disturbance.

The standard approach for evaluating transport coefficients is to use the Green-
Kubo formula, that relates the transport coefficient to the 2-point correlation function
of a current J that couples to the quantity of interest (electrical charge, momentum,
etc):

h i Z +∞
transport 1  
∼ lim Im dtd3 x e−iωt J(t, x), J(0, 0) . (14.42)
coefficient ω→0 ω 0

The physical meaning of this formula is that the system is perturbed at the origin by a
current J, and one measures the linear response by evaluating the same current at a
generic point (t, x). The transport coefficient is proportional to the Fourier transform
of this correlation function at zero energy and momentum. Note that this formula
contains the commutator of the two currents, since one wants the two points to be
causally connected.

14.2.9 Matsubara formalism

The perturbative rules that we have derived so far are expressed in coordinate space,
which is usually not very appropriate for explicit calculations. The standard way of
turning them into a set of rules in momentum space is to Fourier transform all the
propagators, and to rely on the fact that the Fourier transform of a convolution product
is the ordinary product of the Fourier transforms, i.e. symbolically
   
FT F ∗ G = FT F × FT G . (14.43)

However, the main difficulty in doing this at finite temperature is that the time
integration in the “convolution product” involves an integration over the complex-
shaped contour C, which makes it unclear whether we may use the above identity.

Two main solutions to this problem have been devised. The first one is the
imaginary time formalism, also known as the Matsubara formalism, that we have
already presented superficially in the section 2.8.2. The main motivation of this
formulation is that the quantities that describe the thermodynamics of a system in
thermal equilibrium are time independent. Therefore, one may exploit the freedom to
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 469

deform the contour C in order to simplify it, as shown in the following figure:

ti 0

t i − iβ − iβ

It is customary to denote x0 = −iτ, so that the variable τ is real and spans the range
[0, β) (the point τ = β should be removed – indeed, because of the KMS symmetry, it
is redundant with the point τ = 0). The imaginary time formalism corresponds to the
Feynman rules derived earlier, specialized to this purely imaginary time contour. Note
that one could in principle use this formalism in order to calculate time-dependent
quantities. One would first obtain them as a function of imaginary times τ1 , τ2 , · · ·
and their dependence upon real times x01 , x02 , · · · may then be obtained by an analytic
continuation.
From the KMS symmetry, we see that the propagator, and more generally the
integrand of any Feynman diagram, is periodic (for bosons) in the variable τ with
period β. Therefore, one can go to Fourier space by decomposing the time dependence
in the form of a Fourier series and by doing an ordinary Fourier transform in space :
+∞ Z
0
X d3 p iωn (τx −τy ) −ip·(x−y) 0
G (τx , x, τy , y) ≡ T e e G (ωn , p) ,
n=−∞
(2π)3
(14.44)
with ωn ≡ 2πnT . These discrete frequencies are called Matsubara frequencies.
Note that for fermions, the propagator is antiperiodic with period β, and the discrete
frequencies that appears in the Fourier series are ωn = 2π(n + 12 )T . Moreover, if the
line carries a conserved charge q, the Matsubara frequencies are shifted by −iµq, i.e.
ωn → ωn − iµq (µ is the chemical potential associated to this conservation law).
In the case of scalar fields, an explicit calculation gives the following free bosonic
propagator in Fourier space,
1 1
G0 (ωn , p) = ≡ 2 . (14.45)
ω2n 2
+p +m2
P + m2
(For the sake of brevity, we denote P2 ≡ ω2n + p2 .) Note that, up to a factor −i,
this propagator is the usual free zero temperature Feynman propagator in which
one has substituted p0 → iωn . Let us list here the Feynman rules for perturbative
calculations in this formalism:
• Propagators :
1
G0 (ωn , p) = ,
P2
470 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

• Vertices : each vertex brings a factor λ. Moreover, the sum of the ωn ’s and of
the p’s that enter into each vertex are zero,
• Loops :
X Z d3 p Z
X
T ≡ .
(2π)3
n∈❩ P

(The right-hand side of this equation is a frequently used compact notation for
the combination of discrete sums and integrals that appear in the Matsubara
formalism. This notation includes a factor T that makes its dimension equal to
four in four space-time dimensions.)

As an illustration of the use of this formalism, let us give two examples of vacuum
graphs:

Z Z
λ XX 1
= ,
8 (P + m )(Q2 + m2 )
2 2
P Q
Z Z
g2 XX 1
= 2 2
. (14.46)
6 (P +m )(Q +m2 )((P+Q)2 +m2 )
2
P Q

The Fourier space version of the Matsubara formalism is structurally very similar
to the zero temperature Feynman rules, which makes it quite appealing. There is
one caveat however: the continuous integrations over energies are now replaced by
discrete sums, which are considerably harder to calculate. Let us expose here two
general methods for evaluating these sums. The first one is based on the following
representation of the propagator of eq. (14.45):
Zβ h i
1
0
G (ωn , p) = dτ e−iωn τ (1+nB (Ep )) e−Ep τ +nB (Ep ) eEp τ , (14.47)
2Ep 0

where the integrand in the right hand side is a mixed representation that depends on
the momentum p and the imaginary time τ. By replacing each propagator of a given
graph by this formula, the discrete sums can be easily performed since they are all of
the form
X X
eiωn τ = β δ(τ − nβ) . (14.48)
n∈❩ n∈❩

(The left hand side is obviously periodic in τ with period β, which is ensured in the
right hand side by the sum over infinitely many shifted copies of the delta function.)
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 471

At this point, one has to integrate over the τ’s that have been introduced when
replacing the propagators by (14.47), but these integrals are straightforward since
the dependence on these times is in the form of delta functions and exponential.
Moreover, only a finite number of the delta functions that appear in the right hand
side of eq. (14.48) actually contribute, due to the constraint that each τ must be in
the range [0, β). As an illustration, consider the evaluation of the 1-loop tadpole in a
scalar theory with quartic coupling:
Z
λX 1
= 2
2 P + m2
P
Z Zβ X h i
λ d3 p −Ep τ Ep τ
= dτ δ(τ−nβ) (1+n (E p ))e +n (Ep )e
2 (2π)3 2Ep 0 n
B B

Z
λ d3 p  
= 1 + 2 nB (Ep )
2 (2π)3 2Ep
 2 
Λ T2
= λ + + ··· , (14.49)
16π2 24

where Λ is an ultraviolet cutoff that restricts the integration range |p| ≤ Λ (the final
formula assumes that Λ ≫ T , and we have not written the terms that depend on the
mass). The first term is the usual zero temperature ultraviolet divergence, while the
term coming from the Bose-Einstein distribution exists only at non-zero temperature.
This second term is ultraviolet finite, thanks to the exponential suppression of the
Bose-Einstein distribution at large energy. We can already note on this example that
the ultraviolet divergences are identical to the zero temperature ones. This is a general
property: if the action has already been renormalized at zero temperature, there are no
additional ultraviolet divergences at finite temperature. This is quite clear on physical
grounds: being at finite temperature means that one has a dense medium in which the
average inter-particle distance is T −1 . However, in the ultraviolet limit, one probes
distance scales that are much smaller than the inverse temperature, for which the
effects of the surrounding medium are irrelevant.
An alternate method for evaluating the sums over the discrete Matsubara frequen-
cies is to note that the function
β
P(z) ≡ (14.50)
eβz − 1
has simple poles of residue 1 at all the z = iωn . Therefore, we can write
X I
dz
f(iωn ) = P(z) f(z) , (14.51)
2iπ
n∈❩ γ

where γ is an integration contour made of infinitesimal circles around each pole of


P(z), as shown in the left part of the figure 14.1. The second step is to deform the
472 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 14.1: Successive deformations of the contour in order to calculate the


discrete sums over Matsubara frequencies. The cross denotes a pole of the function
f(z), while the solid dots on the imaginary axis are the poles of P(z).

contour γ as shown in the middle of the figure 14.1. For this transformation to hold
as is, with no extra term, the function f(z) should not have any pole on the imaginary
axis, which is usually the case. Finally, a second deformation brings the contour along
the real axis. If the function f(z) has poles, the new contour should wrap around these
poles, which an additional contribution. Thus, after these transformations, the discrete
sum over the Matsubara frequencies has been rewritten as a continuous integral along
the real axis (and the weight P(z) becomes an ordinary Bose-Einstein distribution),
plus some isolated contributions coming from poles of the summand. c sileG siocnarF

14.2.10 Momentum space Schwinger-Keldysh formalism


The imaginary time formalism is particularly well suited to calculate the time-in-
dependent thermodynamical properties of a system at finite temperature. However,
interesting dynamical information is also contained in time-dependent objects. In
principle, one could first evaluate them in the Matsubara formalism in terms of
imaginary times τ (or imaginary frequencies iωn ), and then perform an analytic
continuation to real times or energies. Beyond 2-point functions (i.e. for functions
that depend on more than one energy, taking into account energy conservation),
this analytic continuation is usually extremely complicated and for this reason it is
desirable to be able to obtain the result directly in terms of real energies.
In fact, we may ignore4 the vertical part of the contour C. A heuristic justification
is to let the initial time ti go to −∞ and turn off adiabatically the interactions in this
limit. Therefore, the canonical density operator becomes exp(−β H0 ) and there is no
4 A more careful treatment of the vertical part of the contour indicates that its effect it to replace the

statistical distribution nB (Ep ) by nB (|p0 |) in the equations (14.52). See the discussion after eq. (14.60).
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 473

need for the vertical part of the time contour. Let us call + and − respectively the
upper and lower horizontal branches of the contour. We may then break down the free
propagator G0 (x, y) into four propagators G0++ .G0−− , G0+− and G0−+ depending on
where x, y are located, and Fourier transform each of them separately. For a scalar
field, this gives:

i
G0++ (p) = + 2π nB (Ep ) δ(p2 − m2 ) ,
p2 − m2 + iǫ
G0+− (p) = 2π (θ(−p0 ) + nB (Ep )) δ(p2 − m2 ) ,
 0 ∗
G0−− (p) = G++ (p) , G0−+ (p) = G0+− (−p) . (14.52)

Note that these propagators are very closely related to those of the Schwinger-Keldysh
formalism at zero temperature (see eqs. (1.367)), since we have
h i
for ǫ, ǫ ′ = ± , G0ǫǫ ′ (p) = G0ǫǫ ′ (p) +2π nB (Ep ) δ(p2 −m2 ) . (14.53)
T =0

The rules for the vertices and loops are identical to those of the Schwinger-Keldysh
formalism at zero temperature, namely:

• One must assign types + and − to the vertices of a diagram in all the possible
ways,
• Each vertex of type + brings a factor −iλ and each type − vertex a factor +iλ,
• A vertex of type ǫ and a vertex of type ǫ ′ are connected by the free propagator
G0ǫǫ ′ ,
• Each loop momentum must be integrated with the measure d4 p/(2π)4 .

Since this formalism is a simple extension of the zero temperature Schwinger-Keldysh


formalism (the only difference being the propagators in eq. (14.53) ), it makes the
connection with perturbation theory at zero temperature more transparent.
In the Matsubara formalism, the KMS symmetry is trivially encoded in the fact
that all the objects depend only on the discrete frequencies ωn . In the Schwinger-
Keldysh formalism, it is somewhat more obfuscated. A generic n-point function
Γǫ1 ···ǫn (p1 · · · pn ), amputated of its external legs, obeys the following two identities:
X
Γǫ1 ···ǫn (k1 , · · · , kn ) = 0 ,
ǫ1 ···ǫn =±
X h Y 0
i
e−βki Γǫ1 ···ǫn (k1 , · · · , kn ) = 0 . (14.54)
ǫ1 ···ǫn =± {i|ǫi =−}

It is the second of these identities that encodes the KMS symmetry.


474 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Finally, let us note for later use that the four propagators of eqs. (14.52) can be
related to the zero temperature Feynman propagator and its complex conjugate by the
following formula:
 0   0 
G++ G0+− GF 0
=U U (14.55)
G0−+ G0−− 0 G0F ∗
with
p 
θ(−p0 )+nB
1 + nB √
 1+nB 
U(p) ≡  θ(+p0 )+n p  (14.56)
√ B
1 + nB
1+nB

and
i
G0F (p) ≡ . (14.57)
p2 − m2 + iǫ

Resummation of a mass : In eq. (14.56), we have voluntarily not written the


argument of the Bose-Einstein distribution. Given its origin in the propagators listed
in eqs. (14.52), this argument could be Ep or |p0 |. It turns out that the second
possibility is the correct one. In order to see this, consider a mass term, but instead
of including the mass directly into the propagators (as in eqs. (14.52)) let us start
from massless propagators and resum the mass to all orders. In order to simplify this
calculation, let us introduce the following compact notations:
 0   0 
● 0 ≡
G++ G0+−
G0−+ G0−− m=0
, m ≡ ● G++ G0+−
G0−+ G0−− m
, (14.58)

the 2 × 2 matrix of Schwinger-Keldysh propagators, without and with the mass, and
 0   0 
❉0 ≡
GF 0
0 G0F ∗ m=0
, m ≡ ❉
GF 0
0 G0F ∗ m
(14.59)

the corresponding diagonal matrices made of the Feynman propagator and its complex
conjugate. The massive propagators obtained by explicitly summing the mass term
are given by
∞  n
●m ●0 ●0
X
= − im2 σ3
n=0
 n
❉0 ❉0
X∞
= U − im2 Uσ3 U U
n=0
∞  n
❉0 ❉0 ❉mU .
X
= U − im2 σ3 U=U (14.60)
n=0
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 475

In the first line, the third Pauli matrix σ3 provides the necessary signs for the vertices
of types + and − in the Schwinger-Keldysh formalism. The third line uses the fact
that Uσ3 U = σ3 . In the final result, only the matrix ❉
is affected by the mass,
while the matrix U has remained unchanged. If we use the on-shell energy Ep as the
argument of nB in the matrix U, then this argument is |p| since we started from a
massless propagator. With this choice, the final result
p would be inconsistent, since the
poles of the massive propagator are at p0 = ± p2 + m2 (since the matrix m in ❉
the middle now contains the mass), but the statistical information contained in the
U’s is still massless. In contrast, using |p0 | as the argument of nB ensures that the
energy inside nB follows the poles of the propagator, and correctly picks the change
due to the mass. We also see that the (incorrect) prescription nB (Ep ) is equivalent to
neglecting the vertical path of the contour, since it amounts to keeping the interactions
(here, the mass term, treated as an interaction) in the time evolution of the system but
not in the density operator.

14.2.11 Retarded basis

Change of basis : All the objects that appear in the Schwinger-Keldysh formalism
carry indices that take the values + or −. Variants of this formalism may be obtained
by performing linear combinations of these two indices, akin to a change of basis.
For any n-point function G{ǫi } in the ± basis, we may define

X n
Y
G{Xi } (k1 , · · · , kn ) ≡ G{ǫi } (k1 , · · · , kn ) UXi ǫi (ki ) , (14.61)
{ǫi =±} i=1

where U is an invertible “rotation” matrix. The new indices Xi also take two values,
that we may denote 1 and 2. For consistency, the vertex functions obtained by
amputating Feynman graphs of their external lines must be related by

X n
Y
{Xi } {ǫi }
Γ (k1 , · · · , kn ) ≡ Γ (k1 , · · · , kn ) V Xi ǫi (ki ) , (14.62)
{ǫi =±} i=1

where the matrix V is defined by


T
V Xǫ (k) ≡ ((U )−1 )Xǫ (−k) . (14.63)

In particular, this formula gives the expression of the vertices in the new formalism.
For instance, in a φ4 scalar theory, we have
X
−iλABCD (k1 , · · · , k4 ) = −iλ ǫ V Aǫ (k1 )V Bǫ (k2 )V Cǫ (k3 )V Dǫ (k4 ) .
ǫ=±
(14.64)
476 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Note that the new vertices may be momentum dependent if the rotation matrix is.
Moreover, there could be up to 24 non-zero vertices, while there are only two in the
original Schwinger-Keldysh formalism (but we will see shortly that these rotations
may reduce the number of non-zero entries for the propagators, which is sometimes
an advantage). The n-point functions in the new basis may be obtained directly in
perturbation theory, in terms of Feynman diagrams made of the bare propagators and
vertices of the new basis. c sileG siocnarF

Retarded-advanced formalism : A convenient choice of rotation consists in ex-


ploiting the two relations satisfied by the Schwinger-Keldysh propagators,
G++ (p) + G−− (p) = G−+ (p) + G+− (p) ,
0
G−+ (p) = ep /T
G+− (p) , (14.65)
in order to generate two zero entries in the new propagators. Note that the second
of these identities is equivalent to KMS, and is therefore only valid in thermal
equilibrium. For bosons, the matrix U that achieves this is
 
1 a(k0 )a(−k0 ) −a(k0 )a(−k0 )
U(k) = , (14.66)
a(−k0 ) −nB (−k0 ) −nB (k0 )
where a(k0 ) is an arbitrary non-vanishing function. A similar transformation exists
for fermions. In all cases, it is such that the new propagators become
 
0 GA (k)
GXY (k) = , (14.67)
GR (k) 0
where GR,A are the vacuum retarded and advanced propagators. In this formalism,
it is customary to denote R and A the values taken by the indices X, Y (therefore,
RA
the term in the upper right location is G ≡ GA , and the other non-zero term is
AR
G ≡ GR ). These rotated propagators do not depend on temperature, which is now
relegated into the vertices. A convenient choice is a(k0 ) = −nB (k0 ), which leads to
the following vertices in a φ4 scalar theory
AAAA RRRR
λ =λ =0,
ARRR
λ =λ,
AARR
λ (k1 , · · · , k4 ) = −λ [1 + nB (k01 ) + nB (k02 )] .
(14.68)
(The vertices we have not written explicitly are obtained by circular permutations.)
The general expression of the vertices in the rotated formalism is
Q
nB (−k0i )
i|Xi =A
λ{Xi } = λ  P  . (14.69)
nB k0i
i|Xi =R
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 477

(For fermionic lines, we must replace nB by −nF , and shift the argument by −qµ
if the line carries a conserved charge.) This formalism, compared to the original
Schwinger-Keldysh one, has a number of advantages:

• Thanks to eq. (14.69), the Bose-Einstein (or Fermi-Dirac in the case of fer-
mions) functions are conveniently factorized in each Feynman graph,
• In this formalism, the two identities (14.54) satisfied by n-point functions take
a particularly simple form,
A···A R···R
Γ =Γ =0, (14.70)

which renders immediate the simplifications allowed by these identities.


• The retarded-advanced formalism has close connections to the Matsubara
formalism, since every R/A n-point function can be obtained as a linear combi-
nation of the analytical continuations (iωn → p0 ± i0+ ) of the corresponding
function in the Matsubara formalism.

14.3 Large distance effective theories


14.3.1 Infrared divergences
Quantum field theories with massless bosons at non-zero temperature suffer from
pathologies in the infrared sector, due to the low energy behaviour of the Bose-Einstein
distribution:
T
nB (E) ≈ ≫1. (14.71)
E≪T E
As we shall see now, using a massless φ4 scalar field theory as a playground, this
leads to loop contributions that exhibit soft divergences. The simplest graph that
suffers from this problem is the following 2-loop graph,

that has two nested tadpoles. Let us assume that the uppermost tadpole has already
been combined with the corresponding 1-loop ultraviolet counterterm, so that only
the finite part remains, and denote µ2 the finite remainder. From eq. (14.49), its
expression is given by
λ T2
µ2 ≡ . (14.72)
24
478 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(This is the exact result for the temperature dependent part in a massless theory.) With
this shorthand, we have

Z
λµ2 X 1
=
2 (P2 )2
P
Z
λ µ2 d3 p nB (p)(1 + nB (p)) 2 eβp − e−βp
= +
2 (2π)3 4p2 T p
| {z }
≈ T4
p≪T p

λ µ2 T
= + infrared finite terms , (14.73)
4π2 ΛIR

where in the last line we have introduced an infrared cutoff ΛIR in order to prevent
a divergence at the lower end of the integration range. A similar calculation would
indicate an even worse infrared singularity in the following 3-loop graph:

µ3
∼ λT µ + infrared finite terms , (14.74)
Λ3IR

and more generally for n insertions of the base tadpole on the main loop,

 2n−1
µ
∼ λT µ + infrared finite terms . (14.75)
ΛIR

Unlike ultraviolet divergences that can, in a renormalizable theory, be disposed of


systematically by a redefinition of the couplings in front of a few local operators in the
Lagrangian, it is not possible to handle infrared divergences in this manner because
they correspond to long distance phenomena. Fortunately, there is a simple way out
in the present case: the series of graphs that we have started evaluating are the first
terms of a geometrical series, since the repeated insertions of a tadpole equal to µ2
(after subtraction of the appropriate counterterm) merely amounts to dressing by a
mass µ2 an originally massless propagator. Namely, we have

+ + + ··· =

Z p
λ d3 p  
= p 1 + 2 nB ( p2 + µ2 ) . (14.76)
2 3 2
(2π) 2 p + µ2
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 479

(The thicker propagator indicates a massive scalar with mass µ2 .) The procedure
used here, that consists in summing an infinite subset of (individually divergent)
perturbative contributions, is a simple form of resummation. We can readily see that
it leads to an infrared finite sum, since now the quantity µ2 plays the role of a cutoff
at small momentum.
Let us now estimate the √
contribution of the infrared sector to this integral. At
weak coupling, we have µ ∼ λ T ≪ T . Therefore, for momenta p ∼ µ, we have
Z p Z
2
2 1 + 2 nB ( p + µ )
2 dp p2 T
λ dp p p ∼λ ∼ λT µ ∼ λ3/2 T 2 . (14.77)
p2 + µ2 p2 + µ2

This contribution comes in addition to the ultraviolet divergence λΛ2 and the contri-
bution λT 2 that are both contained in the first diagram of the resummed series (these
terms come from momenta of order T or above). We observe here an unexpected
feature; the appearance of half powers of the coupling constant λ. On the surface, this
is quite surprising since the power counting indicates that one power of λ should come
with each loop. This oddity is in fact a consequence of the infrared behaviour of the
Bose-Einstein distribution, in T/E, combined with the fact that the µ introduced in the
resummation is of order λ1/2 . Although the loop expansion generates a series which
is analytic in λ, this property may be broken if some parameters in the integrands
depend on λ1/2 . c sileG siocnarF

14.3.2 Screened perturbation theory


The resummation of the finite part of the 1-loop tadpole is sufficient in order to
screen the infrared divergences in the graphs corresponding to a strict loop expansion.
However, since such a resummation amounts to a reorganization of perturbation
theory (here, by already including an infinite set of graphs into the propagator), it
should be done in a careful way that avoids any double countings, and ensures that
we are not modifying the original theory. This can be achieved by a method, called
screened perturbation theory, that consists in adding and subtracting a mass term to
the Lagrangian,
1   λ 1 1
L= ∂µ φ ∂µ φ − φ4 − µ2 φ2 + µ2 φ2 . (14.78)
2 4! 2 2
This manipulation clearly ensures that nothing is changed to the original theory. The
reorganization of perturbation theory allowed by this trick comes from treating the
two mass terms on different footings: the term − 21 µ2 φ2 is treated non-perturbatively
by including it directly into the definition of the free propagator, while the term
+ 21 µ2 φ2 is treated order by order, as a finite counterterm.
In this reorganization, the value of µ2 has so far been left unspecified, and it could
a priori be chosen arbitrarily. A general rule governing this choice is to include in µ2
480 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

as much as possible of the large contributions coming from loop corrections to the
propagator. The 1-loop contribution in λT 2 is an obvious candidate for including in
µ2 , since for momenta p2 . λT 2 this is indeed a large correction to the denominator
of the propagator. At small coupling λ ≪ 1, this is the dominant one. However, when
the coupling increases, the propagator may receive additional large corrections from
higher order loop corrections, and an improved resummation scheme could include
these additional corrections.
A further improvement, sometimes considered in some applications, is to let µ2
free and to use some reasonable condition to choose an “optimal” value. For instance,
this condition may be the minimization of the 1-loop correction, which in a sense
would indicate that the resummation has shifted most of this loop contribution into
the free propagator. For instance, one may try to achieve

ZΛ p
λ d3 p 1 + 2 nB ( p2 + µ2 ) λ Λ2
0= + counterterms = p − − µ2 ,
2 (2π)3 2 p2 + µ2 16π2
(14.79)

where the two subtractions are respectively the ultraviolet counterterm and the finite
counterterm necessary in order not to overcount the mass µ2 . The equation, that
provides an implicit definition of the mass µ2 , is called a gap equation5 . Because this
equation is non-linear in µ2 , its solution contains all orders in λ, but at small λ it is
dominated by the 1-loop result µ2 = λT 2 /24.
We show an application of this method to the calculation of the free energy F in
the figure 14.2. In this figure, the results obtained at 1-loop and 2-loops in screened
perturbation theory are compared to the first two orders (λ and λ3/2 ) of the ordinary
perturbative expansion. Firstly, we can see that the latter is quite unstable except at
low coupling: the two subsequent orders differ substantially, and even the sign of the
correction due to the interactions flips. In contrast, screened perturbation theory leads
to a remarkably stable result, with very small changes when going from 1-loop to
2-loops. To a large extent, this success is due to the non trivial coupling dependence
of the mass µ2 , acquired by solving the gap equation (14.79) (screened perturbation
theory with only the 1-loop mass, would be better than strict perturbation theory, but
would encounter some difficulties at large coupling).

14.3.3 Symmetry restoration at high temperature


The thermal correction to the mass µ2 = λT 2 /24 also explains why symmetries that
may be spontaneously broken at low temperature are generically restored at high
5 The terminology comes from the fact that the solution of this equation usually shifts the energy of a

particle, generating a “gap” in the spectrum, and thus requiring a non-zero energy to create such a particle.
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 481

Figure 14.2: Free


energy at non-zero tem-
perature in the φ4 scalar
field theory (normalized 2.5
to the free energy of the
non-interacting theory). 2.0
The horizontal axis is
the coupling strength 1.5
g ≡ λ1/2 . Curves “g2”
and “g3”: orders λ and -F
1.0
λ3/2 in the original
b
perturbative expansion. 0.5 a
Curves “a” and “b”: g2
screened perturbation g3
0.0
theory at 1-loop and 0 1 2 3 4 5 6 7 8 9 10
2-loops, with the mass g
µ2 determined as the
exact solution of the gap
equation (14.79).

100

Figure 14.3: Evolution


of a scalar potential with 80

increasing temperature.
Thick dark curve: po- 60

tential with degenerate


V(φ)

40
non-trivial minima at
low temperature, leading
20
to spontaneous symme-
try breaking. Thick light 0
curve: potential at the
critical temperature. -10 -5 0 5 10
φ
482 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

temperature. Let us consider for instance a scalar theory whose potential at zero
temperature is
m20 2 λ 4
Vφ = − φ + φ . (14.80)
2 4!
Because of the sign of the mass term, this potential has two degenerate minima. The
true vacuum of this theory is at a non-zero value of φ, and the discrete symmetry
φ → −φ is thus spontaneously broken. When the temperature increases, the thermal
fluctuations generate a positive correction to the square of the mass, proportional to
λT 2 . Eventually m2 becomes positive, i.e. the potential has a unique minimum at
φ = 0, and the symmetry is restored. The critical temperature, that separates the low
temperature broken phase and the high temperature symmetric phase, is the point at
which m2 = 0.

14.3.4 Hard Thermal Loops


The scalar φ4 theory considered in the previous subsection is rather special because
the one-loop tadpole diagram that gives the thermal correction to the mass is momen-
tum independent, and because it is calculable analytically in a massless theory. In
gauge theories at finite temperature, there are also important thermal corrections to
the propagator of fermions and gauge bosons, but their structure is much richer. As
we shall see now, the calculation of the corresponding one-loop self-energies requires
an approximation based on the assumption that the loop momentum is of order of T
while the external momentum is much smaller, p ≪ T . The resulting self-energies
are known as Hard Thermal Loops.

Photon hard thermal loop : A simple example of hard thermal loop is that of a
photon in QED, for the only graph at is shown on the right of the figure 14.4. In the
c sileG siocnarF

K K

P P

Figure 14.4: Fermion and photon self-energies at 1-loop.

Matsubara formalism, the expression of the photon polarization tensor is


Z
µν 2
X tr (γµ (P
/ − /K)γνP/)
Π (K) = e 2
. (14.81)
P (P − K) 2
P
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 483

(We neglect the fermion mass in this expression.)


Let us pause a moment in order to discuss the possible form of this tensor. In
QED, Πµν (K) must be symmetric under the exchange of the Lorentz indices, and
must obey the following Ward-Takahashi identity:

Kµ Πµν (K) = Kν Πµν (K) = 0 . (14.82)

In the vacuum, this relation is sufficient to fully constrain the tensorial structure of
Πµν , up to an overall function of K2 . In the presence of a surrounding thermal bath,
the situation is more complicated: besides the metric tensor gµν and the 4-momentum
Kµ , this tensor may also contain the 4-velocity Uµ of the thermal bath (with respect
to the observer). Let us first introduce

V µ ≡ K2 Uµ − (K · U) Kµ . (14.83)

Then, one may check that Ward-Takahashi identity is satisfied by two symmetric
tensors

Kµ Kν V µ V ν
PTµν ≡ gµν − − ,
K2 V2
V µV ν
PLµν ≡ . (14.84)
V2

Besides being transverse to Kµ , these two tensors satisfy

PTµ ν PTνρ = PTµρ , PLµ ν PLνρ = PLµρ , PTµ ν PLνρ = 0 ,


PTµ µ = 2 , PLµ µ = 1 , (14.85)

which means that they are mutually orthogonal projectors (the values of their traces
indicate that PTµν encodes two degrees of freedom, while PLµν contains only one).
Moreover, in the rest frame of the thermal bath, we have Uµ = δµ0 , and the first of
these tensors reads
ki kj
PT00 = PTi0 = PT0i = 0 , PTij = δij − . (14.86)
k2
Therefore, PTµν is a projector orthogonal to the 3-momentum k.
In terms of these projectors, the most general photon polarization tensor is of the
form:

Πµν (K) = PTµν ΠT (K) + PLµν ΠL (K) . (14.87)

Note that in the presence of a heat bath, the functions ΠT,L (K) may depend on the
four components of Kµ separately (in the vacuum, the corresponding function would
484 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

depend only on the Lorentz invariant K2 ). This complication is due to the fact
the thermal bath imposes a preferred frame that breaks Lorentz invariance. If the
photon self-energy is resummed on the propagator, one obtains the following dressed
propagator in a generic covariant gauge:
PTµν PLµν Kµ Kν
−Dµν (K) = 2
+ 2
+ξ , (14.88)
K + ΠT (K) K + ΠL (K) K2
thanks to the orthogonality properties of these projectors (the gauge dependent term
in the propagator is not affected by the resummation). The two functions ΠT,L (K)
may be obtained from Πµ µ and Π00 by using

k2
Πµ µ = 2 ΠT + ΠL Π00 = − Π . (14.89)
K2 L
The fully traced polarization tensor, Πµ µ , is the easiest to evaluate:
Z
µ 2
X tr (γµ (P /)
/ − /K)γµP
Π µ (K) = e
P2 (P − K)2
P
Z  
2
X K2 2
= 4e − . (14.90)
P2 (P − K)2 P2
P

The hard thermal loop approximation consists in assuming that the external momen-
tum K is much smaller than the temperature, that controls the typical loop momentum.
In this approximation, we have
Z
X 1 e2 T 2
Πµ µ (K) = −8e2 = . (14.91)
HTL P2 3
|P {z }
T2
− 24

The sum-integral in this expression has a very simple tadpole structure, but note that
the Matsubara frequencies are the fermionic ones, hence the result −T 2 /24 for its
thermal contribution (instead of T 2 /12 in the bosonic case). The 00 component is a
bit more complicated,
Z
00 2
X tr (γ0 (P
/ − /K)γ0P /)
Π (K) = e 2
P (P − K) 2
P
Z  
2
X 8P0 (P0 − K0 ) 4
= e − 2
HTL P2 (P − K)2 P
P
e2 T 2 h k0  k0 + k i
= 1− ln 0 . (14.92)
HTL 3 2k k −k
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 485

In the second line, we have dropped a non-HTL term in K2 /P2 (P − K)2 , and in the
last line we have analytically continued the discrete Matsubara frequency to a real
energy K0 → ik0 . Therefore, the transverse and longitudinal self-energies of the
photon in the HTL approximation read

e2 T 2 k0 h k0 1  k2   k0 + k i
ΠT (K) = + 1 − 20 ln 0
6 k k 2 k k −k
e2 T 2  k20 h k0  k0 + k i
ΠL (K) = 1− 2 1− ln 0 . (14.93)
3 k 2k k −k

Electron hard thermal loop : A similar approximation can be used for fermions
in QED. Due to the breaking of Lorentz invariance caused by the thermal bath, the
self-energy may be decomposed as

/ (K) ≡ α(K) γ0 + β(K) p


Σ b·γ, (14.94)

b ≡ p/|p|. Using the same method as above, one finds


where p

e2 T 2
tr (/K Σ
/ (K)) = 4 (K0 α + kβ) = ,
HTL 2
e2 T 2  k0 + k 
tr (γ0Σ
/ (K)) = 4α = ln 0 . (14.95)
HTL 4k k −k

Moreover, the HTL approximation leads to a fermion self-energy that does not depend
on the gauge chosen for the photon propagator. After summation of this self-energy
to all orders, the fermion propagator becomes

γ0 + kb·γ γ0 − kb·γ
S(K) = + , (14.96)
2(k0 − k − Σ+ ) 2(k0 + k + Σ− )

where Σ± ≡ β ± α. c sileG siocnarF

Non-Abelian gauge theories : In the case of a non-Abelian gauge theory such as


QCD, the fermion self-energy is given by the same graph as in QED, the only change
being the substitution e2 → g2 ta a 2 2
f tf = g (N − 1)/(2N) (assuming fermions
in the fundamental representation of su(N)). Interestingly, although it is given
by four graphs (see the second line in the figure 14.5), the gluon self-energy in
the HTL approximation has the same form as the photon one, modulo the change
e2 → g2 (N + Nf /2) with Nf the number of quark flavours. In addition, there are
Hard Thermal Loops specific to non-Abelian gauge theories, both in the n-gluon
function, and in the function with a quark-antiquark pair and n − 2 gluons, as shown
in the figure 14.5.
486 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 14.5: List of Hard Thermal Loops in QCD.

Quasi-particles : One of the effects of the summation of Hard Thermal Loops


on the boson and fermion propagators is to shift their poles, i.e. to modify their
dispersion relations (such modified excitations are called quasi-particles). This can be
seen from the fact that the self energies ΠT,L and Σ± do not vanish on the original
mass-shell, at k0 = ±k. This modification is due to the multiple interactions of a
particle with those of the surrounding bath6 , which tend to make it heavier than it
would be in the vacuum.
In the case of bosons, the dispersion relations of the transverse and longitudinal
modes become distinct (except at zero momentum), as shown in the left part of the
figure 14.6. Another peculiarity of this change of the dispersion curves is that it does
not correspond to a constant mass, but rather to a momentum dependent one (in the
figure m2γ ≡ e2 T 2 /9 denotes the mass of long wavelength excitations). Moreover,
the residue of the longitudinal pole vanishes exponentially for k ≫ T , which means
that this mode decouples at low temperature. This is indeed expected, since the
longitudinal mode is unphysical in the vacuum. Thus, the longitudinal mode is a
purely collective phenomenon, that exists only in the presence of a dense medium.
In contrast, the residue of the transverse pole goes to one at large momentum, and
one thus recovers in this limit the in-vacuum gauge boson propagator. Furthermore,
the self-energies in eqs. (14.93) are purely real above the light-cone (they can have
6 The same happens to electrons in a crystal, due to their interactions with phonons.
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 487

Figure 14.6: Gluon (left) and quark (right) dispersion relations.

p0 / m γ p0 / m f

(T) (+)
1 (L)
1
(−)

p / mγ p / mf
1 1

an imaginary part only when the argument of the logarithm is negative), which
implies that the shifted poles remain on the real axis. In other words, in the HTL
approximation, the gauge boson excitations remain infinitely long-lived.
In the case of fermions, there are also two distinct modes, denoted (+) and (−),
that merge at zero momentum and k0 = m2f ≡ e2 T 2 /8. The + mode is the analogue
of the zero temperature fermion, modified by the surrounding thermal bath (the residue
of this pole goes to one when k ≫ T ). In contrast, the − mode is a purely collective
mode (the corresponding residue vanishes exponentially at low temperature). Like for
bosons, these fermionic modes have an infinite lifetime in the HTL approximation.

Debye screening : The Hard Thermal Loop correction to the gauge boson propa-
gator also encodes interesting phenomena in the space-like region. In particular, by
taking the zero frequency limit of the photon self-energy, and then its zero momentum
limit (in this order), one can determine how the Coulomb potential of a static electrical
charge is modified at long distance. Simply recall that the Coulomb potential is given
by the Fourier transform of the longitudinal7 term in the propagator,
Z 3
d k eik·r
A0 (r) ∼ . (14.97)
(2π)3 k2 + ΠL (0, k)
At large distance, we need the small k behaviour of ΠL (0, k), which is given by
e2 T 2
lim ΠL (k0 = 0, k) = . (14.98)
k→0 3 }
| {z
m2
D

7 The transverse projector does not couple to the electromagnetic current of a static charge, e.g. an

infinitely heavy charged particle.


488 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(The mass mD is called the Debye mass.) The Fourier transform then gives the
following Coulomb potential at long distance,

e−mD r
A0 (r) ∼ , (14.99)
r
which is exponentially attenuated compared to the vacuum Coulomb potential of a
point-like charge. The inverse of the Debye mass characterizes the typical distance
beyond which this screening is sizeable. Physically, this phenomenon is due to the

−m r
A0(r) = e r
D

Figure 14.7: Debye screening in QED.

fact that the test charge polarizes the charged medium surrounding it, by attracting in
its vicinity charges of the opposite sign. Because of this, a distant observer sees an
effective charge which is much small than the bare charge visible at short distance
(see the figure 14.7).

Landau damping : The last collective phenomenon included in Hard Thermal


Loops is Landau damping, which manifests itself in the fact that the HTL self-energies
have an imaginary part in the space-like region. This imaginary part indicates that a
wave propagating in such a dense medium is attenuated over distance scales of order
(eT )−1 . In the case of photons, the microscopic mechanism of this damping is the
absorption of a photon by the surrounding electrical charges, by a process such as
e− γ → e− . c sileG siocnarF

Sum rules : Propagators resummed with HTL self-energies should be used in


processes involving soft momenta of order eT or below, in order to capture the main
collective effects. However, given the explicit form of the self-energies (recall for
instance eqs. (14.93)), the use of these dressed propagators complicates significantly
the calculations in which they appear. Nevertheless, some integrals in which these
propagators enter can be evaluated in closed form by exploiting their analytical
properties.
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 489

Let us return to Minkowski spacetime, and consider the retarded resummed


propagators, defined as
R i
∆T,L (k0 , k) ≡ . (14.100)
k20 − k2 − ΠT,L (k0 , k) + ik0 0+

This propagator admits the following spectral representation,

+∞
Z
i dω i
2
= ω ρT,L (ω, k) 2 .
k0 − k − ΠT,L (k0 , k) + ik0 0+
2 2π k0 − ω + ik0 0+
2
−∞
(14.101)

where the spectral function ρT,L is defined by


R
ρT,L (k0 , k) ≡ 2 i Im ∆T,L (l0 , l) . (14.102)

From eq. (14.101), we may derive other useful integrals that contain the spectral
functions ρT,L . The starting point is to take the imaginary part of eq. (14.101), by
denoting ω ≡ kx et k0 ≡ ky, which gives the following identity

+∞
Z " #
dx 1
x ρT,L (kx, k) P 2
2π y − x2
−∞

k2 (y2 − 1) − Re ΠT,L (ky, k)


= .(14.103)
(k2 (y2 − 1) − Re ΠT,L (ky, k))2 + (Im ΠT,L (ky, k))2

Various interesting integrals can then be obtained by taking special values of y. With
y = 0, we obtain
+∞
Z +∞
Z
dx ρT (kx, k) 1 dx ρL (kx, k) 1
= 2 , = 2 , (14.104)
2π x k 2π x k + m2D
−∞ −∞

while y = +∞ leads to
+∞
Z
dx 1
x ρT,L (kx, k) = 2 . (14.105)
2π k
−∞

Let us also mention another exact integral involving the HTL photon self-energies,
Z1 h i
dx 2 Im Π(x) 1 1
2 2
=π − ,
0 x (z + Re Π(x)) + (Im Π(x)) z + Re Π(∞) z + Re Π(0)
490 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

(14.106)

where Π(x) is any of ΠT,L (kx, k) (which does not depend on k since the bosonic
HTL self-energies depend only on the ratio k0 /k). The values at x = ∞ and x = 0
of these self-energies that appear in the right-hand side are easily determined from
eqs. (14.93). This integral, where the value of x is bounded by one, appears in
the scattering cross-section of a hard particle on a particle of the thermal bath, by
exchange of a soft photon (the momentum of this photon is space-like, hence |x| ≤ 1).

Relevant physical scales : When discussing the physics of a weakly coupled system
of particles at high temperature T (much larger than the masses), it is useful to have
in mind the following hierarchy of length scales:

• ℓ = T −1 . T is the typical momentum of a particle in this system, and its


inverse is the typical separation between two neighboring particles. At shorter
distance scales, a particle behaves exactly as if it were in the vacuum. This is
why ultraviolet renormalization at non-zero temperature can be done with the
zero-temperature counterterms.

• ℓ = (gT )−1 . This is the typical distance over which a particle “feels” modifica-
tions of its dispersion relation. Besides the appearance of a thermal gap in the
spectrum of gauge bosons and matter fields, the HTL self-energies also encode
Debye screening and Landau damping. c sileG siocnarF

• ℓ = (g2 T )−1 . This is the mean distance between scatterings with a soft colour
exchange. These are forward scatterings, since the momentum transfer (of
order gT , the scale of the infrared cutoff provided by the dressing of the gluon
propagator) is much smaller than the momentum of the incoming particles
(typically T ). A gross way to obtain this scale is by estimating the corresponding
scattering rate:
2
Z
d2 p⊥
Γ soft = p⊥ ∼ g4 T 3 ∼ g2 T , (14.107)
collisions p4⊥
gT .p⊥

where p⊥ is the momentum transfer transverse to the momentum of the incom-


ing particles. Although these scatterings do not lead to an appreciable transport
of momentum, they reshuffle the colour of the particles and hence contribute to
the colour conductivity.

• ℓ = (g4 T )−1 . This is the mean distance between scatterings with a momentum
transfer of order T , i.e. those that scatter particles at large angles. Estimating
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 491

1 / gT

T
g2
/
1

T
g4
/
1

Figure 14.8: Relevant distance scales in a relativistic plasma at high temperature.

this scale is done as above, but with a lower limit of order T for the momentum
transfer:
2
Z
d2 p⊥
Γ hard = p⊥ ∼ g4 T 3 ∼ g4 T . (14.108)
collisions p4⊥
T .p⊥

This scale is usually called the mean free path. This is the relevant scale for
all transport phenomena that require significant momentum exchanges, for
instance the viscosity. Beyond this scale is the realm of collective effects such
as sound waves (on these scales, it is more appropriate to describe the system
as a fluid rather than in terms of elementary field excitations).

Perturbative and non-perturbative modes : Although it is in principle possible


to study any phenomenon at finite temperature in terms of the bare Lagrangian, this
becomes increasingly difficult at large distance because of non-trivial in-medium
effects. In order to circumvent this difficulty, various resummation schemes and
effective descriptions have been devised, one of which is the resummation of HTLs
discussed above.
Our goal here is not to give a detailed account of these various techniques, but to
provide general principles regarding what can and cannot be treated perturbatively,
492 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

focusing on gauge bosons. Let us first recall that a mode is perturbative if its kinetic
energy dominates its interaction energy. For a mode of momentum k, the kinetic
energy of a gauge field can be estimated as
K ∼ (∂A)2 ∼ k2 A2 . (14.109)
For the interaction energy, we have
2
I ∼ g2 A4 ∼ g2 A2 . (14.110)
(The second part of the equation is of course not exact, but it gives the correct order
of magnitude.) Thus, a mode of momentum k is perturbative if k2 ≫ g2 A2 . When
discussing the order of magnitude of A2 , it is useful to distinguish the contribution
of the various momentum scales by defining
Z κ∗ 3
2 d p
A κ∗ ∼ n (Ep ) ,
Ep B
the contribution of all the thermal modes up to the scale κ∗ . From these considerations,
we can now distinguish three types of modes:
• Hard modes : k ∼ T . For these modes, we have A2 T
∼ T 2 , and K ≫ I.
They are therefore fully perturbative.
• Soft modes : k ∼ gT . For these modes, k2 ∼ g2 A2 T , which implies that
the soft modes interact strongly with the hard modes. However, we also have
A2 gT ∼ gT 2 , so that k2 ≫ g2 A2 gT . Thus, the soft modes interact
perturbatively among themselves. Consequently, it is possible to describe
perturbatively the soft modes, provided one has performed first a resummation
of the contribution of the hard modes. Screened perturbation theory is a
realization of this idea.
• Ultrasoft modes : k ∼ g2 T . For these modes, we have A2 g2 T ∼ g2 T 2 , so
that k2 ∼ g2 A2 g2 T . Therefore, the ultrasoft modes interact non perturba-
tively among themselves, and there is no way to treat them in a perturbative
approach. A non perturbative approach, such as lattice field theory, is necessary
for this.

14.4 Out-of-equilibrium systems


Until now, we have discussed only systems in equilibrium, whose initial state is
described by the canonical density operator ρ ≡ exp(−β H). However, many inter-
esting questions could also be asked for a system which is not initially in thermal
equilibrium, the prime of them being to describe its relaxation towards equilibrium.
In this section, we discuss a few aspects of the quantum field theory treatment of
out-of-equilibrium systems. c sileG siocnarF
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 493

14.4.1 Pathologies of the naive approach


Firstly, let us note that the Matsubara formalism does not seem prone to a simple
out-of-equilibrium generalization, since the KMS symmetry (that encodes into the
correlation functions the fact that the system is in equilibrium) is in a sense hardwired
into the discrete Matsubara frequencies.
The Schwinger-Keldysh formalism appears to be a more adequate starting point
for such a generalization. Let us first discuss a simple extension that does not work,
because the reasons of its failure will teach us a useful lesson. Since in eqs. (14.53),
the only reference to the statistical state of the system is contained in the Bose-Einstein
distribution nB (Ep ), we may try to replace it by an arbitrary distribution f(p) that
describes the particle distribution in an out-of-equilibrium system8 :
h i
∀ǫ, ǫ ′ = ± , G0ǫǫ ′ (p) = G0ǫǫ ′ (p) + 2π f(p) δ(p2 − m2 ) . (14.111)
T =0

Consider now the insertion of a self-energy Σ on the bare propagator,

X
Σ = G0+ǫ (p) Σǫǫ′ (p) G0ǫ′ + (p) . (14.112)
ǫ,ǫ′ =±

Such an expression is delicate to expand, because it involves products of distributions


that are notoriously ill-defined, such as δ2 (p2 − m2 ). Let us first determine which of
these products are well defined and which are not. For this, let us write
 1 2    2 1
2 2 1
iP + πδ(z) = π δ (z) − P + 2iπδ(z)P
z z z
 2  
i d i
= = −i
z + i0+ dz z + i0+
       ′
d 1 1
= −i iP + πδ(z) = P − iπδ ′ (z) .
dz z z
(14.113)

From this exercise, we obtain the following two identities:


   2    ′
2 2 1 1
π δ (z) − P = P
z z
1
2δ(z)P = −δ ′ (z) . (14.114)
z
8 Note firstly that this would not encompass the most general initial states, only those for which the

initial correlations are only 2-point correlations.


494 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Since the derivative of a distribution is well-defined, this indicates that certain products
(or combinations of products) of delta functions and principal values are well defined,
but not all of them (for instance, the product δ2 (z) makes no sense).
Returning now to eq. (14.112) and expanding the propagators, we see that it
contains terms that are ill-defined:
h well defined i h i
Σ = +π2 δ2 (p2 −m2 ) (1+f(p))Σ+− −f(p)Σ−+ ,
distributions
(14.115)

where we have used the first of eqs. (14.54) in order to simplify the combination of
self-energies that appear in the square bracket. Note that the square bracket vanishes
in equilibrium thanks to the KMS symmetry. We are thus facing a very peculiar
pathology, that exists only out-of-equilibrium.
We may learn a bit more about this issue by formally resumming the self-energy
Σ on the propagator. Let us introduce the following notations:
     
● 0

G0++
G0−+
G0+−
G0−−
, ❉≡ G0F
0
0
G0∗
, ❙≡ Σ++
Σ−+
Σ+−
Σ−−
,
F

(14.116)

and consider the resummed propagator defined by


∞ h in
●≡ ●0(−i❙) ●0 .
X
(14.117)
n=0

A straightforward calculation shows that


 
● =U
GF GF ΣG
0
e ∗
G∗F
F U (14.118)

where U is the matrix defined in eq. (14.56), but with f(p) instead of the Bose-Einstein
distribution, and where we have used the following notations

i
GF (p) ≡ ,
− p2 m2 − ΣF + iǫ
ΣF ≡ Σ++ + Σ+− ,
1 h i
e≡
Σ (1 + f(p))Σ+− − f(p)Σ−+ . (14.119)
1 + f(p)

Note that the Feynman propagator and its complex conjugate have mirror poles on
each side of the real energy axis. If the self-energy ΣF has no imaginary part, then
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 495

these poles “pinch” the real axis and lead to a singularity (this is in fact a pathology
of the same nature as the product δ2 in eq. (14.115)). By performing explicitly the
multiplication with the matrix U, we obtain the resummed propagator in the following
form:
h i
Gǫǫ ′ (p) = G0ǫǫ ′ (p) + 2π f(p) δ(p2 − m2 )
T =0
h i
+ (1 + f(p))Σ+− − f(p)Σ−+ GF (p)G∗F (p) .
(14.120)

Since it does not depend on the indices ǫǫ ′ , the pathological term (on the second line)
appears on the same footing as the second term, that contains the distribution f(p).
Thus, the lesson of this calculation is that one may consider hiding this pathology into
a redefinition of the distribution f(p). However, the naive formalism that we have
tried to use so far is not adequate for doing this consistently, and must be amended in
a number of ways:

• The initial time ti should not be taken to −∞, as is done when using the
Schwinger-Keldysh formalism in momentum space. Indeed, this is the time at
which the system was prepared in an out-of-equilibrium state. If it were equal
to −∞, the system would have had an infinite amount of time for relaxing to
equilibrium at the finite time where a measurement is performed. Note that
observables will in general depend on the initial time ti , in contrast with what
happens in equilibrium. c sileG siocnarF

• The Schwinger-Keldysh formalism in momentum space assumes that the system


is invariant by translation, in particular in the time direction. This is clearly
not the case when the system starts out-of-equilibrium, since it is expected
to evolve towards equilibrium. Thus, one should stick to the formalism in
coordinate space.

14.4.2 Kadanoff-Baym equations

The Kadanoff-Baym equations, that we shall derive now, may be viewed as a kind
of quantum kinetic equations. These equations are exact, but contain a self-energy
that must be truncated to a manageable number of diagrams in order to be usable in
practical applications. In the next subsection, we will show how the traditional kinetic
equations can be derived from the Kadanoff-Baym equations.
The starting point is the Dyson-Schwinger equation, written in coordinate space,
496 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

that expresses the resummation of a self-energy on the propagator:


Z  
G(x, y) = G0 (x, y) + d4 ud4 v G0 (x, u) − iΣ(u, v) G(v, y)
ZC  
0
G(x, y) = G (x, y) + d4 ud4 v G(x, u) − iΣ(u, v) G0 (v, y) ,
C
(14.121)

where G0 is the free propagator and G is the resummed one. Note that the time
integrations run over the Schwinger-Keldysh contour C. Here, we have written the
equation in two ways, depending on whether the self-energy is inserted on the right or
on the left of the bare propagator (in the end, the resulting propagator G is the same in
both cases). Next, we apply the operator x + m2 on the first equation and y + m2
on the second equation. This eliminates the bare propagators, and we obtain:
Z
2
(x + m )G(x, y) = −iδc (x − y) − d4 v Σ(x, v) G(v, y) ,
C
Z
(y + m2 )G(x, y) = −iδc (x − y) − d4 v G(x, v) Σ(v, y) , (14.122)
C

where δc (x − y) is the generalization of the delta function to the contour C. This is


one of the forms of the Kadanoff-Baym equations.

14.4.3 From QFT to kinetic theory

Kinetic theory is an approximation of the underlying dynamics in terms of a space-


time dependent distribution of particles f(x, p). One may note right away that
this is necessarily an approximate description, because it is not possible to define
simultaneously the position and momentum of a particle.
In the Kadanoff-Baym equations (14.122), the dressed propagator G and the
self-energy Σ are in general not invariant under translations, precisely because the
system is out-of-equilibrium. Therefore, one may not Fourier transform them in the
usual way. Instead, one uses a Wigner transform, defined as follows
Z
s s
F(X, p) ≡ d4 s eip·s F X + , X − , (14.123)
2 2
where F(x, y) is a generic 2-point function (we use the same symbol for its Wigner
transform, since the arguments are sufficient to distinguish them). In other words,
the Wigner transform is an usual Fourier transform with respect to the separation
s ≡ x − y, and the result still depends on the mid-point X ≡ (x + y)/2. Note that in
eq. (14.123), the time integration is over the real axis, not over the contour C. Wigner
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 497

transforms do not share with the Fourier transform their properties with respect to
convolution. Given two 2-point functions F and G, let us define:
Z
H(x, y) ≡ d4 z F(x, z) G(z, y) . (14.124)

The Wigner transform of H is given by

i ← → → ← 
H(X, p) = F(X, p) exp ∂X ∂p − ∂X ∂p G(X, p) , (14.125)
2
where the arrows indicate on which side the corresponding derivative acts. The right
hand side of this formula reduces to the ordinary product of the transforms when there
is no X dependence, i.e. when the functions F and G are translation invariant. The
first correction to the translation invariant case is proportional to the Poisson bracket
of F and G,

i
H(X, p) = F(X, p)G(X, p) + F(X, p), G(X, p) + · · · . (14.126)
2
The derivatives with respect to x and y that appear in the Kadanoff-Baym equations
can be written in terms of derivatives with respect to X and s :

1 1
∂x = ∂ X + ∂ s , ∂y = ∂ X − ∂ s
2 2
1 1
x = X + ∂X · ∂s + s , y = X − ∂X · ∂s + s .(14.127)
4 4

In these operators, the Wigner transform just amounts to a substitution

∂s → −ip , s → −p2 . (14.128)

In order to go from the Kadanoff-Baym equations to kinetic equations, two


approximations are necessary:

1. Gradient approximation : p ∼ ∂s ≫ ∂X . The derivatives with respect


to the mid-point X characterize the space and time scales over which the
properties of the system (e.g. its particle distribution) change significantly.
This approximation therefore means that these scales, that characterize the
off-equilibriumness of the system, should be much larger than the De Broglie
wavelength of the particles. Another way to state this approximation is that
the mean free path in the system should be much larger than the wavelength
of the particles, which amounts to a certain diluteness of the system. Using
this approximation in the two Kadanoff-Baym equations (14.122), taking their
498 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

difference, and breaking it down into its ++, −−, +− and −+ components,
one obtains

−2ip · ∂X (G+− (X, p) − G−+ (X, p)) = 0 ,


 i
−2ip · ∂X (G+− (X, p) + G−+ (X, p)) = 2 G−+ Σ+− − G+− Σ−+ .
(14.129)

2. Quasi-particle approximation : This approximation consists in assuming


that the dressed propagators Gǫǫ ′ can be written in terms of a local particle
distribution f(X, p) as in eqs. (14.111). This is equivalent to

G−+ (X, p) = (1 + f(X, p)) ρ(X, p) ,


G+− (X, p) = f(X, p) ρ(X, p) , (14.130)

where ρ(X, p) ≡ G−+ (X, p) − G+− (X, p). This would be exact for non-
interacting, infinitely long-lived, particles. In the presence of interactions, the
approximation is justified when the time between two collisions of a particle is
large compared to its wavelength.

Using eqs. (14.129) and (14.130), we obtain and equation for f(X, p), which is
nothing but a Boltzmann equation:
h i i h i
∂t + vp · ∇x f(X, p) = (1 + f(X, p))Σ+− − f(X, p)Σ−+ , (14.131)
2Ep
| {z }
❈ p [f;X]

where vp ≡ p/Ep is the velocity vector for particles of momentum p. Note that the
Boltzmann equation is spatially local since all the objects it contains are evaluated
at the coordinate X, but its right hand side is non local in momentum. The right

hand side, p [f; X], is called the collision term. The combination ∂t + vp · ∇x that
appears in the left hand side is called the transport derivative. It is zero on any function
whose t and x dependence arise only in the combination x − vp t (this is the case
for a distribution of non-interacting particles, that move at the constant velocity vp
prescribed by their momentum). c sileG siocnarF

In order to obtain an explicit expression of the collision term, it is necessary to


truncate the self-energies to a certain order (usually, the lowest order that gives a
non-zero result) in the loop expansion. In a scalar theory with a φ4 interaction, the
self-energies should be evaluated at two-loops,

Σ= . (14.132)
14. Q UANTUM FIELD THEORY AT FINITE TEMPERATURE 499

Using the Feynman rules of the Schwinger-Keldysh formalism, this diagram leads to
the following collision term
Z
❈p [f; X] =
λ2
4Ep
d3 p1 d3 p2 d3 p3
(2π) 2E1 (2π) 2E2 (2π)3 2E3
3 3
(2π)4 δ(p−p1 −p2 −p3 )
h
× f(X, p1 )f(X, p2 )(1 + f(X, p3 ))(1 + f(X, p))]
i
−f(X, p3 )f(X, p)(1 + f(X, p1 ))(1 + f(X, p2 )) .
(14.133)

The expression describes the rate of change of the particle distribution, under the
effect of 2-body elastic collisions. It is the difference between a production rate
(coming from the term in which the particle of momentum p is produced, and thus
weighted by a factor 1 + f(X, p)) and a destruction rate (from the term in which the
particle of momentum p is destroyed, and has a weight f(x, p)).
To close this section, let us mention an additional term that arises when the self-
energy contains a local part, i.e. a term proportional to a delta function in space-time:

Σ(u, v) = Φ(u)δc (u − v) + Π(u, v) (14.134)

When such a local term is present, the difference of the two Kadanoff-Baym equations
contains Φ(y)G(x, y) − Φ(x)G(x, y), whose Wigner transform at lowest order in
the gradient approximation is
 
i ∂X Φ(X) · ∂p G(X, p) . (14.135)

This extra term leads to a somewhat modified Boltzmann equation,


h i 1 i h i
∂ t + vp · ∇x f + ∂X Φ · ∂p f = (1 + f) Π+− − f Π−+ . (14.136)
2Ep 2Ep

In the new term (underlined), one may interpret ∂X Φ as a mean force field acting on
the particles. Under the action of this force, the particles accelerate which implies a
change of their momentum. The left hand side of the above equation thus describes
the change of the distribution of particles under the effect of this mean field, in the
absence of any collisions (that are described by the right hand side).
500 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS
Chapter 15

Strong fields and


semi-classical methods

15.1 Introduction
Until now, all our discussion of quantum field theory has been centered on an ex-
pansion about the vacuum, i.e. on situations involving a system with few particles.
This is also a regime in which the fields are in a certain sense1 small. The connection
between the field amplitude and the density of particles in a state may be grasped
by writing the LSZ reduction formula that gives the expectation value of the number
operator for a system whose initial state is Φin . By mimicking the derivation of the
section (1.4), one obtains easily
Z
1
Φin a†p,out ap,out Φin = d4 xd4 y eip·(x−y) (x +m2 )(y +m2 )
Z
× Φin φ(x)φ(y) Φin
Z
 
Φin φ(x)φ(y) Φin = Dφ± (z) φ− (x)φ+ (y) ei (S[φ+ ]−S[φ− ]) ,
(15.1)

where in the second line we have sketched the path integral representation of the
matrix element that appears in the reduction formula. Note that, since there is no
time ordering in this matrix element, the Schwinger-Keldysh formalism must be used
here. This formula is only a sketch, because the boundary conditions of the path
1 When we talk of small or large fields, we are referring to the magnitude of the c-number field in a path

integral (it does not make sense to apply these qualifiers to the field operator itself).

501
502 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

integral at the initial time should be precised in order to properly account for the
initial state Φin . However, what we want to illustrate with these formulas is the
direct relationship between large particle occupation numbers (the left hand side of
the first equation), and large fields in a path integral. Moreover, in the path integral,
the magnitude of the fields is controlled by the boundary conditions (this is the only
thing that depends on the initial state of the system in the right hand side of the second
equation).
There is an implicit assumption of weak fields in the perturbative machinery that
we have studied so far, which is best viewed in the path integral formalism. For
instance, in the second of eqs. (15.1), the perturbative expansion amounts to writing
S = S0 + Sint , and to expand the exponentials in powers of Sint . In a scalar field theory
with a quartic coupling, the interaction part of the action reads
Z
λ
Sint [φ] = − d4 x φ4 (x) , (15.2)
4!
while the free action (that we keep inside the exponential) is given by
Z h i
1
S0 [φ] = d4 x (∂µ φ)(∂µ φ) − m2 φ2 . (15.3)
2
The common justification of the perturbative expansion is that, when the coupling
constant λ is small, we have Sint ≪ S0 . However, since S0 [φ] is quadratic in the field
while Sint [φ] contains higher powers of φ, this inequality may not be true if the field
is large, even at weak coupling. In order to make this statement more precise, we
must account for the fact that the field has mass dimension 1. Let us denote by Q
the typical momentum scale in the problem under consideration (for simplicity we
assume that there is only one), and then we write

φ(x) ∼ ϑ Q , (15.4)

where ϑ is a dimensionless number that encodes the order of magnitude of the field.
Naive dimensional analysis tells us that

(∂µ φ)(∂µ φ) ∼ ϑ2 Q4 ,
λφ4 ∼ λ ϑ4 Q4 . (15.5)

For the interaction term to be small compared to the kinetic term, we must have

λ ϑ2 ≪ 1 , (15.6)

which is slightly different from the usual criterion of small λ, since this condition
depends on the field magnitude via ϑ. The purpose of this chapter is to explore
situations of weak coupling (i.e. λ ≪ 1) where the inequality (15.6) is not satisfied
because of strong fields. We call this the strong field regime of quantum field theory.
We will discuss two main situations where strong fields may occur:
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 503

• The initial state is a highly occupied state, such as a coherent state.c sileG siocnarF

• The initial state is the ground state, but the system is driven by a strong external
source.

As we shall see, since the coupling constant is assumed to be small, there is neverthe-
less a loop expansion, but each loop order (including the tree level approximation) is
non-perturbative in a sense that we will clarify in the rest of the chapter.

15.2 Expectation values in a coherent state


In the section 1.16.5, we have presented the Schwinger-Keldysh formalism, that
allows the evaluation of expectation values of an observable in the in- vacuum state,
0in O 0in . In the previous chapter, we have generalized this technique to expectation
values in a thermal state, i.e. a mixed state whose density matrix is the canonical
equilibrium one, ρ ≡ exp(−β H). Another generalization, that we shall consider in
this section, is to consider an expectation value in a coherent state, which may be
defined from the perturbative in-vacuum as follows
Z
d3 k
χin ≡ Nχ exp χ(k) a†k,in 0in , (15.7)
(2π)3 2Ek
where χ(k) is a function of 3-momentum and Nχ a normalization constant adjusted
so that χin χin = 1. From the canonical commutation relation
 
ap,in , a†q,in = (2π)3 2Ep δ(p − q) , (15.8)

it is easy to check the following identity

ap,in χin = χ(p) χin ,


Z
2 d3 k 2
Nχ = exp − χ(k) . (15.9)
(2π)3 2Ek

The first equation tells us that χin is an eigenstate of annihilation operators, which
is another definition of coherent states, and the second one provides the value of the
normalization constant. The occupation number in the initial state is closely related
to the function χ(k). Indeed, we have

χin a†p,in ap,in χin = |χ(p)| .


2
(15.10)

In other words, the number of particles in the mode of momentum p is the squared
modulus of the function χ(p). A large χ thus corresponds to a highly occupied initial
state (at the opposite, χ(p) ≡ 0 corresponds to the vacuum).
504 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Consider now the generating functional for the extension of the Schwinger-Kel-
dysh formalism in this coherent state,
Z
Zχ [j] ≡ χin P exp i d4 x j(x)φ(x) χin
C
Z h i
= χin P exp i d4 x Lint (φin (x)) + j(x)φin (x) χin , (15.11)
C

where j(x) is a fictitious source that lives on the closed-time contour C introduced in
the figure 1.4. As usual, the first step is to factor out the interactions as follows:
Z  δ  Z
4
Zχ [j] = exp i d x Lint χin P exp i d4 x j(x)φin (x) χin . (15.12)
C iδj(x) C
| {z }
Zχ0 [j]

A first application of the Baker-Campbell-Hausdorff formula enables one to remove


the path ordering, which gives
Z
Zχ0 [j] = χin exp i d4 x j(x)φin (x) χin
C
Z
1  
× exp − d xd4 y j(x)j(y) θc (x0 − y0 ) φin (x), φin (y) ,
4
2 C
(15.13)

where θc (x0 − y0 ) generalizes the step function to the ordered contour C. Note
that the factor on the second line is a commuting number and thus can be removed
from the expectation value. A second application of the Baker-Campbell-Hausdorff
formula allows to normal-order the first factor. Decomposing the in-field as follows,
Z Z
d3 k d3 k
φin (x) ≡ ak,in e−ik·x
+ a† e+ik·x , (15.14)
3
(2π) 2Ek (2π)3 2Ek k,in
| {z } | {z }
(−) (+)
φin (x) φin (x)

we obtain the following expression for the free generating functional


Z Z
(+) (−)
Zχ0 [j] = χin exp i d4 x j(x)φin (x) exp i d4 y j(y)φin (y) χin
C C
Z
1  (+) (−) 
× exp + d4 xd4 y j(x)j(y) φin (x), φin (y)
2 C
Z
1  
× exp − d4 xd4 y j(x)j(y) θc (x0 − y0 ) φin (x), φin (y) . (15.15)
2 C
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 505

The factor of the first line can be evaluated by using the fact that the coherent state is
an eigenstate of annihilation operators:
Z Z
(+) (−)
χin exp i d x j(x)φin (x) exp i d4 y j(y)φin (y) χin
4
C C
Z Z 3  
d k −ik·x ∗ +ik·x
= exp i d4 x j(x) χ(k)e + χ (k)e .
C (2π)3 2Ek
| {z }
Φχ (x)
(15.16)

We denote Φχ (x) the field obtained by substituting the creation and annihilation
operators of the in-field by χ∗ (k) and χ(k) respectively. Note that this is no longer
an operator, but a (real valued) c-number field. Moreover, because it is a linear
superposition of plane waves, this field is a free field:
(x + m2 ) Φχ (x) = 0 . (15.17)
The second and third factors of eq. (15.15) are commuting numbers, provided we
do not attempt to disassemble the commutators. Using the decomposition of the in-
field in terms of creation and annihilation operators, and the canonical commutation
relation of the latter, we obtain
   (+) (−) 
θc (x0 − y0 ) φin (x), φin (y) − φin (x), φin (y)
Z
d3 k
= θc (x0 − y0 ) e−ik·(x−y)
(2π)3 2Ek
Z
d3 k
0
+ θc (y − x )0
e+ik·(x−y) ,
(2π)3 2Ek
| {z }
G0
c (x,y)

(15.18)

which is nothing but the usual bare path-ordered propagator G0c (x, y). Collecting
all the factors, the generating functional for path-ordered Green’s functions in the
Schwinger-Keldysh formalism with an initial coherent state reads
Z  δ  Z
Zχ [j] = exp i d4 x Lint exp i d4 x j(x) Φχ (x)
C iδj(x) C
Z
1
× exp − d4 xd4 y j(x)j(y) G0c (x, y) . (15.19)
2 C

It differs from the corresponding functional with the perturbative vacuum2 as initial
state only by the second factor, that we have underlined. This generating functional is
2 The vacuum initial state corresponds to the function χ(k) ≡ 0, i.e. to Φχ (x) = 0.
506 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

also equal to3


Z Z  δ 
Zχ [j] = exp i d4 x j(x) Φχ (x) exp i d4 x Lint Φχ (x) +
C C iδj(x)
Z
1
× exp − d4 xd4 y j(x)j(y) G0c (x, y) . (15.20)
2 C

The first factor has the effect of shifting the fields by Φχ (x). The simplest way to see
this is to write

φ ≡ Φχ + ζ . (15.21)

In the definition (15.11), this leads to


Z Z
Zχ [j] = exp i d4 j(x) Φχ (x) χin P exp i d4 x j(x) ζ(x) χin , (15.22)
C C

where the second factor in the right hand side is the generating functional for correla-
tors of ζ. Comparing with eq. (15.20), we see that the generating functional for ζ is
identical to the vacuum one, except that the argument φ of the interaction Lagrangian
is replaced by Φχ + ζ:

Lint (φ) → Lint (Φχ + ζ) . (15.23)

In other words, the field ζ appears to be coupled to a background field Φχ . For


instance, for a φ4 interaction term, we have

ζ4 ζ3 Φχ ζ2 Φ2χ ζΦ3χ Φ4χ


Lint (Φχ + ζ) = −λ + + + + . (15.24)
4! 8 4 8 4!
The first term, in ζ4 , gives the usual four-leg vertex in the Feynman rules, and the
following terms describe the interactions of ζ with the background field Φχ . The
last term plays no role since it does not contain the quantum field ζ. Except for the c sileG siocnarF

appearance of these new vertices that involve a background field, the Feynman rules
are the same as in the Schwinger-Keldysh formalism for a vacuum initial state, with
+ and − vertices, and bare propagators G0++ , G0+− , G0−+ and G0−− to connect them.
In summary, replacing the vacuum initial state by a coherent state amounts to extend
the usual Schwinger-Keldysh formalism with a background field Φχ .
As in eq. (15.4), let us assume for the purpose of power counting that

Φχ ∼ ϑQ , (15.25)
3 In this transformation, we use the functional analogue of
F(∂x ) eαx G(x) = eαx F(α + ∂x ) G(x) .
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 507

Figure 15.1: Vertices that appear in the perturbative expansion for the calculation
of expectation values with a coherent initial state. The circled cross denotes the
field Φχ .

and consider a connected graph G made of nE external lines, nI internal lines, n1


vertices ζΦ3χ , n2 vertices ζ2 Φ2χ , n3 vertices ζ3 Φχ , n4 vertices ζ4 , and nL loops.
These parameters are related by the following two identities:

nE + 2nI = 4n4 + 3n3 + 2n2 + n1 ,


nL = nI − (n1 + n2 + n3 + n4 ) + 1 . (15.26)

Then, the order in λ and ϑ of this graph is given by

G ∼ λn1 +n2 +n3 +n4 ϑ3n1 +2n2 +n3


√ 3n1 +2n2 +n3
∼ λnL −1+nE /2 λϑ . (15.27)

The first factor is nothing but the usual order in λ of a connected graph with nE
external lines and nL loops. The second factor counts the number of insertions
(3n1 + 2n2 + √ n3 ) of the background field Φχ . Interestingly, it involves only the
combination λ ϑ, that appears also in the inequality (15.6) that delineates the strong
field regime. From eq. (15.27), we can draw the following conclusions:

• When λϑ2 ≪ 1, i.e. in the weak field regime, we can make a double pertur-
bative expansion in λ and in ϑ (i.e. in the occupation of the initial coherent
state). Leading order results correspond to tree diagrams with zero (or the mini-
mal number necessary for the observable under consideration to be non-zero)
insertions of the background field.

• When λϑ2 & 1, i.e. in the strong field regime, the expansion in powers of λ
is still possible (and is organized by the number of loops in the graphs). But
the expansion in powers of the background field becomes illegitimate, and
one should instead treat Φχ to all orders. As we shall see now, this leads to
important modifications in the calculation of observables in the strong field
regime.
508 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Note that for a system prepared in a coherent initial state, it is the function χ(k)
that defines the coherent that determines whether we are in the weak or strong field
regime.
In order to illustrate the changes to the perturbative expansion in the strong field
regime, let us consider a very simple observable, the expectation value of the field
operator,

Φ(x) ≡ χin φ(x) χin = Φχ (x) + χin ζ(x) χin . (15.28)

The beginning of the diagrammatic representation of Φ(x) at tree level reads:

Φ(x) = + + + + +...
tree
(15.29)

In fact, at tree level, Φ(x) is the sum of all the tree diagrams (weighted by the
appropriate symmetry factor) whose root is the point x and whose leaves are the
coherent field Φχ . This infinite set of trees can be generated recursively by the
following integral representation:
Z h i λ 
Φ(x) = Φχ (x) + i d4 x G0++ (x, y) − G0+− (x, y) − Φ3 (y) . (15.30)
| {z } | 6 {z }
G0 (x,y)
R U ′ (Φ(y))

Interestingly, after one has summed over the + and − indices carried by the vertices,
the propagators G0++ and G0+− of the Schwinger-Keldysh diagrammatic rules always
appear via their difference, which is nothing but the bare retarded propagator:

G0++ (x, y) − G0+− (x, y) = G0−+ (x, y) − G0−− (x, y) = G0R (x, y) . (15.31)

Since this propagator obeys

(x + m2 ) G0R (x, y) = −i δ(x − y) , G0R (x, y) = 0 if x0 < y0 , (15.32)

the expectation value Φ(x) at tree level satisfies

(x + m2 ) Φ(x) + U ′ (Φ(x)) = 0 ,


lim Φ(x) = Φχ (x) . (15.33)
x0 →−∞

In other words, at tree level, the field expectation value obeys the classical field
equation of motion, with the boundary value Φχ (x) at the initial time. The non-
linearity of this equation of motion is crucial in the strong field regime, and all the
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 509

terms of the series (15.29) have the same magnitude when λ ϑ2 ∼ 1. Nevertheless,
the representation of this series as the solution of the classical field equation of
motion with a retarded boundary condition is very useful, since it turns the problem
of summing an infinite series of Feynman graphs into the much simpler (at least
numerically) problem of solving a partial differential equation. c sileG siocnarF

This result for the expectation value of φ(x) generalizes to the expectation value
of any observable built from the field operator: at tree level, its expectation value is
obtained by replacing the operator φ(x) by the c-number classical field Φ(x) inside
the observable:
 
χin O φ(x) χin = O Φ(x) . (15.34)
tree level

We will defer the study of loop corrections to these expectation values until the section
15.4, because this discussion will be common with another strong field situation that
we shall discuss first, namely the case of quantum field theories coupled to a strong
external source.

15.3 Quantum field theory with external sources


Let us now consider a second way to reach the large field regime. This time, the
initial state of the system is the vacuum, but the field is coupled to an external source
that drives the system away from the ground state. When the external source is large,
the field expectation value will eventually become large itself, and the system will
again be in the strong field regime. Let us consider a scalar field theory with quartic
interaction coupled to a source J, whose Lagrangian is
1 1 λ
L≡ (∂µ φ)(∂µ φ) − m2 φ2 − φ4 +Jφ . (15.35)
2 2 4!
| {z }
U(φ)

Although we consider here the example of a φ4 interaction term, we will often write
the equations for a generic potential U(φ), and sometimes diagrammatic illustrations
will be given for a cubic interaction for simplicity. These more general interactions
terms will be defined as λ−1+n/2 Qn−4 φn , where Q is an object of mass dimension
1. The Feynman rules for this theory are the usual ones, with the addition of a special
rule for the external current J. In momentum space, a source j attached to the end
of a propagator of momentum p contributes a factor iJ̃(p) (where J̃ is the Fourier
transform of J).
The source J(x) is a given function of space-time, fixed once for all. As we
shall see shortly, the strong field regime corresponds to large sources J ∼ λ−1/2 –
we all call this situation the strong source dense. In contrast, the situation where
510 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 15.2: Generic


connected graph in the
strong source regime. In
this example, nE = 5,
nI = 11, nJ = 4,
nL = 1, n3 = 5 and
n4 = 2.

the external source J is small is called the weak source dilute. Consider a simply
connected diagram (see figure 15.2), with nE external legs, nI internal lines, nL
(4)
independent loops, nJ sources, and n3 cubic vertices, n4 quartic vertices, etc...
These parameters are not all independent. First, the number of propagator endpoints
should match the available sites to which they can be attached. This leads to a first
identity,

nE + 2nI = nJ + 3n3 + 4n4 + 5n5 + · · · (15.36)

A second identity expresses the number of independent loops in terms of the other
parameters,

nL = nI − (n3 + n4 + n5 + · · · ) − nJ + 1 . (15.37)

Thanks to these two relations, the order of a diagram G can be written as


1 3 √ n
G ∼ JnJ λ 2 n3 +n4 + 2 n5 +··· = λnL −1+NE /2 λJ J . (15.38)

This formula is very similar to eq. (15.27). First, it does not depend on the number
of vertices and on the number of internal lines; only the number of external legs, the
number of loops and the number of sources appear in the result. The strong source
regime√is the regime where it is not legitimate to expand in powers of J because the
factor λ J is not small. In this case, the order of a diagram does not depend on its
number of sources, and an infinite number of diagrams –with fixed nE and nL but
arbitrary nJ – contribute at each order.

15.4 Observables at LO and NLO


Leading order : Let us consider an observable O(φ), possibly non-local but with
fields only at the same time tf (the discussion could be generalized to fields with
only space-like separations). At leading order in the strong field regime. As we
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 511

Figure 15.3: The two x x y


contributions to observ-
ables at NLO in the
strong field regime.

have seen in the previous sections, this can be achieved by the presence of strong
external sources, or by starting from a highly occupied coherent state. In both case,
the calculation of expectation values is done with the Schwinger-Keldysh formalism.
Note that since the field operators in the observable are taken at equal times, they
commute and the result does not depend on the + or − assignments for those fields.
But it is crucial to sum over all the ± indices in the internal vertices of the graphs.
At leading order in λ, its expectation value is obtained by simply replacing the
field operator φ by the solution Φ of the classical equations of motion,
O(φ) LO
= O(Φ) , (15.39)
with
(x + m2 )Φ + U ′ (Φ) = J ,
lim Φ(x) = Φχ (x) . (15.40)
x0 →ti

(We have combined in a single description the two situations, with an external source
J and starting from a non-trivial coherent state χin .) Note that it is the internal
sums over the ± indices of the Schwinger-Keldysh formalism that lead to retarded
boundary conditions, by virtue of eq. (15.31).

Next-to-leading order : For such an observable, the corresponding next-to-leading


order correction can be formally written as follows,
Z Z
δO(Φ) 1 δ2 O(Φ)
O(φ) NLO = d3 x δΦ(x) + d3 xd3 y G(x, y) ,
tf δΦ(x) 2 tf δΦ(x)δΦ(y)
(15.41)
where δΦ is the 1-loop correction to the classical field Φ, and G(x, y) is the propagator
dressed by the background field Φ. The two contributions of eq. (15.41) are illustrated
in the figure 15.3. Since the fields operators in the observable O(φ) are all separated
c sileG siocnarF

by space-like intervals, it is not necessary to indicate the ± indices in δΦ and G, and


we have in fact:
δΦ+ (x) = δΦ− (x) ,
G++ (x, y) = G−− (x, y) = G−+ (x, y) = G+− (x, y) if (x − y)2 < 0 .
(15.42)
512 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Let us start with δΦ± . The propagators in the diagram on the left of the figure
15.3 are the Schwinger-Keldysh propagators in the presence of a background field Φ,
i.e. the propagators Gǫǫ ′ . For a generic interaction potential, we can write δΦ± (x)
as follows:
Z
i X
δΦǫ (x) = − d4 z ǫ′ Gǫǫ′ (x, z) U′′′ (Φ(z)) Gǫ′ ǫ′ (z, z) . (15.43)
2 ′
ǫ =±

In this formula, the 1/2 is a symmetry factor, the factor ǫ′ in the integrand takes
into account the fact that vertices of type − have an opposite sign in the Schwinger-
Keldysh formalism, and the factor −i U′′′ (Φ(z)) is the general form of the 3-particle
vertex in the presence of an external field (for an arbitrary interaction potential U).
Thus, we have reduced the calculation to that of the 2-point functions G±± . These
four propagators are defined recursively by the following equations :
X Z
Gǫǫ′ (x, y) = G0ǫǫ′ (x, y)−i η d4 z G0ǫη (x, z) U′′ (Φ(z)) Gηǫ′ (z, y) . (15.44)
η=±

Here, −i U′′ (Φ(z)) is the general form for the insertion of a background field on a
propagator in a theory with potential U(Φ). From these equations, we obtain the
following equations :
   
x +m2 +U′′ (Φ(x)) G+− (x, y) = y +m2 +U′′ (Φ(y)) G+− (x, y) = 0 ,
   
x +m2 +U′′ (Φ(x)) G−+ (x, y) = y +m2 +U′′ (Φ(y)) G−+ (x, y) = 0 .
(15.45)

In addition to these equations of motion, these propagators must become equal to


their free counterparts G0+− and G0−+ when x0 , y0 → −∞. From the definition of
the various components of the Schwinger-Keldysh propagators, G++ and G−− are
given in terms of G+− and G−+ by the following expressions:

G++ (x, y) = θ(x0 − y0 ) G−+ (x, y) + θ(y0 − x0 ) G+− (x, y) ,


G−− (x, y) = θ(x0 − y0 ) G+− (x, y) + θ(y0 − x0 ) G−+ (x, y) . (15.46)

The above conditions determine G+− and G−+ uniquely. In order to find these
propagators, let us recall the following representation of their bare counterparts :
Z
d3 p
G0+− (x, y) = a−p (x)a+p (y) ,
(2π)3 2Ep
Z
d3 p
G0−+ (x, y) = a+p (x)a−p (y) , (15.47)
(2π)3 2Ep
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 513

where

(x +m2 ) a±p (x) = 0 , lim a±p (x) = e∓ip·x . (15.48)


x0 →−∞

It is trivial to generalize this representation of the off-diagonal propagators to the case


of a non zero background field, by writing
Z
d3 p
G+− (x, y) = a−p (x)a+p (y) ,
(2π)3 2Ep
Z
d3 p
G−+ (x, y) = a+p (x)a−p (y) , (15.49)
(2π)3 2Ep

with
 
x +m2 +U′′ (Φ(x)) a±p (x) = 0 , lim a±p (x) = e∓ip·x . (15.50)
x0 →−∞

By construction, these expressions of G+− and G−+ obey the appropriate equations
of motion, and go to the correct limit in the remote past. The functions a±p (x) are
sometimes called mode functions. They provide a complete basis for the linear space
of solutions of the equation (15.50), i.e. the space of linearized perturbations to the
classical solution of the field equation of motion.

Relationship between LO and NLO : At this point, we have all the building
blocks in order to obtain the single inclusive spectrum at NLO. One can go further
and obtain a formal relationship between the LO and NLO inclusive spectra. A key
observation for this is that the functions ak that appear in the dressed propagators
G±∓ can be obtained from the classical field Φ as follows:

a±k (x) =❚±k Φ(x) , (15.51)

where the operator ❚±k is defined by


Z
❚±k · · · ≡ d3 u e∓ik·u
u0 =−∞
h δ δ i
× ∓ iEk ··· . (15.52)
δΦini (u) δ(∂0 Φini (u)) Φini ≡Φχ


In words, the operator ±k in eq. (15.51) differentiates the classical field Φ with
respect to its initial condition Φini , and replaces it by the initial condition of a±k .
Since a±k is a linear perturbation to Φ, this indeed gives the correct result. c sileG siocnarF
514 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Thus, the propagator G+− (x, y) that enters at NLO can be written as
Z h ih i
G+− (x, y) =
d3 k
(2π)3 2Ek
−k ❚
Φ(x) +k Φ(y)❚ . (15.53)

In the rest of our NLO calculation, we only need this propagator for a space-like
separation between x and y, which implies that G+− (x, y) = G−+ (x, y). In this case,
we can symmetrize the expression of the propagator as follows:
Z h ih i
G+− (x, y) =
1 d3 k
2 (2π)3 2Ek
❚−k Φ(x) ❚+k Φ(y)
h ih i
+ ❚+k Φ(x) ❚−k Φ(y) . (15.54)

As we shall see now, a similar expression can be obtained for δΦ± . Let us
start from eq. (15.43). Since the propagators G++ and G−− are equal when the two
endpoints are evaluated at equal times, we have
Z h i
i d3 k 4
δΦǫ (x) = − d z G ǫ+ (x, z) − G ǫ− (x, z)
2 (2π)3 2Ek | {z }
GR (x, z)
×U′′′ (Φ(z)) a−k (z)a+k (z) , (15.55)

where GR is the retarded propagator in the presence of the background field Φ. By


writing more explicitly the interactions with the background field,
Z h
δΦǫ (x) = −i d4 y G0R (x, y) U′′ (Φ(y)) δΦǫ (y)
Z i
1 ′′′ d3 k
+ U (Φ(y)) 3
a∗k (y)ak (y) , (15.56)
2 (2π) 2Ek

(with G0R the bare retarded propagator), one may prove that
Z
δΦǫ (x) =
1
2
d3 k
(2π)3 2Ek
❚+k ❚−k Φ(x) . (15.57)

By inserting this expression, as well as eq. (15.54), in eq. (15.41), we can write the
NLO expectation value as follows,
" Z #
O NLO =
1 d3 k
2 (2π)3 2Ek
+k −k ❚ ❚ O LO . (15.58)

This central result is illustrated in the figure 15.4. Some remarks should be made
about this formula:
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 515

classical
quantum

Figure 15.4: Illustration of eq. (15.58). The open squares represent the operator
❚ ❚
k (u) −k (v). Their action is to remove two instances of the initial classical field
(the open circles), and to connect them with the light colored link to form a loop.

i. In this formula, the LO observable that appears in the right hand side must be
considered as a functional of the initial classical field.

ii. The LO and NLO observables cannot be obtained in closed analytical form, be-
cause they contain the classical field Φ – retarded solution of a non-linear partial
differential equation that cannot be solved analytically in general. Nevertheless,
eq. (15.58) is an exact relationship between the two.

Why is the NLO “nearly classical”? : In a sense, eq. (15.58) indicates that ob-
servables at NLO in the strong field regime are almost classical, since they can be
obtained from the LO result (that depends only on the classical field Φ) by acting with

the operators ±k (i.e. derivatives with respect to the initial value of the classical
field). If one had kept track of the powers of h̄, the h̄ that comes at NLO would just
be an overall prefactor (the prefactor 1/2 in eq. (15.58) would become h̄/2), but all
the rest of the formula would not contain any h̄.
This is in fact not specific to the strong field regime nor to quantum field theory, but
is a general property of quantum mechanics. To see this, consider a generic quantum
system of Hamiltonian H and density operator ρt . The latter evolves according to the
Liouville-von Neumann equation:

∂ρt  
i h̄ = H, ρt . (15.59)
∂t
516 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

The next step is to introduce the Wigner transforms of the density operator:
Z
s s
Wt (x, p) ≡ ds eip·s x + ρt x − . (15.60)
2 2

The Wigner transform of an operator is a Fourier transform of the matrix elements of


the operator in the position basis with respect to the difference of coordinates. The
function Wt (x, p) may be viewed in a loose sense4 as a probability distribution in
the classical phase-space of the system (x and p are classical variables, not opera-
tors). Note that the Wigner transform of the Hamiltonian operator H is the classical
Hamiltonian H. On may show that the Liouville-von Neumann equation is equivalent
c sileG siocnarF

to
 
∂Wt 2 i h̄ ← → ← → 
= H(x, p) sin ∂p ∂x − ∂x ∂p Wt (x, p) (15.61)
∂τ i h̄ 2

= H, Wt + O(h̄2 ) (15.62)
| {z }
Poisson bracket

The first line is an exact equation, known as the Moyal-Groenewold equation. In the
second line, we have performed an expansion in powers of h̄, and one can readily
see that the order zero in h̄ is nothing but the classical Liouville equation (it thus
describes a system whose time evolution is classical). The first quantum correction
to the time evolution arises only at the order h̄2 . Therefore, at the order h̄ (i.e. NLO
in the language of quantum field theory), the time evolution of the system remains
purely classical. This does not mean that there are no quantum corrections of order
h̄, but that these corrections can only come from the initial state of the system (in
particular, from the fact that a quantum system cannot have well defined x and p at
the same time, and the Wigner distribution Wt (x, p) must have a width of order h̄
❚ ❚
at least). The effect of the operator in +k −k that acts on the LO in eq. (15.58) is
precisely to restore this quantum width of the initial state.

15.5 Green’s formulas

Eq. (15.51), that formally relates a small field perturbation to the background field on
top of which it propagates, plays a crucial role in discussing many questions related
to strong fields. A standard proof of this formula relies on Green’s formulas, that we
shall discuss in this section.
4 W is not a bona fide probability distribution, because it is not positive definite in general. But the
t
regions of phase-space where it is negative are small, typically of order h̄. After being integrated either over
x or over p, it becomes a genuine probability distribution for the expectation values of p or x, respectively.
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 517

15.5.1 Green’s formula for a retarded classical scalar field


Consider the following partial differential equation5 ,
(x + m2 )Φ(x) + U′ (Φ(x)) = j(x) . (15.63)
Since this equation contains time derivatives up to second order, it is necessary to
specify the initial value of Φ itself as well as that of its first time derivative. Let us
assume that we know these values on the surface t = 0. We wish to obtain a formula
for Φ(x) at a time x0 > 0 in terms of this initial data. In order to do this, we must
introduce the retarded Green’s function of the operator x + m2 , defined by

(x + m2 ) G0R (x, y) = −iδ(x − y) ,


G0R (x, y) = 0 if x0 < y0 . (15.64)

(The superscript 0 is a reminder of the fact that this is a free Green’s function, that
does not depend on the interaction potential U(Φ).) Note that G0R (x, y) obeys the
same equation if acted upon with y + m2 instead. From the equations obeyed by Φ
and by G0R , we obtain
→  h i
G0R (x, y) y +m2 Φ(y) = G0R (x, y) j(y) − U′ (Φ(y)) ,
← 
G0R (x, y) y +m2 Φ(y) = −iδ(x − y)Φ(y) , (15.65)

where the arrows on the d’Alembertian operators indicate on which side they act. By
integrating these equations over y above the initial surface t = 0, and by subtracting
them, we get the following relation
Z h← i

Φ(x) = i d4 y G0R (x, y) (y − y )Φ(y) + j(y) − U′ (Φ(y)) . (15.66)
y0 >0

The last step is to show that the term that involves the difference between the two
d’Alembertian operators is in fact a boundary term that depends only on the initial
conditions. Note first the following identity,
← → ← →
A( − )B = ∂µ A( ∂ µ − ∂ µ )B , (15.67)
where the leftmost ∂µ acts on everything on its right. In other words, the left hand side
is a total derivative, and its integral over d4 y can be rewritten as a surface integral
thanks to Stokes’ theorem. The integration domain defined by y0 > 0 has three
boundaries:
5 This equation is the classical equation of motion in the scalar field theory of Lagrangian
1 1
L≡ (∂µ φ)(∂µ φ) − m2 φ2 − U(φ) + jφ .
2 2
518 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 15.5: Typical


contribution to Φ(x) in the
x diagrammatic representa-
tion of eq. (15.68), in the
case of cubic interactions.
The solid dots represent
the sources j, the open
circles represent the initial
value of the field or field
y0 = 0 derivatives on the surface
y0 = 0. The lines are
retarded propagators G0R .

i. y0 = +∞ : this boundary at infinite time does not contribute, since the retarded
propagator obeys G0R (x, y) = 0 if y0 > x0 .

ii. y0 = 0 : this boundary gives a non zero contribution, that depends only on the
initial conditions for the field Φ.

iii. Boundary at spatial infinity : this boundary does not contribute if we assume
that the field vanishes when |x| → ∞, or for a finite volume with periodic
boundary conditions in the spatial directions.

Therefore, we obtain
Z h i
Φ(x) = i d4 y G0R (x, y) j(y) − U′ (Φ(y))
y0 >0
Z
→ ←
+i d3 y G0R (x, y)( ∂ y0 − ∂ y0 )Φ(y) . (15.68)
y0 =0

In this Green’s formula, the first term in the right hand side provides the dependence
on the source j, and on the interactions, while the second term tells us how Φ(x)
depends on the initial values of Φ(x) and of its first time derivative. c sileG siocnarF

Except in the trivial case where the potential U(Φ) is zero, eq. (15.68) does not
provide an explicit result for Φ(x), since the right hand side depends on Φ(y) at
points above the initial surface. Despite this limitation, this is a very useful tool in
order to perform formal manipulations involving retarded solutions of eq. (15.63). To
end this section, let us mention a diagrammatic interpretation of eq. (15.68), illustrated
in figure 15.5. One can expand the right hand side of eq. (15.68) in powers of the
interactions. The starting point is the zeroth order approximation, obtained by setting
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 519

the potential to U = 0, and then by proceeding recursively in order to keep higher


orders in U. The outcome of this expansion is an infinite series of terms that have a
tree structure. The root of this tree is the point x where the field is evaluated, and its
leaves are either sources j (if there are any above the surface y0 = 0) or the initial
data on the surface y0 = 0. In particular, if the source j(x) vanishes at y0 > 0, then
all the j dependence of the classical field is implicitly hidden in the Φ(y) that appears
in the boundary term.

Extension to a generic initial surface : In eq. (15.68), the initial conditions for
the field Φ have been set on the surface of constant time y0 = 0. However, there are
many situations in which this initial data is known on a different initial surface. Let
us consider a generic surface Σ, on which the field Φ and its derivatives are known.
As before, we wish to obtain a formula that expresses Φ(x) at some point x above Σ
in terms of these initial conditions on Σ.
Most of the derivation is identical to the case of a constant time initial surface,
with all the integrals over the domain y0 > 0 replaced by integrals over the domain Ω
located above Σ. The only significant change occurs when we apply Stokes’ theorem
in order to transform the 4-dimensional integral of a total derivative into an integral
over the boundary of Ω. Like in the previous case, the boundaries at infinite time, and
at infinity in the spatial directions do not contribute, and we have only a contribution
from the surface Σ. Stokes’ theorem can then be written as
Z Z
d y ∂µ F (y) = − d3 Sy nµ Fµ (y) ,
4 µ
(15.69)
Ω Σ

where d3 Sy is the measure on the surface Σ, and nµ is a 4-vector normal to the surface
Σ at the point y, pointing above the surface Σ. In the important case where the initial
surface is invariant by translation in the transverse directions, the proper normalization
for nµ and d3 Sy can be obtained as follows. Parameterize an arbitrary displacement
dyµ on the surface Σ about the point y as dyµ = (βdy3 , dy1 , dy2 , dy3 ), where β
is the local slope of the surface Σ in the (y3 , y0 ) plane. Then, we have:

nµ dyµ = 0 ,
nµ nµ = 1 , n 0 > 0 ,
p
d3 Sy = 1 − β2 dy1 dy2 dy3 . (15.70)

The second and third conditions require to have β < 1 in order to make sense. This
implies that the surface Σ must be locally space-like. Physically, this means that a
signal emitted from a point of the surface Σ cannot reach the surface again in the
future. The relations (15.70) are illustrated in figure 15.6. Note that the orthogonality
defined by nµ dyµ = 0 does not correspond to the Euclidean concept of orthogonality.
520 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Ω nµ
dyµ
Σ
y

Figure 15.6: Illustration of eqs. (15.70).

Thanks to eq. (15.69), it is possible to write the Green’s formula for an arbitrary
initial surface Σ as
Z h i
Φ(x) = i d4 y G0R (x, y) j(y) − U′ (Φ(y))

Z
→ ←
+i d3 Sy G0R (x, y)(n· ∂ y −n· ∂ y )Φ(y) . (15.71)
Σ

For an arbitrary surface Σ, the second term in the right hand side of this formula tells
us explicitly what information about Φ we must provide on the initial surface in order
to determine it uniquely above the surface: at every point y ∈ Σ, one must specify
the values of the field Φ(y) and of its normal derivative n · ∂y Φ(y).

15.5.2 Green’s formula for small field perturbations


Consider now a small perturbation a(x) to the classical field, and assume that a(x) ≪
Φ(x). Therefore, one can linearize the equation of motion of a(x), and we get
h i
x + m2 + U′′ (Φ(x)) a(x) = 0 . (15.72)

Treating the term U′′ (Φ(x))a(x) as an interaction, we can easily derive a Green’s
formula that expresses the field fluctuation a(x) in terms of its initial conditions on a
surface Σ,
Z h i
a(x) = i d4 y G0R (x, y) − U′′ (Φ(y))a(y)

Z
→ ←
+i d3 Sy G0R (x, y)(n· ∂ y −n· ∂ y )a(y) . (15.73)
Σ

Eq. (15.73) is illustrated in the figure 15.7. Every diagram contributing to a(x) has
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 521

Figure 15.7: Typical


contribution to a(x) in
the diagrammatic repre-
sentation of eq. (15.73),
in the case of cubic inter-
actions. The solid dots x
represent the sources j, the
open circles represent the
initial data for Φ(y) on
the surface y0 = 0, and
the open square the initial
data for a(y). The lines
are retarded propagators y0 = 0
G0R . The dashed line is
the retarded propagator of
the fluctuation in the back-
ground Φ, i.e. an inverse
of the operator  + U′′ (Φ).

exactly one instance of the initial value of a(y) (represented by an open square in
the figure) on the initial surface. Indeed, it is easy to see from eq. (15.73) that a(x)
depends linearly on its value a(y) on the initial surface. This is a consequence of the
fact that equation of motion for a small fluctuation is a linear equation. c sileG siocnarF

By comparing the figures 15.5 and 15.7, one sees that they differ only by the fact
that one instance of the field Φ(y) has been replaced by the small fluctuation a(y)
on the initial surface. Therefore, we expect a linear relationship between a(x) and
Φ(x), of the form

a(x) = ❚a Φ(x) , (15.74)


where a is a linear operator that substitutes one power of Φ(y) by a(y) on Σ (i.e.
an operator that involves first derivatives with respect to the initial conditions on Σ).
It is easy to prove this relation by using eqs. (15.71) and (15.73). In order to do so

and at the same time determine the form of the operator a , let us apply a to the ❚
Green’s formula that gives Φ(x). We get6
Z h i
❚a Φ(x) = i d4 y G0R (x, y) − U′′ (Φ(y)) ❚a Φ(y)

Z
+i ❚a → ←
d3 Sy G0R (x, y)(n· ∂ y −n· ∂ y )Φ(y) . (15.75)
Σ
6 Since ❚a acts only on the initial fields on Σ, we have ❚a j = 0.
522 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

If the boundary term in this formula can be made identical to the boundary term in the
Green’s formula for a(x), then this equation will be identical to the Green’s formula
for a(x) and we will have proven the announced relationship between a(x) and Φ(x).

This is the case if the operator a is chosen as
Z  
❚ 3
a ≡ d Sy a(y)
δ
δΦ(y)
+ (n · ∂a(y))
δ
δ(n · ∂Φ(y))
, (15.76)
Σ

which is nothing but the operator that substitutes a(y) to Φ(y) on the initial surface
Σ, as announced (this definition generalizes eq. (15.52) to a generic initial surface and
to generic initial conditions for the perturbation).

Note that a is an operator that performs an infinitesimal translation (by an
amount a(y)) to the initial condition of the classical field. By exponentiation, it may
be promoted into an operator that performs a finite shift of the initial condition. In
particular, if we denote by Φ[Φ0 ] the classical field whose initial value on Σ is Φ0 ,
then we have

e❚a Φ[Φ0 ] = Φ[Φ0 + a] . (15.77)

15.5.3 Schwinger-Keldysh formalism

It is often useful to obtain Green’s formulas for fields in the Schwinger-Keldysh


formalism. In this case, one has a pair of fields Φ± (x), that are both solutions of the
classical equation of motion

(x + m2 )Φ± (x) + U′ (Φ± (x)) = j(x) . (15.78)

(For simplicity we take the same source j(x) for the two fields, but this limitation
is easily circumvented if necessary.) Since the Schwinger-Keldysh propagator G0++
is also a Green’s function of the operator x + m2 , we can reproduce the previous
derivation of Green’s formula, which leads to7
Z h i
Φ+ (x) = i d4 y G0++ (x, y) j(y) − U′ (Φ+ (y))
Z h iy0 =+∞
← →
+i d3 y G0++ (x, y)( ∂ y0 − ∂ y0 )Φ+ (y) , (15.79)
y0 =−∞

0
y =b
where we used the notation [f(y0 )]y 0 =a ≡ f(b) − f(a). The only difference with

the Green’s formula derived with retarded propagators is the boundary term: since
7 Here also, the prefactors i follow from our convention for the propagators of the Schwinger-Keldysh

formalism (see eqs. (1.367)).


15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 523

G0++ (x, y) does not vanish when y0 > x0 , there is also a non-zero contribution from
the boundary at y0 = +∞. Then, by using the fact that (y + m2 )G0+− (x, y) = 0,
we obtain in a similar way :
Z h i
0 = i d4 y G0+− (x, y) j(y) − U′ (Φ− (y))
Z h iy0 =+∞
← →
+i d3 y G0+− (x, y)( ∂ y0 − ∂ y0 )Φ− (y) . (15.80)
y0 =−∞

Subtracting this equation from eq. (15.79), we obtain

Φ+ (x)
Z h i h i
= i d4 y G0++ (x, y) j(y)−U′ (Φ+ (y)) − G0+− (x, y) j(y)−U′ (Φ− (y))
Z h iy0 =+∞
↔ ↔
−i d3 y G0++ (x, y) ∂ y0 Φ+ (y) − G0+− (x, y) ∂ y0 Φ− (y) ,
y0 =−∞
(15.81)
↔ → ←
where A ∂ y0 B ≡ A( ∂ y0 − ∂ y0 )B. Similarly, we obtain for Φ− (x) :

Φ− (x)
Z h i h i
= i d4 y G0−+ (x, y) j(y)−U′ (Φ+ (y)) − G0−− (x, y) j(y)−U′ (Φ− (y))
Z h iy0 =+∞
↔ ↔
−i d3 y G0−+ (x, y) ∂ y0 Φ+ (y) − G0−− (x, y) ∂ y0 Φ− (y) 0 .
y =−∞
(15.82)

At this point, these formulas are rather formal, and it is not clear why we have
gone through the trouble of subtracting the quantity given by eq. (15.80), since it is
identically zero. This will become transparent in the next section, where we show
that these formulas enable one to sum series of tree diagrams encountered in the
Schwinger-Keldysh formalism.
Note also that the only property of the propagators G0−+ and G0+− that we have
used in this derivation is the fact that they are annihilated by the operator y . There-
fore, the equations (15.81) and (15.82) remain valid if we replace these propagators
by any other pair of propagators sharing the same property. For instance, one can
replace the propagators G0+− and G0−+ of eqs. (1.367) by the following objects
Z
0 d4 p −ip·(x−y)
G+− (x, y) = e u(p) 2πθ(−p0 )δ(p2 ) ,
(2π)4
Z 4
0 d p −ip·(x−y)
G−+ (x, y) = e v(p) 2πθ(+p0 )δ(p2 ) , (15.83)
(2π)4
524 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

where u(p) and v(p) are some arbitrary functions of the momentum p, without
altering any of the formulas in this section. We will make use of this freedom in the
next section. c sileG siocnarF

15.5.4 Summing tree diagrams using Green’s formulas

Many problems involving strong fields require that one sums infinite series of tree
diagrams. These sums of diagrams can in general be expressed in terms of solutions
of the classical equations of motion. However, in order to determine them uniquely,
one must know the boundary conditions obeyed by these classical solutions. The
strategy in order to obtain them is to write the sum of tree diagrams as a recursive
integral equation. Then, by comparing this integral equation with a Green’s formula
such as eq. (15.68), one can read off the boundary conditions easily.

Sum of retarded trees : Let us illustrate this first in the simplest case, where one
must sum all the tree diagrams built with retarded propagators, and whose leaves are
a source j(x). Let us call Φ(x) the sum of all such tree diagrams. Given the recursive
structure of such trees, one can write immediately :
Z h i
Φ(x) = i d4 y G0R (x, y) j(y) − U′ (Φ(y)) , (15.84)

where the integration over d4 y is extended to the entire8 space-time. Therefore, we


see that this formula is identical to the Green’s formula (15.68), with the initial surface
at y0 = −∞ instead of y0 = 0, and where the boundary term is be identically zero.
This means that the sum of these tree diagrams is a retarded solution of the classical
equation of motion with a null boundary condition in the remote past

(x + m2 )Φ(x) + U′ (Φ(x)) = j(x) ,


lim Φ(y0 , y) = 0 , lim ∂0 Φ(y0 , y) = 0 . (15.85)
y0 →−∞ y0 →−∞

Sum of trees in the Schwinger-Keldysh formalism : Consider now a more com-


plicated example, in which one must sum tree diagrams in the Schwinger-Keldysh
formalism. Now, each vertex is carrying an index ǫ = ±. For simplicity, we assume
that the + and − sources are identical, so that we still have a single source j(x).
Because this is necessary in certain applications, we are going to use the modified
0 0
propagators G+− and G−+ defined in eqs. (15.83), instead of the propagators G0+−
and G−+ defined in eqs. (1.367) (the propagators G0++ and G0−− are kept unchanged).
0

In addition to summing over all the possible trees, we sum over all the combinations of
8 In the Feynman rules the integration at each vertex is extended to the full space-time ❘4 .
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 525

± indices at every internal vertex. Firstly, the sum of these trees can be written in the
form of two coupled integral equations (there are now two fields Φ± (x) depending
on the index carried by the root of the tree) :
Z h i
Φ+ (x) = i d4 y G0++ (x, y) j(y) − U′ (Φ+ (y))
Z h i
0
−i d4 y G+− (x, y) j(y) − U′ (Φ− (y)) ,
Z h i
0
Φ− (x) = i d4 y G−+ (x, y) j(y) − U′ (Φ+ (y))
Z h i
−i d4 y G0−− (x, y) j(y) − U′ (Φ− (y)) . (15.86)

At this point, we recognize that the right hand side of these equations is identical to
the first term in the right hand side of eqs. (15.81) and (15.82). From this observation,
we conclude that Φ+ (x) and Φ− (x) are solutions of the classical equation of motion,

(x + m2 )Φ± (x) + U′ (Φ± (x)) = j(x) , (15.87)

and that they obey the following boundary conditions


Z h iy0 =+∞
↔ 0 ↔
d3 y G0++ (x, y) ∂ y0 Φ+ (y) − G+− (x, y) ∂ y0 Φ− (y) =0,
0 y =−∞
Z h iy0 =+∞
0 ↔ ↔
d3 y G−+ (x, y) ∂ y0 Φ+ (y) − G0−− (x, y) ∂ y0 Φ− (y) =0.
y0 =−∞
(15.88)

We have now coupled boundary conditions for the fields Φ+ and Φ− , that involve the
value of the fields both at y0 = −∞ and at y0 = +∞. In addition, these boundary
conditions are non-local in coordinate space, since they involve integrals over d3 y on
the surfaces y0 = ±∞. However, they can be simplified considerably if one uses the
following Fourier representations for the propagators
Z
d3 p h 0 0 −ip·(x−y) 0 +ip·(x−y)
i
G0++ (x, y) = θ(x − y )e +θ(y0
− x )e ,
(2π)3 2Ep
Z
d3 p h 0 0 +ip·(x−y) 0 −ip·(x−y)
i
G0−− (x, y) = θ(x − y )e +θ(y0
− x )e ,
(2π)3 2Ep
Z
0 d3 p
G+− (x, y) = u(p) e+ip·(x−y) ,
(2π)3 2Ep
Z
0 d3 p
G−+ (x, y) = v(p) e−ip·(x−y) . (15.89)
(2π)3 2Ep
526 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Compared to the expressions of these propagators in eqs. (1.367) and (15.83), we


have performed explicitly the integration over p0 in order to obtain these formulas.
Thus, whenever p0 appears in these expressions, it should be replaced by the positive
on-shell value p0 = |p|. The other ingredient we need in order to simplify the
boundary conditions is a Fourier representation for the fields Φ± (y),
Z h i
d3 p (+) 0 −ip·y (−) 0 +ip·y
Φǫ (y) ≡ f (y , p) e + f (y , p) e . (15.90)
(2π)3 2Ep ǫ ǫ

The superscripts (±) on the Fourier coefficients serve to distinguish the positive and
negative frequency modes. Note that because the fields Φ± (y) are not free fields,
these Fourier coefficients are time dependent. In practice, one may assume that the
(±)
interactions are switched off at y0 = ±∞, so that the coefficients fǫ (y0 , p) tend to
constants when y0 → ±∞. However, these limiting values are different at y0 = +∞
and at y0 = −∞, and we must keep the y0 argument to distinguish them. Using the
identity
Z
↔ ′ ′
d3 y eiǫp·(x−y) ∂ y0 eiǫ p ·y = iδǫǫ′ eiǫp·x (2π)3 2Ep δ(p − p′ ) , (15.91)

(valid for ǫ, ǫ′ = ±) it is easy to rewrite the boundary conditions (15.88) as a set of


separate conditions for each Fourier mode p:

(+) (−)
f+ (−∞, p) = f− (−∞, p) = 0 ,
(−) (−)
f+ (+∞, p) = u(p) f− (+∞, p) ,
(+) (+)
f− (+∞, p) = v(p) f+ (+∞, p) . (15.92)

The boundary conditions have a very compact expression in terms of the Fourier
coefficients of the fields Φ± . At y0 = −∞, Φ+ (y) has no positive energy modes
and Φ− (y) has no negative energy modes. At y0 = +∞, the negative energy modes
of Φ+ (y) and Φ− (y) are proportional (with a proportionality relation that involves
the function u(p)). A similar relation, that involves the function v(p), holds between
their positive energy modes at y0 = +∞. Eqs. (15.92), together with the equations
of motion (15.87), determine uniquely the fields Φ± (x) and therefore provide the
solution to our original problem of summing tree diagrams in the Schwinger-Keldysh
formalism. One should however keep in mind that this solution is somewhat formal,
because it is in general extremely difficult to solve a non-linear field equation of
motion with boundary conditions specified both at y0 = −∞ and y0 = +∞.
Let us also mention that these boundary conditions become considerably simpler
in the case where u(p) = v(p) ≡ 1. Indeed, from the second and third of eqs. (15.92),
we see that the fields Φ± (y) have identical Fourier coefficients at y0 = +∞. There-
fore, the two fields must be equal in the limit y0 → +∞. Then, by solving their
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 527

equation of motion backwards in time, one sees trivially that they are equal at all
times (since they obey identical equations of motion),

if u(p) = v(p) ≡ 1 , Φ+ (x) = Φ− (x) , for all x ∈ ❘4 . (15.93)

Finally, the first of eqs. (15.92) tells us that

if u(p) = v(p) ≡ 1 , lim Φ± (x) = 0 . (15.94)


x0 →−∞

To summarize, when u(p) = v(p) ≡ 1, the two fields Φ± (x) are equal to the
retarded field that vanishes when x0 → −∞. This result could in fact have been
obtained by a much more elementary argument. Indeed, when u(p) = v(p) ≡ 1,
the summation over the ± indices at the vertices of tree diagrams always leads to the
following combinations of propagators,

G0++ − G0+− = G0−+ − G0−− = G0R . (15.95)

In other words, summing over these indices amounts to replacing all the propagators
in a given tree by retarded propagators, and one is thus led to the problem discussed
in section 15.5.4.c sileG siocnarF

15.6 Mode functions


15.6.1 Propagators in a background field
We have introduced in eqs. (15.50) a set of small perturbations on top of a background
field Φ, as a way to express propagators dressed by this background. We shall discuss
further properties of these functions in this section. However, before we do so, let
us propose an alternative derivation of the dressed propagators. In eqs. (15.49),
this problem was solved by writing the equations of motion obeyed by the various
propagators, as well as their boundary conditions, and by exhibiting an expression
that fulfills both. The method proposed here simply amounts to performing explicitly
the resummation of the background field insertions. As one will see, this approach is
arguably more tedious, but is in a sense much more elementary.
The starting point is eq. (15.44), that performs the resummation of the background
field. In this form, the equation is fairly complicated to solve because the four
components of the Schwinger-Keldysh propagator get mixed already after the first
insertion of the background field. However, there is a simple way to simplify these
equations. It is based on the observation that the four propagators are not independent,
but satisfy a linear relation,

G0++ + G0−− = G0+− + G0−+ ,


G++ + G−− = G+− + G−+ , (15.96)
528 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

which follows immediately from their definition as path-ordered products of two


fields, and from the identity θ(x) + θ(−x) = 1. It is possible to exploit this relation
as follows: perform a rotation on the matrix made of the four propagators so that
one component of the rotated matrix becomes zero. Having a zero in the matrix of
propagators makes the resummation of the background field considerably simpler.
Therefore, let us define


X
αβ ≡ Ωαǫ Ωβǫ′ Gǫǫ′ . (15.97)
ǫ,ǫ′ =±

(The same rotation is applied to the free propagators.) There is not a unique choice

of the matrix Ωαǫ that gives a zero component in αβ , but the following choice is
convenient:
 
1 −1
Ωαǫ ≡ . (15.98)
1/2 1/2

The rotated propagators read


   

0
αβ =
0 G0A
G0R G0S
, ●αβ = 0
GR
GA
GS
, (15.99)

where we have introduced

G0R = G0++ − G0+− , GR = G++ − G0+− ,


G0A = G0++ − G0−+ , GA = G++ − G0−+ ,
G0S = G0++ + G0−− , GS = G++ + G0−− . (15.100)

(The subscripts R, A and S stand respectively for retarded, advanced and symmetric.)
After having performed this rotation, eq. (15.44) is transformed into
XZ

αβ (x, y) = ●0
αβ (x, y) − i ● ●
d4 z 0αδ (x, z) U′′ (Φ(z)) σδγ γβ (z, y) ,
δ,γ
(15.101)

where we denote
 
0 1
σ≡ . (15.102)
1 0

In order to make the notations more compact, let us introduce the following shorthand,

h i XZ
❆ ◦ ❇ αβ(x, y) ≡ −i d4 z ❆αδ(x, z) U′′(Φ(z)) σδγ ❇γβ(z, y) . (15.103)
δ,γ
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 529

With this notation, eq. (15.101) takes a very compact form,

● = ●0 + ●0 ◦ ● , (15.104)

and its solution is


∞ h i◦n
●= ●0 ❆◦n ≡ ❆| ◦ ·{z· · ◦ ❆} .
X
, with (15.105)
n=0 n times
What makes the calculation of this infinite sum easy after the rotation we have

performed is the fact that the elementary object 0 σ is the sum of a diagonal and a
nilpotent matrix:
 0   
● 0
σ= + , ❉ ◆ ≡
GA

0 G0R
0
, ≡
0 0
G0S 0
◆. (15.106)


One has 2 = 0, which simplifies a lot the calculation of the n-th power of 0 σ. ●
From this observation, it is easy to obtain
 0 ⋆(n+1) !
h i◦(n+1)
●0 0
=  ⋆(n+1) Pn  ⋆i
GA
 ⋆(n−i) , (15.107)
G0R 0
i=0 GR ⋆ G0S ⋆ G0A

with the notation


Z
 
A ⋆ B (x, y) ≡ −i d4 z A(x, z) U′′ (Φ(z)) B(z, y) , (15.108)

(and an obvious definition for the ⋆-exponentiation.) The summation of the off-
diagonal components of eq. (15.107) is trivial since these terms do not mix. Moreover,
the resummed GS propagator has a simple expression in terms of the resummed
retarded and advanced propagators. These results can be summarized by
∞ h
X i⋆n ∞ h
X i⋆n
GR = G0R , GA = G0A ,
n=0 n=0
GS = GR (G0R )−1 G0S (GA ) 0 −1
GA . (15.109)

At this stage, we know all the components of the resummed propagator in the rotated
basis. In order to obtain them in the original basis, we just have to invert the rotation
of eq. (15.97), which gives

G−+ = GR (G0R )−1 G0−+ (G0A )−1 GA ,


G+− = GR (G0R )−1 G0+− (G0A )−1 GA . (15.110)

It is easy to check that these equations are equivalent to eqs. (15.49). c sileG siocnarF
530 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

15.6.2 Basis of retarded small fluctuations


Since small perturbations obey a linear equation of motion, it is always possible to
write them as a linear superposition of small fluctuations that obey retarded boundary
conditions. Let us introduce a±k (x), defined by the equation of motion
h i
x + m2 + U′′ (Φ(x)) a±k (x) = 0 , (15.111)

and the retarded boundary condition

lim a±k (x) = e∓ik·x . (15.112)


x0 →−∞

Note that for a real potential U(Φ), a±k (x) are mutual complex conjugates. Any
solution of the equation of motion for small fluctuations can be written as
Z h i
d3 k
a(x) = 3
αk k
+ a+k (x) + α− a−k (x) , (15.113)
(2π) 2Ek

where the αk
± are constant coefficients that depend on the boundary conditions (the
boundary conditions in general lead to a set of linear equations for the coefficients).

15.6.3 Completeness relations


The set of small fluctuations {a+k (x), a−k (x)} obey some useful relations, that are a
consequence of unitarity. Consider first two generic solutions a1 (x) and a2 (x) of the
equation of small fluctuations. In order to make the notations more compact in the
rest of this section, it is useful to introduce the following notations
 
 a(x) 
a ≡ , a ≡ a∗ (x) ȧ∗ (x) σ2 , (15.114)
ȧ(x)

where the dot denotes a time derivative and σ2 is the second Pauli matrix. Thanks
to the fact that the background potential U′′ (Φ(x)) is real, one can construct from
a1 and a2 an inner product which is an invariant of the time evolution of the two
perturbations. This quantity is reminiscent of the Wronskian for two solutions of a
second order ordinary differential equation, and it is defined as follows
Z h i

a1 a2 ≡ i d3 x ȧ∗1 (x) a2 (x) − a∗1 (x) ȧ2 (x) . (15.115)

Although a1 a2 could in principle depend on time (since one integrates only over
space in its definition), it is immediate to verify that
∂ 
0
a1 a2 = 0 . (15.116)
∂x
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 531

Since it is a constant in time, one can compute this inner product from the value of the
field fluctuations in the remote past. This is particularly handy when the fluctuations
under consideration are specified by retarded boundary conditions, as is the case for
a±k (x). One finds

a+k a+l = (2π)3 2Ek δ(k − l) ,

a−k a−l = −(2π)3 2Ek δ(k − l) ,
 
a+k a−l = a−k a+l = 0 . (15.117)

Consider now a generic solution a(x) of eq. (15.111). Since the a±k a basis of
the linear space of solutions, one can write a(x) as a linear superposition
Z
 d3 k h k  i
a = 3
α+ a+k + αk
− a−k , (15.118)
(2π) 2k

where the coefficients αk


± do not depend
 on time or space. By using the orthogonality
relations obeyed by the vectors a±k , one gets easily
 
αk
+ = a+k a , αk
− = − a−k a . (15.119)

By inserting these relations back into eq. (15.118), and by using the fact that it is
valid for any small fluctuation a(x) solution of eq. (15.111), we obtain the following
identity
Z
d3 k h   i
3
ak ak − a−k a−k = 1 . (15.120)
(2π) 2k

This identity is valid at all times over the space of solutions of eq. (15.111). It is a
manifestation of the fact that, when the backgroundfield is real, the time evolution
preserves the completeness of the set of states a±k .

15.7 Multi-point correlation functions at tree level

15.7.1 Generating functional for local measurements

Definition : In the previous section, we have studied a generic observable at lead-


ing and next-to-leading orders in λ, and we have established a general functional
relationship that relates them. In a sense, this relationship reflects the fact the first h̄
correction in a quantum theory is not fully quantum: at this order only the initial state
contains quantum effects, but the time evolution of the system is still classical. c sileG siocnarF
532 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Let us consider now an observable involving multiple points x1 , · · · , xn , corre-


sponding to n measurements. For simplicity, we assume that the points xi where the
measurements are performed lie on the same surface of constant time x0 = tf , but
the final results are valid for any locally space-like surface (this ensures that there
is no causal relation between the points xi , and also that the ordering between the
operators in the correlator does not matter). In this case, the leading order is a com-
pletely disconnected contribution made of n separate factors, that does not contain
any correlation between the n measurements. However, the physically interesting
information lies in the correlation between these measurements,

C{1···n} ≡ O(x1 ) · · · O(xn ) c


, (15.121)

where the subscript c indicates that we retain only the connected part of the correlator.
From the generic power counting arguments developed in the previous sections,
these connected correlators are all of order λ−1 in the strong field regime. It is
also important to realize that the connected part of these correlators is subleading
compared to their fully disconnected part, since

O(x1 ) · · · O(xn ) = O(x1 ) · · · O(xn )


| {z }
λ−n
X Y
+ O(xi )O(xj ) c
O(xk )
i<j k6=i,j
| {z }
λ1−n

+ · · · + O(x1 ) · · · O(xn ) c
(15.122)
| {z }
λ−1

We see in this formula that, in the strong field regime, the fully connected part of a
n-point correlator is suppressed by λn−1 compared to the trivial disconnected term.
Thus, even at tree level, the correlated part of a n-point function is not a leading order
quantity, but arises only at order n − 1 in the expansion in powers of λ.
One can encapsulate all the correlation functions (15.121) into a generating
functional defined as follows9 :
Z
F[z(x)] ≡ 0in exp d3 x z(x) O(φ(x)) 0in , (15.123)
tf

where the argument of the field in O is x ≡ (tf , x). From this generating functional,
the correlation functions are obtained by differentiating with respect to the z(xi ) and
by setting z ≡ 0 afterwards. In order to remove the uncorrelated part of the n-point
9 This is easily generalized to the case where the initial state is a coherent state instead of the vacuum.
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 533

function, we should differentiate the logarithm of F, i.e.

δn ln F
C{1···n} = (15.124)
δz(x1 ) · · · δz(xn ) z≡0

The observable O(φ(x)) is made of the field in the Heisenberg picture, φ(x), that can
be related to the field φin (x) of the interaction picture as follows:

φ(x) = U(−∞, x0 ) φin (x) U(x0 , −∞) , (15.125)

where U(t1 , t2 ) is an evolution operator given in terms of the interactions by the


following formula
Z t2
U(t2 , t1 ) = T exp i dx0 d3 x Lint (φin (x)) . (15.126)
t1

We can therefore rewrite the generating functional solely in terms of the interaction
picture field φin ,
Z Z
F[z(x)] = 0in P exp d x i dx0 Lint (φ+
3 −
in (x)) − Lint (φin (x))

+z(x) O(φin (tf , x)) 0in , (15.127)

where P denotes the path ordering on the Schwinger-Keldysh time contour C. We


denote by φ+ in the (interaction picture) field that lives on the upper branch and by
φ− −
in the field on the lower branch (the minus sign in front of the term Lint (φin (x))
comes from the fact that the lower branch is oriented from +∞ to −∞). The operator
O(φin (x)) lives at the final time of this contour, and could either be viewed as made
of fields of type + or of type − (the two choices lead to the same results).

Expression in the Schwinger-Keldysh formalism : Since the initial state is the


vacuum, the generating functional defined in eq. (15.127) can be represented dia-
grammatically as the sum of all the vacuum-to-vacuum graphs (i.e. graphs without
external legs) in the Schwinger-Keldysh formalism (see the section 1.16.5), extended
by an extra vertex that corresponds to the insertions of the observable O. Let us
recall here that the Schwinger-Keldysh diagrammatic rules consist in having two
types of interaction vertices (+ and − depending on which branch of the contour the
vertex lies on, the − vertex being the opposite of the + one) and four types of bare
propagators (G0++ , G0−− , G0+− and G0−+ ) depending on the location of the endpoints
on the contour. The additional vertex exists only on the final surface, at the time tf .
It is accompanied by a factor z(x), and has as many legs as there are fields in O(φ).
There is only one kind of this vertex (we can decide to call it + or − without affecting
anything). We recapitulate these Feynman rules in the figure 15.8. In the case of
534 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

Figure 15.8:
Diagrammatic rules for
+ + 0 − − 0 the extended Schwinger-
G++ G−− Keldysh formalism that
+ − − +
gives the generating func-
0 0
G+− G−+ tional. The Feynman rules
shown here for the self-
+ −
−iλ +iλ interactions correspond
to a λφ4 /4! interaction
+ − term. In this illustration,
+iJ(x) −iJ(x)
we have assumed that the
observable is quartic in
z(x) the field when drawing
the corresponding vertex
(proportional to z(x)).

the vacuum initial state, we recall that the propagators have the following explicit
expressions:
Z
d3 k
G0−+ (x, y) = e−ik·(x−y) ,
(2π)3 2Ek
Z
0 d3 k
G+− (x, y) = eik·(x−y) ,
(2π)3 2Ek
G0++ (x, y) = θ(x0 − y0 ) G0−+ (x, y) + θ(y0 − x0 ) G0+− (x, y) ,
G0−− (x, y) = θ(x0 − y0 ) G0+− (x, y) + θ(y0 − x0 ) G0−+ (x, y) .(15.128)

Note that when we set z ≡ 0, these diagrammatic rules fall back to the pure Schwin-
ger-Keldysh formalism, for which all the connected vacuum-to-vacuum graphs are
zero. This implies that

F[z ≡ 0] = 1 , (15.129)

in accordance with the fact that this should be 0in 0in = 1.

Retarded-advanced representation : In order to clarify which approximations are


legitimate in the strong field regime, it is useful to use a different basis of fields by
introducing

φ2 ≡ 21 φ+ + φ− , φ1 ≡ φ+ − φ− . (15.130)

The half-sum φ2 in a sense captures the classical content (plus some quantum correc-
tions), while the difference φ1 is purely quantum (because it represents the different
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 535

histories of the fields in the amplitude and in the complex conjugated amplitude). To
see how the Feynman rules are modified in terms of these new fields, let us start from
X
φα = Ωαǫ φǫ (α = 1, 2) , (15.131)
ǫ=±

with Ωαǫ the matrix defined in eq. (15.98). The new propagators after this rotation
have calculated earlier,

G021 = G0++ − G0+− ,


G012 = G0++ − G0−+ ,
 0 
G022 = 1 0
2 G+− + G−+ ,
G011 = 0. (15.132)

Note that G021 is the bare retarded propagator, while G012 is the bare advanced propa-
gator. The vertices in the new formalism (here written for a quartic interaction) are
given by
h i
Λαβγδ ≡ −i λ Ω−1 −1 −1 −1 −1 −1
+α Ω+β Ω+γ Ω+δ − Ω−α Ω−β Ω−γ Ω−δ ,
−1 −1
(15.133)

where
 
1/2 1
Ω−1
ǫα = [Ωαǫ Ω−1
ǫβ = δαβ ] . (15.134)
−1/2 1

More explicitly, we have :

Λ1111 = Λ1122 = Λ2222 = 0


Λ1222 = −i λ , Λ1112 = −i λ/4 . (15.135)

(The vertices not listed explicitly here are obtained by permutations.) Finally, the
rules for an external source in the retarded-advanced basis are :

J1 = J , J2 = 0 . (15.136)

Finally, note that the observable depends only on the field φ2 , i.e. O = O(φ2 ).
Indeed, the fields φ+ and φ− represent the field in the amplitude and in the conjugated
amplitude. Their difference should vanish when a measurement is performed. c sileG siocnarF

15.7.2 First derivative at tree level


First derivative of ln F : Differentiating the generating functional with respect to
z(x) amounts to exhibiting a vertex O at the point x at the final time (as opposed to
536 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

weighting this vertex by z(x) and integrating over x). Furthermore, by considering
the logarithm of the generating functional rather than F itself, we have only diagrams
that are connected to the point x, as shown in this representation:

δ ln F
= x , (15.137)
δz(x)

where the gray blob is a sum of graphs constructed with the Feynman rules of the
figure 15.8, or their analogue in the retarded-advanced formulation. Therefore, these
graphs still depend implicitly on z. Note that this blob does not have to be connected.

Tree level expression : Without further specifying the content of the blob, the
equation (15.137) is valid to all orders, both in z and in g. At lowest order in g (tree
level), a considerable simplification happens because the blob must be a product of
disconnected subgraphs, one for each line attached to the vertex O(φ(x)):

δ ln F
= x , (15.138)
δz(x) tree

where now each of the light coloured blob is a connected tree 1-point diagram. In the
retarded-advanced formalism, there are two of these 1-point functions, that we will
denote φ1 and φ2 . At tree level, they can be defined recursively by the following pair
of coupled integral equations:
Z
∂Lint (φ1 , φ2 )
φ1 (x) = i d4 y G012 (x, y)
Ω ∂φ2 (y)
Z
+ d3 y G012 (x, y) z(y) O ′ (φ2 (y)) ,
tf
Z
∂Lint (φ1 , φ2 ) ∂Lint (φ1 , φ2 )
φ2 (x) = i d4 y G021 (x, y) + G022 (x, y)
Ω ∂φ1 (y) ∂φ2 (y)
Z
+ d3 y G022 (x, y) z(y) O ′ (φ2 (y)) . (15.139)
tf

In these equations, O ′ is the derivative of the observable with respect to the field, Ω is
the space-time domain comprised between the initial and final times, and we denote
Lint (φ1 , φ2 ) ≡ Lint (φ2 + 21 φ1 ) − Lint (φ2 − 12 φ1 ) . (15.140)
λ 4
For an interaction Lagrangian − 4! φ + Jφ, this difference reads
λ λ
Lint (φ1 , φ2 ) = − φ32 φ1 − φ31 φ2 + Jφ1 . (15.141)
6 4!
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 537

In terms of these fields, we have


δ ln F
= O(φ2 (x)) , (15.142)
δz(x) tree

i.e. simply the observable O evaluated on the field φ2 (x) (but this field depends on z
to all orders, via the boundary terms in eqs. (15.139)).

Classical equations of motion : Using the fact that G012 and G021 are Green’s
functions of  + m2 , respectively obeying the following identities

(x + m2 ) G012 (x, y) = −iδ(x − y) , (x + m2 ) G021 (x, y) = +iδ(x − y) ,


(15.143)
while G022 vanishes when acted upon by this operator,
(x + m2 ) G022 (x, y) = 0 , (15.144)
we see that φ1 and φ2 obey the following classical field equations of motion:
∂Lint (φ1 , φ2 )
(x + m2 ) φ1 (x) = ,
∂φ2 (x)
∂Lint (φ1 , φ2 )
(x + m2 ) φ2 (x) = . (15.145)
∂φ1 (x)
Note that here the point x is located in the “bulk” Ω; this is why the observable
does not enter in these equations of motion. In fact, the observable enters only in
the boundary conditions satisfied by these fields on the hypersurface at tf . For later
reference, let us also rewrite these equations of motion in the specific case of a scalar
field theory with a λφ4 /4! interaction term and an external source J:
h i λ
x + m2 + λ2 φ22 φ1 + φ31 = 0 ,
4!
λ λ
(x + m2 ) φ2 + φ32 + φ21 φ2 = J . (15.146)
6 8

Boundary conditions : The equations of motion (15.145) are easier to handle


than the integral equations (15.139), but they must be supplemented with boundary
conditions in order to define uniquely the solutions. The standard procedure for
deriving the boundary conditions is to consider the combination G012 (x, y) (y +
m2 ) φ1 (y), and let the operator y + m2 act alternatively on the right and on the
left,
→ ∂Lint (φ1 , φ2 )
G012 (x, y) (y +m2 ) φ1 (y) = G012 (x, y)
∂φ2 (y)

G012 (x, y) (y +m2 ) φ1 (y) = −iδ(x − y) φ1 (y) . (15.147)
538 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

By subtracting these equations and integrating over y ∈ Ω, we obtain


Z Z
∂Lint (φ1 , φ2 ) ↔
φ1 (x) = i d4 y G012 (x, y) −i d4 y G012 (x, y) y φ1 (y) .
Ω ∂φ2 (y) Ω
(15.148)

The second term of the right hand side is a total derivative thanks to
↔ h ↔ i
A  B = ∂µ A ∂ µ B . (15.149)

Therefore, this term can be rewritten as a surface integral extended to the boundary of
the domain Ω. With reasonable assumptions on the spatial localization of the source
J(x) that drives the field, we may disregard the contribution from the boundary at
spatial infinity. The remaining boundaries are at the initial time ti and final time tf ,
Z Z h i tf
∂Lint (φ1 , φ2 ) ↔
φ1 (x) = i d4 y G012 (x, y) −i d3 y G012 (x, y) ∂ y0 φ1 (y) .
Ω ∂φ2 (y)
(15.150)

Note that the boundary term vanishes at the initial time ti , because G012 is the retarded
propagator. Likewise, we obtain the following equation for φ2 :
Z
∂Lint (φ1 , φ2 ) ∂Lint (φ1 , φ2 )
φ2 (x) = i d4 y G021 (x, y) + G022 (x, y)
Ω ∂φ 1 (y) ∂φ2 (y)
Z h i tf
↔ ↔
−i d3 y G021 (x, y) ∂ y0 φ2 (y) + G022 (x, y) ∂ y0 φ1 (y) .
ti
(15.151)

The boundary conditions at ti and tf are obtained by comparing eqs. (15.139) and
(15.150-15.151). At the final time tf , the boundary condition is
c sileG siocnarF

φ1 (tf , x) = 0 , ∂0 φ1 (tf , x) = i z(x) O ′ (φ2 (tf , x)) . (15.152)

At the initial time ti , we must have


Z h i
↔ ↔
d3 y G021 (x, y) ∂ y0 φ2 (y) + G022 (x, y) ∂ y0 φ1 (y) = 0 . (15.153)
y0 =ti

Some simple manipulations lead to the following equivalent form


Z
↔ 
d3 y G0−+ (x, y) ∂ y0 φ2 (y) + 12 φ1 (y)
y0 =ti
Z
↔ 
= d3 y G0+− (x, y) ∂ y0 φ2 (y) − 12 φ1 (y) = 0 . (15.154)
y0 =ti
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 539

From the explicit form of the propagators G0+− and G0−+ (see eqs. (15.128)), we see
that, at the initial time, the combination φ2 + 21 φ1 has no positive frequency compo-
nents, and the combination φ2 − 21 φ1 has no negative frequency components. An
equivalent way to state this boundary condition is in terms of the Fourier coefficients
of the fields φ1,2 . Let us decompose them at the time ti as follows,
Z
d3 k e (+) (k) e−ik·x + φ
e (−) (k) e+ik·x . (15.155)
φ1,2 (ti , x) ≡ 3
φ 1,2 1,2
(2π) 2Ek
In terms of the coefficients introduced in this decomposition, the boundary conditions
at the initial time read:

e (+) (k) = − 1 φ
φ e (+) (k) , e (−) (k) = 1 φ
φ e (−) (k) . (15.156)
2
2 1 2
2 1

15.7.3 Correlations in the quasi-classical regime

quasi-classical approximation : It is in principle possible to solve order by order


in the function x(x) the equations of motion (15.145) (or (15.146) for a quartic
interaction term) with the boundary conditions (15.152) and (15.156). At order 0 in z,
one easily recovers the result of eq. (15.34) for the 1-point function, which states that
the expectation value of an observable at leading order is given by the solution Φ of
the classical field equation of motion,

(x + m2 ) Φ(x) + U ′ (Φ(x)) = j(x) , (15.157)

with a boundary condition at the initial time that depends on the coherent state
in which the system is initialized (Φini ≡ 0 when the initial state is the vacuum).
However, this expansion becomes increasingly cumbersome beyond this simple result.
Instead of pursuing this very complicated expansion in powers of z, we present an
approximation that allows for an all-orders solution of eqs. (15.145), (15.152) and
(15.156). Here, we give only a very sketchy motivation for this approximation, and a
lengthier discussion of its validity will be provided later in this section (after we have
derived expressions for the fields φ1 and φ2 ).
Let us first recall that the fields φ+ and φ− represent, respectively, the space-time
evolution of the field in amplitudes and in conjugate amplitudes. The fact that they are
distinct leads to interferences when squaring amplitudes, a quantum effect controlled
by h̄. Consequently, we may expect the difference φ1 ≡ φ+ − φ− to be small
compared to φ± themselves, i.e.

φ1 ≪ φ2 . (15.158)

In this situation, that we will call the quasi-classical approximation, we can approxi-
mate the equations of motion (15.145) by keeping only the lowest order in φ1 . This
540 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

amounts to keeping only the terms linear in φ1 in eq. (15.140) (in the case of a φ4
theory, it means dropping the φ31 φ2 term in eq. (15.141)). In the approximation, they
read
h i
′′
 + m2 − Lint (φ2 ) φ1 = 0 ,

( + m2 ) φ2 − Lint (φ2 ) = 0 , (15.159)

while the boundary conditions are still given by (15.152) and (15.154). The problem
one must now solve is illustrated in the figure 15.9. The field φ1 obeys a linear

φ1 (x) = 0 ∂0 φ1 (x) = z(x)O'(φ2(x)) tf

φ2 φ1

∼ ∼ ∼ ∼ ti
φ(+)
2
= −φ(+)
1
/2 φ(−)
2
= +φ(−)
1
/2

Figure 15.9: Relationship between the fields φ1 and φ2 in the quasi-classical


approximation.

equation of motion (dressed by the field φ2 , although this aspect is not visible in
the figure), with an advanced boundary condition that depends on φ2 . In parallel,
the field φ2 obeys the classical field equation of motion, with a retarded boundary
condition that depends on φ1 . As we shall show, this tightly constrained problem
admits a formal solution, valid to all orders in the function z, in the form of an implicit
functional equation for the first derivative of ln F[z].
c sileG siocnarF

Formal solution : In order to solve the equation of motion for φ1 , let us introduce
mode functions a±k (x), defined as follows
h i
′′
x + m2 − Lint (φ2 (x)) a±k (x) = 0
lim a±k (x) = e∓ik·x . (15.160)
x0 →ti
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 541

In other words, they form a basis of the linear space of solutions of the equation
obeyed by φ1 , and therefore we may express φ1 as a linear superposition of the mode
functions. At this point, we use a slightly more explicit form of eq. (15.120),
Z  
d3 k a+k (x)ȧ−k (y) a−k (x)a+k (y)
− (+k ↔ −k)
(2π)3 2Ek ȧ+k (x)ȧ−k (y) ȧ−k (x)a+k (y)
 
1 0
= i δ(x − y) , (15.161)
x0 =y0 0 1
Thanks to these identities, it is easy to check that the field φ1 that obeys the required
equation of motion and boundary conditions is given by
Z
d3 k
φ1 (x) = d3 u a−k (x)a+k (tf , u) − a+k (x)a−k (tf , u)
(2π)3 2Ek
×z(u) O ′ (φ2 (tf , u)) . (15.162)
The above equation formally defines φ1 (x) in the bulk, x ∈ Ω, in terms of the field
φ2 at the final time. Besides the explicit factor z(u), the right hand side contains also
an implicit z dependence (to all orders in z) in the field φ2 (tf , u) and in the mode
functions a±k (since they evolve on top of the background φ2 ).
Then, using the boundary condition at the initial time, we obtain the following
expression for the field φ2 at ti ,
Z Z
1 d3 k
φ2 (ti , y) = d3 u e+ik·y a+k (tf , u)
2 (2π)3 2Ek
+e−ik·y a−k (tf , u) z(u) O ′ (φ2 (tf , u)) . (15.163)

This can be expressed in a more convenient way with eq. (15.51). In terms of the

operators ±k , we may rewrite φ2 at the initial time as follows:
Z Z
1 d3 k
φ2 (ti , y) = d3 u z(u) O(φ2 (tf , u))
2 (2π)3 2Ek
❚+k e+ik·y+ ❚−k e−ik·y ,
← ←
× (15.164)

where the arrows indicate on which side the ❚±k operators act. This expression gives
the initial condition for the first of eqs. (15.159), in the form of a linear superposition
of plane waves exp(±ik · y). The next step is to note that the field φ2 (x) that satisfies
this equation of motion, and has the initial condition φ2 (ti , y) is formally given by
Z
h δ
φ2 (x) = exp d3 y φ2 (ti , y)
δΦini (ti , y)

δ i
+(∂0 φ2 (ti , y)) Φ(x) . (15.165)
δ(∂0 Φini (ti , y))
542 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

This formula follows from the fact that the derivative with respect to the initial field is
the generator for shifts of the initial condition of Φ; its exponential is therefore the
corresponding translation operator. The same formula applies also to any function of
the field, e.g. O(φ2 ). Substituting φ2 (ti , y) by eq. (15.164) inside the exponential,
this leads to
 Z Z
 1 d3 k
O φ2 (x) = exp d3 u z(u) O(φ2 (tf , u))
2 (2π)3 2Ek 
h← → i 
❚ ❚ ❚ ❚
← →
× +k −k + −k +k O Φ(x) Φini ≡0
Z 
3

= exp d u z(u) O(φ2 (tf , u)) ⊗ O Φ(x) ,
Φini ≡0
(15.166)

where the ⊗ operation is defined by


Z h← → i
d3 k
1
❚ ❚ ❚ ❚
← →
A⊗B≡ A +k −k + −k +k B. (15.167)
2 (2π)3 2Ek

Setting x0 = tf and denoting

D[x1 ; z] ≡ O(φ2 (tf , x1 )) (15.168)

the first derivative of ln F, we see that it obeys the following recursive formula
Z 
3

D[x1 ; z] = exp d u z(u) D[u; z] ⊗ O Φ(tf , x1 ) . (15.169)
Φini ≡0

Realization of the quasi-classical approximation : Let us now return on the con-


dition φ1 ≪ φ2 , that was used in the derivation of eq. (15.169), in order to see a
posteriori when it is satisfied. To that effect, we can use eq. (15.162) for φ1 . For φ2 ,
the initial condition at ti is given by eq. (15.164). For the sake of this discussion, it is
sufficient to use a linearized solution for φ2 in the bulk, that reads
Z Z
1 d3 k
φ2 (x) = d3 u a−k (x)a+k (tf , u) + a+k (x)a−k (tf , u)
lin 2 (2π)3 2Ek
× z(u) O ′ (φ2 (tf , u)) , (15.170)

First of all, a comparison between eqs. (15.162) and (15.170) indicates that φ1 and
φ2 have the same order in the coupling constant g, since they are made of the same
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 543

building blocks (the only difference is the sign between the two terms of the integrand,
and an irrelevant overall factor 21 ).
However, a hierarchy between φ1 and φ2 arises dynamically when the classical
solutions of the field equation of motion (15.157) are unstable. Such instabilities are
fairly generic in several quantum field theories; in particular the scalar field theory
with a φ4 coupling that we are using as example is known to have a parametric
resonance. Since the mode functions a±k are linearized perturbations on top of the
classical field φ2 , an instability of the classical solution φ2 is equivalent to the fact
that some of the mode functions grow exponentially with time, as exp(µ(x0 − ti ))
(where µ is the Lyapunov exponent). Thus, since eq. (15.170) is bilinear in the mode
functions, we expect that
0
+tf −2ti )
φ2 (x) ∼ eµ(x . (15.171)
lin

Estimating the magnitude of φ1 requires more care. Indeed, from eqs. (15.161),
antisymmetric combinations of the mode functions at equal times remain of order 1
even if individual mode functions grow exponentially with time. Thus, at the final
time, we have

φ2 (tf , x)
φ1 (tf , x) ∼ 1 and ∼ e2µ(tf −ti ) ≫ 1 , (15.172)
φ1 (tf , x)

for sufficiently large tf − ti . c sileG siocnarF

In order to estimate the ratio φ2 /φ1 at intermediate times, one may use the
following reasoning. The antisymmetric combination of mode functions that enters in
eq. (15.162) is the advanced propagator GA in the background φ2 . This advanced
propagator may also be expressed in terms of a different set of mode functions b±k
defined to be plane waves at the final time tf ,
h i
′′
x + m2 − Lint (φ2 (x)) b±k (x) = 0
lim b±k (x) = e∓ik·x . (15.173)
x0 →tf

In terms of these alternate mode functions, we also have


Z Z
d3 k
φ1 (x) = d3 u b−k (x)b+k (tf , u) − b+k (x)b−k (tf , u)
(2π)3 2Ek
×z(u) O ′ (φ2 (tf , u)) , (15.174)

In the presence of instabilities, these backward evolving mode functions grow when x0
decreases away from tf , as exp(µ(tf − x0 )) (in this sketchy argument, the Lyapunov
544 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

exponent µ is assumed here to be the same for the forward and backward mode
functions). This implies
0
φ1 (x) ∼ eµ(tf −x )
, (15.175)
and the following magnitude for the ratio φ2 /φ1 at intermediate times
0
φ2 (x) eµ(x +tf −2ti ) 0
∼ µ(t −x0) ∼ e2µ(x −ti ) . (15.176)
φ1 (x) e f

Thus, with instabilities and non-zero Lyapunov exponents, the quasi-classical approx-
imation is generically satisfied thanks to the exponential growth of perturbations over
the background.

Expansion of eq. (15.169) in powers of z : Although eq. (15.169) cannot be solved


explicitly, it is fairly easy to obtain a diagrammatic representation of its solution. For
this, let us introduce the following graphical notations:

i ≡ O Φ(tf , xi ) ,
Z

≡ d3 u z(u) O Φ(tf , u) ,

A B ≡ A⊗B,
in terms of which the functional equation obeyed by D[x1 ; z] reads:
 Z ! 
3
D[x1 ; z] = exp d u z(u) D[u; z] 1 .
Φini ≡0

At the order 0 in z, we just need to set z ≡ 0 inside the exponential, to obtain


D(0) [x1 ; z] = 1 . (15.177)
Then, we proceed recursively. We insert the 0-th order result in the exponential, and
expand to order 1 in z, leading to the following result at order 1:
D(1) [x1 ; z] = 1 . (15.178)
The next two iterations give:
1
D(2) [x1 ; z] = 1 + 1 , (15.179)
2!
and
D(3) [x1 ; z] = 1 + 1

1 1
+ 1 + 1 . (15.180)
2! 3!
15. S TRONG FIELDS AND SEMI - CLASSICAL METHODS 545

These examples generalize to all orders in z: the functional D[x1 ; z] can be represented
as the sum of all the rooted trees (the root being the node carrying the fixed point x1 )
weighted by the corresponding symmetry factor 1/S(T ):

δ ln F[z] X 1
1
= D[x1 ; z] = . (15.181)
δz(z1 ) S(T )
rooted
trees T

Correlation functions : The n-point correlation function is obtained by differenti-


ating this expression n − 1 times, with respect to z(x2 ), · · · , z(xn ), and by setting
z ≡ 0 afterwards. This selects all the trees with n distinct labeled nodes10 (in-
cluding the node at x1 ). Moreover, since derivatives commute, these successive
differentiations eliminate the symmetry factors, leading to
3 5

δ ln F[z] X
1
= C{1···n} = ... 4 . (15.182)
δz(z1 ) · · · δz(xn ) ...

z≡0 trees with n


labeled nodes ...

2 ...
n

The number of trees contributing to this sum is equal to nn−2 (Cayley’s formula).
The equation (15.182) tells us that, at tree level in the quasi-classical regime, all the
n-point correlation functions are entirely determined by the functional dependence of
the solution of the classical field equation of motion with respect to its initial condition.
Moreover, this formula provides a way to construct explicitly the correlation functions
in terms of functional derivatives with respect to the initial field. c sileG siocnarF

In the quasi-classical approximation, the final state correlations are entirely due to
quantum fluctuations in the initial state, that are encoded in the function G022 (x, y). If
the initial state is the vacuum, it reads
Z
d3 k
G022 (x, y) = eik·(x−y) . (15.183)
(2π)2 2Ek
The support of this function is dominated by distances |x − y| smaller than the
Compton wavelength m−1 . Thus, in the tree representation of eq. (15.182), a link
between the points xi and xj is nonzero provided that the past light-cones of summits
xi and xj overlap at the initial time (or at least approach each other within distances
. m−1 ), as illustrated in the figure 15.10. A more thorough analysis would indicate
that eq. (15.182) is exact at tree level for the 1-point and 2-point correlations, but is
incomplete (even at tree-level) beyond 2 points. The corrections to this formula are
nevertheless suppressed if the condition φ1 ≪ φ2 holds. c sileG siocnarF

10 Thus, permuting nodes in general yields a different tree.


546 F. G ELIS – A S TROLL T HROUGH Q UANTUM F IELDS

1 2 3 tf

ti

Figure 15.10: Causal structure of the 3-point correlation function in the quasi-
classical regime.

You might also like