0% found this document useful (0 votes)
34 views

Kinetic Theory of Self-Gravitating Systems: James Binney

This document provides an introduction to the kinetic theory of self-gravitating systems. Key points include: 1) Self-gravitating systems like galaxies differ from electrostatic plasmas in that gravity endows all particles with the same sign of charge, resulting in globally inhomogeneous systems rather than localized inhomogeneities. 2) Applying the virial theorem to a self-gravitating system of N particles relates the system's mass to its spatial extent and velocity dispersion, allowing estimation of typical particle speeds in galaxies. 3) A self-gravitating system cannot truly achieve thermal equilibrium due to its negative heat capacity and the finite escape speed from its gravitational potential, which would cause particles to evaporate over time

Uploaded by

nom nom
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Kinetic Theory of Self-Gravitating Systems: James Binney

This document provides an introduction to the kinetic theory of self-gravitating systems. Key points include: 1) Self-gravitating systems like galaxies differ from electrostatic plasmas in that gravity endows all particles with the same sign of charge, resulting in globally inhomogeneous systems rather than localized inhomogeneities. 2) Applying the virial theorem to a self-gravitating system of N particles relates the system's mass to its spatial extent and velocity dispersion, allowing estimation of typical particle speeds in galaxies. 3) A self-gravitating system cannot truly achieve thermal equilibrium due to its negative heat capacity and the finite escape speed from its gravitational potential, which would cause particles to evaporate over time

Uploaded by

nom nom
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Contents i

Kinetic theory of self-gravitating systems

James Binney

1 Introduction 1
1.1 What differentiates stellar and electrostatic plasmas? 1
1.2 Virial theorem 1
1.3 Thermal equilibrium? 2
1.4 Escape 3
1.5 Fluctuations 3

2 Mean-field model 4
2.1 Angle-action variables 5
• Adiabatic invariance 6 • Hamilton-Jacobi equation 6 • Choice of actions 7
2.2 Self-consistent, mean-field model 7
2.3 Biorthogonal potential-density pairs 8

3 Perturbing the DF 8

4 Evolution of the mean-field model 12


4.1 Dynamics of fluctuations 12

5 Diffusion in a galactic disc 17

Appendices
A Rewriting D1 20
1
Introduction

The aim of these 8 lectures is to show how the ideas introduced earlier in the second section
of the course in connection with electrostatic plasmas can be extended to stellar systems, with
sometimes surprising results. This is rather a small, even niche, corner of stellar dynamics, but
an intriguing one that fits neatly with the remainder of this course. A general introduction to
stellar dynamics can be found in Galactic Dynamics, Binney & Tremaine, PUP (2008) – hereafter
BT08 – while rather a different perspective is given in a review arXiv1309.2794 (NewAR, 57, 29).
Any fan of recorded lectures can try the lectures at http://iactalks.iac.es/talks/view/329.

1.1 What differentiates stellar and electrostatic plasmas?


The key differences between a gravitational plasma and an electrostatic one are:
• Gravity, being an even-spin theory (e.g. A. Zee, Quantum Field theory in a Nutshell, Prince-
ton University Press) endows all particles with the same sign of charge.
• Consequently, self-gravitating systems are globally inhomogeneous rather than inhomoge-
neous only on scales smaller than the Debye length λD .
In a solid or liquid the forces on a particle are dominated by the contributions of near
neighbours. In an electrostatic plasma the forces is dominated by contributions from particles
less distant than λD . A simple argument shows that in a self-gravitating system the forces are
dominated by remote particles.
Consider the gravitational force on one particle in a system of N ≫ 1 particles. On scales
much bigger than the mean inter-particle distance λ we can characterise the system by a mean
mass-density ρ(x). We place the origin of the coordinates at our particle’s location and consider
the force due to mass near x. The mass in the cell distance |x| away and subtending solid angle
d2 Ω is δM = ρ(x)|x|2 d|x| d2 Ω so by the inverse-square law, the force on our particle from this
cell is δF = Gρ(x)d|x| d2 Ω. Crucially the factor |x|2 has disappeared, so as we increase |x|, the
contributions to F simply track ρ(x) (Figure 1.1).
For every cell at x there is a corresponding one at −x, which pulls in the opposite direction.
Hence until |x| is not small compared to the scale on which ρ(x) varies, the net force on our
particle will be negligible. That is, the force on our particle is dominated by distant particles,
not by near neighbours.

1.2 Virial theorem


We first obtain a result that links the mass of a self-gravitating system to its extents in real and
velocity space. We have N particles of mass m moving in the mutually generated gravitational
field. We dot the eqn of motion of one particle (α)
X xβ − xα
mẍα = Gm2 (1.1)
|xα − xβ |3
β

by xα and sum over α:


X X (xβ − xα ) · xα
m ẍα · xα = Gm2 (1.2)
α
|xα − xβ |3
αβ
2 Chapter 1: Introduction

Figure 1.1 Each shaded portion of


the cone contributes equally to the
force on the star at its apex.

Adding to this the equation obtained by α ↔ β we get


X X X (xβ − xα ) · (xα − xβ )
m ẍα · xα + m ẍβ · xβ = Gm2 (1.3)
α
|xα − xβ |3
β αβ

Simplifying each side


X  d2 |xα |2  X 1
m − 2|ẋα |2 = −Gm2 . (1.4)
α
dt2 |xα − xβ |
If the derivative were significantly non-zero, the cluster would be secularly expanding or con-
tracting. So we argue this term is negligible and conclude that 2K + W = 0, where
N
X N
X 1
K = 21 m |ẋα |2 ; W = − 21 Gm2 . (1.5)
α=1
|xα − xβ |
α,β=1

K is the cluster’s kinetic energy while W is its potential energy. So we have an N -particle version
of the 1-particle virial theorem, which should be familiar from quantum mechanics: that if the
P.E. V (x) scales as V (sx) = sα V (x), then 2 hKi = α hV i, where K = p2 /2m is the kinetic-energy
operator.
Our Galaxy is thought to have a (largely dark) mass M ∼ 1012 M⊙ distributed through a
volume of characteristic radius R ∼ 100 kpc. Taking |xα − xβ | ∼ R and summing N 2 terms 1/R
we estimate
GM 2
W ∼− . (1.6)
2R
If the typical speed of a dark particle is σ, 2K ∼ M σ 2 , so the virial theorem yields
GM
σ2 ∼ . (1.7)
2R
Putting in numbers
r
6.7 × 10−11 × 1012 × 1.6 × 1030
σ= ≃ 2.0 × 105 m s−1 ∼ 200 km s−1 .
2 × 100 × 3 × 1019
is a typical random velocity of a dark particle.

1.3 Thermal equilibrium?


It’s natural to imagine that the Galaxy comprises a gravitationally confined system of dark
particles in thermal equilibrium. It’s a monatomic gas so its temperature is given by 23 N kB T = K
and its internal energy is U = K + W = −K is negative. The heat capacity of the system
∂U ∂K
C= =− = − 23 N kB (1.8)
∂T ∂T
is also negative. A negative specific heat is highly problematic because it makes it impossible for
the system to come into thermal equilibrium with a conventional heat bath: suppose the system
and heat bath were in thermal equilibrium. Then a fluctuation could shift δU of energy from
the system to the heat bath. The system would heat up by δT = |δU/C| and the heat bath, if
it was large, would heat by a smaller amount. Since the system would now be hotter than the
bath, more heat would flow from the system to the bath and the system would get hotter and
hotter, apparently without limit.
1.5 Fluctuations 3

1.4 Escape
There’s another conceptual problem associated with the system attaining thermal equilibrium:
p
gravity confines particles only up to a finite escape speed vesc (x) = −2Φ(x), where Φ is the
gravitational potential. Hence in thermal equilibrium the df f (x, v) would have to vanish for
v > vesc , so could not be a Maxwellian since the latter is non-zero all the way to ∞. Yet surely
the processes that maintain thermal equilibrium will scatter stars to v > vescape and such stars
will then escape (‘evaporate’) from the system. We can assess the scale of this issue by computing
the mean-square value of vesc :
Z Z
2 1 2 W K
Vesacape = d3 x ρvesc
2
=− d3 x ρΦ = −4 =8 = 4σ 2 . (1.9)
M M M M

So Vesc = 2σ, i.e. twice the rms speed, which isn’t far into the√
high-velocity tail. The fraction of
a Maxwellian distribution with one-dimensional dispersion σ/ 3 that lies above 2σ is
R∞ R∞ 2
dv v 2 exp(−3v 2 /2σ 2 ) √ dx x2 e−x
6 1
fesc = R2σ
∞ = R∞ 2
∼ . (1.10)
2 2
dv v exp(−3v /2σ ) 2 2
dx x e −x 140
0 0

Since the velocities of stars will be reshuffled into a Maxwellian once every relaxation time, each
such time we expect ∼ M/140 of the mass to evaporate.

1.5 Fluctuations
So how long is the relaxation time? Consider a system of mass M and characteristic scale R,
p
in which the characteristic internal speed is σ = GM/R. Consider now a subregion of size
r = xR, which contains mass Mr ≃ x3 M . If there are N stars in the entire system, then n ≃ x3 N
is the typical number
√ of stars
√ in the subregion, and on account of Poisson noise Mr fluctuates
by δMr = Mr / n = x3 M/ x3 N during times δt = r/σ. Consider a point that is distance yR
from our subregion. At this point a single fluctuation in the subregion’s gravitational attraction
will change the velocity of a test star by

GδMr GM x3/2 xR σx5/2


δv = δt = √ = √ . (1.11)
(yR)2 (yR)2 N σ y2 N

This formula states that for given y, large volumes x ≃ 1 perturb v very much more strongly
than small volumes x ≪ 1. Against this trend we must bear in mind that (a) y ≥ x, (b) the
number of subregions perturbing increases as x−3 as x decreases, and (c) the time within which
the contribution (1.11) comes about decreases with x, so in a given time each small subregion
makes many more contributions to v than does a large subregion.
We assume that the contributions to v from different subregions are statistically independent,
so it’s appropriate to add the δv in quadrature. There are ∼ 4π(y/x)2 subregions of scale x
that are distance yR from our point, and in a crossing time tcross = R/σ each such subregion
contributes x−1 times. So in a crossing time all these subregions change v 2 by

y2 σ 2 x2
(∆v)2 = 4π 3
(δv)2 = 4π 2 . (1.12)
x y N

Now we have to sum over y = x, 2x, 3x, . . . , 1. We convert the sum to an integral using dy = x
and have Z  
X 1 1 1 dy 1 1 1
2
≃ 2
= − 1 ≃ 2. (1.13)
y x x y x x x
Hence in a crossing time the subregions of scale x change v 2 by

(∆v)2 ≃ 4πσ 2 /N. (1.14)

Remarkably, this is independent of x, so regions of each scale xR contribute equally to changing


v 2 . We obtain the total change by summing these contributions over all relevant values of x. We
do this by multiplying equation (1.14) by − ln xmin , where xmin R is the smallest subregion it’s
sensible to consider. This clearly shouldn’t be smaller than a decent multiple of the inter-particle
distance λ ∼ R/N 1/3 .
4πσ 2 ln N
(∆v)2tcross ≃ (1.15)
3N
The relaxation time is the time required for fluctuations to change any velocity by order of
itself, thus for (∆v)2 to accumulate to σ 2 . From (1.15) it follows that

N
trelax ≃ tcross . (1.16)
4 ln N
In an ideal gas the number of molecules in a given volume experiences Poisson fluctuations
as was assumed above, and these fluctuations can be considered to arise from thermally excited
sound waves. The self-gravity of a stellar system makes the system more compressible on large
scales than on small scales, where self gravity is unimportant and an ideal gas provides a valid
model. Hence, large-scale fluctuations have a larger amplitude than simple Poisson fluctuations,
with the consequence that contrary to our finding above of equal contributions from all scales,
fluctuations on the size of the system are dominant. In Chapter 4 we will develop the apparatus
required to include the amplifying effect of self gravity, and in Chapter 5 we will see that self-
gravity accelerates the relaxation of stellar discs by orders of magnitude. Its effect is much smaller
in star clusters.
The conclusions we’ve reached in §§1.3 and 1.4 make it clear that the statistical mechanics
of self-gravitating systems must be very different from anything we have previously encountered.

2
Mean-field model

In this section we assemble the tools needed to figure out the long-term evolution of self-
gravitating systems. The key step is to recognise that the evolution can be described as a
sequence of steady states of a ‘mean-field’ model. Since the dominant forces come from remote
particles (Figure 1.1), an excellent approximation to F can be obtained by smearing the masses of
each particle over distances somewhat larger than the inter-particle distance. The gravitational
potential Φ0 of this mean-field model is the time-average of the system’s real fluctuating Φ.
The latter may be computed by smearing the passes of particles through volumes that extend
just a bit further than the local inter-particle distance. This system has a pretty smooth density
distribution ρ(x) and consequently a very smooth gravitational potential
Z
ρ(x′ )
Φ(x) = −G d3 x′ . (2.1)
|x − x′ |

Conservation of particles as they flow through phase space requires that the one-particle df
f (x, v) of the mean-field model satisfies

∂f ∂ ∂
0= + · (f ẋ) + · (f v̇) = 0, (2.2)
∂t ∂x ∂v
and since by Hamilton’s equations

∂ ∂2H ∂
· ẋ = =− · v̇, (2.3)
∂x ∂x · ∂v ∂v
2.1 Action-angle coordinates 5

where H = 21 v 2 + Φ is the Hamiltonian, we have that f satisfies the collisionless Boltzmann


(Vlasov) equation
∂f ∂f ∂f ∂f
0= + ẋ · + v̇ = + [f, H], (2.4)
∂t ∂x ∂v ∂t
where [., .] is the Poisson bracket.
Consider now a steady-state mean-field model. In this case [f, H] = 0, so the df is a constant
of the equations of particle motion in the mean-field potential. Jeans’ theorem states (trivially
because f is a constant of motion!) that the df of a stationary mean-field model can be assumed
to depend on (x, v) only through constants of motion.
Inhomogeneity is a major setback: when a system is translationally invariant, group theory
guarantees that the linear equations governing small disturbances have a complete set of solutions
of the form ei(k·x−ωt) and to understand the dynamics of disturbances we only have to determine
the dispersion relation. When a system is inhomogeneous, we don’t know the structure of the
eigenfunctions up front and have to work hard to find them.
Next, we study orbits in the mean-field potential.

2.1 Angle-action variables


If you numerically integrate orbits in potentials similar to those of star clusters and galaxies and
Fourier decompose the resulting time series x(t), y(t), etc, you generally find that these series
are quasiperiodic.1 That is, their Fourier decompositions are of the form
X
x(t) = Xn ein·Ωt , (2.5)
n

where the 2d or 3d vectors n have integer components and the vector Ω is made up of 2 or 3
frequencies that are characteristic of the orbit.2 From the quasiperiodic nature of x(t) it can be
shown (see V.I. Arnold Mathematical Methods of Classical Mechanics Springer) that the orbit
admits at least as many independent integrals of motion I(x, v) as it has degrees of freedom.
That is, there are at least 2 or 3 (depending on whether or not the orbit is confined to a plane)
independent functions on phase space such that
d
I[x(t), v(t)] = 0. (2.6)
dt
In a time-independent potential, H is always an integral of motion, and in an axisymmetric
potential the appropriate component of angular moment is always another integral. The non-
trivial numerical result is that there is almost always a third integral of motion of unknown
functional form.
Given a set of integrals of motion Ii , any function Ji (I1 , I2 , I3 ) of three variables provides
another integral. Given this choice, it’s natural to ask whether a set of integrals can be found
that can be complemented by canonically-conjugate variables, θi . For if we had a system of
canonical coordinates (θ, J) such that the momenta were constant, half of Hamilton’s equations
would read
∂H
0 = J˙i = − . (2.7)
∂θi
That is, these equations of motion establish that the Hamiltonian, and its derivatives, are func-
tions of the Ji only and are therefore constant on each orbit. The other equations of motion are
now trivially solved:

∂H
θ̇i = = Ωi (J) a constant ⇒ θi (t) = θi (0) + Ωi t. (2.8)
∂Ji

So in the (θ, J) coordinate system dynamics becomes trivial. The magic integrals Ji are called
actions and their conjugate variables θi are called angles because one usually scales the actions
1 Binney & Spergel, ApJ, 252, 308 (1982)
2 Whereas the Fourier decomposition of a periodic function contains only integer multiples of a single funda-
mental frequency, a quasiperiodic function contains only integer linear combinations of 2 or more fundamental
frequencies.
6 Chapter 2: Mean-field model

so ordinary phase-space coordinates such as x are 2π periodic in the angles. That is the function
on phase space x can be expanded as
X
x(θ, J) = Xn (J)ein·θ . (2.9)
n

The Fourier expansion (2.5) from which we started arises by eliminating θ between equations
(2.8) and (2.9).
Whenever the frequencies Ωi are incommensurable (that is, no relation of the form n · Ω = 0
exists) the actions constitute a complete set of integrals of motion in the sense that any integral
of motion can be obtained as a function of them. Since almost all real numbers are irrational,
the frequencies of most orbits are incommensurable and the actions are generically a complete
set of integrals.
We have seen that by Jeans’ theorem the df of an equilibrium mean-field model is an
integral of motion, so it is a function f (J) of the actions. In a plasma we assume f (v) because
in a homogeneous system, v = constant. Many formulae derived for a plasma will go over to a
stellar system with the substitutions x → θ, v → J.

2.1.1 Adiabatic invariance


Action integrals are adiabatic invariants: if H evolves on a timescale that is longer than the
dynamical time, an orbit of H evolves in such a way that J = constant. Consequently, when H
evolves slowly, the number of particles with J in each element d3 J of action space is unchanged,
so the function f (J) is constant.

2.1.2 Hamilton-Jacobi equation


Let S(x, J) be the generating function of the canonical transformation (x, p) ↔ (θ, J). Then
p = ∂S/∂x and we use this relation to eliminate p from the statement that the Hamiltonian is
constant along the orbit:  
∂S
H x, = E. (2.10)
∂x
This Hamilton-Jacobi equation, which holds at all points that can be reached by Pthe orbit,
is a p.d.e. for S. In practice it can only be solved when the substitution S(x, J) = i Si (xi , J)
leads to a clean separation of variables. For example, for planar motion in an axisymmetric Φ(r)
we have
p2φ
H(r, φ, pr , pφ ) = 12 p2r + 2 + Φ(r), (2.11)
2r
so 2r2 times the H-J eqn yields
 2  2
∂Sr ∂Sφ
r2 + 2r2 (Φ − E) = − = −L2 , (2.12)
∂r ∂φ

where −L2 is a constant of separation. Sφ = Lφ follows trivially, and almost as easily we get
Z r
r
L2
Sr = dr 2(E − Φ) − . (2.13)
r2

These operations yield a function S(x, E, L), which is not of the required form: the integrals of
motion E, L are (unknown) functions of the required action integrals; the pair (E, L) cannot be
complemented by variables to form a set of canonical coordinates. The actions Ji are defined by
I
1
Ji = dx · p, (2.14)
2π γi

where each path γi around the part of phase space accessible to the orbit cannot be deformed
into another of the γi without leaving the accessible region.3
3 The integrals (2.14) are unchanged by sliding γ over the accessible region because the latter has vanishing
P i
Poincaré invariant i dxi dpi .
2.2 Mean-field model 7

Once we’ve separated the H-J eqn, we can evaluate the integral (2.14) that defines an action
associated with each spatial coordinate because a separated equation such as (2.12) makes pi a
function of only its coordinate xi . In the case of 2d motion, we hold r constant along one path,
and φ constant along the other path. Then the first path trivially yields Jφ = L and the second
path yields r
I Z
1 1 rmax L2
Jr (E, L) = dr pr = dr 2(E − Φ) − 2 . (2.15)
2π π rmin r
To find the angle variables we have to use the chain rule

∂Sr ∂Sφ ∂Sr ∂E ∂Sr ∂L ∂Sφ ∂L


θi = + = + + . (2.16)
∂Ji ∂Ji ∂E ∂Ji ∂L ∂Ji ∂L ∂Ji

2.1.3 Choice of actions


The action integrals Ji are defined up to a set of discrete canonical transformations (generating
function S(θ, J′ ) = θ · M · J′ where the matrix M has integer elements). For an axisymmetric
system the actions are uniquely defined by requiring that

Jr quantifies radial excursions


Jφ = Lz is angular momentum about the symmetry axis (2.17)
Jz quantifies oscilations perpendicular to the equatorial plane.

In the spherical limit Jz = L − |Lz | is the angular momentum in the (x, y) plane.

2.2 Self-consistent, mean-field model


A stellar system’s df f (x, v) specifies the mass dm = f d3 x d3 v in each infinitesimal volume of
phase space. If the system is in a statistically steady state, Jeans’ theorem tells us that f can
depend on (x, v) only through J(x, v), so it can be expressed as a function f (J). In fact any
non-negative function of three variables 0 ≤ Jr < ∞, −∞ < Jφ < ∞ and 0 ≤ Jz < ∞ specifies
an axisymmetric stellar system, and a really powerful way of generating models that can be fitted
to observational data is simply to write down a likely function.4
Given some function f (J) how do we discover what the system looks like in real space?
1) Make a guess ρ0 (x) at its density distribution. The guess doesn’t have to be a good one,
but you should ensure that its mass satisfies
Z Z
M ≡ (2π)3 d3 J f (J) = d3 x ρ0 . (2.18)

2) Solve Poisson’s equation for the potential Φ0 (x) generated by ρ0 .


3) Obtain the angle-action coordinates for ρ0 (x) and use them to determine a new density
distribution Z
ρ1 (x) = d3 v f [J(x, v)]. (2.19)

4) Return to step (2) with ρ0 replaced by ρ1 and iterate until ρn (x) differs negligibly from
ρn−1 . This typically requires ∼ 5 iterations.5
The only tricky part of this procedure is obtaining the angle-action coordinates of Φn . In practice
approximations to the true (θ, J) coordinates are used.6

4 In the 20th c. composers appeared who argued that any series of notes constitutes music. We disagree:

writing music involves observing rules regarding scales, chords, etc. Similarly, creating plausible stellar systems
requires adherence to rules regarding how f (J) behaves in certain parts of action space. But these rules are a
matter of good taste.
5 Binney, MNRAS, 440, 787 (2014)
6 Sanders & Binney, MNRAS, 457, 2107 (2016)
2.3 Biorthogonal potential-density pairs
Unfortunately, while Φ is a function of only x, it becomes a function of both θ and J. So while
angle-action variables make dynamics trivial (advance θ linearly in t), they seriously complicate
the solution of Poisson’s eqn.
We finesse this difficulty by introducing a basis of biorthogonal potential-density pairs.
That is, a set of pairs (ρ(α) , Φ(α) ) such that
Z

4πGρ(α) = ∇2 Φ(α) and d3 x Φ(α)∗ ρ(α ) = −Eδαα′ , (2.20)

where E is an arbitrary constant with the dimensions of energy. Given a density distribution
ρ(x), we expand it in the basis
 X

 Φ(x) = Aα Φ(α) (x),
X 
α
ρ(x) = Aα ρ(α) (x) ⇒ Z (2.21)

 1
α  Aα = − d3 x Φ(α)∗ (x)ρ(x).
E

If ρ and Φ are time-dependent, the Aα become time-dependent.


In practice potential-density pairs are complex because they are based on the spherical
harmonics Ylm – see §2.8 of BT08 for more information. However, we could regard the real
and imaginary parts of Ylm (θ, φ) = pm l (cos θ)(cos φ + i sin φ) as (real) basis functions in their own
right. Below, we will find it useful to assume that we are in fact working with real basis functions.

3
Perturbing the DF

The full df f (x, v) satisfies


∂f
0=+ [f, H]. (3.1)
∂t
Breaking f and H into their mean-field and fluctuating parts, we obtain

∂f0 ∂f1
0= + + [f0 , H0 ] + [f0 , H1 ] + [f1 , H0 ] + [f1 , H1 ]. (3.2)
∂t ∂t
By Jeans’ theorem, [f0 , H0 ] = 0. When we ensemble-average the equation, the parts linear in f1
or H1 vanish, so we are left with
∂f0
0= + h[f1 , H1 ]i. (3.3a)
∂t
The second term in this equation is clearly O(f12 ) or smaller, so the time derivative of f0 is small,
as expected.
Since we are not formally expanding in some small parameter (e.g., 1/N ) we haven’t yet
defined f1 exactly: our only requirement is that its ensemble average vanishes, so hf i = f0 .
Hence we are free to define f1 such that the part of eqn (3.2) that is O(f1 ) is identically zero,
which is a stronger statement that f1 has vanishing ensemble average. That is, we now require

∂f1
0= + [f0 , H1 ] + [f1 , H0 ], (3.3b)
∂t
Introduction 9

where Z
f1 (x′ , v)
H1 = Φ1 (x) = −G d3 x′ d3 v
|x − x′ |
because Ronly the potential term fluctuates and it is related to the perturbation to the density,
ρ1 (x) = d3 v f1 (x, v), by the Poisson integral.
Since the (θ, J) system, like the (x, v) one, is canonical and Poisson brackets are invariant
under changes of canonical coordinates, we can substitute x → θ, v → J in all these formulae if
we wish. Then we have
f (θ, J, t) = f0 (J) + f1 (θ, J, t) (3.4)
and
H = H0 (J) + Φ1 (θ, J, t) (3.5)
and eqn (3.4) becomes

∂f1 ∂f1 ∂H0 ∂f0 ∂Φ1


0= + · − · + O(f12 ). (3.6)
∂t ∂θ ∂J ∂J ∂θ
From (2.8) we identify ∂H0 /∂J = Ω(J) as the frequency vector of the unperturbed orbit J
Moreover, incrementing any angle coordinate by 2π brings us back to the same point in phase
space (eq. 2.9), so all functions of θ can be expressed as Fourier series:
X Z
f1 (θ, J, t) = fˆ1 (n, J, t)ein·θ d3 θ
fˆ1 (n, J, t) = f1 (θ, J, t)e−in·θ
n (2π)3
X ↔ Z (3.7)
Φ1 (θ, J, t) = Φ̂1 (n, J, t)ein·θ d3 θ
Φ̂1 (n, J, t) = Φ1 (θ, J, t)e−in·θ ,
n (2π)3

Using these results, we can rewrite the linearised Vlasov equation (3.6) as
!
X ∂ fˆ1 ∂ ˆ0
f
0= e in·θ
+ in · Ωfˆ1 − in · Φ̂1 . (3.8)
n
∂t ∂J

Since θ is arbitrary, for this equation to hold, every coefficient of ein·θ must separately vanish,
so we obtain an infinite set of equations

∂ fˆ1 ∂ fˆ0
= in · Φ̂1 − in · Ωfˆ1 for n with integer components. (3.9)
∂t ∂J

We use Laplace transforms to solve (3.9): multiplying by e−pt (with ℜ(p) > 0) and integrat-
ing over t, we get1

∂f0 e
pfe1 (n, J, p) − fˆ1 (n, J, 0) + in · Ωfe1 (n, J, p) − in · Φ1 (n, J, p) = 0, (3.10)
∂J
where the tildes denote Laplace transforms:
Z ∞
fe1 (n, J, p) ≡ dt e−pt fˆ1 (n, J, t). (3.11)
0

Solving for fe1 we have (cf. Schekochihin eqn. 3.8)

in · ∂f0 e
∂J Φ1 (n, J, p) + fˆ1 (n, J, 0)
fe1 (n, J, p) = . (3.12)
p + in · Ω

This equation provides one connection between a perturbation to the potential Φ1 and the re-
sponse f1 it induces dynamically.
We now need to put into maths the principle that Φ1 is the potential generated by the
perturbation to the density that’s associated with f1 . To obtain the coefficients Aα of the
1 While the dimensions of a quantity are unchanged by a hat, a tilde raises the dimensions by a factor T .
10 Chapter 3: Perturbing the DF

potential-density expansion (2.21) of the perturbed densityP (or at any rate their temporal Laplace
transforms), we multiply the left side of (3.12) by Φ(α)∗ n ein·θ and integrate over phase space:
Z X Z
d3 θd3 J Φ(α)∗ (x) ein·θ fe1 (n, J, p) = d3 xd3 v Φ(α)∗ (x)fe1 (x, v, p)
n
Z (3.13)
= 3
d xΦ (α)∗
(x)e eα (p).
ρ1 (x, p) = −E A

Here we have exploited the fact the Jacobian between any two sets of canonical coordinates is
unity, so d3 θd3 J = d3 xd3 v. Now operating in the same way on the rhs of eqn (3.12) we have
Z X in · + fˆ1 (n, J, 0)
∂f0 e
∂J Φ1 (n, J, p)
d3 θd3 J ein·θ Φ(α)∗ (x)
n
p + in · Ω
Z P e (3.14)
X in · ∂f0
α′ Aα′ (p)Φ̂
(α′ )
(n, J) + fˆ1 (n, J, 0)
= (2π)3 d3 J [Φ̂(α) (n, J)]∗ ∂J
.
n
p + in · Ω

Uniting the two sides (3.13) and (3.14) of equation (3.12) we obtain an equation for A eα :

3 Z
P
X in · ∂f0 e (n, J)]∗ Φ̂(α ) (n, J) + fˆ1 (n, J, 0)[Φ̂(α) (n, J)]∗

(α)
eα (p) = − (2π) α′ Aα′ (p)[Φ̂
A d3 J ∂J
.
E n
p + in · Ω
(3.15)
We move the term on the right containing Aα′ to the left side so we can write
X 3 Z X fˆ1 (n, J, 0)[Φ̂(α) (n, J)]∗
eα′ (p) = − (2π)
ǫαα′ (p)A d3 J , (3.16a)
E p + in · Ω
′α n

where Z
(2π)3 X n · ∂f0 ′
ǫαα′ (p) ≡ δαα′ + i d3 J ∂J
[Φ̂(α) (n, J)]∗ Φ̂(α ) (n, J) (3.16b)
E n
p + in · Ω
is the analogue of the dielectric function (cf. Schekochihin eqn. 3.11). In both integrals over J
in equations (3.16) we must use the Landau prescription. That is, we must ensure that in · Ω
passes to the left of p in the complex plane (Box 3.2).
After computing the inverse of the dimensionless matrix ǫ, we have an explicit expression
for Aeα (p). Multiplying this by Φ̂(α) (n, J) and summing over α we obtain the Laplace transform
of the potential perturbation arising from the initial condition f1 (n, J, 0):
X
e 1 (n′ , J′ , p) =
Φ eα′ (p)Φ̂(α′ ) (n′ , J′ )
A
α′
Z X fˆ1 (n, J, 0) X
(2π)3 ′
=− d3 J Φ̂(α ) (n′ , J′ )ǫ−1
α′ α (p)[Φ̂
(α)
(n, J)]∗ (3.17a)
E n
p + in · Ω
αα′
Z X fˆ1 (n, J, 0)
= −(2π)3 d3 J En′ n (J′ , J, p) ,
n
p + in · Ω

where
1 X (α′ ) ′ ′ −1
En′ n (J′ , J, p) ≡ Φ̂ (n , J )ǫα′ α (p)[Φ̂(α) (n, J)]∗ , (3.17b)
E ′
αα

has dimensions M −1 L2 T −2 and is (to within a factor E) ǫ−1 written in the (n, J) basis rather
than the (α, J) basis. Equation (3.17a) is analogous to Schekochihin eqn. (3.13) in giving the
Laplace transform of the response potential set up by a specified initial condition. It’s more
complicated than Schekochihin eqn. (3.13) because: (a) in the latter Poisson’s equation is solved
by simply dividing by k 2 while here we do acrobatics with the potential basis functions; (b) we
have E where Schekochihin eqn. 3.13 has 1/ǫ and the case ǫ = 0 becomes the case in which
our matrix ǫ has no inverse, so E, which is basically this inverse, diverges; (c) Schekochihin
eqn. (3.13) involves an integral over v with the denominator of the integrand linear in v, while
here we integrate over J and the denominator involves the non-linear function n · Ω(J). The
generalisation of the Landau prescription to this more complex context is given in Box 3.2.
Introduction 11

Box 3.1: Stability of a collisionless system


If we recover the temporal dependence of Φ1 by taking the inverse Laplace transform of equa-
tion (3.17a), we obtain a sum of terms with exponential time dependence epi t (Schekochihin
eqn. 3.16), where pi is the value of the Laplace transform variable at which the matrix E has
a pole in the sense that it is the inverse of a singular matrix ǫ. Consequently, the stability
of a system at the level of collisionless dynamics is determined by whether the dielectric
matrix ǫ is singular at any value p0 of the Laplace transform variable with ℜp0 > 0. We
say that each pi is associated with a normal mode of the system. In a stable system the
normal modes are all neutral (ℜpi = 0) or damped (ℜpi < 0).

Box 3.2: The Landau prescription with actions


We often encounter, as in eqns (3.16), an integral over action space with a denominator that
vanishes if p = −in·Ω(J). In a plasma, analogous integrals occur with denominator p+ ik·v
and we evaluate them using the Landau contour. To solve our more complex problem we
make a coordinate change from J → (x, y, z), where z ≡ n · Ω and (x, y) is a coordinate
system for the 2-surfaces z = const. Then
Z Z ∞
k(J) K(z)
d3 J = dz , (1)
p + in · Ω −∞ p + iz
where Z
∂(J)
K(z) ≡ dxdy k(J).
∂(x, y, z)
The integral on the right of (1) is now in just the form considered by Landau. We write
p = γ − iω with γ > 0, and have
Z ∞ Z Z
K(z) K(z) K(z)
dz = −i dz = −i dz .
−∞ p + iz z − ip z − (ω + iγ)
The z contour (real axis) passes under the pole, so in the limit γ → 0 this becomes
Z ∞  Z  Z
K(z) K(z) K(z)
dz = −i P dz + iπK(ω) = −iP dz + πK(ω). (2)
−∞ p + iz z − ω z −ω
R
Now let’s transform d3 J k(J)δ(n · Ω − ω) into the (x, y, z) system:
Z Z
3
d J k(J)δ(n · Ω − ω) = dz K(z)δ(z − ω) = K(ω).

When we use this equation in (2), we obtain the needed analogue of the Plemelj formula.
Z Z Z
3 k(J) K(z)
d J = −iP dz + π d3 J k(J)δ(n · Ω − ω) (p = −iω + 0). (3)
p + in · Ω z−ω
4
Evolution of the mean-field model

We have been studying the properties of mean-field equilibrium systems. Such systems are fully
characterised by a non-negative df of the form f (J). We have shown how to compute the
evolution of the df when at t = 0 it differs very slightly from f (J). In all the above we have
been imagining that the system comprises an extremely large number of particles withR extremely
low masses, so statistical fluctuations of the density around its mean value, ρ(x) = d3 v f (x, v),
vanish. In this section we explore how to compute the evolution of f that occurs because its
constituent particles have non-zero masses, so ρ and Φ fluctuate around their mean values.
Recall from Paul Dellar’s discussion of the BBGKY hierarchy that the 1-particle df f (x, v)
satisfies a Boltzmann equation in which the 2-particle correlation function g (2) (x, v, x′ , v′ ) ap-
pears (Problem 7):
Z
df ′
3 ′ 3 ′ ∂u(x − x ) ∂g
(2)
(w, w′ )
= (N − 1) d x d v · , (4.1)
dt w ∂x′ ∂v

where w ≡ (x, v) denotes position in phase space and u(x − x′ ) is the interaction potential
between two particles. The physical content of this equation is that evolution of the mean-field
model, f (x, v), is driven by the tendency, encoded in g (2) for particles to cluster together, so you
are more likely to find a second particle near you if you stand on a particle than if you stand in
a random location. Heyvaerts1 obtains from equation (4.1) the equation for the evolution of f ,
which is what we seek in this section, but we’ll proceed along a different path, similar to that
laid out by Chavanis.2

4.1 Dynamics of fluctuations


We argue that the small-scale structure is unimportant so we should be able to compute every-
thing in terms of a smooth potential Φ(x, t) providing we properly account for the fluctuations
in Φ.
Equation (3.3a) shows that evolution of f0 is driven by the ‘collision integral’ − h[f1 , H1 ]i,
and the evolution of f1 is given by equation (3.3b). Our the strategy is to use the solutions to
(3.3b) that we obtained in Chapter 3 to compute h[f1 , H1 ]i. We replace f1 and Φ1 in h[f1 , H1 ]i
by their Fourier expansions in θ (eq. 3.7). We also take advantage of the expectation operator
h.i to integrate over all angles. Then we have
*Z 
d3 θ X ˆ X ∂ Φ̂1 (n′ , J, t) ′
h[f1 , Φ1 ]i = 3
f1 (n, J, t)ein·θ in · ein ·θ
(2π) n
∂J
n′
X ∂ fˆ1 (n, J, t) X +
in·θ ′ ′ in′ ·θ
− e · in Φ̂1 (n , J, t)e (4.2)
n
∂J
n′
* +
∂ X
=i · nfˆ1 (n, J, t)Φ̂1 (−n, J, t) .
∂J n

1 J. Heyvaerts, MNRAS, 407, 355 (2010)


2 P.-H. Chavanis, Physica A, 391, 3680 (2012).
4.1 Dynamics of fluctuations 13

Hence the equation for the evolution of the mean-field model is

∂f0 ∂
=− · F, (4.3a)
∂t ∂J

where the flux of stars in action space3 is


* +
X
F=i nfˆ1 (n, J, t)Φ̂1 (−n, J, t) . (4.3b)
n

The divergence on the right of (4.3a) guarantees conservation of stars.


Rewriting (4.3b) in terms of Laplace transforms, it becomes
* +
X Z dp Z
dp′ p′ t e
pt e ′
F(J) = i n e f1 (n, J, p) e Φ1 (−n, J, p ) . (4.4)
n
2πi 2πi

Now we use equation (3.12) to eliminate fe1


* !Z +
X Z dp in · ∂f0 e
+ fˆ1 (n, J, 0) dp′ p′ t e
∂J Φ1 (n, J, p)
F(J) = i n ept e Φ1 (−n, J, p′ ) . (4.5)
n
2πi p + in · Ω 2πi

D E
This expression for the diffusive flux is made up of a part that’s proportional to Φ e 1 (n)Φ
e 1 (−n)
that will
D be non-vanishing
E regardless of the physical cause of fluctuations in the potential, and
ˆ e
a part f1 (n)Φ1 (−n) that will be non-vanishing only to the extent that the fluctuations in Φ
are generated by the fluctuations in f . Moreover, the first term is proportional to the gradient
of f0 (J) while the second is not. These distinctions will prove important (e.g., Problem 10), so
we explicitly break F = F1 + F2 into two parts,
D E
X Z dp Z ′ ′
dp p t fˆ1 (n, J, 0)Φ
e 1 (−n, J, p′ )
F1 = i n ept e
n
2πi 2πi p + in · Ω
D E (4.6)
X Z Z ′ ′ e 1 (n, J, p)Φ
Φ e 1 (−n, J, p′ )
∂f0 dp pt dp p t
F2 = − nn· e e .
n
∂J 2πi 2πi p + in · Ω

e 1 , these fluxes become


Using (3.17a) to eliminate Φ
* +
X Z dp ˆ Z
dp′ p′ t
Z X ˆ ′ ′
3 pt f1 (n, J, 0) 3 ′ ′ ′ f1 (n , J , 0)
F1 (J) ≡ −(2π) i n e e d J E−nn′ (J, J , p ) ′
n
2πi p + in · Ω 2πi p + in′ · Ω′
n′
X Z ∂f Z
6 dp pt in · ∂J0 X fˆ1 (n′ , J′ , 0)
F2 (J) ≡ (2π) i n e d3 J′ Enn′ (J, J′ , p)
n
2πi p + in · Ω p + in′ · Ω′
n′
Z Z X ˆ ′′ ′′ 
dp′ p′ t 3 ′′ ′′ ′ f1 (n , J , 0)
× e d J E−nn′′ (J, J , p ) ′ .
2πi p + in′′ · Ω′′
n′′
(4.7)
The expectation-value brackets h.i imply that we require the expectation fˆ1 (n, J, 0)fˆ1 (n′ , J′ , 0)
of the initial conditions. In Box 4.1 we show that
D E 1
fˆ1 (n, J, 0)fˆ1 (n′ , J′ , 0) = δn,−n′ δ(J − J′ )mf0 (J). (4.8)
(2π)3

3 Strictly, the density of stars in action space is (2π)3 f (J) and the action-space flux is (2π)3 F(J) rather than
0
F(J), but in heuristic discussions it’s convenient to ignore the factor (2π)3 .
14 Chapter 4: Evolution of the mean-field model

Box 4.1: Expectation value of the initial conditions


D E
We require fˆ1 (n, J, 0)fˆ1 (n′ , J′ , 0) . We drop the time slot for brevity and recall that f1 is
the difference between the actual df and the mean-field model, which has df f0 (J). The
actual df is a sum of one delta-function for each particle:
X
f (θ, J) = m δ(θ − θi )δ(J − Ji ).
i

Thus bearing in mind that hf (θ, J)i = f0 (J),




 
f1 (θ, J)f1 (θ′ , J′ ) = f (θ, J) − f0 (J) f (θ′ , J′ ) − f0 (J′ )


= f (θ, J)f (θ′ , J′ ) − f0 (J)f0 (J′ )
X

= m2 δ(θ − θi )δ(J − Ji )δ(θ′ − θj )δ(J′ − Jj ) − f0 (J)f0 (J′ ).
ij

Now
X
X

δ(θ − θi )δ(J − Ji )δ(θ′ − θj )δ(J′ − Jj ) = δ(θ − θi )δ(J − Ji )δ(θ′ − θj )δ(J′ − Jj )
ij i6=j
X

+ δ(θ − θi )δ(J − Ji )δ(θ′ − θ)δ(J′ − J)
i
= m−2 f0 (J)f0 (J′ ) + m−1 f0 (J)δ(θ − θ′ )δ(J − J′ ),
where we have assumed that the particles are uniformly distributed in θ and uncorrelated
(so the expectation value of products of delta-functions associated with different particles is
the product of the expectation values of the individual terms). When the last equation is
used in the previous equation, we obtain


f1 (θ, J)f1 (θ′ , J′ ) = mf0 (J)δ(θ − θ′ )δ(J − J′ ),
which simply states that particles are only correlated with themselves. Finally Fourier
transforming
D E Z Z 3 ′
ˆ ˆ d3 θ d θ −i(n·θ+n′ ·θ′ )
′ ′ ′
f1 (n, J)f1 (n , J ) = mf0 (J)δ(J − J ) 3
e δ(θ − θ′ )
(2π) (2π)3
= (2π)−3 mf0 (J)δ(J − J′ )δn,−n′ .

Inserting this and using the δ-function to carry out the integral over J′ in the equation for F1
and over J′′ in the equation for F2 , we get
X Z dp 1
Z
dp′ p′ t f0 (J)
pt
F1 (J) = −im n e e E−n−n (J, J, p′ ) ′
n
2πi p + in · Ω 2πi p − in · Ω
X Z ∂f Z X
dp pt n · ∂J 0
1
F2 (J) = −(2π)3 m n e d3 J′ Enn′ (J, J′ , p) (4.9)
n
2πi p + in · Ω p + in′ · Ω′
n′
Z
dp′ p′ t f0 (J′ )
× e E−n−n′ (J, J′ , p′ ) ′ .
2πi p − in′ · Ω′
The expression for F1 is easy to simplify further because E−n−n won’t contribute a pole at
ℜ(p′ ) ≥ 0: if it had such a pole, the underlying model would be unstable (Box 3.1), and we are
interested in the case when it’s stable. So the only singularity we need consider is the obvious
one when p′ = in · Ω. Similarly, the integration over p follows immediately from the pole at
p = −in · Ω. So we have
X
F1 (J) = −im nE−n−n (J, J, in · Ω)f0 (J). (4.10)
n

Notice that the time dependencies introduced by the two inverse Laplace transforms have can-
celled, so the flux F1 is constant.
4.1 Dynamics of fluctuations 15

Now we turn to F2 . The integral over p′ is straightforward because the integrand has only
the obvious pole at p′ = in′ · Ω′ . After doing the p′ integral we have

X Z dp n · ∂f0

F2 (J) = −(2π)3 m n ept ∂J

n
2πi p + in · Ω
Z (4.11)
X ′ ′ f0 (J′ )
× d3 J′ ein ·Ω t Enn′ (J, J′ , p)E−n−n′ (J, J′ , in′ · Ω′ )
p + in′ · Ω′
n′

Now we perform the integral over J′ using the Landau prescription (Box 3.2) to handle the pole
at in′ · Ω′ = −p:

X Z dp ∂f0  Z X
3 pt n · ∂J
F2 (J) = −(2π) m n e −iP + π d3 J′ e−pt δ(in′ · Ω′ − ip)
n
2πi p + in · Ω
n ′
(4.12)

′ ′ ′
× Enn′ (J, J , p)E−n−n′ (J, J , −p)f0 (J ) ,

where P is the (real) principal part of the integral. It is now straightforward to execute the
integral over p because the integrand has just the simple pole at p = −in · Ω. After integration
over p we have

X  Z X
3 ∂f0
F2 (J) = −(2π) m ne −in·Ωt
n· −iP + π d3 J′ ein·Ωt δ(n′ · Ω′ − n · Ω)
n
∂J
n ′
 (4.13)
× Enn′ (J, J , −in · Ω)E−n−n′ (J, J′ , in · Ω)f0 (J′ ) ,

We now argue that since F2 is real, the contribution from the principal part, P , must vanish,
and we have finally
X  ∂f0  Z X
F2 (J) = − 12 (2π)4 m n n· d3 J′ δ(n′ · Ω′ − n · Ω)
n
∂J (4.14)
′ n
′ ′ ′
× Enn′ (J, J , −in · Ω)E−n−n′ (J, J , in · Ω)f0 (J ).

Notice that the time dependence has disappeared from F2 as it did from F1 .
At this point we assume that we are working with real basis functions Φ(α) for then by the
bottom-right equation of (3.7), [Φ̂(α) (n, J)]∗ = Φ̂(α) (−n, J). Also [ǫ(p)]∗ = ǫ(p∗ ) (Problem 8).
Consequently, from (3.17b)

1 X (α) ∗ (α′ )
[Enn′ (J, J′ , −in · Ω)]∗ = [Φ̂ (n, J)]∗ [ǫ−1
αα′ (−in · Ω)] Φ̂ (n′ , J′ )
E ′
αα
1 X (α) (α′ ) (4.15)
= Φ̂ (−n, J)ǫ−1αα′ (in · Ω)[Φ̂ (−n′ , J′ )]∗
E ′
αα
= E−n−n′ (J, J′ , in · Ω).

Consequently, our expression (4.14) can be simplified to


X Z
∂f0  2
F2 (J) = − 21 (2π)4 m n n· d3 J′ Enn′ (J, J′ , −in · Ω) f0 (J′ )δ(n′ · Ω′ − n · Ω).
∂J
nn′
(4.16)
This completes our computation of the diffusive flux in action space that’s engendered by
Poisson fluctuations in the density:

F(J) = F1 (J) + F2 (J)


∂f0 (4.17)
= −D1 (J)f0 − D2 (J) · ,
∂J
16 Chapter 4: Evolution of the mean-field model

where D1 is the (vector) drag coefficient and D2 is the (tensor) diffusion coefficient:
X
D1 (J) = im E−n−n (J, J, in · Ω) n
n
XZ 2 (4.18)
D2 (J) = 12 (2π)4 m d3 J′ Enn′ (J, J′ , −in · Ω) f0 (J′ )δ(n′ · Ω′ − n · Ω) n ⊗ n.
nn′

Notice that the sign of D2 is positive, so the flux that it generates is in the opposite direction
to the gradient of f0 : stars diffuse away from regions of high phase-space density. Whereas the
flux of heat in a metal bar, q = −κ∇T is simply proportional to the gradient of the heat-
density T , our diffusive flux has, in addition to a term that’s proportional to the gradient of
the star density, a term that’s proportional to the density itself. To understand the necessity
of this additional term, consider how the system would evolve if it were absent. Then stars
would diffuse from modest initial actions to ever higher actions, so eventually the density of
stars would become uniform throughout phase space, just as heat diffusion will eventually make
the temperature uniform throughout a bar. However, energy conservation, which is encoded in
the dynamics we have been using, excludes a uniform distribution of stars in action space, since
larger actions are associated with more energy. Consequently, the tendency of the term in F
proportional to ∂f0 /∂J to drive the system to uniformity in action space has to be counteracted
by the term proportional to f0 , which generates a net drift towards the origin of action space.
In thermal equilibrium, F must vanish by detailed balance. Then the df f0 = exp(−βH),
where H is the Hamiltonian and β = (kB T )−1 is the inverse temperature. Since ∂H/∂J = Ω,
for F to vanish the diffusion coefficients (which depend on f0 ) must satisfy

D1 (J) − βD2 (J) · Ω(J) = 0 (4.19)

everywhere in action space. This relation provides a useful check on any formulae for the diffusion
coefficients (Problem 9). It also suggests that whatever the origin of the fluctuations that drive
diffusion (here Poisson fluctuations), D1 and D2 will be closely related to one another. In fact,
from our expression for D1 one can derive (Appendix A)
XZ 2 ∂f0
D1 (J) = − 21 (2π)4 m d3 J′ Enn′ (J, J′ , −in · Ω) n′ · δ(n′ · Ω′ − n · Ω) n, (4.20)
∂J′
nn′

which is extremely similar to our expression for D2 .


Equation (4.20) for D1 and our equation (4.18) for D2 give the Di (J) as sums of contributions
from stars at any point J′ at which stars “resonate” with stars at J – two stars resonate in the
sense that the n′ harmonic of one star coincides with the n harmonic of the other. D1 and D2
are proportional to the values taken by ∂f0 /∂J′ and f0 (J′ ), respectively, because the strength of
the oscillating field that’s created by the stars at J′ is proportional to the number of stars at J′ .
On account of the vector n that occurs in D1 and the diadic n ⊗ n in D2 , the diffusion tensor is
highly anisotropic in the sense that stars diffuse anomalously fast in the direction n that yields
the largest number of resonant stars.
5
Diffusion in a galactic disc

The formalism developed in the last section gives fascinating insight into the dynamics of galactic
discs similar to that in which we reside. These systems were among the first to be studied by
N-body simulation when electronic computers became widely available, but it is only recently
that we have achieved a reasonable understanding of their dynamics.
Fouvry et al. (arXiv150706887) have applied the formalism of Chapter 4 to razor-thin discs:
restricting motion to the xy plane significantly simplifies the computations. First, angle-action
coordinates are readily constructed for an axisymmetric disc (Problem 3). Second, Kalnajs (1976)
has defined a convenient set of orthonormal potential-density pairs

Φα (r, φ) = eilφ Φln (r) ρα (r, φ) = eilφ ρln (r), (5.1)

where α = (l, n). Φln is a specified polynomial and ρln is a polynomial in r times a half power of
1 − r2 /r02 , where r0 is the edge of the disc.
Next they compute the AA representation of their basis potentials:
Z
1 ra
Φ̂(α) (n, J) = δα2 ,n2 dr Φln (r) cos[n1 θ1 + n2 (θ2 − φ)].
π rp

They considered a disc that is confined by a potential that generates a circular speed vc =
(R∂Φ/∂R)1/2 that is everywhere constant. If the disc generated this potential on its own, its
surface density Σ(R) would be proportional to R−1 . It is more realistic (and numerically more
convenient) to assume that Φ is generated by three components: (i) a bulge that dominates the
mass density near the origin, (ii) a dark halo that dominates the mass density far from the centre,
and (iii) the disc, which contributes ∼ 0.5 of the radial force at intermediate radii. One says that
a “Mestel” disc with Σ(R) ∝ R−1 has been “tapered” at small and large radii to accommodate
the bulge and the dark halo. The unperturbed df is
2
f0 (E, Jφ ) = ξCJφq e−E/σr Tin (Jφ )Tout (Jφ ), (5.2a)

where E = 12 (vR
2
+ vφ2 ) + Φ, C normalises the df such that with ξ = Tin = Tout = 1 the disc
generates the entire potential, σr is a parameter that controls the magnitude of stars’ random
motions, and
q = (vc /σr )2 − 1 (5.2b)
was taken to have the value 11.4. Finally the taper functions are

Jφ4 (Rout vc )5
Tin (Jφ ) = Tout (Jφ ) = , (5.2c)
(Rin vc )4 + Jφ4 (Rout vc )5 + |Jφ |5

where Rout = 11.5Rin. By increasing ξ between zero and unity, the dynamical importance of
the disc’s self-gravity can be increased from unimportant to dominant. For ξ ≃ 0.5 this disc is
known to be stable in the sense (Box 3.1) that all its normal modes are damped (Toomre 1981).
In Figure 5.1 arrows show the diffusive flux F computed from equation (A.7). We see that
F is small except along a ridge that slopes leftwards up from the Jφ axis (which is where the
18 Chapter 5: Diffusion in a galactic disc

Figure 5.1 The diffusive flux of stars in action space for a tapered Mestel disc with active mass fraction ξ = 0.5
(From Fouvry et al 2015).

Figure 5.2 Left panel: a plot of div F, computed from equation (A.7), with blue indicating negative and red
positive values. Right panel: a corresponding plot of the increment (blue) or decrement (red) in the df during
an N-body simulation of the same disc by Sellwood (2012).

stars of a cool disc are strongly concentrated). The narrowness of this ridge is emphasised by
the left panel of Figure 5.2, which shows div F. We see that div F is negligible except along a
narrow ridge, where it is positive lower down and negative higher up, indicating that stars are
diffusing from near circular orbits to more eccentric orbits with slightly less angular momentum.
The ridge of non-negligible F nearly coincides with a line
2Ωφ − Ωr = constant ≡ 2ωp . (5.3)
Stars on this line are said to be at the inner Lindblad resonance of a perturbation. This
perturbation is a bi-symmetric structure that rotates in the same sense as the stars in the disc
with angular velocity ωp . So Figure 5.1 indicates that the dynamics of the disc are dominated
by a coherent structure. Given that this is a stable disc, how can this be?
The answer is that although all the disc’s normal modes are damped, some are only weakly
damped. Consequently, for some values of p the matrix E that occurs squared in equation (A.7)
becomes large. Consequently F becomes large at points in action space at which in · Ω coincides
(for some n) with such a value of p.
Actually, to get a large value of F it is not sufficient to have a large value of |E|2 ; the product
(∂f0 /∂J)f0 (J′ ) should also be large. In short, the requirements for obtaining a large flux are
quite specific and it is perhaps not surprising that they are satisfied only along a ridge in action
space.
Introduction 19

Equation (4.3a) states that f0 will increase where div F < 0 and decrease where div F > 0.
The right panel of Figure 5.2 shows the change over some time interval in the df of an N-body
simulation of the disc. We see that the blue region of increase and the red region of decrease
broadly coincide with the regions of negative and positive div F.
In a series of carefully controlled N-body simulations, Fouvry et al (2015) checked the pre-
dictions of Chapter 4 regarding how the magnitude of F scales with particle number N and
with active mass fraction ξ (which determines the magnitude of |E|2 and thus F). The flux is
predicted to be proportional to m, the mass of a single particle, so it should scale as 1/N . The
experiments indicate that the changes in f0 induced by the flux follow this scaling within the
errors.
From equation (A.7) one can deduce that increasing ξ from 0.5 to 0.6 magnifies F by a factor
42 while in the N-body experiments this change in ξ increases the change in f0 in a given period
by a factor 29. Given the significant uncertainties in the numerical work, this comparison again
amounts to agreement within the errors.
The work of Fouvry et al leaves no doubt that equation (A.7) correctly predicts the action-
space flux that Poisson noise drives in a stable but responsive disc, and that until this flux has
substantially modified f0 , Poisson noise is the only driver of evolution. The evolution occurs 1000
to 10 000 times faster than naive estimates of two-particle relaxation predict because the noise
excites collective motions, which are then amplified by a process called swing amplification.
Specifically, noise excites leading spiral waves of density that propagate towards the corotation
radius (where Ωφ = ωp ). These waves are gradually unwound by shear in the disc: the angular
velocity of star streaming increases inwards, so any structure in the disc is constantly being
sheared towards a tightly-wound trailing structure. As a leading-arm spiral is sheared into a
trailing-arm structure, self-gravity abruptly amplifies it. The resulting larger-amplitude trailing
wave then propagates back inwards. It is Landau damped when it reaches stars that resonate with
it. The strength of the swing amplifier determines the magnitude of E, and thus the magnitude
of the diffusive flux. Increasing the active mass fraction ξ strengthens the swing amplifier and
thus increases F.
Fouvry et al do not compute the evolution of f0 by integrating equation (4.3a) because
computing F at a single time is already a major task. But the N-body models give insight into
what we would find if we did integrate (4.3a), and it’s extraordinary. The ridge of enhanced div F
in Figure 5.2 would create a ridge of enhanced f0 , and this ridge would make the disc unstable
at a collisionless level. That is, Poisson noise drives a disc that is stable towards one that is
unstable. Indeed all simulations of initially stable discs that have been integrated for sufficiently
long have developed O(1) non-axisymmetries and degenerated into a strong bar. The larger the
number N of particles in your disc, the longer you have to wait for the bar to form, but it always
does form. Moreover, the final stages in which O(1) spiral structure develops into a bar occur on
the same timescale regardless of the value of N – this fact implies that the final-stage dynamics
are collisionless. By increasing N you simply increase the delay before the final stage is reached
by decreasing the value of f1 (n, J, 0) in our equations and thus the initial rate at which f0 is
modified into an unstable df.
20 Appendix A: Rewriting D1

Appendix A: Rewriting D1
We can bring F1 to a form that closely parallels eqn (4.16) for F2 . We first note that
X X
nE−n−n (J, J, in · Ω) = (−n)Enn (J, J, −in · Ω),
n n

so from (3.17b)
X
F1 (J) = −im 12 f0 (J) n [E−n−n (J, J, in · Ω) − Enn (J, J, −in · Ω)]
n
X
= im 21 f0 (J) n {Enn (J, J, −in · Ω) − [Enn (J, J, −in · Ω)]∗ } (A.1)
n
m X X (α′ ) 
= i f0 (J) n Φ̂ (n, J)[Φ̂(α) (n, J)]∗ ǫ−1 −1
α′ α (−in · Ω) − [ǫαα′ (−in · Ω)]

,
2E n ′ αα

where the second equality uses (4.15). In the curly bracket of the last line we have the difference between
ǫ−1 and ǫ−1† . We use
ǫ−1 − ǫ−1† = ǫ−1 (ǫ† − ǫ)ǫ−1† . (A.2)
From equation (3.16b)
n o Z X ′ ∂f0 (β) ′ ′ ∗ (β ′ ) ′ ′
(2π)3
[ǫ(p)]† − ǫ(p) =− i d3 J′ n · [Φ̂ (n , J )] Φ (n , J )
ββ ′ E ∂J′
n′
  (A.3)
1 1
× + .
p∗ − in′ · Ω′ p + in′ · Ω′
We need to put p = γ − in · Ω with γ > 0 and extract the limit γ → 0 as per Box 3.2.
n o Z X ′ ∂f0 (β) ′ ′ ∗ (β ′ ) ′ ′
(2π)3

[ǫ(p)] − ǫ(p) =− i d3 J′ n · [Φ̂ (n , J )] Φ (n , J )
ββ ′ E ∂J′
n′
  (A.4)
1 1
× − .
i(n · Ω − n′ · Ω′ ) + γ i(n · Ω − n′ · Ω′ ) − γ
The principal parts of the two integrals cancel but the contributions from skirting the pole add because
in the left integral the pole is at z = n · Ω − iγ and in the right integral it’s at z = n · Ω + iγ. Glancing
back at (A.3) we see that the right integral has exactly the form considered in Box 3.2, so it yields +πK.
The left integral will yield minus this, so
n o Z X ′ ∂f0
(2π)4 ′
[ǫ(−in · Ω)]† − ǫ(−in · Ω) = −i d3 J′ n · δ(n′ · Ω′ − n · Ω)[Φ̂(β) (n′ , J′ )]∗ Φ(β ) (n′ , J′ ).
ββ ′ E ′
∂J ′
n
(A.5)
Inserting eqn (A.5) in (A.2) and then in (A.1), we arrive at
X X Z X ′ ∂f0
(2π)4 ′
F1 (J) = m 2 12 f0 (J) n Φ̂(α ) (n, J)ǫ−1
α′ β (−in · Ω) d3 J′ n ·
E n ′ ′ ′
∂J′
αα ββ n
(β) (β ′ )
′ ′
× δ(n · Ω − n · Ω)[Φ̂ (n , J )] Φ (n , J ′ ′ ∗
· Ω)] [Φ̂(α) (n, J)]∗
′ ′
)[ǫ−1
αβ ′ (−in
∗ (A.6)
X Z X ′ ∂f0
= (2π)4 m 21 f0 (J) n d3 J′ n · ′
δ(n′ · Ω′ − n · Ω)n|Enn (J, J′ , −in · Ω)|2 .
n ′
∂J
n

Comparing equations (4.16) and (A.6) we see that they have extremely similar structures, so when
we combine them to form the total flux F = F2 + F1 we obtain quite a simple bottom line:
X Z 3 ′  
∂f0 ∂f0
F(J) = (2π)4 m 21 n d J |Enn (J, J′ , −in · Ω)|2 δ(n′ · Ω′ − n · Ω) n′ · f0 (J) − n · f0 (J ′
)
∂J′ ∂J
nn′
X Z  
∂ ∂
= (2π)4 m 12 n d3 J′ |Enn (J, J′ , −in · Ω)|2 δ(n′ · Ω′ − n · Ω) n′ · −n· f0 (J)f0 (J′ ).

∂J′ ∂J
nn
(A.7)
Problems 21

Problems
1 Write down the generating function S(θ, J′ ) of the canonical transformation (θ, J) ↔ (θ′ , J′ ) that
makes ordinary phase-space coordinates periodic in the θ′i with period unity rather than 2π.
2 Let S(x, J) be the generating function of the canonical transformation between (x, p) and the angle-
action coordinates of the harmonic oscillator H(x, p) = 12 (p2 + ω 2 x2 ). Explain what the Hamilton-Jacobi
equation is, and show that for this system it yields
E 
S= ψ + 12 sin 2ψ , (P.8)
ω

where sin ψ ≡ ωx/ 2E. Define the action and show that for this system it is J = E/ω. Hence show
that 
S = J ψ + 21 sin 2ψ , (P.9)
Hence show that θ = ψ
3 Particles move in the (r, φ) plane in the potential Φ(r). Write down the Hamilton-Jacobi equation
for the generating function S(r, φ, Jr , Jφ ). By writing S = Sr (r, Jr , Jφ ) + Sφ (φ, Jr , Jφ ) show that Jφ = pφ
and obtain an integral for Jr . Show that
Z
dr
θr (r, J) = Ωr (P.10)
pr
where Ωr is the radial frequency. Give a physical interpretation of this result.
4 N particles form a system with Hamiltonian
X 2 X 
H = 12 pi + u(qi , qj ) , (P.11)
i j

where u is a symmetric function of its arguments. Show from first principles that the N -particle df
satisfies
∂f (N)
+ [f (N) , H] = 0, (P.12)
∂t
where the Poisson bracket [f, g] is defined by
XN  
∂f ∂g ∂f ∂g
[f, g] = · − · . (P.13)
i=1
∂qi ∂pi ∂pi ∂qi

Explain why equation (P.12) can be written df (N) /dt = 0.

5 An interesting orthogonal potential-density basis (Φ(n) , ρ(n) ) starts with the Hernquist sphere
ρ0 2πGρ0 a2
ρ(0) (r) = ↔ Φ(0) (r) = − , (P.14)
r/a(1 + r/a)3 1 + r/a
where ρ0 and a are constants. Show that in this case the constant E = GM 2 /3a, where M = 2πρ0 a3 is
the system’s mass. Explain why it’s reasonable to adopt as other members of the family
(r/a)l C
Φ(0,l,m) = Ylm (θ, φ), (P.15)
(1 + r/a)2l+1
where C is a constant to be determined. Show that the corresponding density distribution is
(2l + 1)(l + 1)C (r/a)l−1
ρ(0,l,m) = − Y m (θ, φ). (P.16)
2πGa2 (1 + r/a)2l+3 l
Explain how the constant C is determined. (Much more detail in Hernquist & Ostriker, ApJ, 386, 375
(1992))
6 Show that the action-space flux F defined by equation (4.4) is necessarily real.

7 Derive from Liouville’s equation for the full N -particle df f (N) (x1 , . . . , vN ) the Boltzmann eq that
connects the 1-particle df f (1) to the 2-particle correlation function
g(w, w′ ) ≡ f (2) (w, w′ ) − f (1) (w)f (1) (w′ ).

8 Show from the definition (3.16b) ǫ that


[ǫ−1 (p)]∗ = ǫ−1 (p∗ ). (P.17)

9 Using the result of Appendix A, show that the diffusive flux F vanishes when the df is f0 (J) ∝
e−βH0 (J) , where β is a constant. What physical principle does this result vindicate/illustrate?
22 Problems

10 Let f0 (J) be the distribution function (df) of an equilibrium stellar system that has gravitational
potential Φ0 (x) and angle-action coordinates (θ, J). Show that if we write the df of the perturbed model
f (x, v, t) = f0 + f1 (x, v, t), then to first order in the perturbations f1 satisfies
∂f1 ∂f1 ∂f0 ∂Φ1
+ Ω0 · − · = 0, (P.18)
∂t ∂θ ∂J ∂θ
where Ω0 = ∂H0 /∂J and the perturbed potential is Φ(x, t) = Φ0 (x) + Φ1 (x, t). Hence or otherwise show
that
0 e
in · ∂f Φ1 (n, J, p) + f̂1 (n, J, 0)
fe1 (n, J, p) = ∂J
, (P.19)
p + in · Ω0
where the meanings of a tilde and a hat should be explained.
What physical principle is used to obtain from the last equation the expression
Z X
e 1 (n′ , J′ , p) = −(2π)3 d3 J f̂1 (n, J, 0)
Φ En′ n (J′ , J, p) , (P.20)
n
p + in · Ω0

where E is the inverse of the “dielectric tensor”? Explain (without calculation) how from this equation
we can obtain
Z X
n · ∂f 0
f̂1 (n′ , J′ , 0) f̂1 (n, J, 0)
fe1 (n, J, p) = −(2π)3 i ∂J
d3 J′ Enn′ (J, J′ , p) ′ · Ω′
+ . (P.21)
p + in · Ω0 ′
p + in 0 p + in · Ω0
n

Fluctuations in Φ drive a diffusive flux F of the mass-bearing stars through phase space. F is given
by
* Z Z +
X dp pt e dp′ p′ t e ′
F(J) = i n e f1 (n, J, p) e Φ1 (−n, J, p ) , (P.22)
n
2πi 2πi

where h·i indicates an ensemble average. A population of massless tracer particles orbits within the
stellar system. Let g0 (J) and g1 (x, v, t) be the unperturbed and perturbed dfs of this population. Show
that the phase-space flux G of the tracer population is given by an expression of the form (an expression
for D2 is not required)
∂g0
G = −D2 (J) · . (P.23)
∂J
Explain the physical significance of the form taken by G.

11 An equilibrium stellar system is described by the distribution function (df) f0 (x, v) and the mean-
field potential Φ0 (x). Write down two equations that must be satisfied by f0 and Φ0 .
Let f1 (x, v, t) be the small change in the system’s DF when it is out of equilibrium. Obtain the
equation that governs the evolution of f1 to first order in small quantities.
State three properties of angle-action coordinates (θ, J). Use these coordinates to simplify your
equation for f1 .
Fluctuations in the DF cause the equilibrium state f0 to evolve slowly. Show that this evolution is
governed by the equation
* +
∂f0 ∂ X
= −i · nf̂1 (n, J, t)Φ̂1 (−n, J, t) , (P.24)
∂t ∂J n

where the hat operator is such that


Z
d3 θ
ĝ(n) ≡ g(θ) e−in·θ . (P.25)
(2π)3

Show that the right side of the equation for ∂f0 /∂t is real and explain the significance of its taking
the form of a divergence.
The evolution equation can be brought to the form
 
∂f0 ∂ ∂f0
=− · D1 (J)f0 + D2 (J) · (P.26)
∂t ∂J ∂J

Given that f0 describes particles in thermal equilibrium at inverse temperature β = (kB T )−1 , show that

D1 = D2 · K, (P.27)

where K is a vector function of J that should be identified.

You might also like