The Riemann Zeta Function and Probability Theory: Phillipe Biane February 13, 2015
The Riemann Zeta Function and Probability Theory: Phillipe Biane February 13, 2015
Phillipe Biane
Abstract
Translation of “La Fonction zéta de Riemann et les probabilités”by
Phillipe Biane. La fonction zêta, pp. 165–193 .
1 Introduction
In recent years, particularly under the influence of the physicist and number
theorist F.J. Dyson, probabilistic ideas have penetrated the field of number
theory, particularly the study of the Riemann zeta function. Without at-
tempting to provide a comprehensive overview of the relationship between
probability and number theory, I will try to explain two examples of how
these seemingly distant areas are closely related.
The first example we consider is the theory of random matrices and its
applications to the study of zeros of the Riemann zeta function. The origin
of number theorists’ interest in random matrix theory can be traced to
the work of H.L. Montgomery on the distribution of the spacings between
zeros of the zeta function. Let ρ = 1/2 + iγ denote the nontrivial zeros
of the zeta function (we ignore the trivial zeros −2, −4,. . .). We assume
the Riemann hypothesis is true: i.e, γ are real. The zeros are distributed
symmetrically with respect to 1/2 (this follows from the functional equation
for ζ) and the number of zeros whose imaginary part is in the interval [0,
T
T] is asymptotic to 2π log T as T → ∞. Therefore, the average density of
1
zeros in the interval [0, T ] is 2π log T , and the average spacing between two
2
consecutive zeros is log T . Montgomery was interested in the asymptotic
distribution of the difference γ − γ 0 where (γ, γ 0 ) denotes the set of pairs of
zeros in the interval [0, T ]. These differences are rescaled by log T to make
the average spacing 1, and we focus on Na,b (T ), the number of pairs (γ, γ 0 )
such that γ − γ 0 ∈ [ log
2πa 2πb
T , log T ]. Based on calculations that will be discussed
1
in Section 2.1, Montgomery conjectured the following asymptotic behavior
!
sin u 2
Z b
T
Na,b (T ) ∼ log T 1− du + o(1) , 0 < a < b, (1)
2π a πu
as T → ∞
Montgomery was at Princeton at the same time as Dyson when he made
this conjecture. In a conversation with Dyson, he was astonished to learn
that the above asymptotic behavior is the same as that of the differences be-
tween the eigenvalues of a Gaussian Hermitian matrix, a result well-known
to theoretical physicists. Their motivation for this problem and some subtle
calculations will be explained in Section 2.2. The coincidence between Mont-
gomery’s conjecture and the physicists results on random matrices cast new
on Polya and Hilbert’s suggestion that the numbers γ should be the eigen-
values of a self-adjoint operator on a Hilbert space. The existence of such
an operator, which would imply in particular the validity of the Riemann
hypothesis, is still speculative. Nevertheless, this possibility motivated A.
Odlyzko to experimentally test Montgomery’s conjecture. In numerical cal-
culations to be discussed in Section 2.3, he verified that the zeros of the zeta
functions conform to the predictions of the random matrix model with high
precision. We will also see how the heuristic of random matrix theory led J.
Keating and N. Snaith to offer a remarkable conjecture for the asymptotic
behavior of the moments of the zeta function on the critical line, a problem
which dates back to the work of Hardy and Littlewood.
Despite the convincing experimental confirmation, the relationship be-
tween probability and zeta function mentioned above remain largely conjec-
tural. This is why I have also chosen to discuss other connections. These
are probably more anecdotal in terms of the Riemann zeta function, but
they involve Brownian motion, probably the most important object, and
certainly the most studied, in modern probability theory. We will see that
the Riemann ξ function, which expresses the functional equation of the zeta
function in a symmetric manner, is the Mellin transform of a probability
measure that appears in the study of Brownian motion, or more specifically
in the theory of Brownian excursions. This discussion gives us the opportu-
nity to present the basics of excursion theory. We also give a probabilistic
interpretation of the functional equation and explainP how probabilistic rea-
soning leads to a natural renormalization of the series ∞ n −s
n=1 (−1) n , which
converges in the whole complex plane.
2
2 Correlations in the zeros of the Riemann zeta
function and random matrices
2.1 Montgomery’s conjecture on the pair correlation
The starting point of Montgomery’s work [9] is the study of the asymp-
totic behavior of the Fourier transform of the distribution of the differences
between the γ. Montogomery considers the quantity
−1
T X 0 4
F (α) = log T T −iα(γ−γ ) . (2)
2π 4 + (γ − γ 0 )2
0≤γ,γ 0 ≤T
the error term being uniform for α ∈ [0, 1 − ε), for any ε > 0. The proof of
this result, is too technical to be presented in detail here. It is based on an
“explicit formula” that links the zeros of the zeta function and the prime
numbers under the assumption that the Riemann hypothesis is true. For
t ∈ R and x ≥ 1, we have
X xiγ
2 = (4)
1 + (t − γ)2
0≤γ≤T
X x −1/2+it X x 3/2+it
= −x1/2 Λ(n) + Λ(n)
n n>x
n
n≤x
!
x 1/2
+x−1+it (log (|t| + 2) + O(1)) + O (5)
|t| + 2
where Λ(n) is the arithmetic function that takes the value log p if n is divis-
ible by the prime p, and 0 if not. Let G(t, x) denote the left side of equation
(4). It is easily verfied that
Z T
|G(t, T α )|2 dt = F (α)T log T + O(log3 T ),
0
3
RT
and (3) then becomes a delicate estimate of 0 |D(t, T α )|2 dt where D(t, x)
denotes the right hand side of (4). This estimate is only possible if α ∈ [0, 1),
but heuristic arguments of Montgomery (still under the Riemann hypoth-
esis) suggest that F (α) = 1 + o(1) for α ≥ 1 uniformly on any compact
set. This determines the asymptotic behavior of F on all of R. The Fourier
inversion formula and (2.1) imply
X Z ∞
0 log T 4 T
r γ−γ = log T F (α)r̂(α) dα, (6)
0
2π 4 + (γ − γ 0 )2 2π −∞
γ,γ
1, u ∈ [a, b],
r(u) = ,
0, otherwise,
sin πx 2 sin πx 2
Z ∞ Z ∞ Z b
r̂(α) min(1 − |α|, 0) dα = r(x) dx = dx.
−∞ −∞ πx a πx
We combine these calculations to obtain the estimate
2.2 GUE
Quantum theory implies that the energy levels of an atomic system are the
eigenvalues of a Hermitian operator in Hilbert space, known as the Hamil-
tonian of the system. When the atomic system contains many elementary
4
particles, there is a profusion of energy levels and the Hamiltonian is too
complex to be diagonalized numerically. In this context, the physicist E.
Wigner suggested that the energy levels of such a Hamiltonian can be mod-
eled by the eigenvalues of a random Hermitian matrix. Wigner’s hope was
the statistical properties of energy levels, such as the distribution of their
spacings, coincide with those of random matrices. Wigner’s intuition proved
well-founded and there is a good match between experiment and the pre-
dictions of the random matrix model in many quantum systems. See, for
example, the introduction to the book of Mehta [7].
I will now explain how to describe the statistical structure of eigenvalues
of a large random matrix. The term GUE - an acronym for Gaussian Unitary
Ensemble - designates the space of N ×N Hermitian matrices, HN , equipped
2
with the standard Gaussian measure with density (2)−N /2 exp(−tr(M 2 )/2)
with respect to the Lebesgue measure on HN . We now describe the law of the
eigenvalues of a random matrix sampled from GUE. Any Hermitian matrix
M0 can be written in the form U0 X0 U0∗ where U0 is a unitary matrix and
X0 is a diagonal matrix of eigenvalues of M0 . Consider the map (X, S) 7→
U0 eiS Xe−iS U0 where X ranges over real diagonal matrices and S ranges over
the Hermitian matrices. The differential at (X0 , S0 ) of this map is (X, S) 7→
X + U0 [X0 , S]U0∗ where [X, Y ] = XY − Y X denotes the commutator of two
matrices. If M0 is generic (all its eigenvalues are distinct), the kernel of the
differential is the subspace of pairs (0, S) where S is diagonal. It follows from
the implicit function theorem that (X, S) 7→ U0 eiS Xe−iS U0∗ , restricted to
the subspace of S whose diagonal coefficients are zero, is a diffeomorphism
between a neighborhood of (x0 , 0) and a neighborhood of M0 . By identifying
a pair (X, S) with the matrix X + S, we can calculate the eigenvalues of this
change of variables, and find that the Jacobian at (X0 , 0) is i<j (xi − xj )2 ,
Q
where xi are the eigenvalues of X0 (and M0 ). Let x1 (M ), . . . , xN (M ) denote
the eigenvalues associated with a matrix M . For any symmetric function of
N variables f (x1 , ..., XN ) we use the change of variables of the formula, to
find
2
e−tr(M )/2
Z
f (x1 (M ), . . . , xN (M )) dM
HN (2π)N 2 /2
Z
1 Y PN 2
= f (x1 , . . . , xN ) (xi − xj )2 e− n=1 xn dx1 . . . dxN ,
ZN RN
1≤i<j≤N
5
of the law of the eigenvalues of M is
1 Y PN
x2n
P (N ) (x1 , . . . , xN ) = (xi − xj )2 e− n=1 . (7)
ZN
1≤i<j≤N
where
Rn(N ) (x1 , . . . , xn )
Z
N!
= P (N ) (x1 , . . . , xn , xn+1 , . . . , xN ) dxn+1 . . . dxN .
(N − n)! RN −n
(N )
In order to calculate Rn we rewrite P (N ) using the Vandermonde deter-
minant Y h i
(xi − xj ) = det xi−1
j .
1≤i,j≤N
i>j
h i
We take linear combinations of the columns of the matrix xi−1
j to find
Y
(xi − xj ) = det [Pi−1 (xj )]1≤i,j≤N
i>j
6
The normalized Hermite wave-functions defined below form an orthogonal
basis for L2 (R):
2
1 e−x /4
ϕn (x) = √ Pn (x) .
n! (2π)1/4
2
The density P (N ) is proportional to det [ϕi−1 (xj )]1≤i,j≤N . To determine
the constant of proportionality we evaluate the integral
Z 2
det [ϕi−1 (xj )]1≤i,j≤N dx1 . . . dxN
RN
Z X N Y
Y N
= ε(σ)ε(τ ) ϕi−1 (xσi )ϕj−1 (xτj ) dx1 . . . dxN .
RN σ,τ ∈Σ i=1 j=1
n
Since the functions ϕi are orthogonal, the only terms of the sum that give
a non-zero contribution are those for which σ = τ . Each such term gives a
unti contribution. Thus, the above integral is N ! and
1 2
P (N ) (x1 , . . . , xN ) = det [ϕi−1 (xj )]1≤i,j≤N
N!
1 h i
= det K (N ) (xi , xj )
N! 1≤i,j≤N
where
N
X
K (N ) (x, y) = ϕk−1 (xi )ϕk−1 (xj ).
k=1
In light of the orthogonality of the ϕk we have
Z Z
(N )
K (x, x) = N ; K (N ) (x, z)K (N ) (z, y) dz = K (N ) (x, y). (9)
R R
We also deduce that
h i
R(N ) (x1 , . . . , xN ) = det K (N ) (xi , xj ) . (10)
1≤i,j≤N
(N )
In fact, we will reason by induction that Rn may be expressed as such a
determinant for all n. Assuming this holds for n + 1 we have
Z
(N ) 1 (N )
Rn (x1 , . . . , xn ) = Rn+1 (x1 , . . . , xn , xn+1 ) dxn+1
N −n R
Z
1 h i
= det K (N ) (xi , xj ) dxn+1
N −n R 1≤i,j≤n+1
Z
1 X
= ε(σ) K (N ) (x1 , xσ1 ) . . . K (N ) (xn+1 , xσn+1 ) dxn+1 .
N −n R
σ∈Σn+1
7
If σn+1 = n + 1 in this sum, then the first equality in (9) implies
Z
K (N ) (x1 , xσ1 ) · · · K (N ) (xn+1 , xσn+1 ) dxn+1 (11)
R
= N K (N ) (x1 , xσ1 ) · · · K (N ) (xn , xσn ).
8
The limiting distribution of the eigenvalues is called the Wigner semicircle
√ If we choose a small interval (−εN , εN ) around 0 such that εN → 0 but
law.
εN N → ∞√the average number of eigenvalues in this interval will be of the
order of 2ε√
N N /π and the average spacing between two of these eigenvalues
will be π/ N . Again using the asymptotics of Hermite polynomials we
arrive at the formula
(N )
√ √ N →∞ sin π(ξ − η)
K πξ/ N , πη/ N −→ .
π(ξ − η)
9
concerns a million zeros around 1022 -th zero. For example, the 1022 + 1-st
zero is
1/2 + i1370919909931995308226.68016095 . . .
and the lowest significant digits of the following three zeros are
8226.77659152
8226.94593324
8227.16707942.
Thus, Montgomery’s guess and Odlyzko’s numerical verification add weight
to Hilbert and Polya’s conjecture that the γ are the eigenvalues of a Her-
mitian operator. While not providing a clear notion of the origin of this
operator, they do help to identify its form a little better. In fact we can
consider Gaussian random matrices presenting the different symmetries of
Hermitian matrices, such as real symmetric matrices (GOE for Gaussian Or-
thogonal Ensemble), or symplectic matrices (GSE = Gaussian Symplectic
Ensemble). Calculations similar to those of the preceding paragraph allow
2
us to determine the pair correlation function (the function 1 − sinπxπx for
GUE), and other eigenvalue statistics. These statistics differ from that of
GUE. Thus, the numerics with GUE suggest that the operator of Polya and
Hilbert, if it exists, should be hermitian, not orthogonal or symplectic.
Montgomery’s results have been extended to other L-functions and re-
search in this area is currently very active. However, time and skill do not
allow me address this subject, for which I refer to a recent Bourbaki seminar
by P. Michel [8].
To conclude this first part we’ll look at another problem concerning the
zeta function, that of the asymptotic behavior of its moments on the critical
line. The problem is to estimate
Z T 2k
1
ζ( + it dt,
2
0
10
Nothing else has been shown for higher
R T moments.
6 There are conjectures
for the asymptotics of the moments 0 ζ( 12 + it dt (due to Conrey and
RT 6
Ghosh) and for 0 ζ( 21 + it dt (by Conrey and Gonek) of the form
Z T
2k
1
ζ( + it dt, ∼ ak bk T (log T )k2 ,
2 (15)
0
with k 2 ∞
!
Y Γ(m + k) 2
1 X
ak = 1− p−m .
p m! Γ(k)
p∈P m=0
11
which allows us to conjecture that
k−1
Y j!
bk = .
(j + k)!
j=0
It is easy to check that these values agree with those above for b3 and b4 .
Keating and Snaith conjecture that these are the general forms for all k.
1 ∞ s
Z
ξ(s) = t Ψ(t) dt, s ∈ C, (17)
2 0
where
∞
2 2
X
2π 2 n4 y 2 − 3πn2 e−πn y .
Ψ(y) = 4y (18)
n=1
12
using the formula
Ψ(y) = 6yθ0 (y 2 ) + 4y 3 θ00 (y 2 ).
The functional equation for the Jacobi theta function
1 1
θ(t) = √ θ( ), t>0
t t
implies that Ψ satisfies
1 1
Ψ(y) = 3
Ψ( ), y > 0. (19)
y y
This allows us to analytically continue the zeta function and deduce equation
(16).
The starting point of the developments that follow is that Ψ is positive
on the half-line R+ and has integral 1, thus it is the density of a probability
measure on the half-line. Indeed, the formula (18) shows that Ψ(y) > 0 for
y > 1, because it is a sum of positive terms, and the functional equation
(19) implies positivity for y < 1. The graph of this density is indicated in
Figure 1, and the distribution function
Z x ∞
2 2
X
1 − 2πn2 y 2 e−πn y
FΨ (y) = Ψ(y) dy = 1 + 2 (20)
0 n=1
∞
4π X 2 /y 2
= n2 e−πn . (21)
y3
n=1
13
LA FONCTION ZÊTA DE RIEMANN ET LES PROBABILITÉS 177
1.9
1.7
1.5
1.3
1.1
0.9
0.7
0.5
0.3
0.1
−0.1
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2
Figure
Figure 1: The distribution 1
function of equation (20)
! y "∞
2 y2
(3.5)
The originFΨof(y) the= relationship
Ψ(x)dx = 1 + 2 zeta
between (1 − 2πn2 y 2and
function )e−πnBrownian mo-
0
tion may be found in the fact that the theta n=1 functions of Jacobi, which are
"∞
closely related to the Riemann zeta4π function, arise in the solutions of the
2 −πn2 /y 2
(3.6)
heat equation. On the other hand, we = n e
y 3 know that heat flow and Brownian
n=1
motion are two physical phenomena whose underlying mathematical struc-
s’obtient
ture is theensame.
intégrant
Thus,terme à terme
we can la formule
imagine in this (3.3),
way thatou bien celle donnée
the Riemann zetapar
(3.4).
function must appear in the theory of Brownian motion. These general con-
C’est un fait
siderations, remarquable
however, que, enprecise
tell us nothing dépit de l’apparence
about the exact complexe
nature ofdethesela for-
mule (3.3), cette
relationships. In mesure de probabilités
particular the fact thatintervient
the zeta de façon naturelle
function (or more dansaccu-un
grand the
rately nombre de questions
probability densityprovenant
Ψ) appeardeinlanatural
théorieproblems
des marches aléatoires et
is remarkable.
du mouvement brownien. Une revue assez complète des interprétations pro-
babilistes
3.2 Thedegame la formule (3.3) en
of heads andtermes
tailsdu mouvement brownien figure dans
[1] et [14]. Il ne sera pas possible, dans le cadre de cet exposé de discuter
Two players compete in a game of heads and tails. It is assumed that
de tous ces résultats, mais j’évoquerai l’interprétation qui me semble la plus
the payoff for each win is a unit, and we are interested in the winnings
accessible. Celle-ci fait intervenir les excursions du mouvement brownien hors
of one of the players. We can represent this gain after n steps by a sum
Sde zéro, et on en donnera une approche élémentaire au moyen du jeu de pile ou
n = X1 + . . . + Xn , where Xi represents the result of the ith game. The
faceare
dans le paragraphe 3.2. variables
Cela nousthat permettra
satisfy au paragraphe 3.4 de mon-
X i independent random P (X i = ±1) = 1/2. We
trer que l’équation fonctionnelle de la fonction zêta
assume that the fortunes of the two players are endless, and that the est équivalente à l’égalité
game
never stops. A classical theorem of Polya asserts that with probability 1,ou
en loi de deux variables aléatoires définies à partir du mouvement brownien
the gain of the players will be 0 for infinitely many values of n, i.e. both
players will return to their initial (equal) fortunes infinitely often. We will
14
establish this result below by elementary considerations.
Let T1 , T2 , . . . Tn , denote the times of successive returns to zero, i.e.,
T0 = 0 and Tj = inf{n > Tj−1 , Sn = 0} for j > 0. After each return to
zero, the gain behaves like a simple random walk, that is to say the family of
random variables (STj +n , n ≥ 0) has the same law as the family (Sn , n ≥ 0).
Moreover it is independent of the random variables (Sn 1n≤Tj , n ≥ 0). This
is a consequence of the strong Markov property of the random walk, which
can be verified by conditioning with respect to the value of Tj . The times
T0 , T1 , T2 , . . . , Tn , therefore form an increasing sequence, whose increments
(Ti − Ti−1 ; i ≥ 1) are independent and have the same law as T1 . We can
calculate the probability that the first return to 0 occurs at time 2n (it is
clear that the return time cannot be an odd number). As we shall see below,
(2n − 2)!
P (T1 = 2n) = . (22)
22n−1 n!(n − 1)!
Similarly, the maximum difference in fortunes has a simple law. If we put
Mj = max{|Sn |, Tj−1 ≤ n ≤ Tj } then Mj are iid random variables with the
law
1 1
P (Mj = r) = − , r = 1, 2, . . .
r r+1
The length of most of the time intervals between any two successive returns
to 0 is small. Thus the maximum difference in gain between the two players
is also small. Nevertheless, sometimes this interval is very long, and the
maximum difference in gain is important. To quantify this, we will calcu-
late the probability that the return time is equal to 2n and the maximum
difference in earnings is equal to m for given n and m (see Diaconis and
Smith [12], and also [3, 11] for similar calculations). It is convenient to
represent the sequence (Sk , k ≥ 0) by the graph obtained by linearly inter-
polating between integer times as shown in Figure 2. We consider such a
graph restricted to the time interval k ∈ [0, n]. Each graph corresponds to
the realization of a unique sequence (X1 , . . . , Xn ) ∈ {+1, −1}n , therefore
the probability of the event it represents is 2−n .
Consider now the event that the first return time equals 2n, denoted
{T1 = 2n}. A sequence (X1 , . . . , X2n ) realizes this event if and only if the
sequence of partial sums (Sk ; 1 ≤ k ≤ 2n) satisfies S2n = 0 and Sk 6= 0 for
1 ≤ k ≤ 2n − 1. The calculation of the number of such sequences is a classic
exercise in the use of the reflection principle of Désiré André. It suffices to
count the number of sequences that are strictly positive for k ∈ [1, 2n − 1]
and to multiply this number by 2. Each such positive sequence has S1 = 1
and S2n−1 = 1. We first count the sequences with S0 = S2n = 0 and
15
pour r = 1, 2, . . . . La plupart du temps l’intervalle entre deux temps de
retour successifs en 0 prend une petite valeur, et alors la différence maximale
de gain entre les deux joueurs est également petite. Néanmoins, il arrive
parfois que cet intervalle soit très long, et que la différence maximale de gain
soit également importante. Pour quantifier cela nous allons calculer, suivant
Diaconis et Smith [11] (voir aussi [3], [12] pour des calculs semblables), pour n
et m donnés, la probabilité pour que le temps de retour soit égal à 2n et que la
différence maximale de gains soit égale à m. Il est commode de représenter la
suite (Sk ; k " 0) par son graphe, en interpolant linéairement entre les instants
entiers, comme sur la figure 2. Considérons un tel graphe où l’on se restreint
Figure 2
Figure 2: The random walk
à l’intervalle de temps k ∈ [0, n]. Chaque graphe correspond à la réalisation
d’une unique suite (X1 , . . . , Xn ) ∈ {+1, −1}n , par conséquent la probabilité
S1 =que
S2n−1l’événement
= 1. These qu’il sequences
représente correspond
soit réalisé est égale à 2−n
to variables X.1 , Considérons
. . . , X2n with
maintenant l’événement correspondant
X1 = 1 and X2n = −1 such that there are an equal number ofau premier temps de retour égal à 2n,and
+1’s
noté
−1’s amongst{T 1 = 2n}. Une suite (X
the X2 , . . . , X2n−1 , . . . , X ) réalise cet événement si
1 . The2nnumber of such sequences is given by et seulement
si la suite des sommes partielles (Sk ; 1 ! k ! 2n), satisfait S2n = 0 et Sk ̸= 0
the binomial coefficient
pour 1 ! k ! 2n −1. Le calcul du nombre de ces suites est un exercice
2n − 2
classique dont on rappelle la solution, qui − 2)!le principe de réflexion dû à
(2nutilise
= .
n−1
Désiré André. Quitte à multiplier leur − 1)!(npar
(nnombre − 2,
1)!il suffit de compter celles
qui restent strictement positives pour k ∈ [1, 2n − 1] ; en particulier pour une
From this set, we must remove the sequences that vanish for at least one
k ∈ [2, 2n − 2]. Let (S1 , . . . , S2n ) be such a sequence. Then there exists a
smallest integer k0 ∈ [2, 2n−1] such that Sk0 = 0. We define a new sequence
Sk0 with Sk0 = Sk for k ≤ k0 , and Sk = −Sk for k0 < k ≤ 2n. The graph
of the sequence S 0 is obtained by reflecting the graph of the sequence S in
the axis y = 0, after the first passage time at 0, see Figure 3. Conversely if
a sequence S 0 satifies S10 = 1, S2n−1
0 = −1 and S2n 0 = 0, then it necessarily
vanishes for some k ∈ [2, 2n − 2], and it can be reflected in the first moment
after it enters zero to obtain a sequence S such that S2n−1 = 1, S2n = 0 and
Sk vanishes at some k between 2 and 2n − 2. The sequence S 0 corresponds
to a sequence (Xi0 ; 2 ≤ i ≤ 2n − 1) for which the number of +1’s is n − 2
number of −1’s is n. Thus, the total number of such sequences is
and the
2n−2
n . As a consequence, the number of sequences S that do not vanish at
any point between 1 and 2n − 1 is
(2n − 2)! (2n − 2)! (2n − 2)!
2× − =2
(n − 1)!(n − 1)! n!(n − 2)! n!(n − 1)!
and we recover
P formula (22) upto the probability of each path. In particular,
we see that ∞ n=1 P (T1 = 2n) = 1, thus T1 < ∞ almost surely. Applying
the Markov property, we see that for all j we have Tj < ∞ almost surely,
and therefore Sn returns to zero infinitely often with probability 1.
16
Il faut retirer de ce nombre celui des suites qui s’annulent pour au moins un
k ∈ [2, 2n − 2]. Soit maintenant (S1 , . . . , S2n ) une telle suite, alors il existe
un plus petit entier k0 ∈ [2, 2n − 1] tel que Sk0 = 0. Considérons la suite
Sk′ telle que Sk′ = Sk pour k ! k0 , Sk′ = −Sk pour k0 ! k ! 2n. On a
′
donc S2n−1 = −1. Le graphe associé à cette suite est obtenu en réfléchissant
le graphe de la suite S autour de l’axe y = 0, après le premier temps de
passage en 0, voir la figure 3. Réciproquement si une suite S1′ , . . . , S2n
′ vérifie
Figure 3
Figure 3: The random walk reflected at its first passage to zero
S1′ = 1, S2n−1′ = −1 et S2n ′ = 0, alors nécessairement elle s’annule pour un
17
application of the inclusion-exclusion principle, we see that the number of
sequences we seek is given by
X
det(g)s2n (g(1)),
g∈G
where the sum runs over all 22n−2 possible combinations of the symbols.
Since the operators Γ± transform the elements of the canonical basis into
18
other elements of this basis, in order that a term in the sum be nonzero, it is
necessary and sufficient that Γε2n−2 Γε2n−3 . . . Γε1 (e1 ) = e1 . In this case, we
have Γεk Γεk−1 . . . Γε1 (e1 ) = eSk+1 where (Sk : 1 ≤ k ≤ 2n − 1) is a sequence
that satisfies the conditions (23). Thus, hΓ2n−2 (e1 ), e1 i is equal to the num-
ber of such sequences. This term may be calculated by diagonalizing the
matrix Γ. The characteristic polynomial may be computed by a recurrence
in m. We expand det(λIm − Γ) = Pm (λ) with respect to the last column, to
obtain the relation Pm (λ) = λPm−1 (λ) + Pm−2 (λ). This recurrence relation
with initial conditions P1 (λ) = λ and P2 (λ) = λ2 − 1 yields Pm in terms of
Chebyshev polynomials of the second kind, and we find
sin ((m + 1)θ)
Pm (2 cos θ) = , 0 < θ < π.
sin θ
kπ
In particular, the roots of Pm are the numbers (2 cos m+1 , 1 ≤ k ≤ m).
The eigenvectors are computed as follows. The eigenvector (x1 , . . . , xm )
corresponding to the eigenvalue λ satisfies xl−1 + xl+1 = λxl . We set λ =
2 cos (kπ/(m + 1)) to find
klπ
sin m+1
xl = Pl−1 (λ)x1 = x1 .
kπ
sin m+1
19
The conditional distribution of the maximum deviation, knowing that the
return time is equal to 2n, is given by the distribution function
P (M1 ≤ m; T1 = 2n)
Fn (x) = P (M1 ≤ m | T1 = 2n ) = .
P (T1 = 2n)
The zeta function, or more precisely the density Ψ, appears when we take
the limit n → ∞. Specifically, we will calculate the conditional distribution
√
of M1 / πn given that T1 = 2n. We use the first expression in (24) and
Stirling’s formula to obtain
∞
√ X 2 2
1 − 2πn2 y 2 e−πn y ,
lim P M1 ≤ y πn | T1 = 2n = 1 + 2
n→∞
n=1
√
Z b
1 2 /2
lim P (Sn / n ∈ [a, b]) = √ e−x dx.
n→∞ 2π a
Similarly it is easily seen that for any real number t > 0, the law of S[nt] /n,
where [·] denotes the integer part, converges
√ to the normal distribution with
2
variance t, whose density is e−x /2t / 2πt. Finally, thanks to independence
of increments (Sn; n ≥ 0) we see that for any sequence of times t1 < t2 <
S
[nt1 ] S −S S[nt ] −S[nt ]
. . . < tk , the family of random variables ( √ n
, [nt2√
]
n
[nt1 ]
,..., k √
n
k−1
)
converges in distribution to a family of independent normal variables with
20
variances t1 , t2 − t1 , . . ., tk − tk−1 . Brownian motion is a stochastic process
i.e. a family of random variables (Xt , t ∈ R+ ) indexed by time t such that
for all k and any k-tuple of times t1 < t2 < . . . < tk , the random variables
Xt1 − X0 , Xt2 − Xt1 , . . . , Xtk − Xtk−1 are independent normal random vari-
ables with mean zero and variance t1 , t2 − t1 , . . . , tk − tk−1 . In other words,
the finite-dimensional marginal distributions of the family (Xt , t ∈ R+ ) are
determined by the formula
for each t1 < t2 < . . . < tn and each Borel function f on Rn , where the
transition density pt is given by
1 −(x−y)2 /2t
pt (x, y) = √ e . (27)
2πt
A fundamental property of the Brownian motion, obtained by first Wiener,
is that its paths are almost surely continuous. If one considers the space of
continuous functions from [0, ∞) to R with the topology of uniform conver-
gence and the associated Borel structure, then the Wiener measure on this
space is a probability measure such that under this measure, the coordinate
maps Xt : C([0, ∞), R) → R, ω 7→ ω(t) satisfy the above conditions. It can
(n)
be shown that the continuous stochastic process St ; t ≥ 0) obtained by
linearly interpolating the graph of the random walk (Sn ; n ≥ 0) and renor-
(n) √
malizing St = S[nt] / n converges in the space C([0, ∞), R) to the Wiener
measure. This means that for any continuous function Φ on C([0, ∞), R),
we have E Φ S (n) → EW [Φ(ω)], where EW denotes expectation with
respect to Wiener measure. It is important to have this information to cal-
culate the laws of certain functionals of Brownian motion by means of the
approximation by random walks. Note that the fact that the variables Xi
are Bernoulli variables (ie, take only two values) is not of great importance,
the expected result of approximation is still true under the assumption that
X)i are identically distributed, have mean zero and variance equal to 1.
This result is known as Donsker’s invariance principle. A good introduction
to Brownian motion may be found in the book of Karatzas and Shreve [5].
One can visualize the continuity of the paths, and the approximation by the
random walks through a computer simulation. Here is a simple program in
Scilab which traces the trajectory of S (n) for t [0,1], which was used to plot
the graph in 4.
21
xbasc();
plotframe([0 -2 1 2],[1,4,1,0]);
A=[0,1];B=[0,0]; plot2d(A,B,1,000); // Definition of the axes;
N=10000; // This is the number of steps.
rand(normal); X=0;Y=0;SX=0;SY=0;
for i=2 :N+1
U=X;X=X+1/N; V=Y;
if rand(1)>0 then Y=Y+1/(sqrt(N)) ;
else
Y=Y-1/(sqrt(N)) ;// Calculate increments.
186 PH. BIANE
end
SX=[U,X]; SY=[V,Y]; plot2d(SX,SY,1,000);
end
−2
0.00 0.25 0.50 0.75 1.00
for t1 < t2 < . . . < tn < T . This process can be obtained as the limit of
the random walk (S (n) , n ≥ 0) conditioned to return to 0 at time N . More
precisely, we have convergence of finite-dimensional marginals
|Snt1 | |Sntn |
lim E f √ ,..., √ | T1 = [T n] = nT [f (et1 , et2 , . . . , etn )] . (32)
n→∞ n n
23
p
of the Bessel bridge in dimension three multiplied by 2/π. That is, we
have
p
n1 2/π max e(t) ∈ dx = Ψ(x) dx, x > 0. (33)
t∈[0,1]
The Brownian motion, Bessel processes and bridge possess a critical property
of scale invariance. For λ > 0, the transformation ω 7→ ωλ where
leaves the laws of Brownian motion and the Bessel process invariant, and
transforms the law nT into the law nT /λ .
3.4 The functional equation for the zeta function and Ito’s
measure
In this section we interpret the functional equation of the Riemann zeta
function as the equality in distribution of two random variables. In order to
see this, we introduce the Ito measure of Brownian excursions. We consider
the space of continuous excursions ω : R → R+ , such that ω(0) = 0, and
there exists T (ω) > 0 such that ω(t) > 0 for 0 < t < T (ω) and ω(t) =
0 for t ≥ T (ω). The law of the process obtained by linear interpolation
from |Sn |, 0 ≤ n ≤ T1 for time upto T1 , and extended by zero after the
time T1 , is a probability measure on this space. The Ito measure is the
scaling limit, as λ → ∞ of measures λP λ where P λ denotes the law of
rescaled processes λ−1/2|Sλt |1t≤T1 /λ . This measure, denoted n+ has infinite
total mass. However, it can be expressed in terms of its finite-dimensional
marginals by
where p0t is defined in (29). We can also describe this measure with the help
of laws nT defined in (31) by the formula
Z ∞
2dT
n+ = nT √ (36)
0 2πT 3
following the limit theorem (3.17) and (3.7), by applying the Stirling formula
(2n − 2)! 1
P (T1 = 2n) = ∼ √ .
22n−1 n!(n − 1)! 2 πn3
24
Formula
√ (36) means that the law of the time to return to 0 under n+ is
2dT / 2T 3 , and conditionally on the return time, the excursion process is a
Bessel bridge of dimension 3.
The Ito measure is fundamental to an understanding of the behavior
of Brownian motion outside the times it vanishes. The set Zω = {t ∈
[0, ∞) |ω(t) = 0 } of zeros of Brownian motion is closed by the continuity of
trajectories. With probability one, it is a perfect set (with no isolated points)
of zero Lebesgue measure. In particular, it is uncountable, and we cannot
define an increasing sequence of times that denumerate the returns to zero,
unlike the discrete set T1 , . . . , Tn , . . . for the random walk. Nevertheless,
since the complement of Zω is open, it has a countable infinity of connected
components, and excursions of Brownian motion are by definition the pieces
of the path corresponding to these connected components. Next, following
P. Lévy we can introduce the local time of the Brownian motion B by
Z t
1
Lt = lim 1|Bs |≤ε ds,
ε→0+ 2ε 0
25
In other words, under the Ito measure, the law of the maximum of the
excusrion is 2dx/x2 and conditionally on the maximum M = x, the law
of the excursion e is nx . In particular, conditionally on the value of the
maximum, the law of the return time to 0 is that of the sum of two hitting
time of x by two independent Bessel(3) processes. Consider now the law of
the pair (M, V ) under the measure n+ , where M is the maximum excursion
ω and V is the return time to 0, i.e. V = inf{t > 0|ω(t) = 0}. In accordance
with Ito’s description (36) and using the scale invariance (34), we can write
law √
(M, V ) = ( V m, V )
where m denotes the maximum of a Bessel bridge under the law n1 , inde-
pendent of V . Similarly, Williams decomposition gives
law
(M, V ) = M, M 2 (T 1 + T 2 )
26
properties of the scale change (34).
Z ∞
2 dv
n+ f (M 2 /V )g(V ) = nv f (M 2 /v)g(v) √
Z ∞ 0 2πv 3
2 dv
= n1 (f (m2 )) g(v) √
0 2πv 3
2dx
= nx f (x2 /(Tx1 + Tx2 ) g(Tx1 + Tx2 ) 2
Z ∞ x
1 2 1 2
2dx
=E f g x (T + T (39)
T1 + T2 0 x2
p Z ∞
1 dv
=E T 1 + T 2f g(v) 3/2 . (40)
T1 + T2 0 v
The equality of lines 2 and 5 gives the result.
You can
√ also write the relation (38) in terms of the densities of the laws
of m and T 1 + T 2 . If we call these densities Ψ1 and Ψ2 respectively, then
we have the relation
r
π −3
Ψ1 (x) = x Ψ2 (x−1 ). (41)
2
In the above discussion we have not used an explicit knowledge of the laws
of m and T1 + T2 . Thus. we can consider the relation (41) as a consequence
of the scaling invariance of Brownian
√ motion and the Ito excursion measure.
π/2
If we now recall that the law of m has been calculated previously in (33),
and its density is Ψ, which satisfies (19) we observe the curious identity
law π
m2 = T1 + T2 .
(42)
2
The two identities (41) and (42) are equivalent, in view of (33) to the func-
tional equation (19). Finally, since (41) is an immediate consequence of the
scale invariance of Brownian motion, we see that it is the identity (42) that
should be considered as the probabilistic basis for the functional equation of
the Riemann zeta function. I know of no direct demonstration of this iden-
tity, which does not involve an explicit calculation of the laws in question.
It would be very interesting to have a purely combinatorial demonstration
of this identity, through manipulation of the paths of Brownian motion.
27
siderations to obtain a renormalization of the series n (−1)n /ns that con-
P
verges in the full complex plane to the entire function (21−s − 1)ζ(s). More
precisely, we will determine the coefficients (an,N ; 0 ≤ n ≤ N ) such that for
every n we have limN →∞ an,N = (−1)n , and the partial sums N s
P
n=1 an,N /n
1−s
P∞ (2 n −1)ζ(s).
converge uniformly for s in compact sets to the entire function
Recall that this entire function is the sum of the series n=1 (−1) /ns , con-
vergent for <(s) > 1. We can choose the coefficients an,N in order to fix
the value of the sum N s
P
n=1 an,N /n at N values of s. It is natural to choose
for these N values s = 0, −2, −4, . . . − 2(N − 1) where the zeta function
vanishes. It is not difficult to see that this implies
(N !)2
an,N = (−1)n ,
(N − n)!(N + n)!
and we have as well
lim an,N = (−1)n .
N →∞
We now relate this renormalization of the series n (−1)n /ns toPthe pre-
P
ceding considerations. First, the non-convergence of the series ∞ n=1 n
−s
for <s < 1, based on the Ψ, relies on the fact that the series (18) does not
converge uniformly on R+ . In fact it is easy to see that
N
2 2 N →∞
X
2π 2 n4 y 2 − 3πn2 e−πn y −→ −∞,
min 4y
y∈[0,ε]
n=1
for every ε > 0 (the convergence is also uniform). In particular, the partial
sum is not positive. We will therefore seek a probabilistic approximation by
simpler random variables of the function Ψ approaching a random variable
with density Ψ. For this recall the Laplace transform, which can deduced
from (30) and (42), or calculated directly by integrating (18) term by term
Z ∞ √ !2
−λy 2 πλ
e Ψ(y) dy = √ .
0 sinh πλ
Euler’s formula
∞ −1
πx Y x2
= 1+ 2 ,
sinh πx n
n=1
and the elementary formula
Z ∞
E e−κE = e−t e−κt dt = (1 + κ)−1 ,
0
28
for the Laplace transform of a standard, exponential random variable with
law P (E > t) = e−t show that we have the equality in law
∞
X En + E 0 n
X2 = π .
n2
n=1
Here X is a random variable with density Ψ and En and En0 are indepen-
dent, standard exponential random variables. It is thus natural to try to
0
approximate the variable X 2 by the partial sums of π ∞ En +En
P
n=1 n2
. This
leads to an approximation of the function (1−s)ζ(s) convergent in the whole
complex plane. However, the calculations are more complicated than in the
simple case that we will consider, which is the random variable
∞
X En
Y =
n2
n=1
which satisfies
E [Y s ] = s(1 − 21−s )Γ(s/2)ζ(s)
and which can be approximated by the partial sums N 2
P
n=1 En /n . By break-
−1
x2
ing the product N
Q
n=1 1 + n2 into simple rational fractions we obtain
the formula !s/2
N N
X En X an,N
E = −sΓ(s/2) .
n2 ns
n=1 n=1
References
[1] P. Biane, J. Pitman, and M. Yor, Probability laws related to the
Jacobi theta and Riemann zeta functions, and Brownian excursions,
Bulletin of the American Mathematical Society, 38 (2001), pp. 435–
465.
29
[2] E. Brézin and S. Hikami, Characteristic polynomials of random ma-
trices, Communications in Mathematical Physics, 214 (2000), pp. 111–
135.
30