Modern Random Matrix Theory Overview
Modern Random Matrix Theory Overview
Boaz Nadler
May 2013
. . . . . .
. . . . . .
Starting point:
[Eugene Wigner, 1955, 1958]
[Mehta and Gaudin, 1960]
random matrix models to study quantum phenomena.
Example: Energy levels of atom = eigenvalues of Hermitian
operator
Hψj = Ej ψj
Low energy states can be tackled analytically. For higher energy
levels, analysis becomes intractable.
. . . . . .
Starting point:
[Eugene Wigner, 1955, 1958]
[Mehta and Gaudin, 1960]
random matrix models to study quantum phenomena.
Example: Energy levels of atom = eigenvalues of Hermitian
operator
Hψj = Ej ψj
Low energy states can be tackled analytically. For higher energy
levels, analysis becomes intractable.
. . . . . .
. . . . . .
Assume E[Wi,j
k ] < ∞ (weaking possible)
. . . . . .
Assume E[Wi,j
k ] < ∞ (weaking possible)
. . . . . .
1
Fm (t) = #{i | ℓi (W ) ≤ t}
m
. . . . . .
1
Fm (t) = #{i | ℓi (W ) ≤ t}
m
. . . . . .
. . . . . .
Numerical Illustration:
X = randn(m,m);
. . . . . .
Numerical Illustration:
X = randn(m,m);
W = 1/sqrt(2*m) (X + X’) ;
. . . . . .
Numerical Illustration:
X = randn(m,m);
W = 1/sqrt(2*m) (X + X’) ;
L = eig(W);
. . . . . .
Numerical Illustration:
X = randn(m,m);
W = 1/sqrt(2*m) (X + X’) ;
L = eig(W);
histL = hist(L,x);
. . . . . .
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
−2 −1 0 1 2
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Simple Observation:
∫ 1 ∑ 1
βm,k = x k dFm (x) = m ℓki = k
m Tr [W ].
. . . . . .
Simple Observation:
∫ 1 ∑ 1
βm,k = x k dFm (x) = m ℓki = k
m Tr [W ].
. . . . . .
Example:
1 ∑
m
1
Tr [W ] = Wii
m m
i=1
Hence
ETr [W ] = 0.
. . . . . .
Example:
1 ∑
m
1
Tr [W ] = Wii
m m
i=1
Hence
ETr [W ] = 0.
2 ∑
Next, (Tr [W ]) = i,j Wii Wjj .
. . . . . .
Example:
1 ∑
m
1
Tr [W ] = Wii
m m
i=1
Hence
ETr [W ] = 0.
2 ∑
Next, (Tr [W ]) = i,j Wii Wjj .
Hence, ∑
E[( m1 Tr [W ])2 ] = 1/m2 i Wii2 = 1
m2
·m· 2
m = 1
m2
Conclusion: First moment βm,1 has zero mean, and variance tends
to zero as m → ∞. Thus
βm,1 → 0 as m→∞
. . . . . .
Second Moment: ∑ ∑ ∑
Tr [W 2 ] = i (W 2 )ii = i j Wij2
Hence, [ ]
1 1 1
E 2
Tr [W ] = · m2 = 1.
m m m
. . . . . .
Second Moment: ∑ ∑ ∑
Tr [W 2 ] = i (W 2 )ii = i j Wij2
Hence, [ ]
1 1 1
E 2
Tr [W ] = · m2 = 1.
m m m
. . . . . .
General Moments:
∑
(W k )11 = W1,i1 Wi1 i2 · · · Wik−1 1
i1 ,...,ik−1
. . . . . .
General Moments:
∑
(W k )11 = W1,i1 Wi1 i2 · · · Wik−1 1
i1 ,...,ik−1
E[Tr (W 2r +1 )] = 0.
. . . . . .
General Moments:
∑
(W k )11 = W1,i1 Wi1 i2 · · · Wik−1 1
i1 ,...,ik−1
E[Tr (W 2r +1 )] = 0.
For even k = 2r , each term must appear even number of times for
non-zero mean.
If each term appears exactly twice, mean is (1/m)k/2
Exact number of such terms is a combinatorial graph enumeration
problem.
. . . . . .
Summary:
m Etr [W
1 2k+1 ] =0
1 2k
m tr [W ] → Catalan Numbers
. . . . . .
Summary:
m Etr [W
1 2k+1 ] =0
1 2k
m tr [W ] → Catalan Numbers
. . . . . .
. . . . . .
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
−2 −1 0 1 2
. . . . . .
Mathematical Foundations:
. . . . . .
Mathematical Foundations:
Lemma 1: Moments uniquely determine a distribution.
. . . . . .
Mathematical Foundations:
Lemma 1: Moments uniquely determine a distribution.∫
Precisely, F - a distribution, with finite moments βk = x k dF (x)
If Carleman condition holds
∑ −1/2k
β2k =∞
k
. . . . . .
. . . . . .
. . . . . .
where C+ = {z = x + iy | y > 0}
. . . . . .
where C+ = {z = x + iy | y > 0}
Inversion Formula:
∫ b
1
F (a, b) = lim Im(S(u + iv ))du
π v →0+ a
. . . . . .
. . . . . .
. . . . . .
. . . . . .
1 ∑ 1
m
1
Tr (A−1 ) = = sm (z).
m m ℓj − z
j=1
. . . . . .
. . . . . .
. . . . . .
Overall:
−1
sm (z) ≈
z + sm (z)
. . . . . .
Overall:
−1
sm (z) ≈
z + sm (z)
Now invert - precisely the semi-circle law.
. . . . . .
. . . . . .
Sn → Σ = Im → ℓi → 1
. . . . . .
Sn → Σ = Im → ℓi → 1
Example:
X = randn(m,n);
S = 1/n X X’ ;
L = eig(S) ;
. . . . . .
0
0.6 0.8 1 1.2 1.4 1.6
. . . . . .
1.5
0.5
0
0.6 0.8 1 1.2 1.4 1.6
. . . . . .
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0.6 0.8 1 1.2 1.4 1.6
. . . . . .
Let {ℓi }m
i=1 be the eigenvalues of a random symmetric matrix H.
. . . . . .
1 √
fMP (t) = (b − t)(t − a) t ∈ [a, b]
2πct
√ √
where a = (1 − c)2 , b = (1 + c)2
. . . . . .
1 √
fMP (t) = (b − t)(t − a) t ∈ [a, b]
2πct
√ √
where a = (1 − c)2 , b = (1 + c)2
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
√ [Wachter, 1978]
ℓ1 → (1 + m/n)2 with probability 1
. . . . . .
√ [Wachter, 1978]
ℓ1 → (1 + m/n)2 with probability 1
[Gordon 1985]
non-asymptotic concentration of measure bounds for largest
eigenvalue of random matrices.
many follow-up works on this topic ...
. . . . . .
Universality
under mild moment conditions, largest eigenvalue converges to
TW distribution
[Soshnikov, Tao & Vu, etc]
. . . . . .
q ′′ = sq + 2q 3 , q(s) ∼ Ai(s) as s → ∞
. . . . . .
Tracy−Widom Densities
0.5
β=1
β=2
0.4
0.3
density
0.2
0.1
0
−5 0 5
s
. . . . . .
Large x asymptotics:
4 3/2 ( )
e− 3 x −3/2
F2 (x) = 1 − 1 + O(x
16πx 3/2
2 3/2 ( )
e− 3 x
F1 (x) = 1 − √ 3/2 1 + O(x −3/2
4 πx
. . . . . .
. . . . . .
1/3
√ 1 1
σnm = µnm √ +√
n− 1
2 m− 1
2
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
β = 2, p = 20, n = 500
TW
0.5 ℓ1
0.4
density
0.3
0.2
0.1
0
−5 −4 −3 −2 −1 0 1 2
s
. . . . . .
β = 2, p = 20, n = 500
TW
0.5 ℓ1
Ratio
0.4
density
0.3
0.2
0.1
0
−5 −4 −3 −2 −1 0 1 2
s
. . . . . .
Theorem: As, p, n → ∞,
[ ] ( )
U − µn,p 1 µn,p 2 ′′
Pr < s ≈ Fβ (s) − Fβ (s)
σn,p βnp σn,p
. . . . . .
Theorem: As, p, n → ∞,
[ ] ( )
U − µn,p 1 µn,p 2 ′′
Pr < s ≈ Fβ (s) − Fβ (s)
σn,p βnp σn,p
. . . . . .
. . . . . .
Σ = diag (λ1 , λ2 , . . . , λk , 0, . . . , 0) + σ 2 Im
. . . . . .
. . . . . .
In statistical physics:
[Hoyle and Rattray, Reimann & al, Biehl, Watson]
. . . . . .
In statistical physics:
[Hoyle and Rattray, Reimann & al, Biehl, Watson]
√
Asymptotic n-Gaussian fluctuations [Paul, 07]
√
n(ℓ1 − E[ℓ1 ]) ∼ N (0, σ 2 (λ1 ))
. . . . . .
0.8
60
0.6
40
0.4
20
0.2
0 0
0 1 2 3 0 1 2 3
σ σ
1
∆ λ / 10
R
0.9 Rth
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.5 1 1.5 2 2.5 3
σ
. . . . . .
Moment Methods
Stieltjes Transform
Concentration of Measure
Many others...
. . . . . .
. . . . . .
. . . . . .
. . . . . .
H0 = null hypothesis.
H1 = alternative
. . . . . .
f (x1 , . . . , xn ) ≤ t → accept H0
. . . . . .
f (x1 , . . . , xn ) ≤ t → accept H0
. . . . . .
. . . . . .
. . . . . .
Detection Power:
. . . . . .
Detection Power:
. . . . . .
. . . . . .
p(x1 , . . . , xn |H1 )
p(x1 , . . . , xn |H0 )
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
H0 is simple
H1 is still not simple
. . . . . .
H0 is simple
H1 is still not simple
but
if we consider test statistics based
only
on sample eigenvalues then it is !
. . . . . .
. . . . . .
p(ℓ1 , . . . , ℓm |H1 )
≷ C (α)
p(ℓ1 , . . . , ℓm |H0 )
. . . . . .
. . . . . .
MDL
0.9
AIC
0.8
0.7
≠ K)
0.6
est
0.5
Pr(K
0.4
0.3
0.2
0.1
0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
N
. . . . . .
MDL
0.9
AIC
0.8 SCHOTT
MODIFIED−AIC
0.7
≠ K)
0.6
est
0.5
Pr(K
0.4
0.3
0.2
0.1
0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
N
. . . . . .
MDL
0.9
AIC
0.8 SCHOTT
MODIFIED−AIC
0.7
RMT
≠ K)
0.6
est
0.5
Pr(K
0.4
0.3
0.2
0.1
0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
N
. . . . . .
. . . . . .
Via largest
√ root test, asymptotically for m, n large only
λ/σ 2 > m/n can be detected (with probability one) !
Via other tests, can detect (but not with probability one) weaker
signals
[Onatsky et. al.]
Final word of caution: Roy’s largest root test is asymptotically
optimal when n → ∞. Not so for very weak signals.
. . . . . .
A lot !
. . . . . .
A lot !
. . . . . .
. . . . . .
. . . . . .
. . . . . .
more material at
[Link]/∼nadler
. . . . . .