0% found this document useful (0 votes)
131 views122 pages

Modern Random Matrix Theory Overview

The document summarizes Boaz Nadler's introduction to modern random matrix theory. It begins by discussing Wigner's original 1955/1958 work using random matrix models to study quantum phenomena like energy levels in atoms. It then covers Wigner's seminal result that for large m×m Wigner matrices, the empirical spectral distribution converges almost surely to the semi-circle law. It also provides an example numerical illustration and discusses Wigner's original proof via the method of moments.

Uploaded by

Diego M Granziol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
131 views122 pages

Modern Random Matrix Theory Overview

The document summarizes Boaz Nadler's introduction to modern random matrix theory. It begins by discussing Wigner's original 1955/1958 work using random matrix models to study quantum phenomena like energy levels in atoms. It then covers Wigner's seminal result that for large m×m Wigner matrices, the empirical spectral distribution converges almost surely to the semi-circle law. It also provides an example numerical illustration and discusses Wigner's original proof via the method of moments.

Uploaded by

Diego M Granziol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Introduction to Random Matrix Theory:

Theory and Applications


Part II

Boaz Nadler

Department of Computer Science and Applied Mathematics


The Weizmann Institute of Science

May 2013

. . . . . .

Boaz Nadler RMT


Part II: Modern Random Matrix Theory

. . . . . .

Boaz Nadler RMT


Part II: Modern Random Matrix Theory

Starting point:
[Eugene Wigner, 1955, 1958]
[Mehta and Gaudin, 1960]
random matrix models to study quantum phenomena.
Example: Energy levels of atom = eigenvalues of Hermitian
operator
Hψj = Ej ψj
Low energy states can be tackled analytically. For higher energy
levels, analysis becomes intractable.

. . . . . .

Boaz Nadler RMT


Part II: Modern Random Matrix Theory

Starting point:
[Eugene Wigner, 1955, 1958]
[Mehta and Gaudin, 1960]
random matrix models to study quantum phenomena.
Example: Energy levels of atom = eigenvalues of Hermitian
operator
Hψj = Ej ψj
Low energy states can be tackled analytically. For higher energy
levels, analysis becomes intractable.

Wigner replaced operator H by finite large random m × m


Hermitian matrix Hm .

. . . . . .

Boaz Nadler RMT


Wigner Matrix

Wigner Matrix [Hermite/Gaussian ensemble]:


- m × m Hermitian matrix W ,

. . . . . .

Boaz Nadler RMT


Wigner Matrix

Wigner Matrix [Hermite/Gaussian ensemble]:


- m × m Hermitian matrix W ,
- Wi,j independent (except symmetry constraint)
{
σ 2 i ̸= j
E[Wi,j ] = 0 E[Wi,j
2
] =
2σ 2 i = j (but σ 2 also ok)

Assume E[Wi,j
k ] < ∞ (weaking possible)

. . . . . .

Boaz Nadler RMT


Wigner Matrix

Wigner Matrix [Hermite/Gaussian ensemble]:


- m × m Hermitian matrix W ,
- Wi,j independent (except symmetry constraint)
{
σ 2 i ̸= j
E[Wi,j ] = 0 E[Wi,j
2
] =
2σ 2 i = j (but σ 2 also ok)

Assume E[Wi,j
k ] < ∞ (weaking possible)

If σ 2 = 1/m, then this is a standard Wigner Matrix.

. . . . . .

Boaz Nadler RMT


Empirical Spectral Measure

Empirical Spectral Distribution (ESD):

1
Fm (t) = #{i | ℓi (W ) ≤ t}
m

. . . . . .

Boaz Nadler RMT


Empirical Spectral Measure

Empirical Spectral Distribution (ESD):

1
Fm (t) = #{i | ℓi (W ) ≤ t}
m

Question: Does Fm converge to limiting F as m → ∞ ?

. . . . . .

Boaz Nadler RMT


Wigner’s Semi-Circle Law

Theorem: As m → ∞, for standard Wigner matrix,


1 √
dFm (x) → 4 − x 2 dx x ∈ [−2, 2]

. . . . . .

Boaz Nadler RMT


Wigner’s Semi-Circle Law

Theorem: As m → ∞, for standard Wigner matrix,


1 √
dFm (x) → 4 − x 2 dx x ∈ [−2, 2]

Numerical Illustration:

X = randn(m,m);

. . . . . .

Boaz Nadler RMT


Wigner’s Semi-Circle Law

Theorem: As m → ∞, for standard Wigner matrix,


1 √
dFm (x) → 4 − x 2 dx x ∈ [−2, 2]

Numerical Illustration:

X = randn(m,m);
W = 1/sqrt(2*m) (X + X’) ;

. . . . . .

Boaz Nadler RMT


Wigner’s Semi-Circle Law

Theorem: As m → ∞, for standard Wigner matrix,


1 √
dFm (x) → 4 − x 2 dx x ∈ [−2, 2]

Numerical Illustration:

X = randn(m,m);
W = 1/sqrt(2*m) (X + X’) ;
L = eig(W);

. . . . . .

Boaz Nadler RMT


Wigner’s Semi-Circle Law

Theorem: As m → ∞, for standard Wigner matrix,


1 √
dFm (x) → 4 − x 2 dx x ∈ [−2, 2]

Numerical Illustration:

X = randn(m,m);
W = 1/sqrt(2*m) (X + X’) ;
L = eig(W);
histL = hist(L,x);

. . . . . .

Boaz Nadler RMT


Wigner’s Semi-Circle Law

simulations: 1000 m = 400 Nbins= 64

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0
−2 −1 0 1 2
. . . . . .

Boaz Nadler RMT


Wigner’s Semi-Circle Law

Wigner’s original proof = method of moments.

Instead of studying empirical spectral distribution Fm , study its


moments. ∫
x k dFm (x)

. . . . . .

Boaz Nadler RMT


Wigner’s Semi-Circle Law

Wigner’s original proof = method of moments.

Instead of studying empirical spectral distribution Fm , study its


moments. ∫
x k dFm (x)

Prove that moments converge to some limits.

. . . . . .

Boaz Nadler RMT


Wigner’s Semi-Circle Law

Wigner’s original proof = method of moments.

Instead of studying empirical spectral distribution Fm , study its


moments. ∫
x k dFm (x)

Prove that moments converge to some limits.

Find distribution corresponding to these limiting moments.

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

Simple Observation:
∫ 1 ∑ 1
βm,k = x k dFm (x) = m ℓki = k
m Tr [W ].

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

Simple Observation:
∫ 1 ∑ 1
βm,k = x k dFm (x) = m ℓki = k
m Tr [W ].

Since Wij are independent zero mean, many entries in W k have


zero mean as well !

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

Example:
1 ∑
m
1
Tr [W ] = Wii
m m
i=1

Hence
ETr [W ] = 0.

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

Example:
1 ∑
m
1
Tr [W ] = Wii
m m
i=1

Hence
ETr [W ] = 0.
2 ∑
Next, (Tr [W ]) = i,j Wii Wjj .

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

Example:
1 ∑
m
1
Tr [W ] = Wii
m m
i=1

Hence
ETr [W ] = 0.
2 ∑
Next, (Tr [W ]) = i,j Wii Wjj .
Hence, ∑
E[( m1 Tr [W ])2 ] = 1/m2 i Wii2 = 1
m2
·m· 2
m = 1
m2

Conclusion: First moment βm,1 has zero mean, and variance tends
to zero as m → ∞. Thus

βm,1 → 0 as m→∞

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

Second Moment: ∑ ∑ ∑
Tr [W 2 ] = i (W 2 )ii = i j Wij2
Hence, [ ]
1 1 1
E 2
Tr [W ] = · m2 = 1.
m m m

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

Second Moment: ∑ ∑ ∑
Tr [W 2 ] = i (W 2 )ii = i j Wij2
Hence, [ ]
1 1 1
E 2
Tr [W ] = · m2 = 1.
m m m

More complicated calculation,


1
Var [ Tr [W 2 ]] → 0 as m→∞
m
Conclusion:
βm,2 → 1

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

General Moments:

(W k )11 = W1,i1 Wi1 i2 · · · Wik−1 1
i1 ,...,ik−1

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

General Moments:

(W k )11 = W1,i1 Wi1 i2 · · · Wik−1 1
i1 ,...,ik−1

For odd k = 2r + 1, at least one term appears only once,

E[Tr (W 2r +1 )] = 0.

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

General Moments:

(W k )11 = W1,i1 Wi1 i2 · · · Wik−1 1
i1 ,...,ik−1

For odd k = 2r + 1, at least one term appears only once,

E[Tr (W 2r +1 )] = 0.

For even k = 2r , each term must appear even number of times for
non-zero mean.
If each term appears exactly twice, mean is (1/m)k/2
Exact number of such terms is a combinatorial graph enumeration
problem.

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

Summary:

m Etr [W
1 2k+1 ] =0
1 2k
m tr [W ] → Catalan Numbers

. . . . . .

Boaz Nadler RMT


Method of Moments / Wigner Matrix

Summary:

m Etr [W
1 2k+1 ] =0
1 2k
m tr [W ] → Catalan Numbers

Distribution with these moments is semi-circle law.

. . . . . .

Boaz Nadler RMT


Universality

Limiting Distribution depends only on underlying moments


Let W be a standardized Wigner matrix whose entries come from
any distribution with finite moments,
Then as m → ∞ eigenvalues of W converge to semi-circle law.

Example: Matrix with entries ±1


X = randn(m,m);
W = sign(X+X’);
W = W / sqrt(m);
L = eig(W);

. . . . . .

Boaz Nadler RMT


Universality / Semi-Circle Law

Binomial iter: 100 m = 400 Nbins= 64

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0
−2 −1 0 1 2

. . . . . .

Boaz Nadler RMT


Method of Moments

Applicable to other more complicated random matrix models.

Mathematical Foundations:

. . . . . .

Boaz Nadler RMT


Method of Moments

Applicable to other more complicated random matrix models.

Mathematical Foundations:
Lemma 1: Moments uniquely determine a distribution.

. . . . . .

Boaz Nadler RMT


Method of Moments

Applicable to other more complicated random matrix models.

Mathematical Foundations:
Lemma 1: Moments uniquely determine a distribution.∫
Precisely, F - a distribution, with finite moments βk = x k dF (x)
If Carleman condition holds
∑ −1/2k
β2k =∞
k

then F is uniquely determined by {βk }

. . . . . .

Boaz Nadler RMT


Method of Moments

. . . . . .

Boaz Nadler RMT


Method of Moments

Lemma 2: under some technical conditions,


Convergence of moments → convergence of distributions.

. . . . . .

Boaz Nadler RMT


The Stieltjes Transform

Very useful tool in RMT.


Definition: f (x) probability density function.
∫ ∞
1
Sf (z) = f (x)dx z ∈ C+
−∞ x − z

where C+ = {z = x + iy | y > 0}

. . . . . .

Boaz Nadler RMT


The Stieltjes Transform

Very useful tool in RMT.


Definition: f (x) probability density function.
∫ ∞
1
Sf (z) = f (x)dx z ∈ C+
−∞ x − z

where C+ = {z = x + iy | y > 0}

Inversion Formula:
∫ b
1
F (a, b) = lim Im(S(u + iv ))du
π v →0+ a

. . . . . .

Boaz Nadler RMT


The Stieltjes Transform

Lemma: convergence of Stieltjes transforms → convergence of


distributions

. . . . . .

Boaz Nadler RMT


The Stieltjes Transform

Lemma: convergence of Stieltjes transforms → convergence of


distributions
Precisely:
- Fn sequence of probability distributions,
- sn (z) their Stieltjes transforms.
If sn (z) → s(z) for all z ∈ C+ and limv →∞ vs(v ) = −1,
then Fn → F .

. . . . . .

Boaz Nadler RMT


The Stieltjes Transform

Matrix Inversion Lemma:



n
1
−1
Tr (A )=
k=1
akk − αkT A−1
k αk

where αk is k-th row of A with k-th element removed


Ak is k-th minor of A (remove k-th row and column).

. . . . . .

Boaz Nadler RMT


The Stieltjes Transform

Matrix Inversion Lemma:



n
1
−1
Tr (A )=
k=1
akk − αkT A−1
k αk

where αk is k-th row of A with k-th element removed


Ak is k-th minor of A (remove k-th row and column).
Take A = Wm − zIm . Then

1 ∑ 1
m
1
Tr (A−1 ) = = sm (z).
m m ℓj − z
j=1

. . . . . .

Boaz Nadler RMT


The Stieltjes Transform

αk = vector of size (m − 1) with i.i.d. zero mean entries, unit


variance.
αk independent of Ak and hence of A−1 k .

. . . . . .

Boaz Nadler RMT


The Stieltjes Transform

αk = vector of size (m − 1) with i.i.d. zero mean entries, unit


variance.
αk independent of Ak and hence of A−1 k .
T −1
Quadratic form αk Ak αk concentrates around mean value, which
(k) (k)
is m1 Tr (Wm − zIm ) ≈ sm (z).
akk ≈ −z.

. . . . . .

Boaz Nadler RMT


The Stieltjes Transform

αk = vector of size (m − 1) with i.i.d. zero mean entries, unit


variance.
αk independent of Ak and hence of A−1 k .
T −1
Quadratic form αk Ak αk concentrates around mean value, which
(k) (k)
is m1 Tr (Wm − zIm ) ≈ sm (z).
akk ≈ −z.

Overall:
−1
sm (z) ≈
z + sm (z)

. . . . . .

Boaz Nadler RMT


The Stieltjes Transform

αk = vector of size (m − 1) with i.i.d. zero mean entries, unit


variance.
αk independent of Ak and hence of A−1 k .
T −1
Quadratic form αk Ak αk concentrates around mean value, which
(k) (k)
is m1 Tr (Wm − zIm ) ≈ sm (z).
akk ≈ −z.

Overall:
−1
sm (z) ≈
z + sm (z)
Now invert - precisely the semi-circle law.

. . . . . .

Boaz Nadler RMT


The Quarter Circle Law

[Marchenko and Pastur, 1965]


- Let X be m × n random matrix.
- Xij zero mean, unit variance.
- Look at sample covariance matrix
1
Sn = XX T
n
- its eigenvalues ℓ1 , . . . , ℓm .

. . . . . .

Boaz Nadler RMT


The Quarter Circle Law

[Marchenko and Pastur, 1965]


- Let X be m × n random matrix.
- Xij zero mean, unit variance.
- Look at sample covariance matrix
1
Sn = XX T
n
- its eigenvalues ℓ1 , . . . , ℓm .

For fixed dimension m, as n → ∞,

Sn → Σ = Im → ℓi → 1

. . . . . .

Boaz Nadler RMT


The Quarter Circle Law

[Marchenko and Pastur, 1965]


- Let X be m × n random matrix.
- Xij zero mean, unit variance.
- Look at sample covariance matrix
1
Sn = XX T
n
- its eigenvalues ℓ1 , . . . , ℓm .

For fixed dimension m, as n → ∞,

Sn → Σ = Im → ℓi → 1

Question: Eigenvalue spread as m, n → ∞ ?


. . . . . .

Boaz Nadler RMT


Simulation: Eigenvalue Spread

Example:

X = randn(m,n);
S = 1/n X X’ ;
L = eig(S) ;

. . . . . .

Boaz Nadler RMT


Simulation: Eigenvalue Spread

iter: 5000 m = 25 n = 5000 Nbins= 64

0
0.6 0.8 1 1.2 1.4 1.6

. . . . . .

Boaz Nadler RMT


Simulation: Eigenvalue Spread

iter: 5000 m = 25 n = 1000 Nbins= 64

1.5

0.5

0
0.6 0.8 1 1.2 1.4 1.6

. . . . . .

Boaz Nadler RMT


Simulation: Eigenvalue Spread

iter: 5000 m = 25 n = 500 Nbins= 64

1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0.6 0.8 1 1.2 1.4 1.6

. . . . . .

Boaz Nadler RMT


Spread of Sample Eigenvalues

Let {ℓi }m
i=1 be the eigenvalues of a random symmetric matrix H.

Empirical Spectral Distribution Function:


1
Fm (t) = #{ℓi ≤ t}
m

. . . . . .

Boaz Nadler RMT


The Quarter-Circle Law

[Marchenko & Pastur, 1967]


Let H ∼ Wm (n, Σ).
Theorem: For Σ = I , as m, n → ∞ with m/n → c, (c < 1) let ℓi
be sample eigenvalues of H/m, then

1 √
fMP (t) = (b − t)(t − a) t ∈ [a, b]
2πct
√ √
where a = (1 − c)2 , b = (1 + c)2

. . . . . .

Boaz Nadler RMT


The Quarter-Circle Law

[Marchenko & Pastur, 1967]


Let H ∼ Wm (n, Σ).
Theorem: For Σ = I , as m, n → ∞ with m/n → c, (c < 1) let ℓi
be sample eigenvalues of H/m, then

1 √
fMP (t) = (b − t)(t − a) t ∈ [a, b]
2πct
√ √
where a = (1 − c)2 , b = (1 + c)2

If c > 1, then a = 0, and there are m − n sample eigenvalues


exactly at zero.

. . . . . .

Boaz Nadler RMT


Extensions

Non-identity Covariance Matrices: X ∼ Wm (n, Σm ) where Σm


eigenvalue density converges to limiting distribution.

. . . . . .

Boaz Nadler RMT


Extensions

Non-identity Covariance Matrices: X ∼ Wm (n, Σm ) where Σm


eigenvalue density converges to limiting distribution.
Then, as n, m → ∞, empirical density of 1/nXX ′ converges to
limit. Can describe its Stieltjes transform in terms of that of
limiting Σ.

. . . . . .

Boaz Nadler RMT


Extensions

Non-identity Covariance Matrices: X ∼ Wm (n, Σm ) where Σm


eigenvalue density converges to limiting distribution.
Then, as n, m → ∞, empirical density of 1/nXX ′ converges to
limit. Can describe its Stieltjes transform in terms of that of
limiting Σ.

Double Wishart Matrices:

Let A ∼ Wm (n1 , Σ) and let B ∼ Wm (n2 , Σ).


As m, n1 , n2 → ∞, eigenvalue distribution of A−1 B converges to
limiting Marchenko-Pastur type distribution.

Remark: Eigenvalues of A−1 B same as those of


(Σ−1 A)−1 (Σ−1 B), so result independent of Σ.
Can choose Σ = I .
. . . . . .

Boaz Nadler RMT


Largest Eigenvalue of Random Matrices

Let Xm×n be random Xij ∼ N (0, 1).

Marchenko-Pastur law: limiting spectral density of eigenvalues of


1/nXX ′ .

. . . . . .

Boaz Nadler RMT


Largest Eigenvalue of Random Matrices

Let Xm×n be random Xij ∼ N (0, 1).

Marchenko-Pastur law: limiting spectral density of eigenvalues of


1/nXX ′ .

However, how large can the largest eigenvalue be ?

. . . . . .

Boaz Nadler RMT


Largest Eigenvalue of Random Matrices

Let Xm×n be random Xij ∼ N (0, 1).

Marchenko-Pastur law: limiting spectral density of eigenvalues of


1/nXX ′ .

However, how large can the largest eigenvalue be ?

Question: Bound on ℓ1 = ∥1/mXX T ∥2 ?

. . . . . .

Boaz Nadler RMT


Largest Eigenvalue of Random Matrices

Let Xm×n be random Xij ∼ N (0, 1).

Marchenko-Pastur law: limiting spectral density of eigenvalues of


1/nXX ′ .

However, how large can the largest eigenvalue be ?

Question: Bound on ℓ1 = ∥1/mXX



T∥ ?
2
Is it true that ℓ1 → (1 + m/n)2 ?

. . . . . .

Boaz Nadler RMT


Largest Eigenvalue of Random Matrices

√ [Wachter, 1978]
ℓ1 → (1 + m/n)2 with probability 1

. . . . . .

Boaz Nadler RMT


Largest Eigenvalue of Random Matrices

√ [Wachter, 1978]
ℓ1 → (1 + m/n)2 with probability 1

[Gordon 1985]
non-asymptotic concentration of measure bounds for largest
eigenvalue of random matrices.
many follow-up works on this topic ...

. . . . . .

Boaz Nadler RMT


Distribution of Largest Eigenvalue

[complex case, Johansson, 00’]


[real case, Johnstone, 01’]
Single Wishart Case:

Pr [ℓ1 < µnm + σnm s] → Fβ (s)


where Fβ is the Tracy-Widom distribution of order β.

Universality
under mild moment conditions, largest eigenvalue converges to
TW distribution
[Soshnikov, Tao & Vu, etc]

. . . . . .

Boaz Nadler RMT


Tracy-Widom Distribution

[Tracy & Widom, 1994, 1996]


( ∫ ∞ )
F2 (s) = exp − (x − s) q(x)dx
2
s
( ∫ ∞ )
F1 (s) = F2 (s) exp −
2
q(x)dx
s

where q(x) is the solution to Painleve-II non-linear 2nd order


differential equation

q ′′ = sq + 2q 3 , q(s) ∼ Ai(s) as s → ∞

q and Fβ are somewhat tricky to compute. Several packages.


Folkmar Bornemann (by request) - matlab code.

. . . . . .

Boaz Nadler RMT


Tracy-Widom Distributions

Tracy−Widom Densities
0.5
β=1
β=2
0.4

0.3
density

0.2

0.1

0
−5 0 5
s

. . . . . .

Boaz Nadler RMT


Tracy-Widom Distributions

Large x asymptotics:
4 3/2 ( )
e− 3 x −3/2
F2 (x) = 1 − 1 + O(x
16πx 3/2
2 3/2 ( )
e− 3 x
F1 (x) = 1 − √ 3/2 1 + O(x −3/2
4 πx

. . . . . .

Boaz Nadler RMT


Second Order Accuracy
[complex case, El-Karoui, 07]
[real case, Ma, Johnstone and Ma]
For Wishart matrices,
with careful choice of centering and scaling parameters, as
n, m → ∞ with n/m → γ,

|Pr[ℓ1 < µnm + σnm s] − Fβ (s)| ≤ Ce −cs m−2/3

. . . . . .

Boaz Nadler RMT


Second Order Accuracy
[complex case, El-Karoui, 07]
[real case, Ma, Johnstone and Ma]
For Wishart matrices,
with careful choice of centering and scaling parameters, as
n, m → ∞ with n/m → γ,

|Pr[ℓ1 < µnm + σnm s] − Fβ (s)| ≤ Ce −cs m−2/3

For real valued data


(√ √ )2
µnm = n− 1
2 + m− 1
2

 1/3
√ 1 1
σnm = µnm  √ +√ 
n− 1
2 m− 1
2
. . . . . .

Boaz Nadler RMT


Largest Eigenvalue Fluctuations

for ℓ1 /m fluctuations are O(1/m2/3 ), much smaller than standard



1/ m fluctuations for averages of many random variables.

TW is a new ”universal” distribution - attractor in terms of largest


eigenvalue due to universality.

. . . . . .

Boaz Nadler RMT


Distribution of the Largest Eigenvalue Divided By Trace

In various settings, interest is in following ratio


ℓ1
U= 1 ∑p > threshold
p j=1 ℓj

. . . . . .

Boaz Nadler RMT


Distribution of the Largest Eigenvalue Divided By Trace

In various settings, interest is in following ratio


ℓ1
U= 1 ∑p > threshold
p j=1 ℓj

This random variable plays a role in several different applications:


I Signal Detection [Besson & Scharf 06’, Kritchman & N. 08,
Bianchi et al. 09’]
I Two-way models of interaction [Johnson & Graybill, 72’]
I Models for Quantum Information Channels.

. . . . . .

Boaz Nadler RMT


Ratio of Largest Eigenvalue to Trace

In principle, as p, n → ∞, [Nechita 08’, Bianchi et al. 09’]


[ ]
U − µn,p
Pr < s → TWβ (s)
σn,p

. . . . . .

Boaz Nadler RMT


Ratio Distribution

β = 2, p = 20, n = 500
TW
0.5 ℓ1

0.4
density

0.3

0.2

0.1

0
−5 −4 −3 −2 −1 0 1 2
s

. . . . . .

Boaz Nadler RMT


Ratio Distribution

β = 2, p = 20, n = 500
TW
0.5 ℓ1
Ratio

0.4
density

0.3

0.2

0.1

0
−5 −4 −3 −2 −1 0 1 2
s

. . . . . .

Boaz Nadler RMT


Ratio of Largest Eigenvalue to Trace

Theorem: As, p, n → ∞,
[ ] ( )
U − µn,p 1 µn,p 2 ′′
Pr < s ≈ Fβ (s) − Fβ (s)
σn,p βnp σn,p

[N., J. Mult. Anal., 2011]

. . . . . .

Boaz Nadler RMT


Ratio of Largest Eigenvalue to Trace

Theorem: As, p, n → ∞,
[ ] ( )
U − µn,p 1 µn,p 2 ′′
Pr < s ≈ Fβ (s) − Fβ (s)
σn,p βnp σn,p

[N., J. Mult. Anal., 2011]


This distribution also relevant to study performance of various
detection methods
[N. IEEE Tran. Sig. Proc. 10’]
[N. Penna and Garello Int. Conf. Comm. 11’]

. . . . . .

Boaz Nadler RMT


Part III:
Signal Bearning Matrices

. . . . . .

Boaz Nadler RMT


Spiked Covariance Models

Consider model whereby

Σ = diag (λ1 , λ2 , . . . , λk , 0, . . . , 0) + σ 2 Im

Spiked covariance with k spikes.

Observe n vectors xi ∈ {R, C}m from this model.

Question: What happens to largest sample eigenvalues and


eigenvectors as n, m → ∞, with k, λj fixed ?

. . . . . .

Boaz Nadler RMT


Phase Transition

[complex case, Ben-Arous, Baik, Peche]


[real case, Baik and Silverstein]
Theorem: For spike model with k spikes, as n, m → ∞ with
m/n → c, for j = 1, . . . , k,
{ ( ) √
σ2
(λj + σ 2 ) 1 + m−k λj > σ 2 m/n
ℓj → √ n λ j √
σ 2 (1 + m/n)2 λj < σ 2 m/n

Phenomena known as retarded learning in statistical physics.

. . . . . .

Boaz Nadler RMT


Phase Transition / Eigenvectors

[D. Paul 07’, Nadler 08’]


Theorem: As m, n → ∞
 √

 0 if λ < σ 2 m/n
nλ2
R (m/n) = |⟨vPCA , v⟩| =
2 2
pσ 4
−1 √

 nλ2 λ
if λ > σ 2 m/n
pσ 4
+ σ2

In statistical physics:
[Hoyle and Rattray, Reimann & al, Biehl, Watson]

. . . . . .

Boaz Nadler RMT


Phase Transition / Eigenvectors

[D. Paul 07’, Nadler 08’]


Theorem: As m, n → ∞
 √

 0 if λ < σ 2 m/n
nλ2
R (m/n) = |⟨vPCA , v⟩| =
2 2
pσ 4
−1 √

 nλ2 λ
if λ > σ 2 m/n
pσ 4
+ σ2

In statistical physics:
[Hoyle and Rattray, Reimann & al, Biehl, Watson]

Asymptotic n-Gaussian fluctuations [Paul, 07]

n(ℓ1 − E[ℓ1 ]) ∼ N (0, σ 2 (λ1 ))

. . . . . .

Boaz Nadler RMT


Phase Transition for finite p as function of σ

First, a ”thought experiment”: Take training set {xν } with finite


m, n and start increasing σ. What should be the expected behavior
of R = ⟨vPCA , v⟩ and of ℓ1 ?

λ1, λ2 as a function of σ <v ,e >


k 1
80 1

0.8
60

0.6
40
0.4

20
0.2

0 0
0 1 2 3 0 1 2 3
σ σ

λ ∼ κ2 + σ 2 (1 + m/n) R ∼ 1 − σ 2 /κ2 m/n


2
n = 50, m = 200, κ = 7.87
. . . . . .

Boaz Nadler RMT


Phase Transition as function of σ

1
∆ λ / 10
R
0.9 Rth

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0 0.5 1 1.5 2 2.5 3
σ

. . . . . .

Boaz Nadler RMT


Tools in Random Matrix Theory

Moment Methods
Stieltjes Transform
Concentration of Measure
Many others...

. . . . . .

Boaz Nadler RMT


Part IV:
Signal Detection / Cognitive Radio Application

. . . . . .

Boaz Nadler RMT


Signal Detection Model

{xi }ni=1 are n i.i.d. observations from



x = λs(t)h + σξ(t)

. . . . . .

Boaz Nadler RMT


Signal Detection as Hypothesis Testing

Given {xi } determine between

H0 : no signal, λ = 0 vs.H1 : signal present λ > 0

. . . . . .

Boaz Nadler RMT


Signal Detection as Hypothesis Testing

Given {xi } determine between

H0 : no signal, λ = 0 vs.H1 : signal present λ > 0

H0 = null hypothesis.
H1 = alternative

. . . . . .

Boaz Nadler RMT


Test Statistics

Test statistic: A function f (x1 , . . . , xn ) → R such that

f (x1 , . . . , xn ) ≤ t → accept H0

. . . . . .

Boaz Nadler RMT


Test Statistics

Test statistic: A function f (x1 , . . . , xn ) → R such that

f (x1 , . . . , xn ) ≤ t → accept H0

f (x1 , . . . , xn ) > t → reject H0

. . . . . .

Boaz Nadler RMT


False Alarm and Power

Two common quantities of interest:

. . . . . .

Boaz Nadler RMT


False Alarm and Power

Two common quantities of interest:

False Alarm Rate:

PFA = Pr[f (x1 , . . . , xn ) > t | H0 true]

. . . . . .

Boaz Nadler RMT


False Alarm and Power

Two common quantities of interest:

False Alarm Rate:

PFA = Pr[f (x1 , . . . , xn ) > t | H0 true]

Detection Power:

PD = Pr[f (x1 , . . . , xn ) > t | H1 true]

. . . . . .

Boaz Nadler RMT


False Alarm and Power

Two common quantities of interest:

False Alarm Rate:

PFA = Pr[f (x1 , . . . , xn ) > t | H0 true]

Detection Power:

PD = Pr[f (x1 , . . . , xn ) > t | H1 true]

Typically, choose threshold t = t(α) such that PFA ≈ α for some


α ≪ 1.

. . . . . .

Boaz Nadler RMT


Optimal Tests

A test is optimal at level α, if

PD [f ] ≥ PD [any other test]

with same false alarm rate α.

. . . . . .

Boaz Nadler RMT


Optimal Tests

A test is optimal at level α, if

PD [f ] ≥ PD [any other test]

with same false alarm rate α.

Neyman-Pearson Lemma: If both H0 and H1 are simple


hypothesis, then optimal test statistic is likelihood ratio test

p(x1 , . . . , xn |H1 )
p(x1 , . . . , xn |H0 )

. . . . . .

Boaz Nadler RMT


Test Statistics for Signal Detection / Cognitive Radio

If σ is known, then H0 is simple.


However, H1 is not simple.
Neyman Pearson Lemma not directly applicable

. . . . . .

Boaz Nadler RMT


Test Statistics for Signal Detection / Cognitive Radio

If σ is known, then H0 is simple.


However, H1 is not simple.
Neyman Pearson Lemma not directly applicable

In practice: Many test statistics were proposed, all based on the


eigenvalues of the sample covariance matrix Sn .

Energy Detection: Tr (Sn ) known σ


Largest Eigenvalue: ℓ1 known σ
Max/Min Ratio: ℓ1 /ℓp unknown σ
GLRT: ℓ1 /Tr (Sn ) unknown σ
etc. etc.

. . . . . .

Boaz Nadler RMT


Optimal Test for Signal Detection ?

Given so many tests, which one to choose ?

Mathematical Principle: Look at simplest possible setting.

. . . . . .

Boaz Nadler RMT


Optimal Test for Signal Detection ?

Given so many tests, which one to choose ?

Mathematical Principle: Look at simplest possible setting.


σ - noise level known.

. . . . . .

Boaz Nadler RMT


Optimal Test for Signal Detection ?

Given so many tests, which one to choose ?

Mathematical Principle: Look at simplest possible setting.


σ - noise level known.
λ - signal strength, known.

. . . . . .

Boaz Nadler RMT


Optimal Test for Signal Detection ?

Given so many tests, which one to choose ?

Mathematical Principle: Look at simplest possible setting.


σ - noise level known.
λ - signal strength, known.

H0 is simple
H1 is still not simple

. . . . . .

Boaz Nadler RMT


Optimal Test for Signal Detection ?

Given so many tests, which one to choose ?

Mathematical Principle: Look at simplest possible setting.


σ - noise level known.
λ - signal strength, known.

H0 is simple
H1 is still not simple
but
if we consider test statistics based
only
on sample eigenvalues then it is !

. . . . . .

Boaz Nadler RMT


Which Test Statistic to use ?

Consider the case of two (nearly) simple hypothesis

H0 : Σ = I vs. H1 : W′ ΣW = I + diag (λ, 0, . . . , 0)

with λ - known. What is unknown is the basis which makes Σ


diagonal in H1 .
Suppose we use only eigenvalues {ℓj } of H as a test statistic.

. . . . . .

Boaz Nadler RMT


Which Test Statistic to use ?

Consider the case of two (nearly) simple hypothesis

H0 : Σ = I vs. H1 : W′ ΣW = I + diag (λ, 0, . . . , 0)

with λ - known. What is unknown is the basis which makes Σ


diagonal in H1 .
Suppose we use only eigenvalues {ℓj } of H as a test statistic.

Neyman-Pearson: optimal method is likelihood ratio test

p(ℓ1 , . . . , ℓm |H1 )
≷ C (α)
p(ℓ1 , . . . , ℓm |H0 )

. . . . . .

Boaz Nadler RMT


Which Test Statistic ?

From multivariate analysis (Muirhead 78’)


∏ ∏ 1
(ℓi − ℓj ) 0 F0 (− nL, Σ−1 )
(n−m−1)/2
p(ℓ1 , . . . , ℓm |Σ) = Cn,m ℓi
2
i<j

0 F0- hypergeometric function with matrix argument.


Key point: asymptotically in sample size n, for dimension m fixed,
( ) ∑
p(ℓ1 , . . . , ℓm |H1 )
log ≈ n(ℓ1 − h(λ)) + O( c1j /(ℓ1 − ℓj ))
p(ℓ1 , . . . , ℓm |H0 )

Asymptotically, as n → ∞, should only look at largest eigenvalue !

. . . . . .

Boaz Nadler RMT


Which Test Statistic ?

From multivariate analysis (Muirhead 78’)


∏ ∏ 1
(ℓi − ℓj ) 0 F0 (− nL, Σ−1 )
(n−m−1)/2
p(ℓ1 , . . . , ℓm |Σ) = Cn,m ℓi
2
i<j

0 F0- hypergeometric function with matrix argument.


Key point: asymptotically in sample size n, for dimension m fixed,
( ) ∑
p(ℓ1 , . . . , ℓm |H1 )
log ≈ n(ℓ1 − h(λ)) + O( c1j /(ℓ1 − ℓj ))
p(ℓ1 , . . . , ℓm |H0 )

Asymptotically, as n → ∞, should only look at largest eigenvalue !

Roy’s Largest Root Test


[Kritchman & N., IEEE-TSP, 09’]
. . . . . .

Boaz Nadler RMT


Using Roy’s Largest Root Test

p=25, K=2, λ = [1 0.4]


1

MDL
0.9
AIC
0.8

0.7
≠ K)

0.6
est

0.5
Pr(K

0.4

0.3

0.2

0.1

0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
N

. . . . . .

Boaz Nadler RMT


Using Roy’s Largest Root Test

p=25, K=2, λ = [1 0.4]


1

MDL
0.9
AIC
0.8 SCHOTT
MODIFIED−AIC
0.7
≠ K)

0.6
est

0.5
Pr(K

0.4

0.3

0.2

0.1

0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
N

. . . . . .

Boaz Nadler RMT


Using Roy’s Largest Root Test

p=25, K=2, λ = [1 0.4]


1

MDL
0.9
AIC
0.8 SCHOTT
MODIFIED−AIC
0.7
RMT
≠ K)

0.6
est

0.5
Pr(K

0.4

0.3

0.2

0.1

0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
N

. . . . . .

Boaz Nadler RMT


Properties of Roy’s Test

Detection Power of Roy’s test

via Random Matrix Theory

. . . . . .

Boaz Nadler RMT


Properties of Roy’s Test

Detection Power of Roy’s test

via Random Matrix Theory

Via largest
√ root test, asymptotically for m, n large only
λ/σ 2 > m/n can be detected (with probability one) !
Via other tests, can detect (but not with probability one) weaker
signals
[Onatsky et. al.]
Final word of caution: Roy’s largest root test is asymptotically
optimal when n → ∞. Not so for very weak signals.

. . . . . .

Boaz Nadler RMT


What we did not cover

A lot !

. . . . . .

Boaz Nadler RMT


What we did not cover

A lot !

- Free Probability (Voiculescu)


- Determinantal Processes
- Kernel random matrices
- Random matrices with heavy tailed distributions
- Random matrices with dependent (correlated) entries
- Random Graphs
- Other transforms
- Concentration of Measure (non-asymptotic finite sample bounds)
- relation to statistical physics
- Rate of convergence
- Linear spectral statistics, Central Limit Theorems - etc etc.

. . . . . .

Boaz Nadler RMT


Some (Recent) References

much more material can be found in:


Multivariate Statistics:
- A. T. James, Distributions of matrix variates and latent roots derived
from normal samples, Ann. Math. Statist., vol. 35, 475-501, 1964.
- T.W. Anderson, An introduction to multivariate statistical analysis,
Wiley, 2003.
- R.J. Muirhead, Aspects of Multivariate Statistical Theory, 2005.
Random Matrices / Recent Books:
- G. Anderson, A. Guionnet, O. Zeitouni, An Introduction to Random
Matrices, Cambridge, 2009.
- A. Tulino, S. Verdu, Random Matrix Methods and Wireless
Communications, 2011.
- Z. D. Bai, J. W. Silverstein, Spectral Analysis of Large Dimensional
Random Matrices, Springer, 2009.

. . . . . .

Boaz Nadler RMT


Some (Recent) References

Largest Eigenvalue Distribution:


- I.M. Johnstone, On the distribution of the largest eigenvalue in principal
component analysis, Ann. Stat., 29:295–327, 2001.
- I.M. Johnstone, Approximate Null Distribution of the Largest Root in
Multivariate Analysis, Ann. Applied Stat, 2009
- N. El Karoui. A rate of convergence result for the largest eigenvalue of
complex white Wishart matrices. Annals of Probability, 2006.
- B. Nadler, On the distribution of the ratio of the largest eigenvalue to
the trace of a Wishart matrix, J. Mult. Anal., 2010.

. . . . . .

Boaz Nadler RMT


Some (Recent) References

Largest Eigenvalue Signal + Noise:


- J. Baik, G. Ben-Arous, S. Peche, Phase transition of the largest
eigenvalue for non-null complex sample covariance matrices, Ann. Prob.,
2005.
- J. Baik, J. W. Silverstein, Eigenvalues of large sample covariance
matrices of spiked population models, J. Mult. Anal., 2006.
- D. Paul, Asymptotics of sample eigenstructure for a large dimensional
spiked covariance model, Stat. Sinica, 2007.
- B. Nadler, Finite sample approximation results for PCA, Ann. Stat.,
2008.
- R.R. Nadakuditi and J. W. Silverstein. Fundamental Limit of Sample
Generalized Eigenvalue Based Detection of Signals in Noise Using
Relatively Few Signal-Bearing and Noise-Only Samples. IEEE Journal of
Selected Topics in Signal Processing, 2010.
- I. M. Johnstone, B. Nadler, Tech. Report, Stanford University, 2011.

. . . . . .

Boaz Nadler RMT


The End

Obrigado pela sua atencao !

more material at
[Link]/∼nadler

. . . . . .

Boaz Nadler RMT

You might also like