0% found this document useful (0 votes)
30 views44 pages

Slides Ch8 Bài 3. Bài toán trị riêng - Phương pháp lũy thừa

Uploaded by

Huy Hoang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views44 pages

Slides Ch8 Bài 3. Bài toán trị riêng - Phương pháp lũy thừa

Uploaded by

Huy Hoang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Applied Mathematics 205

Unit V: Eigenvalue Problems

Lecturer: Dr. David Knezevic


Unit V: Eigenvalue Problems

Chapter V.3: Algorithms for


Eigenvalue Problems

2 / 44
Power Method

3 / 44
Power Method

The power method is perhaps the simplest eigenvalue algorithm

It finds the eigenvalue of A ∈ Cn×n with largest modulus

1: choose x0 ∈ Cn arbitrarily
2: for k = 1, 2, . . . do
3: xk = Axk−1
4: end for

Question: How does this algorithm work?

4 / 44
Power Method
Assuming A is nondefective, then the eigenvectors v1 , v2 , . . . , vn
provide a basis for Cn
Pn
Therefore there exists coefficients αi such that x0 = j=1 αj vj

Then, we have
xk = Axk−1 = A2 xk−2 = · · · = Ak x0
 
n
X n
X
= Ak  αj vj  = αj Ak vj
j=1 j=1
n
X
= αj λkj vj
j=1
 
n−1  k
X λj
= λkn αn vn + αj vj 
λn
j=1

5 / 44
Power Method

Then if |λn | > |λj |, 1 ≤ j < n, we see that xk → λkn αn vn as k → ∞

This algorithm converges linearly: the “error terms” are scaled by a


factor at most |λn−1 |/|λn | at each iteration

Also, we see that the method converges faster if λn is “well


separated” from the rest of the spectrum

6 / 44
Power Method

However, in practice the exponential factor λkn could cause


overflow or underflow after relatively few iterations

Therefore the standard form of the power method is actually the


normalized power method

1: choose x0 ∈ Cn arbitrarily
2: for k = 1, 2, . . . do
3: yk = Axk−1
4: xk = yk /kyk k
5: end for

7 / 44
Power Method

Convergence analysis of the normalized power method is essentially


the same as the un-normalized case

Only difference is we now get an extra scaling factor, ck ∈ R, due


to the normalization at each step
 
n−1  k
X λj
xk = ck λkn αn vn + αj vj 
λn
j=1

8 / 44
Power Method

This algorithm directly produces the eigenvector vn

One way to recover λn is to note that

yk = Axk−1 ≈ λn xk−1

Hence we can compare an entry of yk and xk−1 to approximate λn

We also note two potential issues:


1. We require x0 to have a nonzero component of vn
2. There may be more than one eigenvalue with maximum
modulus

9 / 44
Power Method

Issue 1:
I In practice, very unlikely that x0 will be orthogonal to vn
I Even if x0∗ vn = 0, rounding error will introduce a component
of vn during the power iterations

Issue 2:
I We cannot ignore the possibility that there is more than one
“max. eigenvalue”
I In this case xk would converge to a member of the
corresponding eigenspace

10 / 44
Power Method

An important idea in eigenvalue computations is to consider the


“shifted” matrix A − σI, for σ ∈ R

We see that
(A − σI)vi = (λi − σ)vi
and hence the spectrum of A − σI is shifted by −σ, and the
eigenvectors are the same

For example, if all the eigenvalues are real, a shift can be used with
the power method to converge to λ1 instead of λn

11 / 44
Power Method

Matlab example: Consider power method and shifted power


method for  
4 1
A= ,
1 −2
which has eigenvalues λ1 = −2.1623, λ2 = 4.1623

12 / 44
Inverse Iteration

13 / 44
Inverse Iteration

The eigenvalues of A−1 are the reciprocals of the eigenvalues of A,


since
1
Av = λv ⇐⇒ A−1 v = v
λ

Question: What happens if we apply the power method to A−1 ?

14 / 44
Inverse Iteration

Answer: We converge to the largest (in modulus) eigenvalue of


A−1 , which is 1/λ1 (recall that λ1 is the smallest eigenvalue of A)

This is called inverse iteration

1: choose x0 ∈ Cn arbitrarily
2: for k = 1, 2, . . . do
3: solve Ayk = xk−1 for yk
4: xk = yk /kyk k
5: end for

15 / 44
Inverse Iteration

Hence inverse iteration gives λ1 without requiring a shift

This is helpful since it may be difficult to determine what shift is


required to get λ1 in the power method

Question: What happens if we apply inverse iteration to the


shifted matrix A − σI?

16 / 44
Inverse Iteration

The smallest eigenvalue of A − σI is (λi ∗ − σ), where

i ∗ = arg min |λi − σ|,


i=1,2,...,n

and hence...

Answer: We converge to λ̃ = 1/(λi ∗ − σ), then recover λi ∗ via

1
λi ∗ = +σ
λ̃
Inverse iteration with shift allows us to find the eigenvalue closest
to σ

17 / 44
Inverse Iteration

Matlab example: Eigenvalues of the Laplacian via inverse iteration

18 / 44
Rayleigh Quotient Iteration

19 / 44
Rayleigh Quotient

For the remainder of this chapter (Rayleigh Quotient Iteration, QR


Algorithm) we will assume that A ∈ Rn×n is real and symmetric1

The Rayleigh quotient is defined as

x T Ax
r (x) ≡
xT x
If (λ, v ) ∈ R × Rn is an eigenpair, then

v T Av λv T v
r (v ) = = =λ
vT v vT v

1
Much of the material generalizes to complex non-hermitian matrices, but
symmetric case is simpler
20 / 44
Rayleigh Quotient

Theorem: Suppose A ∈ Rn×n is a symmetric matrix, then for any


x ∈ Rn we have
λ1 ≤ r (x) ≤ λn

Proof: We write xPas a linear combination of (orthogonal)


eigenvectors x = nj=1 αj vj , and the lower bound follows from
Pn 2
Pn 2
x T Ax j=1 λj αj j=1 αj
r (x) = T = Pn 2
≥ λ1 Pn 2
= λ1
x x j=1 αj j=1 αj

The proof of the upper bound r (x) ≤ λn is analogous 

21 / 44
Rayleigh Quotient

Corollary: A symmetric matrix A ∈ Rn×n is positive definite if and


only if all of its eigenvalues are positive

Proof: (⇒) Suppose A is symmetric positive definite (SPD), then


for any nonzero x ∈ Rn , we have x T Ax > 0 and hence

v1T Av1
λ1 = r (v1 ) = >0
v1T v1

(⇐) Suppose A has positive eigenvalues, then for any nonzero


x ∈ Rn
x T Ax = r (x)(x T x) ≥ λ1 kxk22 > 0


22 / 44
Rayleigh Quotient

But also, if x is an approximate eigenvector, then r (x) gives us a


good approximation to the eigenvalue

This is because estimation of an eigenvalue from an approximate


eigenvector is an n × 1 linear least squares problem: xλ ≈ Ax

x ∈ Rn is our “tall thin matrix” and Ax ∈ Rn is our right-hand side

Hence the normal equation for xλ ≈ Ax yields the Rayleigh


quotient, i.e.
x T xλ = x T Ax

23 / 44
Rayleigh Quotient

Question: How accurate is the Rayleigh quotient approximation to


an eigenvalue?

Let’s consider r as a function of x, so r : Rn → R


∂ T ∂
∂r (x) ∂xj (x Ax) (x T Ax) ∂x j
(x T x)
= −
∂xj xT x (x T x)2
2(Ax)j (x T Ax)2xj
= −
xT x (x T x)2
2
= T
(Ax − r (x)x)j
x x
(Note that the second equation relies on the symmetry of A)

24 / 44
Rayleigh Quotient

Therefore
2
∇r (x) = (Ax − r (x)x)
xT x
For an eigenpair (λ, v ) we have r (v ) = λ and hence

2
∇r (v ) = (Av − λv ) = 0
vT v
This shows that eigenvectors of A are stationary points of r

25 / 44
Rayleigh Quotient

Suppose (λ, v ) is an eigenpair of A, and let us consider a Taylor


expansion of r (x) about v :

r (x) = r (v ) + ∇r (v )T (x − v )
1
+ (x − v )T Hr (v )(x − v ) + H.O.T.
2
1
= r (v ) + (x − v )T Hr (v )(x − v ) + H.O.T.
2

Hence as x → v the error in a Rayleigh quotient approximation is

|r (x) − λ| = O(kx − v k22 )

That is, the Rayleigh quotient approx. to an eigenvalue squares the


error in a corresponding eigenvector approx.

26 / 44
Rayleigh Quotient Iteration
The Rayleigh quotient gives us an eigenvalue estimate from an
eigenvector estimate

Inverse iteration gives us an eigenvector estimate from an


eigenvalue estimate

It is natural to combine the two, this yields the Rayleigh quotient


iteration

1: choose x0 ∈ Rn arbitrarily
2: for k = 1, 2, . . . do
T Ax
σk = xk−1 T
3: k−1 /xk−1 xk−1
4: solve (A − σk I)yk = xk−1 for yk
5: xk = yk /kyk k
6: end for

27 / 44
Rayleigh Quotient Iteration

Suppose, at step k, we have kxk−1 − v k ≤ 

Then, from the Rayleigh quotient in line 3 of the algorithm, we


have |σk − λ| = O(2 )

In lines 4 and 5 of the algorithm, we then perform an inverse


iteration with shift σk to get xk

Recall the eigenvector error in one inverse iteration step is scaled


by ratio of “second largest to largest eigenvalues” of (A − σk I)−1

28 / 44
Rayleigh Quotient Iteration

Let λ be the closest eigenvalue of A to σk , then the magnitude of


largest eigenvalue of (A − σk I)−1 is 1/|σk − λ|

The second largest eigenvalue magnitude is 1/|σk − λ̂|, where λ̂ is


the eigenvalue of A “second closest” to σk

Hence at each inverse iteration step, the error is reduced by a factor

|σk − λ| |σk − λ|
= −→ const.|σk − λ| as σk → λ
|σk − λ̂| |(σk − λ) + (λ − λ̂)|

Therefore, we obtain cubic convergence as k → ∞:

kxk − v k → (const.|σk − λ|)kxk−1 − v k = O(3 )

29 / 44
Rayleigh Quotient Iteration

A drawback of Rayleigh iteration: we can’t just LU factorize


A − σk I once since the shift changes each step

Also, it’s harder to pick out specific parts of the spectrum with
Rayleigh quotient iteration since σk can change unpredictably

Matlab demo: Rayleigh iteration to compute an eigenpair of


 
5 1 1
A= 1 6 1 
1 1 7

Matlab demo: Rayleigh iteration to compute an eigenpair of


Laplacian

30 / 44
QR Algorithm

31 / 44
The QR Algorithm

The QR algorithm for computing eigenvalues is one of the best


known algorithms in Numerical Analysis2

It was developed independently in the late 1950s by John G.F.


Francis (England) and Vera N. Kublanovskaya (USSR)

The QR algorithm efficiently provides approximations for all


eigenvalues/eigenvectors of a matrix

We will consider what happens when we apply the power method


to a set of vectors — this will then motivate the QR algorithm

2
Recall that here we focus on the case in which A ∈ Rn×n is symmetric
32 / 44
The QR Algorithm

(0) (0)
Let x1 , . . . , xp denote p linearly independent starting vectors,
and suppose we store these vectors in the columns of X0

We can apply the power method to these vectors to obtain the


following algorithm:

1: choose an n × p matrix X0 arbitrarily


2: for k = 1, 2, . . . do
3: Xk = AXk−1
4: end for

33 / 44
The QR Algorithm

From our analysis of the power method, we see that for each
i = 1, 2, . . . , p:
 
(k)
xi = λkn αi,n vn + λkn−1 αi,n−1 vn−1 + · · · + λk1 αi,1 v1
 
n  k n−p  k
X λ j
X λ j
= λkn−p  αi,j vj + αi,j vj 
λn−p λn−p
j=n−p+1 j=1

Then, if |λn−p+1 | > |λn−p |, the sum in green will decay compared
to the sum in blue as k → ∞

Hence the columns of Xk will converge to a basis for


span{vn−p+1 , . . . , vn }

34 / 44
The QR Algorithm

However, this method doesn’t provide a good basis: each column


of Xk will be very close to vn

Therefore the columns of Xk become very close to being linearly


dependent

We can resolve this issue by enforcing linear independence at each


step

35 / 44
The QR Algorithm

We orthonormalize the vectors after each iteration via a (reduced)


QR factorization, to obtain the simultaneous iteration:

1: choose n × p matrix Q0 with orthonormal columns


2: for k = 1, 2, . . . do
3: Xk = AQ̂k−1
4: Q̂k R̂k = Xk
5: end for

The column spaces of Q̂k and Xk in line 4 are the same

Hence columns of Q̂k converge to orthonormal basis for


span{vn−p+1 , . . . , vn }

36 / 44
The QR Algorithm

In fact, we don’t just get a basis for span{vn−p+1 , . . . , vn }, we get


the eigenvectors themselves!

Theorem: The columns of Q̂k converge to the p dominant


eigenvectors of A

We will not discuss the full proof, but we note that this result is
not surprising since:
I the eigenvectors of a symmetric matrix are orthogonal
I columns of Q̂k converge to an orthogonal basis for
span{vn−p+1 , . . . , vn }

Simultaneous iteration approximates eigenvectors, we obtain


eigenvalues from the Rayleigh quotient Q̂ T AQ̂ ≈ diag(λ1 , . . . , λn )

37 / 44
The QR Algorithm

With p = n, the simultaneous iteration will approximate all


eigenpairs of A

We now show a more convenient reorganization of the


simultaneous iteration algorithm

We shall require some extra notation: the Q and R matrices


arising in the simultaneous iteration will be underlined Q k , R k

(As we will see shortly, this is to distinguish between the matrices


arising in the two different formulations...)

38 / 44
The QR Algorithm

Define3 the k th Rayleigh quotient matrix: Ak ≡ Q T


k
AQ k , and the
QR factors Qk , Rk as: Qk Rk = Ak−1

Our goal is to show that Ak = Rk Qk , k = 1, 2, . . .

Initialize Q 0 = I ∈ Rn×n , then in the first simultaneous iteration


we obtain X1 = A and Q 1 R 1 = A

It follows that A1 = Q T
1
AQ 1 = Q T
1
(Q 1 R 1 )Q 1 = R 1 Q 1

Also Q1 R1 = A0 = Q T
0
AQ 0 = A, so that Q1 = Q 1 , R1 = R 1 , and
A1 = R1 Q1

3
We now we use the full, rather than the reduced, QR factorization hence
we omit ˆ notation
39 / 44
The QR Algorithm

In the second simultaneous iteration, we have X2 = AQ 1 , and we


compute the QR factorization Q 2 R 2 = X2

Also, using our QR factorization of A1 gives

X2 = AQ 1 = (Q 1 Q T
1
)AQ 1 = Q 1 A1 = Q 1 (Q2 R2 ),

which implies that Q 2 = Q 1 Q2 = Q1 Q2 and R 2 = R2

Hence

A2 = Q T
2
AQ 2 = Q2T Q T
1
AQ 1 Q2 = Q2T A1 Q2 = Q2T Q2 R2 Q2 = R2 Q2

40 / 44
The QR Algorithm

The same pattern continues for k = 3, 4, . . .: we QR factorize Ak


to get Qk and Rk , then we compute Ak+1 = Rk Qk

The columns of the matrix Q k = Q1 Q2 · · · Qk approximates the


eigenvectors of A

The diagonal entries of the Rayleigh quotient matrix Ak = Q T


k
AQ k
approximate the eigenvalues of A

(Also, due to eigenvector orthogonality for symmetric A, Ak


converges to a diagonal matrix as k → ∞)

41 / 44
The QR Algorithm

This discussion motivates the famous QR algorthm:

1: A0 = A
2: for k = 1, 2, . . . do
3: Qk Rk = Ak−1
4: Ak = Rk Qk
5: end for

42 / 44
The QR Algorithm

Matlab demo: Compute eigenvalues and eigenvectors of4


 
2.9766 0.3945 0.4198 1.1159
 0.3945 2.7328 −0.3097 0.1129 
A=  0.4198 −0.3097 2.5675 0.6079  ,

1.1159 0.1129 0.6079 1.7231

(This matrix has eigenvalues 1, 2, 3 and 4)

4
Heath example 4.15
43 / 44
The QR Algorithm

We have presented the simplest version of the QR algorithm: the


“unshifted” QR algorithm

In order to obtain an “industrial strength” algorithm, there are a


number of other issues that need to be considered:
I convergence can be accelerated significantly by introducing
shifts, as we did in inverse iteration and Rayleigh iteration
I it is more efficient to reduce A to tridiagonal form (via
Householder reflectors) before applying QR algorithm
I reliable convergence criteria for the eigenvalues/eigenvectors
are required

High-quality implementations, e.g. LAPACK or Matlab’s eig,


handle all of these subtleties for us

44 / 44

You might also like