0% found this document useful (0 votes)
29 views15 pages

1739538576_Eng_Math_Unit_2

The document discusses eigenvalues and eigenvectors, defining them in relation to matrices and providing examples of how to determine them. It introduces the characteristic polynomial and the Cayley-Hamilton theorem, illustrating their applications in finding eigenvalues and powers of matrices. The document includes examples and solutions to demonstrate the concepts and calculations involved.

Uploaded by

yanuj6971
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views15 pages

1739538576_Eng_Math_Unit_2

The document discusses eigenvalues and eigenvectors, defining them in relation to matrices and providing examples of how to determine them. It introduces the characteristic polynomial and the Cayley-Hamilton theorem, illustrating their applications in finding eigenvalues and powers of matrices. The document includes examples and solutions to demonstrate the concepts and calculations involved.

Uploaded by

yanuj6971
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Unit 2

February 10, 2025

Eigenvalues and Eigenvectors


An n × n matrix A can be thought of as the linear mapping that takes any arbitrary vector
x ∈ Rn and outputs a new vector Ax. In some cases, the new output vector Ax is simply a
scalar multiple of the input vector x, that is, there exists a scalar λ such that Ax = λx. This
case is so important that we make the following definition.

Definition 1. Let A be an n × n matrix and let v be a non-zero vector. If Av = λv for some


scalar λ, then we call the vector v an eigenvector of A, and we call the scalar λ an eigenvalue
of A corresponding to v.

Hence, an eigenvector v of A is simply scaled by a scalar λ under multiplication by A.


Eigenvectors are by definition non-zero vectors because A0 is clearly a scalar multiple of 0, and
then it is not clear what the corresponding eigenvalue should be.
Example: Determine if the given vectors v and u are eigenvectors of A. If yes, find the
eigenvalue of A associated with the eigenvector.
     
4 −1 6 −3 −1
A = 2 1 6 , v =  0  , u =  2 
2 −1 8 1 1
Solution: Compute
      
4 −1 6 −3 −6 −3
Av = 2 1 6
   0 = 0 = 2 0  = 2v
   
2 −1 8 1 2 1

Thus, v is an eigenvector of A with eigenvalue λ = 2. Hence, Av = 2v and thus v is an


eigenvector of A with corresponding eigenvalue λ = 2. On the other hand,
    
4 −1 6 −1 0
Au = 2 1 6
   2 = 6 .
 
2 −1 8 1 4
There is no scalar λ such that    
0 −1
6 = λ  2  .
4 1
Therefore, u is not an eigenvector of A.

1
The Characteristic Polynomial of a Matrix
Recall that a number λ is an eigenvalue of A ∈ Rn×n if there exists a non-zero vector v such
that
Av = λv
or equivalently if v ∈ Null(A − λI). In other words, λ is an eigenvalue of A if and only if the
subspace Null(A − λI) contains a vector other than the zero vector. We know that any matrix
M has a non-trivial null space if and only if M is non-invertible, if and only if det(M) = 0.
Hence, λ is an eigenvalue of A if and only if λ satisfies det(A − λI) = 0.
Let’s compute the expression det(A − λI) for a generic 2 × 2 matrix:
a11 − λ a12
det(A − λI) = = (a11 −λ)(a22 −λ)−a12 a21 = λ2 −(a11 +a22 )λ+a11 a22 −a12 a21 .
a21 a22 − λ
Thus, if A is 2 × 2 then
det(A − λI) = λ2 − (a11 + a22 )λ + a11 a22 − a12 a21
is a polynomial in the variable λ of degree n = 2. This motivates the following definition.
Definition Let A be an n × n matrix. The polynomial
p(λ) = det(A − λI)
is called the characteristic polynomial of A.
In summary, to find the eigenvalues of A we must find the roots of the characteristic poly-
nomial:
p(λ) = det(A − λI).
The following theorem asserts that what we observed for the case n = 2 is indeed true for
all n.
Theorem 2. The characteristic polynomial p(λ) = det(A − λI) of an n × n matrix A is an nth
degree polynomial.
Proof: Recall that for the case n = 2 we computed that
det(A − λI) = λ2 − (a11 + a22 )λ + a11 a22 − a12 a21 .
Therefore, the claim holds for n = 2. By induction, suppose that the claims hold for n ≥ 2. If
A is an (n + 1) × (n + 1) matrix then expanding det(A − λI) along the first row:
n+1
X
det(A − λI) = (a11 − λ) det(A11 − λI) + (−1)1+k a1k det(A1k − λI).
k=2

By induction, each of det(A1k − λI) is an nth degree polynomial. Hence, (a11 − λ) det(A11 − λI)
is an (n + 1)th degree polynomial. This ends the proof.
Example: Find the eigenvalues of
 
−4 −6 −7
A= 3 5 3 .
0 0 3
Solution. Compute
     
−4 −6 −7 λ 0 0 −4 − λ −6 −7
A − λI =  3 5 3  − 0 λ 0 =  3 5−λ 3 .
0 0 3 0 0 λ 0 0 3−λ

2
Then
5−λ 3 −6 −7
det(A − λI) = (−4 − λ) −3
0 3−λ 0 3−λ
= (−4 − λ)[(3 − λ)(5 − λ) + 3λ] − 3[−6(3 − λ) − 7λ] = λ3 − 4λ2 + λ + 6.
Factor the characteristic polynomial:
p(λ) = λ3 − 4λ2 + λ + 6 = (λ − 2)(λ − 3)(λ + 1).
Therefore, the eigenvalues of A are λ1 = 2, λ2 = 3, λ3 = −1.

Example: Find the eigenvalues of A and a basis for each eigenspace.


 
2 0 0
A= 4 2 2
−2 0 1
Does R3 have a basis of eigenvectors of A?
Solution. The characteristic polynomial of A is
p(λ) = det(A − λI) = λ3 − 5λ2 + 8λ − 4 = (λ − 1)(λ − 2)2
and therefore the eigenvalues are λ1 = 1 and λ2 = 2. Notice that although p(λ) is a polynomial
of degree n = 3, it has only two distinct roots and hence A has only two distinct eigenvalues.
The eigenvalue λ2 = 2 is said to be repeated and λ1 = 1 is said to be a simple eigenvalue.
For λ1 = 1, one finds that the eigenspace Null(A − λ1 I) is spanned by
 
0
v1 = −2

1
and thus v1 is an eigenvector of A with eigenvalue λ1 = 1.
Now consider λ2 = 2:  
0 0 0
A − 2I =  4 0 2
−2 0 −1
Row reducing A − 2I, one obtains
 
−2 0 −1
A − 2I ∼ 0 0 0
0 0 0
Therefore, rank(A − 2I) = 1, and thus by the Rank Theorem it follows that Null(A − 2I) is
a 2-dimensional eigenspace. Performing back substitution, one finds the following basis for the
λ2 -eigenspace:    
 −1 0 
{v2 , v3 } =  0 , 1
 
2 0
 

Therefore, the eigenvectors


      
 0 −1 0 
{v1 , v2 , v3 } = −2 , 0 , 1
   
1 2 0
 

form a basis for R3 . Hence, for the repeated eigenvalue λ2 = 2, we were able to find two linearly
independent eigenvectors.

3
Cayley-Hamilton Theorem
Statement: Every square matrix satisfies its own characteristic equation.

Uses of Cayley-Hamilton Theorem: To calculate: (i) the positive integral power of A


and (ii) the inverse of a non-singular square matrix A.

Example: Verify Cayley–Hamilton theorem and find A4 and A−1 when


 
−2 −1 2
A = −1 2 −1
1 −1 2

Solution: The characteristic equation of A is |A − λI| = 0, i.e., λ3 − S1 λ2 + S2 λ − S3 = 0 where

S1 = sum of its leading diagonal elements = 2 + 2 + 2 = 6

2 −1 2 2 2 −1
S2 = sum of the minors of its leading diagonal elements = + +
−1 2 1 2 1 2
= (4 − 1) + (4 − 2) + (4 − 1) = 3 + 2 + 3 = 8
S3 = |A| = 2(4 − 1) + 1(−2 + 1) + 2(1 − 2) = 2(3) + 1(−1) + 2(−1) = 6 − 1 − 2 = 3
So, the characteristic equation of A is λ3 − 6λ2 + 8λ − 3 = 0.
By Cayley-Hamilton theorem

A3 − 6A2 + 8A − 3I = O (1)

Verification:
    
−2 −1 2 −2 −1 2 7 −6 9
2
A = A × A = −1
 2 −1 −1
  2 −1 = −5
  6 −6
1 −1 2 1 −1 2 5 −5 7
    
−2 −1 2 7 −6 9 29 −28 38
3 2
A = A × A = −1
 2 −1 −5
  6 −6 = −22
  23 −28
1 −1 2 5 −5 7 22 −22 29
So,
       
29 −28 38 7 −6 9 −2 −1 2 1 0 0
A3 −6A2 +8A−3I = −22 23 −28 −6 −5 6 −6 +8 −1 2 −1 −3 0 1 0
22 −22 29 5 −5 7 1 −1 2 0 0 1
 
0 0 0
= 0 0 0 = O
0 0 0
To find A4 : From (1),
A3 = 6A2 − 8A + 3I (2)
Multiply A on both sides, we get

A4 = 6A3 − 8A2 + 3A = 6[6A2 − 8A + 3I] − 8A2 + 3A

= 36A2 − 48A + 18I − 8A2 + 3A = 28A2 − 45A + 18I (3)

4
Using (3),      
7 −6 9 −2 −1 2 1 0 0
A4 = 28 −5 6 −6 − 45 −1 2 −1 + 18 0 1 0
5 −5 7 1 −1 2 0 0 1
 
124 −123 162
= −95 96 −123

95 −95 124
To find A−1 : Multiply (1) by A−1 ,

A2 − 6A + 8I − 3A−1 = O

3A−1 = A2 − 6A + 8I
     
7 −6 9 −2 −1 2 1 0 0
3A−1 = −5 6 −6 − 6 −1 2 −1 + 8 0 1 0
5 −5 7 1 −1 2 0 0 1
 
30 −3 12
= 0 −11 3 

−3 0 9
So,  
30 −3 12
1
A−1 =  0 −11 3 
3
−3 0 9

Algebraic multiplicity and Geometric multiplicity


Definition: Suppose that A ∈ Mn×n has characteristic polynomial p(λ) that can be factored
as
p(λ) = (λ − λ1 )k1 (λ − λ2 )k2 · · · (λ − λp )kp .
The exponent ki is called the algebraic multiplicity of the eigenvalue λi . The dimension Null(A−
λi I) of the eigenspace associated to λi is called the geometric multiplicity of λi .
For simplicity and whenever it is convenient, we will denote the geometric multiplicity of the
eigenvalue λi as
gi = dim(Null(A − λi I)).
Example: A 6 × 6 matrix A has characteristic polynomial

p(λ) = λ6 − 4λ5 − 12λ4 .

Find the eigenvalues of A and their algebraic multiplicities.


Solution. Factoring p(λ) we obtain

p(λ) = λ4 (λ2 − 4λ − 12) = λ4 (λ − 6)(λ + 2).

Therefore, the eigenvalues of A are λ1 = 0, λ2 = 6, and λ3 = −2. Their algebraic multiplicities


are k1 = 4, k2 = 1, and k3 = 1, respectively. The eigenvalue λ1 = 0 is repeated, while λ2 = 6
and λ3 = −2 are simple eigenvalues.
In the above example, we had p(λ) = (λ−1)(λ−2)2 and thus λ1 = 1 has algebraic multiplicity
k1 = 1 and λ2 = 2 has algebraic multiplicity k2 = 2. For λ1 = 1, we found one linearly
independent eigenvector, and therefore λ1 has geometric multiplicity g1 = 1. For λ2 = 2, we
found two linearly independent eigenvectors, and therefore λ2 has geometric multiplicity g2 = 2.

5
However, as we will see in the next example, the geometric multiplicity gi is in general less than
the algebraic multiplicity ki :
gi ≤ ki .
Example: Find the eigenvalues of A and a basis for each eigenspace:
 
2 4 3
A = −4 −6 −3
3 3 1
For each eigenvalue of A, find its algebraic and geometric multiplicity. Does R3 have a basis
of eigenvectors of A?
Solution. One computes

p(λ) = −λ3 − 3λ2 + 4 = −(λ − 1)(λ + 2)2


and therefore the eigenvalues of A are λ1 = 1 and λ2 = −2. The algebraic multiplicity of λ1
is k1 = 1 and that of λ2 is k2 = 2.
For λ1 = 1, we compute
 
1 4 3
A − I = −4 −7 −3
3 3 0
and then one finds that
 
1
v1 = −1
1
is a basis for the λ1 -eigenspace. Therefore, the geometric multiplicity of λ1 is g1 = 1.
For λ2 = −2, we compute
     
4 4 3 4 4 3 1 1 1
A − λ2 I = −4 −4 −3 ∼ 1 1 1 ∼ 0 0 1
3 3 3 0 0 0 0 0 0
Therefore, since rank(A − λ2 I) = 2, the geometric multiplicity of λ2 = −2 is g2 = 1, which
is less than the algebraic multiplicity k2 = 2.
An eigenvector corresponding to λ2 = −2 is
 
−1
v2 =  1 
0
Therefore, for the repeated eigenvalue λ2 = −2, we are able to find only one linearly in-
dependent eigenvector. Therefore, it is not possible to construct a basis for R3 consisting of
eigenvectors of A.

Eigenvalues and Similarity Transformations


Definition: Let A and B be n × n matrices. We will say that A is similar to B if there exists
an invertible matrix P such that
A = PBP−1 .

6
If A is similar to B then B is similar to A because from the equation A = PBP−1 we can
multiply on the left by P−1 and on the right by P to obtain that

P−1 AP = B.

Hence, with Q = P−1 , we have that B = QAQ−1 and thus B is similar to A. Hence, if A
is similar to B then B is similar to A and therefore we simply say that A and B are similar.
Matrices that are similar are clearly not necessarily equal. However, there is a reason why
the word similar is used. Here are a few reasons why.
Theorem 3. If A and B are similar matrices then the following are true:
1. rank(A) = rank(B)

2. det(A) = det(B)

3. A and B have the same eigenvalues


Proof: We will prove part (c). If A and B are similar then A = PBP−1 for some matrix
P. Then

det(A − λI) = det A − λPP−1 = det PBP−1 − λPP−1 = det P(B − λI)P−1
  

Using properties of determinants:

= det(P) det(B − λI) det P−1 = det(B − λI).




Thus, A and B have the same characteristic polynomial, and hence the same eigenvalues.

Diagonalization
Eigenvalues of Triangular Matrices
Before discussing diagonalization, we first consider the eigenvalues of triangular matrices.
Theorem 4. Let A be a triangular matrix (either upper or lower). Then the eigenvalues of A
are its diagonal entries.
Proof: We will prove the theorem for the case n = 3 and A is upper triangular; the general
case is similar. Suppose then that A is a 3 × 3 upper triangular matrix:
 
a11 a12 a13
A =  0 a22 a23 
0 0 a33

Then  
a11 − λ a12 a13
A − λI =  0 a22 − λ a23 
0 0 a33 − λ
and thus the characteristic polynomial of A is

p(λ) = det(A − λI) = (a11 − λ)(a22 − λ)(a33 − λ)

and the roots of p(λ) are


λ1 = a11 , λ2 = a22 , λ3 = a33 .
In other words, the eigenvalues of A are simply the diagonal entries of A.

7
Example: Consider the following matrix:
 
6 0 0 0 0
−1 0 0 0 0
 
A= 0 0 7 0 0
−1 0 0 −4 0
8 −2 3 0 7

(a) Find the characteristic polynomial and the eigenvalues of A.


(b) Find the geometric and algebraic multiplicity of each eigenvalue of A.

We now introduce a very special type of a triangular matrix, namely, a diagonal matrix.
Definition: A matrix D whose off-diagonal entries are all zero is called a diagonal matrix. For
example, here is a 3 × 3 diagonal matrix:
 
3 0 0
D = 0 −5 0 
0 0 −8

and here is a 5 × 5 diagonal matrix:


 
6 0 0 0 0
0 0 0 0 0
 
0
D= 0 −7 2 0 

0 0 0 2 0
0 0 0 0 −1

A diagonal matrix is clearly also a triangular matrix, and therefore the eigenvalues of a diagonal
matrix D are simply the diagonal entries of D. Moreover, the powers of a diagonal matrix are
easy to compute. For example, if  
λ1 0
D=
0 λ2
then     2 
2 λ1 0 λ1 0 λ1 0
D = =
0 λ2 0 λ2 0 λ22
and similarly for any integer k = 1, 2, 3, . . ., we have that
 k 
k λ1 0
D =
0 λk2

Diagonalization
Definition: A matrix A is called diagonalizable if it is similar to a diagonal matrix D. In other
words, if there exists an invertible matrix P such that

A = PDP−1 .

How do we determine when a given matrix A is diagonalizable? Let us first determine what
conditions need to be met for a matrix A to be diagonalizable. Suppose then that A is diago-
nalizable. Then by Definition, there exists an invertible matrix
 
P = v1 v2 · · · vn

8
and a diagonal matrix  
λ1 0 ··· 0
 0 λ2 ··· 0
D =  ..
 
.. ... .. 
. . .
0 0 · · · λn
such that A = PDP−1 . Multiplying on the right both sides of the equation A = PDP−1 by
the matrix P, we obtain
AP = PD.
Now  
AP = Av1 Av2 · · · Avn
while on the other hand  
PD = λ1 v1 λ2 v2 · · · λn vn .
Therefore, since it holds that AP = PD, we have
   
Av1 Av2 · · · Avn = λ1 v1 λ2 v2 · · · λn vn .
Comparing columns, we must have that
Avi = λi vi .
Thus, the columns v1 , v2 , . . . , vn of P are eigenvectors of A and form a basis for Rn because P
is invertible. In conclusion, if A is diagonalizable, then Rn has a basis consisting of eigenvectors
of A.
Suppose instead that {v1 , v2 , . . . , vn } is a basis of Rn consisting of eigenvectors of A. Let
λ1 , λ2 , . . . , λn be the eigenvalues of A associated with v1 , v2 , . . . , vn , respectively, and set
 
P = v1 v2 · · · vn .
Then P is invertible because {v1 , v2 , . . . , vn } are linearly independent. Let
 
λ1 0 ··· 0
0 λ2 · · · 0 
D =  .. ..  .
 
.. . .
. . . .
0 0 · · · λn
Now, since Avi = λi vi we have that
     
AP = A v1 v2 · · · vn = Av1 Av2 · · · Avn = λ1 v1 λ2 v2 · · · λn vn .
Therefore,  
AP = λ1 v1 λ2 v2 · · · λn vn .
On the other hand,
 
λ1 0 ··· 0
  0 λ2
 ··· 0   
PD = v1 v2 · · · vn  .. .. .. ..  = λ1 v1 λ2 v2 · · · λn vn .
. . . .
0 0 · · · λn
Therefore, AP = PD, and since P is invertible we have that
A = PDP−1 .
Thus, if Rn has a basis consisting of eigenvectors of A, then A is diagonalizable. We have
therefore proved the following theorem.

9
Theorem 5. A matrix A is diagonalizable if and only if there is a basis {v1 , v2 , . . . , vn } of Rn
consisting of eigenvectors of A.

Here, the punchline is that the problem of diagonalization of a matrix A is equivalent to


finding a basis of Rn consisting of eigenvectors of A. We will see in some of the examples below
that it is not always possible to diagonalize a matrix.

Conditions for Diagonalization


We first consider the simplest case when we conclude that a given matrix is diagonalizable,
namely, the case when all eigenvalues are distinct.

Theorem 6. Suppose that A ∈ Rn×n has n distinct eigenvalues λ1 , λ2 , . . . , λn . Then A is


diagonalizable.

Proof. Each eigenvalue λi produces an eigenvector vi . The set of eigenvectors {v1 , v2 , . . . , vn } are
linearly independent because they correspond to distinct eigenvalues. Therefore, {v1 , v2 , . . . , vn }
is a basis of Rn consisting of eigenvectors of A, and then by Theorem 5 we conclude that A is
diagonalizable.

What if A does not have distinct eigenvalues? Can A still be diagonalizable? The following
theorem completely answers this question.

Theorem 7. A matrix A is diagonalizable if and only if the algebraic and geometric multiplicities
of each eigenvalue are equal.

Proof. Let A be an n × n matrix and let λ1 , λ2 , . . . , λp denote the distinct eigenvalues of A.


Suppose that k1 , k2 , . . . , kp are the algebraic multiplicities and g1 , g2 , . . . , gp are the geometric
multiplicities of the eigenvalues, respectively. Suppose that the algebraic and geometric multi-
plicities of each eigenvalue are equal, that is, suppose that gi = ki for each i = 1, 2, . . . , p. Since
k1 +k2 +· · ·+kp = n, then because gi = ki we must also have that g1 +g2 +· · ·+gp = n. Therefore,
there exists n linearly independent eigenvectors of A and consequently A is diagonalizable.
On the other hand, suppose that A is diagonalizable. Since the geometric multiplicity is at
most the algebraic multiplicity, the only way that g1 + g2 + · · · + gp = n is if gi = ki , i.e., that
the geometric and algebraic multiplicities are equal.
Example: Determine if A is diagonalizable. If yes, find a matrix P that diagonalizes A.
 
−4 −6 −7
A= 3 5 3
0 0 3

Solution. The characteristic polynomial of A is

p(λ) = det(A − λI) = (λ − 2)(λ − 3)(λ + 1)

and therefore λ1 = 2, λ2 = 3, and λ3 = −1 are the eigenvalues of A. Since A has n = 3 distinct


eigenvalues, so A is diagonalizable. Eigenvectors v1 , v2 , v3 corresponding to λ1 , λ2 , λ3 are found
to be      
−1 −1 −2
v1 = 1 , v2 = 0 , v3 = 1 
    
0 1 0

10
Therefore, a matrix that diagonalizes A is
 
−1 −2 −2
P= 1 0 1
0 1 0

You can verify that  


λ1 0 0
P  0 λ2 0  P−1 = A
0 0 λ3

Example: Determine if A is diagonalizable. If yes, find a matrix P that diagonalizes A.


 
2 0 0
A= 4 2 2
−2 0 1

Solution. The characteristic polynomial of A is

p(λ) = det(A − λI) = (λ − 1)(λ − 2)2

and therefore λ1 = 1, λ2 = 2. An eigenvector corresponding to λ1 = 1 is


 
0
v1 = −2

1

One finds that g2 = dim(Null(A − λ2 I)) = 2, and two linearly independent eigenvectors for λ2
are    
 −1 0 
{v2 , v3 } =  0 , 1
 
2 0
 

Therefore, A is diagonalizable, and a matrix that diagonalizes A is


 
  0 −1 0
P = v1 v2 v3 = −2 0 1
1 2 0

You can verify that  


λ1 0 0
P  0 λ2 0  P−1 = A
0 0 λ3

Orthogonal Diagonalization
Recall that an n × n matrix A is diagonalizable if and only if it has n linearly independent
eigenvectors. Moreover, the matrix P with these eigenvectors as columns is a diagonalizing
matrix for A, that is,
P −1 AP is diagonal.
As we have seen, the really nice bases of Rn are the orthogonal ones, so a natural question is:
which n × n matrices have an orthogonal basis of eigenvectors? These turn out to be precisely
the symmetric matrices, and this is the main result of this section.

11
Before proceeding, recall that an orthogonal set of vectors is called orthonormal if ∥v∥ = 1
for each vector v in the set, and that any orthogonal set {v1 , v2 , . . . , vk } can be “normalized,”
that is, converted into an orthonormal set:
 
v1 v2 vk
, ,..., .
∥v1 ∥ ∥v2 ∥ ∥vk ∥
In particular, if a matrix A has n orthogonal eigenvectors, they can (by normalizing) be taken
to be orthonormal. The corresponding diagonalizing matrix P has orthonormal columns, and
such matrices are very easy to invert.
Theorem 8. The following conditions are equivalent for an n × n matrix P .
1. P is invertible and P −1 = P T .
2. The rows of P are orthonormal.
3. The columns of P are orthonormal.
Proof. First, recall that condition (1) is equivalent to P P T = I. Let x1 , x2 , . . . , xn denote the
rows of P . Then xTj is the jth column of P T , so the (i, j)-entry of P P T is given by xi · xj .
Thus, P P T = I means that xi · xj = 0 if i ̸= j and xi · xj = 1 if i = j. Hence, condition (1) is
equivalent to (2). The proof of the equivalence of (1) and (3) is similar.

Definition: An n × n matrix P is called an orthogonal matrix if it satisfies one (and hence


all) of the conditions in Theorem 8.

Example: The rotation matrix


 
cos θ − sin θ
R(θ) =
sin θ cos θ
is orthogonal for any angle θ.

Example: The matrix  


2 1 1
A = −1 1 1
0 −1 1
has orthogonal rows, but the columns are not orthogonal. However, if the rows are normalized,
the resulting matrix √ √ √ 
2 1 1
√6 √6 √6 
A′ =
 −1 1 1
 3 √3 √3

−1 1
0 2 2
is orthogonal (so the columns are now orthonormal).

Example: If P and Q are orthogonal matrices, then P Q is also orthogonal, as is P −1 = P T .


Solution: Since P and Q are invertible, their product P Q is also invertible, and
(P Q)−1 = Q−1 P −1 = QT P T = (P Q)T .
Hence, P Q is orthogonal. Similarly, (P −1 )−1 = P = (P T )T = (P −1 )T shows that P −1 is orthog-
onal.

Definition: An n×n matrix A is said to be orthogonally diagonalizable when an orthogonal


matrix P can be found such that P −1 AP = P T AP is diagonal.

12
Theorem 9. The following conditions are equivalent for an n × n matrix A.

1. A has an orthonormal set of n eigenvectors.

2. A is orthogonally diagonalizable.

3. A is symmetric.

Quadratic Forms
Definition: A quadratic form in n variables is a function f : Rn → R of the form
X
f (x) = f (x1 , . . . , xn ) = cij xi xj (∗)
1≤i≤j≤n

where x ∈ Rn and cij ∈ R for 1 ≤ i ≤ j ≤ n. Alternatively, a quadratic form is a homogeneous


polynomial of degree 2 in n variables x1 , . . . , xn .
Examples: The following are quadratic forms:

1. f (x1 ) = x21

2. f (x1 , x2 ) = 2x21 + 3x22 − x1 x2

3. f (x1 , x2 , x3 ) = x21 + x22 + x23 − 2x1 x2 − 2x1 x3 − 2x2 x3

4. f (x1 , . . . , xn ) = x21 + x22 + · · · + x2n = ⟨v, v⟩, where v = (x1 , . . . , xn ).

Thus, the Euclidean inner product gives rise to a quadratic form.


If we set aii = cii for i = 1, . . . , n and aij = 21 cij for 1 ≤ i < j ≤ n, then
n
X X
f (x) = aii x2i + 2aij xi xj .
i=1 1≤i<j≤n

This can be written in matrix form as

f (x) = xT Ax,

where A is the symmetric n × n matrix with (i, j)-th entry equal to aij . This matrix A is called
the matrix of the quadratic form f .
Example: Let
f (x1 , x2 ) = 2x21 − 3x22 − x1 x2 .
Then,
2 − 21
  
  x1
f (x1 , x2 ) = x1 x2 .
− 12 −3 x2
Since a symmetric matrix is involved, we can use Theorem 9, which states that there exists an
orthogonal matrix Q such that
 
λ1 · · · 0
QT AQ = D =  ... . . . ...  ,
 
0 · · · λn

where D is a diagonal matrix, and λ1 , . . . , λn are the eigenvalues of A.

13
Setting y = Q−1 x = QT x, we obtain

x = Qy,

so that
f (x) = (Qy)T A(Qy) = y T QT AQy = y T Dy.
If  
y1
 .. 
y =  . ,
yn
then
y T Dy = λ1 y12 + · · · + λn yn2 ,
which is a quadratic form in the variables y1 , . . . , yn with no cross terms. This process is called
the diagonalization of the quadratic form f .
Theorem 10. (Principal Axes Theorem) Every quadratic form f can be diagonalized.
Specifically, if
f (x) = xT Ax
is a quadratic form in  
x1
x =  ...  ,
 
xn
then there exists an orthogonal matrix Q such that

f (x) = xT Ax = λ1 y12 + · · · + λn yn2 ,

where    
y1 x1
 ..  T  .. 
 .  = Q  . ,
yn xn
and λ1 , . . . , λn are the eigenvalues of A.
From linear algebra, we know that Q is the matrix whose columns are the unit eigenvectors
of A.
Example: Let
f (x1 , x2 , x3 ) = 2x1 x2 + 2x1 x3 + 2x2 x3 .
The matrix of f is  
0 1 1
A = 1 0 1 .
1 1 0
The eigenvalues of A are 2, −1, −1 with corresponding unit eigenvectors
 √   √   
1/√3 −2/√ 6 0√
1/ 3 ,  1/ 6  ,  1/ 2  .
√ √ √
1/ 3 1/ 6 −1/ 2

Thus,  √ √ 
1/√3 −2/√ 6 0√
Q = 1/√3 1/√6 1/ √2  .
1/ 3 1/ 6 −1/ 2

14
Setting y = QT x, we get
1 1 1
y1 = √ (x1 + x2 + x3 ), y2 = √ (−2x1 + x2 + x3 ), y3 = √ (x2 − x3 ).
3 6 2
Expressing f in terms of y1 , y2 , y3 , we obtain

f (y) = 2y12 − y22 − y32 .

Definition: A quadratic form f : Rn → R is called positive definite if

f (x) > 0 for all x ̸= 0.

An immediate consequence of the Principal Axes Theorem is the following:

Theorem 11. Let f (x) = xT Ax be a quadratic form with matrix A. Then f is positive definite
if and only if all the eigenvalues of A are positive.

Proof : By the Principal Axes Theorem, there exists an orthogonal matrix Q such that

f (x) = λ1 y12 + · · · + λn yn2 ,

where  
y1
 .. 
y =  .  = QT x
yn
and λ1 , . . . , λn are the eigenvalues of A. If all the λi are positive, then f (x) > 0 except when
y = 0. But this happens if and only if x = 0 because QT is invertible. Therefore, f is positive
definite.
On the other hand, if one of the eigenvalues λi ≤ 0, letting y = ei and x = Qy, we get
f (x) = λi ≤ 0 and so f is not positive definite. We say that a symmetric matrix A is positive
definite if the associated quadratic form

f (x) = xT Ax

is positive definite. The Principal Axes Theorem has important applications in geometry.

15

You might also like