0% found this document useful (0 votes)
42 views32 pages

SMA 2216 Applied Linear Algebra II Notes-1

Uploaded by

elishajunior2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views32 pages

SMA 2216 Applied Linear Algebra II Notes-1

Uploaded by

elishajunior2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

SMA 2216 APPLIED LINEAR ALGEBRA II

LINEAR MAPPINGS
Definition: Let U and V be vector spaces and let f be a function that associates a unique
vector v ∈ V with each vector u ∈ U. Then f is said to map U into V, denoted f : U −→ V.
In this case, v = f u is called the image of u under f . The vector space U is called the
e e
e domf , while V is callede the codomain (image space) of f , denoted
domain of f ,edenoted
codf .
Definition: Let f : U −→ V be a function from a vector space U into a vector space V.
Then f is said to be
 linear
 if:
  
(i ) f u1 + u2 = f u1 + f u2 for all u1 , u2 ∈ U (i.e. imf is closed under addition).
f f  f f f f
(ii ) f λu = λf u for all u ∈ U and all scalars λ (i.e. imf is closed under scalar
multiplication).
e e e

A linear mapping that maps a vector space U into itself is called a linear operator on U.
Examples:
1. Define a mapping f : R2 −→ R3 by f (x, y) = (x − y, x + y, 0). Determine whether f
is linear or not.
Solution: Let u = (x1 , y1 ), v = (x2 , y2 ) ∈ R2 and λ ∈ R. Then:
(i ) e e
 
f u + v = f (x1 + x2 , y1 + y2 )
e e
= (x1 + x2 − y1 − y2 , x1 + x2 + y1 + y2 , 0)
= (x1 − y1 , x1 + y1 , 0) + (x2 − y2 , x2 + y2 , 0)
= f (x1 , y1 ) + f (x2 , y2 )
   
=f u +f v
e e
(ii )
 
f λu = f (λx1 , λy1 )
e
= (λx1 − λy1 , λx1 + λy1 , 0)
= λ (x1 − y1 , x1 + y1 , 0)
= λf (x1 , y1 )
 
= λf u
e
Therefore, f is linear.
2. Show that the function g : R3 −→ R2 given by g (x, y, z) = (x + y + z, x − y − z) is
linear.
Solution: Let u = (a, b, c), v = (x, y, z) ∈ R3 and λ ∈ R. Then:
e e
1
(i )
 
g u + v = g (a + x, b + y, c + z)
e e
= (a + x + b + y + c + z, a + x − b − y − c − z)
= (a + b + c, a − b − c) + (x + y + z, x − y − z)
= g (a, b, c) + g (x, y, z)
   
=g u +g v
e e
(ii )
 
g λu = g (λa, λb, λc)
e
= (λa + λb + λc, λa − λb − λc)
= λ (a + b + c, a − b − c)
= λg (a, b, c)
 
= λg u
e
Therefore, g is linear.
3. Consider the function h : R3 −→ R3 defined by h (x, y, z) = (2x + y + 1, −x, z). Let
u = (−1, 0, 0) and v = (1, 1, 1). Then
e e
 
h u + h v = (−1, 1, 0) + (4, −1, 1) = (3, 0, 1)
e e
while 
h u + v = h (0, 1, 1) = (2, 0, 1)
  e e
Since h u + v 6= h u + h v , h is not linear.
e e e e
Alternatively, let u = (−1, 0, 0) and λ = 2. Then
e

h 2u = h (−2, 0, 0) = (−3, 2, 0)
e
while 
2h u = 2 (−1, 1, 0) = (−2, 2, 0)
  e
Since 2h u 6= h 2u , h is not linear.
e e
4. The function f : R −→ R where f (x) = sin x is not linear.
Proof :
f (π/2 + π/2) = f (π) = sin π = 0
but
f (π/2) + f (π/2) = sin π/2 + sin π/2 = 1 + 1 = 2.
Now, f is not linear since f (π/2) + f (π/2) 6= f (π/2 + π/2).
Note: A function f (x, y, z, w, · · · ) = (· · · · · · ) is linear if the vector (· · · · · · ) = f (x, y, z, w, · · · )
is a linear combination of the componenets x, y, z, w, · · · .

2
Examples:
1. In Example 1 above,

f (x, y) = (x − y, x + y, 0)
= x (1, 1, 0) + y (−1, 1, 0) .

Since f (x, y) is a linear combination of x and y, then f is linear.


2. In Example 2 above,

g (x, y, z) = (x + y + z, x − y − z)
= x (1, 1) + y (1, −1) + z (1, −1)

Since g (x, y, z) is a linear combination of x, y, and z, then g is linear.


3. The function f : R3 −→ R3 , f (x, y, z) = (sin x)2 , z 3 + 3, xy − 1 is not linear since


(sin x)2 , z 3 + 3, xy − 1 cannot be expressed a linear combination of x, y, and z.




Further Examples:
1. Let U and V be vector spaces and define T : U −→ V by T u = 0 for every u ∈ U.
Then    
e e
 
e
T u1 + u2 = 0 = 0 + 0 = T u1 + T u2
f f e e e f f
and  
T αu = 0 = α0 = αT u .
e e e e
Thus T is a linear transformation, called the zero transformation.
2. Let V be a vector space. Define I : V −→ V by Iv = v for all v ∈ V. In this case, I is
a linear transformation, called the identity transformation/operator.
e e e

3. Let f : R2 −→ R2 be defined by f (x, y) = (−x, y). Geometrically, f is a reflection


about the y-axis and can be shown to be linear.

Non-Linear Transformations
Not every transformation that looks linear is actually linear.
Example:
Let T : R −→ R where T (x) = 2x + 3. Then the graph of {(x, T (x) : x ∈ R)} is a straight
line in the xy-plane. However, T is not linear since T (x + y) = 2 (x + y) + 3 = 2x + 2y + 3
and T (x) + T (y) = (2x + 3) + (2y + 3) = 2x + 2y + 6 so that T (x + y) 6= T (x) + T (y).
Note: The only linear transformation from R to R are functions of the form f (x) = mx,
m ∈ R.
Exercise:
Determine whether the following functions are linear or not:
1. f : R2 −→ R2 , f (x, y) = (2x + y, x)
2. f : R2 −→ R2 , f (x, y) = (x + 2y, 3x − 2y)
3. f : R2 −→ R2 , f (x, y) = (−2x + 1, x)

3
4. f : R2 −→ R3 , f (x, y) = (x + y, x − y, x)
5. f : R3 −→ R2 , f (x, y, z) = (x + y + 2z, x − y)
6. T : R3 −→ R3 , T (x1 , x2 , x3 ) = (x1 + 2x2 − x3 , −x2 , x1 + 7x3 )
7. T : R4 −→ R3 , T (x1 , x2 , x3 , x4 ) = (x1 + x2 , x2 + x3 , x4 + x4 )
Theorem: Let f : U −→ V be linear. Then for all vectors u, v , u1 , u2 , · · · , un ∈ U and all
scalars α1 , α2 , · · · , αn :
e e f f f
(i ) f 0 = 0.
e e  
(ii ) f u − v = f u − f v .
e e e e       
(iii ) f α1 u1 + α2 u2 + · · · + αn un = α1 f u1 + α2 f u2 + · · · + αn f un .
f f f f f f
Example: n o
Let S = v1 = (1, 1, 1) , v2 = (1, 1, 0) , v3 = (1, 0, 0) be a basis for R3 and let T : R3 −→ R2
f f  f     
be the linear mapping such that T v1 = (1, 0), T v2 = (2, −1), and T v3 = (4, 3).
Find a formula for T (x, y, z), then use the formula to find T (2, −3, 5).
f f f
Solution: Expressing (x, y, z) as a linear combination of v1 , v2 , v3 ,
f f f
(x, y, z) = λ1 (1, 1, 1) + λ2 (1, 1, 0) + λ3 (1, 0, 0)

Equating corresponding components,

λ1 + λ2 + λ3 = x
λ1 + λ2 =y
λ1 =z

so that λ1 = z, λ2 = y − z, and λ3 = x − y. Hence

(x, y, z) = z (1, 1, 1) + (y − z) (1, 1, 0) + (x − y) (1, 0, 0)


= zv1 + (y − z) v2 + (x − y) v3
f f f
Thus
     
T (x, y, z) = zT v1 + (y − z) T v2 + (x − y) T v3
f f f
= z (1, 0) + (y − z) (2, −1) + (x − y) (4, 3)
= (4x − 2y − z, 3x − 4y + z)

Now, T (2, −3, 5) = (9, 23).


Exercise:
Let T : R3 −→ R2 be the linear mapping such that T (1, 0, 0) = (2, 3), T (0, 1, 0) = (−1, 4),
and T (0, 0, 1) = (5, −3). Find T (3, −4, 5).

4
Kernel and Range of a Linear Transformation
Definition: Let U and V be vector spaces and let f : U −→ V be a linear transformation.
Then  
(i ) the kernel (null space) of f , denoted ker f , is given by ker f = u ∈ U | f u = 0 .
e  e e
(ii ) the range (image) of f , denoted rangef or imf , is given by imf = f u ∈ V | u ∈ U .
e e
Note: In the above definition:
(a) ker f ⊆ U while imf ⊆ V.
(b) (i ) ker f 6= ∅ as it contains at least the zero vector in U (by the preceeding theorem,
f 0 = 0).
(ii ) imf 6= ∅ since it contains at least the zero vector in V (by the preceeding theorem,
e e
0 = f 0 ∈ imf ).
e e
Theorem: Let f : U −→ V be a linear mapping. Then:
(a) ker f is a subspace of U.
(b) imf is a subspace of V.
Proof :    
(a) Let u1 , u2 ∈ ker f and let λ be a scalar. Then f u1 = f u2 = 0. Now,
f f      f f e
(i ) f u1 + u2 = f u1 + f u2 = 0 + 0 = 0 implying that u1 + u2 ∈ ker f .
f  f  f f e e e f f
(ii ) f λu1 = λf u1 = λ0 = 0 implying that λu ∈ ker f .
f f e e e
(b) Let v1 , v2 ∈ imf and let λ be a scalar. Then there exist u1 , u2 ∈ U such that
f f    f f
v1 = f u1 and v2 = f u2 . Now,
f f f   
f  
(i ) v1 + v2 = f u1 + f u2 = f u1 + u2 ∈ imf .
f f   f  f f f
(ii ) λv1 = λf u1 = f λu1 ∈ imf .
f f f
Definition: Let f : U −→ V be a linear mapping. Then:
(i ) the nullity of f , denoted N (f ), is the dimension of ker f , i.e. N (f ) = dim ker f .
(ii ) the rank of f , denoted R (f ), is the dimension of imf , i.e. R (f ) = dim imf .
Theorem (Rank-Nullity Theorem): Let f : U −→ V be a linear transformation where
dim U = n and dim V = m. Then n = R (f ) + N (f ).
Examples:
1. Let f : R2 −→ R2 be defined by f (x, y) = (x + y, x). Then

ker f = (x, y) ∈ R2 | f (x, y) = (0, 0)




= {(x, y) | (x + y, x) = (0, 0)}

so that

x+y =0
x =0

5
which gives x = y = 0. Hence

ker f = {(x, y) | x = y = 0}
= {(0, 0)}

Recall that for the trivial vector space 0 , there is no non-empty linearly independent
spanning set. Consequently, the empty set is considered to be a basis for 0 . Thus a
e
basis for ker f in this case is ∅, so that N (f ) = 0. e
Now,

imf = {f (x, y) | x, y ∈ R}
= {(x + y, x) | x, y ∈ R}
= {x (1, 1) + y (1, 0) | x, y ∈ R}

In this case, {(1, 1) , (1, 0)} is a linearly independent spanning set for imf and is hence
a basis for imf . Accordingly, R (f ) = 2.
2. Let f : R3 −→ R2 be defined by f (x, y, z) = (x − y, x + y + z). Determine:
(i ) a basis for ker f , and N (f ).
(ii ) a basis for imf , and R (f ).
Solution:
In this case

ker f = (x, y, z) ∈ R3 | f (x, y, z) = (0, 0)




= {(x, y, z) | (x − y, x + y + z) = (0, 0)}

so that

x−y =0
x+y+z =0

yielding y = x and z = −2x. Hence

ker f = {(x, y, z) | y = x, z = −2x}


= {(x, x, −2x) | x ∈ R}
= {x (1, 1, −2) | x ∈ R}

Thus a basis for ker f is {(1, 1, −2)} implying that N (f ) = 1.


On the other hand,

imf = f (x, y, z) | (x, y, z) ∈ R3




= {(x − y, x + y + z) | x, y, z ∈ R}
= {x (1, 1) + y (−1, 1) + z (0, 1) | x, y, z ∈ R}

So the vectors (1, 1) , (−1, 1) , and (0, 1) span imf . Clearly, the three vectors are linearly
dependent. However, any two are linearly independent and will therefore form a basis

6
for imf . For instance, {(1, 1) , (−1, 1)} a basis for imf . Accordingly, R (f ) = 2.
3. Define f : R3 −→ R3 by f (x, y, z) = (x + y − 3z, −x − 2y + 2z, x − 2y + z). Find the
nullity and rank of f .
Solution:
Method I :
In this case

ker f = {(x, y, z) | (x + y − 3z, −x − 2y + 2z, x − 2y + z) = (0, 0, 0)}

yielding the homogeneous system

x + y − 3z = 0
−x − 2y + 2z = 0
x − 2y + z = 0

We reduce the matrix A of coefficients to a row echelon form E


     
1 1 −3 R1 + R2 1 1 −3 1 1 −3
−4R2 + R3 
A = −1 −2 2
  −→  0 −1 −1  0 −1 −1  = E
−→
1 −2 1 R2 + R3 0 −4 3 0 0 7
We now have an equivalent homogeneous system

x + y − 3z = 0
−y − z = 0
7z = 0

and using back substitution gives z = y = x = 0. Hence ker f = {(0, 0, 0)}, and
N (f ) = 0.
By the rank-nullity theorem, 3 = N (f ) + R (f ) = 0 + R (f ). Thus R (f ) = 3.
Method II :

imf = {f (x, y, z) | x, y, z ∈ R}
= {(x + y − 3z, −x − 2y + 2z, x − 2y + z) | x, y, z ∈ R}
= {x (1, −1, 1) + y (1, −2, −2) + z (−3, 2, 1) | x, y, z ∈ R}

The vectors (1, −1, 1), (1, −2, −2), and (−3, 2, 1) span imf . These are the columns
of the matrix A above and from the row echelon form E, the three vectors are linearly
independent (recall that the number of linearly independent columns of a matrix equals
the number of linearly independent rows). Thus they form a basis for imf . Therefore,
R (f ) = 3.
By the rank-nullity theorem, 3 = N (f ) + R (f ) = N (f ) + 3. Thus N (f ) = 0.

7
Exercise:
For each of the linear mappings
1. f : R2 −→ R2 where f (x, y) = (x + y, x − y)
2. f : R2 −→ R2 where f (x, y) = (2x + y, 4x + 2y)
3. f : R3 −→ R3 where f (x, y, z) = (z − y, x − z, y − x)
4. f : R3 −→ R3 where f (x, y, z) = (x − y + z, 2x + y − z, −x + 2y + z)
find:
(a) a basis for ker f and hence N (f ).
(b) a basis for imf and hence R (f ).

Matrices of Linear Functions


If U and V are finite-dimensional vector spaces, then any linear transformation f : U −→ V
can be seen as a matrix transformation. This makes it possible to work with coordinate
matrices of the vectors instead of the vectors themselves.
n o n o
Suppose dim U = n and dim V = m. Let B1 = u1 , u2 , · · · , un and B2 = v1 , v2 , · · · , vm
 f f   f   f f f
be basis for U and V respectively. Then f u1 , f u2 , · · · , f un are vectors in V. So
      f f f
each of f u1 , f u2 , · · · , f un is a linear combination of elements in the basis B2 , i.e.
f f f
 
f u1 = a11 v1 + a12 v2 + · · · + a1m vm
f f f f
f u2 = a21 v1 + a22 v2 + · · · + a2m vm
.. .. ..
f f f f
. . .
 
f un = an1 v1 + an2 v2 + · · · + anm vm
f f f f
 T  
a11 a12 . . . a1m a11 a21 . . . an1
 a21 a22 . . . a2m   a12 a22 . . . an2 
The matrix  ..  =  .. .. , i.e. the transpose of the
   
.. .. .. .. ..
 . . . .   . . . . 
an1 an2 . . . anm a1m a2m . . . anm
matrix of coefficients of the above stytem, is called the matrix representation of f with respect
to the bases B1 and B2 . In the matrix of f , the vectors are written vertically. If the bases B1
and B2 are different, we call the matrix of transformation the transition matrix.
Note: Let A be the matrix  of a linear mapping f : U −→ V.
1. If u ∈ U, then f u = Au. The columns of A span im f , i.e. im f is the column space
of eA (the vector space
e spanned
e by the columns of A). Therefore, the number of linearly
independent columns (rows) of A is the rank of f .
2. If dim U = n and dim V = m, then A is of order m × n irrespective of the bases for U

8
and V.
Examples:
1. Let f : R2 −→ R3 with f (x, y) = (x + y, x − y, x). Find the matrix A of f with respect
to the standard bases for R2 and R3 .
Solution: {(1, 0) , (0, 1)} is the standard basis for R2 while {(1, 0, 0) , (0, 1, 0) , (0, 0, 1)}
is the standard basis for R3 . Now,

f (1, 0) = (1, 1, 1) = 1 (1, 0, 0) + 1 (0, 1, 0) + 1 (0, 0, 1)


f (0, 1) = (1, −1, 0) = 1 (1, 0, 0) − 1 (0, 1, 0) + 0 (0, 0, 1)
 
 T 1 1
1 1 1
So the matrix of f is A = =  1 −1 .
1 −1 0
1 0
2. Let g : R3 −→ R3 be given by g (x, y, z) = (x − y, z, x + y + z). Find the matrix A of
g with respect to the standard basis for R3 .
Solution: The standard basis for R3 is {(1, 0, 0) , (0, 1, 0) , (0, 0, 1)}. Now,

g (1, 0, 0) = (1, 0, 1) = 1 (1, 0, 0) + 0 (0, 1, 0) + 1 (0, 0, 1)


g (0, 1, 0) = (−1, 0, 1) = −1 (1, 0, 0) + 0 (0, 1, 0) + 1 (0, 0, 1)
g (0, 0, 1) = (0, 1, 1) = 0 (1, 0, 0) + 1 (0, 1, 0) + 1 (0, 0, 1)
 T  
1 0 1 1 −1 0
So A =  −1 0 1  =  0 0 1  is the matrix of g with respect to the standard
0 1 1 1 1 1
3
basis for R .
    
x 1 −1 0 x
Clearly, A y
  =  0 0 1   y  = (x − y, z, x + y + z) = g (x, y, z).
z 1 1 1 z
Now,

im g = {(x − y, z, x + y + z) | x, y, z ∈ R}
= {x (1, 0, 1) + y (−1, 0, 1) + z (0, 1, 1) | x, y, z ∈ R}

So the vectors (1, 0, 1) , (−1, 0, 1), and (0, 1, 1) span im g. These are respectively, the
first, second and third columns of A.
3. Define h : R3 −→ R3 by h (x, y, z) = (−2x + y − z, x − 2y − z, −x − y − 2z). Find the
matrix of h using the basis B = {(1, 0, 0) , (1, 1, 0) , (1, 1, 1)} for R3 .
Solution: In this case,

h (1, 0, 0) = (−2, 1, −1) = a (1, 0, 0) + b (1, 1, 0) + c (1, 1, 1)


h (1, 1, 0) = (−1, −1, −2) = d (1, 0, 0) + e (1, 1, 0) + f (1, 1, 1)
h (1, 1, 1) = (−2, −2, −4) = g (1, 0, 0) + h (1, 1, 0) + i (1, 1, 1)

9
Hence we have

a + b + c = −2
b+c=1
c = −1

so that c = −1, b = 2, a = −3,

d + e + f = −1
e + f = −1
f = −2

so that f = −2, e = 1, d = 0,

g + h + i = −2
h + i = −2
i = −4

so that i = −4, h = 2, g = 0.
   
a d g −3 0 0
Therefore, the required matrix is  b e h  =  2 1 2 .
c f i −1 −2 −4
4. Define the linear transformation T : R2 −→ R3 by T (x, y) = (y, −5x + 13y, −7x + 16y).
Determine the matrix of T with respect to the bases B = {(3, 1) , (5, 2)} for R2 and
0
B = {(1, 0, −1) , (−1, 2, 2) , (0, 1, 2)} for R3 .
Solution: In this case,

T (3, 1) = (1, −2, −5) = a (1, 0, −1) + b (−1, 2, 2) + c (0, 1, 2)


T (5, 2) = (2, 1, −3) = d (1, 0, −1) + e (−1, 2, 2) + f (0, 1, 2)

and we have

a −b =1
2b + c = −2
−a + 2b + 2c = −5

so that a = 1, b = 0, c = −2,

d −e =2
2e + f = 1
−d + 2e + 2f = −3

so that d = 3, e = 1, f = −1.
 
1 3
0
Therefore, the matrix of T with respect to B and B is  0 1 .
−2 −1

10
Exercise:
1. Let f : R3 −→ R2 with f (x, y, z) = (x + z, 2x + y + z). Find the matrix of f using the
standard bases for R3 and R2 .
2. Suppose U = {all polynomials with deg ≤ 3} and V = {all polynomials with deg ≤ 2},
and supose f : U −→ V such that f = d/dx. Find a matrix of f using {1, x, x2 , x3 } and
{1, x, x2 } as the basis for U and V respectively.
3. Let T : R2 −→ R3 be defined by T (x, y) = (x + 2y, −x, 0). Find the matrix of T with
respect to the bases B1 = {(1, 3) , (−2, 4)} and B2 = {(1, 1, 1) , (2, 2, 0) , (3, 0, 0)} for R2
and R3 respectively.
 
2 5 −3
4. Suppose A = represents the linear transformation f : R3 −→ R2
1 −4 7

defined by f v = Av . Find the matrix that represents the mapping relative to the
bases {(1, 1, 1) , (1, 1, 0) , (1, 0, 0)} and {(1, 3) , (2, 5)} for R3 and R2 respectively.
e e

One-to-one and Onto Transformations


Definition: Let f : U −→ V be a linear  transformation. Then f is one-to-one (injective),
written 1-1, if and only if f u = f v implies that u = v , i.e. f is 1-1 if and only if every
vector f u in the range of f is the image of exactly one vector in U.
e e e e
e
Theorem:  Let f : U −→ V be a linear transformation. Then f is one-to-one if and only if
ker f = 0 ( i.e. if and only if N (f ) = 0).    
Proof : Suppose f is one-to-one. Let u ∈ ker f . Then f u = 0 . But f 0 = 0 also.
e

Since f is one-to-one, then u = 0 so that ker f = 0 .
e e e e e
e e   e  
Conversely, suppose ker f = 0 and that f u1 = f u2 . Then
e
   f   f 
0 = f u1 − f u2 = f u1 − u2
e f f f f

⇒ u1 − u2 ∈ ker f = 0 ⇒ u1 − u2 = 0 ⇒ u1 = u2 . Hence f is one-to-one.
f f e f f e f f
Examples:
1. Let f : R2 −→ R3 be defined by f (x, y) = (x + y, x − y, x). Determine whether f is
one-to-one or not.
Solution:
In this case
ker f = (x, y) ∈ R2 | f (x, y) = (0, 0, 0)


= {(x, y) | (x + y, x − y, x) = (0, 0, 0)}


so that
x+y =0
x−y =0
x=0

11
yielding x = y = 0. Hence

ker f = {(x, y) | x = y = 0}
= {(0, 0)}

implying that N (f ) = 0. Therefore, f is one-to-one.


2. Let f : R2 −→ R2 be defined by f (x, y) = (x − y, 2x − 2y). Find ker f and hence
determine whether f is one-to-one or not.
Solution:
In this case

ker f = (x, y) ∈ R2 | f (x, y) = (0, 0)




= {(x, y) | (x − y, 2x − 2y) = (0, 0)}

so that

x−y =0
2x − 2y = 0

yielding y = x. Hence

ker f = {(x, y) | y = x}
= {(x, x) | x ∈ R}
= {x (1, 1) | x ∈ R}

Therefore, {(1, 1)} is a basis for ker f , implying that N (f ) = 1. Now, since ker f 6= 0
(or N (f ) 6= 0), f is not one-to-one. e

Definition: Let f : U −→ V be a linear transformation. Then f is said to be onto


(surjective), if for each v ∈ V there is at least one u ∈ U such that v = f u . In other
e if im f = V.
words, f is onto if and only e e e

Theorem: Let f : U −→ V be a linear transformation with dim U = n and dim V = m.


Then f is onto if and only if R (f ) = m. 
Proof : f is onto ⇔ for each v ∈ V there exists u ∈ U such that f u = v ⇔ im f = V
⇔ R (f ) = m. e e e e

Examples:
1. In Example 1 above, N (f ) = 0. By the rank-nullity theorem, R (f ) = 2 < dim cod f .
Hence im f 6= R3 . Therefore, f is not onto.
2. It was shown in a previous example that the linear mapping f : R3 −→ R2 defined by
f (x, y, z) = (x − y, x + y + z) has R (f ) = 2 = dim cod f . Hence im f = R2 . Thus f
is onto.
Theorem: Let f : U −→ V be a linear transformation with dim U = dim V = n.
(i ) If f is one-to-one, then f is onto.
(ii ) If f is onto, then f is one-to-one.

12
Theorem: Let f : U −→ V be a linear transformation. Suppose that dim U = n and
dim V = m. Then
(i ) If n > m, f is not one-to-one.
(ii ) If m > n, f is not onto.

Inverse of a Linear Transformation


Let f : U −→ V be a linear transformation.
 Then f is onto if for every v ∈ V there is at
e ∈ U such that f u
least one u = v , and f is one-to-one if each u ∈ U has a unique image
e
v = f u in V. This existence and uniqueness allows us to define a new transformation
e e e
called
e e inverse of f , denoted by f −1 , which moves V back to U, i.e. f −1 : V −→ U.
the

Conditions for a Linear Transformation to have an Inverse


Let f : U −→ V be a linear transformation where dim U = n and dim V = m. Then f −1
exists if: 
(i ) f is one-to-one ⇔ ker f = 0 ⇔ N (f ) = 0.
(ii ) f is onto ⇔ im f = V ⇔ R (f ) = m.
e

(iii ) n = m.
Note:
(a) Condition (iii ) above is only necessary but not sufficient. It is a consequence of the
first two. So (i ) and (ii ) are sufficient to tell whether f is invertible or not. For instance,
for any finite n and m, f : Rn −→ Rm is not invertible if n 6= m.
(b) f is invertible if its matrix representation with respect to any bases is invertible.
Theorem: Suppose that f : V −→ V is multiplication by an invertible n × n matrix A.
Then f has an inverse and f −1 : V −→ V is multiplication by A−1 .
Theorem: Suppose f : V −→ V is multiplication by an n × n matrix A. Then the following
are equivalent:
(i ) f is invertible.
(ii ) A is invertible.
Examples:
1. Let f : R3 −→ R3 where f (x, y, z) = (x + y + z, x − 3z, y + 4z). Show that f is not
invertible.
Solution:
Method I :
In this case, ker f = {(x, y, z) | (x + y + z, x − 3z, y + 4z) = (0, 0, 0)}
So we have a homogeneous system

x+y +z =0
x − 3z = 0
y + 4z = 0

13
We now reduce the matrix of coefficients of the system to a row echelon form:
     
1 1 1 1 1 1 1 1 1
 1 0 −3  R1 − R2  0 1 4  R2 − R3  0 1 4 
−→ −→
0 1 4 0 1 4 0 0 0
We now have an equivalent homogeneous system

x+y+z =0
y + 4z = 0

Solving for the basic variables x and y in terms of the free variable z:
The equation y + 4z = 0 gives y = −4z and back substitution into the equation
x + y + z = 0 gives x = −y − z = 4z − z = 3z.
Hence

ker f = {(x, y, z) | x = 3z, y = −4z}


= {(3z, −4z, z) | z ∈ R}
= {z (3, −4, 1) | z ∈ R}

implying that N (f ) = 1 6= 0. So f is not one-to-one and therefore not invertible.


Method II :
In this case,

im f = {(x + y + z, x − 3z, y + 4z) | x, y, z ∈ R}


= {x (1, 1, 0) + y (1, 0, 1) + z (1, −3, 4) | x, y, z ∈ R}

so that im f = span {(1, 1, 0) , (1, 0, 1) , (1, −3, 4)} and


     
1 1 0 R1 − R2 1 1 1 1 1 1
 1 0 1  −→  0 1 −1  4R2 − R3  0 1 −1 
−→
1 −3 4 R1 − R3 0 4 −4 0 0 0
From the row echelon form above, R (f ) = 2 6= 3 = dim cod f . Thus f is not onto and
therefore not invertible.
Method III :  
1 1 1
The matrix of f relative to the standard basis for R3 is A =  1 0 −3 . Calculations
0 1 4
show that det A = 0. Hence A, and consequently f , is not invertible.
2. Let f : R3 −→ R3 where f (x, y, z) = (3x + 2y, x − 3z, y + 4z).
(i ) Show that f is invertible.
  
(ii ) Given that f u = (1, 0, 0), f v = (0, 1, 0) and f w = (0, 0, 1), find vectors u,
v , and w. e e e e
(iii ) Find a formula for f −1 .
e e

14
Solution:
(i )
Method I :
In this case, im f = {x (3, 1, 0) + y (2, 0, 1) + z (0, −3, 4) | x, y, z ∈ R}. So the vectors
(3, 1, 0), (2, 0, 1), and (0, −3, 4) span im f . Now,
     
3 1 0 3 1 0 3 1 0
 2 0 1  2R1 − 3R2  0 2 −3  3R2 + 2R3  0 2 −3 
−→ −→
0 −3 4 0 −3 4 0 0 −1
From the row echelon form above, R (f ) = 3 = dim cod f . Thus f is onto. From the
rank-nullity theorem, N (f ) = 0. Thus f is one-to-one. Since f is both one-to-one and
onto, it is invertible.
Method II :
In this case, ker f = {(x, y, z) | (3x + 2y, x − 3z, y + 4z) = (0, 0, 0)}
So we have a homogeneous system

3x + 2y =0
x − 3z = 0
y + 4z = 0

We now reduce the matrix of coefficients of the system to a row echelon form:
     
3 2 0 3 2 0 3 2 0
R − 3R2  R − 2R3 
 1 0 −3  1 0 2 9  2 0 2 9 
−→ −→
0 1 4 0 1 4 0 0 1
We now have an equivalent homogeneous system

3x + 2y =0
2y + 9z = 0
z=0

yielding z = y = x = 0. Hence ker f = {(0, 0, 0)} implying that N (f ) = 0. Thus f is


one-to-one. From the rank-nullity theorem, R (f ) = 3. Thus f is onto. Since f is both
one-to-one and onto, it is invertible.
Method III :  
3 2 0
The matrix of f relative to the standard basis for R3 is A =  1 0 −3 . Calculations
0 1 4
show that det A = 1 6= 0. Hence A, and consequently f , is invertible.
(ii ) Let u = (a, b, c). Then f (a, b, c) = (3a + 2b, a − 3c, b + 4c) = (1, 0, 0) so that
e
3a + 2b =1
a − 3c = 0
b + 4c = 0

15
which yields a = 3, b = −4, and c = 1. Hence u = (3, −4, 1).
e
Next, let v = (p, q, r). Then f (p, q, r) = (3p + 2q, p − 3r, q + 4r) = (0, 1, 0). Thus
e
3p + 2q =0
p − 3r = 1
q + 4r = 0

which yields p = −8, q = 12, and r = −3. Hence v = (−8, 12, −3).
e
If w = (k, m, n), then f (k, m, n) = (3k + 2m, k − 3n, m + 4n) = (0, 0, 1). Thus
e
3k + 2m =0
k − 3n = 0
m + 4n = 1

which yields k = −6, m = 9, and n = −2. Hence w = (−6, 9, −2).


e
(iii ) Since f −1 exists, then:
f u = (1, 0, 0) ⇒ f −1 (1, 0, 0) = u = (3, −4, 1)
f v = (0, 1, 0) ⇒ f −1 (0, 1, 0) = v = (−8, 12, −3)
e e

f w = (0, 0, 1) ⇒ f −1 (0, 0, 1) = w = (−6, 9, −2)


e e
e e
Now, (x, y, z) = x (1, 0, 0) + y (0, 1, 0) + z (0, 0, 1), and from a previous theorem,

f −1 (x, y, z) = f −1 {x (1, 0, 0) + y (0, 1, 0) + z (0, 0, 1)}


= xf −1 (1, 0, 0) + yf −1 (0, 1, 0) + zf −1 (0, 0, 1)
= x (3, −4, 1) + y (−8, 12, −3) + z (−6, 9, −2)
= (3x − 8y − 6z, −4x + 12y + 9z, x − 3y − 2z)

Exercise:
 
1 −1 2
 3 0 1 
Let A =   be the matrix representation of a linear mapping f : R3 −→ R4 .
 1 2 −3 
−2 −1 1
(i ) Determine the rank and nullity of f .
(ii ) Show that f −1 does not exist.

16
EIGENVALUES AND EIGENVECTORS
Definition: Let A be an n×n matrix. The scalar λ is called an eigenvalue of A if there exists
a non-zero vector v such that Av = λv . In this case, the vector v is called an eigenvector of
A corresponding to e λ. e
e the eigenvelue e

Example:
 
3 0
Let A = . Then
8 −1
        
0 3 0 0 0 0
A = = = −1
1 8 −1 1 −1 1
 
0
Thus λ1 = −1 is an eigenvalue of A with a corresponding eigenvector v1 = .
f 1
Similarly,         
1 3 0 1 3 1
A = = =3
2 8 −1 2 6 2
 
1
so that λ2 = 3 is an eigenvalue of A with a corresponding eigenvector v2 = .
f 2
In this case, λ1 = −1 and λ2 = 3 are the only eigenvalues of the 2 × 2 matrix A.

Computing Eigenvalues and Eigenvectors


 
x1
 x2 
Suppose that λ is an eigenvalue of A. Then there exists a non-zero vector v =   such
 
..
e  . 
xn
that Av = λv = λIv . Rewriting this we have
e e e
(λI − A) v = 0 . . . . . . . . . . (1)
e e
If A is an n × n matrix, then Equation (1) corresponds to a homogeneous system of n
equations in the unknowns x1 , x2 , · · · , xn . This system has non-trivial solutions if and only if
det (λI − A) = 0. This in turn means λ is an eigenvalue of A if and only if det (λI − A) = 0.
When expanded det (λI − A) is a polynomial of degree n in λ, called the characteristic
polynomial of A. The equation det (λI − A) = 0 is called the characteristic equation of A,
and the scalars λ satisfying this equation are the eigenvalues of A. Counting multiplicities,
an n × n matrix has exactly n eigenvalues.
Theorem: Let A be an n × n matrix. Then the characteristic polynomial of A has degree
n. Moreover, the coefficient of λn in the characteristic polynomial is 1, i.e. the characteristic
polynomial is monic.

17
Examples:  
3 0
1. Find the eigenvalues of the matrix A = .
8 −1
Solution:  
λ−3 0
In this case, λI − A = . Hence the characteristic polynomial of A is
−8 λ + 1
 
λ−3 0
det (λI − A) = det = (λ − 3) (λ + 1) = λ2 − 2λ − 3
−8 λ + 1
and the characteristic equation, as a product of irreducible factors, is
(λ − 3) (λ + 1) = 0
The solutions of this equation are λ = −1 and λ = 3. These are the eigenvalues of A.
 
1 0 1
2. Find the eigenvalues of A =  0 1 1 .
1 1 0
Solution:
In this case,
 
λ−1 0 −1
det (λI − A) = det  0 λ − 1 −1 
−1 −1 λ
2

= (λ − 1) λ − λ − 1 − 1 (λ − 1)
= (λ − 1) λ2 − λ − 2


= (λ − 1) (λ − 2) (λ + 1)
is the characteristic polynomial of A, in factor form. Accordingly, the the characteristic
equation of A is (λ − 1) (λ − 2) (λ + 1) = 0. Hence λ = −1, λ = 1, and λ = 2 are the
eigenvalues of A.
Exercise:  
5 0 1
Compute the eigenvalues of A =  1 1 0 . Ans: λ = 2
−7 1 0

Finding Bases for Eigenspaces



Theorem: Let λ be an eigenvalue of an n × n matrix A and let Eλ = v | Av = λv . Then
Eλ is a subspace of Rn (or Cn ). e e e
Proof : If Av = λv , then (λI − A) v = 0. Thus Eλ is the null space of the matrix λI − A.
Therefore, Eλ is a subspace of Rn (or Cn ).
e e e e

Note: 0 ∈ Eλ since Eλ is a subspace. However, 0 is not an eigenvector.


e e
Definition: Let λ be an eigenvalue of A. The subspace Eλ is called the eigenspace of A
corresponding to the eigenvalue λ.

18
Examples:  
3 0
1. Find the eigenspaces of A = .
8 −1
Solution:
From Example 1 above, λ = −1 and λ = 3 are are the eigenvalues of A.
For λ = −1,
    
−4 0 x1 0
(λI − A) v = (−I − A) v = =
e e −8 0 x2 0
   
−4x1 = 0 0 0
⇒ ⇒ x1 = 0, and x2 is a free variable. So v = = x2 . Hence
−8x1 =
0  e x2 1
0
E−1 = span .
1
For λ = 3,
    
0 0 x1 0
(λI − A) v = (3I − A) v = =
e e −8 4 x2 0
   
x1 1
⇒ −8x1 + 4x2 = 0 ⇒ x2 = 2x1 . So v = = x1 , and therefore,
  e 2x1 2
1
E3 = span .
2
 
−12 7
2. Find basis or bases for the eigenspace(s) of A = .
−7 2
Solution:
In this case,
 
λ + 12 −7
det (λI − A) = det
7 λ−2
= λ2 + 10λ + 25
= (λ + 5)2

Now (λ + 5)2 = 0 implies that λ = −5 is an eigenvalue of A (of miltiplicity 2). So A


has only one eigenspace.
For λ = −5,
    
7 −7 x1 0
(λI − A) v = (−5I − A) v = =
e e 7 −7 x2 0
   
x2 1
⇒ 7x1 − 7x2 = 0 ⇒ x1 = x2 so that v = = x2 . Hence the eigenvectors
e x2 1
   
1 1
of A are the non-zero scalar multiples of . Thus is a basis for E−5 .
1 1

19
 
0 0 −2
3. Find the eigenspaces of A =  1 2 1 .
1 0 3
Solution:
In this case,

λ 0 2
det (λI − A) = −1 λ − 2 −1
−1 0 λ−3
= (λ − 2) λ2 − 3λ + 2


= (λ − 1) (λ − 2)2

So (λ − 1) (λ − 2)2 = 0 is the characteristic equation of A and the distinct eigenvalues


of A are λ = 1, and λ = 2. Hence there are only two eigenspaces of A.
For λ = 1,
    
1 0 2 x1 0
(λI − A) v = (I − A) v = −1 −1 −1
   x2 = 0 
 
e e −1 0 −2 x3 0

Reducing the matrix I − A of coefficients of the system:


   
1 0 2 R1 + R2 1 0 2
 −1 −1 −1  −→  0 −1 1 
−1 0 −2 R1 + R3 0 0 0
   
−2x3 −2
x1 + 2x3 = 0
⇒ ⇒ x1 = −2x3 and x2 = x3 . So v =  x3  = x3  1 ,
−x2 + x3 = 0 e x3 1
i.e. the eigenvectors of corresponding to λ = 1 are the non-zero scalar multiples of
     
−2  −2   −2 
 1 . Hence  1  is a basis for E1 , i.e. E1 = span  1  .
1 1 1
   

For λ = 2,
    
2 0 2 x1 0
(λI − A) v = (2I − A) v =  −1 0 −1   x2  =  0 
e e −1 0 −1 x3 0

Reducing the matrix I − A of coefficients of the system to row echelon form:


   
2 0 2 R1 + 2R2 2 0 2
 −1 0 −1  −→  0 0 0 
−1 0 −1 R2 − R3 0 0 0

20
So x2 and x3 is a free variable. Now, 2x1 + 2x3 = 0 ⇒ x1 = −x3 . Thus
         
−x3 0 −x3 0 −1
v =  x2  =  x2  +  0  = x 2  1  + x3  0 
e x3 0 x3 0 1

i.e. the eigenvectors of A corresponding to λ = 2 are the non-zero linear combinations


       
0 −1  0 −1 
of 1 and
   0  which are linearly independent. Hence  1  ,  0  is a
0 1 0 1
 
   
 0 −1 
basis for E2 , i.e. E2 = span  1 ,  0  .
0 1
 

Theorem: Eigenvectors corresponding to distinct eigenvalues are linearly independent.


Exercise:
Find the eigenspace(s) of each of the following matrices:
     
4 2 2 1
(a) Ans: E1 = span , E6 = span
3 3 −3 1
     
2 −1 1 1
(b) Ans: E0 = span , E4 = span
−4 2 2 −2
   
−5 −5 −9  3 
(c)  8 9 18  Ans: E−1 = span  −6 
−2 −3 −7 2
 
     
−1 −3 −9  0 1 
(d )  0 5 18  Ans: E−1 = span  −3  ,  −3 
0 −2 −7 1 1
 
       
3 2 4  1 0   2 
(e) 2 0 2 Ans: E−1 = span
   −2 , −2
   , E8 = span  1 
4 2 3 0 1 2
   
     
1 −1 4  1   1 
(f )  3 2 −1  Ans: E−2 = span  −1  , E1 = span  4  ,
2 1 −1 −1 −1
   
 
 1 
E3 = span  2 
1
 

21
Eigenvalues of Triangular Matrices
Theorem: If A is a triangular matrix, then the eigenvalues of A are the entries on the main
diagonal of A.
 
a11 a12 a13 ··· a1n

 0 a22 a23 ··· a2n 

Proof : If A = 
 0 0 a33 ··· a3n  is an upper triangular matrix, then

 .. .. .. .. .. 
 . . . . . 
0 0 0 ··· ann
 
λ − a11 −a12 −a13 ··· −a1n

 0 λ − a22 −a23 ··· −a2n 

λI − A = 
 0 0 λ − a33 ··· −a3n .

 .. .. .. .. .. 
 . . . . . 
0 0 0 ··· λ − ann
Since the determinat of a triangular matrix is equal to the product of its diagonal entries, then
det (λI − A) = (λ − a11 ) (λ − a22 ) · · · (λ − ann ) with zeros λ = a11 , λ = a22 , · · · , λ = ann .
The proof for a lower triangular matrix is similar.
 
1/2 0 0
Example: By inspection the eigenvalues of the lower triangular matrix A =  −1 2/3 0 
5 −8 −1/4
are λ = 1/2, λ = 2/3, and λ = −1/4.

Eigenvalues of Powers of a Matrix


Once the eigenvalues and eigenvectors of a matrix A are found, it becomes easy to determine
the eigenvalues and eigenvectors of any positive integer power of A. For instance, if λ is an
eigenvalue of A and v is a corresponding eigenvector, then Av = λv and it follows that
e e  e
A v = A Av = A λv = λ Av = λ λv = λ2 v
2
  
e e e e e e
This shows that λ2 is an eigenvalue of A2 and v is a corresponding eigenvector.
e
Theorem: Suppose λ is an eigenvalue of a matrix A and v is a corresponding eigenvector.
Then for any positive integer k, λk is an eigenvalue of Ak andev is a corresponding eigenvector.
e
Example:  
0 0 −2
From a previous example, the matrix A =  1 2 1  has eigenvalues λ = 1 and λ = 2.
1 0 3
From the above theorem, both λ = 17 = 1 and λ = 27 = 128 are the eigenvalues of A7 . From
 
−2
the same example,  1  is an eigenvector of A corresponding to the eigenvalue λ = 1.
1

22
From the above theorem, it is also an eigenvector of A7 corresponding to the eigenvalue
   
−1 0
7
λ = 1 = 1. Similarly, the eigenvectors  0  and 1  corresponding to the eigenvalue

1 0
λ = 2 of A are also eigenvector of A7 corresponding to the eigenvalue λ = 27 = 128.

Similar Matrices and Diagonalization


Definition: Two n × n matrices A and B are said to be similar if there exists an invertible
n × n matrix P such that B = P −1 AP . Equivalently, A and B are similar if and only if there
exists an invertible n × n matrix P such that P B = AP .
Examples:      
2 1 4 −2 2 −1
1. Let A = ,B= , and P = . Then
0 −1 5 −3 −1 1
       
2 −1 4 −2 3 −1 2 1 2 −1
PB = = = = AP
−1 1 5 −3 1 −1 0 −1 −1 1

Since det P = 1 6= 0, P is invertible. Therefore, by definition, A


and B are similar.
     
−6 −3 −25 1 0 0 2
4 3
2. Let A =  2 1 8 , B =  0 −1 0 , and P = 0
 1 −1 . Then
2 2 7 0 0 2 3
5 7
 
2 4 3
det P = 3 6= 0. So P is invertible. It can be verified that P A = BP =  0 −1 1 
6 10 14
−1
and A = P BP . Hence A and B are be similar.
Theorem: Similar matrices have same determinant.
Proof : Suppose A and B are similar. Then there exists an invertible matrix P such that
A = P −1 BP . Accordingly,

det A = det P −1 BP


= det P −1 det B det P


1
= det B det P
det P
= det B

Examples:
1. In Example 1 above, matrices A and B are similar. In this case, det A = det B = −2.
2. In Example 2 above, it was shown that A and B are similar. Calculations show that
det A = det B = −2.
Theorem: Similar matrices have the same characteristic polynomial and hence same eigen-
values.

23
Proof : Suppose A and B are similar. Then there exists an invertible matrix P such that
A = P −1 BP . Accordingly,

λI − A = λI − P −1 BP
= λI P −1 P − P −1 BP


= P −1 (λI) P − P −1 BP
= P −1 (λI − B) P

and

det (λI − A) = det P −1 (λI − B) P


 

= det P −1 det (λI − B) det P


1
= det (λI − B) det P
det P
= det (λI − B)

So A and B have the same characteristic polynomial and hence same characteristic equation.
Since eigenvalues are roots of the characteristic equation, they have the same eigenvalues.
Examples:    
2 1 4 −2
1. A prevoius example showed that the matrices A = and B = are
0 −1 5 −3
similar. Clearly, the eigenvalues of A are −1 and 2 (since A is a triangular matrix).
These are the eigenvalues of B also.
 
−6 −3 −25
2. In a prevoius example, it was shown that the matrices A =  2 1 8  and
  2 2 7
1 0 0
B = 0 −1 0  are similar. Clearly, the eigenvalues of B are 1, −1 and 2 (since B

0 0 2
is a diagonal matrix). These are the eigenvalues of A also.
Exercise:    
1 2 3 15 0 0
Determine whether matrices A =  4 5 6  and B =  0 0 0  are similar or not.
7 8 9 0 0 0
Definition: An n × n matrix A is diagonalizable if there exists an invertible n × n matrix
P such that P −1 AP is a diagonal matrix. In other words, A is diagonalizable if there is a
diagonal matrix D such that A is similar to D. The matrix P is said to diagonalize A in this
case.
Remark: If A is diagonalizable, then A is similar to a diagonal matrix whose diagonal
components are the eigenvalues of A.
Theorem: An n × n matrix A is diagonalizable if and only if it has n linearly independent
eigenvectors.

24
Corollary: If an n × n matrix A has n distinct eigenvalues, then it is diagonalizable.

Procedure for Diagonalizing an n × n Matrix


Step 1 : Find n linearly indepenednt eigenvectors of A, say p1 , p2 , · · · , pn .
Step 2: Form a matrix P having p1 , p2 , · · · , pn as its column vectors.
f f f

Step 3 : The matrix P −1 AP will then be diagonal with λ1 , λ2 , · · · , λn as its successive


f f f
diagonal entries, where λi is the eigenvalue corresponding to pi (i = 1, 2, · · · , n).
e
Examples:  
−3 2
1. The characteristic equation of A = is
−2 1
 
λ + 3 −2
det (λI − A) = det
2 λ−1
= (λ + 3) (λ − 1) + 4
= λ2 + 2λ + 1
= (λ + 1)2 = 0

Hence λ = −1 is the only eigenvalue of A. The eigenvectors corresponding to λ = −1


    
2 −2 x1 0
are the non-trivial solutions of (−I − A) v = 0 ⇒ =
e e 2 −2 x 2 0
     
x2 1 1
⇒ 2x1 − 2x2 = 0 ⇒ x1 = x2 so that v = = x2 . Hence
e x 2 1 1
is a basis for E−1 . Since this unique eigenspace is 1-dimensional, then A does not have
two linearly independent eigenvectors and is therefore not diagonalizable.
 
4 2
2. Calculations show that the matrix A = has eigenvalues λ = 1 and λ = 6 with
3 3
   
2 1
corresponding eigenvectors and respectively. Clearly, the two eigenvectors
−3 1
are linearly independent.
 
2 1
Now, the matrix P = diagonalizes A. In this case,
−3 1
   
−1 1 1 −1 4 2 2 1
P AP =
5 3 2 3 3 −3 1
  
1 1 −1 2 6
=
5 3 2 −3 6
 
1 5 0
=
5 0 30
 
1 0
=
0 6

25
which is a diagonal matrix whose diagonal components are the eigenvalues of A.
 
0 0 −2
3. Let A =  1 2 1 . Determine an invertible matrix P such that D = P −1 AP is a
1 0 3
diagonal matrix.
Solution: From a previous example, the eigenvalues of A are λ = 1, and λ = 2, where
p1 = (−2, 1, 1) is an eigenvector of A corresponding to λ = 1 while p2 = (−1, 0, 1) and
f p = (0, 1, 0) are eigenvectors of A corresponding to λ = 2. It can
and f be verified that
3
 
f −2 −1 0
p1 , p2 , andp3 are linearly independent so that P =  1 0 1  diagonalizes A. In
f f f
  1 1 0
−1 0 −1
this case, P −1 =  1 0 2  and
1 1 1
   
−1 0 −1 0 0 −2 −2 −1 0
P −1 AP =  1 0 2   1 2 1   1 0 1 
1 1 1 1 0 3 1 1 0
  
−1 0 −1 −2 −1 0
=  2 0 4   1 0 1 
2 2 2 1 1 0
 
1 0 0
= 0 2 0 

0 0 2

Note: There is no preferred order for the columns of the diagonalizing matrix P . Since
the ith diagonal entry of P −1 AP is the eigenvalue corresponding to the ith column vector of
P , changing the order of the columns of P just
 changes the  order of the eigenvalues on the
−1 0 −2
diagonal of P −1 AP . Thus if we write P =  0 1 1  in Example 3 above, we would
  1 0 1
2 0 0
have obtained P −1 AP =  0 2 0 .
0 0 1
 
1 0 1
4. From a previous example, the matrix A =  0 1 1  has three distinct eigenvalues,
1 1 0
namely λ = −1, λ = 1, and λ = 2. Thus by the corrollary above, A is diagonalizable.
Now, calculations show that (1, 1, −2), (−1, 1, 0) and (1, 1, 1) are eigenvectors of A
 
1 −1 1
corresponding to λ = −1, λ = 1, and λ = 2 respectively. So P =  1 1 1  is a
−2 0 1

26
diagonalizing matrix and
   
1 1 −2 1 0 1 1 −1 1
1
P −1 AP =  −3 3 0  0 1 1  1 1 1 
6
2 2 2 1 1 0 −2 0 1
 
−1 0 0
=  0 1 0 .
0 0 2

Computing Powers of a Matrix


Let A be an n × n matrix and let P be an invertible n × n matrix. Then

(P −1 AP )2 = P −1 AP P −1 AP = P −1 AIAP = P −1 A2 P.

In general, for any positive integer k,

(P −1 AP )k = P −1 Ak P . . . . . . . . . . (∗)

From Equation (∗), if A is diagonalizable and P −1 AP = D is a diagonal matrix, then


P −1 Ak P = (P −1 AP )k = Dk . Solving this equation for Ak ,

P P −1 Ak P P −1 = P Dk P −1
⇒ Ak = P Dk P −1 . . . . . . . . . . (∗∗)
Example:  
1 −1 1
In Example 4 above, it was shown that the matrix A is diagonalized by P =  1 1 1 
  −2 0 1
−1 0 0
−1
and that D = P AP =  0 1 0 . Thus from Equation (∗∗) above,
0 0 2

A5 = P D5 P −1
   
1 −1 1 −1 0 0 1/6 1/6 −1/3

=  1 1 1   0 1 0   −1/2 1/2 0 
−2 0 1 0 0 32 1/3 1/3 1/3
  
−1 −1 32 1/6 1/6 −1/3

=  −1 1 32   −1/2 1/2 0 
2 0 32 1/3 1/3 1/3
 
11 10 11
=  10 11 10  .
11 11 10

27
Exercise:
 
0 1 1
1. Consider the matrix A =  1 1 0 .
1 0 1
(a) Find an invertible matrix P and a diagonal matrix D such that P −1 AP = D.
 
42 43 43
(b) Use the diagonal matrix D in (a) above to compute A7 . Ans: A7 =  43 43 42 .
43 42 43
 
−4 0 −1
2. Let B =  2 −1 0 .
2 0 −1
(a) Find the eigenvalues and the corresponding eigenspaces of B.
    
 0   1 
Ans: λ = −1, E−1 = span  1  ; λ = −2, E−2 = span  −2  ;
0 −2
   
 
 −1 
λ = −3, E−3 = span  1 
1
 

(b) Find the eigenvalues and bases for the corresponding eigenspaces of B 8 .
(c) Show that B is diagonalizable and hence find a diagonalizable matrix P such that
P −1 AP = D is a diagonal matrix.
(d ) Use the diagonalization in part (c) above to evaluate B 10 .
 
117074 0 58025
Ans: B 10 =  −116050 1 −57002 
−116050 0 −57001

Polynomials of Matrices
Consider a polynomial f (t) over a field F given by

f (t) = an tn + an−1 tn−1 + an−2 tn−2 + · · · + a2 t2 + a1 t + a0 .

If A is an n × n matrix over F , we define the polynomial of A as

f (A) = an An + an−1 An−1 + an−2 An−2 + · · · + a2 A2 + a1 A + a0 I

where I is the n × n identity matrix.


We say that A is a root/zero of a polynomial f (t) if f (A) = 0 (the zero n × n matrix).

28
Example:  
2 −4 −3
Let f (x) = x + 3x + 2 and A = . Then
2 1

f (A) = A2 + 3A + 2I
      
−4 −3 −4 −3 −4 −3 1 0
= +3 +2
2 1 2 1 2 1 0 1
     
10 9 −12 −9 2 0
= + +
−6 −5 6 3 0 2
 
0 0
= .
0 0

Since f (A) = 0, then A is a zero of f (x).


Note: If f (x) and g(x) are polynomials and g(x) is a multiple of f (x), i.e. g(x)=h(x)f (x)
for some polynomial h(x), then g(A)=h(A)f (A).
Theorem (Cayley-Hamilton Theorem): Every square matrix is a zero of its character-
istic polynomial.
Example:  
−4 −3
Let A = . Then f (λ) = λ2 + 3λ + 2 is the characteristic polynomial of A. From
2 1
the preceeding example, f (A) = 0.

Applications of the Cayley-Hamilton Theorem


Let A be an invertible n × n matrix. Then the Cayley-Hamilton theorem can be used to find
the inverse of A as follows:
Let f (λ) = λn + an−1 λn−1 + · · · + a1 λ + a0 be the characteristic polynomial of A. Then by
the Cayley-Hamilton Theorem,

f (A) = An + an−1 An−1 + · · · + a1 A + a0 I = 0


⇒ A(An−1 + an−1 An−2 + · · · + a1 I + a0 A−1 ) = 0
⇒ An−1 + an−1 An−2 + · · · + a1 I + a0 A−1 = 0
⇒ −a0 A−1 = An−1 + an−1 An−2 + · · · + a1 I
−1  n−1
⇒ A−1 = + an−1 An−2 + · · · + a1 I

A
a0
 
0 1 1
Example: Use the Cayley-Hamilton theorem to find the inverse of A =  1 1 0 .
1 0 1

29
Solution:
In this case, det A = −2 6= 0. Hence A−1 exists. Now,

λ −1 −1
det(λI − A) = −1 λ − 1 0
−1 0 λ−1
= λ(λ2 − 2λ + 1) + 1[−(λ − 1)] − 1(λ − 1)
= λ3 − 2λ2 − λ + 2

So f (λ) = λ3 − 2λ2 − λ + 2 is the characteristic polynomial of A.


By the Cayley-Hamilton theorem,

f (A) = A3 − 2A2 − A + 2I = 0
⇒ A(A2 − 2A − I + 2A−1 ) = 0
⇒ A2 − 2A − I + 2A−1 = 0
⇒ A−1 = −1/2 A2 − 2A − I
 

      
0 1 1 0 1 1 0 2 2 1 0 0
⇒ A−1 = −1/2  1 1 0   1 1 0  −  2 2 0  −  0 1 0 
1 0 1 1 0 1 2 0 2 0 0 1
     
2 1 1 0 2 2 1 0 0
= −1/2  1 2 1  −  2 2 0  −  0 1 0 
1 1 2 2 0 2 0 0 1
 
1 −1 −1
= /2 −1 −1 1 
−1 
−1 1 −1
 
−1/2 1/2 1/2

=  1/2 1/2 −1/2  .


1/2 −1/2 1/2

    
0 1 1 −1/2 1/2 1/2 1 0 0
−1 −1
As a check, AA = 1 1 0
   1/2 1/2 /2 = 0
  1 0 .
1 0 1 1/2 −1 /2 /2
1 0 0 1
Exercise:
Show that the following matrix is invertible and hence use the Cayley-Hamilton theorem to
find its inverse.
 
1 2
(a)
3 4
 
2 0 2
(b)  0 2 2 
2 2 0

30
Definition: The minimal polynomial of a square matrix A is the monic polynomial of the
smallest degree having A as a root, i.e. h(λ) is the minimal polynomial of A if h(λ) is the
monic polynomial of the smallest degree such that h(A) = 0.
Theorem: The minimal polynomial and the characteristic polynomial of an n × n matrix
have the same irreducible factors.
Theorem: Let A be an n×n matrix. The minimal polynomial of A divides every polynomial
which has A as its zero. In particular, the minimal polynomial divides the characteristic
polynomial.
Example:  
0 0 −2
1. Find the minimal polynomial of A =  1 2 1 .
1 0 3
Solution:
From a previous example f (λ) = (λ − 1)(λ − 2)2 is the characteristic polynomial of A.
So the irreducible factors of f (λ), and hence of the minimal polynomial of A, are λ − 1
and λ − 2. Therefore, the minimal polynomial of A is either h(λ) = (λ − 1)(λ − 2) or
f (λ) = (λ − 1)(λ − 2)2 .
Consider h(λ) = (λ − 1)(λ − 2). Then

h(A) = (A − I)(A − 2I)


       
0 0 −2 1 0 0 0 0 −2 2 0 0
=  1 2 1  −  0 1 0   1 2 1  −  0 2 0 
1 0 3 0 0 1 1 0 3 0 0 2
  
−1 0 −2 −2 0 −2
=  1 1 1  1 0 1 
1 0 2 1 0 1
 
0 0 0
= 0 0 0 .

0 0 0

So h(λ) = λ2 − 3λ + 2 is the minimal polynomial of A.


 
1 0 1
2. Calculate the minimal polynomial of A =  0 1 1 .
1 1 0
Solution:
From a previous example, f (λ) = (λ − 1)(λ + 1)(λ − 2) is the characteristic polynomial
of A. So the irreducible factors of f (λ), and hence of the minimal polynomial, are λ − 1,
λ + 1 and λ − 2. Each of these has multiplicity of 1. Since f (A) = 0 (by the Cayley-
Hamilton theorem), then f (λ) = (λ − 1)(λ + 1)(λ − 2) is the minimal polynomial of A.
Exercise:  
5 0 1
1. Find the minimal polynomial of A =  1 1 0 .
−7 1 0

31
Ans: f (λ) = (λ − 2)3 = λ3 − 6λ2 + 12λ − 8
 
−2 3 0 1
 0 0 2 4 
2. Find the minimal polynomial of B =   0
.
0 1 3 
0 0 0 −2

32

You might also like