SMA 2216 Applied Linear Algebra II Notes-1
SMA 2216 Applied Linear Algebra II Notes-1
LINEAR MAPPINGS
Definition: Let U and V be vector spaces and let f be a function that associates a unique
vector v ∈ V with each vector u ∈ U. Then f is said to map U into V, denoted f : U −→ V.
In this case, v = f u is called the image of u under f . The vector space U is called the
e e
e domf , while V is callede the codomain (image space) of f , denoted
domain of f ,edenoted
codf .
Definition: Let f : U −→ V be a function from a vector space U into a vector space V.
Then f is said to be
linear
if:
(i ) f u1 + u2 = f u1 + f u2 for all u1 , u2 ∈ U (i.e. imf is closed under addition).
f f f f f f
(ii ) f λu = λf u for all u ∈ U and all scalars λ (i.e. imf is closed under scalar
multiplication).
e e e
A linear mapping that maps a vector space U into itself is called a linear operator on U.
Examples:
1. Define a mapping f : R2 −→ R3 by f (x, y) = (x − y, x + y, 0). Determine whether f
is linear or not.
Solution: Let u = (x1 , y1 ), v = (x2 , y2 ) ∈ R2 and λ ∈ R. Then:
(i ) e e
f u + v = f (x1 + x2 , y1 + y2 )
e e
= (x1 + x2 − y1 − y2 , x1 + x2 + y1 + y2 , 0)
= (x1 − y1 , x1 + y1 , 0) + (x2 − y2 , x2 + y2 , 0)
= f (x1 , y1 ) + f (x2 , y2 )
=f u +f v
e e
(ii )
f λu = f (λx1 , λy1 )
e
= (λx1 − λy1 , λx1 + λy1 , 0)
= λ (x1 − y1 , x1 + y1 , 0)
= λf (x1 , y1 )
= λf u
e
Therefore, f is linear.
2. Show that the function g : R3 −→ R2 given by g (x, y, z) = (x + y + z, x − y − z) is
linear.
Solution: Let u = (a, b, c), v = (x, y, z) ∈ R3 and λ ∈ R. Then:
e e
1
(i )
g u + v = g (a + x, b + y, c + z)
e e
= (a + x + b + y + c + z, a + x − b − y − c − z)
= (a + b + c, a − b − c) + (x + y + z, x − y − z)
= g (a, b, c) + g (x, y, z)
=g u +g v
e e
(ii )
g λu = g (λa, λb, λc)
e
= (λa + λb + λc, λa − λb − λc)
= λ (a + b + c, a − b − c)
= λg (a, b, c)
= λg u
e
Therefore, g is linear.
3. Consider the function h : R3 −→ R3 defined by h (x, y, z) = (2x + y + 1, −x, z). Let
u = (−1, 0, 0) and v = (1, 1, 1). Then
e e
h u + h v = (−1, 1, 0) + (4, −1, 1) = (3, 0, 1)
e e
while
h u + v = h (0, 1, 1) = (2, 0, 1)
e e
Since h u + v 6= h u + h v , h is not linear.
e e e e
Alternatively, let u = (−1, 0, 0) and λ = 2. Then
e
h 2u = h (−2, 0, 0) = (−3, 2, 0)
e
while
2h u = 2 (−1, 1, 0) = (−2, 2, 0)
e
Since 2h u 6= h 2u , h is not linear.
e e
4. The function f : R −→ R where f (x) = sin x is not linear.
Proof :
f (π/2 + π/2) = f (π) = sin π = 0
but
f (π/2) + f (π/2) = sin π/2 + sin π/2 = 1 + 1 = 2.
Now, f is not linear since f (π/2) + f (π/2) 6= f (π/2 + π/2).
Note: A function f (x, y, z, w, · · · ) = (· · · · · · ) is linear if the vector (· · · · · · ) = f (x, y, z, w, · · · )
is a linear combination of the componenets x, y, z, w, · · · .
2
Examples:
1. In Example 1 above,
f (x, y) = (x − y, x + y, 0)
= x (1, 1, 0) + y (−1, 1, 0) .
g (x, y, z) = (x + y + z, x − y − z)
= x (1, 1) + y (1, −1) + z (1, −1)
Further Examples:
1. Let U and V be vector spaces and define T : U −→ V by T u = 0 for every u ∈ U.
Then
e e
e
T u1 + u2 = 0 = 0 + 0 = T u1 + T u2
f f e e e f f
and
T αu = 0 = α0 = αT u .
e e e e
Thus T is a linear transformation, called the zero transformation.
2. Let V be a vector space. Define I : V −→ V by Iv = v for all v ∈ V. In this case, I is
a linear transformation, called the identity transformation/operator.
e e e
Non-Linear Transformations
Not every transformation that looks linear is actually linear.
Example:
Let T : R −→ R where T (x) = 2x + 3. Then the graph of {(x, T (x) : x ∈ R)} is a straight
line in the xy-plane. However, T is not linear since T (x + y) = 2 (x + y) + 3 = 2x + 2y + 3
and T (x) + T (y) = (2x + 3) + (2y + 3) = 2x + 2y + 6 so that T (x + y) 6= T (x) + T (y).
Note: The only linear transformation from R to R are functions of the form f (x) = mx,
m ∈ R.
Exercise:
Determine whether the following functions are linear or not:
1. f : R2 −→ R2 , f (x, y) = (2x + y, x)
2. f : R2 −→ R2 , f (x, y) = (x + 2y, 3x − 2y)
3. f : R2 −→ R2 , f (x, y) = (−2x + 1, x)
3
4. f : R2 −→ R3 , f (x, y) = (x + y, x − y, x)
5. f : R3 −→ R2 , f (x, y, z) = (x + y + 2z, x − y)
6. T : R3 −→ R3 , T (x1 , x2 , x3 ) = (x1 + 2x2 − x3 , −x2 , x1 + 7x3 )
7. T : R4 −→ R3 , T (x1 , x2 , x3 , x4 ) = (x1 + x2 , x2 + x3 , x4 + x4 )
Theorem: Let f : U −→ V be linear. Then for all vectors u, v , u1 , u2 , · · · , un ∈ U and all
scalars α1 , α2 , · · · , αn :
e e f f f
(i ) f 0 = 0.
e e
(ii ) f u − v = f u − f v .
e e e e
(iii ) f α1 u1 + α2 u2 + · · · + αn un = α1 f u1 + α2 f u2 + · · · + αn f un .
f f f f f f
Example: n o
Let S = v1 = (1, 1, 1) , v2 = (1, 1, 0) , v3 = (1, 0, 0) be a basis for R3 and let T : R3 −→ R2
f f f
be the linear mapping such that T v1 = (1, 0), T v2 = (2, −1), and T v3 = (4, 3).
Find a formula for T (x, y, z), then use the formula to find T (2, −3, 5).
f f f
Solution: Expressing (x, y, z) as a linear combination of v1 , v2 , v3 ,
f f f
(x, y, z) = λ1 (1, 1, 1) + λ2 (1, 1, 0) + λ3 (1, 0, 0)
λ1 + λ2 + λ3 = x
λ1 + λ2 =y
λ1 =z
4
Kernel and Range of a Linear Transformation
Definition: Let U and V be vector spaces and let f : U −→ V be a linear transformation.
Then
(i ) the kernel (null space) of f , denoted ker f , is given by ker f = u ∈ U | f u = 0 .
e e e
(ii ) the range (image) of f , denoted rangef or imf , is given by imf = f u ∈ V | u ∈ U .
e e
Note: In the above definition:
(a) ker f ⊆ U while imf ⊆ V.
(b) (i ) ker f 6= ∅ as it contains at least the zero vector in U (by the preceeding theorem,
f 0 = 0).
(ii ) imf 6= ∅ since it contains at least the zero vector in V (by the preceeding theorem,
e e
0 = f 0 ∈ imf ).
e e
Theorem: Let f : U −→ V be a linear mapping. Then:
(a) ker f is a subspace of U.
(b) imf is a subspace of V.
Proof :
(a) Let u1 , u2 ∈ ker f and let λ be a scalar. Then f u1 = f u2 = 0. Now,
f f f f e
(i ) f u1 + u2 = f u1 + f u2 = 0 + 0 = 0 implying that u1 + u2 ∈ ker f .
f f f f e e e f f
(ii ) f λu1 = λf u1 = λ0 = 0 implying that λu ∈ ker f .
f f e e e
(b) Let v1 , v2 ∈ imf and let λ be a scalar. Then there exist u1 , u2 ∈ U such that
f f f f
v1 = f u1 and v2 = f u2 . Now,
f f f
f
(i ) v1 + v2 = f u1 + f u2 = f u1 + u2 ∈ imf .
f f f f f f
(ii ) λv1 = λf u1 = f λu1 ∈ imf .
f f f
Definition: Let f : U −→ V be a linear mapping. Then:
(i ) the nullity of f , denoted N (f ), is the dimension of ker f , i.e. N (f ) = dim ker f .
(ii ) the rank of f , denoted R (f ), is the dimension of imf , i.e. R (f ) = dim imf .
Theorem (Rank-Nullity Theorem): Let f : U −→ V be a linear transformation where
dim U = n and dim V = m. Then n = R (f ) + N (f ).
Examples:
1. Let f : R2 −→ R2 be defined by f (x, y) = (x + y, x). Then
so that
x+y =0
x =0
5
which gives x = y = 0. Hence
ker f = {(x, y) | x = y = 0}
= {(0, 0)}
Recall that for the trivial vector space 0 , there is no non-empty linearly independent
spanning set. Consequently, the empty set is considered to be a basis for 0 . Thus a
e
basis for ker f in this case is ∅, so that N (f ) = 0. e
Now,
imf = {f (x, y) | x, y ∈ R}
= {(x + y, x) | x, y ∈ R}
= {x (1, 1) + y (1, 0) | x, y ∈ R}
In this case, {(1, 1) , (1, 0)} is a linearly independent spanning set for imf and is hence
a basis for imf . Accordingly, R (f ) = 2.
2. Let f : R3 −→ R2 be defined by f (x, y, z) = (x − y, x + y + z). Determine:
(i ) a basis for ker f , and N (f ).
(ii ) a basis for imf , and R (f ).
Solution:
In this case
so that
x−y =0
x+y+z =0
= {(x − y, x + y + z) | x, y, z ∈ R}
= {x (1, 1) + y (−1, 1) + z (0, 1) | x, y, z ∈ R}
So the vectors (1, 1) , (−1, 1) , and (0, 1) span imf . Clearly, the three vectors are linearly
dependent. However, any two are linearly independent and will therefore form a basis
6
for imf . For instance, {(1, 1) , (−1, 1)} a basis for imf . Accordingly, R (f ) = 2.
3. Define f : R3 −→ R3 by f (x, y, z) = (x + y − 3z, −x − 2y + 2z, x − 2y + z). Find the
nullity and rank of f .
Solution:
Method I :
In this case
x + y − 3z = 0
−x − 2y + 2z = 0
x − 2y + z = 0
x + y − 3z = 0
−y − z = 0
7z = 0
and using back substitution gives z = y = x = 0. Hence ker f = {(0, 0, 0)}, and
N (f ) = 0.
By the rank-nullity theorem, 3 = N (f ) + R (f ) = 0 + R (f ). Thus R (f ) = 3.
Method II :
imf = {f (x, y, z) | x, y, z ∈ R}
= {(x + y − 3z, −x − 2y + 2z, x − 2y + z) | x, y, z ∈ R}
= {x (1, −1, 1) + y (1, −2, −2) + z (−3, 2, 1) | x, y, z ∈ R}
The vectors (1, −1, 1), (1, −2, −2), and (−3, 2, 1) span imf . These are the columns
of the matrix A above and from the row echelon form E, the three vectors are linearly
independent (recall that the number of linearly independent columns of a matrix equals
the number of linearly independent rows). Thus they form a basis for imf . Therefore,
R (f ) = 3.
By the rank-nullity theorem, 3 = N (f ) + R (f ) = N (f ) + 3. Thus N (f ) = 0.
7
Exercise:
For each of the linear mappings
1. f : R2 −→ R2 where f (x, y) = (x + y, x − y)
2. f : R2 −→ R2 where f (x, y) = (2x + y, 4x + 2y)
3. f : R3 −→ R3 where f (x, y, z) = (z − y, x − z, y − x)
4. f : R3 −→ R3 where f (x, y, z) = (x − y + z, 2x + y − z, −x + 2y + z)
find:
(a) a basis for ker f and hence N (f ).
(b) a basis for imf and hence R (f ).
8
and V.
Examples:
1. Let f : R2 −→ R3 with f (x, y) = (x + y, x − y, x). Find the matrix A of f with respect
to the standard bases for R2 and R3 .
Solution: {(1, 0) , (0, 1)} is the standard basis for R2 while {(1, 0, 0) , (0, 1, 0) , (0, 0, 1)}
is the standard basis for R3 . Now,
im g = {(x − y, z, x + y + z) | x, y, z ∈ R}
= {x (1, 0, 1) + y (−1, 0, 1) + z (0, 1, 1) | x, y, z ∈ R}
So the vectors (1, 0, 1) , (−1, 0, 1), and (0, 1, 1) span im g. These are respectively, the
first, second and third columns of A.
3. Define h : R3 −→ R3 by h (x, y, z) = (−2x + y − z, x − 2y − z, −x − y − 2z). Find the
matrix of h using the basis B = {(1, 0, 0) , (1, 1, 0) , (1, 1, 1)} for R3 .
Solution: In this case,
9
Hence we have
a + b + c = −2
b+c=1
c = −1
d + e + f = −1
e + f = −1
f = −2
so that f = −2, e = 1, d = 0,
g + h + i = −2
h + i = −2
i = −4
so that i = −4, h = 2, g = 0.
a d g −3 0 0
Therefore, the required matrix is b e h = 2 1 2 .
c f i −1 −2 −4
4. Define the linear transformation T : R2 −→ R3 by T (x, y) = (y, −5x + 13y, −7x + 16y).
Determine the matrix of T with respect to the bases B = {(3, 1) , (5, 2)} for R2 and
0
B = {(1, 0, −1) , (−1, 2, 2) , (0, 1, 2)} for R3 .
Solution: In this case,
and we have
a −b =1
2b + c = −2
−a + 2b + 2c = −5
so that a = 1, b = 0, c = −2,
d −e =2
2e + f = 1
−d + 2e + 2f = −3
so that d = 3, e = 1, f = −1.
1 3
0
Therefore, the matrix of T with respect to B and B is 0 1 .
−2 −1
10
Exercise:
1. Let f : R3 −→ R2 with f (x, y, z) = (x + z, 2x + y + z). Find the matrix of f using the
standard bases for R3 and R2 .
2. Suppose U = {all polynomials with deg ≤ 3} and V = {all polynomials with deg ≤ 2},
and supose f : U −→ V such that f = d/dx. Find a matrix of f using {1, x, x2 , x3 } and
{1, x, x2 } as the basis for U and V respectively.
3. Let T : R2 −→ R3 be defined by T (x, y) = (x + 2y, −x, 0). Find the matrix of T with
respect to the bases B1 = {(1, 3) , (−2, 4)} and B2 = {(1, 1, 1) , (2, 2, 0) , (3, 0, 0)} for R2
and R3 respectively.
2 5 −3
4. Suppose A = represents the linear transformation f : R3 −→ R2
1 −4 7
defined by f v = Av . Find the matrix that represents the mapping relative to the
bases {(1, 1, 1) , (1, 1, 0) , (1, 0, 0)} and {(1, 3) , (2, 5)} for R3 and R2 respectively.
e e
11
yielding x = y = 0. Hence
ker f = {(x, y) | x = y = 0}
= {(0, 0)}
so that
x−y =0
2x − 2y = 0
yielding y = x. Hence
ker f = {(x, y) | y = x}
= {(x, x) | x ∈ R}
= {x (1, 1) | x ∈ R}
Therefore, {(1, 1)} is a basis for ker f , implying that N (f ) = 1. Now, since ker f 6= 0
(or N (f ) 6= 0), f is not one-to-one. e
Examples:
1. In Example 1 above, N (f ) = 0. By the rank-nullity theorem, R (f ) = 2 < dim cod f .
Hence im f 6= R3 . Therefore, f is not onto.
2. It was shown in a previous example that the linear mapping f : R3 −→ R2 defined by
f (x, y, z) = (x − y, x + y + z) has R (f ) = 2 = dim cod f . Hence im f = R2 . Thus f
is onto.
Theorem: Let f : U −→ V be a linear transformation with dim U = dim V = n.
(i ) If f is one-to-one, then f is onto.
(ii ) If f is onto, then f is one-to-one.
12
Theorem: Let f : U −→ V be a linear transformation. Suppose that dim U = n and
dim V = m. Then
(i ) If n > m, f is not one-to-one.
(ii ) If m > n, f is not onto.
(iii ) n = m.
Note:
(a) Condition (iii ) above is only necessary but not sufficient. It is a consequence of the
first two. So (i ) and (ii ) are sufficient to tell whether f is invertible or not. For instance,
for any finite n and m, f : Rn −→ Rm is not invertible if n 6= m.
(b) f is invertible if its matrix representation with respect to any bases is invertible.
Theorem: Suppose that f : V −→ V is multiplication by an invertible n × n matrix A.
Then f has an inverse and f −1 : V −→ V is multiplication by A−1 .
Theorem: Suppose f : V −→ V is multiplication by an n × n matrix A. Then the following
are equivalent:
(i ) f is invertible.
(ii ) A is invertible.
Examples:
1. Let f : R3 −→ R3 where f (x, y, z) = (x + y + z, x − 3z, y + 4z). Show that f is not
invertible.
Solution:
Method I :
In this case, ker f = {(x, y, z) | (x + y + z, x − 3z, y + 4z) = (0, 0, 0)}
So we have a homogeneous system
x+y +z =0
x − 3z = 0
y + 4z = 0
13
We now reduce the matrix of coefficients of the system to a row echelon form:
1 1 1 1 1 1 1 1 1
1 0 −3 R1 − R2 0 1 4 R2 − R3 0 1 4
−→ −→
0 1 4 0 1 4 0 0 0
We now have an equivalent homogeneous system
x+y+z =0
y + 4z = 0
Solving for the basic variables x and y in terms of the free variable z:
The equation y + 4z = 0 gives y = −4z and back substitution into the equation
x + y + z = 0 gives x = −y − z = 4z − z = 3z.
Hence
14
Solution:
(i )
Method I :
In this case, im f = {x (3, 1, 0) + y (2, 0, 1) + z (0, −3, 4) | x, y, z ∈ R}. So the vectors
(3, 1, 0), (2, 0, 1), and (0, −3, 4) span im f . Now,
3 1 0 3 1 0 3 1 0
2 0 1 2R1 − 3R2 0 2 −3 3R2 + 2R3 0 2 −3
−→ −→
0 −3 4 0 −3 4 0 0 −1
From the row echelon form above, R (f ) = 3 = dim cod f . Thus f is onto. From the
rank-nullity theorem, N (f ) = 0. Thus f is one-to-one. Since f is both one-to-one and
onto, it is invertible.
Method II :
In this case, ker f = {(x, y, z) | (3x + 2y, x − 3z, y + 4z) = (0, 0, 0)}
So we have a homogeneous system
3x + 2y =0
x − 3z = 0
y + 4z = 0
We now reduce the matrix of coefficients of the system to a row echelon form:
3 2 0 3 2 0 3 2 0
R − 3R2 R − 2R3
1 0 −3 1 0 2 9 2 0 2 9
−→ −→
0 1 4 0 1 4 0 0 1
We now have an equivalent homogeneous system
3x + 2y =0
2y + 9z = 0
z=0
15
which yields a = 3, b = −4, and c = 1. Hence u = (3, −4, 1).
e
Next, let v = (p, q, r). Then f (p, q, r) = (3p + 2q, p − 3r, q + 4r) = (0, 1, 0). Thus
e
3p + 2q =0
p − 3r = 1
q + 4r = 0
which yields p = −8, q = 12, and r = −3. Hence v = (−8, 12, −3).
e
If w = (k, m, n), then f (k, m, n) = (3k + 2m, k − 3n, m + 4n) = (0, 0, 1). Thus
e
3k + 2m =0
k − 3n = 0
m + 4n = 1
Exercise:
1 −1 2
3 0 1
Let A = be the matrix representation of a linear mapping f : R3 −→ R4 .
1 2 −3
−2 −1 1
(i ) Determine the rank and nullity of f .
(ii ) Show that f −1 does not exist.
16
EIGENVALUES AND EIGENVECTORS
Definition: Let A be an n×n matrix. The scalar λ is called an eigenvalue of A if there exists
a non-zero vector v such that Av = λv . In this case, the vector v is called an eigenvector of
A corresponding to e λ. e
e the eigenvelue e
Example:
3 0
Let A = . Then
8 −1
0 3 0 0 0 0
A = = = −1
1 8 −1 1 −1 1
0
Thus λ1 = −1 is an eigenvalue of A with a corresponding eigenvector v1 = .
f 1
Similarly,
1 3 0 1 3 1
A = = =3
2 8 −1 2 6 2
1
so that λ2 = 3 is an eigenvalue of A with a corresponding eigenvector v2 = .
f 2
In this case, λ1 = −1 and λ2 = 3 are the only eigenvalues of the 2 × 2 matrix A.
17
Examples:
3 0
1. Find the eigenvalues of the matrix A = .
8 −1
Solution:
λ−3 0
In this case, λI − A = . Hence the characteristic polynomial of A is
−8 λ + 1
λ−3 0
det (λI − A) = det = (λ − 3) (λ + 1) = λ2 − 2λ − 3
−8 λ + 1
and the characteristic equation, as a product of irreducible factors, is
(λ − 3) (λ + 1) = 0
The solutions of this equation are λ = −1 and λ = 3. These are the eigenvalues of A.
1 0 1
2. Find the eigenvalues of A = 0 1 1 .
1 1 0
Solution:
In this case,
λ−1 0 −1
det (λI − A) = det 0 λ − 1 −1
−1 −1 λ
2
= (λ − 1) λ − λ − 1 − 1 (λ − 1)
= (λ − 1) λ2 − λ − 2
= (λ − 1) (λ − 2) (λ + 1)
is the characteristic polynomial of A, in factor form. Accordingly, the the characteristic
equation of A is (λ − 1) (λ − 2) (λ + 1) = 0. Hence λ = −1, λ = 1, and λ = 2 are the
eigenvalues of A.
Exercise:
5 0 1
Compute the eigenvalues of A = 1 1 0 . Ans: λ = 2
−7 1 0
18
Examples:
3 0
1. Find the eigenspaces of A = .
8 −1
Solution:
From Example 1 above, λ = −1 and λ = 3 are are the eigenvalues of A.
For λ = −1,
−4 0 x1 0
(λI − A) v = (−I − A) v = =
e e −8 0 x2 0
−4x1 = 0 0 0
⇒ ⇒ x1 = 0, and x2 is a free variable. So v = = x2 . Hence
−8x1 =
0 e x2 1
0
E−1 = span .
1
For λ = 3,
0 0 x1 0
(λI − A) v = (3I − A) v = =
e e −8 4 x2 0
x1 1
⇒ −8x1 + 4x2 = 0 ⇒ x2 = 2x1 . So v = = x1 , and therefore,
e 2x1 2
1
E3 = span .
2
−12 7
2. Find basis or bases for the eigenspace(s) of A = .
−7 2
Solution:
In this case,
λ + 12 −7
det (λI − A) = det
7 λ−2
= λ2 + 10λ + 25
= (λ + 5)2
19
0 0 −2
3. Find the eigenspaces of A = 1 2 1 .
1 0 3
Solution:
In this case,
λ 0 2
det (λI − A) = −1 λ − 2 −1
−1 0 λ−3
= (λ − 2) λ2 − 3λ + 2
= (λ − 1) (λ − 2)2
For λ = 2,
2 0 2 x1 0
(λI − A) v = (2I − A) v = −1 0 −1 x2 = 0
e e −1 0 −1 x3 0
20
So x2 and x3 is a free variable. Now, 2x1 + 2x3 = 0 ⇒ x1 = −x3 . Thus
−x3 0 −x3 0 −1
v = x2 = x2 + 0 = x 2 1 + x3 0
e x3 0 x3 0 1
21
Eigenvalues of Triangular Matrices
Theorem: If A is a triangular matrix, then the eigenvalues of A are the entries on the main
diagonal of A.
a11 a12 a13 ··· a1n
0 a22 a23 ··· a2n
Proof : If A =
0 0 a33 ··· a3n is an upper triangular matrix, then
.. .. .. .. ..
. . . . .
0 0 0 ··· ann
λ − a11 −a12 −a13 ··· −a1n
0 λ − a22 −a23 ··· −a2n
λI − A =
0 0 λ − a33 ··· −a3n .
.. .. .. .. ..
. . . . .
0 0 0 ··· λ − ann
Since the determinat of a triangular matrix is equal to the product of its diagonal entries, then
det (λI − A) = (λ − a11 ) (λ − a22 ) · · · (λ − ann ) with zeros λ = a11 , λ = a22 , · · · , λ = ann .
The proof for a lower triangular matrix is similar.
1/2 0 0
Example: By inspection the eigenvalues of the lower triangular matrix A = −1 2/3 0
5 −8 −1/4
are λ = 1/2, λ = 2/3, and λ = −1/4.
22
From the above theorem, it is also an eigenvector of A7 corresponding to the eigenvalue
−1 0
7
λ = 1 = 1. Similarly, the eigenvectors 0 and 1 corresponding to the eigenvalue
1 0
λ = 2 of A are also eigenvector of A7 corresponding to the eigenvalue λ = 27 = 128.
det A = det P −1 BP
Examples:
1. In Example 1 above, matrices A and B are similar. In this case, det A = det B = −2.
2. In Example 2 above, it was shown that A and B are similar. Calculations show that
det A = det B = −2.
Theorem: Similar matrices have the same characteristic polynomial and hence same eigen-
values.
23
Proof : Suppose A and B are similar. Then there exists an invertible matrix P such that
A = P −1 BP . Accordingly,
λI − A = λI − P −1 BP
= λI P −1 P − P −1 BP
= P −1 (λI) P − P −1 BP
= P −1 (λI − B) P
and
So A and B have the same characteristic polynomial and hence same characteristic equation.
Since eigenvalues are roots of the characteristic equation, they have the same eigenvalues.
Examples:
2 1 4 −2
1. A prevoius example showed that the matrices A = and B = are
0 −1 5 −3
similar. Clearly, the eigenvalues of A are −1 and 2 (since A is a triangular matrix).
These are the eigenvalues of B also.
−6 −3 −25
2. In a prevoius example, it was shown that the matrices A = 2 1 8 and
2 2 7
1 0 0
B = 0 −1 0 are similar. Clearly, the eigenvalues of B are 1, −1 and 2 (since B
0 0 2
is a diagonal matrix). These are the eigenvalues of A also.
Exercise:
1 2 3 15 0 0
Determine whether matrices A = 4 5 6 and B = 0 0 0 are similar or not.
7 8 9 0 0 0
Definition: An n × n matrix A is diagonalizable if there exists an invertible n × n matrix
P such that P −1 AP is a diagonal matrix. In other words, A is diagonalizable if there is a
diagonal matrix D such that A is similar to D. The matrix P is said to diagonalize A in this
case.
Remark: If A is diagonalizable, then A is similar to a diagonal matrix whose diagonal
components are the eigenvalues of A.
Theorem: An n × n matrix A is diagonalizable if and only if it has n linearly independent
eigenvectors.
24
Corollary: If an n × n matrix A has n distinct eigenvalues, then it is diagonalizable.
25
which is a diagonal matrix whose diagonal components are the eigenvalues of A.
0 0 −2
3. Let A = 1 2 1 . Determine an invertible matrix P such that D = P −1 AP is a
1 0 3
diagonal matrix.
Solution: From a previous example, the eigenvalues of A are λ = 1, and λ = 2, where
p1 = (−2, 1, 1) is an eigenvector of A corresponding to λ = 1 while p2 = (−1, 0, 1) and
f p = (0, 1, 0) are eigenvectors of A corresponding to λ = 2. It can
and f be verified that
3
f −2 −1 0
p1 , p2 , andp3 are linearly independent so that P = 1 0 1 diagonalizes A. In
f f f
1 1 0
−1 0 −1
this case, P −1 = 1 0 2 and
1 1 1
−1 0 −1 0 0 −2 −2 −1 0
P −1 AP = 1 0 2 1 2 1 1 0 1
1 1 1 1 0 3 1 1 0
−1 0 −1 −2 −1 0
= 2 0 4 1 0 1
2 2 2 1 1 0
1 0 0
= 0 2 0
0 0 2
Note: There is no preferred order for the columns of the diagonalizing matrix P . Since
the ith diagonal entry of P −1 AP is the eigenvalue corresponding to the ith column vector of
P , changing the order of the columns of P just
changes the order of the eigenvalues on the
−1 0 −2
diagonal of P −1 AP . Thus if we write P = 0 1 1 in Example 3 above, we would
1 0 1
2 0 0
have obtained P −1 AP = 0 2 0 .
0 0 1
1 0 1
4. From a previous example, the matrix A = 0 1 1 has three distinct eigenvalues,
1 1 0
namely λ = −1, λ = 1, and λ = 2. Thus by the corrollary above, A is diagonalizable.
Now, calculations show that (1, 1, −2), (−1, 1, 0) and (1, 1, 1) are eigenvectors of A
1 −1 1
corresponding to λ = −1, λ = 1, and λ = 2 respectively. So P = 1 1 1 is a
−2 0 1
26
diagonalizing matrix and
1 1 −2 1 0 1 1 −1 1
1
P −1 AP = −3 3 0 0 1 1 1 1 1
6
2 2 2 1 1 0 −2 0 1
−1 0 0
= 0 1 0 .
0 0 2
(P −1 AP )2 = P −1 AP P −1 AP = P −1 AIAP = P −1 A2 P.
(P −1 AP )k = P −1 Ak P . . . . . . . . . . (∗)
P P −1 Ak P P −1 = P Dk P −1
⇒ Ak = P Dk P −1 . . . . . . . . . . (∗∗)
Example:
1 −1 1
In Example 4 above, it was shown that the matrix A is diagonalized by P = 1 1 1
−2 0 1
−1 0 0
−1
and that D = P AP = 0 1 0 . Thus from Equation (∗∗) above,
0 0 2
A5 = P D5 P −1
1 −1 1 −1 0 0 1/6 1/6 −1/3
= 1 1 1 0 1 0 −1/2 1/2 0
−2 0 1 0 0 32 1/3 1/3 1/3
−1 −1 32 1/6 1/6 −1/3
= −1 1 32 −1/2 1/2 0
2 0 32 1/3 1/3 1/3
11 10 11
= 10 11 10 .
11 11 10
27
Exercise:
0 1 1
1. Consider the matrix A = 1 1 0 .
1 0 1
(a) Find an invertible matrix P and a diagonal matrix D such that P −1 AP = D.
42 43 43
(b) Use the diagonal matrix D in (a) above to compute A7 . Ans: A7 = 43 43 42 .
43 42 43
−4 0 −1
2. Let B = 2 −1 0 .
2 0 −1
(a) Find the eigenvalues and the corresponding eigenspaces of B.
0 1
Ans: λ = −1, E−1 = span 1 ; λ = −2, E−2 = span −2 ;
0 −2
−1
λ = −3, E−3 = span 1
1
(b) Find the eigenvalues and bases for the corresponding eigenspaces of B 8 .
(c) Show that B is diagonalizable and hence find a diagonalizable matrix P such that
P −1 AP = D is a diagonal matrix.
(d ) Use the diagonalization in part (c) above to evaluate B 10 .
117074 0 58025
Ans: B 10 = −116050 1 −57002
−116050 0 −57001
Polynomials of Matrices
Consider a polynomial f (t) over a field F given by
28
Example:
2 −4 −3
Let f (x) = x + 3x + 2 and A = . Then
2 1
f (A) = A2 + 3A + 2I
−4 −3 −4 −3 −4 −3 1 0
= +3 +2
2 1 2 1 2 1 0 1
10 9 −12 −9 2 0
= + +
−6 −5 6 3 0 2
0 0
= .
0 0
29
Solution:
In this case, det A = −2 6= 0. Hence A−1 exists. Now,
λ −1 −1
det(λI − A) = −1 λ − 1 0
−1 0 λ−1
= λ(λ2 − 2λ + 1) + 1[−(λ − 1)] − 1(λ − 1)
= λ3 − 2λ2 − λ + 2
f (A) = A3 − 2A2 − A + 2I = 0
⇒ A(A2 − 2A − I + 2A−1 ) = 0
⇒ A2 − 2A − I + 2A−1 = 0
⇒ A−1 = −1/2 A2 − 2A − I
0 1 1 0 1 1 0 2 2 1 0 0
⇒ A−1 = −1/2 1 1 0 1 1 0 − 2 2 0 − 0 1 0
1 0 1 1 0 1 2 0 2 0 0 1
2 1 1 0 2 2 1 0 0
= −1/2 1 2 1 − 2 2 0 − 0 1 0
1 1 2 2 0 2 0 0 1
1 −1 −1
= /2 −1 −1 1
−1
−1 1 −1
−1/2 1/2 1/2
0 1 1 −1/2 1/2 1/2 1 0 0
−1 −1
As a check, AA = 1 1 0
1/2 1/2 /2 = 0
1 0 .
1 0 1 1/2 −1 /2 /2
1 0 0 1
Exercise:
Show that the following matrix is invertible and hence use the Cayley-Hamilton theorem to
find its inverse.
1 2
(a)
3 4
2 0 2
(b) 0 2 2
2 2 0
30
Definition: The minimal polynomial of a square matrix A is the monic polynomial of the
smallest degree having A as a root, i.e. h(λ) is the minimal polynomial of A if h(λ) is the
monic polynomial of the smallest degree such that h(A) = 0.
Theorem: The minimal polynomial and the characteristic polynomial of an n × n matrix
have the same irreducible factors.
Theorem: Let A be an n×n matrix. The minimal polynomial of A divides every polynomial
which has A as its zero. In particular, the minimal polynomial divides the characteristic
polynomial.
Example:
0 0 −2
1. Find the minimal polynomial of A = 1 2 1 .
1 0 3
Solution:
From a previous example f (λ) = (λ − 1)(λ − 2)2 is the characteristic polynomial of A.
So the irreducible factors of f (λ), and hence of the minimal polynomial of A, are λ − 1
and λ − 2. Therefore, the minimal polynomial of A is either h(λ) = (λ − 1)(λ − 2) or
f (λ) = (λ − 1)(λ − 2)2 .
Consider h(λ) = (λ − 1)(λ − 2). Then
31
Ans: f (λ) = (λ − 2)3 = λ3 − 6λ2 + 12λ − 8
−2 3 0 1
0 0 2 4
2. Find the minimal polynomial of B = 0
.
0 1 3
0 0 0 −2
32