MATH336LinearII
MATH336LinearII
Definition. By a field F , we mean a non-empty set of elements with two laws of combination,
which we call an addition + and a multiplication · satisfying:
(F1) To every pair of elements a, b ∈ F there is associated a unique element, called their sum,
which we denote by a + b.
(F2) Addition is associative: (a + b) + c = a + (b + c).
(F3) Addition is commutative: a + b = b + a.
(F4) There exists an element, which we denote by 0, such that a + 0 = a for all a ∈ F .
(F5) For each a ∈ F there exists an element, which we denote by −a such that a + (−a) = 0.
(F6) To every pair of elements a, b ∈ F there is associated a unique element, called their
product, which we denote by ab, or a · b.
(F7) Multiplication is associative: (ab)c = a(bc).
(F8) Multiplication is commutative: ab = ba.
(F9) There exists an element different from 0, which we denote by 1, such that a · 1 = a for all
a ∈ F.
(F10) For each a ∈ F , a 6= 0, there exists an element which we denote by a−1 , such that
a · a−1 = 1.
(F11) Multiplication is distributive with respect to addition: (a + b)c = ac + bc.
We write Q for the set of rational numbers, R for the set of real numbers and C for the set
of complex numbers. These sets are fields. The rigorous definition and treatments on fields can
be found in any abstract algebra courses including 2301337 Abstract Algebra I. The definition of
field was presented once in Linear Algebra I. In this course, F always denotes any of Q, R, C
or other fields. Its members are called scalars. However, almost nothing essential is lost if we
assume that F is the real field R or the complex field C.
Example 1.1.2. Let p be a prime and Fp = {0, 1, . . . , p − 1}. For a and b in Fp , we define
1
2 1. Vector Spaces
where aij ∈ F for all i ∈ {1, 2, . . . , m} and j ∈ {1, 2, . . . , n}. We write Mm,n (F ) for the set of
m × n matrices with entries in F and we write Mn (F ) for Mn,n (F ) the set of square matrices of
order n.
Remark. As a shortcut, we often use the notation A = [aij ] to denote the matrix A with entries
aij . Notice that when we refer to the matrix we put parentheses—as in “[aij ]”, and when we refer
to a specific entry we do not use the surrounding parentheses—as in “aij .”
Definition. Two m × n matrices A = [aij ] and B = [bij ] are equal if aij = bij for all i ∈
{1, 2, . . . , m} and j ∈ {1, 2, . . . , n}.
Definition. The m × n zero matrix 0m×n ∈ Mm,n (F ) is the matrix with 0F ’s everywhere,
0 0 0 ··· 0
0 0 0 · · · 0
0m×n = . . . . .
.. .. .. · · · ..
0 0 0 ··· 0
Definition. Let A = [aij ] and B = [bij ] be m × n matrices and a scalar r ∈ F . The matrix A + rB
is the matrix C ∈ Mm,n (F ) with entries C = [cij ] where
Theorem 1.1.1. Let A, B and C be matrices of the same size, and let r and s be scalars in F . Then
(a) A + B = B + A (e) r0 = 0 and 0A = 0
(b) (A + B) + C = A + (B + C) (f) 1A = A
(c) A + 0 = A (g) (r + s)A = rA + sA
(d) r(A + B) = rA + rB (h) r(sA) = (rs)A = (sr)A = s(rA)
1.1. The Algebra of Matrices over a Field 3
Definition. Let A be an m × n matrix with columns ~a1 , ~a2 , . . . , ~an and ~x is a column vector
in F n . The product of A and ~x denoted by A~x is the linear combination of the columns of A
using the corresponding entries in ~x as weights. That is,
x1
x2
A~x = ~a1 ~a2 · · · ~an . := x1~a1 + x2~a2 + · · · + xn~an .
..
xn
If B is an n × p matrix with columns ~b1 , ~b2 , . . . , ~bp , then the product of A and B, denoted by
AB, is the m × p matrix with columns A~b1 , A~b2 , . . . , A~bp . In other words,
h i h i
AB = A ~b1 ~b2 · · · ~bp := A~b1 A~b2 · · · A~bp .
The above definition of AB is a good for theoretical work. When A and B have small sizes,
the following method is more efficient when working by hand. Let A = [aij ] ∈ Mm,n (F ) and
B = [bij ] ∈ Mn,p (F ). Then the matrix product AB is defined as the matrix C = [cij ] ∈ Mm,p (F )
with entries
n
X
cij = ail blj ,
l=1
that is,
a11 a12 ··· a1n c11 ··· c1p
.. .. .. b11 ··· b1j ··· b1p .. ..
. . ··· . ··· ···
. ··· .
b21 b2j b2p
ai1 ai2
··· ain .. .. · · ·
.. = cij ··· .
.. .. .. . ··· . ··· . . ..
. . ··· . b .. ··· .
n1 ··· bnj ··· bnp
am1 am2 ··· amn cm1 ··· cmp
Theorem 1.1.2. Let A be m × n and let B and C have sizes for which the indicated sums and
products are defined.
(a) A(B + C) = AB + AC and (B + C)A = BA + CA
(b) r(AB) = (rA)B = A(rB) for any scalar r
(c) A0n×k = 0m×k and 0k×m A = 0k×n
(d) Im A = A = AIn
(e) A(BC) = (AB)C
Remarks. Properties above are analogous to properties of real numbers. But NOT ALL real
number properties correspond to matrix properties.
1. It is not the case that AB always equal to BA.
2. Even if A 6= 0 and AB = AC, then B may not equal to C. (A must
have
an
inverse!)
1 0 0 0 0 0
3. It is possible for AB = 0 even if A 6= 0 and B 6= 0. E.g., = .
0 0 1 0 0 0
4 1. Vector Spaces
Theorem 1.1.3. Let A and B denote matrices whose sizes are appropriate for the following sums
and products.
(a) (AT )T = A (c) (rA)T = rAT for any scalar r
(b) (A + B)T = AT + B T (d) (AB)T = B T AT
Definition. A vector space V over a field F is a nonempty set of elements called vectors, which
two laws of combination, called vector addition (or addition) and scalar multiplication, sat-
isfying the following conditions.
(A1) ∀~u, ~v ∈ V, ~u + ~v ∈ V . (SM1) ∀a ∈ F, ∀~u ∈ V, a~u ∈ V .
(A2) ∀~u, ~v ∈ V, ~u + ~v = ~v + ~u. (SM2) ∀a ∈ F, ∀~u, ~v ∈ V, a(~u + ~v ) = a~u + a~v .
(A3) ∀~u, ~v ∈ V, ~u + (~v + w) ~ (SM3) ∀a, b ∈ F, ∀~u ∈ V, (a + b)~u = a~u + b~u.
~ = (~u + ~v ) + w.
(A4) ∃~0 ∈ V, ∀~u ∈ V, ~u + ~0 = ~u = ~0 + ~u. (SM4) ∀a, b ∈ F, ∀~u ∈ V, (ab)~u = a(b~u).
′ ′ ~ ′
(A5) ∀~u ∈ V, ∃~u ∈ V, ~u + ~u = 0 = ~u + ~u. (SM5) ∀~u ∈ V, 1~u = ~u (1 ∈ F ).
We call ~0 the zero vector and ~u′ the negative of ~u.
Examples 1.2.1. 1. For any field F and n ≥ 1, we have F n is a vector space over F where
(x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn )
and
a(x1 , . . . , xn ) = (ax1 , . . . , axn )
for all (x1 , . . . , xn ), (y1 , . . . , yn ) ∈ F n and a ∈ F
1.2. Axioms of a Vector Space 5
2. Let m, n ∈ N, F is a field and Mm,n (F ) the set of all m × n matrices over F . Then Mm,n (F )
is a vector space over F under the usual addition and scalar multiplication of matrices.
3. [The space of functions from a set to a field] Let S be a nonempty set and F a field. Let
F S = {f | f : S → F }. Then F S is a vector space over F by defining f + g and af for
functions f, g ∈ F S and a scalar c ∈ F as follows:
for all t ∈ S. The zero function from S into F is the zero vector of F and the negative of
f ∈ V is −f defined by (−f )(t) = −f (t) for all t ∈ S.
4. [The sequence space] Let F N = {(xn ) : (xn ) is a sequence in F }. Then F N is a vector
space over F under the usual addition and scalar multiplication of sequences. That is, for
sequences (an ) and (bn ) in F N and a scalar c ∈ F ,
Its zero is the zero sequence (zn ) where zn = 0 for all n and the negative of (an ) is the
sequence (bn ) given by bn = −an for all n.
5. Let n be a non-negative integer and Fn [x] be the set of polynomials over F of degree at most
n. That is,
for all polynomials p(x) = a0 +a1 x+a2 x2 +· · ·+an xn and q(x) = b0 +b1 x+b2 x2 +· · ·+bn xn in
Fn [x] and c ∈ F . Then Fn [x] is a vector space over F . Observe that for each positive integer
n, we have Fn−1 [x] ⊂ Fn [x].
6. [The space of polynomials over a field] Let F [x] be the set of all polynomials over F . That
is,
Theorem 1.2.2. Let (V1 , +1 , ·1 ), (V2 , +2 , ·2 ), . . . , (Vn , +n , ·n ) be vector spaces over a field F .
For (~v1 , ~v2 , . . . , ~vn ), (w
~ 1, w ~ n ) ∈ V and c ∈ F , we define the addition and scalar multiplica-
~ 2, . . . , w
tion on V = V1 × V2 × · · · × Vn by
Then V is a vector space over F with the zero vector ~0 = (~01 , ~02 , . . . , ~0n ) and the negative of
(~v1 , ~v2 , . . . , ~vn ) is (−~v1 , −~v2 , . . . , −~vn ). V is called the direct product of V1 , V2 , . . . , Vn .
6 1. Vector Spaces
1.3 Subspaces
Theorem 1.3.1. Let W be a nonempty subset of V . Then the following statements are equivalent.
(i) W is a subspace of V .
(ii) ∀~u, ~v ∈ W, ∀c ∈ F, ~u + ~v ∈ W and c~u ∈ W .
(iii) ∀~u, ~v ∈ W, ∀c, d ∈ F, c~u + d~v ∈ W .
(iv) ∀~u, ~v ∈ W, ∀c ∈ F, c~u + ~v ∈ W .
Examples 1.3.1. 1. For any vector space V over a field F , we have {~0V } and V are subspaces
of V , called trivial subspaces.
2. For a non-negative integer n, we have Fn [x] is a subspace of F [x].
3. Let α ∈ F and Vα = {(x1 , x2 ) : x1 = αx2 }. Then Vα is a subspace of F 2 .
4. Let Bd(R) = {(an ) ∈ RN : (an ) is a bounded sequence},
C(R) = {(an ) ∈ RN : (an ) is a convergent sequence} and
C0 (R) = {(an ) ∈ RN : an → 0 as n → ∞}.
Then Bd(R), C(R) and C0 (R) are subspaces of RN .
5. Let C 0 (−∞, ∞) = {f ∈ RR : f is continuous on (−∞, ∞)}.
Then C 0 (−∞, ∞) is a subspace of RR .
6. Let W = {f : R → R | f ′′ = f }. Then W is a subspace of RR .
7. Let W1 = {p(x) ∈ F [x] : p(1) = 0} and W2 = {p(x) ∈ F [x] : p(0) = 1}.
Then W1 is a subspace of F [x] but W2 is not.
8. Let A ∈ Mm,n (F ). Then Nul A = {~x ∈ F n : A~x = ~0m } is a subspace of F n , called the null
space of A.
Theorem 1.3.2. Let V be a vector space over a field F . The intersection of any collection of
subspaces of V is a subspace of V .
Remark. W1 + W2 is the smallest subspace of V containing W1 and W2 , i.e., any subspace con-
taining W1 and W2 must contain W1 + W2 .
Since ∅ ⊂ {~0V } which is the smallest of all subspaces of V , we have Span = {~0V }. Moreover,
if W is a subspace of V , then Span W = W . In particular, Span(Span S) = Span S.
Remark. Let S be a non-empty subset of V and let W be a subspace of V containing S. Note that
for c1 , . . . , cm ∈ F and ~v1 , . . . ~vm ∈ S, we have ~v1 , . . . ~vm ∈ W and so
c1~v1 + · · · + cm~vm ∈ W.
Thus, Y := {c1~v1 + · · · + cm~vm : c1 , . . . , cm ∈ F and ~v1 , . . . , ~vm ∈ S for some m ∈ N} ⊆ W for all
subspaces W of V containing S. Hence, Y ⊆ Span S.
Theorem 1.3.4. Span S is the smallest subspace of V containing S. That is, any subspace of V
containing S must also contain Span S. Moreover, Span ∅ = {~0} and
In particular,
Span{~v1 , . . . , ~vp } = {c1~v1 + · · · + cp~vp : c1 , . . . , cp ∈ F }.
Definition. Let A = ~a1 ~a2 . . . ~an be an m × n matrix over a field F . Then ~ai ∈ F m for all
i = 1, 2, . . . , n and Span{~a1 , ~a2 , . . . , ~an } is a subspace of F m , called the column space of A. We
denote this space by Col A.
Theorem 1.3.5. Let V and W be vector spaces over a field F and T : V → W a linear transfor-
mation. Then T (~0V ) = ~0W and ∀~v ∈ V, T (−~v ) = −T (~v ).
Definition. Let V and W be vector spaces over a field F and T : V → W a linear transforma-
tion. Recall that the image or range of T is given by
im T = range T = {w
~ ∈ W : ∃~v ∈ V, T (~v ) = w}
~ = {T (~v ) : ~v ∈ V }.
Example 1.3.2. Let A = ~a1 ~a2 . . . ~an be an m × n matrix over a field F .
Then the matrix transformation T : F n → F m given by
T (~x) = A~x
T is onto ⇔ im T = F m ⇔ Col A = F m .
T (p(x)) = p(1)
for all p(x) ∈ R[x]. Show that T is an onto linear transformation and find its kernel.
Example 1.3.4. Let V be the space of differentiable functions on (−∞, ∞) with continuous
derivative. Define a function T : V → C 0 (−∞, ∞) by
T (f (x)) = f ′ (x)
for all f ∈ V . Show that T is an onto linear transformation and find its kernel.
Definition. Let V be a vector space over a field F . Vectors ~u1 , ~u2 , . . . , ~un in V are linearly
independent if
If there is a linear combination c1 ~u1 + c2 ~u2 + . . . cn ~un = ~0 with the scalars c1 , c2 , . . . , cn not all
zero, we say that ~u1 , ~u2 , . . . , ~un are linearly dependent.
is dependent or independent in R3 .
Example 1.3.6. Determine whether the set of vectors
2 2 1 0 0 1 1 1 1
~u1 = , ~u2 = , ~u3 =
0 0 1 0 0 1 0 0 1
Remark. Observe that the question of dependence and independence of sets of functions is re-
lated to the interval over which the space is defined. Consider the same interval [−1, 1] with the
functions f , g and h defined as follows:
f (x) = 1, −1 ≤ x ≤ 1,
( (
0 if −1 ≤ x ≤ 0, 0 if −1 ≤ x ≤ 0,
g(x) = and h(x) =
x if 0 ≤ x ≤ 1. x2 if 0 ≤ x ≤ 1.
These functions are linearly independent. However, if we restrict these same functions to the
interval [−1, 0], then they are dependent because
for −1 ≤ x ≤ 0.
Theorem 1.4.1. Let V be a vector space over a field F and B = {~v1 , . . . , ~vn } ⊆ V linearly
independent.
1. If ~v ∈ Span B, then there exist unique c1 , . . . , cn ∈ F such that
~v = c1~v1 + · · · + cn~vn .
2. If B is a basis for V , then every vector in V can be expressed uniquely as a linear combination
of ~v1 , . . . , ~vn .
3. Let W be a vector space over a field F and w ~ n ∈ W (not necessarily distinct). If B is a
~ 1, . . . , w
basis for V , then there is a unique linear transformation from V to W such that T (~vi ) = w ~i
for all i ∈ {1, . . . , n}.
Examples 1.4.1. 1. Find a linear transformation T that satisfies the following conditions
(i) T : C → R2 [x] with T (1 − i) = 2x2 and T (1 + i) = 1 − x,
(ii) T : R2 [x] → R2 with T (1) = (2, 1), T (1 − x) = (0, 1) and T (x + x2 ) = (1, 1).
2. Let T : R1 [x] → R3 be a linear transformation with
Lemma 1.4.2. 1. If ~u, ~v1 , . . . , ~vn ∈ S and ~u = c1~v1 +· · ·+cn~vn , then Span S = Span(S r{~u}).
2. If S is a linearly independent subset of V and ~u ∈ / Span S, then S ∪ {~u} is linearly indepen-
dent.
Theorem 1.4.4. [Replacement Theorem] Let V be a vector space that is spanned by a set G
containing exactly n vectors. Let L be a linearly independent subset of V with m vectors. Then
1. m ≤ n,
2. there exists a subset H of G with n − m vectors such that L ∪ H spans V .
Corollary 1.4.5. If a vector space V has a finite spanning set {~v1 , . . . , ~vn }, then
1. {~v1 , . . . , ~vn } has a subset which is a basis,
2. any linearly independent set in V can be extended to a basis,
3. V has a basis,
4. any two bases have the same finite number of elements, necessarily ≤ n.
Definition. If a vector space V has a finite spanning set, then we say that V is finite-
dimensional, and the number of elements in a basis is called the dimension of V , written
dim V . If V has no finite spanning set, we say that V is infinite-dimensional.
Examples 1.4.3. 1. The vector space {~0} has dimension zero with basis ∅.
2. The vector space F n , n ≥ 1, is of dimension n with standard basis {(1, 0, . . . , 0), (0, 1, . . . , 0),
. . . , (0, 0, . . . , 1)}. Similarly, Mm,n (F ) is of dimension mn where m, n ∈ N.
3. The vector space Fn [x] is of dimension n + 1 with standard basis {1, x, x2 , . . . , xn }.
4. The vector spaces F N and F [x] are infinite-dimensional. A basis for F [x] is {1, x, x2 , . . . }.
5. If we consider C as a vector space over C, it has dimension one with basis {1}. But if we
consider C as a vector space over R it has dimension two with basis {1, i}.
Remark. The above corollary is valid for a “finite” dimensional vector space. For a general (fi-
nite/infinite dimensional) vector space V , consider L = {L ⊆ V : L is linearly independent}.
Then ∅ ∈ L . Partially ordering L by ⊆.S We now show that every S chain in L has an upper
bound. Let C be a chain in L . Consider C . Let ~v1 , . . . , ~vn ∈ C and c1 , . . . , cn ∈ F be such
that c1~v1 + · · · + cn~vn = ~0V . Suppose ~vi ∈ Li for some Li ∈ C for all i ∈ {1, . . . , n}. Since C
is a chain, we may suppose that L1 ⊆ . . . ⊆ Ln . Thus,S~v1 , . . . , ~vn are in Ln which is a linearly
S
independent set. This implies c1 = · · · = cn = 0. Hence, C is a linearly independent set, so C
is in L . By Zorn’s lemma—“If a partially ordered set P has the property that every chain (i.e.,
totally ordered subset) has an upper bound in P , then the set P contains at least one maximal
element.”, L contains a maximal element, say B. This is a maximal linearly independent subset
of V . By Theorem 1.4.3 (1), B is a basis for V . Hence, every vector space has a basis. Note that a
basis for F N exists in this way and is not constructible explicitly.
Corollary 1.4.6. If V is a finite-dimensional vector space with dim V = n, then any spanning
set of n elements is a basis of V , and any linearly independent set of n elements is a basis of V .
Consequently, if W is an n-dimensional subspace of V , then W = V .
1.4. Bases and Dimensions 11
Theorem 1.4.8. If W1 and W2 are finite dimensional subspaces of a vector space V over a field F ,
then W1 + W2 is finite dimensional and
Definition. Let V and W be vector spaces over a field and T : V → W a linear transformation.
If V is finite dimensional, the rank of T , denoted by rank T , is dim(im T ) and the nullity of T ,
denoted by nullity T , is dim(ker T ).
Theorem 1.4.9. Let V and W be vector spaces over a field F and T : V → W a linear transfor-
mation. If V is finite dimensional, then
Corollary 1.4.11. If V is finite dimensional, S and T are linear transformations from V to V , and
T ◦ S is the identity map, then T = S −1 .
From Theorem 1.4.1, we have known that the representation of a given vector ~v ∈ V in terms
of a given basis is unique.
Definition. Let V be an n-dimensional vector space over a field F with ordered basis B =
{~v1 , . . . , ~vn } and ~v ∈ V . Then ∀~v ∈ V, ∃!(c1 , . . . , cn ) ∈ F n , ~v = c1 ~u1 + · · · + cn ~un . The vector
c1
..
[~v ]B = . ∈ F n
cn
Note that ∼
= is an equivalence relation.
Therefore, the theory of finite-dimensional vector spaces can be studied from column vectors
and matrices which we shall pursue in the next chapter.
Exercises for Chapter 1. 1. Let V = R+ the set of all positive integers. Define a vector addition and
a scalar multiplication on V as
v ⊕ w = vw and α ⊙ v = vα
for all positive real numbers v and w, and α ∈ R. Show that (V, ⊕, ⊙) is a vector space over R.
2. Let V be a vector space over a field F . For c ∈ F and ~v ∈ V , if c~v = ~v , prove that c = 1 or ~v = ~0V .
3. Which of the following are subspaces of M2 (R)?
(a) {A ∈ M2 (R) : det A = 0} (b) {A ∈ M2 (R) : A = AT }
T
(c) {A ∈ M2 (R) : A = −A } (d) {A ∈ M2 (R) : A2 = A}
N
4. Which of the following are subspaces of R ?
(a) All sequences like (1, 0, 1, 0, . . . ) that include infinitely many zeros.
(b) {(an ) ∈ RN : ∃n0 ∈ N, ∀j ≥ n0 , aj = 0}. (c) All decreasing sequences: aj+1 ≤ aj for all j ∈ N.
(d) All arithmetic sequences: {(an ) ∈ RN : ∃a, d ∈ R, ∀n ∈ N, an = a + (n − 1)d}.
(e) All geometric sequences: {(an ) ∈ RN : ∃a, r ∈ R, ∀n ∈ N, r 6= 0 ∧ an = arn−1 }.
5. Which of the following are subspaces of V = C 0 [0, 1]?
(a) {f ∈ V : f (0) = 0} (b) {f ∈ V : ∀x ∈ [0, 1], f (x) ≥ 0}
(c) All increasing functions: ∀x, y ∈ [0, 1], x < y ⇒ f (x) ≤ f (y).
6. Let V and W be vector spaces over a field F and T : V → W a linear transformation.
(a) If V1 is a subspace of V , then T (V1 ) = {T (~x) : ~x ∈ V1 } is a subspace of W .
(b) If W1 is a subspace of W , then T −1 (W1 ) = {~x ∈ V : T (~x) ∈ W1 } is a subspace of V .
7. If L, M and N are three subspaces of a vector space V such that M ⊆ L, then show that
L ∩ (M + N ) = (L ∩ M ) + (L ∩ N ) = M + (L ∩ N ).
Also give an example, in which the result fails to hold when M * L. (Hint. Consider Vα of F 2 .)
8. Let S1 and S2 be subsets of a vector space V . Prove that Span(S1 ∪ S2 ) = Span S1 + Span S2 .
9. If ~v1 , ~v2 , ~v3 ∈ V such that ~v1 + ~v2 + ~v3 = ~0, prove that Span{~v1 , ~v2 } = Span{~v2 , ~v3 }.
10. Let S = {~v1 , . . . , ~vn } and c1 , . . . , cn ∈ F r {0}. Prove that:
(a) Span S = Span{c1~v1 , . . . , cn~vn }
(b) S is linearly independent ⇔ {c1~v1 , . . . , cn~vn } is linearly independent.
11. If {~y , ~v1 , . . . , ~vn } is linearly independent, show that {~y + ~v1 , . . . , ~y + ~vn } is also linearly independent.
12. Determine (with reason or counter example) whether the following statements are TRUE or FALSE.
(a) If W1 and W2 are subspaces of V , then W1 ∪ W2 is a subspace of V .
(b) If {~v1 , ~v2 , ~v3 } is a basis of R3 , then {~v1 , ~v1 + ~v2 , ~v1 + ~v2 + ~v3 } is a basis of R3 .
13. Determine whether the following subsets are linearly independent.
(a) {(1, i, −1), (1 + i, 0, 1 − i), (i, −1, −i)} in C3 (b) {x, sin x, cos x} in C 0 (R)
14. Let V be a vector space over a field F . Let ~v1 , ~v2 , . . . , ~vn be vectors in V .
If w~ ∈ Span{~v1 , ~v2 , . . . , ~vn } r Span{~v2 , . . . , ~vn }, then ~v1 ∈ Span{w, ~ ~v2 , . . . , ~vn } r Span{~v2 , . . . , ~vn }.
15. Prove that if U and V are finite dimensional vector spaces, then dim(U × V ) = dim U + dim V .
16. Find a basis and the dimension of the following subspaces of M2 (R).
(a) {A ∈ M2 (R) : A = AT } (b) {A ∈ M2 (R) : A = −AT }
(c) {A ∈ M2 (R) : ∀B ∈ M2 (R), AB = BA}
17. Let B ∈ M2 (R) and W = {A ∈ M2 (R) : AB = BA}.
Prove that W is a subspace of M2 (R) and dim W ≥ 2.
18. Find a basis for the subspace W = {p(x) ∈ R3 [x] : p(2) = 0} and extend to a basis for R3 [x].
1.4. Bases and Dimensions 13
19. Let W1 = Span{(1, 0, 2), (1, −2, 2)} and W2 = Span{(1, 1, 0), (0, 1, −1)} in R3 .
Find dim(W1 ∩ W2 ) and dim(W1 + W2 ).
20. If T : V → W is a linear transformation and B is a basis for V , prove that Span T (B) = im T .
21. Let T : R2 [x] → R3 [x] be given by T (p(x)) = xp(x).
(a) Prove that T is a linear transformation and determine its rank and nullity.
(b) Does T −1 exist? Explain.
22. Suppose that U and V are subspaces of R13 , with dim U = 7 and dim V = 8.
(a) What is the smallest and largest possible dimensions of U ∩ V ? Explain.
(b) What is the smallest and largest possible dimensions of U + V ? Explain.
23. If V and W are finite-dimensional vector spaces such that dim V > dim W , then there is no one-to-
one linear transformation T : V → W .
24. Let U and W be subspaces of a vector space V . If dim V = 3, dim U = dim W = 2 and U 6= W , prove
that dim(U ∩ W ) = 1.
25. Let U and W be subspaces of a vector space V such that U ∩ W = {~0}.
Assume that ~u1 , ~u2 are linearly independent in U and w ~ 1, w
~ 2, w
~ 3 are linearly independent in W .
(a) Prove that {~u1 , ~u2 , w
~ 1, w ~ 3 } is a linearly independent set in V .
~ 2, w
(b) If dim V = 5, show that dim U = 2 and dim V = 3.
14 1. Vector Spaces
2 | Inner Product Spaces
Definition. Let F = R or C and let V be a vector space over F . Let ~u and ~v be vectors in V .
An inner product or scalar product on V is a function from V × V to F , denoted by h·, ·i, with
following properties:
(IN1) ∀~u, ~v , w
~ ∈ V, h~u + ~v , wi
~ = h~u, wi~ + h~v , wi.
~
(IN2) ∀~u, ~v ∈ V, ∀c ∈ F, hc~u, ~v i = ch~u, ~v i.
(IN3) ∀~u, ~v ∈ V, h~u, ~v i = h~v , ~ui. Here, ¯· is the complex conjugation.
(IN4) ∀~u ∈ V, h~u, ~ui ≥ 0 and [h~u, ~ui = 0 ⇒ ~u = ~0].
A vector space over F , in which an inner product is defined, is called an inner product space.
Remarks. 1. For all ~u, ~v ∈ V , h~0, ~ui = 0 = h~u, ~0i and h~u, ~v i = 0 ⇔ h~v , ~ui = 0.
2. If F = R, then (IN3) reads ∀~u, ~v ∈ V, h~u, ~v i = h~v , ~ui.
Example 2.1.1. Consider the complex vector space Cn of n-tuples of complex numbers. Let
~u = (u1 , u2 , . . . , un ) and ~v = (v1 , v2 , . . . , vn ). We define
Remark. If we consider, on the other hand, Rn the space of n-tuples of real numbers, we have a
real-valued scalar product h~u, ~v i = u1 v1 + u2 v2 + . . . + un vn and the verification of the properties
is exactly like Example 2.1.1, where all conjugation symbols are removed.
Example 2.1.2. Consider V = C 0 [a, b] the vector space of real-valued continuous functions de-
fined on the interval [a, b]. Let
Z b
hf, gi = f (x)g(x) dx.
a
We can add to the list of properties of the scalar product by proving some theorems, assuming
of course that we are dealing with a complex vector space with a scalar product.
15
16 2. Inner Product Spaces
hc1 ~u + c2~v , c1 ~u + c2~v i = c1 c̄1 h~u, ~ui + c1 c̄2 h~u, ~v i + c̄1 c2 h~v , ~ui + c2 c̄2 h~v , ~v i.
hc1 ~u + c2~v , c1 ~u + c2~v i = c1 c̄1 h~u, ~ui + c2 c̄2 h~v , ~v i = |c1 |2 h~u, ~ui + |c2 |2 h~v , ~v i.
The quantity h~u, ~ui is non-negative and is zero if and only if ~u = ~0. Therefore, we associate
with it the square of the length of the vector.
p
Definition. For ~v ∈ V , we define the length or norm of ~v to be k~v k = h~v , ~v i.
Some of the properties of the norm are given by the next theorem.
Theorem 2.1.3. If V is an inner product space over F , then the norm k · k has the following
properties:
1. ∀~u ∈ V, k~uk ≥ 0 and k~uk = 0 ⇔ ~u = ~0
2. ∀~u ∈ V, ∀a ∈ F, ka~uk = |a|k~uk
3. ∀~u, ~v ∈ V, |(~u, ~v )| ≤ k~ukk~v k (the Cauchy-Schwarz inequality)
4. ∀~u, ~v ∈ V, k~u + ~v k ≤ k~uk + k~v k (the triangle inequality).
Example 2.1.3. Let f be a real-valued continuous function defined on the interval [a, b]. Prove
that Z b
f (x)dx ≤ (b − a)M, where M = max |f (x)|.
a x∈[a,b]
Definition. Let V be an inner product space over F . Two nonzero vectors ~u and ~v are orthog-
onal if (~u, ~v ) = 0. A vector ~u is a unit vector if k~uk = 1.
is an orthonormal set.
2.2. Orthonormal Bases 17
Theorem 2.2.3. [Gram-Schmidt Process] Let ~v1 , ~v2 , . . . , ~vn ∈ V be linearly independent. Then
∀m ∈ {1, . . . , n}, ∃w ~ m ∈ V such that {w
~ 1, . . . , w ~ m } is an orthogonal set and it is a basis
~ 1, . . . , w
for Span{~v1 , . . . , ~vm }.
(2) Span{w
~ 1, . . . , w ~ k+1 } = Span{~v1 , . . . , ~vk , ~vk+1 }. Again, by induction hypothesis,
~ k, w
~ k+1 ∈ Span{w
w ~ k , ~vk+1 } = Span{~v1 , . . . , ~vk , ~vk+1 }.
~ 1, . . . , w
Then Span{w
~ 1, . . . , w ~ k+1 } ⊆ Span{~v1 , . . . , ~vk , ~vk+1 }. For the reverse, we note that
~ k, w
k
X h~vk+1 , w
~ ii
~vk+1 = w
~ k+1 + ~ i ∈ Span{w
w ~ 1, . . . , w ~ k+1 }.
~ k, w
~ i k2
kw
i=1
Corollary 2.2.4. If V is a finite dimensional inner product space, then V has an orthonormal
basis.
18 2. Inner Product Spaces
Proof. Let B = {~v1 , . . . , ~vm } be a basis for V . Then B is linearly independent. By the Gram-
Schmidt Process, we can construct an orthogonal subset {w ~ m } of V which is a basis for
~ 1, . . . , w
Span{~v1 , . . . , ~vm } = V . Hence, {w ~ m } is an orthogonal basis for V in which we can nor-
~ 1, . . . , w
malize each vector to obtain an orthonormal basis as desired.
1 2i
Example 2.2.2. Let H = Span 2i , 6 ⊂ C3 . Find an orthonormal basis for H.
0 −3
√
Example 2.2.3. Let V be the space of continuous functions on [0, 1] and H = Span{1, 3 x, 10x}
a 3-dimensional subspace of V . Use the Gram-Schmidt process to find an orthogonal basis for H.
Definition. Let V be an inner product space over F . For S ⊆ V , the orthogonal complement
of S is the set S ⊥ , read “S perp”, defined by
Theorem 2.3.3. [Bessel’s inequality] Let S = {~v1 , . . . , ~vn } be a set of distinct nonzero vectors.
If S is an orthogonal set, then for all ~v ∈ V ,
n
X |h~v , ~vi i|2
≤ k~v k2
k~vi k2
i=1
Theorem 2.3.4. V = W1 ⊕ W2
⇔ every vector ~v ∈ V can be expressed uniquely as ~v = w
~1 + w ~ 1 ∈ W1 and w
~ 2 with w ~ 2 ∈ W2 .
10. Prove that the finite sequence a0 , a1 , . . . , an of positive real numbers is a geometric progression if
and only if
(a0 a1 + a1 a2 + · · · + an−1 an )2 = (a20 + a21 + · · · + a2n−1 )(a21 + a22 + · · · + a2n ).
11. Let P (x) be a polynomial with positive real coefficients. Prove that
p √
P (a)P (b) ≥ P ( ab)
for all a, b ≥ 0.
12. Let V be an n-dimensional inner product space and m < n. If {~v1 , . . . , ~vm } is an orthonormal set,
then there exists ~vm+1 , . . . , ~vn ∈ V such that {~v1 , . . . , ~vn } is an orthonormal basis for V .
13. Prove the following statements.
(a) ∀S1 , S2 ⊆ V, S1 ⊆ S2 ⇒ S1⊥ ⊇ S2⊥ . (b) ∀S ⊆ V, (Span S)⊥ = S ⊥ .
(c) For S ⊆ V , if ~u ∈ S and ~v ∈ S , then k~u + ~v k = k~uk2 + k~v k2 .
⊥ 2
16. Consider the inner product space C 0 [−1, 1]. Suppose that f and g are continuous on [−1, 1] and
kf − gk ≤ 5. Let r
1 3
u1 (x) = √ and u2 (x) = x for x ∈ [−1, 1].
2 2
Write Z Z
1 1
aj = uj (x)f (x) dx and bj = uj (x)g(x) dx
−1 −1
for j = 1, 2. Show that |a1 − b1 |2 + |a2 − b2 |2 ≤ 25. (Hint. Use Bessel’s inequality.)
17. If V is a finite dimensional inner product space and W is a subspace of V , prove that (W ⊥ )⊥ = W .
18. If {~v1 , ~v2 } is a basis for V , show that V = Span{~v1 } ⊕ Span{~v2 }.
19. Consider the subspace Vα , α ∈ R, of R2 . Prove that if α 6= β, then R2 = Vα ⊕ Vβ .
20. Let V = RR be the space of all functions from R to R. Let
the sets of all even and odd functions, respectively. Prove the following statements.
(a) Ve and Vo are subspaces of V . (b) V = Ve ⊕ Vo .
21. Let S be a set of vectors in a finite dimensional inner product space V . Suppose that “h~u, ~v i = 0 for
all ~u ∈ S implies ~v = ~0”. Show that V = Span S.
22. Let RN be the sequence space of real numbers. Let V = {(an ) ∈ RN : only finitely many ai 6= 0}.
(a) Prove that V is a subspace of RN .
(b) Given (an ), (bn ) ∈ V , define
∞
X
h(an ), (bn )i = a n bn .
n=1
(Note that this makes sense since only finitely many ai and bi are nonzero.) Show that this defines
( on V . ∞
an inner product )
X
(c) Let U = (an ) ∈ V : an = 0 .
n=1
Show that U is a subspace of V such that U ⊥ = {~0}, U + U ⊥ 6= V and U 6= U ⊥⊥ .
3 | Matrices
Definition. For any system of m linear equations in n unknowns with coefficients over a field F
always has a trivial solution, namely the solution obtained by letting all xj = 0. Other nonzero
solutions (if any) are called nontrivial solutions.
21
22 3. Matrices
Remark. If A is an m × n matrix, then rank A ≤ n and rank A is the maximum number of linearly
independent columns of A by Corollary 1.4.5.
Examples 3.1.1. Consider the following augmented matrices. Write down their general solutions
(if any).
1 −3 4 7
1. 0 1 2 2
0 0 1 5
1 −3 7 0 1 −3 7 1
2. 0 1 4 0 0 1 4 0
0 0 0 0 0 0 0 −1
1 0 3 0 1 0 3 5
3.
0 1 −2 0 0 1 −2 1
1 −4 −2 0 3 −5
0 1 0 0 −1 −1
4.
0 0
0 1 0 0
0 0 0 0 0 0
~z = ~y + ~yp ,
where ~y is a solution of the homogeneous system A~x = ~0m and A~yp = ~b.
Definition. The main part of the algorithms used for solving simultaneous linear systems with
coefficients in F is called elementary row operations. It makes repeatedly used of three
operations on the linear system or on its augmented matrix, each of which preserves the set of
solutions because its inverse is an operation of the same kind:
1. (Interchange, Rij ) Interchange the ith row and the jth row.
2. (Scaling, cRi ) Multiply the ith row by a nonzero scalar c.
3. (Replacement, Ri + cRj ) Replace the ith row by the sum of it and a scalar c multiple of
the jth row.
The elementary column operations are defined in a similar way.
Operation Reverse
Rij Rij
cRi , c 6= 0 (1/c)Ri
Ri + cRj Ri − cRj
Definition. Two linear systems are said to be equivalent if they have the same set of solutions.
Theorem 3.2.1. Suppose that a sequence of elementary operations is performed on a linear system.
Then the resulting system has the same set of solutions as the original, so the two linear systems
are equivalent.
Proof. It is clear from the way we do the row reductions that if c1 , c2 , . . . , cn satisfy the original
system, then they also satisfy the reduced system. Since the elementary row operations are re-
versible if we start with the reduced system, the original system can be recovered. Now, it is clear
that any solutions of the reduced system is also a solution of the original system.
Definition. A rectangular matrix is in echelon form (or row-echelon form) if it has the fol-
lowing three properties:
1. All nonzero rows are above any rows of all zeros.
2. Each leading entry of a row is in a column to the right of the leading entry of the row
above it.
3. All entries in a column below a leading entry are zero. If a matrix in echelon form satisfies
the following additional conditions, then it is in reduced echelon form (or reduced row-
echelon form):
4. The leading entry in each nonzero row is 1, called the leading 1.
5. Each leading 1 is the only nonzero entry in its column.
An echelon matrix (respectively, reduced echelon matrix) is one that is in echelon form (re-
spectively, reduced echelon form).
Theorem 3.2.2. Every matrix can be brought to a reduced echelon matrix by a finite sequence of
elementary row operations.
Definition. Let A be an n × n matrix. We say that A is invertible or nonsingular and has the
n × n matrix B as inverse if AB = BA = In .
Theorem 3.2.3. Suppose A and B are invertible matrices of the same size. Then the following
results hold:
(a) A−1 is invertible and (A−1 )−1 = A, i.e., A is the inverse of A−1 .
(b) AB is invertible and (AB)−1 = B −1 A−1 .
(c) AT is invertible and (AT )−1 = (A−1 )T .
Corollary 3.2.5. If A and C are square matrices such that AC = I, then also CA = I. In
particular, A and C are invertible, C = A−1 and A = C −1 .
Remark. Elementary matrices are invertible because row operations are reversible. To find the in-
verse of an elementary matrix E, determine the elementary row operation needed to transform E
back into I and apply this operation to I to obtain the inverse.
Example 3.2.2. Find the inverses of the elementary matrices given in Example 3.2.1
Theorem 3.2.10. A square matrix is invertible if and only if it is a product of elementary matrices.
Remark. From the above theorem, we obtain an algorithm to find A−1 if A is invertible. Namely,
we start with the block matrix [A : I] and row reduce it until we reach the final reduced echelon
form [I : U ] (because A is row equivalent to I by Theorem 3.2.4). Then we have U = A−1 .
−2 3
Example 3.2.4. Express A = as a product of elementary matrices.
1 0
Lemma 3.3.1. Let V be a vector space over a field F . Let ~v1 , . . . , ~vn be in V .
1. Span{~v1 , . . . , ~vn } = Span{~v1 , . . . , c~vi , . . . ~vn } for all i ∈ {1, . . . , n} and c ∈ F nonzero.
2. Span{~v1 , . . . , ~vn } = Span{~v1 , . . . , ~vi + c~vj , . . . , ~vj , . . . , ~vn } for all i 6= j and c ∈ F .
Corollary 3.3.6. Let A, B, U and V be matrices of sizes for which the indicated products are
defined.
1. Col(AV ) ⊆ Col A, with equality if V is (square and) invertible.
2. Row(U A) ⊆ Row A, with equality if U is (square and) invertible.
3. rank AB ≤ rank A and rank AB ≤ rank B.
Let A be an m × n matrix of rank r, and let R be the reduced row-echelon form of A. Theorem
3.2.9 shows that R = U A where U is invertible, and that U can be found by [A : Im ] → [R : U ].
The matrix R has r leading ones (since rank A=r) so, as R is reduced, the n × m T
R
matrix
I 0
contains each row of Ir in the first r columns. Thus, row operations will carry RT → r .
0 0 n×m
Ir 0
Hence, Theorem 3.2.9 (again) shows that = U1 RT where U1 is an n × n invertible
0 0 n×m
matrix. Writing V = U1T , we obtain
!T
T T T Ir 0 Ir 0
U AV = RV = RU1 = (U1 R ) = = .
0 0 n×m 0 0 m×n
" #
T T Ir 0 T
Moreover, the matrix U1 = V can be computed by [R : In ] → : V . This proves
0 0 n×m
Theorem 3.3.7. Let A be an m × n matrix of rank r. There exist invertible matrices U and V of
size m × m and n × n, respectively, such that
Ir 0
U AV = ,
0 0 m×n
Proof. Observe first that U R = S for some invertible matrix U (by Theorem 3.2.9 there exist
invertible matrices P and Q such that R = P A and S = QA; take U = QP −1 . We show that
R = S by induction on the number m of rows of A. The case m = 1 is trivial because we can
perform only scaling. If ~rj and ~sj denotes the jth column of R and of S, respectively, the fact that
U R = S gives
U~rj = ~sj for each j. (3.3.1)
Since U is invertible, this shows that R and S have the same zero columns. Hence, by passing to
the matrices obtained by deleting the zero columns from R and S, we may assume that R and S
have no zero columns.
But then the first column of R and S is the first column of Im because they are reduced row-
echelon so (3.3.1) forces that the first column of U is the first column of Im . Now, write U, R and
S in block form as follows.
1 X 1 Y 1 X
U= ,R = and S = .
0 V 0 R′ 0 S′
where R1 and S1 are r × r. Then block multiplication gives U R = R. That is, S = R. This
completes the proof.
The set of all such permutations is denoted by Sn , and the number of such permutations is n!.
Example 3.4.1. S2 = {12, 21} and S3 = {123, 132, 213, 231, 312, 321}.
We say that σ is an even permutation ⇔ |Iσ | is even, and an odd permutation ⇔ |Iσ | is odd.
We then define the sign or parity of σ, written sgn σ, by
(
1 if σ is even,
sgn σ =
−1 if σ is odd.
Then (
g if σ is even,
σ(g) =
−g if σ is odd.
That is, σ(g) = (sgn σ)g.
Thus, the product of two even or two odd permutations is even, and the product of an odd and an
even permutation is odd.
Definition. The determinant of A = [aij ], denoted by det A or |A|, is the sum of all the above
n! products where each such product is multiplied by sgn σ. That is,
X X
|A| = (sgn σ)a1j1 a2j2 . . . anjn = (sgn σ)a1σ(1) a2σ(2) . . . anσ(n) .
σ∈Sn σ∈Sn
3.4. Permutations and Determinants 29
Theorem 3.4.4. The determinant of a matrix A and its transpose are equal. That is, |A| = |AT |.
Remark. By this theorem, any theorem about the determinant of a matrix A that concerns the
rows of A will have an analogous theorem concerning the columns of A.
1 2 ... k ... l ... n
Lemma 3.4.5. For k < l, τ = is an odd permutation in Sn .
1 2 ... l ... k ... n
Proof. Assume that kth and lth rows are identical with k < l.
That is, akj = alj for all j ∈ {1, . . . , n}.
In particular, for any σ ∈ Sn , akσ(l) = alσ(l) and akσ(k) = alσ(k) .
1 2 ... k ... l ... n
Let τ = .
1 2 ... l ... k ... n
Then sgn τ = −1 and σ(τ (j)) = σ(j) for all j ∈ {1, . . . , n} r {k, l}. Also,
sgn(στ ) = (sgn σ)(sgn τ ) = − sgn σ.
As σ runs through all even permutations, στ runs through all odd permutations, and vice versa.
Thus
X
|A| = (sgn σ)a1σ(1) a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
σ∈Sn
X
= (sgn σ)a1σ(1) a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
σ∈Sn
even
+ (sgn(στ ))a1στ (1) a2στ (2) . . . akστ (k) . . . alστ (l) . . . anστ (n)
X
= (sgn σ)a1σ(1) a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
σ∈Sn
even
− (sgn σ)a1σ(1) a2σ(2) . . . akσ(l) . . . alσ(k) . . . anσ(n)
X
= (sgn σ)a1σ(1) a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
σ∈Sn
even
− (sgn σ)a1σ(1) a2σ(2) . . . alσ(l) . . . akσ(k) . . . anσ(n)
= 0.
30 3. Matrices
Lemma 3.4.12. Let E be an elementary matrix. Then |EA| = |E||A| for any matrix A. In
particular, if E1 , E2 . . . , Es are elementary matrices, then
Theorem 3.4.14. The determinant of a product of two matrices A and B is the product of their
determinants; that is |AB| = |A||B|.
Definition. Consider an n-square matrix A = [aij ]. Let Mij (A) denote the (n − 1)-square
submatrix of A obtained by deleting its ith row and jth column. The determinant |Mij (A)| is
called the minor of the element aij of A, and we define the cofactor of aij , denoted by Cij (A),
to be the “signed” minor:
Cij (A) = (−1)i+j |Mij (A)|.
Recall that
X
|A| = (sgn σ)a1σ(1) a2σ(2) . . . anσ(n)
σ∈Sn
= aij Cij (A) + (terms which do not contain aij as a factor).
Lemma 3.4.15. Cij (A) = Cij (A) for all i, j ∈ {1, . . . , n}.
Theorem 3.4.16. [Laplace] The determinant of a square matrix A = [aij ] is equal to the sum of
the products obtained by multiplying the elements of any row (column) by their respective cofactors:
n
X
|A| = ai1 Ci1 (A) + ai2 Ci2 (A) + · · · + ain Cin (A) = aij Cij (A)
j=1
Xn
|A| = a1j C1j (A) + a2j C2j (A) + · · · + anj Cnj (A) = aij Cij (A)
i=1
for all i, j ∈ 1, 2, . . . , n.
3.4. Permutations and Determinants 31
Remark. The above formulas for |A| is called the Laplace expansions of the determinant of A
by the ith row and the jth column. Together with the elementary row operations, they offer a
method of simplifying the computation of |A|.
Next we proceed to prove the lemma.
Thus,
X
Cnn (B) = (sgn σ)b1σ(1) b2σ(2) . . . bn−1,σ(n−1)
σ∈Sn
σ(n)=n
X
= (sgn τ )b1τ (1) b2τ (2) . . . bn−1,τ (n−1)
τ ∈Sn−1
Write
a11 a12 ... a1j ... a1n
a21 a22 ... a2j ... a2n
..
.
A=
ai1 ai2
.
... aij ... ain
..
.
an1 an2 ... anj ... ann
To compute Cij (A), we row reduce A to A′ by interchanging rows n − i times and columns n − j
times as shown:
a11 a12 ... a1,j−1 a1,j+1 . . . a1n a1j
a21 a22 ... a2,j−1 a2,j+1 . . . a2n a2j
.. ..
. .
′
a i−1,1 a i−1,2 . . . a i−1,j−1 a i−1,j+1 . . . a i−1,n a i−1,j
A = ai+1,1 ai+1,2 . . . ai+1,j−1 ai+1,j+1 . . . ai+1,n ai+1,j .
.. ..
. .
an1 an2 . . . an,j−1 an,j+1 . . . ann anj
ai1 ai2 ... ai,j−1 ai,j+1 . . . ain aij
Hence,
|A′ | = (−1)(n−i)+(n−j) |A| = (−1)−i−j |A|.
That is,
Therefore,
Definition. Let A = [aij ] be an n × n matrix and let Cij (A) denote the cofactor of aij . The
classical adjoint of A, denoted by adj A, is the transpose of the matrix of the cofactors of A,
namely,
adj A = [Cij (A)]T .
We say “classical adjoint” here instead of simply “adjoint” because the term “adjoint” will be
used for an entirely different concept.
For any n × n matrix A and any ~b ∈ F n , let Ai (~b) be the matrix obtained from A by replacing
the ith column by the vector ~b, that is,
" #
~a · · · ~b · · · ~
a
Aj (~b) = 1 |{z} n
jth
for all j = 1, 2, . . . , n.
Theorem 3.4.18. [Cramer’s rule] Let A be an invertible n × n matrix. For any ~b ∈ F n , the unique
solution ~x of A~x = ~b has entries given by
|Aj (~b)|
xj = , j = 1, 2, . . . , n.
|A|
Exercises for Chapter 3. 1. The following matrices are echelon forms of coefficient matrices of linear
systems.
Which has
a unique solution?
Why?
1 2 3 4 1 2 3 4
0 1 2 3 0 1 2 3
(a)
0 0 1 2
(b)
0 0 0 1
0 0 0 1 0 0 0 0
2. Find the general solution to the linear system
x1 + 2x2 + x3 − 2x4 = 5
2x1 + 4x2 + x3 + x4 = 9
3x1 + 6x2 + 2x3 − x4 = 14
3.4. Permutations and Determinants 33
9. Find the number c so that (if possible) the rank of A is (a) 1 (b) 2 (c) 3
6 4 2
A = −3 −2 −1
9 6 c
1 2 1 b 1 2 0 3
10. Suppose A = 2 a 1 8 has the reduced echelon form R = 0 0 1 2.
∗ ∗ ∗ ∗ 0 0 0 0
(a) Find a and b. ~
(b) Solve A~x = 0.
11. Let A be an m × n matrix for which
1 0
A~x = 1 has no solutions and A~x = 1 has a unique solution.
1 0
(a) Give all possible information about m and n and the rank of A.
(b) Find all solutions of A~x = ~0 and explain your answer.
12. Let A be an 3 × 4 matrix for which
1 0
A~x = 1 has no solutions and A~x = 1 has more than one solutions.
1 0
−1 −1 . . . n 1 1 ... c
Compute A−1 .
−2 1 c
18. (a) For which values of the parameter c is A = 0 −1 1 invertible?
1 2 0
5 e e
(b) For which values of e is the matrix A = e e e not invertible?
1 2 e
a b b
19. Let A = a a b . If a 6= 0 and a 6= b, prove that A is invertible and find A−1 in terms of a and b.
a a a
1 0 0
20. Show that if A = 0 1 0 is an elementary matrix, then at least one entry in the third row must
a b c
be zero.
21. In each case
findan elementary
matrix E suchthat B = EA.
2 1 3 −1 2 1 −1 −3
(a) A = ,B = (b) A = ,B =
3 −1 2 1 3 −1 3 −1
22. In each case find an invertible matrix U such that U A = B, and express U as a product of elementary
matrices.
2 1 3 1 −1 −2 2 −1 0 3 0 1
(a) A = ,B = (b) A = ,B =
−1 1 2 3 0 1 1 1 1 2 −1 0
23. In each case find invertible matrices U and V such that U AV is in the Smithnormal form.
1 −1 2 1
1 1 −1 3 2 1 −2
(a) A = (b) A = (c) A = (d) A = 2 −1 0 3
−2 −2 4 2 1 −2 4
0 1 −4 1
24. Let F be a field and A = [aij ] ∈ Mn (F ). Define the trace of A to be the sum of the diagonal elements,
that is,
Xn
tr A = aii .
i=1
0 0 0 4
27. If A is an n × n matrix such that A2 = A and rank A = n, prove that A = In .
3.4. Permutations and Determinants 35
R1 +3R2 R23
A / A1 / A2 3R2 −R1/ A3 R1 −3R2/ A4 2R1
/ A5 .
Determine the values of |A1 |, |A2 |, |A3 |, |A4 | and |A5 |, respectively.
38. If A is an invertible square matrix of order n > 1, show that det(adj A) = (det A)n−1 .
What is det(adj A) if A is not invertible? Prove your answer.
39. Let A, B, C be 3 × 3 matrices with det A = 3, det B 3 = −8, det C = 2. Compute
(a) det(ABC) (b) det(5AC T ) (c) det(A3 B −3 C −1 ) (d) det[B −1 (adj C)] .
T T
40. Show that adj A = (adj A) .
41. Show that if A is invertible and n > 2, then adj (adj A) = (det A)n−2 A.
42. If A and B are invertible, show that
43. Prove that if A is an invertible upper triangular matrix (all entries lying below the diagonal are zero),
then adj A and A−1 are upper triangular.
44. Suppose the set of real-valued functions f1 (x), f2 (x), . . . , fk (x) are all defined and are differentiable
k − 1 times on the interval [a, b]. The Wronskian of the set of functions is defined on this interval to
be the determinant
f1 (x) f2 (x) ··· fk (x)
f1′ (x) f2′ (x) ··· fk′ (x)
′′
W (x) = f1 (x) f2′′ (x) ··· fk′′ (x) .
.. .. .. ..
. . . .
(k−1) (k−1) (k−1)
f1 (x) f2 (x) ··· fk (x)
Prove that a set of real-valued functions {f1 (x), f2 (x), . . . , fk (x)} differentiable k − 1 times on the
interval [a, b], are linearly independent if W (x0 ) 6= 0 at some point x0 in the interval.
45. Consider the interval [−1, 1] and the two functions defined by
( (
0 if −1 ≤ x ≤ 0, x2 if −1 ≤ x ≤ 0,
f (x) = and g(x) =
x2 if 0 ≤ x ≤ 1 0 if 0 ≤ x ≤ 1.
These functions are both differentiable. Show that f and g are linearly independent but W (x) = 0
for all x ∈ [−1, 1]. This provides an example to prove that the converse of the previous problem does
not hold.
46. (a) Show that the functions 1, x, x2 , . . . , xk are linearly independent in the function space C 0 [0, 1].
(b) Show that the functions sin x, sin 2x, sin 3x, . . . , sin kx are linearly independent in the function
space C 0 [0, 2π]. (Hint. Use the Wronskian.)
36 3. Matrices
1 x1 x21
1 x1
V2 = = x2 − x1 and V3 = 1 x2 x22 = (x2 − x1 )(x3 − x1 )(x3 − x2 ).
1 x2
1 x3 x23
1 x1 ... x1n−1
1 x2 ... x2n−1 Y
Vn = = (xj − xi ).
...
i<j
1 xn ... xnn−1
This determinant is called the Vandermonde determinant. (Hint. To do the induction easily, multi-
ply each column by x1 and subtract it from the next column on the right starting from the right-hand
side. We shall find that Vn = (xn − x1 ) . . . (x2 − x1 )Vn−1 .)
4 | Linear Transformations
Definition. Let V and W be two vector spaces over F . We write L(V, W ) for the set of all linear
transformations from V to W , that is,
Then L(V, W ) is a vector space over F with the operations defined by for S, T ∈ L(V, W ),
for all ~v ∈ V and c ∈ F . Note that the zero function is its zero vector and (−T )(~v ) = −T (~v ) for
all ~v ∈ V .
Remark. By Theorem 1.4.1, for a given basis B = {~v1 , ~v2 , . . . , ~vn } for an n-dimensional vector
space V , there exists a unique linear transformation T : V → W such that T (~vi ) = w ~ i ∈ W for
all i ∈ {1, 2, . . . , n}. Then for S, T ∈ L(V, W ), (S(~vα ) = T (~vα ) for all i ∈ {1, 2, . . . , n}) ⇒ S = T .
Hence, to show that two linear transformations are identical, it suffices to see the equality on some
basis of V .
Theorem 4.1.1. Let B = {~v1 , . . . , ~vn } be a basis for V and let C = {w ~ m } be a basis for W .
~ 1, . . . , w
For each i ∈ {1, . . . , n} and j ∈ {1, . . . , m}, we define
(
w
~ j if i = k,
Tij (~vk ) =
~0W if i 6= k,
for all k ∈ {1, . . . , n}. By Theorem 1.4.1, Tij ∈ L(V, W ) for all i, j. Then
is a basis for L(V, W ). Hence, if dim V = n and dim W = m, then dim L(V, W ) = mn.
37
38 4. Linear Transformations
Definition. The space V ∗ is called the dual space of V and V ∗∗ is called the double dual of V .
Remarks. 1. For f ∈ V ∗ ,
(a) f 6= 0 ⇒ im f = F
(b) if V is finite dimensional and f 6= 0, then nullity f = (dim V ) − 1.
2. For ~v ∈ V , if f (~v ) = 0 for all f ∈ V ∗ , then ~v = ~0.
Theorem 4.1.3. 1. The map θ : ~v 7→ L~v is a 1-1 linear transformation from V into V ∗∗ .
2. If V is finite-dimensional, then
(a) the map θ : ~v 7→ L~v is an isomorphism of V onto V ∗∗
(b) ∀L ∈ V ∗∗ , ∃!~v ∈ V, L = L~v .
Corollary 4.1.4. If V is finite dimensional, then each basis of V ∗ is the dual of some basis of V .
Example 4.1.2. Consider V = R2 [x], the vector space of all polynomials of degree less than 2
over R. Let t1 , t2 , t3 be three distinct real numbers and let fi (p(x)) = p(ti ) for all p(x) ∈ R2 [x] and
i = 1, 2, 3.
Show that {f1 , f2 , f3 } is a basis of V ∗ and find a basis of V such that {f1 , f2 , f3 } is its dual
basis.
π : ~v 7→ ~v + W for all ~v ∈ V .
Its kernel is equal to W . This map π is called the canonical projection from V onto V /W .
2. If V is a finite dimensional vector space and W is a subspace of V , then V /W is finite
dimensional and dim(V /W ) = dim V − dim W .
Theorem 4.2.2. [Isomorphism Theorem] Let V and W be two vector spaces over a field F and
T : V → W a linear transformation. Then
V /(ker T ) ∼
= im T.
Definition. Let V be an n-dimensional vector space over a field F with an ordered basis B =
{~v1 , ~v2 , . . . , ~vn } and ~v ∈ V . Then ∀~v ∈ V, ∃!(c1 , . . . , cn ) ∈ F n ,
c1
c2
~v = c1~v1 + c2~v2 + . . . + cn~vn and [~v ]B = . ∈ F n
..
cn
Example 4.3.1. Let B = {(1, 1, 0, 0), (1, 0, 1, 0), (1, 1, 1, 0), (0, 0, 0, 2i)} be an ordered basis for C4 .
Find [(2, −16, 3, −i)]B .
We recall Theorems 1.4.12 and 1.4.13 as follows.
Theorem 4.3.1. Let V be an n-dimensional vector space over F and B a basis for V .
~ ∈ V and c ∈ F , we have [~v + w]
1. For ~v , w ~ B = [~v ]B + [w]
~ B and [c~v ]B = c[~v ]B .
2. The map ~v 7→ [~v ]B is an isomorphism from V onto F n .
This also implies ∀~u, ~v ∈ V, [~u]B = [~v ]B ⇔ ~u = ~v .
Definition. The matrix [T ]CB is called the matrix for T relative to the ordered bases B and
C If V = W and B = C, then we write [T ]B for [T ]B n n
B . In addition, if T : F → F is a linear
n
transformation and B is the standard basis for F , we call [T ]B the standard matrix for T .
T
~v / T (~
v)
[T ]C
B×
[~v ]B v )]C = [T ]CB [~v ]B
/ [T (~
4.4. Change of Bases 41
Theorem 4.3.3. Let V , W and Z be finite-dimensional vector spaces over a field F and let B, C
and D be ordered bases of V , W and Z, respectively.
If S : V → W and T : W → Z are linear transformations, then
[T ◦ S]D D C
B = [T ]C [S]B .
Example 4.3.3. Define T : R2 [x] → R3 by T (a + bx + cx2 ) = (a − 2b, 3c − 2a, 3c − 4b) for all
a, b, c ∈ R. Compute rank T .
Definition. Let V be a n-dimensional vector space over a field F . with an ordered basis B =
{~v1 , . . . , ~vn }. If B ′ = {~v1′ , . . . , ~vn′ } is another ordered basis for V , we define the transition or
′
change of coordinate matrix from B ′ to B by PB→B′ = [I]B B
Theorem 4.4.2. Let B and B ′ be two bases for a finite dimensional vector space V . If T : V → V
is a linear operator, then
′
[T ]B′ = [I]B B −1
B′ [T ]B [I]B = (PB→B′ ) [T ]B (PB→B′ ).
Exercises for Chapter 4. 1. If T : V → W is an isomorphism and B is a basis for V , prove that T (B)
is a basis for W .
2. Let T : V → V be a linear transformation. Suppose that there exists a ~v ∈ V such that T (T (~v )) 6= ~0
and T (T (T (~v ))) = ~0. Prove that {~v , T (~v ), T (T (~v ))} is linearly independent.
3. Let S, T ∈ L(V, W ) and c ∈ F . Prove that:
(a) ker S ∩ ker T ⊆ ker(S + cT ) (b) im(S + T ) ⊆ im S + im T .
4. Let E be a linear transformation on a vector space V such that E ◦ E = E.
Prove that the following statements hold.
(a) ∀~v ∈ V, ~v ∈ im E ⇔ E(~v ) = ~v (b) ∀~v ∈ V, ~v − E(~v ) ∈ ker E (c) V = ker E ⊕ im E.
5. Let f, g ∈ V ∗ . If ker f ⊆ ker g, prove that g = cf for some c ∈ F .
6. Let V be an n-dimensional vector space over F .
If f, g ∈ V ∗ are linearly independent, find dim(ker f ∩ ker g).
7. If V and W are finite dimensional vector spaces which are isomorphic, prove that V ∗ ∼ = W ∗.
3
8. Let B = {(1, 0, −1), (1, 1, 1), (2, 2, 0)} be a basis for R . Find the dual basis of B.
4.4. Change of Bases 43
Clearly, f1 , f2 ∈ V ∗ . Prove that {f1 , f2 } is a basis for V ∗ and find a basis of V such that {f1 , f2 } is its
dual basis.
10. (a) Let W be a subspace of a finite dimensional vector space V .
If B = {x1 , . . . , xm } is a basis for W and {x1 , . . . , xm , xm+1 , . . . , xn } is a basis of V ,
show that {xm+1 + W, . . . , xn + W } is a basis for V /W .
(b) Let H = Span{(1, 1, −1)}. Determine a basis for R3 /H.
11. Let W1 and W2 be two subspaces of a vector space V .
Define T : W1 + W2 → W2 /(W1 ∩ W2 ) by T (w ~1 +w ~ 2) = w~ 2 + (W1 ∩ W2 ) for all w ~ 1 ∈ W1 and w
~ 2 ∈ W2 .
(a) Prove that T is well defined and is an onto linear transformation.
(b) Prove that ker T = W1 .
(c) Conclude by Theorem 4.2.2 that (W1 + W2 )/W1 ∼ = W2 /(W1 ∩ W2 ).
This is a generalization of Theorem 1.4.8.
12. If W1 and W2 are subspaces of V with W1 ⊆ W2 .
Define T : V /W1 → V /W2 by T (~v + W1 ) = ~v + W2 for all ~v ∈ V .
(a) Prove that T is well defined and is an onto linear transformation.
(b) Prove that ker T = W2 /W1 .
(c) Conclude by Theorem 4.2.2 that (V /W1 )/(W2 /W1 ) ∼ = V /W2 .
13. Let U , V and W be finite dimensional vector spaces over a field F . Let S : U → V and T : V → W
be linear transformations such that T ◦ S is the zero map. Show that
14. Let V and W be finite dimensional vector spaces over a field F . Let U be a subspace of V and
T : V → W a linear transformation.
(a) Prove that dim(V /U ) ≥ dim(T (V )/T (U )).
(b) If T is 1-1, prove also that the inequality in (a) becomes an equality.
15. For S ⊆ V , let A(S) = {f ∈ V ∗ : f (~v ) = 0 for all ~v ∈ S}. It is called the annihilator of S.
Prove that
(a) A(S) is a subspace of V ∗ (b) If S1 ⊆ S2 , then A(S1 ) ⊇ A(S2 )
(c) If V is finite dimensional and W is a subspace of V , then V ∗ /A(W ) ∼ = W ∗.
16. Prove that ∀S, T ∈ L(V, V ), S ◦ T ∈ L(V, V ).
17. Let T : V → W be a linear transformation where dim V = dim W = n.
Prove that the following statements are equivalent.
(i) T is an isomorphism.
(ii) [T ]cC
B is invertible for all ordered bases B and C of V and W , respectively.
(iii) [T ]cC
B is invertible for some pair of ordered bases B and C of V and W , respectively.
18. Suppose the linear transformation T : R2 → R2 is given by
where p′ (x) is the derivative of p(x). Show that T is an isomorphism by finding [T ]B where B =
{1, x, x2 , . . . , xn }.
44 4. Linear Transformations
TA (~x) = A~x
Eλ (A) = {~x ∈ F n : A~x = λ~x} = {~x ∈ F n : (A − λIn )~x = ~0n } = Nul(A − λIn ).
Then
λ is an eigenvalue of A ⇔ ker(TA − λI) 6= ~0n
⇔ Nul(A − λIn ) 6= ~0n
⇔ A − λIn is not invertible
⇔ det(A − λIn ) = 0.
45
46 5. Structure Theorems
Theorem 5.1.3. If A and B are similar n × n matrices, then A and B have the same characteristic
polynomial and eigenvalues (with same multiplicities).
have the same determinant, trace, characteristic polynomial and eigenvalue, but they are not
similar because P IP −1 = I for any invertible matrix P .
Definition. A diagonal matrix D is a square matrix such that all the entries off the main
diagonal are zero, that is if D is of the form
λ1 0 . . . 0
0 λ2 . . . 0
D= . .. . . .. = diag(λ1 , λ2 , . . . , λn ),
.. . . .
0 0 ... λn
Definition. Let V be a finite dimensional vector space and T ∈ L(V, V ) a linear operator. We
say that T is diagonalizable if there exists a basis B for V such that [T ]B is a diagonal matrix.
Proof. Let P = ~v1 ~v2 . . . ~vn and D = diag(λ1 , λ2 , . . . , λn ).
Then AP = P D becomes
λ1 0 · · · 0
0 λ2 · · ·
0
A ~v1 ~v2 · · · ~vn = ~v1 ~v2 · · · ~vn . .. . . ..
.. . . .
0 0 ··· λn
A~v1 A~v2 · · · A~vn = λ1~v1 λ2~v2 · · · λn~vn .
Since ~v2 , . . . , ~vk+1 are k eigenvectors, they are linearly independent by induction hypothesis, so
c2 = · · · = ck+1 = 0.
Lemma 5.1.7. Let {~v1 , . . . , ~vk } be a linearly independent set of eigenvectors of an n × n matrix A,
extend it to a basis of F n , and let
P = ~v1 . . . ~vk ~vk+1 . . . ~vn
In other words,
cA (x) = (x − λ)m g(x)
Proof. Assume that dim Eλ (A) = d with basis {~v1 , . . . , ~vd }. By Lemma 5.1.7, there exists an
invertible n × n matrix P such that
−1 λId B
P AP = =M
0 C
(x − λ)Id B
cA (x) = cM (x) = det(xIn − M ) =
0 xIn−d − C
= (det(x − λ)Id )(det(xIn−d − C))
= (x − λ)d cC (x).
Theorem 5.2.1. The characteristic polynomial and minimal polynomial for A have the same roots.
Remark. Although the minimal polynomial and the characteristic polynomial have the same
roots, they may not be the same.
5 −6 −6
Example 5.2.1. The characteristic polynomial for A = −1 4 2 is (x − 1)(x − 2)2 while
3 −6 −4
(A − I)(A − 2I) = 0,
Then
This gives
I = Bn−1
an−1 I = Bn−2 − ABn−1
an−2 I = Bn−3 − ABn−2
..
.
a1 I = B0 − AB1
a0 I = −AB0 .
Therefore
An +an−1 An−1 + . . . a1 A + a0 I
= An Bn−1 + An−1 (Bn−2 − ABn−1 ) + An−2 (Bn−3 − ABn−2 ) + . . .
+ A(B0 − AB1 ) − AB0
=0
as desired.
3 1 −1
Example 5.2.2. Determine the minimal polynomial of A = 2 2 −1.
2 2 0
Recall that
Notice that symmetric and Hermitian matrices are square matrices and they coincide if F = R.
3 1 −1 2 + 3i
Example 5.3.1. Let A = and B = .
1 −2 2 − 3i 2
Then A is symmetric and both of them are Hermitian.
Corollary 5.3.4. If U = ~u1 ~u2 . . . ~un ∈ Mn (C) is a unitary matrix, then for all j, k ∈
{1, 2, . . . , n} we have (
1 if j = k,
(~uj , ~uk ) =
0 if j 6= k.
Remark. The converse of Corollary 5.3.4 is also true and its proof is left as an exercise.
1 1 i cos t − sin t
Example 5.3.3. U1 = √ and U2 =
2 i 1 sin t cos t
are unitary matrices.
Theorem 5.3.5. Every eigenvalue of an unitary matrix U has absolute value one, i.e., |λ| = 1.
Moreover, eigenvectors corresponding to different eigenvalues are orthogonal to each other.
We are going to explore some very remarkable facts about Hermitian and real symmetric
matrices. These matrices are diagonalizable, and moreover diagonalization can be accomplished
by a unitary matrix P . This means that P −1 AP = P H AP is diagonal. In this situation, we say
that the matrix A is unitarily or orthogonally diagonalizable. Orthogonally and unitary are
particularly attractive since the calculation is essentially free and error-free as well: P H = P −1 .
52 5. Structure Theorems
Remark. The converse of Theorem 5.3.6 is also true. In addition, we prove a stronger result.
Theorem 5.3.7. [Principal Axes Theorem] Every Hermitian matrix is unitarily diagonalizable.
In addition, every real symmetric matrix is orthogonally diagonalizable.
Proof. It is a consequence of Schur Triangularization Theorem which is beyond the scope of this
course.
~x · ~y = ~xT ~y = x1 y1 + · · · + xn yn H
~x · ~y = ~x ~y = x̄1 y1 + · · · + x̄n yn
orthogonality: ~xT ~y = 0 ~xH ~y = 0
orthonormal: P T P = In = P P T unitary: U U = In = U U H
H
(A − λI)~vr = ~vr−1 ,
(A − λI)~vr−1 = ~vr−2 ,
..
.
(A − λI)~v2 = ~v1 .
(A − λI)r~vr = ~0.
54 5. Structure Theorems
Each block has one eigenvector, one eigenvalue, and 1s just above the diagonal:
λi 1
λi 1
Jordan block Ji = J(λi , ri ) =
. .
.
. 1
λi r ×r
i i
The same λi will appear in several blocks, if it has several independent eigenvectors. Moreover, M
consists of n generalized eigenvectors which are linearly independent.
5.4. Jordan Forms 55
Remark. Theorem 5.4.1 says that every n × n matrix A has n linearly independent generalized
eigenvectors. These n generalized eigenvectors may be arranged in chains, with the sum of the
lengths of the chains associated with a given eigenvalue λ equal to the multiplicity of λ. But the
structure of these chains depends on the defect of λ, and can be quite complicated. For instance,
a multiplicity-four-eigenvalue can correspond to
• Four length 1 chain (defect 0);
• Two length 1 chains and a length 2 chain (defect 1);
• Two length 2 chains (defect 2);
• A length 1 chain and a length 3 chain (defect 2);
• A length 4 chain (defect 3).
Observe that, in each of these cases, the length of the longest chain is at most d + 1 where d
is the defect of the eigenvalue. Consequently, once we have found all the ordinary eigenvectors
corresponding to a multiple eigenvalue λ, and therefore know the defect d of λ, we can begin
with the equation
(A − λI)d+1 ~u = ~0 (5.4.1)
Algorithm: Begin with a nonzero solution ~u1 of Eq. (5.4.1) and successively multiply by the
matrix A − λI until the zero vector is obtained. If
(A − λI)~u1 = ~u2 6= ~0
(A − λI)~u2 = ~u3 6= ~0
..
.
(A − λI)~uk−1 = ~uk 6= ~0
0 0 1 0
0 0 0 1
Example 5.4.3. Let A = with the characteristic polynomial x(x + 2)3 .
−2 2 −3 1
2 −2 1 −3
Find the chains of generalized eigenvectors corresponding to each eigenvalues and the Jordan
form of A.
8 0 0 0
0 8 0 3
Example 5.4.4. Let A = 4 0 8 0. Find the minimal polynomial of A and chain(s) of
0 0 0 8
generalized eigenvectors and the Jordan form of A.
Example 5.4.5. Write down the Jordan form of the following matrices.
0 0 1 1 3 5 0 0 3 0 0 0
0 0 1 1 0 3 6 0 0 3 5 0
(1)
0
(2) (3)
0 1 1 0 0 4 7 0 0 4 6
0 0 0 0 0 0 0 4 0 0 0 4
56 5. Structure Theorems
Let N (r) = J(0; r) denote an r × r matrix that has 1’s immediately above the diagonal and
zero elsewhere. For example,
0 1 0 0
0 1 0
0 1 0 0 1 0
N (2) = , N (3) = 0 0 1 , N (4) = 0 0 0 1 , etc.
0 0
0 0 0
0 0 0 0
Suppose that f (x) is a polynomial of degree s. Then the Taylor expansion around a point c from
calculus gives us
because the entries of N k that are k steps above the diagonal are 1’s and all the other entries
are zeros.
Example 5.4.6. Compute J(λ; 4)2 , J(λ; 3)10 and J(λ; 2)s .
s
J1 J1
Remark. If J = . .. s
is in a Jordan form, then J =
..
.
.
Jt Jts
2 1 0
Example 5.4.7. Compute J s for J = 0 2 0.
0 0 3
Example 5.4.8. Given a square matrix A, use the Jordan form of A, to determine its minimal
polynomial.
Solution. Let J be the Jordan form of A. Since f (A) = M f (J)M −1 , f (A) = 0 if and only if
f (J) = 0. Also, if J(λ; r) is a Jordan block, then f (J(λ; r)) is a Jordan block of f (J). We must
thus find a polynomial such that, for every Jordan block J(λ; r) of J, f (J(λ; r)) = 0 holds.
But we derived a formula for f (J(λ; r)), and it equals the zero matrix if and only if f (λ), f ′ (λ),
. . . , f (r−1) (λ) are all zero. Thus, f (x) and its first r − 1 derivatives must vanish at x = λ; in other
words, (x − λ)r must be a factor of f (x).
5.4. Jordan Forms 57
Let λ1 , . . . , λk be the distinct eigenvalues of A and mi the “maximum size” of the Jordan blocks
corresponding to the eigenvalue λi . Hence, we obtain
0 1 0
Exercises for Chapter 5. 1. Let A = 0 0 1. Find a, b, c so that det(A − λI3 ) = 9λ − λ3 .
a b c
2. Let T : V → V be a linear operator.
A subspace U of V is T -invariant if T (U ) ⊆ U , i.e., ∀~u ∈ U, T (~u) ∈ U .
(a) Show that ker T and im T are T -invariant.
(b) If U and W are T -invariant, prove that U ∩ W and U + W are also T -invariant.
(c) Show that the eigenspace Eλ (T ) is T -invariant.
3. Show that A and AT have the same eigenvalues.
4. Show that if λ1 , . . . , λk are eigenvalues of A, then λm m
1 , . . . , λk are eigenvalues of A
m
for all m ≥ 1.
m
Moreover, each eigenvector of A is an eigenvector of A .
5. Let A and B be n × n matrices over a field F . If I − AB is invertible, prove that I − BA is invertible
and (I − BA)−1 = I + B(I − AB)−1 A.
6. Show that if A and B are the same size, then AB and BA have the same eigenvalues.
7. Determine all 2 × 2 diagonalizable matrices A with nonzero repeated eigenvalue a, a.
8. Let V be the space of all real-valued continuous functions. Define T : V → V by
Z x
(T f )(x) = f (t) dt.
0
tr AB − (tr A)(tr B) + tr AB −1 = 0.
58 5. Structure Theorems
15. Find the 2 × 2 matrices with real entries that satisfy the equation
−2 −2
X 3 − 3X 2 = .
−2 −2
(Hint. Apply
the Cayley-Hamilton
Theorem.)
0 0 c
16. Let A = 1 0 b .
0 1 a
Prove that the minimal polynomial of A and the characteristic polynomial of A are the same.
17. A 3 × 3 matrix A has the characteristic polynomial x(x − 1)(x + 2).
What is the characteristic polynomial of A2 ?
18. Let V = Mn (F ) be the vector space of n × n matrices over a field F . Let A be an n × n matrix.
Let TA be the linear operator on V defined by TA (B) = AB.
Show that the minimal polynomial for TA is the minimal polynomial for A.
19. Let U be an n × n real orthonormal matrix. Prove that
2
(a) |tr (U )| ≤ n, and (b) det(U
( − In ) = 0 if n is odd.
1 if j = k,
20. If U = ~u1 ~u2 . . . ~un with (~uj , ~uk ) = , prove that U is unitary.
0 if j 6= k
21. Let A be an n × n symmetric matrix with distinct eigenvalues λ1 , . . . , λk . Prove that
(A − λ1 In ) . . . (A − λk In ) = 0.
For any matrix M , compare JM with M K. If they equal, show that M is not invertible. Then J
and K are not similar.
33. Suppose that a square matrix has two eigenvalues λ = 2, 5, and np (λ) = nullity(A − λI)p , p ∈ N, are
as follows:
n1 (2) = 2, n2 (2) = 4, np (2) = 5 for p ≥ 3, and n1 (5) = 1, np (5) = 2 for p ≥ 2.
Write down the Jordan form of A.
34. If J = J(0; 5) is the 5 × 5 Jordan block with λ = 0. Find J 2 , count its eigenvectors and write its
Jordan form.
35. How many possible Jordan forms are there for a 6 × 6 matrix with characteristic
polynomial (x − 1)2 (x + 2)4 ?
5.4. Jordan Forms 59
2 a b
36. Let A = 0 2 c ∈ M3 (R).
0 0 1
(a) Prove that A is diagonalizable if and only if a = 0.
(b) Find the minimal polynomial of A when (i) a = 0 (ii) a 6= 0.
37. Let V = {h(x, y) = ax2 + bxy + cy 2 + dx + ey + f : a, b, c, d, e, f ∈ R} be a subspace of the space
of polynomial in two variables x and y over R. Then B = {x2 , xy, y 2 , x, y, 1} is a basis for V . Define
T : V → V by Z
∂
(T (h))(x, y) = h(x, y) dx .
∂y
(a) Prove that T is a linear transformation and find A = [T ]B .
(b) Compute the characteristic polynomial and the minimal polynomial of A.
(c) Find the Jordan form of A.
38. True or False:
3 0 3 1 3 0 3 1
(a) and are similar. (b) and are similar.
0 4 0 4 0 3 0 3
a 1 0 b 0 0
39. Show that 0 a 0 and 0 a 1 are similar.
0 0 b 0 0 a
40. Write down the Jordan formfor the following matrices and find its minimal polynomial.
−1 0 1 1 0 0 2 0 0
−2 1
(a) (b) 0 −1 1 (c) −2 −2 −3 (d) −7 9 7
−1 −4
1 −1 −1 2 3 4 0 0 2
3 1 −1 −2 17 4 −3 5 −5 5 −1 1
(e) 2 2 −1 (f) −1 6 1 (g) 3 −1 3 (h) 1 3 0
2 2 0 0 1 2 8 −8 10 −3 2 1
1 −4 0 −2 2 1 0 1 −1 −4 0 0 1 3 7 0
0 1 0 0 0 2 1 0 1 3 0 0 0 −1 −4 0
(i)
6 −12 −1 −6
(j)
0 0 2 1
(k)
1
(l)
2 1 0 0 1 3 0
0 −4 0 −1 0 0 0 2 0 1 0 1 0 −6 −14 1
Eigenvalues: (b) −1, −1, −1 (c) 1, 1, 1 (d) 2, 2, 9 (e) 1, 2, 2 (f) 2, 2, 2 (g) 2, 2, 2 (h) 3, 3, 3
(i) −1, −1, 1, 1 (k) 1, 1, 1, 1 (l) 1, 1, 1, 1.