0% found this document useful (0 votes)

2 views

MATH336LinearII

The document provides definitions and properties of fields and matrices, establishing the foundational algebraic structures necessary for vector spaces. It outlines the axioms of vector spaces, including vector addition and scalar multiplication, along with various theorems related to matrix operations. Additionally, it discusses examples of fields and vector spaces, illustrating the concepts with specific cases.

Uploaded by

Natchanon Yingyot

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

MATH336LinearII

Uploaded by

Natchanon Yingyot

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 59

1 | Vector Spaces

1.1 The Algebra of Matrices over a Field

Definition. By a field F , we mean a non-empty set of elements with two laws of combination,
which we call an addition + and a multiplication · satisfying:
(F1) To every pair of elements a, b ∈ F there is associated a unique element, called their sum,
which we denote by a + b.
(F2) Addition is associative: (a + b) + c = a + (b + c).
(F3) Addition is commutative: a + b = b + a.
(F4) There exists an element, which we denote by 0, such that a + 0 = a for all a ∈ F .
(F5) For each a ∈ F there exists an element, which we denote by −a such that a + (−a) = 0.
(F6) To every pair of elements a, b ∈ F there is associated a unique element, called their
product, which we denote by ab, or a · b.
(F7) Multiplication is associative: (ab)c = a(bc).
(F8) Multiplication is commutative: ab = ba.
(F9) There exists an element different from 0, which we denote by 1, such that a · 1 = a for all
a ∈ F.
(F10) For each a ∈ F , a 6= 0, there exists an element which we denote by a−1 , such that
a · a−1 = 1.
(F11) Multiplication is distributive with respect to addition: (a + b)c = ac + bc.

Remark. Note that in a field F , 0 + 0 = 0.

We write Q for the set of rational numbers, R for the set of real numbers and C for the set
of complex numbers. These sets are fields. The rigorous definition and treatments on fields can
be found in any abstract algebra courses including 2301337 Abstract Algebra I. The definition of
field was presented once in Linear Algebra I. In this course, F always denotes any of Q, R, C
or other fields. Its members are called scalars. However, almost nothing essential is lost if we
assume that F is the real field R or the complex field C.

Example 1.1.1. A non-empty subset F of C such that for any x, y ∈ F , x − y ∈ F and xy ∈ F

and for any non-zero z ∈ F , 1/z ∈ F is also a field. It is called a subfield of C. For example,
Q(i) = {a + bi : a, b ∈ Q} is a subfield of C.

Example 1.1.2. Let p be a prime and Fp = {0, 1, . . . , p − 1}. For a and b in Fp , we define

a + b = the remainder when we divide a + b with p, and

ab = the remainder when we divide ab with p.

Then (Fp , +, ·) is a finite field of p elements. Note that if p = 2, we have 1 + 1 = 0.

1
2 1. Vector Spaces

Definition. Let F be a field. An m × n (m by n) matrix A with m rows and n columns with

entries over F is a rectangular array of the form
 
a11 · · · a1j · · · a1n
 .. .. .. 
 . . . 
 
A =  ai1 · · · aij · · · ain 

,
 .. .. .. 
 . . . 
am1 · · · amj · · · amn

where aij ∈ F for all i ∈ {1, 2, . . . , m} and j ∈ {1, 2, . . . , n}. We write Mm,n (F ) for the set of
m × n matrices with entries in F and we write Mn (F ) for Mn,n (F ) the set of square matrices of
order n.

Remark. As a shortcut, we often use the notation A = [aij ] to denote the matrix A with entries
aij . Notice that when we refer to the matrix we put parentheses—as in “[aij ]”, and when we refer
to a specific entry we do not use the surrounding parentheses—as in “aij .”

Definition. Two m × n matrices A = [aij ] and B = [bij ] are equal if aij = bij for all i ∈
{1, 2, . . . , m} and j ∈ {1, 2, . . . , n}.

Definition. The m × n zero matrix 0m×n ∈ Mm,n (F ) is the matrix with 0F ’s everywhere,
 
0 0 0 ··· 0
0 0 0 · · · 0
 
0m×n =  . . . . .
 .. .. .. · · · .. 
0 0 0 ··· 0

When m = n we write 0n as an abbreviation for 0n×n .

The n × n identity matrix In ∈ Mn (F ) is the matrix with 1’s on the main diagonal and 0’s
everywhere else,  
1 0 0 ··· 0
0 1 0 · · · 0
 
In =  . . . ..  .
 .. .. .. · · · .
0 0 0 ··· 1

Definition. Let A = [aij ] and B = [bij ] be m × n matrices and a scalar r ∈ F . The matrix A + rB
is the matrix C ∈ Mm,n (F ) with entries C = [cij ] where

cij = aij + rbij .

Theorem 1.1.1. Let A, B and C be matrices of the same size, and let r and s be scalars in F . Then
(a) A + B = B + A (e) r0 = 0 and 0A = 0
(b) (A + B) + C = A + (B + C) (f) 1A = A
(c) A + 0 = A (g) (r + s)A = rA + sA
(d) r(A + B) = rA + rB (h) r(sA) = (rs)A = (sr)A = s(rA)
1.1. The Algebra of Matrices over a Field 3

Definition. Let A be an m × n matrix with columns ~a1 , ~a2 , . . . , ~an and ~x is a column vector
in F n . The product of A and ~x denoted by A~x is the linear combination of the columns of A
using the corresponding entries in ~x as weights. That is,
 
x1
 x2 


A~x = ~a1 ~a2 · · · ~an  .  := x1~a1 + x2~a2 + · · · + xn~an .
 .. 
xn

If B is an n × p matrix with columns ~b1 , ~b2 , . . . , ~bp , then the product of A and B, denoted by
AB, is the m × p matrix with columns A~b1 , A~b2 , . . . , A~bp . In other words,
h i h i
AB = A ~b1 ~b2 · · · ~bp := A~b1 A~b2 · · · A~bp .

The above definition of AB is a good for theoretical work. When A and B have small sizes,
the following method is more efficient when working by hand. Let A = [aij ] ∈ Mm,n (F ) and
B = [bij ] ∈ Mn,p (F ). Then the matrix product AB is defined as the matrix C = [cij ] ∈ Mm,p (F )
with entries
n
X
cij = ail blj ,
l=1

that is,
   
a11 a12 ··· a1n   c11 ··· c1p
 .. .. ..  b11 ··· b1j ··· b1p  .. .. 
 . . ··· .  ··· ···
 . ··· . 
  b21 b2j b2p 
  
 ai1 ai2
 ··· ain  .. .. · · ·
..  =  cij ··· .
 .. .. ..   . ··· . ··· .   . .. 
 . . ··· .  b  .. ··· . 
n1 ··· bnj ··· bnp
am1 am2 ··· amn cm1 ··· cmp

If A is a square matrix of order n, then we write Ak for |A ·{z

· · A}.
k copies

Theorem 1.1.2. Let A be m × n and let B and C have sizes for which the indicated sums and
products are defined.
(a) A(B + C) = AB + AC and (B + C)A = BA + CA
(b) r(AB) = (rA)B = A(rB) for any scalar r
(c) A0n×k = 0m×k and 0k×m A = 0k×n
(d) Im A = A = AIn
(e) A(BC) = (AB)C

Remarks. Properties above are analogous to properties of real numbers. But NOT ALL real
number properties correspond to matrix properties.
1. It is not the case that AB always equal to BA.
2. Even if A 6= 0 and AB = AC, then B may not equal to C. (A must
have
an
inverse!)

1 0 0 0 0 0
3. It is possible for AB = 0 even if A 6= 0 and B 6= 0. E.g., = .
0 0 1 0 0 0
4 1. Vector Spaces

Definition. The transpose of an m × n matrix A is the n × m matrix obtained from A by

interchanging the rows and columns. We denote the transpose of A by AT . That is, if A =
[aij ]m×n , then AT = [bji ]n×m where bji = aij for all i, j. Moreover,
 T
x1
 x2 
~xT =  .  = x1 x2 · · ·
 
xm
.
 . 
xm

and so if A = ~a1 ~a2 · · · ~an , then
 T
~a1
~aT 
 2
AT =  .  .
 .. 
~aTn

Theorem 1.1.3. Let A and B denote matrices whose sizes are appropriate for the following sums
and products.
(a) (AT )T = A (c) (rA)T = rAT for any scalar r
(b) (A + B)T = AT + B T (d) (AB)T = B T AT

1.2 Axioms of a Vector Space

Definition. A vector space V over a field F is a nonempty set of elements called vectors, which
two laws of combination, called vector addition (or addition) and scalar multiplication, sat-
isfying the following conditions.
(A1) ∀~u, ~v ∈ V, ~u + ~v ∈ V . (SM1) ∀a ∈ F, ∀~u ∈ V, a~u ∈ V .
(A2) ∀~u, ~v ∈ V, ~u + ~v = ~v + ~u. (SM2) ∀a ∈ F, ∀~u, ~v ∈ V, a(~u + ~v ) = a~u + a~v .
(A3) ∀~u, ~v ∈ V, ~u + (~v + w) ~ (SM3) ∀a, b ∈ F, ∀~u ∈ V, (a + b)~u = a~u + b~u.
~ = (~u + ~v ) + w.
(A4) ∃~0 ∈ V, ∀~u ∈ V, ~u + ~0 = ~u = ~0 + ~u. (SM4) ∀a, b ∈ F, ∀~u ∈ V, (ab)~u = a(b~u).
′ ′ ~ ′
(A5) ∀~u ∈ V, ∃~u ∈ V, ~u + ~u = 0 = ~u + ~u. (SM5) ∀~u ∈ V, 1~u = ~u (1 ∈ F ).
We call ~0 the zero vector and ~u′ the negative of ~u.

Theorem 1.2.1. Let V be a vector space over a field F . Then

1. (Cancellation) ∀~u, ~v , w
~ ∈ V, ~u + w ~ ⇒ ~u = ~v and
~ = ~v + w
∀~u, ~v , w
~ ∈ V, w ~ + ~v ⇒ ~u = ~v .
~ + ~u = w
2. The zero vector and the negative of ~u are unique. We shall denote the negative of ~u by −~u.
3. ∀~v ∈ V, −(−~v ) = ~v .
4. ∀~v ∈ V, 0~v = ~0.
5. ∀a ∈ F, a~0 = ~0.
6. ∀a ∈ F, ∀~v ∈ V, (−a)~v = −(a~v ) = a(−~v ). In particular, (−1)~v = −(1~v ) = −~v .
7. ∀a ∈ F, ∀~v ∈ V, a~v = ~0 ⇒ (a = 0 ∨ ~v = ~0).

Examples 1.2.1. 1. For any field F and n ≥ 1, we have F n is a vector space over F where
(x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn )
and
a(x1 , . . . , xn ) = (ax1 , . . . , axn )
for all (x1 , . . . , xn ), (y1 , . . . , yn ) ∈ F n and a ∈ F
1.2. Axioms of a Vector Space 5

2. Let m, n ∈ N, F is a field and Mm,n (F ) the set of all m × n matrices over F . Then Mm,n (F )
is a vector space over F under the usual addition and scalar multiplication of matrices.
3. [The space of functions from a set to a field] Let S be a nonempty set and F a field. Let
F S = {f | f : S → F }. Then F S is a vector space over F by defining f + g and af for
functions f, g ∈ F S and a scalar c ∈ F as follows:

(f + g)(t) = f (t) + g(t) and (cf )(t) = cf (t)

for all t ∈ S. The zero function from S into F is the zero vector of F and the negative of
f ∈ V is −f defined by (−f )(t) = −f (t) for all t ∈ S.
4. [The sequence space] Let F N = {(xn ) : (xn ) is a sequence in F }. Then F N is a vector
space over F under the usual addition and scalar multiplication of sequences. That is, for
sequences (an ) and (bn ) in F N and a scalar c ∈ F ,

(an ) + (bn ) = (an + bn ) and c(an ) = (c an ).

Its zero is the zero sequence (zn ) where zn = 0 for all n and the negative of (an ) is the
sequence (bn ) given by bn = −an for all n.
5. Let n be a non-negative integer and Fn [x] be the set of polynomials over F of degree at most
n. That is,

Fn [x] = {a0 + a1 x + a2 x2 + · · · + an xn : ai ∈ F for all i ∈ {0, 1, 2, . . . , n}}.

We define the addition and scalar multiplication by

p(x) + q(x) = (a0 + b0 ) + (a1 + b1 )x + (a2 + b2 )x2 + · · · + (an + bn )xn

and c(p(x)) = (ca0 ) + (ca1 )x + (ca2 )x2 + · · · + (can )xn

for all polynomials p(x) = a0 +a1 x+a2 x2 +· · ·+an xn and q(x) = b0 +b1 x+b2 x2 +· · ·+bn xn in
Fn [x] and c ∈ F . Then Fn [x] is a vector space over F . Observe that for each positive integer
n, we have Fn−1 [x] ⊂ Fn [x].
6. [The space of polynomials over a field] Let F [x] be the set of all polynomials over F . That
is,

F [x] = {a0 + a1 x + a2 x2 + · · · + an xn : n ≥ 0 and ai ∈ F for all i ∈ {0, 1, 2, . . . , n}}.

[
Then F [x] = Fn [x]. If we use the addition and scalar multiplication defined for Fn [x],
n≥0
then F [x] is a vector space over F . The zero polynomial 0(x) = 0 + 0x + 0x2 + · · · is its
zero vector and for f (x) = c0 + c1 x + · · · + cn xn ∈ F [x], the negative of f (x) is (−f )(x) =
(−c0 ) + (−c1 )x + · · · + (−cn )xn .

Theorem 1.2.2. Let (V1 , +1 , ·1 ), (V2 , +2 , ·2 ), . . . , (Vn , +n , ·n ) be vector spaces over a field F .
For (~v1 , ~v2 , . . . , ~vn ), (w
~ 1, w ~ n ) ∈ V and c ∈ F , we define the addition and scalar multiplica-
~ 2, . . . , w
tion on V = V1 × V2 × · · · × Vn by

(~v1 , ~v2 , . . . , ~vn ) + (w

~ 1, w
~ 2, . . . , w
~ n ) = (~v1 +1 w
~ 1 , ~v2 +2 w
~ 2 , . . . , ~vn +n w
~ n)
and c(~v1 , ~v2 , . . . , ~vn ) = (c ·1 ~v1 , c ·2 ~v2 , . . . , c ·n ~vn ).

Then V is a vector space over F with the zero vector ~0 = (~01 , ~02 , . . . , ~0n ) and the negative of
(~v1 , ~v2 , . . . , ~vn ) is (−~v1 , −~v2 , . . . , −~vn ). V is called the direct product of V1 , V2 , . . . , Vn .
6 1. Vector Spaces

1.3 Subspaces

Definition. Let V be a vector space over a field F . A subspace of V is a subset of V which

is itself a vector space over F with the operations of vector addition and scalar multiplication
of V .

Theorem 1.3.1. Let W be a nonempty subset of V . Then the following statements are equivalent.
(i) W is a subspace of V .
(ii) ∀~u, ~v ∈ W, ∀c ∈ F, ~u + ~v ∈ W and c~u ∈ W .
(iii) ∀~u, ~v ∈ W, ∀c, d ∈ F, c~u + d~v ∈ W .
(iv) ∀~u, ~v ∈ W, ∀c ∈ F, c~u + ~v ∈ W .

Examples 1.3.1. 1. For any vector space V over a field F , we have {~0V } and V are subspaces
of V , called trivial subspaces.
2. For a non-negative integer n, we have Fn [x] is a subspace of F [x].
3. Let α ∈ F and Vα = {(x1 , x2 ) : x1 = αx2 }. Then Vα is a subspace of F 2 .
4. Let Bd(R) = {(an ) ∈ RN : (an ) is a bounded sequence},
C(R) = {(an ) ∈ RN : (an ) is a convergent sequence} and
C0 (R) = {(an ) ∈ RN : an → 0 as n → ∞}.
Then Bd(R), C(R) and C0 (R) are subspaces of RN .
5. Let C 0 (−∞, ∞) = {f ∈ RR : f is continuous on (−∞, ∞)}.
Then C 0 (−∞, ∞) is a subspace of RR .
6. Let W = {f : R → R | f ′′ = f }. Then W is a subspace of RR .
7. Let W1 = {p(x) ∈ F [x] : p(1) = 0} and W2 = {p(x) ∈ F [x] : p(0) = 1}.
Then W1 is a subspace of F [x] but W2 is not.
8. Let A ∈ Mm,n (F ). Then Nul A = {~x ∈ F n : A~x = ~0m } is a subspace of F n , called the null
space of A.

Theorem 1.3.2. Let V be a vector space over a field F . The intersection of any collection of
subspaces of V is a subspace of V .

Definition. For non-empty subsets S1 , S2 , . . . , Sn of V , we define

n
X
S1 + S2 + · · · + Sn = Si = {x1 + x2 + · · · + xn : x1 ∈ S1 , x2 ∈ S2 , . . . , xn ∈ Sn }.
i=1

Theorem 1.3.3. If W1 , . . . , Wn are subspaces of V , then W1 + · · · + Wn is a subspace of V .

Remark. W1 + W2 is the smallest subspace of V containing W1 and W2 , i.e., any subspace con-
taining W1 and W2 must contain W1 + W2 .

Definition. Let V be a vector space over a field F .

A vector ~v is said to be a linear combination of ~v1 , . . . , ~vn ∈ V if

∃a1 , . . . , an ∈ F, ~v = a1~v1 + · · · + an~vn .

Definition. Let S ⊆ V . The subspace of V spanned by S is defined to be the intersection of

all subspaces of V containing S. We denote this subspace by Span S.
For ~v1 , . . . , ~vp ∈ V , we call Span{~v1 , . . . , ~vp } the subspace of V spanned by ~v1 , . . . , ~vp .
1.3. Subspaces 7

Since ∅ ⊂ {~0V } which is the smallest of all subspaces of V , we have Span = {~0V }. Moreover,
if W is a subspace of V , then Span W = W . In particular, Span(Span S) = Span S.
Remark. Let S be a non-empty subset of V and let W be a subspace of V containing S. Note that
for c1 , . . . , cm ∈ F and ~v1 , . . . ~vm ∈ S, we have ~v1 , . . . ~vm ∈ W and so

c1~v1 + · · · + cm~vm ∈ W.

Thus, Y := {c1~v1 + · · · + cm~vm : c1 , . . . , cm ∈ F and ~v1 , . . . , ~vm ∈ S for some m ∈ N} ⊆ W for all
subspaces W of V containing S. Hence, Y ⊆ Span S.

Theorem 1.3.4. Span S is the smallest subspace of V containing S. That is, any subspace of V
containing S must also contain Span S. Moreover, Span ∅ = {~0} and

Span S = {c1~v1 + · · · + cm~vm : c1 , . . . , cm ∈ F and ~v1 , . . . , ~vm ∈ S for some m ∈ N} if S 6= ∅.

In particular,
Span{~v1 , . . . , ~vp } = {c1~v1 + · · · + cp~vp : c1 , . . . , cp ∈ F }.

Definition. Let A = ~a1 ~a2 . . . ~an be an m × n matrix over a field F . Then ~ai ∈ F m for all
i = 1, 2, . . . , n and Span{~a1 , ~a2 , . . . , ~an } is a subspace of F m , called the column space of A. We
denote this space by Col A.

By Theorem 1.3.4, we have

Col A = {c1~a1 + c2~a2 + · · · + cn~an : c1 , c2 , . . . , cn ∈ F }.

Definition. Let V and W be vector spaces over a field F . A function T : V → W is said to be a

linear transformation if the following conditions are satisfied:
(i) ∀~u, ~v ∈ V, T (~u + ~v ) = T (~u) + T (~v ) and
(ii) ∀~u ∈ V, ∀c ∈ F, T (c~u) = cT (~u).

Theorem 1.3.5. Let V and W be vector spaces over a field F and T : V → W a linear transfor-
mation. Then T (~0V ) = ~0W and ∀~v ∈ V, T (−~v ) = −T (~v ).

Theorem 1.3.6. The following statements are equivalent.

(i) T is a linear transformation.
(ii) ∀~u, ~v ∈ V, ∀c ∈ F, T (~u + ~v ) = T (~u) + T (~v ) ∧ T (c~u) = cT (~u).
(iii) ∀~u, ~v ∈ V, ∀c, d ∈ F, T (c~u + d~v ) = cT (~u) + dT (~v ).
(iv) ∀~u, ~v ∈ V, ∀c ∈ F, T (c~u + ~v ) = cT (~u) + T (~v ).

Definition. Let V and W be vector spaces over a field F and T : V → W a linear transforma-
tion. Recall that the image or range of T is given by

im T = range T = {w
~ ∈ W : ∃~v ∈ V, T (~v ) = w}
~ = {T (~v ) : ~v ∈ V }.

The kernel of T is defined by

ker T = {~v ∈ V : T (~v ) = ~0W } = T −1 ({~0W }).

Theorem 1.3.7. The kernel of T is a subspace of V and the image of T is a subspace of W .

8 1. Vector Spaces

Example 1.3.2. Let A = ~a1 ~a2 . . . ~an be an m × n matrix over a field F .
Then the matrix transformation T : F n → F m given by

T (~x) = A~x

is a linear transformation. Its kernel is

Nul A = {~x ∈ F n : A~x = ~0m }

the null space of A, and its image is

im T = {A~x : ~x ∈ F n } = {x1~a1 + · · · + xn~an : x1 , . . . , xn ∈ F } = Col A,

which is the column space of A.

Remark. Since the image of T : ~x → A~x the column space of A,

T is onto ⇔ im T = F m ⇔ Col A = F m .

If Col A = F m , we say that the columns of A span F m .

Example 1.3.3. Let T : R[x] → R be defined by

T (p(x)) = p(1)

for all p(x) ∈ R[x]. Show that T is an onto linear transformation and find its kernel.
Example 1.3.4. Let V be the space of differentiable functions on (−∞, ∞) with continuous
derivative. Define a function T : V → C 0 (−∞, ∞) by

T (f (x)) = f ′ (x)

for all f ∈ V . Show that T is an onto linear transformation and find its kernel.

Definition. Let V be a vector space over a field F . Vectors ~u1 , ~u2 , . . . , ~un in V are linearly
independent if

∀c1 , c2 , . . . , cn ∈ F, c1 ~u1 + c2 ~u2 + · · · + cn ~un = ~0 ⇒ c1 = c2 = · · · = cn = 0.

If there is a linear combination c1 ~u1 + c2 ~u2 + . . . cn ~un = ~0 with the scalars c1 , c2 , . . . , cn not all
zero, we say that ~u1 , ~u2 , . . . , ~un are linearly dependent.

Example 1.3.5. Determine whether the set of vectors

{(1, 1, 1), (0, 1, 1), (0, 0, 1)}

is dependent or independent in R3 .
Example 1.3.6. Determine whether the set of vectors

2 2 1 0 0 1 1 1 1
~u1 = , ~u2 = , ~u3 =
0 0 1 0 0 1 0 0 1

is dependent or independent in M2,3 (R).

Remarks. 1. The empty set is linearly independent.
~
2. If 0V is in S, then S is linearly dependent.
3. The singleton {~0V } is linearly dependent and {~u} is linearly independent unless ~u = ~0V .
1.4. Bases and Dimensions 9

Theorem 1.3.8. Let V be a vector space over a field F and S1 ⊆ S2 ⊆ V . Then

1. Span S1 ⊆ Span S2 .
2. If S1 is linearly dependent, then S2 is linearly dependent.
3. If S2 is linearly independent, then S1 is linearly independent.

Example 1.3.7. Consider the space of continuous functions C 0 [−1, 1].

Determine whether the functions 1, x, x2 are dependent or independent.

Remark. Observe that the question of dependence and independence of sets of functions is re-
lated to the interval over which the space is defined. Consider the same interval [−1, 1] with the
functions f , g and h defined as follows:

f (x) = 1, −1 ≤ x ≤ 1,
( (
0 if −1 ≤ x ≤ 0, 0 if −1 ≤ x ≤ 0,
g(x) = and h(x) =
x if 0 ≤ x ≤ 1. x2 if 0 ≤ x ≤ 1.

These functions are linearly independent. However, if we restrict these same functions to the
interval [−1, 0], then they are dependent because

0 · f (x) + 1 · g(x) + 0 · h(x) = 0

for −1 ≤ x ≤ 0.

Theorem 1.3.9. Let T : V → W be a linear transformation. Then T is 1-1 ⇔ ker T = {~0V }.

1.4 Bases and Dimensions

Definition. Let V be a vector space over F . A subset B ⊂ V is a basis for V if B is linearly

independent and Span B = V .

Theorem 1.4.1. Let V be a vector space over a field F and B = {~v1 , . . . , ~vn } ⊆ V linearly
independent.
1. If ~v ∈ Span B, then there exist unique c1 , . . . , cn ∈ F such that

~v = c1~v1 + · · · + cn~vn .

2. If B is a basis for V , then every vector in V can be expressed uniquely as a linear combination
of ~v1 , . . . , ~vn .
3. Let W be a vector space over a field F and w ~ n ∈ W (not necessarily distinct). If B is a
~ 1, . . . , w
basis for V , then there is a unique linear transformation from V to W such that T (~vi ) = w ~i
for all i ∈ {1, . . . , n}.

Examples 1.4.1. 1. Find a linear transformation T that satisfies the following conditions
(i) T : C → R2 [x] with T (1 − i) = 2x2 and T (1 + i) = 1 − x,
(ii) T : R2 [x] → R2 with T (1) = (2, 1), T (1 − x) = (0, 1) and T (x + x2 ) = (1, 1).
2. Let T : R1 [x] → R3 be a linear transformation with

T (2 − x) = (1, −1, 1) and T (1 + x) = (0, 1, −1).

Find T (−1 + 2x).

10 1. Vector Spaces

Lemma 1.4.2. 1. If ~u, ~v1 , . . . , ~vn ∈ S and ~u = c1~v1 +· · ·+cn~vn , then Span S = Span(S r{~u}).
2. If S is a linearly independent subset of V and ~u ∈ / Span S, then S ∪ {~u} is linearly indepen-
dent.

Theorem 1.4.3. Let V be a vector space over F .

1. If B is a linearly independent subset of V which is maximal with respect to the property of
being linearly independent (i.e., ∀B ⊆ S, S 6= B ⇒ S is not linearly independent), then B is
a basis of V .
2. If B is a spanning set for V which is minimal with respect to the property of spanning (i.e.,
∀S ⊆ B, S 6= B ⇒ Span S $ V ), then B is a basis of V .

Theorem 1.4.4. [Replacement Theorem] Let V be a vector space that is spanned by a set G
containing exactly n vectors. Let L be a linearly independent subset of V with m vectors. Then
1. m ≤ n,
2. there exists a subset H of G with n − m vectors such that L ∪ H spans V .

Example 1.4.2. Extend {(1, 1, 1)} to a basis of R3 .

Corollary 1.4.5. If a vector space V has a finite spanning set {~v1 , . . . , ~vn }, then
1. {~v1 , . . . , ~vn } has a subset which is a basis,
2. any linearly independent set in V can be extended to a basis,
3. V has a basis,
4. any two bases have the same finite number of elements, necessarily ≤ n.

Definition. If a vector space V has a finite spanning set, then we say that V is finite-
dimensional, and the number of elements in a basis is called the dimension of V , written
dim V . If V has no finite spanning set, we say that V is infinite-dimensional.

Examples 1.4.3. 1. The vector space {~0} has dimension zero with basis ∅.
2. The vector space F n , n ≥ 1, is of dimension n with standard basis {(1, 0, . . . , 0), (0, 1, . . . , 0),
. . . , (0, 0, . . . , 1)}. Similarly, Mm,n (F ) is of dimension mn where m, n ∈ N.
3. The vector space Fn [x] is of dimension n + 1 with standard basis {1, x, x2 , . . . , xn }.
4. The vector spaces F N and F [x] are infinite-dimensional. A basis for F [x] is {1, x, x2 , . . . }.
5. If we consider C as a vector space over C, it has dimension one with basis {1}. But if we
consider C as a vector space over R it has dimension two with basis {1, i}.
Remark. The above corollary is valid for a “finite” dimensional vector space. For a general (fi-
nite/infinite dimensional) vector space V , consider L = {L ⊆ V : L is linearly independent}.
Then ∅ ∈ L . Partially ordering L by ⊆.S We now show that every S chain in L has an upper
bound. Let C be a chain in L . Consider C . Let ~v1 , . . . , ~vn ∈ C and c1 , . . . , cn ∈ F be such
that c1~v1 + · · · + cn~vn = ~0V . Suppose ~vi ∈ Li for some Li ∈ C for all i ∈ {1, . . . , n}. Since C
is a chain, we may suppose that L1 ⊆ . . . ⊆ Ln . Thus,S~v1 , . . . , ~vn are in Ln which is a linearly
S
independent set. This implies c1 = · · · = cn = 0. Hence, C is a linearly independent set, so C
is in L . By Zorn’s lemma—“If a partially ordered set P has the property that every chain (i.e.,
totally ordered subset) has an upper bound in P , then the set P contains at least one maximal
element.”, L contains a maximal element, say B. This is a maximal linearly independent subset
of V . By Theorem 1.4.3 (1), B is a basis for V . Hence, every vector space has a basis. Note that a
basis for F N exists in this way and is not constructible explicitly.

Corollary 1.4.6. If V is a finite-dimensional vector space with dim V = n, then any spanning
set of n elements is a basis of V , and any linearly independent set of n elements is a basis of V .
Consequently, if W is an n-dimensional subspace of V , then W = V .
1.4. Bases and Dimensions 11

Corollary 1.4.7. If V is a finite-dimensional vector space and U is a proper subspace of V , then U

is finite-dimensional and dim U < dim V .

Theorem 1.4.8. If W1 and W2 are finite dimensional subspaces of a vector space V over a field F ,
then W1 + W2 is finite dimensional and

dim(W1 + W2 ) = dim W1 + dim W2 − dim(W1 ∩ W2 ).

Example 1.4.4. Consider two subspaces of R5

     

 a 
 
 c 


  a − b 
 
  d  

       
5 5
 ∈ R : a, b ∈ R and W2 =  0  ∈ R : c, d, e ∈ R .
W1 =   b   

  a + b 
 
  e  


 
 
 

   
0 d−e

Find bases for W1 , W2 and W1 ∩ W2 . Determine the dimension of W1 + W2 .

Definition. Let V and W be vector spaces over a field and T : V → W a linear transformation.
If V is finite dimensional, the rank of T , denoted by rank T , is dim(im T ) and the nullity of T ,
denoted by nullity T , is dim(ker T ).

Theorem 1.4.9. Let V and W be vector spaces over a field F and T : V → W a linear transfor-
mation. If V is finite dimensional, then

rank T + nullity T = dim V.

Theorem 1.4.10. Let V and W be finite dimensional and T : V → W a linear transformation

and dim V = dim W . Then T is one-to-one ⇔ T is onto.

Corollary 1.4.11. If V is finite dimensional, S and T are linear transformations from V to V , and
T ◦ S is the identity map, then T = S −1 .

From Theorem 1.4.1, we have known that the representation of a given vector ~v ∈ V in terms
of a given basis is unique.

Definition. Let V be an n-dimensional vector space over a field F with ordered basis B =
{~v1 , . . . , ~vn } and ~v ∈ V . Then ∀~v ∈ V, ∃!(c1 , . . . , cn ) ∈ F n , ~v = c1 ~u1 + · · · + cn ~un . The vector
 
c1
 .. 
[~v ]B =  .  ∈ F n
cn

is called the coordinate vector of ~v relative to the ordered basis B.

~ ∈ V and c ∈ F , we have [~v + w]

Theorem 1.4.12. For ~v , w ~ B = [~v ]B + [w]
~ B and [c~v ]B = c[~v ]B .

Definition. A one-to-one linear transformation from V onto W is called an isomorphism. If

there exists an isomorphism from V onto W , then we say that V is isomorphic to W and we
write V ∼
= W.
12 1. Vector Spaces

Note that ∼
= is an equivalence relation.

Theorem 1.4.13. Let V be an n-dimensional vector space over F .

If B is a basis for V , then the map ~v 7→ [~v ]B is an isomorphism from V onto F n . Hence, V ∼
= F n.

Therefore, the theory of finite-dimensional vector spaces can be studied from column vectors
and matrices which we shall pursue in the next chapter.

Corollary 1.4.14. If V and W are finite dimensional, then dim V = dim W ⇔ V ∼

= W.

Exercises for Chapter 1. 1. Let V = R+ the set of all positive integers. Define a vector addition and
a scalar multiplication on V as

v ⊕ w = vw and α ⊙ v = vα

for all positive real numbers v and w, and α ∈ R. Show that (V, ⊕, ⊙) is a vector space over R.
2. Let V be a vector space over a field F . For c ∈ F and ~v ∈ V , if c~v = ~v , prove that c = 1 or ~v = ~0V .
3. Which of the following are subspaces of M2 (R)?
(a) {A ∈ M2 (R) : det A = 0} (b) {A ∈ M2 (R) : A = AT }
T
(c) {A ∈ M2 (R) : A = −A } (d) {A ∈ M2 (R) : A2 = A}
N
4. Which of the following are subspaces of R ?
(a) All sequences like (1, 0, 1, 0, . . . ) that include infinitely many zeros.
(b) {(an ) ∈ RN : ∃n0 ∈ N, ∀j ≥ n0 , aj = 0}. (c) All decreasing sequences: aj+1 ≤ aj for all j ∈ N.
(d) All arithmetic sequences: {(an ) ∈ RN : ∃a, d ∈ R, ∀n ∈ N, an = a + (n − 1)d}.
(e) All geometric sequences: {(an ) ∈ RN : ∃a, r ∈ R, ∀n ∈ N, r 6= 0 ∧ an = arn−1 }.
5. Which of the following are subspaces of V = C 0 [0, 1]?
(a) {f ∈ V : f (0) = 0} (b) {f ∈ V : ∀x ∈ [0, 1], f (x) ≥ 0}
(c) All increasing functions: ∀x, y ∈ [0, 1], x < y ⇒ f (x) ≤ f (y).
6. Let V and W be vector spaces over a field F and T : V → W a linear transformation.
(a) If V1 is a subspace of V , then T (V1 ) = {T (~x) : ~x ∈ V1 } is a subspace of W .
(b) If W1 is a subspace of W , then T −1 (W1 ) = {~x ∈ V : T (~x) ∈ W1 } is a subspace of V .
7. If L, M and N are three subspaces of a vector space V such that M ⊆ L, then show that

L ∩ (M + N ) = (L ∩ M ) + (L ∩ N ) = M + (L ∩ N ).

Also give an example, in which the result fails to hold when M * L. (Hint. Consider Vα of F 2 .)
8. Let S1 and S2 be subsets of a vector space V . Prove that Span(S1 ∪ S2 ) = Span S1 + Span S2 .
9. If ~v1 , ~v2 , ~v3 ∈ V such that ~v1 + ~v2 + ~v3 = ~0, prove that Span{~v1 , ~v2 } = Span{~v2 , ~v3 }.
10. Let S = {~v1 , . . . , ~vn } and c1 , . . . , cn ∈ F r {0}. Prove that:
(a) Span S = Span{c1~v1 , . . . , cn~vn }
(b) S is linearly independent ⇔ {c1~v1 , . . . , cn~vn } is linearly independent.
11. If {~y , ~v1 , . . . , ~vn } is linearly independent, show that {~y + ~v1 , . . . , ~y + ~vn } is also linearly independent.
12. Determine (with reason or counter example) whether the following statements are TRUE or FALSE.
(a) If W1 and W2 are subspaces of V , then W1 ∪ W2 is a subspace of V .
(b) If {~v1 , ~v2 , ~v3 } is a basis of R3 , then {~v1 , ~v1 + ~v2 , ~v1 + ~v2 + ~v3 } is a basis of R3 .
13. Determine whether the following subsets are linearly independent.
(a) {(1, i, −1), (1 + i, 0, 1 − i), (i, −1, −i)} in C3 (b) {x, sin x, cos x} in C 0 (R)
14. Let V be a vector space over a field F . Let ~v1 , ~v2 , . . . , ~vn be vectors in V .
If w~ ∈ Span{~v1 , ~v2 , . . . , ~vn } r Span{~v2 , . . . , ~vn }, then ~v1 ∈ Span{w, ~ ~v2 , . . . , ~vn } r Span{~v2 , . . . , ~vn }.
15. Prove that if U and V are finite dimensional vector spaces, then dim(U × V ) = dim U + dim V .
16. Find a basis and the dimension of the following subspaces of M2 (R).
(a) {A ∈ M2 (R) : A = AT } (b) {A ∈ M2 (R) : A = −AT }
(c) {A ∈ M2 (R) : ∀B ∈ M2 (R), AB = BA}
17. Let B ∈ M2 (R) and W = {A ∈ M2 (R) : AB = BA}.
Prove that W is a subspace of M2 (R) and dim W ≥ 2.
18. Find a basis for the subspace W = {p(x) ∈ R3 [x] : p(2) = 0} and extend to a basis for R3 [x].
1.4. Bases and Dimensions 13

19. Let W1 = Span{(1, 0, 2), (1, −2, 2)} and W2 = Span{(1, 1, 0), (0, 1, −1)} in R3 .
Find dim(W1 ∩ W2 ) and dim(W1 + W2 ).
20. If T : V → W is a linear transformation and B is a basis for V , prove that Span T (B) = im T .
21. Let T : R2 [x] → R3 [x] be given by T (p(x)) = xp(x).
(a) Prove that T is a linear transformation and determine its rank and nullity.
(b) Does T −1 exist? Explain.
22. Suppose that U and V are subspaces of R13 , with dim U = 7 and dim V = 8.
(a) What is the smallest and largest possible dimensions of U ∩ V ? Explain.
(b) What is the smallest and largest possible dimensions of U + V ? Explain.
23. If V and W are finite-dimensional vector spaces such that dim V > dim W , then there is no one-to-
one linear transformation T : V → W .
24. Let U and W be subspaces of a vector space V . If dim V = 3, dim U = dim W = 2 and U 6= W , prove
that dim(U ∩ W ) = 1.
25. Let U and W be subspaces of a vector space V such that U ∩ W = {~0}.
Assume that ~u1 , ~u2 are linearly independent in U and w ~ 1, w
~ 2, w
~ 3 are linearly independent in W .
(a) Prove that {~u1 , ~u2 , w
~ 1, w ~ 3 } is a linearly independent set in V .
~ 2, w
(b) If dim V = 5, show that dim U = 2 and dim V = 3.
14 1. Vector Spaces
2 | Inner Product Spaces

2.1 Inner Products

We shall need the following properties of complex numbers.

Proposition 2.1.1. Let z = a + bi where a, b ∈ R.

1. Re z = a (real part) and Im z = b (imaginary part) √
2. The conjugate z̄ = a − bi, and the absolute value |z| = a2 + b2 . Moreover, z z̄ = |z|2 .
3. z̄¯ = z and |z| = 0 ⇔ a = b = 0.
4. If z, w ∈ C, then z + w = z̄ + w̄ and zw = z̄ w̄.

Definition. Let F = R or C and let V be a vector space over F . Let ~u and ~v be vectors in V .
An inner product or scalar product on V is a function from V × V to F , denoted by h·, ·i, with
following properties:
(IN1) ∀~u, ~v , w
~ ∈ V, h~u + ~v , wi
~ = h~u, wi~ + h~v , wi.
~
(IN2) ∀~u, ~v ∈ V, ∀c ∈ F, hc~u, ~v i = ch~u, ~v i.
(IN3) ∀~u, ~v ∈ V, h~u, ~v i = h~v , ~ui. Here, ¯· is the complex conjugation.
(IN4) ∀~u ∈ V, h~u, ~ui ≥ 0 and [h~u, ~ui = 0 ⇒ ~u = ~0].
A vector space over F , in which an inner product is defined, is called an inner product space.

Remarks. 1. For all ~u, ~v ∈ V , h~0, ~ui = 0 = h~u, ~0i and h~u, ~v i = 0 ⇔ h~v , ~ui = 0.
2. If F = R, then (IN3) reads ∀~u, ~v ∈ V, h~u, ~v i = h~v , ~ui.

Example 2.1.1. Consider the complex vector space Cn of n-tuples of complex numbers. Let
~u = (u1 , u2 , . . . , un ) and ~v = (v1 , v2 , . . . , vn ). We define

h~u, ~v i = u1 v̄1 + u2 v̄2 + . . . + un v̄n .

Show that this is an inner product.

Remark. If we consider, on the other hand, Rn the space of n-tuples of real numbers, we have a
real-valued scalar product h~u, ~v i = u1 v1 + u2 v2 + . . . + un vn and the verification of the properties
is exactly like Example 2.1.1, where all conjugation symbols are removed.

Example 2.1.2. Consider V = C 0 [a, b] the vector space of real-valued continuous functions de-
fined on the interval [a, b]. Let
Z b
hf, gi = f (x)g(x) dx.
a

Show that this defines an inner product.

We can add to the list of properties of the scalar product by proving some theorems, assuming
of course that we are dealing with a complex vector space with a scalar product.

15
16 2. Inner Product Spaces

Theorem 2.1.2. 1. ∀~u, ~v , w

~ ∈ V, h~u, ~v + wi
~ = h~u, ~v i + h~u, wi.
~
2. ∀~u, ~v ∈ V, ∀c ∈ F, h~u, c~v i = c̄h~u, ~v i.
3. ∀~u ∈ V, h~u, ~v i = 0) ⇒ ~v = ~0.
4. ∀~u ∈ V, h~u, ~v i = h~u, wi
~ ⇒ ~v = w. ~ In fact, if h~v − w,
~ ~v i = h~v − w,
~ wi,
~ then ~v = w.
~

Remark. Let c1 , c2 ∈ F and ~u, ~v ∈ V . Then

hc1 ~u + c2~v , c1 ~u + c2~v i = c1 c̄1 h~u, ~ui + c1 c̄2 h~u, ~v i + c̄1 c2 h~v , ~ui + c2 c̄2 h~v , ~v i.

Moreover, if h~u, ~v i = 0, then h~v , ~ui = 0, so

hc1 ~u + c2~v , c1 ~u + c2~v i = c1 c̄1 h~u, ~ui + c2 c̄2 h~v , ~v i = |c1 |2 h~u, ~ui + |c2 |2 h~v , ~v i.

The quantity h~u, ~ui is non-negative and is zero if and only if ~u = ~0. Therefore, we associate
with it the square of the length of the vector.
p
Definition. For ~v ∈ V , we define the length or norm of ~v to be k~v k = h~v , ~v i.

Some of the properties of the norm are given by the next theorem.

Theorem 2.1.3. If V is an inner product space over F , then the norm k · k has the following
properties:
1. ∀~u ∈ V, k~uk ≥ 0 and k~uk = 0 ⇔ ~u = ~0
2. ∀~u ∈ V, ∀a ∈ F, ka~uk = |a|k~uk
3. ∀~u, ~v ∈ V, |(~u, ~v )| ≤ k~ukk~v k (the Cauchy-Schwarz inequality)
4. ∀~u, ~v ∈ V, k~u + ~v k ≤ k~uk + k~v k (the triangle inequality).

Example 2.1.3. Let f be a real-valued continuous function defined on the interval [a, b]. Prove
that Z b
f (x)dx ≤ (b − a)M, where M = max |f (x)|.
a x∈[a,b]

2.2 Orthonormal Bases

Definition. Let V be an inner product space over F . Two nonzero vectors ~u and ~v are orthog-
onal if (~u, ~v ) = 0. A vector ~u is a unit vector if k~uk = 1.

Definition. A subset S of V is called an orthogonal set if ∀~u, ~v ∈ S, ~u 6= ~v ⇒ ~u and ~v are

orthogonal. Moreover, S is called an orthonormal set if S is orthogonal and ∀~v ∈ S, k~v k = 1.

Example 2.2.1. 1. The standard basis of F n , n ∈ZN is an orthonormal set.

2π
2. Let V = C 0 [0, 2π] with inner product hf, gi = f (x)g(x) dx. Then
0

1 1 1 1 1
S= √ , √ cos x, √ sin x, √ cos 2x, √ sin 2x, . . .
2π π π π π

is an orthonormal set.
2.2. Orthonormal Bases 17

Let V be an inner product space.

Lemma 2.2.1. Let S = {~v1 , . . . , ~vn } be an orthogonal set.

n
DX E
1. ∀α1 , . . . , αn ∈ F, ∀k ∈ {1, . . . , n}, αi~vi , ~vk = αk k~vk k2 .
i=1
n
X h~v , ~vi i
2. ∀~v ∈ Span S, ~v = ~vi .
k~vi k2
i=1

Theorem 2.2.2. If S is an orthogonal set, then S is linearly independent.

Theorem 2.2.3. [Gram-Schmidt Process] Let ~v1 , ~v2 , . . . , ~vn ∈ V be linearly independent. Then
∀m ∈ {1, . . . , n}, ∃w ~ m ∈ V such that {w
~ 1, . . . , w ~ m } is an orthogonal set and it is a basis
~ 1, . . . , w
for Span{~v1 , . . . , ~vm }.

Proof. We prove this theorem by induction on m ≥ 1.

If m = 1, {~v1 } is an orthogonal set. Choose w ~ 1 = ~v1 . Then Span{w ~ 1 } = Span{~v1 }. Let
k ∈ {1, 2, . . . , n − 1} and assume that there exist w ~ k ∈ V such that {w
~ 1, . . . , w ~ k } is an
~ 1, . . . , w
orthogonal set Span{w ~ k } = Span{~v1 , . . . , ~vk }. Choose
~ 1, . . . , w
k
X h~vk+1 , w
~ ii
~ k+1 = ~vk+1 − v̂k+1 = ~vk+1 −
w w
~ i. (2.2.1)
~ i k2
kw
i=1

We have to show that:

(1) {w~ 1, . . . , w ~ k+1 } is an orthogonal set. By induction hypothesis, {w
~ k, w ~ k } is an orthog-
~ 1, . . . , w
onal set, so it suffices to show that w ~ k+1 is orthogonal to w ~ j for all j ∈ {1, . . . , k}. Let
j ∈ {1, . . . , k}.
* k
+
X h~vk+1 , w
~ ii
hw ~ j i = ~vk+1 −
~ k+1 , w w
~ i, w
~j
~ i k2
kw
i=1
k
X h~vk+1 , w~ ii
= h~vk+1 , w
~ji − w
~ i, w
~j
kw~ i k2
i=1
= h~vk+1 , w
~ j i − h~vk+1 , w
~ j i = 0.

(2) Span{w
~ 1, . . . , w ~ k+1 } = Span{~v1 , . . . , ~vk , ~vk+1 }. Again, by induction hypothesis,
~ k, w

Span{w ~ k } = Span{~v1 , . . . , ~vk }.

~ 1, . . . , w

From Eq. (2.2.1), we have

~ k+1 ∈ Span{w
w ~ k , ~vk+1 } = Span{~v1 , . . . , ~vk , ~vk+1 }.
~ 1, . . . , w

Then Span{w
~ 1, . . . , w ~ k+1 } ⊆ Span{~v1 , . . . , ~vk , ~vk+1 }. For the reverse, we note that
~ k, w
k
X h~vk+1 , w
~ ii
~vk+1 = w
~ k+1 + ~ i ∈ Span{w
w ~ 1, . . . , w ~ k+1 }.
~ k, w
~ i k2
kw
i=1

Since an orthogonal set is linearly independent, {w ~ k } is a basis for Span{~v1 , . . . , ~vk }.

~ 1, . . . , w

Corollary 2.2.4. If V is a finite dimensional inner product space, then V has an orthonormal
basis.
18 2. Inner Product Spaces

Proof. Let B = {~v1 , . . . , ~vm } be a basis for V . Then B is linearly independent. By the Gram-
Schmidt Process, we can construct an orthogonal subset {w ~ m } of V which is a basis for
~ 1, . . . , w
Span{~v1 , . . . , ~vm } = V . Hence, {w ~ m } is an orthogonal basis for V in which we can nor-
~ 1, . . . , w
malize each vector to obtain an orthonormal basis as desired.
    
 1 2i 
Example 2.2.2. Let H = Span 2i ,  6  ⊂ C3 . Find an orthonormal basis for H.
 
0 −3
√
Example 2.2.3. Let V be the space of continuous functions on [0, 1] and H = Span{1, 3 x, 10x}
a 3-dimensional subspace of V . Use the Gram-Schmidt process to find an orthogonal basis for H.

2.3 Orthogonal Complements

Definition. Let V be an inner product space over F . For S ⊆ V , the orthogonal complement
of S is the set S ⊥ , read “S perp”, defined by

S ⊥ = {~v ∈ V : h~v , ~ui = 0 for all ~u ∈ S}.

Remark. ∅⊥ = V = {~0}⊥ , V ⊥ = {~0} and S ⊥ = (Span S)⊥ .

Theorem 2.3.1. For any subset S of V , S ⊥ is a subspace of V .

Lemma 2.3.2. Let S = {~v1 , . . . , ~vn } be a set of distinct nonzero vectors.

n
X h~v , ~vi i
If S is an orthogonal set, then ~v − ~vi ∈ S ⊥ for all ~v ∈ V .
k~vi k2
i=1

Theorem 2.3.3. [Bessel’s inequality] Let S = {~v1 , . . . , ~vn } be a set of distinct nonzero vectors.
If S is an orthogonal set, then for all ~v ∈ V ,
n
X |h~v , ~vi i|2
≤ k~v k2
k~vi k2
i=1

and equality holds if and only if ~v ∈ Span S.

Let W1 and W2 be subspaces of a vector space V . We know that W1 + W2 is subspace of V . If

V = W1 + W2 , we say that V is a sum of W1 and W2 . The sum is direct, denoted by W1 ⊕ W2 , if
W1 ∩ W2 = {~0V }. That is,

V = W1 ⊕ W2 ⇔ [(1) V = W1 + W2 and (2) W1 ∩ W2 = {~0V }].

Theorem 2.3.4. V = W1 ⊕ W2
⇔ every vector ~v ∈ V can be expressed uniquely as ~v = w
~1 + w ~ 1 ∈ W1 and w
~ 2 with w ~ 2 ∈ W2 .

Theorem 2.3.5. [Orthogonal Decomposition Theorem] Let W be a finite dimensional subspace

of an inner product space V . Then
1. V = W ⊕ W ⊥ . In other words, every ~v in V decomposes uniquely as ~v = ~y + ~z with ~y ∈ W
and ~z ∈ W ⊥ .
2. dim W + dim W ⊥ = dim V .
2.3. Orthogonal Complements 19

Exercises for Chapter 2. 1. Let Vn = {A ∈ Mn (R) : A = AT } be the vector space of all n × n

symmetric matrices over R, and define the product of two matrices A and B by
hA, Bi = tr (AB).
where tr denotes the trace of matrix.
(a) Show that this is an inner product on Vn .
1 0 0 0
(b) Obtain an orthonormal basis for the subspace H = Span , of V2 .
0 1 0 2
2. Find an orthonormal basis for R2 [x] with respect to the inner product
Z 1
hp(x), q(x)i = p(x)q(x) dx.
0
R ′′
3. Let W = {y(x) ∈ R : y + 4y = Z π0}. Then W is a real vector space generated by {cos 2x, sin 2x}.
Define an inner product hy, zi = y(x)z(x) dx for all y, z ∈ W . Find an orthonormal basis for W .
0
4. Let V and W be two vector spaces and T a one-to-one linear transformation from V into W . If W is
an inner product space with inner product (·, ·), prove that the function h , i : V × V → F defined by
h~u, ~v i = (T (~u), T (~v ))
for all ~u, ~v ∈ V is an inner product on V .
5. Let V be an inner product space over F . Prove the following statements.
(a) If F = R, then ∀~u, ~v ∈ V, h~u, ~v i = 41 k~u + ~v k2 − 14 k~u − ~v k2 .
(b) If F = C, then ∀~u, ~v ∈ V, h~u, ~v i = 14 k~u + ~v k2 − 41 k~u − ~v k2 + 4i k~u + i~v k2 − 4i k~u − i~v k2 .
(c) ∀~u, ~v ∈ V, k~u + ~v k2 + k~u − ~v k2 = 2k~uk2 + 2k~v k2 .
(a) and (b) are called the polarization identity and (c) is called the parallelogram law.
6. Show that |k~uk − k~v k| ≤ k~u − ~v k for all ~u, ~v ∈ V .
7. From the Cauchy-Schwarz inequality, |h~u, ~v i| ≤ k~ukk~v k, prove that equality holds if and only if ~u and
~v are linearly dependent.
8. By choosing a suitable vector ~b in the Cauchy-Schwarz inequality, prove that
(a1 + · · · + an )2 ≤ n(a21 + · · · + a2n ).
When does equality hold?
9. Consider V = C 0 [a, b]. Let f ∈ V . Prove that
Z Z !1/2 Z !1/2
b b b
|f (x)|2 dx ≤ |f (x)| dx |f (x)|3 dx .
a a a

10. Prove that the finite sequence a0 , a1 , . . . , an of positive real numbers is a geometric progression if
and only if
(a0 a1 + a1 a2 + · · · + an−1 an )2 = (a20 + a21 + · · · + a2n−1 )(a21 + a22 + · · · + a2n ).
11. Let P (x) be a polynomial with positive real coefficients. Prove that
p √
P (a)P (b) ≥ P ( ab)
for all a, b ≥ 0.
12. Let V be an n-dimensional inner product space and m < n. If {~v1 , . . . , ~vm } is an orthonormal set,
then there exists ~vm+1 , . . . , ~vn ∈ V such that {~v1 , . . . , ~vn } is an orthonormal basis for V .
13. Prove the following statements.
(a) ∀S1 , S2 ⊆ V, S1 ⊆ S2 ⇒ S1⊥ ⊇ S2⊥ . (b) ∀S ⊆ V, (Span S)⊥ = S ⊥ .
(c) For S ⊆ V , if ~u ∈ S and ~v ∈ S , then k~u + ~v k = k~uk2 + k~v k2 .
⊥ 2

(d) For ~v1 , . . . , ~vn ∈ V , {~v1 }⊥ ∩ · · · ∩ {~vn }⊥ = (Span{~v1 , . . . , ~vn })⊥ .

14. Construct an orthonormal basis for the subspace H = {(1, −i, i)}⊥ of C3 .
15. Let W be a subspace of an inner product space V over F . If ~v ∈ V satisfies
h~v , wi
~ + hw,
~ ~v i ≤ hw,
~ wi
~ ~ ∈ W,
for all w
show that h~v , wi ~ ∈ W.
~ = 0 for all w
20 2. Inner Product Spaces

16. Consider the inner product space C 0 [−1, 1]. Suppose that f and g are continuous on [−1, 1] and
kf − gk ≤ 5. Let r
1 3
u1 (x) = √ and u2 (x) = x for x ∈ [−1, 1].
2 2
Write Z Z
1 1
aj = uj (x)f (x) dx and bj = uj (x)g(x) dx
−1 −1

for j = 1, 2. Show that |a1 − b1 |2 + |a2 − b2 |2 ≤ 25. (Hint. Use Bessel’s inequality.)
17. If V is a finite dimensional inner product space and W is a subspace of V , prove that (W ⊥ )⊥ = W .
18. If {~v1 , ~v2 } is a basis for V , show that V = Span{~v1 } ⊕ Span{~v2 }.
19. Consider the subspace Vα , α ∈ R, of R2 . Prove that if α 6= β, then R2 = Vα ⊕ Vβ .
20. Let V = RR be the space of all functions from R to R. Let

Ve = {f ∈ V : ∀x ∈ R, f (−x) = f (x)} and Vo = {f ∈ V : ∀x ∈ R, f (−x) = −f (x)},

the sets of all even and odd functions, respectively. Prove the following statements.
(a) Ve and Vo are subspaces of V . (b) V = Ve ⊕ Vo .
21. Let S be a set of vectors in a finite dimensional inner product space V . Suppose that “h~u, ~v i = 0 for
all ~u ∈ S implies ~v = ~0”. Show that V = Span S.
22. Let RN be the sequence space of real numbers. Let V = {(an ) ∈ RN : only finitely many ai 6= 0}.
(a) Prove that V is a subspace of RN .
(b) Given (an ), (bn ) ∈ V , define
∞
X
h(an ), (bn )i = a n bn .
n=1

(Note that this makes sense since only finitely many ai and bi are nonzero.) Show that this defines
( on V . ∞
an inner product )
X
(c) Let U = (an ) ∈ V : an = 0 .
n=1
Show that U is a subspace of V such that U ⊥ = {~0}, U + U ⊥ 6= V and U 6= U ⊥⊥ .
3 | Matrices

3.1 Solutions of Linear Systems

Definition. For any system of m linear equations in n unknowns with coefficients over a field F

a11 x1 + a12 x2 + ... + a1n xn = b1

a21 x1 + a22 x2 + ... + a2n xn = b2
.. ..
. .
am1 x1 + am2 x2 + . . . + amn xn = bm ,

we can use the matrix notation

A~x = ~b,
where     

a11 a12 ... a1n x1 b1
 a21 a22 ... a2n   x2   b2 
and ~b =  .  ,
     
A= . .. ..  , ~x =  .  ,
 .. . .   ..   .. 
am1 am2 . . . amn xn bm
considered as matrices over F . In this case, we usually call A the coefficient matrix of the
system. It is clear that A~x = ~b has a solution ⇔ ~b ∈ Col A. If all b1 , . . . , bm are equal to 0,
the linear system is said to be homogeneous. Note that all solutions of a homogeneous system
form the null space of A.
There is another matrix which plays an important role in the study of linear systems. This is
the augmented matrix, which is formed by inserting ~b as a new last column into the coefficient
matrix. In other words, the augmented matrix is
 
a11 a12 . . . a1n b1
h i  a21 a22 . . . a2n b2 
A : ~b =  .
 
.. .. ..  .
 .. . . . 
am1 am2 . . . amn bm

Remark. A homogeneous linear system

a11 x1 + a12 x2 + ... + a1n xn = 0
a21 x1 + a22 x2 + ... + a2n xn = 0
.. ..
. .
am1 x1 + am2 x2 + . . . + amn xn = 0,

always has a trivial solution, namely the solution obtained by letting all xj = 0. Other nonzero
solutions (if any) are called nontrivial solutions.

Definition. The rank of a matrix A is the dimension of the column space of A.

21
22 3. Matrices

Remark. If A is an m × n matrix, then rank A ≤ n and rank A is the maximum number of linearly
independent columns of A by Corollary 1.4.5.

Theorem 3.1.1. Let A be an m × n matrix over a field F .

1. The homogeneous system A~x = ~0m has only the trivial solution ~x = ~0n
⇔ the columns of A are linearly independent ⇔ rank A = n.
2. If rank A < n, then a homogeneous linear system has a nontrivial solution in F .

For an m × n matrix A over a field F , recall that the matrix transformation

T : ~x 7→ A~x
is a linear transformation from F n to F m . Its kernel is Nul A and its image is Col A.

Definition. The dimension of Nul A is called the nullity of A, denoted by nullity A.

By Theorem 1.4.9, we have:

Corollary 3.1.2. Let A be an m × n matrix over a field F . Then

rank A + nullity A = n = the number of columns of A.

Examples 3.1.1. Consider the following augmented matrices. Write down their general solutions
(if any).
 
1 −3 4 7
1.  0 1 2 2 
0 0 1 5
   
1 −3 7 0 1 −3 7 1
2.  0 1 4 0   0 1 4 0 
0 0 0 0 0 0 0 −1

1 0 3 0 1 0 3 5
3.
0 1 −2 0 0 1 −2 1
 
1 −4 −2 0 3 −5
 0 1 0 0 −1 −1 
4. 
 0 0

0 1 0 0 
0 0 0 0 0 0

Theorem 3.1.3. Let A be an m × n matrix over a field F and ~b ∈ F m .

1. A~x = ~b has a solution ⇔ ~b ∈ Col A ⇔ rank[A : ~b] = rank A.
2. If ~z ∈ F n is a solution of A~x = ~b, then

~z = ~y + ~yp ,

where ~y is a solution of the homogeneous system A~x = ~0m and A~yp = ~b.

Hence, the solution set of A~x = ~b is empty or given by

~yp + {~y ∈ F n : A~y = ~0m },
where ~yp is a solution of A~x = ~b, called a particular solution.

Corollary 3.1.4. Let A be an m × n matrix over a field F and ~b ∈ F m .

1. If A~x = ~b has a unique solution, then A~x = ~0m has a unique solution and rank A = n.
2. If A~x = ~0m has a nontrivial solution, then A~x = ~b has no solution or more than one
solutions.
3.2. Inverse of a Matrix and Elementary Matrices 23

3.2 Inverse of a Matrix and Elementary Matrices

Definition. The main part of the algorithms used for solving simultaneous linear systems with
coefficients in F is called elementary row operations. It makes repeatedly used of three
operations on the linear system or on its augmented matrix, each of which preserves the set of
solutions because its inverse is an operation of the same kind:
1. (Interchange, Rij ) Interchange the ith row and the jth row.
2. (Scaling, cRi ) Multiply the ith row by a nonzero scalar c.
3. (Replacement, Ri + cRj ) Replace the ith row by the sum of it and a scalar c multiple of
the jth row.
The elementary column operations are defined in a similar way.

Remark. The elementary row operations are reversible as follows.

Operation Reverse
Rij Rij
cRi , c 6= 0 (1/c)Ri
Ri + cRj Ri − cRj

Definition. Two linear systems are said to be equivalent if they have the same set of solutions.

Theorem 3.2.1. Suppose that a sequence of elementary operations is performed on a linear system.
Then the resulting system has the same set of solutions as the original, so the two linear systems
are equivalent.

Proof. It is clear from the way we do the row reductions that if c1 , c2 , . . . , cn satisfy the original
system, then they also satisfy the reduced system. Since the elementary row operations are re-
versible if we start with the reduced system, the original system can be recovered. Now, it is clear
that any solutions of the reduced system is also a solution of the original system.

Definition. A rectangular matrix is in echelon form (or row-echelon form) if it has the fol-
lowing three properties:
1. All nonzero rows are above any rows of all zeros.
2. Each leading entry of a row is in a column to the right of the leading entry of the row
above it.
3. All entries in a column below a leading entry are zero. If a matrix in echelon form satisfies
the following additional conditions, then it is in reduced echelon form (or reduced row-
echelon form):
4. The leading entry in each nonzero row is 1, called the leading 1.
5. Each leading 1 is the only nonzero entry in its column.
An echelon matrix (respectively, reduced echelon matrix) is one that is in echelon form (re-
spectively, reduced echelon form).

Theorem 3.2.2. Every matrix can be brought to a reduced echelon matrix by a finite sequence of
elementary row operations.

Proof. This can be done by an algorithm, called the Gaussian Algorithm.

1. If the matrix consists entirely of zeros, stop–it is already in row-echelon form.
2. Otherwise, find the first column from the left containing a nonzero entry (call it a), and
move the row containing that entry to the top position.
24 3. Matrices

3. Now multiply the new top row by 1/a to create a leading 1.

4. By subtracting multiples of that row from rows below it, make each entry below the lead-
ing 1 zero.
This completes the first row, and all further row operations are carried out on the remaining rows.
5. Repeat steps 1–4 on the matrix consisting of the remaining rows.
The process stops when either now rows remain at Step 5 of the remaining rows consist entirely
of zeros. Observe that the Gaussian algorithm is recursive.

Definition. A pivot position in a matrix A is a location in A that corresponds to a leading entry

in an echelon form of A. A pivot column is a column of A that contains a pivot position.

Definition. Let A be an n × n matrix. We say that A is invertible or nonsingular and has the
n × n matrix B as inverse if AB = BA = In .

If B and C are n × n matrices with AB = In and CA = In , then the associativity of multipli-

cation implies that
B = In B = (CA)B = C(AB) = CIn = C.
Hence an inverse for A is unique if it exists and we write A−1 for this inverse.

Theorem 3.2.3. Suppose A and B are invertible matrices of the same size. Then the following
results hold:
(a) A−1 is invertible and (A−1 )−1 = A, i.e., A is the inverse of A−1 .
(b) AB is invertible and (AB)−1 = B −1 A−1 .
(c) AT is invertible and (AT )−1 = (A−1 )T .

Theorem 3.2.4. [Invertible Matrix Theorem]

The following statements are equivalent for an n × n matrix A.
(i) A is invertible.
(ii) The homogeneous system A~x = ~0 has only the trivial solution ~x = ~0n .
(iii) A can be carried to the identity matrix In by elementary row operations.
(iv) The system A~x = ~b has at least one solution for any vector ~b ∈ F n .
(v) There is an n × n matrix C such that AC = In .

Corollary 3.2.5. If A and C are square matrices such that AC = I, then also CA = I. In
particular, A and C are invertible, C = A−1 and A = C −1 .

Corollary 3.2.6. An n × n matrix A is invertible if and only if rank A = n.

Definition. An elementary matrix is one that is obtained by performing a single elementary

row operation on an identity matrix.

Example 3.2.1. Consider the following elementary matrices:

     
1 0 0 1 0 0 1 0 0
E1 = 0 2 0 , E2 = 0 0 1 , and E3 = 0 1 0 .
0 0 1 0 1 0 3 0 1
 
a b c
Let A = d e f . Compute the products E1 A, E2 A and E3 A.
g h i
3.3. More on Ranks 25

Theorem 3.2.7. If an elementary row operation is performed on an m × n matrix A, the resulting

matrix can be written as EA, where the m × m matrix E is created by performing the same row
operation on Im .

Remark. Elementary matrices are invertible because row operations are reversible. To find the in-
verse of an elementary matrix E, determine the elementary row operation needed to transform E
back into I and apply this operation to I to obtain the inverse.

Corollary 3.2.8. An elementary matrix is invertible. Moreover,

Rij Rij
1. If I −→ E1 , then I −→ E1−1 .
cR (1/c)Ri
2. If c 6= 0 and I −→i E2 , then I −→ E2−1 .
Ri +cRj Ri −cRj
3. If c ∈ F and I −→ E3 , then I −→ E3−1 .

Example 3.2.2. Find the inverses of the elementary matrices given in Example 3.2.1

Theorem 3.2.9. Suppose A is an m × n matrix and A → B by elementary row operations.

1. B = U A for some m × m invertible matrix U .
2. U can be computed by [A : Im ] → [B : U ] using the operations carrying A → B.
3. U = Ek Ek−1 . . . E2 E1 , where E1 , E2 , . . . , Ek−1 , Ek are the elementary matrix corresponding
(in order) to the elementary row operations carrying A → B.

2 3 1
Example 3.2.3. If A = , express the reduced row-echelon form R of A as R = U A
1 2 1
where U is invertible.

Theorem 3.2.10. A square matrix is invertible if and only if it is a product of elementary matrices.

Remark. From the above theorem, we obtain an algorithm to find A−1 if A is invertible. Namely,
we start with the block matrix [A : I] and row reduce it until we reach the final reduced echelon
form [I : U ] (because A is row equivalent to I by Theorem 3.2.4). Then we have U = A−1 .

−2 3
Example 3.2.4. Express A = as a product of elementary matrices.
1 0

3.3 More on Ranks

Definition. Let A be an m × n matrix.

The column space, Col A, of A is the subspace of Rm spanned by the columns of A.
The row space, Row A, of A is the subspace of Rn spanned by the rows of A.

Note that Col A = Row AT .

Lemma 3.3.1. Let V be a vector space over a field F . Let ~v1 , . . . , ~vn be in V .
1. Span{~v1 , . . . , ~vn } = Span{~v1 , . . . , c~vi , . . . ~vn } for all i ∈ {1, . . . , n} and c ∈ F nonzero.
2. Span{~v1 , . . . , ~vn } = Span{~v1 , . . . , ~vi + c~vj , . . . , ~vj , . . . , ~vn } for all i 6= j and c ∈ F .

Lemma 3.3.2. Let A and B denote m × n matrices.

1. If A → B by elementary row operations, then Row A = Row B.
2. If A → B by elementary column operations, then Col A = Col B.
26 3. Matrices

If A is any matrix, we can carry A → R by elementary row operations where R is a row-

echelon matrix. Hence, Row A = Row R by Lemma 3.3.2.

Lemma 3.3.3. If R is a row-echelon matrix, then

1. The nonzero rows of R form a basis for Row R.
2. The columns of R containing leading ones form a basis for Col R.

Theorem 3.3.4. Let A denote any m × n matrix of rank r. Then

dim Col A = r = dim Row A.

Moreover, if A is carried to a row-echelon matrix R by row operations, then

1. The r nonzero rows of R form a basis for Row A.
2. If the pivot positions lie in columns j1 , j2 , . . . , jr of R, then columns j1 , j2 , . . . , jr of A are a
basis of Col A. That is, the pivot columns of A form a basis for Col A.

Corollary 3.3.5. 1. If A is any matrix, then rank A = rank AT .

2. If A is an m × n matrix, then rank A ≤ m and rank A ≤ n.
3. rank A = rank U A = rank AV whenever U and V are invertible.

Corollary 3.3.6. Let A, B, U and V be matrices of sizes for which the indicated products are
defined.
1. Col(AV ) ⊆ Col A, with equality if V is (square and) invertible.
2. Row(U A) ⊆ Row A, with equality if U is (square and) invertible.
3. rank AB ≤ rank A and rank AB ≤ rank B.

Let A be an m × n matrix of rank r, and let R be the reduced row-echelon form of A. Theorem
3.2.9 shows that R = U A where U is invertible, and that U can be found by [A : Im ] → [R : U ].
The matrix R has r leading ones (since rank A=r) so, as R is reduced, the n × m T
R
matrix
I 0
contains each row of Ir in the first r columns. Thus, row operations will carry RT → r .
0 0 n×m

Ir 0
Hence, Theorem 3.2.9 (again) shows that = U1 RT where U1 is an n × n invertible
0 0 n×m
matrix. Writing V = U1T , we obtain
!T
T T T Ir 0 Ir 0
U AV = RV = RU1 = (U1 R ) = = .
0 0 n×m 0 0 m×n
" #
T T Ir 0 T
Moreover, the matrix U1 = V can be computed by [R : In ] → : V . This proves
0 0 n×m

Theorem 3.3.7. Let A be an m × n matrix of rank r. There exist invertible matrices U and V of
size m × m and n × n, respectively, such that

Ir 0
U AV = ,
0 0 m×n

called the Smith normal form of A.

Moreover, if R is a reduced row-echelon form of A, then:
1. U can be computed by [A : Im ] → [R": U ]. #
I r 0
2. V can be computed by [RT : In ] → :VT .
0 0 n×m
3.4. Permutations and Determinants 27
 
1 −1 1 2
Example 3.3.1. Given A =  2 −2 1 −1, find invertible matrices U and V such that U AV
−1 1 0 3
is in the Smith normal form.

Theorem 3.3.8. [Uniqueness of the reduced row-echelon form]

If a matrix A is carried to reduced row-echelon matrices R and S by row operations, then R = S.

Proof. Observe first that U R = S for some invertible matrix U (by Theorem 3.2.9 there exist
invertible matrices P and Q such that R = P A and S = QA; take U = QP −1 . We show that
R = S by induction on the number m of rows of A. The case m = 1 is trivial because we can
perform only scaling. If ~rj and ~sj denotes the jth column of R and of S, respectively, the fact that
U R = S gives
U~rj = ~sj for each j. (3.3.1)
Since U is invertible, this shows that R and S have the same zero columns. Hence, by passing to
the matrices obtained by deleting the zero columns from R and S, we may assume that R and S
have no zero columns.
But then the first column of R and S is the first column of Im because they are reduced row-
echelon so (3.3.1) forces that the first column of U is the first column of Im . Now, write U, R and
S in block form as follows.

1 X 1 Y 1 X
U= ,R = and S = .
0 V 0 R′ 0 S′

Since U R = S, block multiplication gives V R′ = S ′ so, since V is invertible (U is invertible) and

both R′ and S ′ are reduced row-echelon, we obtain R′ = S ′ by the induction hypothesis. Thus, R
and S have the same number (say r) of leading 1’s, and so both have m − r zero rows.
In fact, R and S have leading ones in the same columns, say r of them. Applying (3.3.1) to
these columns shows that the first r columns of U are the first r columns of Im . Hence, we can
write U, R and S in block form as follows:

I M R1 R2 S S2
U= r ,R = and S = 1 ,
0 W 0 0 0 0

where R1 and S1 are r × r. Then block multiplication gives U R = R. That is, S = R. This
completes the proof.

3.4 Permutations and Determinants

Definition. Let n ∈ N. A permutation σ on the set {1, 2, . . . , n} is a one-to-one mapping of the

set onto itself or equivalently, a rearrangement of the numbers 1, 2, . . . , n. Such a permutation
σ is denoted by

1 2 ... n
σ= or σ = j1 j2 . . . jn , where ji = σ(i).
j1 j2 . . . jn

The set of all such permutations is denoted by Sn , and the number of such permutations is n!.

Example 3.4.1. S2 = {12, 21} and S3 = {123, 132, 213, 231, 312, 321}.

Remark. If σ ∈ Sn , then the inverse mapping σ −1 ∈ Sn ; and if σ, τ ∈ Sn , then the composition

mapping σ ◦ τ ∈ Sn . Also, the identity mapping ε = σ ◦ σ −1 = 123 . . . n ∈ Sn .
28 3. Matrices

Definition. For a permutation σ in Sn , let

Iσ = {(i, k) : i, k ∈ {1, 2, . . . , n}, i < k and σ(i) > σ(k)}.

We say that σ is an even permutation ⇔ |Iσ | is even, and an odd permutation ⇔ |Iσ | is odd.
We then define the sign or parity of σ, written sgn σ, by
(
1 if σ is even,
sgn σ =
−1 if σ is odd.

Thus, sgn σ ∈ {−1, 1} for all σ ∈ Sn .

Example 3.4.2. Let σ = 2134 in S4 and τ = 21543 in S5 .

1. Find σ −1 and τ −1 .
2. Compute sgn σ and sgn τ .

Theorem 3.4.1. Let n ≥ 2 and let g be the polynomial given by

Y
g = g(x1 , x2 , . . . , xn ) = (xi − xj ).
i<j

For σ(g) ∈ Sn , define the polynomial

Y
σ(g) = (xσ(i) − xσ(j) ).
i<j

Then (
g if σ is even,
σ(g) =
−g if σ is odd.
That is, σ(g) = (sgn σ)g.

Theorem 3.4.2. Let σ, τ ∈ Sn . Then

sgn(τ ◦ σ) = (sgn τ )(sgn σ).

Thus, the product of two even or two odd permutations is even, and the product of an odd and an
even permutation is odd.

Let [aij ] be a square matrix of size n × n.

Consider a product of n elements of A such that one and only one element comes from each
row and one and only one element comes from each columns. Such a product can be written in
the form
a1ji a2j2 . . . anjn
that is, where the factors comes from successive rows, and so the first subscripts are in the nat-
ural order 1, 2, . . . , n. Now since the factors come from different columns, the sequence of the
second subscripts forms a permutation σ = j1 j2 . . . jn in Sn . Conversely, each permutation in Sn
determines a product of the above form. Thus the matrix A contains n! such products.

Definition. The determinant of A = [aij ], denoted by det A or |A|, is the sum of all the above
n! products where each such product is multiplied by sgn σ. That is,
X X
|A| = (sgn σ)a1j1 a2j2 . . . anjn = (sgn σ)a1σ(1) a2σ(2) . . . anσ(n) .
σ∈Sn σ∈Sn
3.4. Permutations and Determinants 29

Lemma 3.4.3. Let A = [aij ] be an n × n matrix and σ ∈ Sn .

1. sgn σ −1 = sgn σ.
2. {(i, σ −1 (i)) : i ∈ {1, 2, . . . , n}} = {(σ(i), i) : i ∈ {1, 2, . . . , n}}.
3. aσ(1),1 aσ(2),2 . . . aσ(n),n = a1,σ−1 (1) a2,σ−1 (2) . . . an,σ−1 (n) .

Theorem 3.4.4. The determinant of a matrix A and its transpose are equal. That is, |A| = |AT |.

Remark. By this theorem, any theorem about the determinant of a matrix A that concerns the
rows of A will have an analogous theorem concerning the columns of A.

1 2 ... k ... l ... n
Lemma 3.4.5. For k < l, τ = is an odd permutation in Sn .
1 2 ... l ... k ... n

Proof. Note that Iτ = {(k, j) : j ∈ {k + 1, k + 2, . . . , l}} ∪ {(i, l) : i ∈ {k + 1, k + 2, . . . , l − 1}}.

Then |Iτ | = (l − k) + (l − k − 1) = 2(l − k) − 1 is odd. Thus, τ is an odd permutation and so
sgn τ = −1.

Theorem 3.4.6. If A → B by interchanging two rows (columns) of A, then |B| = −|A|.

Theorem 3.4.7. Let A be a square matrix of size n × n.

(a) If A has a row (column) of zeros, then |A| = 0.
(b) If σ 6= 1 2 . . . n, then ∃i ∈ {1, 2, . . . , n}, i > σ(i).
(c) If A is triangular, i.e., A has zeros above or below the diagonal, then |A| is the product of
diagonal elements. In particular, |I| = 1.

Theorem 3.4.8. If A has two identical rows (columns), then |A| = 0.

Proof. Assume that kth and lth rows are identical with k < l.
That is, akj = alj for all j ∈ {1, . . . , n}.
In particular, for any σ ∈ Sn , akσ(l) = alσ(l) and akσ(k) = alσ(k) .

1 2 ... k ... l ... n
Let τ = .
1 2 ... l ... k ... n
Then sgn τ = −1 and σ(τ (j)) = σ(j) for all j ∈ {1, . . . , n} r {k, l}. Also,
sgn(στ ) = (sgn σ)(sgn τ ) = − sgn σ.
As σ runs through all even permutations, στ runs through all odd permutations, and vice versa.
Thus
X
|A| = (sgn σ)a1σ(1) a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
σ∈Sn
X
= (sgn σ)a1σ(1) a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
σ∈Sn
even

+ (sgn(στ ))a1στ (1) a2στ (2) . . . akστ (k) . . . alστ (l) . . . anστ (n)
X
= (sgn σ)a1σ(1) a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
σ∈Sn
even

− (sgn σ)a1σ(1) a2σ(2) . . . akσ(l) . . . alσ(k) . . . anσ(n)
X
= (sgn σ)a1σ(1) a2σ(2) . . . akσ(k) . . . alσ(l) . . . anσ(n)
σ∈Sn
even

− (sgn σ)a1σ(1) a2σ(2) . . . alσ(l) . . . akσ(k) . . . anσ(n)
= 0.
30 3. Matrices

Hence, we have the theorem.

Theorem 3.4.9. If A → B by multiplying a row (column) of A by a scalar c ∈ F , then |B| = c|A|.

Remark. If A is an n × n matrix, then |cA| = cn |A|.

Theorem 3.4.10. If A → B by adding a multiple of a row (column) of A to another row (column)

of A, then |B| = |A|.

Corollary 3.4.11. If E is an elementary matrix, then det E 6= 0.

Lemma 3.4.12. Let E be an elementary matrix. Then |EA| = |E||A| for any matrix A. In
particular, if E1 , E2 . . . , Es are elementary matrices, then

|E1 E2 . . . Es | = |E1 ||E2 | . . . |Es |.

Theorem 3.4.13. Let A be a square matrix. Then, A is invertible ⇔ det A 6= 0.

Theorem 3.4.14. The determinant of a product of two matrices A and B is the product of their
determinants; that is |AB| = |A||B|.

Definition. Consider an n-square matrix A = [aij ]. Let Mij (A) denote the (n − 1)-square
submatrix of A obtained by deleting its ith row and jth column. The determinant |Mij (A)| is
called the minor of the element aij of A, and we define the cofactor of aij , denoted by Cij (A),
to be the “signed” minor:
Cij (A) = (−1)i+j |Mij (A)|.

Recall that
X
|A| = (sgn σ)a1σ(1) a2σ(2) . . . anσ(n)
σ∈Sn
= aij Cij (A) + (terms which do not contain aij as a factor).

Lemma 3.4.15. Cij (A) = Cij (A) for all i, j ∈ {1, . . . , n}.

Grant this lemma, we observe that

n
X n
X
|A| = aij Cij (A) = aij Cij (A).
j=1 j=1

Therefore, we have shown.

Theorem 3.4.16. [Laplace] The determinant of a square matrix A = [aij ] is equal to the sum of
the products obtained by multiplying the elements of any row (column) by their respective cofactors:
n
X
|A| = ai1 Ci1 (A) + ai2 Ci2 (A) + · · · + ain Cin (A) = aij Cij (A)
j=1
Xn
|A| = a1j C1j (A) + a2j C2j (A) + · · · + anj Cnj (A) = aij Cij (A)
i=1

for all i, j ∈ 1, 2, . . . , n.
3.4. Permutations and Determinants 31

Remark. The above formulas for |A| is called the Laplace expansions of the determinant of A
by the ith row and the jth column. Together with the elementary row operations, they offer a
method of simplifying the computation of |A|.
Next we proceed to prove the lemma.

Proof of Lemma 3.4.15. Note that for any matrix B,

X
|B| = (sgn σ)b1σ(1) b2σ(2) . . . bn−1,σ(n−1) bnσ(n)
σ∈Sn
X
= bnn (sgn σ)b1σ(1) b2σ(2) . . . bn−1,σ(n−1) + (other terms which do not contain bnn ).
σ∈Sn
σ(n)=n

Thus,
X
Cnn (B) = (sgn σ)b1σ(1) b2σ(2) . . . bn−1,σ(n−1)
σ∈Sn
σ(n)=n
X
= (sgn τ )b1τ (1) b2τ (2) . . . bn−1,τ (n−1)
τ ∈Sn−1

b11 b12 ... b1,n−1

b21 b22 ... b2,n−1
= ..
.
bn−1,1 bn−1,2 . . . bn−1,n−1
= determinant of the matrix obtained from deleting the nth row and nth column of B.

Write  
a11 a12 ... a1j ... a1n
 a21 a22 ... a2j ... a2n 
 
 .. 
 . 
A=
 ai1 ai2
.
 ... aij ... ain 

 .. 
 . 
an1 an2 ... anj ... ann
To compute Cij (A), we row reduce A to A′ by interchanging rows n − i times and columns n − j
times as shown:
 
a11 a12 ... a1,j−1 a1,j+1 . . . a1n a1j
 a21 a22 ... a2,j−1 a2,j+1 . . . a2n a2j 
 
 .. .. 

 . . 

′
 a i−1,1 a i−1,2 . . . a i−1,j−1 a i−1,j+1 . . . a i−1,n a i−1,j

A = ai+1,1 ai+1,2 . . . ai+1,j−1 ai+1,j+1 . . . ai+1,n ai+1,j  .

 
 .. .. 

 . . 

 an1 an2 . . . an,j−1 an,j+1 . . . ann anj 
ai1 ai2 ... ai,j−1 ai,j+1 . . . ain aij
Hence,
|A′ | = (−1)(n−i)+(n−j) |A| = (−1)−i−j |A|.
That is,

|A| = (−1)i+j |A′ |

aij Cij (A) + (other terms) = (−1)i+j aij Cnn (A′ ) + (other terms).
32 3. Matrices

Therefore,

Cij (A) = (−1)i+j Cnn (A′ )

= (−1)i+j (the determinant of the matrix obtained from
deleting the nth row and the nth column of A′ )
= (−1)i+j |Mij (A)|
= Cij (A).

This completes the lemma.

Definition. Let A = [aij ] be an n × n matrix and let Cij (A) denote the cofactor of aij . The
classical adjoint of A, denoted by adj A, is the transpose of the matrix of the cofactors of A,
namely,
adj A = [Cij (A)]T .

We say “classical adjoint” here instead of simply “adjoint” because the term “adjoint” will be
used for an entirely different concept.

Theorem 3.4.17. Let A be a square matrix. Then

A(adj A) = (adj A)A = |A|I

where I is the identity matrix. Thus, if |A| 6= 0, then

1
A−1 = (adj A).
|A|

For any n × n matrix A and any ~b ∈ F n , let Ai (~b) be the matrix obtained from A by replacing
the ith column by the vector ~b, that is,
" #
~a · · · ~b · · · ~
a
Aj (~b) = 1 |{z} n
jth

for all j = 1, 2, . . . , n.

Theorem 3.4.18. [Cramer’s rule] Let A be an invertible n × n matrix. For any ~b ∈ F n , the unique
solution ~x of A~x = ~b has entries given by

|Aj (~b)|
xj = , j = 1, 2, . . . , n.
|A|

Exercises for Chapter 3. 1. The following matrices are echelon forms of coefficient matrices of linear
systems.
 Which has
 a unique solution?
 Why?
1 2 3 4 1 2 3 4
0 1 2 3 0 1 2 3
(a) 
0 0 1 2
 (b) 
0 0 0 1


0 0 0 1 0 0 0 0
2. Find the general solution to the linear system

x1 + 2x2 + x3 − 2x4 = 5
2x1 + 4x2 + x3 + x4 = 9
3x1 + 6x2 + 2x3 − x4 = 14
3.4. Permutations and Determinants 33

3. Consider the linear system with parameter a

(2a − 1)x + ay − (a + 1)z = 1

ax + y − 2z = 1
2x + (3 − a)y + (2a − 6)z = 1

Determine, with proof, for which a this system has

(a) no solution (b) a unique solution (c) more than one solutions.
4. Consider the linear system
x + 2y + z = 3
ay + 5z = 10
2x + 7y + az = b
(a) Find those values of a for which the system has a unique solution.
(b) Find those pairs of values (a, b) for which the system has more than one solutions.
5. If A~x = ~b has more than one solutions, why is it impossible for A~x = ~c (new right-hand side) to have
only one solution? Could A~x = ~c have no solution?
6. Let A~x = ~0 be a homogeneous system of n linear equations in n unknowns and let Q be an invertible
n × n matrix. Show that A~x = ~0 has a nontrivial solution if and only if (QA)~x = ~0 has a nontrivial
solution.
7. If A~x = ~b has two distinct solutions p~ and ~q, find two distinct solutions to A~x = ~0.
8. Under what conditions on b1 and b2 (if any) is A~x = ~b consistent (has a solution)?

1 2 0 3 b
A= and ~b = 1 .
2 4 0 7 b2

9. Find the number c so that (if possible) the rank of A is (a) 1 (b) 2 (c) 3
 
6 4 2
A = −3 −2 −1
9 6 c
   
1 2 1 b 1 2 0 3
10. Suppose A = 2 a 1 8 has the reduced echelon form R = 0 0 1 2.
∗ ∗ ∗ ∗ 0 0 0 0
(a) Find a and b. ~
(b) Solve A~x = 0.
11. Let A be an m × n matrix for which
   
1 0
A~x = 1 has no solutions and A~x = 1 has a unique solution.
1 0

(a) Give all possible information about m and n and the rank of A.
(b) Find all solutions of A~x = ~0 and explain your answer.
12. Let A be an 3 × 4 matrix for which
   
1 0
A~x = 1 has no solutions and A~x = 1 has more than one solutions.
1 0

(a) Give all possible values for rank A.

(b) Do we always have more than one solutions for A~x = ~0? Explain your answer.
(c) Is it possible to have a vector ~b such that A~x = ~b has a unique solution? Why?
13. Find the value for c in the following n by n inverse:
   
n −1 . . . −1 c 1 ... 1
−1 n . . . −1 −1 1   1 c . . . 1
. . . . . . . . . −1 then A = n + 1 . . . . . . . . . 1 .
if A =   

−1 −1 . . . n 1 1 ... c

14. If E is an elementary matrix, prove that E T is an elementary matrix.

34 3. Matrices

15. Let E1 , E2 and E3 denote, respectively, the elementary row operations

“Interchange rows R1 and R2 ” “Multiply R3 by 5” “Replace R2 by −3R1 + R2 ”.

(a) Find the corresponding 3-square elementary matrices E1 , E2 and E3 .

(b) Find the inverses of matrices E1 , E2 and E3 .
16. Let A be a 3 × 3 invertible matrix. Construct B by replacing R3 of matrix A by R3 − 4R1 . How do
we find B −1 from A−1 ? Explain. 
0 1 0
17. Let A be a 3 × 3 matrix and B = 1 0 0. Consider the augmented matrix C = [A : B]. After row
0 0 1
reducing C, we get the following matrix
 
1 0 1 2 −3 −4
 0 1 0 −1 2 2 .
0 0 −1 0 0 1

Compute A−1 .  
−2 1 c
18. (a) For which values of the parameter c is A =  0 −1 1 invertible?
1 2 0
 
5 e e
(b) For which values of e is the matrix A = e e e not invertible?
  1 2 e
a b b
19. Let A = a a b . If a 6= 0 and a 6= b, prove that A is invertible and find A−1 in terms of a and b.
a a a 
1 0 0
20. Show that if A = 0 1 0 is an elementary matrix, then at least one entry in the third row must
a b c
be zero.
21. In each case
findan elementary
matrix E suchthat B = EA.
2 1 3 −1 2 1 −1 −3
(a) A = ,B = (b) A = ,B =
3 −1 2 1 3 −1 3 −1
22. In each case find an invertible matrix U such that U A = B, and express U as a product of elementary
matrices.
2 1 3 1 −1 −2 2 −1 0 3 0 1
(a) A = ,B = (b) A = ,B =
−1 1 2 3 0 1 1 1 1 2 −1 0
23. In each case find invertible matrices U and V such that U AV is in the Smithnormal form. 
1 −1 2 1
1 1 −1 3 2 1 −2
(a) A = (b) A = (c) A = (d) A = 2 −1 0 3
−2 −2 4 2 1 −2 4
0 1 −4 1
24. Let F be a field and A = [aij ] ∈ Mn (F ). Define the trace of A to be the sum of the diagonal elements,
that is,
Xn
tr A = aii .
i=1

(a) Show that the trace is a linear transformation from Mn (F ) onto F .

(b) If A and B are in Mn (F ), then tr (AB) = tr (BA).
(c) If B is invertible, then tr (B −1 AB) = tr A.
(d) Prove that there are no squarereal matrices A and B such that AB − BA = In .
1 2 0 2 1
25. Let A = −1 −2 1 1 0 .
1 2 −3 −7 −2
(a) Find bases for Col A and Nul A.  (b) Find rank A and nullity A.
1 2 0 3
0 2 2 2
26. Determine the rank and nullity of A =  0 0 0 0.


0 0 0 4
27. If A is an n × n matrix such that A2 = A and rank A = n, prove that A = In .
3.4. Permutations and Determinants 35

28. Let A be a 5 × 7 matrix with rank 4.

(a) What is the dimension of the solution space of A~x = ~0?
(b) Does A~x = ~b have a solution for all ~b ∈ R5 ? Explain.
29. Let A be a square matrix such that Ak = 0 for some positive integer k. Prove that I + A is invertible.
30. Let A and B be m × n and n × m matrices, respectively. If m > n, show that AB is not invertible.
31. Let A and B be m × n and n × m matrices, respectively. Show that AB = 0m×m ⇔ Col B ⊆ Nul A.
a11 a12 a13 a14
a a22 a23 a24
32. Determine the sign of all permutations in S4 and expand the determinant 21 by
a31 a32 a33 a34
a41 a42 a43 a44
using permutations and their signs explicitly.
33. Determine the sign of the following permutations in S5 .
(a) 12354 (b) 12534 (c) 15243 (d) 54321
34. Show that if two rows (columns) of A are proportional, i.e., Rk = cRl for some k < l, then |A| = 0.
35. Let A = [aij ] be a square matrix of order n and σ ∈ Sn . If Aσ = [aσ(i),j ], show that |Aσ | = (sgn σ)|A|.
36. Prove that if n is odd, 1 + 1 6= 0 and A is a square matrix of order n with A = −AT , then A is not
invertible.
37. After the indicated row operations on a 3 × 3 matrix A with det A = −540, matrices A1 , A2 , . . . , A5
are successively obtained:

R1 +3R2 R23
A / A1 / A2 3R2 −R1/ A3 R1 −3R2/ A4 2R1
/ A5 .

Determine the values of |A1 |, |A2 |, |A3 |, |A4 | and |A5 |, respectively.
38. If A is an invertible square matrix of order n > 1, show that det(adj A) = (det A)n−1 .
What is det(adj A) if A is not invertible? Prove your answer.
39. Let A, B, C be 3 × 3 matrices with det A = 3, det B 3 = −8, det C = 2. Compute
(a) det(ABC) (b) det(5AC T ) (c) det(A3 B −3 C −1 ) (d) det[B −1 (adj C)] .
T T
40. Show that adj A = (adj A) .
41. Show that if A is invertible and n > 2, then adj (adj A) = (det A)n−2 A.
42. If A and B are invertible, show that

adj (AB) = (adj B)(adj A) and adj (BAB −1 ) = B(adj A)B −1 .

43. Prove that if A is an invertible upper triangular matrix (all entries lying below the diagonal are zero),
then adj A and A−1 are upper triangular.
44. Suppose the set of real-valued functions f1 (x), f2 (x), . . . , fk (x) are all defined and are differentiable
k − 1 times on the interval [a, b]. The Wronskian of the set of functions is defined on this interval to
be the determinant
f1 (x) f2 (x) ··· fk (x)
f1′ (x) f2′ (x) ··· fk′ (x)
′′
W (x) = f1 (x) f2′′ (x) ··· fk′′ (x) .
.. .. .. ..
. . . .
(k−1) (k−1) (k−1)
f1 (x) f2 (x) ··· fk (x)

Prove that a set of real-valued functions {f1 (x), f2 (x), . . . , fk (x)} differentiable k − 1 times on the
interval [a, b], are linearly independent if W (x0 ) 6= 0 at some point x0 in the interval.
45. Consider the interval [−1, 1] and the two functions defined by
( (
0 if −1 ≤ x ≤ 0, x2 if −1 ≤ x ≤ 0,
f (x) = and g(x) =
x2 if 0 ≤ x ≤ 1 0 if 0 ≤ x ≤ 1.

These functions are both differentiable. Show that f and g are linearly independent but W (x) = 0
for all x ∈ [−1, 1]. This provides an example to prove that the converse of the previous problem does
not hold.
46. (a) Show that the functions 1, x, x2 , . . . , xk are linearly independent in the function space C 0 [0, 1].
(b) Show that the functions sin x, sin 2x, sin 3x, . . . , sin kx are linearly independent in the function
space C 0 [0, 2π]. (Hint. Use the Wronskian.)
36 3. Matrices

47. Use induction to show that

1 1 1 ··· 1 1
..
1 0 0 . 0 0
0 1 0 ··· 0 0
= (−1)n+1 .
0 0 1 ··· 0 0
.. .. .. ..
. . ··· . .
0 0 0 ··· 1 0
48. (a) Let x1 , x2 and x3 be numbers. Show that

1 x1 x21
1 x1
V2 = = x2 − x1 and V3 = 1 x2 x22 = (x2 − x1 )(x3 − x1 )(x3 − x2 ).
1 x2
1 x3 x23

(b) If x1 , x2 , . . . , xn are numbers, then show by induction that

1 x1 ... x1n−1
1 x2 ... x2n−1 Y
Vn = = (xj − xi ).
...
i<j
1 xn ... xnn−1

This determinant is called the Vandermonde determinant. (Hint. To do the induction easily, multi-
ply each column by x1 and subtract it from the next column on the right starting from the right-hand
side. We shall find that Vn = (xn − x1 ) . . . (x2 − x1 )Vn−1 .)
4 | Linear Transformations

4.1 Linear Functionals

Definition. Let V and W be two vector spaces over F . We write L(V, W ) for the set of all linear
transformations from V to W , that is,

L(V, W ) = {T : V → W | T is a linear transformation}.

Then L(V, W ) is a vector space over F with the operations defined by for S, T ∈ L(V, W ),

(S + T )(~v ) = S(~v ) + T (~v ) and (cT )(~v ) = c T (~v )

for all ~v ∈ V and c ∈ F . Note that the zero function is its zero vector and (−T )(~v ) = −T (~v ) for
all ~v ∈ V .

Remark. By Theorem 1.4.1, for a given basis B = {~v1 , ~v2 , . . . , ~vn } for an n-dimensional vector
space V , there exists a unique linear transformation T : V → W such that T (~vi ) = w ~ i ∈ W for
all i ∈ {1, 2, . . . , n}. Then for S, T ∈ L(V, W ), (S(~vα ) = T (~vα ) for all i ∈ {1, 2, . . . , n}) ⇒ S = T .
Hence, to show that two linear transformations are identical, it suffices to see the equality on some
basis of V .

Theorem 4.1.1. Let B = {~v1 , . . . , ~vn } be a basis for V and let C = {w ~ m } be a basis for W .
~ 1, . . . , w
For each i ∈ {1, . . . , n} and j ∈ {1, . . . , m}, we define
(
w
~ j if i = k,
Tij (~vk ) =
~0W if i 6= k,

for all k ∈ {1, . . . , n}. By Theorem 1.4.1, Tij ∈ L(V, W ) for all i, j. Then

{Tij : i ∈ {1, . . . , n} and j ∈ {1, . . . , m}}

is a basis for L(V, W ). Hence, if dim V = n and dim W = m, then dim L(V, W ) = mn.

Definition. Let V be a vector space over a field F .

A linear transformation from V to F is also called a linear functional. Let

V ∗ = L(V, F ) and V ∗∗ = (V ∗ )∗ (= L(V ∗ , F ) = L(L(V, F ), F )).

By Theorem 4.1.1, we have that if V is finite dimensional, then

dim V = dim V ∗ = dim V ∗∗

and thus, by Corollary 1.4.14, V ∼

=V∗ ∼
= V ∗∗ .

37
38 4. Linear Transformations

Definition. The space V ∗ is called the dual space of V and V ∗∗ is called the double dual of V .

Examples 4.1.1. The following functions are linear functionals.

Z 1
1. T : C 0 [0, 1] → R given by T (f ) = f (x) dx.
0
2. T : F [x] → F given by T (p(x)) = p(1).

Remarks. 1. For f ∈ V ∗ ,
(a) f 6= 0 ⇒ im f = F
(b) if V is finite dimensional and f 6= 0, then nullity f = (dim V ) − 1.
2. For ~v ∈ V , if f (~v ) = 0 for all f ∈ V ∗ , then ~v = ~0.

Theorem 4.1.2. Let dim V = n and let B = {~v1 , . . . , ~vn } be a basis of V .

For each i ∈ {1, . . . , n}, let fi ∈ V ∗ be such that
(
1 if i = j,
fi (~vj ) =
0 if i 6= j.

Then the following statements hold.

1. {f1 , . . . , fn } is a basis of V ∗ which is called the dual basis of B.
Xn
2. ∀f ∈ V ∗ , f = f (~vi )fi = f (~v1 )f1 + . . . + f (~vn )fn .
i=1
n
X
3. ∀~v ∈ V , ~v = fi (~v )~vi = f1 (~v )~v1 + . . . + fn (~v )~vn .
i=1

For ~v ∈ V, define L~v : V ∗ → F by L~v (f ) = f (~v ) for all f ∈ V ∗ .

Then L~v ∈ V ∗∗ for all ~v ∈ V . Hence, {L~v : ~v ∈ V } ⊆ V ∗∗ .

Theorem 4.1.3. 1. The map θ : ~v 7→ L~v is a 1-1 linear transformation from V into V ∗∗ .
2. If V is finite-dimensional, then
(a) the map θ : ~v 7→ L~v is an isomorphism of V onto V ∗∗
(b) ∀L ∈ V ∗∗ , ∃!~v ∈ V, L = L~v .

Corollary 4.1.4. If V is finite dimensional, then each basis of V ∗ is the dual of some basis of V .

Example 4.1.2. Consider V = R2 [x], the vector space of all polynomials of degree less than 2
over R. Let t1 , t2 , t3 be three distinct real numbers and let fi (p(x)) = p(ti ) for all p(x) ∈ R2 [x] and
i = 1, 2, 3.
Show that {f1 , f2 , f3 } is a basis of V ∗ and find a basis of V such that {f1 , f2 , f3 } is its dual
basis.

Let V be an inner product space over a field F = R or C.

1. ∀w~ ∈ V , the map ~v 7→ (~v , w)
~ is a linear functional on V .
2. The maps ~v 7→ (~v , w
~ 1 ) and ~v 7→ (~v , w
~ 2 ) are identical ⇔ w
~1 = w
~ 2.

Theorem 4.1.5. Let V be a finite dimensional inner product space and f ∈ V ∗ .

Then ∃!w
~ ∈ V, f (~v ) = (~v , w)
~ for all ~v ∈ V .
Hence, V ∗ = {fw~ : w~ ∈ V } where fw~ (~v ) = (~v , w)
~ for all ~v ∈ V .
4.2. Quotient Spaces and Isomorphism Theorem 39

4.2 Quotient Spaces and Isomorphism Theorem

Let V be a vector space over a field F and let W be any subspace of V . For ~v ∈ V , define
~v + W = {~v + w ~ ∈ W}
~ :w
which is called a coset of W . Then
(1) ∀~v1 , ~v2 ∈ V, ~v1 + W = ~v2 + W ⇔ ~v1 − ~v2 ∈ W ,
(2) ∀~v1 , ~v2 ∈ V, (~v1 + W ) ∩ (~v2 + W ) = ∅ or ~v1 + W = ~v2 + W and
(3) ∀~v1 , ~v2 ∈ V, (~v1 + W ) + (~v2 + W ) = (~v1 + ~v2 ) + W .
For c ∈ F and ~v ∈ V , define c(~v + W ) = c~v + W .

Definition. Let V /W = {~v + W : ~v ∈ V }. It is a vector space over F with respect to the

operations

(~v1 + W ) + (~v2 + W ) = (~v1 + ~v2 ) + W and c(~v1 + W ) = c~v1 + W,

and ~0 + W is the zero vector of V /W and −(~v + W ) = (−~v ) + W for all ~v ∈ V .

The vector space V /W is called the quotient space of V by W .

Theorem 4.2.1. 1. There is a linear transformation π from V onto V /W given by

π : ~v 7→ ~v + W for all ~v ∈ V .

Its kernel is equal to W . This map π is called the canonical projection from V onto V /W .
2. If V is a finite dimensional vector space and W is a subspace of V , then V /W is finite
dimensional and dim(V /W ) = dim V − dim W .

Theorem 4.2.2. [Isomorphism Theorem] Let V and W be two vector spaces over a field F and
T : V → W a linear transformation. Then

V /(ker T ) ∼
= im T.

Example 4.2.1. Let A be an m × n matrix and TA : Rn → Rm given by TA (~x) = A~x. Then we

have
Rn /(Nul A) ∼= Col A.
Moreover, if ~b ∈ Col A, then A~x = ~b has a solution, say ~yp . Theorem 4.2.2 also gives the corre-
spondence
~yp + Nul A ←→ ~b.
This is Theorem 3.1.3 (2).

4.3 Matrix Representations

Definition. Let V be an n-dimensional vector space over a field F with an ordered basis B =
{~v1 , ~v2 , . . . , ~vn } and ~v ∈ V . Then ∀~v ∈ V, ∃!(c1 , . . . , cn ) ∈ F n ,
 
c1
 c2 
~v = c1~v1 + c2~v2 + . . . + cn~vn and [~v ]B =  .  ∈ F n
 
 .. 
cn

is called the coordinate vector of ~v relative to the ordered basis B.

40 4. Linear Transformations

Example 4.3.1. Let B = {(1, 1, 0, 0), (1, 0, 1, 0), (1, 1, 1, 0), (0, 0, 0, 2i)} be an ordered basis for C4 .
Find [(2, −16, 3, −i)]B .
We recall Theorems 1.4.12 and 1.4.13 as follows.

Theorem 4.3.1. Let V be an n-dimensional vector space over F and B a basis for V .
~ ∈ V and c ∈ F , we have [~v + w]
1. For ~v , w ~ B = [~v ]B + [w]
~ B and [c~v ]B = c[~v ]B .
2. The map ~v 7→ [~v ]B is an isomorphism from V onto F n .
This also implies ∀~u, ~v ∈ V, [~u]B = [~v ]B ⇔ ~u = ~v .

Theorem 4.3.2. Let T : V → W be a linear transformation where dim V = m and dim W = n,

and let B = {~v1 , . . . , ~vn } and C = {w ~ m } be ordered bases of V and W , respectively. Then
~ 1, . . . , w
for each j ∈ {1, . . . , n}, we have

T (vj ) = d1j w ~ 2 + · · · + dmj w

~ 1 + d2j w ~ m.

Hence, there exists a unique m × n matrix over a field F given by

 
d11 d12 ... d1n
 d21
 d22 ... d2n 
[T ]CB = [T (~v1 )]C [T (~v2 )]C · · · [T (~vn )]C =  .

.. ..  .
 .. . . 
dm1 dm2 . . . dmn

Furthermore, ϕ : T 7→ [T ]CB is an isomorphism of L(V, W ) onto Mm,n (F ).

Definition. The matrix [T ]CB is called the matrix for T relative to the ordered bases B and
C If V = W and B = C, then we write [T ]B for [T ]B n n
B . In addition, if T : F → F is a linear
n
transformation and B is the standard basis for F , we call [T ]B the standard matrix for T .

Note that for ~v ∈ V ,

~v = c1~v1 + c2~v2 + · · · + cn~vn ,
so that
T (~v ) = c1 T (~v1 ) + c2 T (~v2 ) + · · · + cn T (~vn ).
Thus,
 
c1
 c2 


[T (~v )]C = c1 [T (~v1 )]C + c2 [T (~v2 )]C + · · · + cn [T (~vn )]C = [T (~v1 )]C [T (~v2 )]C · · · [T (~vn )]C  .  .
 .. 
cn
That is,
[T (~v )]C = [T ]CB [~v ]B for all ~v ∈ V .
We conclude this in the following diagram.

T
~v / T (~
v)

[T ]C
B×
[~v ]B v )]C = [T ]CB [~v ]B
/ [T (~
4.4. Change of Bases 41

Example 4.3.2. 1. Let B = {1 + x, x} be an ordered basis for R1 [x] and

C = {1 + x, x, x2 − 1, x3 } an ordered basis for R3 [x].
Let T : R1 [x] → R3 [x] be a linear transformation defined by
T (a + bx) = x2 (a + bx) for all a, b ∈ R.
Find [T ]CB .  
1 −1 0 0
2. Suppose T : M22 (R) → R3 is a linear transformation with [T ]CB = 0 1 −1 0  where
0 0 1 −1

1 0 0 1 0 0 0 0
B= , , , and C = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}.
0 0 0 0 1 0 0 1

a b
Compute T .
c d

Theorem 4.3.3. Let V , W and Z be finite-dimensional vector spaces over a field F and let B, C
and D be ordered bases of V , W and Z, respectively.
If S : V → W and T : W → Z are linear transformations, then

[T ◦ S]D D C
B = [T ]C [S]B .

Moreover, if V = W = Z and B = C = D, then [T ◦ S]B = [T ]B [S]B .

Corollary 4.3.4. Let V be a finite-dimensional vector space, B an ordered basis and T : V → V a

linear transformation. Then
1. T is an isomorphism ⇒ [T ]B is invertible and [T ]−1
B = T −1 B .

2. [T ]B is invertible ⇒ T is an isomorphism and T −1 B = [T ]−1
B .

Theorem 4.3.5. Let T : V → W be a linear transformation where dim V = n and dim W = m.

If B and C are any ordered bases for V and W , respectively, then rank T = rank[T ]CB .

Example 4.3.3. Define T : R2 [x] → R3 by T (a + bx + cx2 ) = (a − 2b, 3c − 2a, 3c − 4b) for all
a, b, c ∈ R. Compute rank T .

4.4 Change of Bases

Definition. Let V be a n-dimensional vector space over a field F . with an ordered basis B =
{~v1 , . . . , ~vn }. If B ′ = {~v1′ , . . . , ~vn′ } is another ordered basis for V , we define the transition or
′
change of coordinate matrix from B ′ to B by PB→B′ = [I]B B

Theorem 4.4.1. Let B, B ′ and B ′′ be bases for V . Then

1. ∀~v ∈ V, [~v ]B′ = PB→B′ [~v ]B ,
2. PB→B = In ,
3. PB→B′ is invertible and (PB→B′ )−1 = PB′ →B ,
4. PB→B′′ = PB′ →B′′ PB→B′ .

−1 1 1 2
Example 4.4.1. Let B = , ′
and B = , be ordered bases for R2 .
0 −1 1 −1
(a) Find P ′ .
B→B

0
(b) If ~v = , find [~v ]B and [~v ]B′ .
−1
42 4. Linear Transformations

Definition. A linear transformation from V to V is called a linear operator on V .

Theorem 4.4.2. Let B and B ′ be two bases for a finite dimensional vector space V . If T : V → V
is a linear operator, then
′
[T ]B′ = [I]B B −1
B′ [T ]B [I]B = (PB→B′ ) [T ]B (PB→B′ ).

Example 4.4.2. Let T : R3 → 3

 R be a linear transformation
  with
   
2 1 0  1 1 −1 
standard matrix 6 1 −1. Find [T ]B′ where B ′ = 2 , −3 ,  1  .
 
0 0 1 0 0 −6

From the above theorem, we have

det[T ]B = det[T ]B′ and rank[T ]B = rank[T ]B′

for any two bases B and B ′ for V .

Definition. If T : V → V is a linear operator, we define the determinant of T by

det T = det[T ]B for some basis B for V .

Definition. For n × n matrices A and B, we say that A is similar to B, A ∼ B, if there exists

an invertible matrix P ∈ Mn (F ) such that B = P −1 AP .

Remarks. 1. ∼ is an equivalence relation on Mn (F ).

2. If A ∼ B, then AT ∼ B T , Ak ∼ B k for all k ∈ N, and A−1 ∼ B −1 (if inverses exist).
3. [T ]B and [T ]B′ are similar for any two bases B and B ′ of V .

Definition. The trace of an n × n matrix A is the sum of the diagonal elements.

Theorem 4.4.3. Let A and B be similar matrices. Then

1. det A = det B,
2. rank A = rank B,
3. tr A = tr B.

Exercises for Chapter 4. 1. If T : V → W is an isomorphism and B is a basis for V , prove that T (B)
is a basis for W .
2. Let T : V → V be a linear transformation. Suppose that there exists a ~v ∈ V such that T (T (~v )) 6= ~0
and T (T (T (~v ))) = ~0. Prove that {~v , T (~v ), T (T (~v ))} is linearly independent.
3. Let S, T ∈ L(V, W ) and c ∈ F . Prove that:
(a) ker S ∩ ker T ⊆ ker(S + cT ) (b) im(S + T ) ⊆ im S + im T .
4. Let E be a linear transformation on a vector space V such that E ◦ E = E.
Prove that the following statements hold.
(a) ∀~v ∈ V, ~v ∈ im E ⇔ E(~v ) = ~v (b) ∀~v ∈ V, ~v − E(~v ) ∈ ker E (c) V = ker E ⊕ im E.
5. Let f, g ∈ V ∗ . If ker f ⊆ ker g, prove that g = cf for some c ∈ F .
6. Let V be an n-dimensional vector space over F .
If f, g ∈ V ∗ are linearly independent, find dim(ker f ∩ ker g).
7. If V and W are finite dimensional vector spaces which are isomorphic, prove that V ∗ ∼ = W ∗.
3
8. Let B = {(1, 0, −1), (1, 1, 1), (2, 2, 0)} be a basis for R . Find the dual basis of B.
4.4. Change of Bases 43

9. Consider V = R1 [x]. Let f1 : V → R and f2 : V → R be defined by

Z 1 Z 2
f1 (p(x)) = p(x) dx and f2 (p(x)) = p(x) dx.
0 0

Clearly, f1 , f2 ∈ V ∗ . Prove that {f1 , f2 } is a basis for V ∗ and find a basis of V such that {f1 , f2 } is its
dual basis.
10. (a) Let W be a subspace of a finite dimensional vector space V .
If B = {x1 , . . . , xm } is a basis for W and {x1 , . . . , xm , xm+1 , . . . , xn } is a basis of V ,
show that {xm+1 + W, . . . , xn + W } is a basis for V /W .
(b) Let H = Span{(1, 1, −1)}. Determine a basis for R3 /H.
11. Let W1 and W2 be two subspaces of a vector space V .
Define T : W1 + W2 → W2 /(W1 ∩ W2 ) by T (w ~1 +w ~ 2) = w~ 2 + (W1 ∩ W2 ) for all w ~ 1 ∈ W1 and w
~ 2 ∈ W2 .
(a) Prove that T is well defined and is an onto linear transformation.
(b) Prove that ker T = W1 .
(c) Conclude by Theorem 4.2.2 that (W1 + W2 )/W1 ∼ = W2 /(W1 ∩ W2 ).
This is a generalization of Theorem 1.4.8.
12. If W1 and W2 are subspaces of V with W1 ⊆ W2 .
Define T : V /W1 → V /W2 by T (~v + W1 ) = ~v + W2 for all ~v ∈ V .
(a) Prove that T is well defined and is an onto linear transformation.
(b) Prove that ker T = W2 /W1 .
(c) Conclude by Theorem 4.2.2 that (V /W1 )/(W2 /W1 ) ∼ = V /W2 .
13. Let U , V and W be finite dimensional vector spaces over a field F . Let S : U → V and T : V → W
be linear transformations such that T ◦ S is the zero map. Show that

dim(W/ im T ) − dim(ker T / im S) + nullity S = dim W − dim V + dim U.

14. Let V and W be finite dimensional vector spaces over a field F . Let U be a subspace of V and
T : V → W a linear transformation.
(a) Prove that dim(V /U ) ≥ dim(T (V )/T (U )).
(b) If T is 1-1, prove also that the inequality in (a) becomes an equality.
15. For S ⊆ V , let A(S) = {f ∈ V ∗ : f (~v ) = 0 for all ~v ∈ S}. It is called the annihilator of S.
Prove that
(a) A(S) is a subspace of V ∗ (b) If S1 ⊆ S2 , then A(S1 ) ⊇ A(S2 )
(c) If V is finite dimensional and W is a subspace of V , then V ∗ /A(W ) ∼ = W ∗.
16. Prove that ∀S, T ∈ L(V, V ), S ◦ T ∈ L(V, V ).
17. Let T : V → W be a linear transformation where dim V = dim W = n.
Prove that the following statements are equivalent.
(i) T is an isomorphism.
(ii) [T ]cC
B is invertible for all ordered bases B and C of V and W , respectively.
(iii) [T ]cC
B is invertible for some pair of ordered bases B and C of V and W , respectively.
18. Suppose the linear transformation T : R2 → R2 is given by

T (1, 1) = (2, 3) and T (−1, 1) = (4, 5).

Find the standard matrix for T .

19. Let B = {1, x, x2 } be an ordered basis for R2 [x] and C = {(1, 0), (1, −1)} an ordered basis for R2 .
Find [T ]CB if T : R2 [x] → R2 is a linear transformation defined by T (a + bx + cx2 ) = (a + c, 2b) for all
a, b, c ∈ R.
20. Let B = {sin t, cos t} and B ′ = {sin t+2 cos t, sin t−cos t} be ordered bases for H = Span B = Span B ′
which is a subspace of C 1 (−∞, ∞). Let D : H → H defined by D(f ) = f ′′ for all
21. If T : V → W is an isomorphism, prove that ([T ]CB )−1 = [T −1 ]B C for all ordered bases B and C of V
and W , respectively.
22. Let T : R2 [x] → R3 be a linear transformation defined by T (a + bx + cx2 ) = (a − c, b, a − 2c) for all
a, b, c ∈ R. If B = {1, x, x2 } and C = (1, 0, 0), (0, 1, 0), (0, 0, 1), find [T ]CB and the formula for T −1 .
23. Let T : Rn [x] → Rn [x] be a linear transformation defined by

T (p(x)) = p(x) + xp′ (x),

where p′ (x) is the derivative of p(x). Show that T is an isomorphism by finding [T ]B where B =
{1, x, x2 , . . . , xn }.
44 4. Linear Transformations

24. Let α be a real number. Define a linear transformation Tα : M2 (R) → M2 (R) by

Tα (A) = A + αAT for all A ∈ M2 (R).

1 0 0 0 0 1 0 −1
If B = , , , , find [T ]B and conclude that Tα is invertible if α2 6= 1.
0 0 0 1 1 0 1 0
25. Let T : R2 → R2 be a linear transformation defined by T (x, y) = (−y, x) for all x, y ∈ R. Prove that
(a) ∀c ∈ R, (A − cI2 ) is invertible,
2 a b
(b) if B is an ordered basis for R and [T ]B = , then bc 6= 0.
c d
5 | Structure Theorems

5.1 Eigenvalues and Eigenvectors

We first recall some numerical examples.

1 0
Example 5.1.1. Diagonalize A = .
−1 2
That is, find an invertible matrix P and a diagonal matrix D (if any) such that A = P DP −1 .
 
3 1 −1
Example 5.1.2. Let A = 2 2 −1 and T (~x) = A~x a matrix transformation on R3 .
2 2 0
Find a basis B (if any) such that [T ]B is a diagonal matrix. Given det(A − λI) = −(λ − 1)(λ − 2)2 .

Definition. Let V be a vector space over a field F and T ∈ L(V, V ).

A scalar λ ∈ F is called an eigenvalue or characteristic value of T if there exists a nonzero
vector ~v ∈ V such that T (~v ) = λ~v . If λ is is an eigenvalue of T , then ~v ∈ V such that T (~v ) = λ~v
is called an eigenvector or characteristic vector of T associated with the characteristic
value λ. We have that

Eλ (T ) = {~v ∈ V : T (~v ) = λ~v } = {~v ∈ V : (T − λI)(~v ) = ~0V } = ker(T − λI)

is a subspace of V , called the eigenspace or characteristic space of T associated with λ.

Remark. λ is an eigenvalue of T ⇔ ker(T − λI) 6= ~0V ⇔ T − λI is not 1-1.

For matrix theory, we restrict ourselves to the case of V is n-dimensional. Then L(V, V ) ∼
=
Mn (F ) with T 7→ [T ]B for a fixed basis B of V . Hence, we can only work on Mn (F ).

Definition. Let A ∈ Mn (F ). The matrix transformation TA : F n → F n is given by

TA (~x) = A~x

for all ~x ∈ F n . An eigenvalue of TA is called an eigenvalue of A and the eigenspace of TA is

called an eigenspace of A. In other words,

Eλ (A) = {~x ∈ F n : A~x = λ~x} = {~x ∈ F n : (A − λIn )~x = ~0n } = Nul(A − λIn ).

Then
λ is an eigenvalue of A ⇔ ker(TA − λI) 6= ~0n
⇔ Nul(A − λIn ) 6= ~0n
⇔ A − λIn is not invertible
⇔ det(A − λIn ) = 0.

45
46 5. Structure Theorems

Definition. The polynomial cA (x) = det(xIn − A) is called the characteristic polynomial of A.

Thus we have proved

Theorem 5.1.1. For A ∈ Mn (F ), λ is an eigenvalue of A ⇔ det(A − λIn ) = 0, i.e., λ is a root of

the characteristic polynomial of A.

Since an eigenvalue of an n × n matrix A is a root of cA (x) = det(xIn − A) which has degree n

and a polynomial of degree n over a field F has at most n roots in F , A has ≤ n eigenvalues.

Theorem 5.1.2. An n × n matrix has at most n eigenvalues.

Remark. If A is similar to B, then det A = det B and

det(B − λIn ) = det(P −1 AP − λP −1 In P ) = det(P −1 (A − λIn )P ) = det(A − λIn ).

Therefore, we have the following result.

Theorem 5.1.3. If A and B are similar n × n matrices, then A and B have the same characteristic
polynomial and eigenvalues (with same multiplicities).

Example 5.1.3. The matrices

1 1 1 0
A= and I=
0 1 0 1

have the same determinant, trace, characteristic polynomial and eigenvalue, but they are not
similar because P IP −1 = I for any invertible matrix P .

Definition. A diagonal matrix D is a square matrix such that all the entries off the main
diagonal are zero, that is if D is of the form
 
λ1 0 . . . 0
 0 λ2 . . . 0 
 
D= . .. . . ..  = diag(λ1 , λ2 , . . . , λn ),
 .. . . . 
0 0 ... λn

where λ1 , λ2 , . . . , λn ∈ F (not necessarily distinct).

Definition. An n × n matrix A over F is said to be diagonalizable if A is similar to a diagonal

matrix, that is, there are an invertible matrix P and a diagonal matrix D such that P −1 AP = D.
In this case, we say that P diagonalizes A.

Definition. Let V be a finite dimensional vector space and T ∈ L(V, V ) a linear operator. We
say that T is diagonalizable if there exists a basis B for V such that [T ]B is a diagonal matrix.

Theorem 5.1.4. Let A be an n × n matrix.

1. A is diagonalizable ⇔
A has eigenvectors ~v1 , . . . , ~vn such that P = ~v1 · · · ~vn is invertible.
2. When this is the case, P −1 AP = diag(λ1 , λ2 , . . . , λn ), where
for each i, λi is the eigenvalue of A corresponding to ~vi .
5.1. Eigenvalues and Eigenvectors 47

Proof. Let P = ~v1 ~v2 . . . ~vn and D = diag(λ1 , λ2 , . . . , λn ).
Then AP = P D becomes
 
λ1 0 · · · 0
 0 λ2 · · ·
 0 
A ~v1 ~v2 · · · ~vn = ~v1 ~v2 · · · ~vn  . .. . . .. 
 .. . . . 
0 0 ··· λn

A~v1 A~v2 · · · A~vn = λ1~v1 λ2~v2 · · · λn~vn .

Comparing columns shows that A~vi = λi~vi for each i, so

P −1 AP = D ⇔ P is invertible and A~vi = λi~vi for all i ∈ {1, . . . , n}.

The results follow.

Theorem 5.1.5. Let ~v1 , . . . , ~vm be eigenvectors corresponding to distinct eigenvalues λ1 , . . . , λm of

an n × n matrix A. Then {~v1 , . . . , ~vm } is linearly independent.

Proof. We use induction on k.

If k = 1, then {~v1 } is linearly independent because ~v1 6= ~0.
Let k ≥ 1 and the theorem is true for any k eigenvectors.
Let ~v1 , . . . , ~vk+1 be eigenvectors corresponding to distinct eigenvalues λ1 , . . . , λk+1 of A.
Let c1 , . . . , ck+1 ∈ F be such that

c1~v1 + c2~v2 + · · · + ck+1~vk+1 = ~0. (5.1.1)

Since A~vi = λ~vi for all i, multiplying by A both sides gives

c1 λ1~v1 + c2 λ2~v2 + · · · + ck+1 λk+1~vk+1 = ~0. (5.1.2)

Subtracting (5.1.2) by λ1 ×(5.1.1), we have

c2 (λ2 − λ1 )~v2 + · · · + ck+1 (λk+1 − λ1 )~vk+1 = ~0.

Since ~v2 , . . . , ~vk+1 are k eigenvectors, they are linearly independent by induction hypothesis, so

c2 (λ2 − λ1 ) = · · · = ck+1 (λk+1 − λ1 ) = 0.

However, λ1 , . . . , λk+1 are distinct, hence we get

c2 = · · · = ck+1 = 0.

This implies c1~v1 = ~0, so c1 = 0 because ~v1 6= ~0.

Therefore, {~v1 , . . . , ~vk+1 } is linearly independent.

Corollary 5.1.6. If A is an n × n matrix with n distinct eigenvalues, then A is diagonalizable.

Proof. Let ~v1 , . . . , ~vn be eigenvectors corresponding

to distinct eigenvalues λ1 , . . . , λn of A.
Then they are linearly independent, and so P = ~v1 . . . ~vn is invertible
and P −1 AP = diag(λ1 , λ2 , . . . , λn ).
48 5. Structure Theorems

Lemma 5.1.7. Let {~v1 , . . . , ~vk } be a linearly independent set of eigenvectors of an n × n matrix A,
extend it to a basis of F n , and let

P = ~v1 . . . ~vk ~vk+1 . . . ~vn

which is invertible. If λ1 , . . . , λk are the (not necessarily distinct) eigenvalues of A corresponding

to ~v1 , . . . , ~vk , respectively, then P −1 AP has block form

−1 diag(λ1 , . . . , λk ) B
P AP =
0 C

where B has size k × (n − k) and C has size (n − k) × (n − k).

Definition. An eigenvalue λ of a square matrix A is said to have multiplicity m if it occurs m

times as a root of the characteristic polynomial cA (x).

In other words,
cA (x) = (x − λ)m g(x)

for some polynomial g(x) such that g(λ) 6= 0.

Lemma 5.1.8. Let λ be an eigenvalue of multiplicity m of a square matrix A.

Then nullity(A − λI) = dim Eλ (A) ≤ m.

Proof. Assume that dim Eλ (A) = d with basis {~v1 , . . . , ~vd }. By Lemma 5.1.7, there exists an
invertible n × n matrix P such that

−1 λId B
P AP = =M
0 C

where Id is the d × d identity matrix. Since M and A are similar,

(x − λ)Id B
cA (x) = cM (x) = det(xIn − M ) =
0 xIn−d − C
= (det(x − λ)Id )(det(xIn−d − C))
= (x − λ)d cC (x).

Hence, d ≤ m because m is the highest power of (x − λ) in cA (x).

Theorem 5.1.9. Let λ1 , . . . , λk be all distinct eigenvalues of an n × n matrix A.

For each i ∈ {1, . . . , k}, let mi denote the multiplicity of λi and write di = nullity(A − λi In ).
Then 1 ≤ di ≤ mi for all i, n = m1 + · · · + mk and

cA (x) = (x − λ1 )m1 · · · (x − λk )mk .

Moreover, the following statements are equivalent.

(i) A is diagonalizable.
(ii) di = nullity(A − λi In ) = dim Eλi (A) = mi for all i.
(iii) n = d1 + · · · + dk .
5.2. Annihilating Polynomials 49

5.2 Annihilating Polynomials

2
Let A be an n × n matrix over a field F . Since dim Mn (F ) = n2 , the set {In , A, A2 , . . . , An } is
linearly dependent. Then there exist c0 , c1 , . . . , cn2 in F not all zero such that
2
c0 In + c1 A + c2 A2 + · · · + cn2 An = 0.
2
Let f (x) be the polynomial over F defined by f (x) = c0 + c1 x + c2 x2 + · · · + cn2 xn . Then f (x) 6= 0
and f (A) = 0.
Let g(x) = α−1 f (x) where α is the leading coefficient of f (x). Then g(x) is monic (leading
coefficient = 1) and g(A) = 0. Thus there exists a polynomial p(x) over F such that
(a) p(A) = 0
(b) p(x) is monic and
(c) ∀ nonzero polynomial q(x), q(A) = 0 ⇒ deg p(x) ≤ deg q(x).
We have that such p(x) is unique (Proof!) and it is called the minimal polynomial. Note that if
k(x) ∈ F [x] and k(A) = 0, then p(x) | k(x).
Remark. If A and B in Mn (F ) are similar, then they have the same minimal polynomial.
Recall that the characteristic polynomial of A is given by

cA (x) = det(xIn − A).

Theorem 5.2.1. The characteristic polynomial and minimal polynomial for A have the same roots.

Remark. Although the minimal polynomial and the characteristic polynomial have the same
roots, they may not be the same.
 
5 −6 −6
Example 5.2.1. The characteristic polynomial for A = −1 4 2  is (x − 1)(x − 2)2 while
3 −6 −4

(A − I)(A − 2I) = 0,

so the minimal polynomial of A is (x − 1)(x − 2). Notice that A is diagonalizable. In general, we

have:

Theorem 5.2.2. If an n × n matrix A is diagonalizable with distinct eigenvalues λ1 , . . . , λk , then

(x − λ1 ) . . . (x − λk ) is the minimal polynomial for A.

Theorem 5.2.3. [Cayley-Hamilton] If f (x) is the characteristic polynomial of a matrix A, then

f (A) = 0.

Proof. Write f (x) = det(xIn − A) = xn + an−1 xn−1 + · · · + a1 x + a0 . Let B = xIn − A. Since

adj B is a matrix such that each entry is obtained by using (n − 1) × (n − 1) submatrix of A and
computing its determinant,
(n−1) n−1 (n−2) n−2 (1) (0)
Cij (B) = bij x + bij x + · · · + bij x + bij

for all i, j ∈ {1, . . . , n}. Thus

adj B = [Cij (B)]Tn×n

(n−1) n−1 (n−2) n−2 (1) (0)
= [bij x + bij x + · · · + bij x + bij ]Tn×n
= Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0
50 5. Structure Theorems

where Bi ∈ Mn (F ). Recall that

(det B)In = B(adj B) = B(Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0 ).

Then

(xn + an−1 xn−1 + · · · +a1 x + a0 )In

= B(Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0 )
= (xI − A)(Bn−1 xn−1 + Bn−2 xn−2 + · · · + B1 x + B0 )
= Bn−1 xn + Bn−2 xn−1 + · · · + B1 x2 + B0 x
− ABn−1 xn−1 + ABn−2 xn−2 + · · · + AB1 x + AB0 .

This gives

I = Bn−1
an−1 I = Bn−2 − ABn−1
an−2 I = Bn−3 − ABn−2
..
.
a1 I = B0 − AB1
a0 I = −AB0 .

Therefore

An +an−1 An−1 + . . . a1 A + a0 I
= An Bn−1 + An−1 (Bn−2 − ABn−1 ) + An−2 (Bn−3 − ABn−2 ) + . . .
+ A(B0 − AB1 ) − AB0
=0

as desired.
 
3 1 −1
Example 5.2.2. Determine the minimal polynomial of A = 2 2 −1.
2 2 0

Some consequences of the Cayley-Hamilton are as follows.

Corollary 5.2.4. The minimal polynomial of A divides its characteristic polynomial.

Recall that

0 is an eigenvalue of A ⇔ 0 = det(A − 0I) = det A ⇔ A is not invertible.

Corollary 5.2.5. If f (x) = a0 + a1 x + · · · + an−1 xn−1 + xn is the characteristic polynomial of an

invertible matrix A, then a0 6= 0
1
A−1 = − (a1 I + a2 A + · · · + An−1 ).
a0
5.3. Symmetric and Hermitian Matrices 51

5.3 Symmetric and Hermitian Matrices

Definition. Let F = R or C and A = [aij ] a matrix over F .

The matrix A is said to be symmetric if A = AT . We define AH = [āij ]T , the conjugate
transpose of A, called A Hermitian. We say that A is Hermitian or self-adjoint if A = AH .

Notice that symmetric and Hermitian matrices are square matrices and they coincide if F = R.

3 1 −1 2 + 3i
Example 5.3.1. Let A = and B = .
1 −2 2 − 3i 2
Then A is symmetric and both of them are Hermitian.

Theorem 5.3.1. If A is a Hermitian matrix, then

(1) ~xH A~x is real for all ~x ∈ Cn and (2) the eigenvalues of A are real.
That is, if A is Hermitian, then all roots of cA (x) are real.

Example 5.3.2. For vectors ~x and ~y in Cn , we define (~x, ~y ) = ~xH ~y .

Then (·, ·) is an inner product on Cn so that

k~xk2 = ~xH ~x = |x1 |2 + · · · + |xn |2 for all ~x = (x1 , . . . , xn ) ∈ Cn .

Theorem 5.3.2. Two eigenvectors corresponding to different eigenvalues of a Hermitian matrix

are orthogonal to one another.

Definition. For F = R or C and U ∈ Mn (F ), U is called unitary if U H U = In = U U H . If

F = R, a unitary matrix satisfies U T U = In = U U T and may be called an orthonormal matrix.

Theorem 5.3.3. Let U ∈ Mn (C) be a unitary matrix.

For the inner product defined in Example 5.3.2, we have
(U~x, U ~y ) = (~x, ~y ) for all ~x, ~y ∈ Cn , so kU~xk = k~xk for all ~x ∈ C.

Corollary 5.3.4. If U = ~u1 ~u2 . . . ~un ∈ Mn (C) is a unitary matrix, then for all j, k ∈
{1, 2, . . . , n} we have (
1 if j = k,
(~uj , ~uk ) =
0 if j 6= k.

Remark. The converse of Corollary 5.3.4 is also true and its proof is left as an exercise.

1 1 i cos t − sin t
Example 5.3.3. U1 = √ and U2 =
2 i 1 sin t cos t
are unitary matrices.

Theorem 5.3.5. Every eigenvalue of an unitary matrix U has absolute value one, i.e., |λ| = 1.
Moreover, eigenvectors corresponding to different eigenvalues are orthogonal to each other.

We are going to explore some very remarkable facts about Hermitian and real symmetric
matrices. These matrices are diagonalizable, and moreover diagonalization can be accomplished
by a unitary matrix P . This means that P −1 AP = P H AP is diagonal. In this situation, we say
that the matrix A is unitarily or orthogonally diagonalizable. Orthogonally and unitary are
particularly attractive since the calculation is essentially free and error-free as well: P H = P −1 .
52 5. Structure Theorems

Theorem 5.3.6. If a real matrix A is a orthogonally diagonalizable with an orthonormal matrix P ,

that is P T AP is a diagonal matrix, then A is symmetric.

Remark. The converse of Theorem 5.3.6 is also true. In addition, we prove a stronger result.

Theorem 5.3.7. [Principal Axes Theorem] Every Hermitian matrix is unitarily diagonalizable.
In addition, every real symmetric matrix is orthogonally diagonalizable.

Proof. We shall show this statement by induction on n. It is clear for n = 1.

Assume that n > 1 and every (n−1)×(n−1) Hermitian matrix is unitary diagonalizable. Consider
an n × n Hermitian matrix A.
Let λ1 be a real eigenvalue of A with unit eigenvector ~v .
Then A~v = λ1~v and k~v k = 1.
Let W = {~v}⊥ with orthonormal basis {~z1 , . . . , ~zn−1 }.
Thus, R = ~v ~z1 . . . ~zn−1 is an n × n unitary matrix. Observe that
   
λ1 0 ... 0 λ1 0 ... 0
 0
0 b22 ... b2n   
B = RH AR =  .
 
.. =
..   .. 
 .. . .   . C(n−1)×(n−1) 
0 bn2 . . . bnn 0

and B H = (RH AR)H = B. Hence, B is Hermitian and so is C.

Since C is an (n − 1) × (n − 1) Hermitian matrix, by the induction hypothesis,
∃ an (n −1) × (n − 1) unitary
 matrix Q such that QH CQ = diag(λ2 , . . . , λn ).
1 0 ... 0
0 
 
Let P =  .  . Then P is an n × n unitary matrix and
 .. Q 
0 n×n

P H BP = P H RH ARP = (RP )H A(RP ).

Choose U = RP . Then U H = (RP )H = P H RH = R−1 P −1 = (RP )−1 = U −1 and

 
λ1
 λ2 
U H AU = P H BP = 
 
.. .
 . 
λn

Hence, A is unitarily diagonalizable.

1 1−i
Example 5.3.4. Diagonalize the Hermitian matrix A = .
1+i 0
 
1 2 0
Example 5.3.5. Orthogonally diagonalize the symmetric matrix A = 2 4 0.
0 0 5
(Given λ = 0, 5, 5).

Definition. A square matrix A is normal if AH A = AAH .

Clearly, every Hermitian matrix is normal.

5.4. Jordan Forms 53

Theorem 5.3.8. A matrix is unitarily diagonalizable if and only if it is normal.

Proof. It is a consequence of Schur Triangularization Theorem which is beyond the scope of this
course.

Real versus Complex

(x1 , . . . , xn ) ∈ Rn (x1 , . . . , xn ) ∈ Cn
length: k~xk2 = x21 + · · · + x2n k~xk2 = |x1 |2 + · · · + |xn |2
transpose: ATij = Aji Hermitian: AH ij = Aji
(AB)T = B T AT (AB) = B H AH
H

~x · ~y = ~xT ~y = x1 y1 + · · · + xn yn H
~x · ~y = ~x ~y = x̄1 y1 + · · · + x̄n yn
orthogonality: ~xT ~y = 0 ~xH ~y = 0
orthonormal: P T P = In = P P T unitary: U U = In = U U H
H

symmetric matrix: AT = A Hermitian matrix AH = A

A = P DP −1 = P DP T (real D) A = U DU −1 = U DU H (real D)
orthogonally diagonalizable unitarily diagonalizable

5.4 Jordan Forms

Theorem 5.1.9 gives necessary and sufficient conditions for an n × n matrix to be diagonalizable,
namely that it should have n independent eigenvectors. We have also seen square matrices which
are not diagonalizable. In this section, we discuss the so-called Jordan canonical form, a form
of matrix to which every square matrix is similar.

Definition. Let A be an n × n matrix. Let λ be an eigenvalue of A with nullity(A − λIn ) = ℓ.

Assume that λ is of multiplicity m. Then 1 ≤ ℓ ≤ m.
If m = 1, then ℓ = m = 1.
If m > 1 and ℓ < m, then λ is said to be defective and the number m − ℓ > 0 of missing
eigenvector(s) is called the defect of λ.

Note that if A has a defective eigenvalue, then A is not diagonalizable.

Definition. The generalized eigenspace Gλ corresponding to an eigenvalue λ of A, consists

of all vectors ~v such that, for some k ∈ N, (A − λI)k~v = ~0, that is,
[
Gλ (A) = {~v ∈ F n : (A − λI)k~v = ~0 for some k ∈ N} = Nul(A − λI)k .
k∈N

Definition. A length r chain of generalized eigenvectors based on the eigenvector ~v for λ

is a set {~v = ~v1 , ~v2 , . . . , ~vr } of r linearly independent generalized eigenvectors such that

(A − λI)~vr = ~vr−1 ,
(A − λI)~vr−1 = ~vr−2 ,
..
.
(A − λI)~v2 = ~v1 .

Since ~v1 is an eigenvector, (A − λI)~v1 = ~0. It follows that

(A − λI)r~vr = ~0.
54 5. Structure Theorems

We may denote the action of the matrix A − λI on the string of vectors by

~vr −→ ~vr−1 −→ · · · −→ ~v2 −→ ~v1 −→ ~0.
Now let W be the subspace of Gλ spanned by {~v1 , . . . , ~vr }. Any vector ~x in W has a represen-
tation of the form
~x = c1~v1 + c2~v2 + · · · + cr~vr
and
A~x = c1 (A~v1 ) + c2 (A~v2 ) + · · · + cr (A~vr )
= c1 (λ~v1 ) + c2 (λ~v2 + ~v1 ) + · · · + cr (λ~vr + ~vr−1 )
= (λc1 + c2 )~v1 + · · · + (λcr−1 + cr )~vr−1 + λcr~vr .
Thus A~x is also in W . If B = {~v1 , . . . , ~vr } is a basis for W , then
    c 
λc1 + c2 λ 1 1
 λc2 + c3    c
 2 
   λ 1  . 

[A~x]B =  .
. 
. .  . 
. = 
 .  = J[~x]B
   
λcr−1 + cr  λ 1  cr−1 
λcr λ cr
where  
λ 1
 λ 1 
 
J = J(λ; r) = 
 . . 

 λ 1
λ r×r
is called the Jordan block of size r corresponding to λ.

1 1
Example 5.4.1. Let A = . Find generalized eigenspaces of A.
−1 3
   
0 1 2 −1 1 0
Example 5.4.2. Let A1 = −5 −3 −7 and A2 =  0 −1 0 .
1 0 0 0 1 −1
Then A1 and A2 have the same characteristic polynomial (x + 1)3 . Find
(1) the minimal polynomials of A1 and A2 , and
(2) the generalized eigenspaces of A1 and A2 .

Theorem 5.4.1. If a n × n matrix A has t linearly independent eigenvectors, then it is similar to

a matrix J, that is, in Jordan form, with t square blocks on the diagonal:
 
J1
Jordan form J = M −1 AM = 
 .. 
.
.
Jt

Each block has one eigenvector, one eigenvalue, and 1s just above the diagonal:
 
λi 1

 λi 1 

Jordan block Ji = J(λi , ri ) = 
 . . 
 .
 . 1
λi r ×r
i i

The same λi will appear in several blocks, if it has several independent eigenvectors. Moreover, M
consists of n generalized eigenvectors which are linearly independent.
5.4. Jordan Forms 55

Remark. Theorem 5.4.1 says that every n × n matrix A has n linearly independent generalized
eigenvectors. These n generalized eigenvectors may be arranged in chains, with the sum of the
lengths of the chains associated with a given eigenvalue λ equal to the multiplicity of λ. But the
structure of these chains depends on the defect of λ, and can be quite complicated. For instance,
a multiplicity-four-eigenvalue can correspond to
• Four length 1 chain (defect 0);
• Two length 1 chains and a length 2 chain (defect 1);
• Two length 2 chains (defect 2);
• A length 1 chain and a length 3 chain (defect 2);
• A length 4 chain (defect 3).
Observe that, in each of these cases, the length of the longest chain is at most d + 1 where d
is the defect of the eigenvalue. Consequently, once we have found all the ordinary eigenvectors
corresponding to a multiple eigenvalue λ, and therefore know the defect d of λ, we can begin
with the equation
(A − λI)d+1 ~u = ~0 (5.4.1)

to start building the chains of generalized eigenvectors corresponding to λ.

Algorithm: Begin with a nonzero solution ~u1 of Eq. (5.4.1) and successively multiply by the
matrix A − λI until the zero vector is obtained. If

(A − λI)~u1 = ~u2 6= ~0
(A − λI)~u2 = ~u3 6= ~0
..
.
(A − λI)~uk−1 = ~uk 6= ~0

but (A − λI)~uk = ~0, then we get the string of k generalized eigenvectors

~u1 −→ ~u2 −→ · · · −→ ~uk .

 
0 0 1 0
0 0 0 1
Example 5.4.3. Let A =   with the characteristic polynomial x(x + 2)3 .
−2 2 −3 1 
2 −2 1 −3
Find the chains of generalized eigenvectors corresponding to each eigenvalues and the Jordan
form of A.
 
8 0 0 0
0 8 0 3
Example 5.4.4. Let A =  4 0 8 0. Find the minimal polynomial of A and chain(s) of


0 0 0 8
generalized eigenvectors and the Jordan form of A.

Example 5.4.5. Write down the Jordan form of the following matrices.

     
0 0 1 1 3 5 0 0 3 0 0 0
0 0 1 1 0 3 6 0 0 3 5 0
(1) 
0
 (2)   (3)  
0 1 1 0 0 4 7 0 0 4 6
0 0 0 0 0 0 0 4 0 0 0 4
56 5. Structure Theorems

Let N (r) = J(0; r) denote an r × r matrix that has 1’s immediately above the diagonal and
zero elsewhere. For example,
 
  0 1 0 0
0 1 0
0 1 0 0 1 0
N (2) = , N (3) = 0 0 1 , N (4) = 0 0 0 1 , etc.

0 0
0 0 0
0 0 0 0

Then J(λ; r) = λI + N (r), or in abbreviated J = λI + N .

Suppose that f (x) is a polynomial of degree s. Then the Taylor expansion around a point c from
calculus gives us

f ′′ (c) 2 f (s) (c) s

f (c + x) = f (c) + f ′ (c)x + x + ··· + x ,
2! s!

where f ′ , f ′′ , . . . , f (s) represent successive derivatives of f . In terms of matrices I and N , we

have

f ′′ (λI)N 2 f (s) (λI)N s

f (J) = f (λI + N ) = f (λI) + f ′ (λI)N + + ··· +
2! s!
′′
f (λ) 2 (s)
f (λ) s
= f (λ)I + f ′ (λ)N + N + ··· + N
 2! s!
′ f ′′ (λ)
f (λ) f (λ) 2!
. . 

f ′′ (λ)

 
 f (λ) f ′ (λ) . 
 2! 
=
 . . . .  
 f ′′ (λ) 

 . . 

 2! 
 ′
. f (λ) 
f (λ) r×r

because the entries of N k that are k steps above the diagonal are 1’s and all the other entries
are zeros.

Example 5.4.6. Compute J(λ; 4)2 , J(λ; 3)10 and J(λ; 2)s .
   s 
J1 J1

Remark. If J =  . ..  s
 is in a Jordan form, then J = 
 .. 
.
.
Jt Jts
 
2 1 0
Example 5.4.7. Compute J s for J = 0 2 0.
0 0 3
Example 5.4.8. Given a square matrix A, use the Jordan form of A, to determine its minimal
polynomial.
Solution. Let J be the Jordan form of A. Since f (A) = M f (J)M −1 , f (A) = 0 if and only if
f (J) = 0. Also, if J(λ; r) is a Jordan block, then f (J(λ; r)) is a Jordan block of f (J). We must
thus find a polynomial such that, for every Jordan block J(λ; r) of J, f (J(λ; r)) = 0 holds.
But we derived a formula for f (J(λ; r)), and it equals the zero matrix if and only if f (λ), f ′ (λ),
. . . , f (r−1) (λ) are all zero. Thus, f (x) and its first r − 1 derivatives must vanish at x = λ; in other
words, (x − λ)r must be a factor of f (x).
5.4. Jordan Forms 57

Let λ1 , . . . , λk be the distinct eigenvalues of A and mi the “maximum size” of the Jordan blocks
corresponding to the eigenvalue λi . Hence, we obtain

f (x) = (x − λ1 )m1 . . . (x − λk )mk

is the minimal polynomial of A.

Example 5.4.9. Find the minimal polynomial of the following matrices.

     
2 0 0 2 1 0 2 1 0
(1) 0
 2 0 (2) 0
 2 0 (3) 0 1 0 
0 0 −1 0 0 −1 0 0 −1
   
3 1 3 1

 3 1 


 3 


 3 


 3 1 

(4) 
 3 
 (5) 
 3 


 8 1 


 8 1 

 8   8 
5 5

 
0 1 0
Exercises for Chapter 5. 1. Let A = 0 0 1. Find a, b, c so that det(A − λI3 ) = 9λ − λ3 .
a b c
2. Let T : V → V be a linear operator.
A subspace U of V is T -invariant if T (U ) ⊆ U , i.e., ∀~u ∈ U, T (~u) ∈ U .
(a) Show that ker T and im T are T -invariant.
(b) If U and W are T -invariant, prove that U ∩ W and U + W are also T -invariant.
(c) Show that the eigenspace Eλ (T ) is T -invariant.
3. Show that A and AT have the same eigenvalues.
4. Show that if λ1 , . . . , λk are eigenvalues of A, then λm m
1 , . . . , λk are eigenvalues of A
m
for all m ≥ 1.
m
Moreover, each eigenvector of A is an eigenvector of A .
5. Let A and B be n × n matrices over a field F . If I − AB is invertible, prove that I − BA is invertible
and (I − BA)−1 = I + B(I − AB)−1 A.
6. Show that if A and B are the same size, then AB and BA have the same eigenvalues.
7. Determine all 2 × 2 diagonalizable matrices A with nonzero repeated eigenvalue a, a.
8. Let V be the space of all real-valued continuous functions. Define T : V → V by
Z x
(T f )(x) = f (t) dt.
0

Show that T has no eigenvalues.

9. Prove that if A is invertible and diagonalizable, then A−1 is diagonalizable.
10. Let V = Span{1, sin 2t, sin2 t}. Let T : V → V defined by T (f ) = f ′′ .
Find all eigenvalues and eigenspaces of D. Is T diagonalizable? Explain.
n
X
11. Let A = [aij ] be an n × n matrix such that for each i = 1, 2, . . . , n, we have aij = 0.
j=1
Show that 0 is an eigenvalue of A.
12. Let A be an n × n matrix with characteristic polynomial (x − λ1 )d1 . . . (x − λk )dk .
Show that tr A = d1 λ1 + · · · + dk λk .
13. Let A be a 2 × 2 matrix. Prove that the characteristic polynomial of A is given by

x2 − (tr A)x + det A = 0.

14. If A and B are 2 × 2 matrices with determinant one, prove that

tr AB − (tr A)(tr B) + tr AB −1 = 0.
58 5. Structure Theorems

15. Find the 2 × 2 matrices with real entries that satisfy the equation

−2 −2
X 3 − 3X 2 = .
−2 −2

(Hint. Apply
 the Cayley-Hamilton
 Theorem.)
0 0 c
16. Let A = 1 0 b .
0 1 a
Prove that the minimal polynomial of A and the characteristic polynomial of A are the same.
17. A 3 × 3 matrix A has the characteristic polynomial x(x − 1)(x + 2).
What is the characteristic polynomial of A2 ?
18. Let V = Mn (F ) be the vector space of n × n matrices over a field F . Let A be an n × n matrix.
Let TA be the linear operator on V defined by TA (B) = AB.
Show that the minimal polynomial for TA is the minimal polynomial for A.
19. Let U be an n × n real orthonormal matrix. Prove that
2
(a) |tr (U )| ≤ n, and (b) det(U
( − In ) = 0 if n is odd.
1 if j = k,
20. If U = ~u1 ~u2 . . . ~un with (~uj , ~uk ) = , prove that U is unitary.
0 if j 6= k
21. Let A be an n × n symmetric matrix with distinct eigenvalues λ1 , . . . , λk . Prove that

(A − λ1 In ) . . . (A − λk In ) = 0.

22. Unitarily diagonalize the following matrices.

     
0 1 0 0 1 0 2 i i
2 1 3 i
(a) (b) (c) −1 0 0  (d) 1 0 0 (e) −i 1 0
−1 2 −i 0
0 0 −1 0 0 1 −i 0 1
23. Show that every unitarily diagonalizable matrix is normal.
24. Suppose that A is real symmetric and orthonormal. Prove that the only possible eigenvalues of A are
±1.
25. Show that if a real matrix A is skew-symmetric (i.e., AT = −A), then iA is Hermitian.
26. Prove that if A is unitarily diagonalizable, then so is AH .
27. Let A be any square real matrix. Show that the eigenvalues of AT A are all non-negative.
28. Show that the generalized eigenspace Gλ corresponding to an eigenvalue λ of an n × n matrix A is
a subspace of F n .
29. Suppose the characteristic polynomial of a 4 × 4 matrix A is (x − 1)2 (x + 1)2 .
(a) Prove that A−1 = 2A − A3 . (b) Write down all possible Jordan form(s) of A.
30. Let J = J(λ; r) be an r × r Jordan block with λ on its diagonal. Show that J has only one linearly
independent eigenvector corresponding to λ.
31. If J is in Jordan form with k Jordan blocks on the diagonal, prove that J has exactly k linearly
independent eigenvectors.
32. These Jordan matrices have eigenvalues 0, 0, 0, 0:
   
0 1 0 1
 0   0 1 
J =  and K= .
 0 1  0 
0 0

For any matrix M , compare JM with M K. If they equal, show that M is not invertible. Then J
and K are not similar.
33. Suppose that a square matrix has two eigenvalues λ = 2, 5, and np (λ) = nullity(A − λI)p , p ∈ N, are
as follows:
n1 (2) = 2, n2 (2) = 4, np (2) = 5 for p ≥ 3, and n1 (5) = 1, np (5) = 2 for p ≥ 2.
Write down the Jordan form of A.
34. If J = J(0; 5) is the 5 × 5 Jordan block with λ = 0. Find J 2 , count its eigenvectors and write its
Jordan form.
35. How many possible Jordan forms are there for a 6 × 6 matrix with characteristic
polynomial (x − 1)2 (x + 2)4 ?
5.4. Jordan Forms 59
 
2 a b
36. Let A = 0 2 c  ∈ M3 (R).
0 0 1
(a) Prove that A is diagonalizable if and only if a = 0.
(b) Find the minimal polynomial of A when (i) a = 0 (ii) a 6= 0.
37. Let V = {h(x, y) = ax2 + bxy + cy 2 + dx + ey + f : a, b, c, d, e, f ∈ R} be a subspace of the space
of polynomial in two variables x and y over R. Then B = {x2 , xy, y 2 , x, y, 1} is a basis for V . Define
T : V → V by Z
∂
(T (h))(x, y) = h(x, y) dx .
∂y
(a) Prove that T is a linear transformation and find A = [T ]B .
(b) Compute the characteristic polynomial and the minimal polynomial of A.
(c) Find the Jordan form of A.
38. True or False:

3 0 3 1 3 0 3 1
(a) and are similar. (b) and are similar.
0 4  0 4   0 3 0 3
a 1 0 b 0 0
39. Show that 0 a 0 and 0 a 1 are similar.
0 0 b 0 0 a
40. Write down the Jordan formfor the following matrices and find its minimal polynomial.  
−1 0 1 1 0 0 2 0 0
−2 1
(a) (b)  0 −1 1  (c) −2 −2 −3 (d) −7 9 7
−1 −4
1 −1 −1 2 3 4 0 0 2
       
3 1 −1 −2 17 4 −3 5 −5 5 −1 1
(e) 2 2 −1 (f) −1 6 1 (g)  3 −1 3  (h)  1 3 0
 2 2 0  0 1 2   8 −8 10   −3 2 1
1 −4 0 −2 2 1 0 1 −1 −4 0 0 1 3 7 0
0 1 0 0   0 2 1 0 1 3 0 0  0 −1 −4 0
(i) 
6 −12 −1 −6
 (j) 
 0 0 2 1
 (k) 
1
 (l)  
2 1 0 0 1 3 0
0 −4 0 −1 0 0 0 2 0 1 0 1 0 −6 −14 1

Eigenvalues: (b) −1, −1, −1 (c) 1, 1, 1 (d) 2, 2, 9 (e) 1, 2, 2 (f) 2, 2, 2 (g) 2, 2, 2 (h) 3, 3, 3
(i) −1, −1, 1, 1 (k) 1, 1, 1, 1 (l) 1, 1, 1, 1.

Introduction To Number Theory - Mathew Crawford
100% (13)
Introduction To Number Theory - Mathew Crawford
338 pages
Art of Problem Solving Volume 2 and Beyond by Sandor Lehoczky, Richard Rusczyk
84% (19)
Art of Problem Solving Volume 2 and Beyond by Sandor Lehoczky, Richard Rusczyk
305 pages
J.-M. Berthelot Mechanics of Rigid Bodies PDF
100% (1)
J.-M. Berthelot Mechanics of Rigid Bodies PDF
629 pages
Complex Variables An Introduction by Carlos A. Berenstein, Roger Gay
100% (1)
Complex Variables An Introduction by Carlos A. Berenstein, Roger Gay
661 pages
รวมการ์ตูนสั้นของจุนจิ อิโต้
100% (1)
รวมการ์ตูนสั้นของจุนจิ อิโต้
487 pages
Linear Algebra notes 2
No ratings yet
Linear Algebra notes 2
8 pages
Exercises
No ratings yet
Exercises
27 pages
Exercises Removed
No ratings yet
Exercises Removed
20 pages
Section 1
No ratings yet
Section 1
14 pages
Unit 2: Matrices, Determinants & Systems of Linear Equations 2.1 Definitions and Basic Matrix Operations A. Definition of A Matrix and Examples
100% (1)
Unit 2: Matrices, Determinants & Systems of Linear Equations 2.1 Definitions and Basic Matrix Operations A. Definition of A Matrix and Examples
37 pages
cs530 12 Notes PDF
No ratings yet
cs530 12 Notes PDF
188 pages
LinearAlgebra-Ver1 4 PDF
No ratings yet
LinearAlgebra-Ver1 4 PDF
63 pages
Cambridge Linear Algebra Notes PDF
No ratings yet
Cambridge Linear Algebra Notes PDF
82 pages
Cambridge Part IB Linear Algebra Alex Chan
No ratings yet
Cambridge Part IB Linear Algebra Alex Chan
82 pages
115af18 Lecture Notes
No ratings yet
115af18 Lecture Notes
59 pages
Abstract Linear Algebra
No ratings yet
Abstract Linear Algebra
48 pages
LA - W1 VS, SB, Ins&Union
No ratings yet
LA - W1 VS, SB, Ins&Union
12 pages
Revision Notes - MA2101
No ratings yet
Revision Notes - MA2101
59 pages
Lectures PDF
No ratings yet
Lectures PDF
56 pages
Fields and Vector Spaces
No ratings yet
Fields and Vector Spaces
9 pages
MAT 213-304 Linear Algebra II Notes
No ratings yet
MAT 213-304 Linear Algebra II Notes
35 pages
Mathematical Spaces: °2011 by Taejeong Kim
No ratings yet
Mathematical Spaces: °2011 by Taejeong Kim
39 pages
LAII-book-I
No ratings yet
LAII-book-I
34 pages
Vector space
No ratings yet
Vector space
32 pages
Lecture 1: September 28, 2021: Mathematical Toolkit Autumn 2021
No ratings yet
Lecture 1: September 28, 2021: Mathematical Toolkit Autumn 2021
5 pages
Vector Spaces: Persson@berkeley - Edu
No ratings yet
Vector Spaces: Persson@berkeley - Edu
4 pages
Notes 610
No ratings yet
Notes 610
209 pages
MTH 311 Advanced Algebra-Zamgist
No ratings yet
MTH 311 Advanced Algebra-Zamgist
232 pages
Vector Spaces Crash Course
No ratings yet
Vector Spaces Crash Course
11 pages
Lectures On Linear Algebra
No ratings yet
Lectures On Linear Algebra
101 pages
Arindama Singh's MA2031 Notes
No ratings yet
Arindama Singh's MA2031 Notes
207 pages
Advanced Engineering Mathematics Prof. Pratima Panigrahi Department of Mathematics Indian Institute of Technology, Kharagpur
No ratings yet
Advanced Engineering Mathematics Prof. Pratima Panigrahi Department of Mathematics Indian Institute of Technology, Kharagpur
15 pages
Ma2101 Cheatsheet Midterms
No ratings yet
Ma2101 Cheatsheet Midterms
2 pages
Linear Algebra: MAT 217 Lecture Notes, Spring 2012
No ratings yet
Linear Algebra: MAT 217 Lecture Notes, Spring 2012
101 pages
MAT 217 All Lectures PDF
No ratings yet
MAT 217 All Lectures PDF
101 pages
MATH212 LA Notes
No ratings yet
MATH212 LA Notes
65 pages
Vector Spaces: 3.1 Abstract Definition
No ratings yet
Vector Spaces: 3.1 Abstract Definition
14 pages
Prelims Linear Algebra I: 1.1 Addition and Scalar Multiplication of Matrices
No ratings yet
Prelims Linear Algebra I: 1.1 Addition and Scalar Multiplication of Matrices
11 pages
UGSemsterSyllabus Maths 5Sem517Maths English LINEARALGEBRA
No ratings yet
UGSemsterSyllabus Maths 5Sem517Maths English LINEARALGEBRA
113 pages
What Is A Vector Space
No ratings yet
What Is A Vector Space
7 pages
Linear Algebra Original
No ratings yet
Linear Algebra Original
55 pages
Classnote Ma2031
100% (1)
Classnote Ma2031
185 pages
L1
No ratings yet
L1
28 pages
Linear Algebra
No ratings yet
Linear Algebra
96 pages
Introduction To Linear Algebra-Compressed
100% (1)
Introduction To Linear Algebra-Compressed
435 pages
S2 Vector Space
No ratings yet
S2 Vector Space
5 pages
Culegere Peter
No ratings yet
Culegere Peter
157 pages
Math853 JBrown Grad Linear Alg
No ratings yet
Math853 JBrown Grad Linear Alg
155 pages
Vector Spaces
No ratings yet
Vector Spaces
9 pages
Vector Space
No ratings yet
Vector Space
8 pages
LA Notes
No ratings yet
LA Notes
19 pages
Linear Algebra Notes
No ratings yet
Linear Algebra Notes
60 pages
Module 1, Notes 1
No ratings yet
Module 1, Notes 1
39 pages
Vector Spaces (2022)
No ratings yet
Vector Spaces (2022)
10 pages
LALect01
No ratings yet
LALect01
14 pages
Iiserb Mm1 Notes Oct 4
No ratings yet
Iiserb Mm1 Notes Oct 4
30 pages
Math 146 Notes
No ratings yet
Math 146 Notes
86 pages
Ross 1.0
No ratings yet
Ross 1.0
31 pages
4389_LA_ch2
No ratings yet
4389_LA_ch2
12 pages
1.2. General Vector Space
No ratings yet
1.2. General Vector Space
20 pages
Aditi
No ratings yet
Aditi
9 pages
Lec-12
No ratings yet
Lec-12
11 pages
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
24 Model building _ R for Data Science
No ratings yet
24 Model building _ R for Data Science
17 pages
6 Workflow_ Scripts _ R for Data Science
No ratings yet
6 Workflow_ Scripts _ R for Data Science
2 pages
Science and Technology RMUTT Journal: Generalized Nonexpansive Mappings in CAT (0) Spaces
No ratings yet
Science and Technology RMUTT Journal: Generalized Nonexpansive Mappings in CAT (0) Spaces
13 pages
Vocab
No ratings yet
Vocab
134 pages
The Book of Math Formulas _ OmegaLearn
No ratings yet
The Book of Math Formulas _ OmegaLearn
2 pages
Ncert 9
No ratings yet
Ncert 9
247 pages
Power of A Point
No ratings yet
Power of A Point
142 pages
7 The Perfect Life
No ratings yet
7 The Perfect Life
1 page
Scicentist 1
No ratings yet
Scicentist 1
21 pages
Analytic Inequalities - Mitrinovic, Dragoslav S., (Springer, 1970)
No ratings yet
Analytic Inequalities - Mitrinovic, Dragoslav S., (Springer, 1970)
416 pages
Qdoc - Tips - Imomath Recommended Books
No ratings yet
Qdoc - Tips - Imomath Recommended Books
5 pages
บันทึกน้องเหมียวของอิโต จุนจิ
No ratings yet
บันทึกน้องเหมียวของอิโต จุนจิ
115 pages
Xu Jiagu Lecture Notes On Mathematical Olympiad Courses Seni
100% (1)
Xu Jiagu Lecture Notes On Mathematical Olympiad Courses Seni
158 pages
Functional Equations A Problem Solving PDF
50% (2)
Functional Equations A Problem Solving PDF
274 pages
APSC 174 Midterm 2 - 2021 - Solutions
No ratings yet
APSC 174 Midterm 2 - 2021 - Solutions
11 pages
Takashi's Econ633 Lecture Notes May 18 2010
No ratings yet
Takashi's Econ633 Lecture Notes May 18 2010
105 pages
03 LBC
No ratings yet
03 LBC
62 pages
Matlab For Chemists
No ratings yet
Matlab For Chemists
90 pages
(Specialist) 2000 Heffernan Exam 1 Solutions
No ratings yet
(Specialist) 2000 Heffernan Exam 1 Solutions
14 pages
Orthogonal Representations and Connectivity of Graphs
No ratings yet
Orthogonal Representations and Connectivity of Graphs
11 pages
Section 4.5
No ratings yet
Section 4.5
21 pages
Chrmistry Bchet 149
No ratings yet
Chrmistry Bchet 149
5 pages
T Veerarajan - Engineering Mathematics II-McGraw-Hill Education (2018)
No ratings yet
T Veerarajan - Engineering Mathematics II-McGraw-Hill Education (2018)
442 pages
Applied Mathematics
No ratings yet
Applied Mathematics
2 pages
Vectors and 3D - 29
No ratings yet
Vectors and 3D - 29
24 pages
Tutorial 4
No ratings yet
Tutorial 4
4 pages
Mzumbe University Faculty of Science and Technology Qms 125:linear Algebra Tutorial Sheet 4
No ratings yet
Mzumbe University Faculty of Science and Technology Qms 125:linear Algebra Tutorial Sheet 4
3 pages
Vector Notes For IIT JEE - pdf-62
No ratings yet
Vector Notes For IIT JEE - pdf-62
8 pages
Multivariable Calculus with Applications 2017th Edition by Peter Lax, Maria Shea Terrell ISBN 3319740725 9783319740720 pdf download
100% (2)
Multivariable Calculus with Applications 2017th Edition by Peter Lax, Maria Shea Terrell ISBN 3319740725 9783319740720 pdf download
62 pages
Lecture Notes - 5
No ratings yet
Lecture Notes - 5
133 pages
Mth603 Midterm Solved Mcqs by Junaid Malik
No ratings yet
Mth603 Midterm Solved Mcqs by Junaid Malik
40 pages
QP Code: 23105591
No ratings yet
QP Code: 23105591
4 pages
MT132: Linear Algebra: Faculty of Computer Studies Tutor Marked Assignment - TMA (Spring 2021/2022)
No ratings yet
MT132: Linear Algebra: Faculty of Computer Studies Tutor Marked Assignment - TMA (Spring 2021/2022)
12 pages
Udacity Session10
No ratings yet
Udacity Session10
52 pages
1.8-1.9: Linear Transformations: HKBU Math 2207 Linear Algebra Semester 1 2018, Week 3, Page 1 of 24
No ratings yet
1.8-1.9: Linear Transformations: HKBU Math 2207 Linear Algebra Semester 1 2018, Week 3, Page 1 of 24
24 pages
Matlab Diktaat
No ratings yet
Matlab Diktaat
90 pages
Series TD 01 Structures Algebriques
No ratings yet
Series TD 01 Structures Algebriques
3 pages
The Linear Algebra Curriculum Study Group Recommendations - Moving Beyond Concept Definition
No ratings yet
The Linear Algebra Curriculum Study Group Recommendations - Moving Beyond Concept Definition
20 pages
Department of Mathematics and Statistics
No ratings yet
Department of Mathematics and Statistics
2 pages
Spanning and Linear Independence
No ratings yet
Spanning and Linear Independence
6 pages
Linear Algebra and Its Applications 5th Edition Lay Solutions Manual pdf download
No ratings yet
Linear Algebra and Its Applications 5th Edition Lay Solutions Manual pdf download
49 pages
Cmi Linear Algebra Full Notes
No ratings yet
Cmi Linear Algebra Full Notes
83 pages