0% found this document useful (0 votes)
12 views

Chapter 1

Uploaded by

Wai Yi Kan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Chapter 1

Uploaded by

Wai Yi Kan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Chapter 1

Linear Algebra: Review, Eigenvalues and Vector Spaces

SEHH2363 Mathematical Methods for Data Science

Contents
1.1 Systems of linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Gauss-Jordan method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Systems of homogeneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Nonsingular and Elementary matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Definition of the determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6 Further properties of the determinant . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7 Linear dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.8 Eigenvectors and eigenvalues of a square matrix . . . . . . . . . . . . . . . . . . . . 26
1.9 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.10 Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.11 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.12 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.13 Column space, row space and null space . . . . . . . . . . . . . . . . . . . . . . . . 41

1
1.1 Systems of linear equations

A system of m linear equations in n unknowns x1 , x2 , . . . , xn is given by


8
>
> a x + a12 x2 + . . . + a1n xn = b1
>
< 11 1
.. .. .. ..
. . . . , (1)
>
>
>
: a x + a x + ... + a x = b
m1 1 m2 2 mn n m

where the aij ’s and the bk ’s are given scalars. Eqn (1) can be conveniently written in matrix form
as Ax = b by putting A = [aij ], x = [x1 . . . xn ]T and b = [b1 . . . bm ]T . The matrix A is commonly
known as the coefficient matrix of the linear system.

A column vector v in Rn satisfying Av = b is said to be a solution of the system of linear


equations Ax = b. We say that the system of linear equations Ax = b is consistent if it has
a solution. The collection of all the solutions is called the solution set of the system. On the
other hand, if Ax = b has no solution, then the system is said to be inconsistent. It is evident
that a system of linear equations is either inconsistent (no solution) or consistent (has at least one
solution).
Our main problem is to determine whether a given system of linear equations Ax = b is con-
sistent, and find all its solutions in case the linear system is consistent.

Definition 1 Two systems of linear equations are said to be equivalent if they have identical
solutions, i.e., their solution sets are equal.

A given system of linear equations may be solved by the method of elimination. The idea is to
replace a given system Ax = b by an equivalent system A0 x = b0 in such a way that the latter is
easy to solve.

The elimination process consists of the following types of operations:

(i) interchange any 2 equations of a system of linear equations;

(ii) multiply both sides of any equation in a system by a non-zero scalar;

(iii) add a multiple of one equation to another equation within the system.

2
It is obvious that if a given system Ax = b is reduced to A0 x = b0 by operations of types (i),
(ii) or (iii), then these two systems are equivalent, i.e., they have identical solutions.

Given a system of linear equations Ax = b, we define the augmented matrix for this system to
be the m ⇥ (n + 1) matrix obtained by adjoining to the right of A the column vector b.
2 3
a a12 · · · a1n b1
6 11 7
6 . .. .. .. 7
Notation [A|b] = 6 .. . . . 7.
4 5
am1 am2 · · · amn bm

It is clear that instead of applying any one of the above three operations to the system Ax = b,
we might as well apply one of the following operations to the augmented matrix [A |b] :

(i) interchange any two rows of [A |b] ;

(ii) multiply any row of [A |b] by a non-zero scalar;

(iii) add a scalar multiple of one row of [A |b] to another row.

These operations are called elementary row operations on matrices.

Definition 2 A matrix is said to be in reduced row-echelon form if it has the following prop-
erties:

1. If a row does not consist entirely of zeros, then the 1st non-zero entry of this row is equal to
1 (known as the leading 1 of the row);

2. All the rows that consist entirely of zeros are grouped together at the bottom of the matrix;

3. If the leading 1 of i-th row occurs at the p-th column and if the leading 1 of the (i + 1 )-th row
occurs at the q-th column, then p < q;

4. Each column that contains a leading 1 has zeros elsewhere.

2 3 2 3
2 3 1 0 0 4 0 1 2 0 15
1 2 0 6 7 6 7
Example 1 The matrices 4 5, 6
6 0 1 0
7 6
2 7 and 6 0 0 0 1
7
3 7
0 0 1 4 5 4 5
0 0 1 9 0 0 0 0 0
are in reduced row-echelon form

3
2 3
2 3 2 3 0 1 2 6 0
1 5 3 1 4 3 7 6 7
6 7 6 7 6 7
6 7 6 7 6 0 0 1 1 18 7
while 6 0 2 2 7, 6 0 0 0 0 7 and 6
6
7 are not in reduced row-echelon
7
4 5 4 5 6 0 0 0 0 1 7
0 0 0 0 1 1 5 4 5
0 0 0 0 0
form.

Theorem 1 Every matrix A can be reduced to a matrix in reduced row-echelon form by applying
to A a sequence of elementary row operations.

Example 2 Reduce the following matrix into the reduced row echelon form.
2 3
4 2 3
6 7
6 7
A=6 2 1 1 7.
4 5
3 1 2
2 3 2 3
4 1
6 7 6 7
6 7 6 7
We start with the first column, and try to change it from 6 2 7 to 6 0 7 .
4 5 4 5
3 0
Step 1. (Choosing the pivot) Since the (1,1)-th entry is nonzero, it can be a pivot. Therefore, no
permutation of rows is needed.
Step 2. (Normalize the pivot) Reduce the pivot to one by scalar multiplication, i.e.
2 3
1 3
1
1 6 2 4 7
6 7
r1 ! r1 : 6 2 1 1 7.
4 4 5
3 1 2

Step 3. (Eliminating the remaining entries in the same column) By using addition or subtraction,
we reduce the entries in the same column to zero.
2 3
1 3
1
r2 2r1 ! r2 : 6 2 4 7
6 1 7
6 0 0 7.
r3 + 3r1 ! r3 : 4 2 5
1 17
0 2 4

This finishes the reduction process for the first column. Let us do it again for the second column.
Step 1. (Choosing the pivot) Since the (2,2)-th entry is zero, it cannot be used as pivot. We look
for any nonzero entries underneath the (2,2)-th entry. If there is none, then the matrix is singular

4
and has no inverse. If there is one non-zero entry below the (2, 2)-th entry, then we move it to the
(2,2)-th position by a row permutation. Since (3, 2)-th entry is nonzero, we permute the second row
with the third row. 2 3
1 3
1
6 2 4 7
6 1 17 7
r2 $ r3 : 6 0 7.
4 2 4 5
1
0 0 2

Step 2. (Normalize the pivot) We multiply the whole second row by 2.


2 3
1 3
1
6 2 4 7
6 17 7
2r2 ! r2 : 6 0 1 7.
4 2 5
1
0 0 2

Step 3. (Eliminating the remaining entries in the same column) By using addition or subtraction,
we reduce the entries in the same column to zero.
2 3
7
✓ ◆ 1 0
1 6 2 7
6 17 7
r1 + r2 ! r1 : 6 0 1 7.
2 4 2 5
1
0 0 2

This finishes the reduction process for the second column. Let us do it again for the third column.

For the third column, we have


Step 1. (3,3)-th entry is nonzero and can be used as pivot.
Step 2. Normalize the (3,3)-th entry to one.
2 3
7
1 0
6 2 7
6 17 7
2r3 ! r3 : 6 0 1 7.
4 2 5
0 0 1

Step 3. (Eliminating the remaining entries in the same column) By using addition or subtraction,
we reduce the entries in the same column to zero.
2 3
1 0 0
r1 + 72 r3 ! r1 : 6 7
6 7
6 0 1 0 7.
r2 + 17
r ! r2 : 4 5
2 3
0 0 1

5
Notice that in choosing the pivot for the j-th column, we should look for non-zero entries at or
below the (j, j)-th entry. Do not use the non-zero entries above the (j, j)-th entry as pivot.

Remark 1 For any matrix A, there is one and only one reduced row echelon form R and R is said
to be row equivalent to A.

1.2 Gauss-Jordan method

Consider a system of linear equations Ax = b and its augmented matrix [A |b] . It follows from
Theorem 1 that we can use a sequence of elementary row operations to reduce the matrix A to a
matrix R in reduced row-echelon form. In this manner, [A |b] is reduced, by the same sequence of
elementary row operations, to [R |c] for some column vector c. The solution of the new equivalent
system Rx = c can be obtained simply by inspection. This process is called the Gauss-Jordan
Method. It remains to be one of the most popular methods in handling systems of linear equations.

Gauss–Jordan Elimination
The Gauss–Jordan elimination method systemically transforms an augmented matrix into a
reduced row echelon form. The steps are given as follows.

Step 1 Choose the leftmost nonzero column and use appropriate row operations to get a leading 1 at
the top.

Step 2 Use multiples of the row containing the 1 from step 1 to get zeros in all remaining places in
the column containing this leading 1.

Step 3 Repeat step 1 with the submatrix formed by deleting the row used in step 2 and all rows
above this row.

Step 4 Repeat step 2 with the entire matrix. Continue this process until it is impossible to go further.

If any point in this process we obtain a row with all zeros to the left of the vertical line and a
nonzero number to the right, we can stop, since we will have a contradiction: 0 = n, n 6= 0. We can
then conclude that the system has no solution.

6
Example 3 Solve the given system by Gauss–Jordan elimination

2x1 2x2 +x3 = 3


3x1 +x2 x3 = 7
x1 3x2 +2x3 = 0
2 3
2 2 1 3 Step 1 Choose the leftmost
6 7
6 7
6 3 1 1 7 7 nozero column and get a 1
4 5
1 3 2 0 at the top

2 3
1 3 2 0 Step 2 Use the mulitples of the
6 7
6 7
6 3 1 1 7 7 r1 $ r3 row 1 to get zeros in all remaning
4 5
2 2 1 3 places in column 1.

2 3
1 3 2 0 Step 3 Repeat step 1 with the
6 7 r2 (3) r1 ! r2
6 7
6 0 10 7 7 7 submarix formed by deleting
4 5 r3 (2) r1 ! r3
0 4 3 3 the top row.

2 3
1 3 2 0
6 7 Step 3 Repeat step 2 with the
6 7
6 0 1 0.7 0.7 7 (0.1) r2 ! r2
4 5 entrie marix.
0 4 3 3

2 3
1 0 0.1 2.1 Step 4 Repeat step 1 with the
6 7 r1 ( 3) r2 ! r1
6 7
6 0 1 0.7 0.7 7 submarix formed by deleting
4 5 r3 (4) r2 ! r3
0 0 0.2 0.2 the top two rows.

2 3
1 0 0.1 2.1
6 7 Step 5 Repeat step 2 with the
6 7
6 0 1 0.7 0.7 7 ( 5) r3 ! r3
4 5 entrie marix.
0 0 1 1

2 3
1 0 0 2 The matrix is now in reduced form,
6 7 r1 ( 0.1) r3 ! r1
6 7
6 0 1 0 0 7 and we can proceed to solve the
4 5 r2 ( 0.7) r3 ! r2
0 0 1 1 corresponding reduced system

7
The solution of the system is
x1 =2
x2 =0
x3 = 1

Example 4 Determine whether the following system is consistent


8
>
> x2 4x3 =8
>
<
2x1 3x2 + 2x3 =1 .
>
>
>
: 5x 8x + 7x
1 2 3 =1

The augmented matrix of the system can be reduced as follows :


2 3 2 3 2 3
3 1
0 1 4 8 2 3 2 1 1 1
6 7 6 7 1 r !r 6 2 2 7
6 7 r1 $r2 6 72 1 1 6 7
6 2 3 2 1 7 ! 6 0 1 4 8 7 ! 6 0 1 4 8 7
4 5 4 5 4 5
5 8 7 1 5 8 7 1 5 8 7 1
2 3 2 3
3 1 3 1
1 1 1 1
6 2 2 7 r + 1 r !r 6 2 2 7
r3 5r1 !r3 6 7 3 2 2 3 6 7
! 6 0 1 4 8 7 ! 6 0 1 4 8 7.
4 5 4 5
1 3 5
0 2
2 2
0 0 0 2
8
>
> x1 3
x + x3 = 1
>
< 2 2 2

The last matrix is the augmented matrix of 0x1 + x2 4x3 = 8 .


>
>
>
: 0x + 0x + 0x = 5
1 2 3 2

5
It is inconsistent as the third equation 0x1 + 0x2 + 0x3 = 2
has no solution. Therefore, the system
of linear equations is inconsistent.

8
>
> x + x2 4x3 =5
>
< 1
Example 5 The augmented matrix of 2x1 + 3x2 7x3 = 14
>
>
>
: x2 x3 = 4
2 3
1 1 4 5
6 7
6 7
is given by 6 2 3 7 14 7.
4 5
0 1 1 4
We may use elementary row operations to reduce the coefficient matrix to reduced row echelon form

8
2 3
1 0 5 1
6 7
6 7
to obtain 6 0 1 1 4 7.
4 5
0 0 0 0
8
< x 5x3 = 1
1
The corresponding equivalent system is , giving x1 = 1 + 5x3 and x2 = 4 + x3 .
: x2 x3 = 4

We call x1 and x2 the leading variables (because they


2 corresponding
3 to the column positions of the
1 0 5
6 7
6 7
two leading 1’s in the reduced row-echelon matrix 6 0 1 1 7, and x3 is a free variable which
4 5
0 0 0
may assume any value.
h iT
As such, solutions of the system are given by x = 1 + 5t 4 t t .

1.3 Systems of homogeneous equations

A linear equation with zero “right hand side” is called a homogeneous linear equation. A system
of m homogeneous linear equations in n unknowns may be written as Ax = 0, where A is a given
m ⇥ n matrix and 0 is the m ⇥ 1 zero column vector. A homogeneous system has the n ⇥ 1 zero
vector 0 as an obvious solution (called the trivial solution). Any other solutions are known as
non-trivial solutions.

It is also clear that if v and u are solutions of Ax = 0, then tv + su is also a solution for any
scalars t, s. As such, the solution set of Ax = 0 either has only the trivial solution, or has infinitely
many solutions.

Remark 2 To solve Ax = 0, we may reduce its augmented matrix [A| 0] to [R| 0], where R is a
matrix in reduced row-echelon form. Suppose that R has r non-zero rows, and that for 1  j  r,
the leading 1 of the j th row occurs at the kj th column. It then follows from the structure of the
reduced row-echelon matrix R that xk1 , xk2 , . . ., xkr may be taken as basic variables of the system
P
Rx = 0, which consists of only r linear equations, the j th equation being of the form xkj = (· · · ).
P
Here, (· · · ) denotes sums that involve only the remaining (n r) variables (free variables). It is

9
now clear that the system Rx = 0 has non-trivial solutions whenever n > r.
2 3
1 0 ⇥ ··· 0 ⇥ ··· 0 ⇥ ··· 0
6 7
6 .. 7
6 1 ⇥ ··· 0 ⇥ ··· 0 ⇥ ··· . 7
6 7
6 .. 7
6 1 ⇥ ··· 0 ⇥ ··· . 7
6 7
6 7
6 1 ⇥ ··· 0 7
6 7
6 7
6 0 7
6 7
6 .. 7
6 . 7
4 5
0

A simple consequence of Remark 2 is the following useful theorem.

Theorem 2 If A is an m ⇥ n matrix where m < n, then the homogeneous system Ax = 0 always


has non-trivial solutions. In other words, if the number of equations is less than the number of
unknowns, then the system has non-trivial solutions.

For instance, a system of three linear equations in four unknowns admits non-trivial solutions.

1.4 Nonsingular and Elementary matrices

For any non-zero real number a, there is a real number b such that ab = ba = 1. The number b is
known as the multiplicative inverse of a. The matrix analogue of this will now be discussed. We
shall begin with a few definitions and examples.

Definition 3 A square matrix A is said to be nonsingular (or invertible) if there is a square


matrix B such that AB = I and BA = I. The matrix B is called an inverse of A.

2 32 3 2 32 3
3 2 5 2 5 2 3 2
Example 6 Since 4 54 5 = I and 4 54 5 = I,
7 5 7 3 7 3 7 5
2 3 2 3
3 2 5 2
we conclude that the matrix 4 5 is nonsingular, with 4 5 as an inverse.
7 5 7 3

10
2 3
0 a1 a2
6 7
6 7
Example 7 Consider the 3 ⇥ 3 matrix A = 6 0 a3 a4 7.
4 5
0 a5 a6
If B is any 3 ⇥ 3 matrix, then the 1st column of the product BA consists entirely of zeros. As
such, BA 6= I and A thus has no inverse.

Some important facts about nonsingular matrices are listed in the following

Proposition 1 If A, B and C are n ⇥ n matrices such that AC = I and BA = I, then B = C.

Proof B = BI = B(AC) = (BA)C = IC = C.

Proposition 2 If A and B are nonsingular matrices of the same order, then AB is nonsingular
1
and (AB) = B 1A 1.

Proof By the associativity of matrix multiplication, we have

AB(B 1 A 1 ) = A(B(B 1 A 1 )) = A((BB 1 )A 1 ) = A(IA 1 ) = AA 1 = I and


⇥ ⇤
(B 1 A 1 )AB = B 1 (A 1 (AB)) = B 1 (A 1 A)B = B 1 (IB) = B 1 B = I.

1
Therefore, AB is nonsingular and (AB) = B 1A 1.

Note If A1 ,A2 ,. . .,Ak are nonsingular matrices, then repeated applications of Proposition 2
show that their product (A1 A2 . . . Ak ) is nonsingular and that (A1 A2 . . . Ak ) 1
= Ak 1 . . . A2 1 A1 1 .
In particular, if A is nonsingular, then Ak is nonsingular k and (Ak ) 1
= (A 1 )k .

The simplest nonsingular matrices are the so-called elementary matrices.

Definition 4 An n ⇥ n matrix is called an elementary matrix if it can be obtained from the


n ⇥ n identity matrix In by performing a single elementary row operation.
2 3 2 3 2 3
1 0 0 1 0 0 0 1 0
6 7 6 7 6 7
6 7 6 7 6 7
Example 8 E=6 0 3 0 7, F = 6 0 1 0 7 and G = 6 1 0 0 7
4 5 4 5 4 5
0 0 1 2 0 1 0 0 1
are elementary because E is obtained from I by multiplying its second row by 3, F is obtained from
I by adding ( 2) times its first row to the third row and G is obtained from I by interchanging its
first and second rows.

11
We now state some useful and interesting facts about elementary matrices in the following
Proposition.

Proposition 3

(i) If A is an m ⇥ n matrix and E is an m ⇥ m elementary matrix that results from performing a


certain elementary row operation on Im , then the product EA is the matrix that results when
this same elementary row operation is performed on A.

(ii) If an elementary row operation is applied to an identity matrix I to produce an elementary ma-
trix E, then there exists another elementary row operation which, when applied to E, produces
I.

(iii) Every elementary matrix is nonsingular, and the inverse of an elementary matrix is also an
elementary matrix.
2 3
a11 a12 a13
6 7
6 7
Example 9 Let A = 6 a21 a22 a23 7 and E, F and G be as given in Example 8. Straightforward
4 5
a31 a32 a33
calculations indicate that
2 3
a11 a12 a13
6 7
6 7
(a) EA = 6 3a21 3a22 3a23 7 is the matrix obtained from A by multiplying the second row
4 5
a31 a32 a33
of A by 3;
2 3
a11 a12 a13
6 7
6 7
(b) FA = 6 a21 a22 a23 7 is the matrix obtained from A by adding ( 2)
4 5
a31 2a11 a32 2a12 a33 2a13
times the first row of A to the third row;
2 3
a a a23
6 21 22 7
6 7
(c) GA = 6 a11 a12 a13 7 is obtained from A by interchanging its first and second rows.
4 5
a31 a32 a33

12
1
Every elementary matrix E has an inverse that is also an elementary matrix. E is obtained
from I by performing the inverse of the elementary row operation that produced E form I.

Elementary Row Operation Inverse Elementary Row Operation


ri $ rj ri $ rj
1
kri ! ri , k 6= 0 r
k i
! ri
rj + kri ! rj rj kri ! rj

Example 10 With E, F and G as given in Example 8, we observe that


2 3
1 0 0
6 7
1 6 1 7
(a) E is the matrix 6 0 0 7, which is the elementary matrix obtained from I by multi-
4 3 5
0 0 1
plying its second row by 13 ;
2 3
1 0 0
6 7
6 7
(b) F 1 is the matrix 6 0 1 0 7, which is the elementary matrix obtained from I by adding 2
4 5
2 0 1
times the first row to the third row ;
2 3
0 1 0
6 7
6 7
(c) G 1 is the matrix 6 1 0 0 7, which is the elementary matrix obtained from I by inter-
4 5
0 0 1
changing the first and second row.

Example 11 2We can use


3 a sequence
2 of elementary
3 2row operations
3 to reduce
2 a matrix3into a row
1
0 0 0 1 2 1 2 4 1 2 4
6 7 6 2 7 6 7 6 7
6 7 r1 $r3 6 7 2r1 !r1 6 7 r2 2r1 !r2 6 7
echelon form 6 2 5 1 7 ! 6 2 5 1 7 ! 6 2 5 1 7 ! 6 0 1 7 7
4 5 4 5 4 5 4 5
1
2
1 2 0 0 0 0 0 0 0 0 0
Thus, 2 32 32 32 3 2 3
1 0 0 2 0 0 0 0 1 0 0 0 1 2 4
6 76 76 76 7 6 7
6 76 76 76 7 6 7
6 2 1 0 76 0 1 0 76 0 1 0 76 2 5 1 7 = 6 0 1 7 7
4 54 54 54 5 4 5
1
0 0 1 0 0 1 1 0 0 2
1 2 0 0 0

The following theorem gives necessary and sufficient conditions for a square matrix to be non-
singular.

Theorem 3 If A is an n ⇥ n matrix, then the following statements are equivalent:

13
(i) A is nonsingular;

(ii) the homogeneous system Ax = 0 has only the trivial solution;

(iii) A can be reduced to I by a sequence of elementary row operations.

1
Proof “(i) ) (ii)”: A is nonsingular ) A exists. For Ax = 0, x = A 1 0 = 0.

“(ii) ) (iii)”: Suppose R is the reduced row-echelon form of A. If R 6= I, then the number of
non-zero rows of R is less than n, and Remark 2 shows that Ax = 0 has non-trivial solutions.
Therefore A can be reduced to I by elementary row operations.

“(iii) ) (i)”: Suppose that there are elementary row operations 1, 2, . . . , k such that

k 1
A !1 A1 !2 A2 !3 . . . ! Ak 1 !k Ak = I.

j
If Ej is the elementary matrix obtained by applying j to I, i.e., I ! Ej , then by Proposition 3,
one has
A !1 E1 A !2 E2 E1 A !3 . . . !k Ek . . . E2 E1 A = I.

We conclude that (Ek . . . E2 E1 )A = I.


Since E1 , E2 , . . . , Ek are elementary matrices, they are also nonsingular.
By Proposition 2, we conclude that the product Ek . . . E2 E1 is nonsingular. Proposition 2 now
1 1
implies that A = (Ek . . . E1 ) is nonsingular, and that A = Ek . . . E2 E1 .

This completes the proof of the equivalence of (i), (ii) and (iii).

Remark 3 We shall now describe a practical method to find A 1 . In fact, it is evident from the
proof of “(iii))(i)” in Theorem 3 that if a sequence of elementary row operations reduces A to I, and
then by performing this same sequence of elementary operations on I, we obtain A 1 . Symbolically,
we have [A |I] !1 [A1 |E1 I] !2 [A2 |E2 E1 I] !3 . . . !k [I |Ek . . . E2 E1 I] = [I |A 1 ].

2 3
1 2 2
6 7
6 7
Example 12 Find the inverse of the matrix A = 6 2 3 6 7.
4 5
1 1 7

14
2 3 2 3
1 2 2 1 0 0 1 2 2 1 0 0
6 7 6 7
6 7 r2 2r1 !r2 6 7
6 2 3 6 0 1 0 7 ! 6 0 1 2 2 1 0 7
4 5 r3 r1 !r3 4 5
1 1 7 0 0 1 0 3 5 1 0 1
2 3 2 3
1 0 6 3 2 0 1 0 6 3 2 0
6 7 6 7
r1 +2r2 !r1 6 7 r3 !r3 6 7
! 6 0 1 2 2 1 0 7 ! 6 0 1 2 2 1 0 7
r3 3r2 !r3 4 5 4 5
0 0 1 5 3 1 0 0 1 5 3 1
2 3
1 0 0 27 16 6
6 7
r1 6r3 !r1 6 7
! 6 0 1 0 8 5 2 7.
r2 2r3 !r2 4 5
0 0 1 5 3 1
2 3
27 16 6
6 7
1 6 7
We therefore conclude that A is nonsingular, and that A =6 8 5 2 7.
4 5
5 3 1

1.5 Definition of the determinant

For any square matrix A, we define det(A), called the determinant of A, as follows:

If A is a 1 ⇥ 1 matrix, i.e., A = [a11 ], we define det(A) = a11 . Thus the determinant of a 1 ⇥ 1


matrix is the (only) entry in the matrix.

Assume that n > 1 and that the determinant is defined for all square matrices of order < n.
(This is the induction hypothesis.) Let A be an n ⇥ n matrix, i.e., A = [aik ]1i, kn .

For any entry aik of A, where 1  i, k  n, we define the following terms:

(1) Mik is the determinant of the (n 1) ⇥ (n 1) matrix obtained from A by deleting its i-th
row and k-th column.

(2) Cik = ( 1)i+k · Mik .

Mik and Cik are respectively called the minor and the cofactor of the entry aik of A. They are
well defined because of the induction hypothesis.

15
Example 13 Let the matrix A be given by
2 3
1 4 7
6 7
6 7
A = 6 2 5 8 7.
4 5
3 6 9

5 8
The minor of a11 =
6 9

2 8
The minor of a12 =
3 9

4 7
The minor of a21 =
6 9

5 8 5 8
The cofactor of a11 = ( 1)1+1 =
6 9 6 9

2 8 2 8
The cofactor of a12 = ( 1)1+2 =
3 9 3 9

4 7 4 7
The cofactor of a21 = ( 1)2+1 =
6 9 6 9

P
n
Definition 5 We now define det(A) = a1k C1k .
k=1

In other words, det(A) is obtained by taking “cofactor expansion” along the first row of the
matrix A.

Notation We also use the symbol |A| or det A to denote the determinant of A.
2 3
a11 a12
Example 14 If A = 4 5, then C11 = a22 , C12 = a21 and thus
a21 a22

det(A) = a11 C11 + a12 C12 = a11 a22 a12 a21 .


2 3
a11 a12 a13
6 7 a22 a23
6 7
If A = 6 a21 a22 a23 7, then C11 = = a22 a33 a23 a32 ,
4 5 a32 a33
a31 a32 a33

16
a21 a23 a21 a22
C12 = = (a21 a33 a23 a31 ) and C13 = = a21 a32 a22 a31 .
a31 a33 a31 a32

Hence det(A) = a11 (a22 a33 a23 a32 ) a12 (a21 a33 a23 a31 ) + a13 (a21 a32 a22 a31 ).

2 3
2 1 3
6 7
6 7
Example 15 If A = 6 1 1 1 7, then
4 5
1 4 2

1 1 1 1 1 1
C11 = = 2, C12 = = 3, C13 = = 5.
4 2 1 2 1 4

) |A| = a11 C11 + a12 C12 + a13 C13 = 2 ⇥ ( 2) + 1 ⇥ 3 + 3 ⇥ 5 = 14.

1.6 Further properties of the determinant

Some interesting properties of the determinant will now be discussed. We shall first state without
proof the following fundamental result.

Theorem 4 Let A be any n ⇥ n matrix where n 2. If 1  i 6= j  n, we have


n
X n
X
aik Cik = ajp Cjp . (2)
k=1 p=1

In other words, the determinant of a matrix can be evaluated by taking cofactor expansion along
any row.

As a simply illustration of Theorem 4, we calculate the determinant of the matrix in Example


16 by expanding along the third row and obtain

1 3 2 3 2 1
|A| = a31 C31 + a32 C12 + a33 C13 = 1 · + 4 · ( 1) + ( 2) · = 14.
1 1 1 1 1 1

The following Proposition tells us that the statement in Theorem 4 remains valid when “row”
is replaced by “column”.

Proposition 4 det(A) can be obtained by taking cofactor expansion along any column. In other
Pn
words, we have det(A) = i=1 aik Cik for any 1  k  n.

17
Since the determinant of a square matrix is defined by induction, results concerning determinant
are normally proved by induction. The following corollary is a typical example.

Proposition 5 det(A) = det(AT ) for any square matrix A.

a b c a d g
d e f = b e h
g h i c f i

As an exercise, the student may prove the following statement by induction:


2 3
d 0 ··· 0
6 11 7
6 7
6 0 d22 · · · 0 7
“Let D = 66 .. ..
7
.. 7. Then det(D) = d11 ⇥ d22 ⇥ . . . ⇥ dnn .”
6 . . . 7
4 5
0 0 · · · dnn

The following proposition describes, among other things, how det(A) varies when elementary
row operations are applied to the matrix A.

Proposition 6 Let A be a square matrix of order n.

(i) If A0 is the matrix obtained from A by interchanging any two rows of A, then det (A0 ) =
( 1) · det (A).
a b c d e f
d e f = a b c
g h i g h i

(ii) If two rows of A are identical, then det (A) = 0.

a b c
d e f =0
a b c

(iii) If B is the matrix obtained by multiplying the i-th row of A by a scalar t while other rows
remain unchanged, then det(B) = t · det(A).

a b c a b c
td te tf =t d e f
a b c a b c

18
(iv) Let b1 , b2 , . . ., bn be scalars. Then
2 3
a11 a12 ··· a1n
6 7
6 .. .. .. 7
6 . . . 7
6 7
6 7
det 6 ai1 + b1 ai2 + b2 ··· ain + bn 7
6 7
6 .. .. .. 7
6 . . . 7
4 5
an1 an2 ··· ann
2 3 2 3
a a ··· a1n a a ··· a1n
6 11 12 7 6 11 12 7
6 . . .
.. .. 7 6 .. .
.. .. 7
6 . . 7 6 . . 7
6 7 6 7
6 7 6 7
= det 6 ai1 ai2 · · · ain 7 + det 6 b1 b2 · · · bn 7 .
6 7 6 7
6 .. .. .. 7 6 .. .. .. 7
6 . . . 7 6 . . . 7
4 5 4 5
an1 an2 · · · ann an1 an2 · · · ann

(v) If C is obtained from A by an elementary row operation of adding a scalar multiple of the i-th
row to its j-th row, where i 6= j, i.e.,
2 3
a11 ··· a12 ··· a1n
6 7
6 .. .. .. 7
6 . . . 7
6 7
6 7
6 ai1 ··· ai2 ··· ain 7
6 7
6 .. .. .. 7
C=6 . . . 7,
6 7
6 7
6 aj1 + tai1 aj2 + tai2 · · · ajn + tain 7
6 7
6 .. .. .. 7
6 . . . 7
4 5
an1 ··· an2 ··· ann

then det (C) = det (A).

Example 16 Find the determinant of the matrix


2 3
2 1 3 5
6 7
6 7
6 2 0 1 0 7
A=6 6
7
7
6 6 1 3 4 7
4 5
7 3 2 8

19
2 1 3 5 2 0 2 0 1 0 1 0
2 0 2 0 r1 $r2 2 1 3 5 2 1 3 5
= = 2
6 1 3 4 6 1 3 4 6 1 3 4
7 3 2 8 7 3 2 8 7 3 2 8

1 0 1 0 1 0 1 0
r2 2r1 !r2
r3 6r1 !r3 0 1 1 5 r3 +r2 !r3 0 1 1 5
= 2 = 2
r4 +7r1 !r4 r4 +3r2 !r4
0 1 3 4 0 0 2 9
0 3 5 8 0 0 8 23

1 0 1 0

r4 +4r3 !r4 0 1 1 5
= 2 = 236
0 0 2 9
0 0 0 59

Remark 4 Statements in Proposition 6 are true when “row” is replaced by “column”. For instance,
a matrix with two of its columns identical has determinant equal to zero.

a b a
d e d =0
g h g

We now state an important theorem about determinants.

Theorem 5 If A and B are n ⇥ n matrices, then det(AB) = det(A) ⇥ det(B).

The following proposition gives a useful necessary and sufficient condition for a square matrix
to be nonsingular.

Proposition 7 A square matrix A is nonsingular if and only if det (A) 6= 0.

1.7 Linear dependence

Definition 6 Let v1 , v2 , . . . , vk be k vectors in Rn . A vector v is called a linear combination


of v1 , v2 , . . . , vk , if v = t1 v1 + t2 v2 + . . . + tk vk for some scalars t1 , t2 , . . . , tk .

20
h i h i h i
Example 17 In R , 8 7 6 is a linear combination of the vectors
3
1 2 3 and 4 5 6 ,
h i h i h i
because 8 7 6 = 4 · 1 2 3 + 3 · 4 5 6 .

2 3 2 3 2 3 2 3
1 2 1 0
6 7 6 7 6 7 6 7
6 7 6 7 6 7 6 7
Example 18 Let a1 = 6 0 7 , a2 = 6 2 7 , a3 = 6 8 7 , b = 6 8 7 . Express b as a linear
4 5 4 5 4 5 4 5
4 5 9 9
combination of a1 , a2 , a3 .

Solution Let c1 a1 + c2 a2 + c3 a3 = b, i.e.


2 3 2 3 2 3 2 3
1 2 1 0
6 7 6 7 6 7 6 7
6 7 6 7 6 7 6 7
c1 6 0 7 + c2 6 2 7 + c 36 8 7 6 8 7.
=
4 5 4 5 4 5 4 5
4 5 9 9

This means that we have to solve the following system of equations:


8
>
> c 2c2 +c3 = 0
>
< 1
2c2 8c3 = 8
>
>
>
: 4c +5c +9c =
1 2 3 9

2 3 2 3 2 3
1 2 1 0 1 2 1 0 1 2 1 0
6 7 6 7 6 7
6 7 r3 +4r1 !r3 6 7 r3 +3r2 !r3 6 7
6 0 2 8 8 7 1 ! 6 0 1 4 4 7 ! 6 0 1 4 4 7
4 5 2 r2 !r2 4 5 4 5
4 5 9 9 0 3 13 9 0 0 1 3
2 3 2 3
1 2 0 3 1 0 0 29
6 7 6 7
r1 r3 !r1 6 7 r1 +2r2 !r1 6 7
! 6 0 1 0 16 7 ! 6 0 1 0 16 7 .
r2 +4r3 !r2 4 5 4 5
0 0 1 3 0 0 1 3

Converting back to a system of equations, we have c1 = 29, c2 = 16, c3 = 3. Hence,

b =29a1 + 16a2 + 3a3 .

In this case, we say that b is in the span of {a1 , a2 , a3 }, where the span of {a1 , a2 , a3 } is the
set of all linear combinations of the vectors a1 , a2 and a3 and is denoted by Span{a1 , a2 , a3 } .

Definition 7 Let v1 , v2 , . . . , vk be vectors in Rn and W be the set of all linear combinations of


v1 , v2 , . . . , vk . Then W is called the span of {v1 , v2 , . . . , vk }, i.e. W = Span {v1 , v2 , . . . , vk } =
{w : w = c1 v1 + c2 v2 + · · · + ck vk }.

21
2 3
2 2 1 0 1
6 7
6 7
6 1 1 2 3 1 7
Example 19 Solve the homogeneous system6
6
7 x = 0.
7
6 1 1 2 0 1 7
4 5
0 0 1 1 1

Solution The augmented matrix is:


2 3
2 2 1 0 1 0
6 7
6 7
6 1 1 2 3 1 0 7
6 7
6 7
6 1 1 2 0 1 0 7
4 5
0 0 1 1 1 0
2 3 2 3
1 1 2 0 1 0 1 1 2 0 1 0
6 7 6 7
6 7 6 7
r1 $r3 6 1 1 2 3 1 0 7 r2 +r1 !r2 6 0 0 0 3 0 0 7
! 6 6
7
7 r3 2r!
6
6
7
7
1 !r3
6 2 2 1 0 1 0 7 6 0 0 3 0 3 0 7
4 5 4 5
0 0 1 1 1 0 0 0 1 1 1 0
2 3 2 3
1 1 2 0 1 0 1 1 0 2 1 0
6 7 6 7
6 7 6 7
r2 $r4 6
6
0 0 1 1 1 0 7 r1 +2r2 !r1 6 0 0 1
7 6
1 1 0 7
7
! 6 7 r3 3r! 6 7
2 !r3
6 0 0 3 0 3 0 7 6 0 0 0 3 0 0 7
4 5 4 5
0 0 0 3 0 0 0 0 0 3 0 0
2 3 2 3
1 1 0 2 1 0 1 1 0 0 1 0
6 7 6 7
6 7 6 7
1
r !r3 6 0 0 1
3 3
1 1 0 7 r4 +3r3 !r4 6 0 0 1 0 1 0 7
! 6 6 7 6 7
7 r2 r!
3 !r2
6 7
6 0 0 0 1 0 0 7 r 2r !r 6 0 0 0 1 0 0 7
4 5 1 3 1 4 5
0 0 0 3 0 0 0 0 0 0 0 0

The corresponding system of equations is:


8
>
> x +x2 +x5 = 0
>
< 1
x3 +x5 = 0
>
>
>
: x4 = 0

Let x2 = ↵, x5 = , where ↵, are arbitrary values. Then, x1 = ↵ , x3 = , x4 =


0. Hence,
h iT h iT h iT
x = ↵ ↵ 0 = ↵ 1 1 0 0 0 + 1 0 1 0 1 Each .
h iT
solution of the system can then be written as a linear combination of v1 = 1 1 0 0 0

22
h iT
and v2 = 1 0 1 0 1 , i.e., x = ↵v1 + v2 . Therefore, the solution set of the linear
system Ax = 0 is Span {v1 , v2 }.

2 3 2 3 2 3 2 3
1 2 1 0
6 7 6 7 6 7 6 7
6 7 6 7 6 7 6 7
Example 20 Let a1 = 6 3 7 , a2 = 6 1 7 , a3 = 6 9 7 , b = 6 b 7 . Find a condition on b, d
4 5 4 5 4 5 4 5
1 7 5 d
如果 第 四⽀ vector 想去
such that b is in Span{a1 , a2 , a3 } . Span 酒 要 符 的 咩條件

Solution b is spanned by a1 , a2 , a3 , if we can find real numbers c1 , c2 , c3 , such that


b Gspanlan 丝 ,到


,

c1 a1 + c2 a2 + c3 a3 = b,

㸑 :1

2 3 2 3 2 3 2 3
1 2 1 0
6 7 6 7 6 7 6 7
6 7 6 7 6 7 6 7
i.e., c1 6 3 7 + c2 6 1 7 + c3 6 9 7 = 6 b 7,
4 5 4 5 4 5 4 5 A 上 ⼆

1 7 5 d 揾 ⽔ 13,8 令 AA ⼆ ) ⼆

8 > A 上 上 isconsistant

>
> C.有 solutioaj
> c1 2c2 c3 = 0 If AI
< -
Fisconsistaht
3c1 +c2 +9c3 = b
>
>
> b_d 要 符合 ⼝ 咩 條件
: c1 +7c2 5c3 = d
Reducing the augmented matrix to echelon form, we have:
2 3 2 3 2 3
1 2 1 0 1 2 1 0 1 2 1 0
6 7 6 7 6 7
6 7 r2 +3r1 !r2 6 7 r3 +r2 !r3 6 7
6 3 1 9 !
b 7 r +r !r 6 0 5 6 b 7 ! 6 0 5 6 b 7
4 5 3 1 3 4 5 4 5
1 7 5 d 0 5 6 d 0 0 0

,
d+b

The system is consistent if and only if


要 consislant 既 條件
d+b=0 e.
( 合鏴 ) (3)
If Eqn(3) is satisfied, then b is in Span{a1 , a2 , a3 }. Otherwise, b is not in Span{a1 , a2 , a3 }.

Definition 8 A set of vectors {v1 , v2 , . . . , vp } is said to be linearly dependent, if there are some
scalars, c1 , c2 , . . . , cp , not all zero, such that

c1 v 1 + c2 v 2 + · · · + cp v p = 0 (4)

In such case, Eqn(4) is called a linear dependent relation among v1 , v2 , . . . , vp .

23
𡞴
h iT h iT h iT
Example 21 The vectors v1 = 1 2 3 , v2 = 4 5 6 and v3 = 7 8 9 in R3 are
linearly dependent, because 1 · v1 2 · v2 + 1 · v3 = 0. It is also clear that any one of these 3 vectors
may be expressed as a linear combination of the other two.
1 1
For instance, we have v1 = 2v2 v3 , v2 = v1 + v3 and v3 = v1 + 2v2 .
2 2

Definition 9 Vectors which are not linearly dependent are called linearly independent. In other
words, v1 , v2 , . . . , vk are linearly independent if and only if
k
X
tj vj = 0 ) tj = 0 for j = 1, 2, . . . , k.
j=1

Remark 5 It turns out that linear dependence of vectors is equivalent to the existence of nontrivial
solutions for a certain system of homogeneous equations, in which the vectors form the columns of
the coefficient matrix of the system. In fact, if v1 , v2 , . . . , vk are vectors in Rn , we denote by A the
P
k
n ⇥ k matrix whose j-th column equals to the vector vj for every j. Clearly, tj vj = 0 if and
j=1
h iT
only if At = 0, where t = t1 t2 . . . tk . Therefore, v1 , v2 , . . . , vk are linearly dependent if
and only if the system of homogeneous equations At = 0 has nontrivial solutions.
h iT h iT h iT
Example 22 Consider 1 2 0 , 1 1 1 and 0 0 1 in R3 . Since the matrix
2 3
1 1 0
6 7
6 7
A=6 2 1 0 7 is nonsingular, At = 0 has no nontrivial solution. Therefore, the 3 vectors
4 5
0 1 1
are linearly independent.
By virtue of Theorem 2, we have the following useful result about linear dependence of vectors.

Theorem 6 If v1 , v2 , . . . , vk are vectors in Rn , where k > n, then v1 , v2 , . . . , vk are linearly de-


pendent.
h i
Proof Let A = v1 v2 · · · vk . To show v1 , v2 , . . . , vk are linearly dependent, we consider
At = 0. By Theorem 2, At = 0 has infinitely many solutions. Hence v1 , v2 , . . . , vk are linearly
dependent.
For example, any four vectors in R3 are linearly dependent.
2 3 2 3 2 3
1 4 2
6 7 6 7 6 7
6 7 6 7 6 7
Example 23 Let v1 = 6 2 7 , v2 = 6 5 7 , v3 = 6 1 7 .
4 5 4 5 4 5
3 6 0

24
(a) Determine if {v1 , v2 , v3 } is linearly dependent or independent;

(b) If possible, find a linear dependence relation among v1 , v2 , v3 .

Solution (a) {v1 , v2 , v3 } is linearly dependent, if there exist scalars c1 , c2 , c3 , not all zero, such
that
c1 v 1 + c2 v 2 + c3 v 3 = 0 (5)

We solve all possible solutions of Eqn(5) first. Re-writing, we have:


2 3 2 3 2 3 2 3
1 4 2 0
6 7 6 7 6 7 6 7
6 7 6 7 6 7 6 7
c1 6 2 7 + c2 6 5 7 + c3 6 1 7 = 6 0 7 ,
4 5 4 5 4 5 4 5
3 6 0 0
8
>
> c +4c2 +2c3 = 0
>
< 1
2c1 +5c2 +c3 = 0
>
>
>
: 3c +6c
1 2 = 0
Applying row operations to the augmented matrix, we obtain:
2 3 2 3
1 4 2 0 1 4 2 0
6 7 6 7
6 7 6 7
6 2 5 1 0 7 !6 0 3 3 0 7
4 5 4 5
3 6 0 0 0 6 6 0
2 3 2 3
1 4 2 0 1 0 2 0
6 7 6 7
6 7 6 7
!6 0 1 1 0 7 !6 0 1 1 0 7.
4 5 4 5
0 6 6 0 0 0 0 0
Converting back to a system of equations, we obtain: c1 2c3 = 0, c2 + c3 = 0.
) c1 = 2c3 , c2 = c3 , where c3 is a free variable. Hence, {v1 , v2 , v3 } is linearly dependent.
(b) In the answer of (a), let c3 = 1. Then, c1 = 2, c2 = 1, and we have the required dependence
relation:
2v1 v2 + v3 = 0, or v3 = 2v1 + v2 .

Example 24 Let 2 3 2 3 2 3
1 1 2
6 7 6 7 6 7
6 7 6 7 6 7
6 2 7 6 4 7 6 3 7
v1 = 6
6 7
7 , v2 = 6 7 , v3 = 6
6 7 6
7.
7
6 3 7 6 6 7 6 5 7
4 5 4 5 4 5
1 1 2
Determine if {v1 , v2 , v3 } is linearly dependent or independent.

25
Solution {v1 , v2 , v3 } is linearly independent, if the following vector equation has only the trivial
solution
c1 v 1 + c2 v 2 + c3 v 3 = 0 (6)

We solve all possible solutions of Eqn(6) . Re-writing Eqn(6) , we have:


8
>
> c1 +c2 +2c3 = 0
>
>
>
>
< 2c +4c 3c3 = 0
1 2
>
> 3c1 +6c2 5c3 = 0
>
>
>
>
: c +c2 +2c3 = 0
1

Applying row operations to the augmented matrix, we obtain:

2 3 2 3 2 3
1 1 2 0 1 1 2 0 1 1 2 0
6 7 6 7 6 7
6 7 6 7 6 7 7
6 2 4 3 0 7 6 0 2 7 0 7 6 0 1 0 7
6 7 !6 7 !6 2 7
6 7 6 7 6 7
6 3 6 5 0 7 6 0 3 11 0 7 6 0 3 11 0 7
4 5 4 5 4 5
1 1 2 0 0 0 0 0 0 0 0 0
2 3 2 3 2 3
11 11
1 0 0 1 0 2 0 1 0 0 0
6 2 7 6 7 6 7
6 7 7 6 7 7 6 7
6 0 1 0 7 6 0 1 0 7 6 0 1 0 0 7
!6
6
2 7 !6
7 6
2 7 !6
7 6
7
7
6 0 0 1
0 7 6 0 0 1 0 7 6 0 0 1 0 7
4 2 5 4 5 4 5
0 0 0 0 0 0 0 0 0 0 0 0

Converting back to a system of equations, we have,

c1 = 0, c2 = 0, c3 = 0.

Hence, Eqn(6) has only the trivial solution, and {v1 , v2 , v3 } is linearly independent.

1.8 Eigenvectors and eigenvalues of a square matrix


-

Definition 10 Let A be an n ⇥ n matrix. A non-zero vector v in Rn is called an eigenvector of


A if Av is a scalar multiple of v, i.e., if there is a scalar 。
such that Av = v. The scalar is
called an eigenvalue of A and v is said to be an eigenvector of A corresponding to the eigenvalue
.
2 3 2 3
3 0 1
Example 25 Let A = 4 5 and v = 4 5.
8 1 2

26
Since 2 32 3 2 3 2 3
3 0 1 3 1
Av = 4 54 5=4 5 = 34 5 = 3v,
8 1 2 6 2
2 3
1
thus v is an eigenvector of A corresponding to the eigenvalue = 3. w = 4 5 is not an
0
eigenvector of A because 2 32 3 2 3
3 0 1 3
Aw = 4 54 5=4 5,
8 1 0 8
which is not a scalar multiple of w.

Remark 6 It is clear that if v is an eigenvector of A corresponding to the eigenvalue , then tv


is also an eigenvector of A corresponding to the same eigenvalue , provided that t 6= 0.

By re-writing Av = v as Av v = 0, or (A I)v = 0, we observe that finding an eigenvector


of A is equivalent to finding non-trivial solutions of the homogeneous system (A I)v = 0. Recall
that (A I)v = 0 has non-trivial solutions if and only if det(A I) = 0.

Theorem 7 Let A be an n ⇥ n matrix. A number is an eigenvalue of A if and only if det(A


I) = 0. Tofindeigamlaeof A ,

wrudownde.tl/t-xIn)=OQSdveforXSumofeigenvalueotA=tr(
Once we obtain an eigenvalue (say ) of A, we shall be able to use Gauss-Jordan method to
find non-trivial solutions of (A I)v = 0 and thus obtain the corresponding eigenvectors of A.
As the homogeneous system (A I)v = 0 has infinitely many non-trivial solutions, there are
infinitely many eigenvectors of A corresponding to . Therefore, we only need to find eigenvectors
that are linearly independent. All other eigenvectors corresponding to may be expressed as linear
combinations of these linearly independent eigenvectors. A)
2 prodactoteigenvalue.at/t=detlJ
3
a11 a12 ··· a1n
6 7
6 7
6 a21 a22 ··· a2n 7
Remark 7 Let f ( ) = det(A I) = det 6
6 .. .. ..
7.
7
6 . . ··· . 7
4 5
an1 an2 ··· ann

It follows by induction on n that f ( ) is a polynomial of degree n with leading coefficient ( 1)n .


f ( ) is called the characteristic polynomial of A.

27
Therefore, eigenvalues of the matrix A are the roots of the equation f ( ) = 0. As a result, an
n ⇥ n matrix has exactly n eigenvalues, counting multiplicities.

2 3
5 4
Example 26 For A = 4 5,
1 2
2 3
5 4
we have f ( ) = det 4 5= 2
7 + 6.
1 2
2
Eigenvalues of A are therefore roots of the quadratic equation 7 + 6 = 0.

Therefore 1 = 6, 2 = 1.
8
< v1 + 4v2 = 0
Case (1) For 1 = 6, (A 1 I)v =0, .
: v 4v2 = 0
2 3 2 3 1
1 4 0 1 4 0
4 5!4 5
1 4 0 0 0 0
h iT
We thus obtain v = as an eigenvector corresponding to
4 1 2 = 6.
8
< 4v + 4v = 0
1 2
Case (2) For 2 = 1, (A 2 I)v = 0 , .
: v +v =0
1 2
2 3 2 3
4 4 0 1 1 0
4 5!4 5
1 1 0 0 0 0
h iT
)v= 1 1 as an eigenvector corresponding to 2 = 1.

2 3
0 1 0
6 7
6 7
Example 27 If A = 6 0 0 1 7, then
4 5
6 11 6

1 0
f ( ) = det(A I) = 0 1 = ( 1)( 2)( 3) = 0.
6 11 6
Therefore, the eigenvalues are given by 1 = 1, 2 = 2, 3 = 3.

28
2 32 3
1 1 0 v
6 76 1 7
6 76 7
Case (1) For 1 = 1, (A 1 I)v = 0 , 6 0 1 1 7 6 v2 7 = 0.
4 54 5
6 11 5 v3
2 3 2 3
1 1 0 1 0 0 1 0
6 7 6 7
6 7 6 7
6 0 1 1 0 7!6 0 1 1 0 7
4 5 4 5
6 11 5 0 0 0 0 0
·

h iT
We thus obtain v = 1 1 1 as an eigenvector corresponding to 1 = 1.
32 2
3
2 1 0 v1
6 76 7
6 76 7
Case (2) For 2 = 2, (A 2 I)v = 0 , 6 0 2 1 7 6 v2 7 = 0.
4 54 5
6 11 4 v3
2 3 2 3
1
2 1 0 0 1 0 0
6 7 6 4 7
6 7 6 1 7
6 0 2 1 0 7!6 0 1 0 7
4 5 4 2 5
6 11 4 0 0 0 0 0
h iT
The corresponding eigenvector is given by v = 1 2 4 .
2 32 3
3 1 0 v1
6 76 7
6 76 7
Case (3) For 3 = 3, (A 3 I)v = 0 , 6 0 3 1 7 6 v2 7 = 0.
4 54 5
6 11 3 v3
2 3 2 3
1
3 1 0 0 1 0 0
6 7 6 9 7
6 7 6 1 7
6 0 3 1 0 7!6 0 1 0 7
4 5 4 3 5
6 11 3 0 0 0 0 0
h iT
We thus obtain v = 1 3 9 as an eigenvector corresponding to 3 = 3.

2 3
3 2 0
6 7
6 7
Example 28 For A = 6 2 3 0 7, we obtain f ( ) = (5 )2 (1 ).
4 5
0 0 5

Hence the eigenvalues are 1 = 1, 2 = 3 = 5.

For 1 = 1, we solve (A I)v = 0,


2 3 2 3
2 2 0 1 1 0
6 7 6 7
6 7 6 7
6 2 2 0 7 ! 6 0 0 1 7
4 5 4 5
0 0 4 0 0 0

29
h iT
We obtain v1 = 1 1 0 as a corresponding eigenvector. For 2 = 3 = 5 (double root), we

solve (A 5I)v = 0 2 3 2 3
2
1 2 0 1 0
6 7 6 7
6 7 6 7
2 2 0 7!6 0
6 0 0 7
4 5 4 5
0 0 0 0 0 0
h iT h iT
We obtain v2 = 1 1 0 and v3 = 0 0 1 as two linearly independent eigenvectors.

We thus conclude that there are altogether three linearly independent eigenvectors for the given
matrix A.
2 3
1 1 1
6 7
6 7
Example 29 If A = 6 1 3 1 7, then f ( ) = (2 )(1 )2 .
4 5
1 2 0
Therefore the eigenvalues are 1 = 2, 2 = 3 = 1.

Solving the linear systems (A I)v = 0 for = 2 and = 1 (multiplicity 2),


2 3 2 3
ÒYJ
CA II) 沿 6
6
1 1 1
7
7
6
6
1 0 0
7
7 D ( i) :
2
1

=2:6 1 1 1 7!6 0 1 1 7
间 → (II)
4 5 4 5
1 2 2 0 0 0 Pii

P-sljet.GS?ngleCA-I)PI.hotdiagonelizableQiA=PDptCGA
2 3 2 3
0 1 1 1 0 1
6 7 6 7
6 7 6 7
=1:6 1 2 1 7!6 0 1 1 7
4 5 4 5
1 2 1 0 0 0
h iT h iT
we obtain respectively two linearly independent eigenvectors v1 = 0 1 1 and v2 = 1 1 1 .

Remark 8 From these examples, we observe that an 3 ⇥ 3 matrix may have up to 3 linearly in-
dependent eigenvectors (Examples 27 and 28), but this is not always the case (Example 29). An
n ⇥ n matrix with exactly n linearly independent eigenvectors has some nice properties, which will
be dealt with in the next subsection.

1.9 Diagonalization

[email protected]
Definition 11 A square matrix A is said to be diagonalizable if there is a nonsingular matrix
P such that P 1 AP is a diagonal matrix. We also say that the matrix P diagonalizes A.
Aislisntiiaqonelizable ( 重覆 出現 的少
DQD
ind

Gforeachrepeatedeanvalaed 30 ,

mulfiphcityofxt.tt offreevariable
✗ 出現 的
次數
不 (A)I) ūo
tfreevariabk.GS
Ghotdlaqohalizable
2 3 2 3
5 4 1 4
Example 30 Let A = 4 5. If we take P = 4 5, a simple calculation shows that
1 2 1 1
2 32 32 3 2 3
1 4
5 4 1 4 1 0
P 1 AP = 4 5 5 54 54 5=4 5.
1 1
5 5
1 2 1 1 0 6
2 3
1 4
Therefore, A is diagonalizable and P = 4 5 diagonalizes A.
1 1

Remark 9 It should also be noted that the diagonalizing


2 3 matrix, if exists, is not unique. For
2 1
instance, in Example 30, we may also take P = 4 5.
1
2 4

Theorem 8 If 1, 2, . . ., k are distinct eigenvalues of an n ⇥ n matrix A with corresponding


eigenvectors v1 , v2 , . . . , vk , then v1 , v2 , . . . , vk are linearly independent.

Proof: Omitted.

Theorem 9 Let A be an n ⇥ n matrix. Then A is diagonalizable if and only if A has n linearly


independent eigenvectors.

If v1 , v2 , . . . , vn are linearly independent eigenvectors of A corresponding to eigenvalues 1, 2,

. . ., n (not necessarily distinct), then by taking P to be the n ⇥ n matrix having v1 , v2 , . . . , vn as

its columns and D to be the diagonal matrix with djj = j for j = 1, 2, . . . , n, we obtain AP = PD.
This follows because
h i h i
AP = A v1 v2 · · · vn = Av1 Av2 · · · Avn
2 3
1
6 7
h i h i6
6 2
7
7
= 1 v1 2 v2 ··· n vn = v1 v2 · · · vn 6
6 ..
7 = PD.
7
6 . 7
4 5
n

Proposition 8 An n ⇥ n matrix with n distinct eigenvalues is diagonalizable.

Diagonalization Algorithm of an n ⇥ n matrix A

31
1. Find the characteristic polynomial f ( ) of A.

2. Find the roots of f ( ) to obtain the eigenvalues of A.

3. Repeat (a) and (b) for each eigenvalue of A.

(a) Form A I by subtracting down the diagonal of A

(b) Find a basis for the solution space of the homogeneous system (A I) v = 0. [These
basis vectors are linearly independent eigenvectors of A belonging to .]

4. Consider the collection S = {v1 , v2 , . . . , vm } of all eigenvectors obtained in Step 3:

(a) If m 6= n, then A is not diagonalizable.

(b) If m = n, let P be the matrix whose columns are the eigenvectors v1 , v2 , . . . , vn . Then
2 3
1
6 7
6 7
6 2 7
P AP = 6
1
6 ..
7
7
6 . 7
4 5
n

2 3
1 2 2
6 7
6 7
Example 31 For A = 6 2 2 2 7, we have
4 5
3 6 6

1 2 2
f( ) = 2 2 2 = ( + 2)( + 3).
3 6 6

Therefore, eigenvalues of A are 1 = 2, 2 = 3,


= 0.3 Since
2 32 3
1 2 2 1 2 0
6 7 6 7
6 7 6 7
= 2:6 2 4 2 7!6 0 0 1 7
4 5 4 5
3 6 4 0 0 0

32
2 3 2 3
2 2 2 1 0 1
6 7 6 7
6 7 6 7
= 3:6 2 5 2 7!6 0 1 0 7
4 5 4 5
3 6 3 0 0 0
2 3 2 3
1 2 2 1 0 0
6 7 6 7
6 7 6 7
=0:6 2 2 2 7 ! 6 0 1 1 7
4 5 4 5
3 6 6 0 0 0

Therefore the eigenvectors corresponding to 1 , 2 and 3 are respectively given by


h iT h iT h iT
v1 = 2 1 0 , v 2 = 1 0 1 , v 3 = 0 1 1 .

Observe that these eigenvectors are linearly independent in R3 . It thus follows from Theorem 9 that
A is diagonalizable, and that by taking
2 3 2 3
2 1 0 2 0 0
6 7 6 7
6 7 1 6 7
P=6 1 0 1 7, we have P AP = 6 0 3 0 7.
4 5 4 5
0 1 1 0 0 0

2 3
3 2 0
6 7
6 7
Example 32 The matrix A = 6 2 3 0 7 in Example 28 has
4 5
0 0 5
h iT h iT h iT
v1 = 1 1 0 , v2 = 1 1 0 and v 3 = 0 0 1 as eigenvectors corresponding to

eigenvalues 1 = 1, 2 =
= 5. Therefore, A diagonalizable.
3
2 3 2 3
1 1 0 1 0 0
6 7 6 7
6 7 1 6 7
In fact, by taking P = 6 1 1 0 7, we have P AP = 6 0 5 0 7.
4 5 4 5
0 0 1 0 0 5

Example 33 The 3 ⇥ 3 matrix in Example 29 has only two linearly independent eigenvectors and
is therefore not diagonalizable.

2 3
1 0 ··· 0
6 7
6 7
6 0 2 ··· 0 7
Remark 10 If P 1 AP = D = 6
6 .. .. .. ..
7, then A = PDP 1 .
7
6 . . . . 7
4 5
0 0 ··· n

33
2 3
dm

ǎiiibthen
1 0 ··· 0
6 7
6 7
6 0 dm2 ··· 0 7
This implies Am = PDm P 1
for any positive integer m. As Dm = 6
6 .. .. . . ..
7, we may
7
6 . . . . 7
4 5
0 0 ··· dm
n
use this result to calculate the m-th power of A easily.
2 3
4 3
Example 34 Compute Am , where A = 4 5 and m is a positive integer.
2 1
Solution The eigenvalues of A are 1 = 2, 2 = 1 with corresponding eigenvectors given as
h iT h iT
v1 = 3 2 and v2 = 1 1 .
2 3 2 3 2 3
3 1 1 1 2 0
Now P = 4 5, P 1 = 4 5, D = 4 5, A = PDP 1 and
2 1 2 3 0 1

2 32 32 3 2 3
m m m
3 1 2 0 1 1 3⇥2 2 3⇥2 +3
Am = PDm P 1
=4 54 54 5=4 5.
m m+1 m+1
2 1 0 1 2 3 2 2 2 +3

1.10 Vector Space

A vector space is a system which consists of a set and rules for combining elements in the set.

Definition 12 A set of vectors V is called a vector space over R if

(a) ADDITION

i. There is an operation + on V which associates with each pair of elements u and v


in V an elements u + v in V .

ii. u + v = v + u for all u, v 2 V.

iii. (u + v) + w = u + (v + w) for all u, v, w 2 V.

iv. There exists an element 0 2 V such that v + 0 = 0 + v = v for all v 2 V.

v. for all v 2 V there exists an element v 2 V such that v + ( v) = v + v = 0.

(b) MULTIPLICATION BY SCALAR

i. For every 2 R and v 2 V , there is defined an element v 2 V.

ii. (u + v) = u + v for all 2 R and all u, v 2 V.

34
iii. ( + µ) (u) = u + µu for all , µ 2 R and all u 2 V.

iv. ( µ) (u) = (µu) for all , µ 2 R and all u 2 V.

v. 1 · u = u for all u 2 V.

Remark 11 Note that V is non-empty (we assume in (a)iv. that there exists an element 0 2 V )

Example 35 Consider the set V of the vectors v = (x, y) of real numbers (R).
If 2 R, we define (x, y) = ( x, y) .
If (x1 , y1 ) and (x2 , y2 ) are two elements of V , then we define

(x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 ) .

Show that V is a vector space.

Solution (a)i. u = (x1 , y1 ) , v = (x2 , y2 ) 2 V ) x1 , x2 , y1 , y2 2 R ) x1 + x2 , y1 + y2 2 R, thus

u + v = (x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 ) 2 V.

(a)ii.

u + v = (x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 )

= (x2 + x1 , y2 + y1 ) = (x2 , y2 ) + (x1 , y1 ) = v + u

(a)iii.

(u + v) + w = [(x1 , y1 ) + (x2 , y2 )] + (x3 , y3 )

= (x1 + x2 , y1 + y2 ) + (x3 , y3 )

= ((x1 + x2 ) + x3 , (y1 + y2 ) + y3 )

= (x1 + (x2 + x3 ) , y1 + (y2 + y3 ))

= (x1 , y1 ) + (x2 + x3 , y2 + y3 )

= (x1 , y1 ) + [(x2 , y2 ) + (x3 , y3 )]

= u + (v + w)

(a)iv.
u + 0 = (x, y) + (0, 0) = (x + 0, y + 0) = (x, y) = u

35
(a)v.

u = (x, y) , u = ( x, y)

u+ ( u) = (x, y) + ( x, y) = (x + ( x) , y + ( y)) = (0, 0) = 0

(b)i. 2 R, v = (x, y) 2 V ) x, y, 2 R ) x, y 2 R, thus

v= (x, y) = ( x, y) 2 V.

(b)ii.

(u + v) = [(x1 , y1 ) + (x2 , y2 )] = (x1 + x2 , y1 + y2 )

= ( (x1 + x2 ) , (y1 + y2 ))

= ( x 1 + x 2 , y1 + y 2 )

= ( x 1 , y1 ) + ( x 2 , y 2 )

= (x1 , y1 ) + (x2 , y2 ) = u + v

(b)iii.

( + µ) (u) = ( + µ) (x, y)

= (( + µ) x, ( + µ) y)

= ( x + µx, y + µy)

= ( x, y) + (µx, µy)

= (x, y) + µ (x, y) = u + µu

(b)iv.

( µ) (u) = ( µ) (x, y)

= (( µ) x, ( µ) y)

= ( (µx) , (µy))

= (µx, µy) = (µu)

(b)v.
1 · u = 1 · (x, y) = (1 · x, 1 · y) = (x, y) = u

36
1.11 Subspaces

Definition 13 Suppose that V is a vector space over R. A subset W of V is called a subspace of


V if W itself is a vector space, with the same definition of addition and scalar multiplication as in
V.

Theorem 10 A subset W of V is a subspace if and only if the following are all true.

(i) 0 2 W

(ii) u + v 2 W for all u, v 2 W

(iii) u 2 W for all u 2 W, 2F

Example 36 For every vector spaces V , the subsets V and {0} are subspaces. Other subspaces are
called proper subspaces.

Example 37 Let V = R3 and W = {(x, y, z) : x + 2y + 3z = 0} . Then W is a subspace of V.

Solution (i) 0 + 2 (0) + 3 (0) = 0 ) (0, 0, 0) 2 W.


(ii) Assume (x1 , y1 , z1 ) 2 W and (x2 , y2 , z2 ) 2 W. Then x1 + 2y1 + 3z1 = 0 and x2 + 2y2 + 3z2 = 0,

(x1 + x2 ) + 2 (y1 + y2 ) + 3 (z1 + z2 ) = (x1 + 2y1 + 3z1 ) + (x2 + 2y2 + 3z2 ) = 0

Hence ((x1 + x2 ) , (y1 + y2 ) , (z1 + z2 )) 2 W. Thus, (x1 , y1 , z1 ) + (x2 , y2 , z2 ) 2 W.


(iii) Assume (x, y, z) 2 W and 2 R. Then x + 2y + 3z = 0

x + 2 ( y) + 3 ( z) = (x + 2y + 3z) = 0

Hence ( x, y, z) 2 W. Thus, (x, y, z) 2 W.

Example 38 Any plane (ax + by + cz = 0) in R3 which passes through the origin is also a subspace
of R3 .
⇢h iT
Example 39 Let H = u v 2 R2 : u = 2s and v = 3 + 8s , i.e. H is the set of all points
h iT h iT
in R2 of the form u v = 2s 3 + 8s . Determine if H is a subspace of R2 .
h iT
Solution If 2s 3 + 8s were zero for some s, then 2s = 0 and 3 + 8s = 0. This is a
contradiction. Hence, the zero vector of R2 is not in H (condition (i) is not satisfied). Hence, H is
not a subspace.

37
80 1 9
< x =
Example 40 The set W = @ A : xy 0 and x, y 2 R . Determine if W is a subspace of
: y ;
R2 .
0 1 0 1
2 1
Solution Let u1 = @ A and u2 = @ A . Since 2 ⇥ 1 0 and ( 1) ⇥ ( 2) 0, thus
1 2
0 1
1
u1 and u2 are in W. But their sum u1 + u2 = @ A is not in W because the product of two
1
components is (1) ⇥ ( 1) < 0. Therefore, W is not a subspace.
80 1 9
< x =
Example 41 The set W = @ A 2 R2 : y = x2 . Determine if W is a subspace of R2 .
: y ;
0 1 0 1
1 2
Solution Let = 2 and u = @ A . Since 1 = 12 , thus u is in W. But the vector u = @ A
1 2
2
is not in W because 2 6= 2 . Therefore, W is not a subspace.

1.12 Bases

In this section we identify and study the subsets that span a vector space V or a subspace H as
“efficiently” as possible. The following definition gives the most simplified representation of a vector
space or subspace.

Definition 14 Let H be a subspace of a vector space V. A set of vectors B = {b1 , b2 , . . . , bp } in


V is a basis for H, if

(i) B is a linearly independent set, and

(ii) the space spanned by B coincides with H; i.e. H = Span{b1 , b2 , . . . , bp } .

Example 42 Let 2 3 2 3
1 0
e1 = 4 5, e2 = 4 5.
0 1
2 3
v1
For every v = 4 5 2R2 , we can write
v2
2 3
v1
v =4 5 = v 1 e 1 + v 2 e2 .
v2

38
Hence, R2 is spanned by {e1 , e2 } . Obviously, e1 and e2 are linearly independent. Thus, the set
{e1 , e2 } is a basis for R2 , it is called the standard basis for R2 .

Example 43 Let 2 3 2 3 2 3
1 0 0
6 7 6 7 6 7
6 7 6 7 6 .. 7
6 0 7 6 1 7 6 . 7
e1 = 6
6 ..
7,
7 e2 = 6
6 ..
7,...,
7 en = 6
6
7.
7
6 . 7 6 . 7 6 0 7
4 5 4 5 4 5
0 0 1
The set {e1 , e2 , . . . , en } is called the standard basis for Rn .

Example 44 Let 2 3 2 3 2 3
3 0 6
6 7 6 7 6 7
6 7 6 7 6 7
v1 = 6 4 7 , v 2 = 6 1 7 , v 3 = 6 7 7.
4 5 4 5 4 5
2 1 5
Determine if {v1 , v2 , v3 } is a basis for R3 .
h i
Solution Let A = v1 v2 v3 . Then det A = 6 6= 0. Therefore, A is invertible and v1 , v2 , v3
are linearly independent. Since 2A is invertible,
3 then for every b 2 R3 , there exists a vector x such
x
h i6 1 7
6 7
that b = Ax = v1 v2 v3 6 x2 7 = x1 v1 + x2 v2 + x3 v3 . Therefore, R3 = Span{v1 , v2 , v3 } .
4 5
x3
Thus, {v1 , v2 , v3 } forms a basis for R3 .

Example 45 Let 2 3 2 3 2 3
1 3 4
6 7 6 7 6 7
6 7 6 7 6 7
v1 = 6 2 7 , v 2 = 6 5 7 , v 3 = 6 5 7
4 5 4 5 4 5
3 7 6
and H = Span{v1 , v2 , v3 } . Show that Span{v1 , v2 , v3 } = Span{v1 , v2 }, and find a basis for H.

Solution Note that every vector in Span{v1 , v2 } belongs to H. That is, Span{v1 , v2 } ✓ H.
Let x = c1 v1 + c2 v2 + c3 v3 be any vector in H. Since v3 = 5v1 + 3v2 , we have

x = c1 v1 + c2 v2 + c3 (5v1 + 3v2 ) = (c1 + 5c3 ) v1 + (c2 + 3c3 ) v2 .

Hence, x is in Span{v1 , v2 } and H ✓ Span{v1 , v2 }. Thus, H = Span{v1 , v2 }.


Since {v1 , v2 } is linearly independent, {v1 , v2 } is a basis of H.

39
Theorem 11 (The Spanning Set Theorem) Let S = {v1 , v2 , . . . , vp } be a set in V and let H =
Span{v1 , v2 , . . . , vp } .

(a) If one of the vectors in S, say vk , is a linear combination of the remaining vectors in S, then
the set formed from S by removing vk still spans H.

(b) If H 6= {0} , some subset of S is a basis for H.

Proof (a) By rearranging the list of vectors in S, if necessary, we may suppose that vp is a
linear combination of v1 , v2 , . . . , vp 1 , say, vp = a1 v1 + a2 v2 + · · · + ap 1 vp 1 .
Given any x in H, we have scalars c1 , c2 , . . . , cp , such that

x = c1 v 1 + c2 v 2 + · · · + cp 1 v p 1 + cp v p .

Then

x = c1 v 1 + c2 v 2 + · · · + cp 1 v p 1 + cp (a1 v1 + a2 v2 + · · · + ap 1 vp 1 )

= (c1 + cp a1 ) v1 + (c2 + cp a2 ) v2 + · · · + (cp 1 + c p ap 1 ) v p 1

As x is a linear combination of v1 , v2 , . . . , vp 1 , {v1 , v2 , . . . , vp 1 } spans H.


(b) If the original spanning set S is linearly independent, it is already a basis for H. Otherwise,
one of the vectors in S depends on the others and may be deleted by part (a). We may repeat this
process until the spanning set is linearly independent and hence is a basis for H. If the spanning
set is eventually reduced to one vector, that vector will be nonzero because H 6= {0}.

Theorem 12 Let B = {b1 , b2 , . . . , bn } be a basis for a vector space V . Then for each x in V,
there exists a unique set of scalars c1 , c2 , . . . , cn , such that

x = c1 b1 + c2 b2 + · · · + cn bn (7)

Proof Since B spans V, there exists scalars such that Eqn(7) holds. Suppose x also has the
representation x =d1 b1 + d2 b2 + · · · + dn bn , for scalars d1 , d2 , . . . , dn . Then, subtracting, we have

0=x x = (c1 d1 ) b1 + (c2 d2 ) b2 + · · · + (cn dn ) bn (8)

Since B is linearly independent, the weights in Eqn(8) must all be zero. i.e. cj = dj for 1  j  n.

Theorem 13 If a vector space V has a basis B = {b1 , b2 , . . . , bn } , then any set in V containing
more than n vectors must be linearly dependent.

40
Theorem 14 If a vector space V has a basis of n vectors, and if {u1 , u2 , . . . , uk } is any k linearly
independent vectors in V, then we must have k  n.

Theorem 15 If a vector space V has a basis of n vectors, then every basis of V must consist of
exactly n vectors.

Definition 15 If V is spanned by a finite set, then V is said to be finite-dimensional and the


dimension of V , written as dim V is the number of vectors in a basis for V .

Note: The dimension of the zero vector space {0} is defined to be zero. If V is not spanned by
a finite set, then V is said to be infinite-dimensional.

Example 46 The set of vectors {e1 , e2 , . . . , en } is the standard basis for Rn . Thus, dim Rn = n.
2 3 2 3
1 1
6 7 6 7
6 7 6 7
Example 47 Let H = Span{v1 , v2 } where v1 = 6 2 7 and v2 = 6 0 7. A basis for H is
4 5 4 5
3 2
{v1 , v2 }, since v1 , v2 are linearly independent. Hence dim H = 2.

1.13 Column space, row space and null space

Definition 16 Given an m ⇥ n matrix A, the null space of A is the set Nul(A) of all solutions to
the homogeneous equation Ax = 0. In set notation,

Nul (A) = {x : x is in Rn , and Ax = 0}.


2 3
2 3 9
1 3 8 6 7
6 7
Example 48 Let A = 4 5 and u = 6 3 7 . Determine if u belongs to the null space
3 9 2 4 5
0
of A.

Solution 2 3
2 3 9 2 3 2 3
1 3 6
8 6 7 7 9 9 0
Au = 4 56 3 7 = 4 5=4 5
3 9 2 4 5 27 + 27 0
0
Thus, u is in Nul(A).

41
Theorem 16 The null space of an m ⇥ n matrix A is a subspace of Rn . Equivalently, the set of
all solutions to a system Ax = 0 of m homogeneous linear equations in n unknowns is a subspace
of Rn .

Proof (i) 0 is in Nul(A). (trivial)


(ii) Let u and v be any two vectors in Nul(A). Then Au = 0 and Av = 0.
Hence, A (u + v) = Au + Av = 0 + 0 = 0. Therefore, u + v is also in Nul(A).
(iii) If c is any scalar, then A (cu) = c (Au) = 0.
Thus, cu is also in Nul(A).
By (i), (ii), (iii), Nul(A) is a subspace of Rn .

Example 49 Find a spanning set for the null space of the matrix:
2 3
1 2 0 1 3
6 7
6 7
A=6 1 2 1 1 1 7
4 5
2 4 0 2 6

Solution First solve Ax = 0. Reduce the augmented matrix to reduced echelon form:
2 3
1 2 0 1 3 0
6 7 x1 2x2 x4 +3x5 = 0
6 7
6 0 0 1 2 2 0 7,
4 5 x3 +2x4 2x5 = 0
0 0 0 0 0 0

The general solution is: x1 = 2x2 + x4 3x5 , x3 = 2x4 + 2x5 with x2 , x4 and x5 free.
2 3 2 3 2 3 2 3 2 3
x 2x + x4 3x5 2 1 3
6 1 7 6 2 7 6 7 6 7 6 7
6 7 6 7 6 7 6 7 6 7
6 x2 7 6 x2 7 6 1 7 6 0 7 6 0 7
6 7 6 7 6 7 6 7 6 7
6 7 6 7 6 7 6 7 6 7
6 x3 7 = 6 2x4 + 2x5 7 = x2 6 0 7 + x4 6 2 7 + x5 6 2 7
6 7 6 7 6 7 6 7 6 7
6 7 6 7 6 7 6 7 6 7
6 x4 7 6 x4 7 6 0 7 6 1 7 6 0 7
4 5 4 5 4 5 4 5 4 5
x5 x5 0 0 1
u v w

Hence, {u, v, w} is a spanning set for Nul(A).

Remark 12 1. {u, v, w} produced by the method in the last example is linearly independent.

2. The number of vectors in the spanning set for Nul(A) equals the number of free variables in
the equation Ax = 0.

42
Definition 17 Given an m ⇥ n matrix A, the column space of A is the set Col(A) of all linear
combinations of the column vectors of A. In set notation,

Col (A) = {b : b is in Rm , and b = Ax for some x in Rn } .

Definition 18 If A is an m ⇥ n matrix, the subspace of Rn spanned by the row vectors of A is


called the row space of A and is denoted by Row(A).

Example 50 Let 2 3
1 0 0 0
6 7
6 7
A=6 0 1 0 0 7.
4 5
0 0 0 1
The row space of A is the set of all 4–vectors spanned by
nh i h i h io
1 0 0 0 , 0 1 0 0 , 0 0 0 1 ,

i.e.
nh i h i h io
Row (A) = Span 1 0 0 0 , 0 1 0 0 , 0 0 0 1 .

The elements of Row(A) have the form of


h i h i h i h i
a 1 0 0 0 +b 0 1 0 0 +c 0 0 0 1 = a b 0 c .

Thus,
nh i o
Row (A) = a b 0 c : a, b, c 2 R .

The column space of A is the set of all vectors spanned by


82 3 2 3 2 3 2 39
>
> 1 0 0 0 >
>
>
<6 7 6 7 6 7 6 7> =
6 7 6 7 6 7 6 7
6 0 7,6 1 7,6 0 7,6 0 7 ,
>
> 4 5 4 5 4 5 4 5> >
>
: >
0 0 0 1 ;
i.e. 82 3 2 3 2 3 2 39
>
> 1 0 0 0 >>
>
<6 7 6 7 6 7 6 7> =
6 7 6 7 6 7 6 7
Col (A) = Span 6 0 7 , 6 1 7 , 6 0 7 , 6 0 7 .
>
> 4 5 4 5 4 5 4 5> >
>
: 0 >
0 0 1 ;
The elements of Col(A) have the form of
2 3 2 3 2 3 2 3 2 3
1 0 0 0 a
6 7 6 7 6 7 6 7 6 7
6 7 6 7 6 7 6 7 6 7
a6 0 7 + b6 1 7 + c6 0 7 + d6 0 7 = 6 b 7.
4 5 4 5 4 5 4 5 4 5
0 0 0 1 d

43
Thus 82 3 9
>
> a >
>
>
<6 7 >
=
6 7
Col (A) = 6 b 7 : a, b, d 2 R .
>
> 4 5 >
>
>
: d >
;

The row space of A is a three-dimensional subspace of R4 and the column space of A is R3 .

Theorem 17 The row space of an m ⇥ n matrix A is a subspace of Rn .

Proof Trivial.

Theorem 18 Two row equivalent matrices have the same row space.

Proof If B is row equivalent to A, then B can be formed from A by a finite sequence of


row operations. Thus the row vectors of B must be linear combinations of the row vectors of
A. Consequently, the row space of B must be a subspace of the row space of A. Since A is row
equivalent to B, by the same reasoning, the row space of A is a subspace of the row space of B.

Example 51 The matrix 2 3


1 2 0 1 3
6 7
6 7
A=6 1 2 1 1 1 7
4 5
2 4 0 2 6
can be reduced to 2 3
1 2 0 1 3
6 7
6 7
B=6 0 0 1 2 2 7.
4 5
0 0 0 0 0
By Theorem 18, matrices A and B have the same row space, i.e.
nh i h i h io
Row (A) = Span 1 2 0 1 3 , 1 2 1 1 1 , 2 4 0 2 6
nh i h io
= Span 1 2 0 1 3 , 0 0 1 2 2

= Row (B) .

Theorem 19 The column space of an m ⇥ n matrix A is a subspace of Rm .

Proof Trivial.

Lemma 1 Suppose that A is an m ⇥ n matrix. Then Ax = b is consistent if and only if b 2


Col(A) .

44
h i
Proof ()) Let A = a1 a2 . . . an . If Ax = b is consistent, then there exists a vector x
such that
b = Ax = x1 a1 + x2 a2 + · · · xn an 2 Col (A)

(() If b 2 Col(A) , then there exists some weights x1 , x2 , . . . , xn such that

b = x1 a1 + x2 a2 + · · · xn an = Ax

Hence Ax = b is consistent.

Example 52 Consider the linear system


0 10 1 0 1
1 2 1 x 9
B CB 1 C B C
B CB C B C
B 0 3 0 C B x2 C = B 6 C .
@ A@ A @ A
4 5 1 x3 9
0 1
2
B C
B C
It is easy to check that B 2 C is a solution of the system. Then we may write
@ A
7
0 1 0 1 0 1 0 1
9 1 2 1
B C B C B C B C
B C B C B C B C
B 6 C = ( 2) B 0 C + 2 B 3 C + 7 B 0 C.
@ A @ A @ A @ A
9 4 5 1
Thus, b 2 Col(A).

Theorem 20 Let A be an m ⇥ n matrix. The linear system Ax = b is consistent for every b 2 Rm


if and only if the column vectors of A span Rm . The system Ax = b has at most one solution for
every b 2 Rm if and only if the column vectors of A are linearly independent.

Proof We have seen that the system Ax = b is consistent if and only if b is in the column
space of A. It follows that Ax = b will be consistent for every b 2 Rm if and only if the column
vectors of A span Rm . To prove the second statement, note that if Ax = b has at most one solution
for every b, then in particular the system Ax = 0 can have only the trivial solution, and hence
the column vectors of A must be linearly independent. Conversely, if the column vectors of A are
linearly independent, Ax = 0 has only the trivial solution. Now if x1 and x2 were both solutions
to Ax = b, then x1 x2 would be a solution to Ax = 0,

A (x1 x2 ) = Ax1 Ax2 = b b = 0.

It follows that x1 x2 , and hence x1 must equal x2 .

45
h i
Remark 13 Let A be an invertible n ⇥ n matrix: say A = a1 a2 . . . an . Then the columns
of A form a basis for Rn , because they are linearly independent and they span Rn .

Example 53 Let 2 3
1 3 0 2 1
6 7
h i 6 7
6 3 9 1 5 5 7
A= a1 a2 . . . a5 =6
6
7.
7
6 2 6 1 3 2 7
4 5
5 15 2 8 8
Find a basis for Col(A).

Solution
2 3 2 3 2 3
1 3 0 2 1 1 3 0 2 1 1 3 0 2 1
6 7 6 7 6 7
6 7 6 7 6 7
6 3 9 1 5 5 7 6 0 0 1 1 8 7 6 0 0 1 1 8 7
6 7 !6 7 !6 7
6 7 6 7 6 7
6 2 6 1 3 2 7 6 0 0 1 1 4 7 6 0 0 0 0 4 7
4 5 4 5 4 5
5 15 2 8 8 0 0 2 2 13 0 0 0 0 3
2 3 2 3
1 3 0 2 1 1 3 0 2 0
6 7 6 7
6 7 6 7
6 0 0 1 1 8 7 6 0 0 1 1 0 7
!6
6
7 !6
7 6
7
7
6 0 0 0 0 1 7 6 0 0 0 0 1 7
4 5 4 5
0 0 0 0 3 0 0 0 0 0

Hence
b2 = 3b1 and b4 = 2b1 b3 ) a2 = 3a1 and a4 = 2a1 a3

We may discard a2 and a4 . The remaining set of vectors i.e. {a1 , a3 , a5 } can still span Col(A).
{b1 , b3 , b5 } is linearly independent ) {a1 , a3 , a5 } is linearly independent
8 2 3 2 3 2 39
>
> 1 0 1 > >
>
> 6 7 6 7 6 7>>
>
> 6 7 6 7 6 7>>
< 6 3 7 6 1 7 6 5 7=
a1 = 6 7 6 7
6 7 , a3 = 6 7 , a5 = 6
6 7 is a basis for Col (A).
7>
>
> 6 2 7 6 1 7 6 2 7>
>
> 4 5 4 5 4 5>>
>
> >
>
: 5 2 8 ;

Remark 14 When A is row reduced to B, Ax = 0 and Bx = 0 have exactly the same solution set.
Therefore, the columns of A have exactly the same linear dependence relationship as the columns of
B. i.e. Elementary row operations on a matrix do not a↵ect the linear dependence relations among
the columns of the matrix.

46
Definition 19 The dimension of the null space of A is called the nullity of A. The dimension of
the column space of A is called the rank of A.

Remark 15 Let A be an m ⇥ n matrix, U be its echelon form and r be the number of nonzero
rows of U.

1. If r = m, then there are no zero rows in U and there is always a solution to the system
Ax = b.

2. The system has n r free variables in the solution.

3. If r = n, then there are no free variables in the solution and Nul(A) contains only x = 0.

4. r  min (m, n).

5. Rank (A) is the number of pivot columns in A.

6. Nullity of A is the number of free variables in the equation Ax = 0. It is also the number of
nonpivot columns in A.

Theorem 21 (Fundamental Theorem of Linear Algebra) If A is an m ⇥ n matrix, then

Rank (A) + nullity of A = n

Proof Let U be the reduced row echelon form of A. The system Ax = 0 is equivalent to the
system Ux = 0. If A has rank r, then U will have r nonzero rows and consequently the system
Ux = 0 will involve r pivot variables and n r free variables. The dimension of Nul(A) equals the
number of free variables.

Theorem 22 If A is an m ⇥ n matrix, the dimension of the row space of A equals the dimension
of the column space of A.

Note: The row echelon form U tells us only which columns of A to use to form a basis. We
cannot use the column vectors from U since, in general, U and A have di↵erent columns spaces.
But one can use the row echelon form U of A to find a basis for the column space of A. One
need only determine the columns of U that correspond to the leading l’s. Those same columns of
A will be linearly independent and form a basis for the column space of A.

47
Example 54 Find bases and the dimensions for the row space, the column space and the null space
of the matrix: 2 3
2 5 8 0 17
6 7
6 7
6 1 3 5 1 5 7
A=6
6
7
7
6 3 11 19 7 1 7
4 5
1 7 13 5 3

Solution Reducing A to echelon form, we have:


2 3 2 3
2 5 8 0 17 1 3 5 1 5
6 7 6 7
6 7 6 7
6 1 3 5 1 5 7 6 2 5 8 0 17 7
6 7 6 7
A=6 7 !6 7
6 3 11 19 7 1 7 6 3 11 19 7 1 7
4 5 4 5
1 7 13 5 3 1 7 13 5 3
2 3 2 3
1 3 5 1 5 1 3 5 1 5
6 7 6 7
6 7 6 7
6 0 1 2 2 7 7 6 0 1 2 2 7 7
!6 6
7 !6
7 6
7
7
6 0 2 4 4 14 7 6 0 0 0 0 0 7
4 5 4 5
0 4 8 4 8 0 0 0 4 20
2 3 2 3
1 3 5 1 5 1 3 5 1 5
6 7 6 7
6 7 6 7
6 0 1 2 2 7 7 6 0 1 2 2 7 7
!6 6
7 !6
7 6
7
7
6 0 0 0 4 20 7 6 0 0 0 1 5 7
4 5 4 5
0 0 0 0 0 0 0 0 0 0
2 3 2 3
1 3 5 0 10 1 0 1 0 1
6 7 6 7
6 7 6 7
6 0 1 2 0 3 7 6 0 1 2 0 3 7
!6 6 7 6 7
7 !6 7
6 0 0 0 1 5 7 6 0 0 0 1 5 7
4 5 4 5
0 0 0 0 0 0 0 0 0 0
nh i h i h io
A basis for Row(A) is: 1 0 1 0 1 , 0 1 2 0 3 , 0 0 0 1 5 and dim
Row A = 3.
A basis for Col(A) is: 82 3 2 3 2 39
>
> 2 5 0 >
>
>
> 6 7 6 7 6 7>>
>
> 7>
<6 7 6
6 1 7 6 3 7 6
7 6
1 7=
>
6 7,6 7,6 7
> 6 7 6 7 6 7
>
> 6 3 7 6 11 7 6 7 7>>
>
> 4 5 4 5 4 5>>
>
>
: 1 >
7 5 ;

48
Thus, dim Col(A) = 3.

x3 and x5 are free variables


x1 +x3 +x5 = 0
x1 = x3 x5
x2 2x3 +3x5 = 0
x2 = 2x3 3x5
x4 5x5 = 0
x4 = 5x5
2 3 2 3 2 3
x3 x5 1 1
6 7 6 7 6 7
6 7 6 7 6 7
6 2x3 3x5 7 6 2 7 6 3 7
6 7 6 7 6 7
6 7 6 7 6 7
x=6 x3 7 = x 36 1 7 + x 56 0 7
6 7 6 7 6 7
6 7 6 7 6 7
6 5x5 7 6 0 7 6 5 7
4 5 4 5 4 5
x5 0 1
82 3 2 39
>
> 1 1 > >
>
>6 7 6 7>>
>
> 6 7 6 7 >
>
>
> 6 2 7 6 3 7 >
>
>
<6 7 6 7>=
6 7 6 7
A basis for Nul(A) = 6 1 7 , 6 0 7 and dim Nul(A) = 2.
>
> 6 7 6 7>
>
> 6 7 6 7>>
>
> 6 0 7 6 5 7>>
>
>4 5 4 5>>
>
>
: 0 >
1 ;

Remark 16 Let A be an m ⇥ n matrix. If the column vectors of A span Rm , then n must be


greater than or equal to m, since no set of less than m vectors could span Rm . If the columns of A
are linearly independent, then n must be less than or equal to m, since every set of more than m
vectors in Rm is linearly dependent. Thus, if the column vectors of A form a basis for Rm , then n
must equal m.

Corollary 1 An n ⇥ n matrix A is nonsingular if and only if the column vectors of A form a basis
for Rn .

49

You might also like