ch01 Introduction To Linear Algebra 5th Edition PDF
ch01 Introduction To Linear Algebra 5th Edition PDF
Matrices and
Systems of Linear
Equations
1
Overview In this chapter we discuss systems of linear equations and methods (such as Gauss-Jordan
elimination) for solving these systems. We introduce matrices as a convenient language
for describing systems and the Gauss-Jordan solution method.
We next introduce the operations of addition and multiplication for matrices and
show how these operations enable us to express a linear system in matrix-vector terms
as
Ax = b.
Representing the matrix A in column form as A = [A1 , A2 , . . . , An ], we then show that
the equation Ax = b is equivalent to
x1 A1 + x2 A2 + · · · + xn An = b.
The equation above leads naturally to the concepts of linear combination and linear inde-
pendence. In turn, those ideas allow us to address questions of existence and uniqueness
for solutions of Ax = b and to introduce the idea of an inverse matrix.
1
August 2, 2001 13:48 i56-ch01 Sheet number 2 Page number 2 cyan black
Linear Systems
Our objective is to obtain simultaneous solutions to a system (that is, a set) of one or
more linear equations. Here are three examples of systems of linear equations.
(a) x1 + x2 = 3
x1 − x2 = 1
(b) x1 − 2x2 − 3x3 = −11
−x1 + 3x2 + 5x3 = 15
(c) 3x1 − 2x2 = 1
6x1 − 4x2 = 6
In terms of solutions, it is easy to check that x1 = 2, x2 = 1 is one solution to system
(a). Indeed, it can be shown that this is the only solution to the system.
August 2, 2001 13:48 i56-ch01 Sheet number 3 Page number 3 cyan black
Example 2
(a) Display the system of equations with coefficients a11 = 2, a12 = −1, a13 =
−3, a21 = −2, a22 = 2, and a23 = 5, and with constants b1 = −1 and b2 = 3.
(b) Verify that x1 = 1, x2 = 0, x3 = 1 is a solution for the system.
Solution
(a) The system is
2x1 − x2 − 3x3 = −1
−2x1 + 2x2 + 5x3 = 3.
(b) Substituting x1 = 1, x2 = 0, and x3 = 1 yields
2(1) − (0) − 3(1) = −1
−2(1) + 2(0) + 5(1) = 3.
∗ For clarity of presentation, we assume throughout the chapter that the constants aij and bi are real numbers,
although all statements are equally valid for complex constants. When we consider eigenvalue problems,
we will occasionally encounter linear systems having complex coefficients, but the solution technique is no
different. In Chapter 4 we will discuss the technical details of solving systems that have complex coefficients.
August 2, 2001 13:48 i56-ch01 Sheet number 4 Page number 4 cyan black
x2 x2 x2
x1 x1 x1
Example 3 Give a geometric representation for each of the following systems of equations.
(a) x1 + x2 = 2
2x1 + 2x2 = 4
(b) x1 + x2 = 2
x1 + x2 = 1
(c) x1 + x2 = 3
x1 − x2 = 1
Solution The representations are displayed in Fig. 1.1.
August 2, 2001 13:48 i56-ch01 Sheet number 5 Page number 5 cyan black
The graph of a linear equation in three variables, ax1 + bx2 + cx3 = d, is a plane in
three-dimensional space (as long as one of a, b, or c is nonzero). So, as another example,
let us consider the general (2 × 3) system:
a11 x1 + a12 x2 + a13 x3 = b1
a21 x1 + a22 x2 + a23 x3 = b2 .
Because the solution set for each equation can be represented by a plane, there are two
possibilities:
1. The two planes might be coincident, or they might intersect in a line. In either
case, the system has infinitely many solutions.
2. The two planes might be parallel. In this case, the system has no solution.
Note, for the case of the general (2 × 3) system, that the possibility of a unique solution
has been ruled out.
As a final example, consider a general (3 × 3) system:
a11 x1 + a12 x2 + a13 x3 = b1
a21 x1 + a22 x2 + a23 x3 = b2
a31 x1 + a32 x2 + a33 x3 = b3 .
If we view this (3 × 3) system as representing three planes, it is easy to see from the
geometric perspective that there are three possible outcomes: infinitely many solutions,
no solution, or a unique solution (see Fig. 1.2). Note that Fig. 1.2(b) does not illustrate
every possible case of a (3 × 3) system that has no solution. For example, if just two
of three planes are parallel, then the system has no solution even though the third plane
might intersect each of the two parallel planes.
We conclude this subsection with the following remark, which we will state formally
in Section 1.3 (see Corollary to Theorem 3). This remark says that the possible outcomes
suggested by the geometric interpretations shown in Figs. 1.1 and 1.2 are typical for any
system of linear equations.
Remark An (m × n) system of linear equations has either infinitely many solutions,
no solution, or a unique solution.
Figure 1.2 The general (3 × 3) system may have (a) infinitely many
solutions, (b) no solution, or (c) a unique solution.
August 2, 2001 13:48 i56-ch01 Sheet number 6 Page number 6 cyan black
In general, a system of equations is called consistent if it has at least one solution, and the
system is called inconsistent if it has no solution. By the preceding remark, a consistent
system has either one solution or an infinite number of solutions; it is not possible for a
linear system to have, for example, exactly five solutions.
Matrices
We begin our introduction to matrix theory by relating matrices to the problem of solving
systems of linear equations. Initially we show that matrix theory provides a convenient
and natural symbolic language to describe linear systems. Later we show that matrix
theory is also an appropriate and powerful framework within which to analyze and solve
more general linear problems, such as least-squares approximations, representations of
linear operations, and eigenvalue problems.
The rectangular array
1 3 −1 2
4 2 1 −3
0 2 0 3
is an example of a matrix. More generally, an (m × n) matrix is a rectangular array of
numbers of the form
a11 a12 · · · a1n
a a22 · · · a2n
A = .21 . .
.. ..
am1 am2 · · · amn
Thus an (m × n) matrix has m rows and n columns. The subscripts for the entry aij
indicate that the number appears in the ith row and j th column of A. For example, a32 is
the entry in the third row and second column of A. We will frequently use the notation
A = (aij ) to denote a matrix A with entries aij .
Example 4 Display the (2 × 3) matrix A = (aij ), where a11 = 6, a12 = 3, a13 = 7, a21 = 2,
a22 = 1, and a23 = 4.
Solution
6 3 7
A=
2 1 4
If we display the coefficients and constants for this system in matrix form,
1 2 1 4
B = 2 −1 −1 1 ,
1 1 3 0
then we have expressed compactly and naturally all the essential information. The matrix
B is called the augmented matrix for the system.
In general, with the (m × n) system of linear equations
we associate two matrices. The coefficient matrix for system (3) is the (m × n) matrix
A where
a11 a12 · · · a1n
a a22 · · · a2n
A = .21 .. .
.. .
am1 am2 · · · amn
The augmented matrix for system (3) is the [m × (n + 1)] matrix B where
a11 a12 · · · a1n b1
a a22 · · · a2n b2
B = 21 . . .. .
.. .. .
am1 am2 · · · amn bm
Note that B is nothing more than the coefficient matrix A augmented with an extra
column; the extra column is the right-hand side of system (3).
The augmented matrix B is usually denoted as [A | b], where A is the coefficient
matrix and
b1
b
2
b = . .
..
bm
Example 5 Display the coefficient matrix A and the augmented matrix B for the system
x1 − 2x2 + x3 = 2
2x1 + x2 − x3 = 1
−3x1 + x2 − 2x3 = −5.
August 2, 2001 13:48 i56-ch01 Sheet number 8 Page number 8 cyan black
Solution The coefficient matrix A and the augmented matrix [A | b] are given by
1 −2 1 1 −2 1 2
A= 2 1 −1 and [A | b] = 2 1 −1 1 .
−3 1 −2 −3 1 −2 −5
Elementary Operations
As we shall see, there are two steps involved in solving an (m × n) system of equations.
The steps are:
1. Reduction of the system (that is, the elimination of variables).
2. Description of the set of solutions.
The details of both steps will be left to the next section. For the remainder of this section,
we will concentrate on giving an overview of the reduction step.
The goal of the reduction process is to simplify the given system by eliminating
unknowns. It is, of course, essential that the reduced system of equations have the same
set of solutions as the original system.
Definition 1 Two systems of linear equations in n unknowns are equivalent provided that they
have the same set of solutions.
Thus the reduction procedure must yield an equivalent system of equations. The follow-
ing theorem provides three operations, called elementary operations, that can be used
in reduction.
Theorem 1 If one of the following elementary operations is applied to a system of linear equations,
then the resulting system is equivalent to the original system.
1. Interchange two equations.
2. Multiply an equation by a nonzero scalar.
3. Add a constant multiple of one equation to another.
(In part 2 of Theorem 1, the term scalar means a constant; that is, a number.) The proof
of Theorem 1 is included in Exercise 41 of Section 1.1.
To facilitate the use of the elementary operations listed above, we adopt the following
notation:
Notation Elementary Operation Performed
Ei ↔ Ej The ith and j th equations are interchanged.
kEi The ith equation is multiplied by the nonzero scalar k.
Ei + kEj k times the j th equation is added to the ith equation.
August 2, 2001 13:48 i56-ch01 Sheet number 9 Page number 9 cyan black
The following simple example illustrates the use of elementary operations to solve a
(2×2) system. (The complete solution process for a general (m×n) system is described
in detail in the next section.)
Row Operations
As noted earlier, we want to use an augmented matrix as a shorthand notation for a
system of equations. Because equations become translated to rows in the augmented
matrix, we want to perform elementary operations on the rows of a matrix. Toward that
end, we introduce the following terminology.
Definition 2 The following operations, performed on the rows of a matrix, are called elementary
row operations:
1. Interchange two rows.
2. Multiply a row by a nonzero scalar.
3. Add a constant multiple of one row to another.
August 2, 2001 13:48 i56-ch01 Sheet number 10 Page number 10 cyan black
We say that two (m × n) matrices, B and C, are row equivalent if one can be
obtained from the other by a sequence of elementary row operations. Now if B is the
augmented matrix for a system of linear equations and if C is row equivalent to B, then
C is the augmented matrix for an equivalent system. This observation follows because
the elementary row operations for matrices exactly duplicate the elementary operations
for equations.
Thus, we can solve a linear system with the following steps:
We will specify what we mean by a simpler system in the next section. For now, we
illustrate in Example 7 how using elementary row operations to reduce an augmented
matrix is exactly parallel to using elementary operations to reduce the corresponding
system of equations.
E1 ↔ E3 : R 1 ↔ R3 :
2x1 + 4x2 − 2x3 = 2 2 4 −2 2
3x1 + 5x2 − 5x3 = 1 3 5 −5 1
2x2 + x3 = −2 0 2 1 −2
(1/2)E1 : (1/2)R1 :
x1 + 2x2 − x3 = 1 1 2 −1 1
3x1 + 5x2 − 5x3 = 1 3 5 −5 1
2x2 + x3 = −2 0 2 1 −2
E2 − 3E1 : R2 − 3R1 :
x1 + 2x2 − x3 = 1 1 2 −1 1
− x2 − 2x3 = −2 0 −1 −2 −2
2x2 + x3 = −2 0 2 1 −2
The variable x1 has now been eliminated from the second and third equations. Next,
we eliminate x2 from the first and third equations and leave x2 , with coefficient 1, in the
second equation. We continue the reduction process with the following operations:
(−1)E2 : (−1)R2 :
x1 + 2x2 − x3 = 1 1 2 −1 1
x2 + 2x3 = 2 0 1 2 2
2x2 + x3 = −2 0 2 1 −2
E1 − 2E2 : R1 − 2R2 :
x1 − 5x3 = −3 1 0 −5 −3
x2 + 2x3 = 2 0 1 2 2
2x2 + x3 = −2 0 2 1 −2
E3 − 2E2 : R3 − 2R2 :
x1 − 5x3 = −3 1 0 −5 −3
x2 + 2x3 = 2 0 1 2 2
−3x3 = −6 0 0 −3 −6
August 2, 2001 13:48 i56-ch01 Sheet number 12 Page number 12 cyan black
The variable x2 has now been eliminated from the first and third equations. Next, we
eliminate x3 from the first and second equations and leave x3 , with coefficient 1, in the
third equation:
System: Augmented Matrix:
(−1/3)E3 : (−1/3)R3 :
x1 − 5x3 = −3 1 0 −5 −3
x2 + 2x3 = 2 0 1 2 2
x3 = 2 0 0 1 2
E1 + 5E3 : R1 + 5R3 :
x1 =7 1 0 0 7
x2 + 2x3 = 2 0 1 2 2
x3 = 2 0 0 1 2
E2 − 2E3 : R2 − 2R3 :
x1 = 7 1 0 0 7
x2 = −2 0 1 0 −2
x3 = 2 0 0 1 2
The last system above clearly has a unique solution given by x1 = 7, x2 = −2, and
x3 = 2. Because the final system is equivalent to the original given system, both
systems have the same solution.
Corollary Suppose [A | b] and [C | d] are augmented matrices, each representing a different (m×n)
system of linear equations. If [A | b] and [C | d] are row equivalent matrices, then the
two systems are also equivalent.
1.1 EXERCISES
Which of the equations in Exercises 1–6 are linear? In Exercises 7–10, coefficients are given for a system
1. x1 + 2x3 = 3 2. x1 x2 + x2 = 1 of the form (2). Display the system and verify that the
given values constitute a solution.
3. x1 − x2 = sin2 x1 + cos2 x1
4. x1 − x2 = sin2 x1 + cos2 x2 7. a11 = 1, a12 = 3, a21 = 4, a22 = −1,
√ √
5. |x1 | − |x2 | = 0 6. πx1 + 7x2 = 3 b1 = 7, b2 = 2; x1 = 1, x2 = 2
August 2, 2001 13:48 i56-ch01 Sheet number 13 Page number 13 cyan black
8. a11 = 6, a12 = −1, a13 = 1, a21 = 1, 23. Repeat Exercise 22 for the matrices in Exercises 19
a22 = 2, a23 = 4, b1 = 14, b2 = 4; and 21.
x1 = 2, x2 = −1, x3 = 1
In Exercises 24–29, display the coefficient matrix A and
9. a11 = 1, a12 = 1, a21 = 3, a22 = 4,
the augmented matrix B for the given system.
a31 = −1, a32 = 2, b1 = 0, b2 = −1,
b3 = −3; x1 = 1, x2 = −1 24. x1 − x2 = −1 25. x1 + x2 − x3 = 2
10. a11 = 0, a12 = 3, a21 = 4, a22 = 0, x1 + x2 = 3 2x1 − x3 = 1
b1 = 9, b2 = 8; x1 = 2, x2 = 3 26. x1 + 3x2 − x3 = 1 27. x1 + x2 + 2x3 = 6
2x1 + 5x2 + x3 = 5 3x1 + 4x2 − x3 = 5
In Exercises 11–14, sketch a graph for each equation to x1 + x2 + x3 = 3 −x1 + x2 + x3 = 2
determine whether the system has a unique solution, no
solution, or infinitely many solutions. 28. x1 + x2 − 3x3 = −1
x1 + 2x2 − 5x3 = −2
11. 2x + y = 5 12. 2x − y = −1 −x1 − 3x2 + 7x3 = 3
x−y =1 2x − y = 2
29. x1 + x2 + x3 = 1
13. 3x + 2y = 6 14. 2x + y = 5 2x1 + 3x2 + x3 = 2
−6x − 4y = −12 x− y=1 x1 − x2 + 3x3 = 2
x + 3y = 9
15. The (2 × 3) system of linear equations In Exercises 30–36, display the augmented matrix for the
a1 x + b1 y + c1 z = d1 given system. Use elementary operations on equations
to obtain an equivalent system of equations in which x1
a2 x + b2 y + c2 z = d2 appears in the first equation with coefficient one and has
is represented geometrically by two planes. How been eliminated from the remaining equations. Simul-
are the planes related when: taneously, perform the corresponding elementary row
a) The system has no solution? operations on the augmented matrix.
b) The system has infinitely many solutions? 30. 2x1 + 3x2 = 6 31. x1 + 2x2 − x3 = 1
4x1 − x2 = 7 x1 + x2 + 2x3 = 2
Is it possible for the system to have a unique solu-
−2x1 + x2 =4
tion? Explain.
32. x2 + x3 = 4 33. x1 + x2 = 9
In Exercises 16–18, determine whether the given (2 × 3) x1 − x2 + 2x3 = 1 x1 − x 2 = 7
system of linear equations represents coincident planes 2x1 + x2 − x3 = 6 3x1 + x2 = 6
(that is, the same plane), two parallel planes, or two 34. x1 + x2 + x3 − x4 = 1
planes whose intersection is a line. In the latter case, give −x1 + x2 − x3 + x4 = 3
the parametric equations for the line; that is, give equa- −2x1 + x2 + x3 − x4 = 2
tions of the form x = at + b, y = ct + d, z = et + f .
35. x2 + x3 − x4 = 3
16. 2x1 + x2 + x3 = 3 17. x1 + 2x2 − x3 = 2 x1 + 2x2 − x3 + x4 = 1
−2x1 + x2 − x3 = 1 x1 + x2 + x3 = 3 −x1 + x2 + 7x3 − x4 = 0
18. x1 + 3x2 − 2x3 = −1
36. x1 + x2 = 0
2x1 + 6x2 − 4x3 = −2
x1 − x 2 = 0
19. Display the (2×3) matrix A = (aij ), where a11 = 2, 3x1 + x2 = 0
a12 = 1, a13 = 6, a21 = 4, a22 = 3, and a23 = 8.
37. Consider the equation 2x1 − 3x2 + x3 − x4 = 3.
20. Display the (2×4) matrix C = (cij ), where c23 = 4,
c12 = 2, c21 = 2, c14 = 1, c22 = 2, c24 = 3, a) In the six different possible combinations, set
c11 = 1, and c13 = 7. any two of the variables equal to 1 and graph the
21. Display the (3×3) matrix Q = (qij ), where q23 = 1, equation in terms of the other two.
q32 = 2, q11 = 1, q13 = −3, q22 = 1, q33 = 1, b) What type of graph do you always get when you
q21 = 2, q12 = 4, and q31 = 3. set two of the variables equal to two fixed
22. Suppose the matrix C in Exercise 20 is the aug- constants?
mented matrix for a system of linear equations. Dis- c) What is one possible reason the equation in
play the system. formula (1) is called linear?
August 2, 2001 13:48 i56-ch01 Sheet number 14 Page number 14 cyan black
38. Consider the (2 × 2) system 41. Prove that any of the elementary operations in The-
a11 x1 + a12 x2 = b1 orem 1 applied to system (2) produces an equivalent
system. [Hint: To simplify this proof, represent the
a21 x1 + a22 x2 = b2 .
ith equation in system (2) as fi (x1 , x2 , . . . , xn ) =
Show that if a11 a22 − a12 a21
= 0, then this system bi ; so
is equivalent to a system of the form
fi (x1 , x2 , . . . , xn ) = ai1 x1 + ai2 x2 + · · · + ain xn
c11 x1 + c12 x2 = d1
c22 x2 = d2 , for i = 1, 2, . . . , m. With this notation, system (2)
has the form of (A), which follows. Next, for exam-
where c11
= 0 and c22
= 0. Note that the second ple, if a multiple of c times the j th equation is added
system always has a solution. [Hint: First suppose to the kth equation, a new system of the form (B) is
that a11
= 0, and then consider the special case in produced:
which a11 = 0.]
(A) (B)
39. In the following (2 × 2) linear systems (A) and (B),
c is a nonzero scalar. Prove that any solution, f1 (x1 , x2 , . . . , xn ) = b1 f1 (x1 , x2 , . . . , xn ) = b1
x1 = s1 , x2 = s2 , for (A) is also a solution for .. .. .. ..
. . . .
(B). Conversely, show that any solution, x1 = t1 ,
x2 = t2 , for (B) is also a solution for (A). Where is fj (x1 , x2 , . . . , xn ) = bj fj (x1 , x2 , . . . , xn ) = bj
the assumption that c is nonzero required? .. .. .. ..
. . . .
a11 x1 + a12 x2 = b1
(A) fk (x1 , x2 , . . . , xn ) = bk g(x1 , x2 , . . . , xn ) = r
a21 x1 + a22 x2 = b2 .. .. .. ..
. . . .
a11 x1 + a12 x2 = b1 fm (x1 , x2 , . . . , xn ) = bm fm (x1 , x2 , . . . , xn ) = bm
(B)
ca21 x1 + ca22 x2 = cb2
where g(x1 , x2 , . . . , xn ) = fk (x1 , x2 , . . . , xn ) +
40. In the (2 × 2) linear systems that follow, the system cfj (x1 , x2 , . . . , xn ), and r = bk + cbj . To show
(B) is obtained from (A) by performing the elemen- that the operation gives an equivalent system, show
tary operation E2 + cE1 . Prove that any solution, that any solution for (A) is a solution for (B), and
x1 = s1 , x2 = s2 , for (A) is a solution for (B). Sim- vice versa.]
ilarly, prove that any solution, x1 = t1 , x2 = t2 , for
42. Solve the system of two nonlinear equations in two
(B) is a solution for (A).
unknowns
a11 x1 + a12 x2 = b1
(A) x12 − 2x1 + x22 = 3
a21 x1 + a22 x2 = b2
x12 − x22 = 1.
a11 x1 + a12 x2 = b1
(B)
(a21 + ca11 )x1 + (a22 + ca12 )x2 = b2 + cb1
can immediately describe the solution. See, for example, Examples 6 and 7 in Section
1.1. We turn now to the question of how to describe this objective in mathematical
terms—that is, how do we know when the system has been simplified as much as it can
be? The answer is: The system has been simplified as much as possible when it is in
reduced echelon form.
Echelon Form
When an augmented matrix is reduced to the form known as echelon form, it is easy to
solve the linear system represented by the reduced matrix. The formal description of
echelon form is given in Definition 3. Then, in Definition 4, we describe an even simpler
form known as reduced echelon form.
1 ∗ ∗ 1 ∗ ∗ 1 ∗ ∗ 0 1 ∗
A= 0 1 ∗ A= 0 1 ∗ A= 0 0 1 A= 0 0 1
0 0 1 0 0 0 0 0 0 0 0 0
Definition 4 A matrix that is in echelon form is in reduced echelon form provided that the first
nonzero entry in any row is the only nonzero entry in its column.
Figure 1.5 gives four examples (corresponding to the examples in Fig. 1.4) of matrices
in reduced echelon form.
1 0 0 1 0 ∗ 1 ∗ 0 0 1 0
A= 0 1 0 A= 0 1 ∗ A= 0 0 1 A= 0 0 1
0 0 1 0 0 0 0 0 0 0 0 0
Example 1 For each matrix shown, choose one of the following phrases to describe the matrix.
(a) The matrix is not in echelon form.
(b) The matrix is in echelon form, but not in reduced echelon form.
(c) The matrix is in reduced echelon form.
August 2, 2001 13:48 i56-ch01 Sheet number 17 Page number 17 cyan black
1 0 0 1 3 2
A= 2 1 0 , B = 0 −1 1 ,
3 −4 1 0 0 1
0 1 −1 0 1 2 3 4 5 1
C= 0 0 0 1 , D = 0 0 1 2 3 , E = 0 ,
0 0 0 0 0 0 0 1 0 0
0
F = 0 , G = [1 0 0], H = [0 0 1].
1
Solution A, B, and F are not in echelon form; D is in echelon form but not in reduced echelon
form; C, E, G, and H are in reduced echelon form.
Example 2 Each of the following matrices is in reduced echelon form and is the augmented matrix
for a system of linear equations. In each case, give the system of equations and describe
the solution.
1 0 0 3
1 0 −1 0
0 1 0 −2
B= 0
,
C= 0 1 3 0 ,
0 1 7
0 0 0 1
0 0 0 0
1 −3 0 4 2 1 2 0 5
D= 0 0 1 −5 1 , E = 0 0 1 0 .
0 0 0 0 0 0 0 0 0
Solution
Matrix B: Matrix B is the augmented matrix for the following system:
x1 = 3
x2 = −2
x3 = 7.
Therefore, the system has the unique solution x1 = 3, x2 = −2, and x3 = 7.
August 2, 2001 13:48 i56-ch01 Sheet number 18 Page number 18 cyan black
x1 − 2x4 =3
x3 − 4x4 =2
x5 = 2.
The solution of this system is x1 = 3 + 2x4 , x3 = 1 + 4x4 , x5 = 2, and x4 is arbitrary.
Note that the equations place no constraint whatsoever on the variable x2 . That does not
mean that x2 must be zero; instead, it means that x2 is also arbitrary.
series of elementary row operations into a matrix C which is in reduced echelon form.
Then, because C is in reduced echelon form, it is easy to solve the equivalent linear
system represented by C (recall Example 2).
The following steps show how to transform a given matrix B to reduced echelon
form. As such, this list of steps constitutes an informal proof of the existence portion
of Theorem 2. We do not prove the uniqueness portion of Theorem 2. The steps listed
assume that B has at least one nonzero entry (because if B has only zero entries, then B
is already in reduced row echelon form).
The next example illustrates an application of the six-step process just described.
When doing a small problem by hand, however, it is customary to alter the steps slightly—
instead of going all the way to echelon form (sweeping from left to right) and then going
from echelon to reduced echelon form (sweeping from bottom to top), it is customary
to make a single pass (moving from left to right) introducing 0’s above and below the
leading 1. Example 3 demonstrates this single-pass variation.
Example 3 Use elementary row operations to transform the following matrix to reduced echelon
form
0 0 0 0 2 8 4
0 0 0 1 3 11 9
A=
0
.
3 −12 −3 −9 −24 −33
0 −2 8 1 6 17 21
Solution The following row operations will transform A to reduced echelon form.
August 2, 2001 13:48 i56-ch01 Sheet number 21 Page number 21 cyan black
Having provided this example of how to transform a matrix to reduced echelon form,
we can be more specific about the procedure for solving a system of equations that is
diagrammed in Fig. 1.3.
Solution We first create the augmented matrix and then transform it to reduced echelon form. The
augmented matrix is
2 −4 3 −4 −11 28
−1 2 −1 2 5 −13
.
0 0 −3 1 6 −10
3 −6 10 −8 −28 61
The first step is to introduce a leading 1 into row 1. We can introduce the leading 1
if we multiply row 1 by 1/2, but that would create fractions that are undesirable for hand
work. As an alternative, we can add row 2 to row 1 and avoid fractions.
R 1 + R2 : 1 −2 2 −2 −6 15
−1 2 −1 2 5 −13
0 0 −3 1 6 −10
3 −6 10 −8 −28 61
August 2, 2001 13:48 i56-ch01 Sheet number 23 Page number 23 cyan black
x1 − 2x2 + 2x5 = 3
x3 − x5 = 2
x4 + 3x5 = −4.
Solving the preceding system, we find:
x1 = 3 + 2x2 − 2x5
x3 = 2 + x5 (1)
x4 = −4 − 3x5
In Eq. (1) we have a nice description of all of the infinitely many solutions to the
original system—it is called the general solution for the system. For this example,
x2 and x5 are viewed as independent (or unconstrained) variables and can be assigned
values arbitrarily. The variables x1 , x3 , and x4 are dependent (or constrained) variables,
and their values are determined by the values assigned to x2 and x5 . For example, in
Eq. (1), setting x2 = 1 and x5 = −1 yields a particular solution given by x1 = 7,
x2 = 1, x3 = 1, x4 = −1, and x5 = −1.
calculators can solve systems of linear equations and perform simple matrix opera-
tions. For computers there are general-purpose computer algebra systems such as Derive,
Mathematica, and Maple that have extensive computational capabilities. Special-purpose
linear algebra software such as MATLAB is very easy to use and can perform virtually
any type of matrix calculation.
In the following example, we illustrate the use of MATLAB. From time to time, as
appropriate, we will include other examples that illustrate the use of electronic aids.
Use Eq. (2) to find the formula for 13 + 23 + 33 + · · · + n3 . (Note: Eq. (2) can be derived
from the theory of linear difference equations.)
Solution From Eq. (2) there are constants a1 , a2 , a3 , and a4 such that
13 + 23 + 33 + · · · + n3 = a1 n + a2 n2 + a3 n3 + a4 n4 .
>>A=[1,1,1,1,1;2,4,8,16,9;3,9,27,81,36;4,16,64,256,100]
A=
1 1 1 1 1
2 4 8 16 9
3 9 27 81 36
4 16 64 256 100
>>C=rref(A)
C=
1.0000 0 0 0 0
0 1.0000 0 0 0.2500
0 0 1.0000 0 0.5000
0 0 0 1.0000 0.2500
>>C
C=
1 0 0 0 0
0 1 0 0 1/4
0 0 1 0 1/2
0 0 0 1 1/4
and then MATLAB displayed A. At the second prompt, we entered the MATLAB row-
reduction command, C = rref(A). The new matrix C, as displayed by MATLAB, is
the result of transforming A to reduced echelon form.
MATLAB normally displays results in decimal form. To obtain a rational form for
the reduced matrix C, from the submenu numerical form we selected rat and entered C,
finding
1 0 0 0 0
0 1 0 0 1/4
C= .
0 0 1 0 1/2
0 0 0 1 1/4
From this, we have a1 = 0, a2 = 1/4, a3 = 1/2, and a4 = 1/4. Therefore, the formula
for the sum of the first n cubes is
1 1 1
1 3 + 2 3 + 3 3 + · · · + n 3 = n2 + n3 + n 4
4 2 4
or, after simplification,
n2 (n + 1)2
13 + 23 + 33 + · · · + n3 = .
4
August 2, 2001 13:48 i56-ch01 Sheet number 26 Page number 26 cyan black
ADDING INTEGERS Mathematical folklore has it that Gauss discovered the formula
1 + 2 + 3 + · · · + n = n(n + 1)/2 when he was only ten years old. To occupy time, his teacher asked the
students to add the integers from 1 to 100. Gauss immediately wrote an answer and turned his slate over.
To his teacher’s amazement, Gauss had the only correct answer in the class. Young Gauss had recognized
that the numbers could be put in 50 sets of pairs such that the sum of each pair was 101:
(50 + 51) + (49 + 52) + (48 + 53) + · · · + (1 + 100) = 50(101) = 5050.
Soon his brilliance was brought to the attention of the Duke of Brunswick, who thereafter sponsored the
education of Gauss.
1.2 EXERCISES
Consider the matrices in Exercises 1–10. 11. 1 1 0 12. 1 1 0
a) Either state that the matrix is in echelon form 0 1 0 0 0 2
or use elementary row operations to
13. 1 2 1 0 14. 1 2 2 1
transform it to echelon form.
b) If the matrix is in echelon form, transform it 0 1 3 1 0 1 0 0
to reduced echelon form. 15. 1 1 1 0 16. 1 2 0 1
1. 1 2 2. 1 2 −1 0 1 0 0 0 1 1 0
0 1 0 1 3 0 0 0 1 0 0 2 0
3. 2 3 1 4. 0 1 1 17. 1 0 1 0 0 18. 1 2 1 3
4 1 0 1 2 3 0 0 1 1 0 0 0 0 2
0 0 0 1 0 0 0 0 0
5. 0 0 2 3 6. 2 0 3 1
19. 1 0 0 1
2 0 1 4 0 0 1 2
0 1 0 1
7. 1 3 2 1 8. 2 −1 3 0 0 0 1
0 1 4 2 0 1 1 20. 1 1 2 0 2 0
0 0 1 1 0 0 −3
0 1 1
1 0 0
9. 1 2 −1 −2 0 0 1 2 1 2
0 2 −2 −3 21. 2 1 3 2 0 1
0 0 0 1
0 0 1 1 2 1
10. −1 4 −3 4 6 0 0 0 0 3 0
0 2 1 −3 −3 In Exercises 22–35, solve the system by transforming
0 0 0 1 2 the augmented matrix to reduced echelon form.
In Exercises 11–21, each of the given matrices represents 22. 2x1 − 3x2 = 5
the augmented matrix for a system of linear equations. −4x1 + 6x2 = −10
In each exercise, display the solution set or state that the 23. x1 − 2x2 = 3
system is inconsistent. 2x1 − 4x2 = 1
August 2, 2001 13:48 i56-ch01 Sheet number 27 Page number 27 cyan black
In Exercises 36–40, find all values a for which the sys- By Exercise 44, B and C are both row equivalent to
tem has no solution. matrix I in Exercise 44. Determine elementary row
operations that demonstrate that B is row equivalent
36. x1 + 2x2 = −3 37. x1 + 3x2 =4 to C.
ax1 − 2x2 = 5 2x1 + 6x2 =a 48. Repeat Exercise 47 for the matrices
38. 2x1 + 4x2 = a 39. 3x1 + ax2 =3
1 4 1 2
3x1 + 6x2 = 5 ax1 + 3x2 =5 B= , C= .
40. x1 + ax2 = 6 3 7 2 1
ax1 + 2ax2 = 4 49. A certain three-digit number N equals fifteen times
In Exercises 41 and 42, find all values α and β where the sum of its digits. If its digits are reversed, the
0 ≤ α ≤ 2π and 0 ≤ β ≤ 2π. resulting number exceeds N by 396. The one’s digit
is one larger than the sum of the other two. Give
41. 2 cos α + 4 sin β = 3 a linear system of three equations whose three un-
3 cos α − 5 sin β = −1 knowns are the digits of N. Solve the system and
42. 2 cos2 α − sin2 β = 1 find N.
12 cos2 α + 8 sin2 β = 13 50. Find the equation of the parabola, y = ax 2 +bx +c,
43. Describe the solution set of the following system in that passes through the points (−1, 6), (1, 4), and
terms of x3 : (2, 9). [Hint: For each point, give a linear equation
x + x +x =3
1 2 3
in a, b, and c.]
x1 + 2x2 = 5. 51. Three people play a game in which there are al-
For x1 , x2 , x3 in the solution set: ways two winners and one loser. They have the
August 2, 2001 13:48 i56-ch01 Sheet number 28 Page number 28 cyan black
understanding that the loser gives each winner an 54. Find a cubic polynomial, p(x) = a+bx+cx 2 +dx 3 ,
amount equal to what the winner already has. After such that p(1) = 5, p (1) = 5, p(2) = 17, and
three games, each has lost just once and each has p (2) = 21.
$24. With how much money did each begin?
In Exercises 55–58, use Eq. (2) to find the formula for
52. Find three numbers whose sum is 34 when the sum
the sum. If available, use linear algebra software for
of the first and second is 7, and the sum of the second
Exercises 57 and 58.
and third is 22.
55. 1 + 2 + 3 + · · · + n
53. A zoo charges $6 for adults, $3 for students, and
$.50 for children. One morning 79 people enter and 56. 12 + 22 + 32 + · · · + n2
pay a total of $207. Determine the possible numbers 57. 14 + 24 + 34 + · · · + n4
of adults, students, and children. 58. 15 + 25 + 35 + · · · + n5
deduce the various possibilities for the solution set of (1), we will focus on the simpler
problem of analyzing the solution possibilities for the equivalent system represented by
the matrix [C | d].
We begin by making four remarks about an [m × (n + 1)] matrix [C | d] that is in
reduced echelon form. Our first remark recalls an observation made in Section 1.2.
Remark 1: The system represented by the matrix [C | d] is inconsistent if and only if
[C | d] has a row of the form [0, 0, 0, . . . , 0, 1].
Our second remark also follows because [C | d] is in reduced echelon form. In
particular, we know every nonzero row of [C | d] has a leading 1. We also know there
are no other nonzero entries in a column of [C | d] that contains a leading 1. Thus, if xk
is the variable corresponding to a leading 1, then xk can be expressed in terms of other
variables that do not correspond to any leading ones in [C | d]. Therefore, we obtain
Remark 2: Every variable corresponding to a leading 1 in [C | d] is a dependent vari-
able. (That is, each “leading-one variable” can be expressed in terms of the independent
or “nonleading-one variables.”)
We illustrate Remark 2 with the following example.
Our fourth remark is a consequence of Remark 1 and Remark 3. Let r denote the number
of nonzero rows in [C | d]. If r = n+1, then [C | d] has a row of the form [0, 0, . . . , 0, 1]
and hence the system represented by [C | d] must be inconsistent. Therefore, if the system
is consistent, we need to have r < n + 1. This observation leads to:
Remark 4: Let r denote the number of nonzero rows in [C | d]. If the system repre-
sented by [C | d] is consistent, then r ≤ n.
In general, let [C | d] be an [m × (n + 1)] matrix in reduced echelon form where
[C | d] represents a consistent system. According to Remark 2, if [C | d] has r nonzero
rows, then there are r dependent (constrained) variables in the solution of the system
corresponding to [C | d]. In addition, by Remark 4, we know r ≤ n. Since there are
n variables altogether in this (m×n) system, the remaining n−r variables are independent
(or unconstrained) variables. See Theorem 3.
Corollary Consider an (m × n) system of linear equations. If m < n, then either the system is
inconsistent or it has infinitely many solutions.
Proof Consider an (m × n) system of linear equations where m < n. If the system is incon-
sistent, there is nothing to prove. If the system is consistent, then Theorem 3 applies.
For a consistent system, suppose that the augmented matrix [A | b] is row equivalent to a
matrix [C | d] that is in echelon form and has r nonzero rows. Because the given system
has m equations, the augmented matrix [A | b] has m rows. Therefore the matrix [C | d]
also has m rows. Because r is the number of nonzero rows for [C | d], it is clear that
r ≤ m. But m < n, so it follows that r < n. By Theorem 3, there are n − r independent
variables. Because n − r > 0, the system has infinitely many solutions.
Example 3 What are the possibilities for the solution set of a (3 × 4) system of linear equations?
If the system is consistent, what are the possibilities for the number of independent
variables?
Solution By the corollary to Theorem 3, the system either has no solution or has infinitely many
solutions. If the system reduces to a system with r equations, then r ≤ 3. Thus r must
be 1, 2, or 3. (The case r = 0 can occur only when the original system is the trivial
system in which all coefficients and all constants are zero.) If the system is consistent,
the number of free parameters is 4 − r, so the possibilities are 3, 2, and 1.
Example 4 What are the possibilities for the solution set of the following (3 × 4) system?
2x1 − x2 + x3 − 3x4 = 0
x1 + 3x2 − 2x3 + x4 = 0
−x1 − 2x2 + 4x3 − x4 = 0
Solution First note that x1 = x2 = x3 = x4 = 0 is a solution, so the system is consistent. By the
corollary to Theorem 3, the system must have infinitely many solutions. That is, m = 3
and n = 4, so m < n.
Homogeneous Systems
The system in Example 4 is an example of a homogeneous system of equations. More
generally, the (m × n) system of linear equations given in (2) is called a homogeneous
system of linear equations:
a11 x1 + a12 x2 + · · · + a1n xn = 0
a21 x1 + a22 x2 + · · · + a2n xn = 0
.. .. .. (2)
. . .
am1 x1 + am2 x2 + · · · + amn xn = 0.
Thus system (2) is the special case of the general (m × n) system (1) given earlier
in which b1 = b2 = · · · = bm = 0. Note that a homogeneous system is always
consistent, because x1 = x2 = · · · = xn = 0 is a solution to system (2). This solution is
called the trivial solution or zero solution, and any other solution is called a nontrivial
solution. A homogeneous system of equations, therefore, either has the trivial solution
August 2, 2001 13:48 i56-ch01 Sheet number 32 Page number 32 cyan black
as the unique solution or also has nontrivial (and hence infinitely many) solutions. With
these observations, the following important theorem is an immediate consequence of the
corollary to Theorem 3.
Theorem 4 A homogeneous (m×n) system of linear equations always has infinitely many nontrivial
solutions when m < n.
If Eq. (3) has real solutions, then the graph is a curve in the xy-plane. If at least one of
a, b, or c is nonzero, the resulting graph is known as a conic section. Conic sections
include such familiar plane figures as parabolas, ellipses, hyperbolas, and (as well)
certain degenerate forms such as points and lines. Objects as diverse as planets, comets,
man-made satellites, and electrons follow trajectories in space that correspond to conic
sections. The earth, for instance, travels in an elliptical path about the sun, with the sun
at one focus of the ellipse.
In this subsection we consider an important data-fitting problem associated with
Eq. (3), namely:
Suppose we are given several points in the xy-plane, (x1 , y1 ), (x2 , y2 ),
. . . , (xn , yn ). Can we find coefficients a, b, . . . , f so that the graph
of Eq. (3) passes through the given points?
For example, if we know an object is moving along an ellipse, can we make a few
observations of the object’s position and then determine its complete orbit? As we will
see, the answer is yes. In fact, if an object follows a trajectory that corresponds to the
graph of Eq. (3), then five or fewer observations are sufficient to determine the complete
trajectory.
The following example introduces the data-fitting technique. As you will see, Ex-
ample 8 describes a method for finding the equation of the line passing through two
points in the plane. This is a simple and familiar problem, but its very simplicity is a
virtue because it suggests methods we can use for solving more complicated problems.
Example 8 The general equation of a line is dx + ey + f = 0. Find the equation of the line through
the points (1, 2) and (3, 7).
August 2, 2001 13:48 i56-ch01 Sheet number 35 Page number 35 cyan black
Solution In an analytic geometry course, we would probably find the equation of the line by
first calculating the slope of the line. In this example, however, we are interested in
developing methods that can be used to find equations for more complicated curves; and
we do not want to use special purpose techniques, such as slopes, that apply only to lines.
Since the points (1, 2) and (3, 7) lie on the line defined by dx + ey + f = 0, we
insert these values into the equation and find the following conditions on the coefficients
d, e, and f :
d + 2e + f = 0
3d + 7e + f = 0.
We are guaranteed from Theorem 4 that the preceding homogeneous linear system has
nontrivial solutions; that is, we can find a line passing through the two given points. To
find the equation of the line, we need to solve the system. We begin by forming the
associated augmented matrix
1 2 1 0
.
3 7 1 0
It follows that the solution is d = −5f , e = 2f , and hence the equation of the line
is
−5f x + 2fy + f = 0.
Canceling the parameter f , we obtain an equation for the line:
−5x + 2y + 1 = 0.
Example 8 suggests how we might determine the equation of a conic that passes
through a given set of points in the xy-plane. In particular, see Eq. (3); the general conic
has six coefficients, a, b, . . . , f . So, given any five points (xi , yi ) we can insert these
five points into Eq. (3) and the result will be a homogeneous system of five equations for
the six unknown coefficients that define the conic section. By Theorem 4, the resulting
system is guaranteed to have a nontrivial solution—that is, we can guarantee that any five
points in the plane lie on the graph of an equation of the form (3). Example 9 illustrates
this point.
Example 9 Find the equation of the conic section passing through the five points (−1, 0), (0, 1),
(2, 2), (2, −1), (0, −3). Display the graph of the conic.
Solution The augmented matrix for the corresponding homogeneous system of five equations in
six unknowns is listed below. In creating the augmented matrix, we formed the rows
August 2, 2001 13:48 i56-ch01 Sheet number 36 Page number 36 cyan black
in the same order the points were listed and formed columns using the same order the
unknowns were listed in Eq. (3). For example, the third row of the augmented matrix
arises from inserting (2, 2) into Eq. (3):
4a + 4b + 4c + 2d + 2e + f = 0.
We used MATLAB to transform the augmented matrix to reduced echelon form, finding
1 0 0 0 0 7/18 0
0 1 0 0 0 −1/2 0
0 0 1 0 0 1/3 0
.
0 0 0 1 0 −11/18 0
0 0 0 0 1 2/3 0
Thus, the coefficients of the conic through these five points are given by
The graph of this equation is an ellipse and is shown in Fig. 1.7. The graph was drawn us-
ing the contour command from MATLAB. Contour plots and other features of MATLAB
graphics are described in the Appendix.
Finally, it should be noted that the ideas discussed above are not limited to the
xy-plane. For example, consider the quadratic equation in three variables:
The graph of Eq. (4) is a surface in three-space; the surface is known as a quadric
surface. Counting the coefficients in Eq. (4), we find ten. Thus, given any nine points in
three-space, we can find a quadric surface passing through the nine points (see Exercises
30–31).
August 2, 2001 13:48 i56-ch01 Sheet number 37 Page number 37 cyan black
–1
–2
–3
–4
–4 –3 –2 –1 0 1 2 3 4
Figure 1.7 The ellipse determined by five data points, see Example 9.
1.3 EXERCISES
In Exercises 1–4, transform the augmented matrix for 3. − x2 + x3 + x4 = 2
the given system to reduced echelon form and, in the x1 + 2x2 + 2x3 − x4 = 3
notation of Theorem 3, determine n, r, and the number, x1 + 3x2 + x3 =2
n − r, of independent variables. If n − r > 0, then 4. x1 + 2x2 + 3x3 + 2x4 = 1
identify n − r independent variables. x1 + 2x2 + 3x3 + 5x4 = 2
1. 2x1 + 2x2 − x3 = 1 2x1 + 4x2 + 6x3 + x4 = 1
−2x1 − 2x2 + 4x3 = 1 −x1 − 2x2 − 3x3 + 7x4 = 2
2x1 + 2x2 + 5x3 = 5 In Exercises 5 and 6, assume that the given system is
−2x1 − 2x2 − 2x3 = −3 consistent. For each system determine, in the notation of
2. 2x1 + 2x2 = 1 Theorem 3, all possibilities for the number, r of nonzero
4x1 + 5x2 = 4 rows and the number, n − r, of unconstrained variables.
4x1 + 2x2 = −2 Can the system have a unique solution?
August 2, 2001 13:48 i56-ch01 Sheet number 38 Page number 38 cyan black
In Exercises 32–33, note that the equation of a circle has 32. (1, 1), (2, 1), and (3, 2)
the form 33. (4, 3), (1, 2), and (2, 0)
ax 2 + ay 2 + bx + cy + d = 0.
Hence a circle is determined by three points. Find the
equation of the circle through the given points.
APPLICATIONS (OPTIONAL)
1.4
In this brief section we discuss networks and methods for determining flows in networks.
An example of a network is the system of one-way streets shown in Fig. 1.8. A typical
problem associated with networks is estimating the flow of traffic through this network of
streets. Another example is the electrical network shown in Fig. 1.9. A typical problem
consists of determining the currents flowing through the loops of the circuit.
(Note: The network problems we discuss in this section are kept very simple so that
the computational details do not obscure the ideas.)
Flows in Networks
Networks consist of branches and nodes. For the street network shown in Fig. 1.8, the
branches are the streets and the nodes are the intersections. We assume for a network
that the total flow into a node is equal to the total flow out of the node. For example,
Fig. 1.10 shows a flow of 40 into a node and a total flow of x1 + x2 + 5 out of the node.
Since we assume that the flow into a node is equal to the flow out, it follows that the
flows x1 and x2 must satisfy the linear equation 40 = x1 + x2 + 5, or equivalently,
x1 + x2 = 35.
As an example of network flow calculations, consider the system of one-way streets
in Fig. 1.11, where the flow is given in vehicles per hour. For instance, x1 + x4 vehicles
per hour enter node B, while x2 + 400 vehicles per hour leave.
x1
40 x2
Figure 1.10 Since we assume that the flow into a node is equal to the
flow out, in this case, x1 + x2 = 35.
400
x1 x2
800 A B C 600
x5 x4 x3
600 F x6 E x7 D 1600
400 400
Example 1
(a) Set up a system of equations that represents traffic flow for the network shown
in Fig. 1.11. (The numbers give the average flows into and out of the network
at peak traffic hours.)
(b) Solve the system of equations. What is the traffic flow if x6 = 300 and
x7 = 1300 vehicles per hour?
August 2, 2001 13:48 i56-ch01 Sheet number 41 Page number 41 cyan black
Solution
(a) Since the flow into a node is equal to the flow out, we obtain the following
system of equations:
800 = x1 + x5 (Node A)
x1 + x4 = 400 + x2 (Node B)
x2 = 600 + x3 (Node C)
1600 + x3 = 400 + x7 (Node D)
x7 = x4 + x6 (Node E)
x5 + x6 = 1000. (Node F )
(b) The augmented matrix for the system above is
1 0 0 0 1 0 0 800
1 −1 0 1 0 0 0 400
0 1 −1 0 0 0 0 600
.
0 0 1 0 0 0 −1 −1200
0 0 0 1 0 1 −1 0
0 0 0 0 1 1 0 1000
Example 2 Consider the street network in Example 1 (see Fig. 1.11). Suppose that the streets from
A to B and from B to C must be closed (that is, x1 = 0 and x2 = 0). How might the
traffic be rerouted?
Solution By Example 1, the flows are
x1 = x6 − 200
x2 = x7 − 600
x3 = x7 − 1200
x4 = x7 − x6
x5 = 1000 − x6 .
Therefore, if x1 = 0 and x2 = 0, it follows that x6 = 200 and x7 = 600. Us-
ing these values, we then obtain x3 = −600, x4 = 400, and x5 = 800. In order
to have nonnegative flows, we must reverse directions on the street connecting C and
D; this change makes x3 = 600 instead of −600. The network flows are shown in
Fig. 1.12.
400
0 0
800 A B C 600
600 F E D 1600
200 600
400 400
Electrical Networks
We now consider current flow in simple electrical networks such as the one illustrated
in Fig. 1.13. For such networks, current flow is governed by Ohm’s law and Kirchhoff’s
laws, as follows.
Ohm’s Law: The voltage drop across a resistor is the product of the current and
the resistance.
Kirchhoff’s First Law: The sum of the currents flowing into a node is equal to the
sum of the currents flowing out.
Kirchhoff’s Second Law: The algebraic sum of the voltage drops around a closed
loop is equal to the total voltage in the loop.
(Note: With respect to Kirchhoff’s second law, two basic closed loops in Fig. 1.13 are
the counterclockwise paths BDCB and BCAB. Also, in each branch, we make a tentative
August 2, 2001 13:48 i56-ch01 Sheet number 43 Page number 43 cyan black
20 ohms
5 volts
A
I1
10 ohms
I2
B C
I3
10 ohms
10 volts
D
Figure 1.13 The electrical network analyzed in Example 3
assignment for the direction of current flow. If a current turns out to be negative, we
then reverse our assignment for that branch.)
Example 3 Determine the currents I1 , I2 , and I3 for the electrical network shown in Fig. 1.13.
Solution Applying Kirchhoff’s second law to the loops BDCB and BCAB, we obtain equations
−10I2 + 10I3 = 10 (BDCB)
20I1 + 10I2 = 5 (BCAB).
Applying Kirchhoff’s first law to either of the nodes B or C, we find I1 = I2 + I3 .
Therefore,
I1 − I2 − I3 = 0.
The augmented matrix for this system of three equations is
1 −1 −1 0
0 −10 10 10 .
20 10 0 5
This matrix can be row reduced to
1 0 0 0.4
0 1 0 −0.3 .
0 0 1 0.7
Therefore, the currents are
I1 = 0.4, I2 = −0.3, I3 = 0.7.
Since I2 is negative, the current flow is from C to B rather than from B to C, as tentatively
assigned in Fig. 1.13.
August 2, 2001 13:48 i56-ch01 Sheet number 44 Page number 44 cyan black
1.4 EXERCISES
In Exercises 1 and 2, (a) set up the system of equations In Exercises 3 and 4, find the flow of traffic in the rotary
that describes traffic flow; (b) determine the flows x1 , x2 , if x1 = 600.
and x3 if x4 = 100; and (c) determine the maximum and
minimum values for x4 if all the flows are constrained 3. 400
to be nonnegative.
1. 800 400
x4
x1
x1
400 600
200 200
x4 x2
400 200 x3
x3 x2
200 200
400
2. 500 400
4. 500
x1 200 400
700 600
x1 x6
x4 x2
400 x3 600 x2 x5
x3 x4
600 200
300 200
400
August 2, 2001 13:48 i56-ch01 Sheet number 45 Page number 45 cyan black
I1
3 ohms
1 ohm
I2
2 volts
I3
4 ohms
4 volts
9. a) Set up the system of equations that describes the
traffic flow in the accompanying figure.
b) Show that the system is consistent if and only if
a1 + b1 + c1 + d1 = a2 + b2 + c2 + d2 .
6. I1 4 volts
a2 b1
1 ohm
I2 1 ohm
3 volts x1
a1 b2
x4 x2
I3
2 ohms
d2 c1
x3
7. d1 c2
10 volts
2 ohms
3 ohms
4 ohms
August 2, 2001 13:48 i56-ch01 Sheet number 46 Page number 46 cyan black
MATRIX OPERATIONS
1.5
In the previous sections, matrices were used as a convenient way of representing systems
of equations. But matrices are of considerable interest and importance in their own
right, and this section introduces the arithmetic operations that make them a useful
computational and theoretical tool.
In this discussion of matrices and matrix operations (and later in the discussion of
vectors), it is customary to refer to numerical quantities as scalars. For convenience
we assume throughout this chapter that all matrix (and vector) entries are real numbers;
hence the term scalar will mean a real number. In later chapters the term scalar will
also be applied to complex numbers. We begin with a definition of the equality of two
matrices.
Thus two matrices are equal if they have the same size and, moreover, if all their
corresponding entries are equal. For example, no two of the matrices
1 2 2 1 1 2 0
A= , B= , and C =
3 4 4 3 3 4 0
are equal.
August 2, 2001 13:48 i56-ch01 Sheet number 47 Page number 47 cyan black
Definition 6 Let A = (aij ) and B = (bij ) both be (m × n) matrices. The sum, A + B, is the
(m × n) matrix defined by
(A + B)ij = aij + bij .
Note that this definition requires that A and B have the same size before their sum is
defined. Thus if
1 2 −1 −3 1 2 1 2
A= , B= , and C = ,
2 3 0 0 −4 1 3 1
then
−2 3 1
A+B = ,
2 −1 1
while A + C is undefined.
Definition 7 Let A = (aij ) be an (m × n) matrix, and let r be a scalar. The product, rA, is the
(m × n) matrix defined by
(rA)ij = raij .
For example,
1 3 2 6
2 2 −1 = 4 −2 .
0 3 0 6
Vectors in Rn
Before proceeding with the definition of matrix multiplication, recall that a point in
n-dimensional space is represented by an ordered n-tuple of real numbers x = (x1 ,
x2 , . . . , xn ). Such an n-tuple will be called an n-dimensional vector and will be written
in the form of a matrix,
x1
x
x = .2 .
..
xn
For example, an arbitrary three-dimensional vector has the form
x1
x = x2 ,
x3
and the vectors
1 3 2
x = 2 , y = 2 , and z = 3
3 1 1
are distinct three-dimensional vectors. The set of all n-dimensional vectors with real
components is called Euclidean n-space and will be denoted by R n . Vectors in R n will
be denoted by boldface type. Thus R n is the set defined by
x1
x
R n = {x: x = .2 where x1 , x2 , . . . , xn are real numbers}.
..
xn
As the notation suggests, an element of R n can be viewed as an (n × 1) real matrix, and
conversely an (n × 1) real matrix can be considered an element of R n . Thus addition and
scalar multiplication of vectors is just a special case of these operations for matrices.
The idea of the vector form for the general solution is straightforward and is best
explained by a few examples.
Example 2 The matrix B is the augmented matrix for a homogeneous system of linear equations.
Find the general solution for the linear system and express the general solution in terms
of vectors
1 0 −1 −3 0
B= .
0 1 2 1 0
Solution Since B is in reduced echelon form, it is easy to write the general solution:
x1 = x3 + 3x4 , x2 = −2x3 − x4 .
This last expression is called the vector form for the general solution.
In general, the vector form for the general solution of a homogeneous system consists
of a sum of well-determined vectors multiplied by the free variables. Such expressions
are called “linear combinations” and we will use this concept of a linear combination
extensively, beginning in Section 1.7. The next example illustrates the vector form for
the general solution of a nonhomogeneous system.
Example 3 Let B denote the augmented matrix for a system of linear equations
1 −2 0 0 2 3
B= 0 0 1 0 −1 2 .
0 0 0 1 3 −4
Find the vector form for the general solution of the linear system.
Solution Since B is in reduced echelon form, we readily find the general solution:
Scalar Product
In vector calculus, the scalar product (or dot product) of two vectors
u1 v1
u v
u = .2 and v = .2
.. ..
un vn
in R n is defined to be the number u1 v1 + u2 v2 + · · · + un vn = ni=1 ui vi . For example,
if
2 −4
u = 3 and v = 2 ,
−1 3
then the scalar product of u and v is 2(−4) + 3(2) + (−1)3 = −5. The scalar product
of two vectors will be considered further in the following section, and in Chapter 3 the
properties of R n will be more fully developed.
Matrix Multiplication
Matrix multiplication is defined in such a way as to provide a convenient mechanism
for describing a linear correspondence between vectors. To illustrate, let the variables
x1 , x2 , . . . , xn and the variables y1 , y2 , . . . , ym be related by the linear equations
a11 x1 + a12 x2 + · · · + a1n xn = y1
a21 x1 + a22 x2 + · · · + a2n xn = y2
.. .. .. (1)
. . .
am1 x1 + am2 x2 + · · · + amn xn = ym .
August 2, 2001 13:48 i56-ch01 Sheet number 51 Page number 51 cyan black
If we set
x1 y1
x2 y2
x= .. and y= .. ,
. .
xn ym
n
aij xj = yi . (2)
j =1
then the left-hand side of Eq. (2) is precisely the scalar product of the ith row of A with
the vector x. Thus if we define the product of A and x to be the (m × 1) vector Ax whose
ith component is the scalar product of the ith row of A with x, then Ax is given by
n
a1j xj
j =1
n
a2j xj
Ax = j =1 .
..
.
n
amj xj
j =1
Using the definition of equality (Definition 5), we see that the simple matrix equation
Ax = y (3)
n
(AB)ij = aik bkj .
k=1
Figure 1.14 The ij th entry of AB is the scalar product of the ith row of
A and the j th column of B.
Thus the product AB is defined only when the inside dimensions of A and B are
equal. In this case the outside dimensions, m and s, give the size of AB. Furthermore,
the ij th entry of AB is the scalar product of the ith row of A with the j th column of B.
For example,
−1 2
2 1 −3
0 −3
−2 2 4
2 1
2(−1) + 1(0) + (−3)2 2(2) + 1(−3) + (−3)1 −8 −2
= = ,
(−2)(−1) + 2(0) + 4(2) (−2)2 + 2(−3) + 4(1) 10 −6
is undefined.
August 2, 2001 13:48 i56-ch01 Sheet number 53 Page number 53 cyan black
Example 4 illustrates that matrix multiplication is not commutative; that is, normally
AB and BA are different matrices. Indeed, the product AB may be defined while the
product BA is undefined, or both may be defined but have different dimensions. Even
when AB and BA have the same size, they usually are not equal.
Therefore,
x1 = −11z1 + 3z2
x2 = 18z1 − 4z2
x3 = 5z1 + 5z2 .
The use of the matrix equation (3) to represent the linear system (1) provides a
convenient notational device for representing the (m × n) system
a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
.. .. .. .. (4)
. . . .
am1 x1 + am2 x2 + · · · + amn xn = bm
of linear equations with unknowns x1 , . . . , xn . Specifically, if A = (aij ) is the coefficient
matrix of (4), and if the unknown (n × 1) matrix x and the constant (m × 1) matrix b
are defined by
x1 b1
b
x 2
x = .2 and b = . ,
.. ..
xn bm
then the system (4) is equivalent to the matrix equation
Ax = b. (5)
Then
x1
1 3 6
Ax = x2
2 4 0
x3
x1 + 3x2 + 6x3
=
2x1 + 4x2 + 0x3
1 3 6
= x1 + x2 + x3 ;
2 4 0
so that Ax = x1 A1 + x2 A2 + x3 A3 . In particular, if we set
2
x = 2 ,
−3
then Ax = 2A1 + 2A2 − 3A3 .
From Theorem 5, we see that the matrix equation Ax = b corresponding to the
(m × n) system (4) can be expressed as
x1 A1 + x2 A2 + · · · + xn An = b. (8)
Thus, Eq. (8) says that solving Ax = b amounts to showing that b can be written in
terms of the columns of A.
Example 7 Solve
1 3 −1 2
x1 2 + x2 5 + x3 −1 = 6 .
2 8 −2 6
Solution By Theorem 5, the given equation is equivalent to the matrix equation Ax = b, where
1 3 −1 x1 2
A= 2 5 −1 , x = x2 , and b = 6 .
2 8 −2 x3 6
This equation was solved in Example 6 giving x1 = 2, x2 = 1, x3 = 3, so we have
1 3 −1 2
2 2 + 5 + 3 −1 = 6 .
2 8 −2 6
Although Eq. (8) is not particularly efficient as a computational tool, it is useful for
understanding how the internal structure of the coefficient matrix affects the possible
solutions of the linear system Ax = b.
August 2, 2001 13:48 i56-ch01 Sheet number 57 Page number 57 cyan black
Proof If A = (aij ) and B = (bij ), then the j th column of AB contains the entries
n
a1k bkj
k=1
n
a2k bkj
k=1
..
.
n
amk bkj ;
k=1
and these are precisely the components of the column vector ABj , where
b1j
b
2j
Bj = . .
..
bnj
It follows that we can write AB in the form AB = [AB1 , AB2 , . . . , ABs ].
To illustrate Theorem 6, let A and B be given by
2 6
1 3 0 1
A = 0 4 and B = .
4 5 2 3
1 2
Thus the column vectors for B are
1 3 0 1
B1 = , B2 = , B3 = , and B4 =
4 5 2 3
and
26 36 12 20
AB1 = 16 , AB2 = 20 , AB3 = 8 , and AB4 = 12 .
9 13 4 7
Calculating AB, we see immediately that AB is a (3 × 4) matrix with columns AB1 ,
AB2 , AB3 , and AB4 ; that is,
26 36 12 20
AB = 16 20 8 12 .
9 13 4 7
August 2, 2001 13:48 i56-ch01 Sheet number 58 Page number 58 cyan black
1.5 EXERCISES
The (2 × 2) matrices listed in Eq. (9) are used in several Exercises 21–24 refer to the matrices in Eq. (9) and the
of the exercises that follow. vectors in Eq. (10).
21. Find w2 , where w1 = Br and w2 = Aw1 . Calculate
2 1 0 −1 Q = AB. Calculate Qr and verify that w2 is equal
A= , B= ,
1 3 1 3 to Qr.
(9) 22. Find w2 , where w1 = Cs and w2 = Aw1 . Calculate
−2 3 0 0 Q = AC. Calculate Qs and verify that w2 is equal
C= , Z= to Qs.
1 1 0 0
23. Find w3 , where w1 = Cr, w2 = Bw1 , and w3 =
Exercises 1–6 refer to the matrices in Eq. (9). Aw2 . Calculate Q = A(BC) and verify that w3 is
equal to Qr.
1. Find (a) A + B; (b) A + C; (c) 6B; and (d) B + 3C.
24. Find w3 , where w1 = Ar, w2 = Cw1 , and w3 =
2. Find (a) B + C; (b) 3A; (c) A + 2C; and (d) C + 8Z. Bw2 . Calculate Q = B(CA) and verify that w3 is
3. Find a matrix D such that A + D = B. equal to Qr.
4. Find a matrix D such that A + 2D = C.
Exercises 25–30 refer to the matrices in Eq. (9). Find
5. Find a matrix D such that A + 2B + 2D = 3B.
each of the following.
6. Find a matrix D such that 2A+5B +D = 2B +3A.
25. (A + B)C 26. (A + 2B)A
The vectors listed in Eq. (10) are used in several of the 27. (A + C)B 28. (B + C)Z
exercises that follow.
29. A(BZ) 30. Z(AB)
1 2
r= , s= , The matrices and vectors listed in Eq. (11) are used in
0 −3
several of the exercises that follow.
(10)
1 −4 2 3 1 2 1
t= , u= A= , B= , u= ,
4 6 1 4 1 4 3
In Exercises 7–12, perform the indicated computation, 2 1
using the vectors in Eq. (10) and the matrices in Eq. (9). 4 0
v = 2, 4 , C = , (11)
7. a) r + s 8. a) t + s 8 −1
b) 2r + t b) r + 3u 3 2
c) 2s + u c) 2u + 3t
2 1 3 6
2
9. a) Ar 10. a) Bt 2 0
0 4
3
b) Br b) C(r + s) D= , w = .
1 −1 1 −1 1
c) C(s + 3t) c) B(r + s)
11. a) (A + 2B)r 12. a) (A + C)r 1 3 1 2 1
b) (B + C)u b) (2B + 3C)s Exercises 31–41 refer to the matrices and vectors in
Exercises 13–20 refer to the vectors in Eq. (10). In each Eq. (11). Find each of the following.
exercise, find scalars a1 and a2 that satisfy the given 31. AB and BA 32. DC
equation, or state that the equation has no solution. 33. Au and vA 34. uv and vu
13. a1 r + a2 s = t 14. a1 r + a2 s = u 35. v(Bu) 36. Bu
15. a1 s + a2 t = u 16. a1 s + a2 t = r + t 37. CA 38. CB
17. a1 s + a2 u = 2r + t 18. a1 s + a2 u = t 39. C(Bu) 40. (AB)u and A(Bu)
19. a1 t + a2 u = 3s + 4t 20. a1 t + a2 u = 3r + 2s 41. (BA)u and B(Au)
August 2, 2001 13:48 i56-ch01 Sheet number 59 Page number 59 cyan black
In Exercises 68–70, find the vector form for the general 71. In Exercise 57 we saw that the state vector giv-
solution. ing the number of newspaper subscribers in year n
68. x1 + 3x2 − 3x3 + 2x4 − 3x5 = −4 could be found by forming P n x where x is the ini-
3x1 + 9x2 − 10x3 + 10x4 − 14x5 = 2 tial state. Later, in Section 3.8, we will see that
2x1 + 6x2 − 10x3 + 21x4 − 25x5 = 53 as n grows larger and larger, the vector P n x tends
69. 14x1 − 8x2 + 3x3 − 49x4 + 29x5 = 44 toward a limit. Use MATLAB to calculate P n x for
−8x1 + 5x2 − 2x3 + 29x4 − 16x5 = −24 n = 1, 2, . . . , 30. For ease of reading, display the
3x1 − 2x2 + x3 − 11x4 + 6x5 = 9 results using bank format in the MATLAB numeric
70. 18x1 + 18x2 − 10x3 + 7x4 + 2x5 + 50x6 = 26 options menu. What do you think the steady state
−10x1 − 10x2 + 6x3 − 4x4 − x5 − 27x6 = −13 distribution of newspapers will be?
7x1 + 7x2 − 4x3 + 5x4 + 2x5 + 30x6 = 18
2x1 + 2x2 − x3 + 2x4 + x5 + 12x6 = 8
These properties are easily established, and the proofs of 2–4 are left as exercises.
Regarding properties 3 and 4, we note that the zero matrix, O, is the (m × n) matrix, all
of whose entries are zero. Also the matrix P of property 4 is usually called the additive
inverse for A, and the reader can show that P = (−1)A. The matrix (−1)A is also
denoted as −A, and the notation A − B means A + (−B). Thus property 4 states that
A − A = O.
Proof of Property 1 If A = (aij ) and B = (bij ) are (m × n) matrices, then, by Definition 6,
(A + B)ij = aij + bij .
Similarly, by Definition 6,
(B + A)ij = bij + aij .
Since addition of real numbers is commutative, aij +bij and bij +aij are equal. Therefore,
A + B = B + A.
Three associative properties involving scalar and matrix multiplication are given in
Theorem 8.
Theorem 8
1. If A, B, and C are (m × n), (n × p), and (p × q) matrices, respectively, then
(AB)C = A(BC).
2. If r and s are scalars, then r(sA) = (rs)A.
3. r(AB) = (rA)B = A(rB).
The proof is again left to the reader, but we will give one example to illustrate the
theorem.
Finally, the distributive properties connecting addition and multiplication are given
in Theorem 9.
August 2, 2001 13:48 i56-ch01 Sheet number 63 Page number 63 cyan black
Theorem 9
1. If A and B are (m × n) matrices and C is an (n × p) matrix, then (A + B)C =
AC + BC.
2. If A is an (m × n) matrix and B and C are (n × p) matrices, then A(B + C) =
AB + AC.
3. If r and s are scalars and A is an (m × n) matrix, then (r + s)A = rA + sA.
4. If r is a scalar and A and B are (m × n) matrices, then r(A + B) = rA + rB.
Proof We will prove property 1 and leave the others to the reader. First observe that (A + B)C
and AC + BC are both (m × p) matrices. To show that the components of these two
matrices are equal, let Q = A + B, where Q = (qij ). Then (A + B)C = QC, and the
rsth component of QC is given by
n
qrk cks = (ark + brk )cks = ark cks + brk cks .
k=1 k=1 k=1 k=1
Because
n
ark cks + brk cks
k=1 k=1
In the preceding example, note that the first row of A becomes the first column of AT ,
and the second row of A becomes the second column of AT . Similarly, the columns of
A become the rows of AT . Thus AT is obtained by interchanging the rows and columns
of A.
Three important properties of the transpose are given in Theorem 10.
n
aj k cki .
k=1
Next the ij th entry of C TAT is the scalar product of the ith row of C T with the j th
column of AT . In particular, the ith row of C T is [c1i , c2i , . . . , cni ] (the ith column of
C), whereas the j th column of AT is
aj 1
aj 2
.
..
aj n
(the j th row of A). Therefore, the ij th entry of C TAT is given by
n
c1i aj 1 + c2i aj 2 + · · · + cni aj n = cki aj k .
k=1
Finally, since
n
cki aj k = aj k cki ,
k=1 k=1
the ij th entries of (AC)T and C TAT agree, and the matrices are equal.
The transpose operation is used to define certain important types of matrices, such as
positive-definite matrices, normal matrices, and symmetric matrices. We will consider
these in detail later and give only the definition of a symmetric matrix in this section.
In Exercise 49, the reader is asked to show that QTQ is always a symmetric matrix
whether or not Q is symmetric.
In the (n × n) matrix A = (aij ), the entries a11 , a22 , . . . , ann are called the main
a11 a12 a13 diagonal of A. For example, the main diagonal of a (3 × 3) matrix is illustrated in
a21 a22 a23 Fig. 1.15. Since the entries aij and aj i are symmetric partners relative to the main
a31 a32 a33 diagonal, symmetric matrices are easily recognizable as those in which the entries form
a symmetric array relative to the main diagonal. For example, if
Figure 1.15
Main diagonal 2 3 −1 1 2 2
A= 3 4 2 and B = −1 3 0 ,
−1 2 0 5 2 6
n
x y=
T
xi yi ;
i=1
n is, x y is the
that scalar product or dot product of x and y. Also note that yT x =
T
y
i=1 yi xi = i=1 xi yi = x y.
n T
b P One of the basic concepts in computational work is that of the length or norm of a
x vector. If
a
x=
O a x b
is in R 2 , then x can be represented geometrically in the plane as the directed line segment
Figure 1.16 3
Geometric vector in OP from the origin O to the point P , which has coordinates (a, b), as illustrated in
3 √
two-space Fig. 1.16. By the Pythagorean theorem, the length of the line segment OP is a 2 + b2 .
August 2, 2001 13:48 i56-ch01 Sheet number 68 Page number 68 cyan black
1.6 EXERCISES
The matrices and vectors listed in Eq. (3) are used in Since A2 = AB, A2 − AB = O. Factoring
several of the exercises that follow. yields A(A − B) = O. Since A
= O, it follows that
A − B = O. Therefore, A = B.
3 1 1 2 1
29. Two of the six matrices listed in Eq. (3) are symmet-
A = 4 7 , B = 7 4 3 , ric. Identify these matrices.
2 6 6 0 1 30. Find (2 × 2) matrices A and B such that A and
B are symmetric, but AB is not symmetric. [Hint:
(AB)T = B TAT = BA.]
2 1 4 0
2 1 31. Let A and B be (n × n) symmetric matrices. Give
C = 6 1 3 5 , D= , a necessary and sufficient condition for AB to be
1 4
2 4 2 0 symmetric. [Hint: Recall Exercise 30.]
32. Let G be the (2 × 2) matrix that follows, and con-
3 6 1 1 sider any vector x in R 2 where both entries are not
E= , F = , simultaneously zero:
2 3 1 1
2 1 x1
G= , x= ; |x1 | + |x2 | > 0.
1 −3 1 1 x2
u= , v=
−1 3 Show that xTGx > 0. [Hint: Write xTGx as a sum
(3) of squares.]
33. Repeat Exercise 32 using the matrix D in Eq. (3) in
Exercises 1–25 refer to the matrices and vectors in
place of G.
Eq. (3). In Exercises 1–6, perform the multiplications
to verify the given equality or nonequality. 34. For F in Eq. (3), show that xTF x ≥ 0 for all x in
R 2 . Classify those vectors x such that xTF x = 0.
1. (DE )F = D(EF ) 2. (FE )D = F (ED)
3. DE
= ED 4. EF
= FE If x and y are vectors in R n , then the product xT y is of-
ten called an inner product. Similarly, the product xyT
5. F u = F v 6. 3F u = 7F v is often called an outer product. Exercises 35–40 con-
In Exercises 7–12, find the matrices. cern outer products; the matrices and vectors are given in
Eq. (3). In Exercises 35–40, form the outer products.
7. AT 8. D T 9. E TF
35. uvT 36. u(F u)T 37. v(Ev)T
10. ATC 11. (F v)T 12. (EF)v
38. u(Ev)T 39. (Au)(Av)T 40. (Av)(Au)T
In Exercises 13–25, calculate the scalars. 41. Let a and b be given by
13. uT v 14. vT F u 15. vT Dv
1 3
16. v Fv
T
17. u u
T
18. vT v a= and b = .
2 4
19. u 20. Dv 21. Au
22. u − v 23. F u 24. F v a) Find x in R 2 that satisfies both xT a = 6 and
25. (D − E)u xT b = 2.
26. Let A and B be (2×2) matrices. Prove or find a coun- b) Find x in R 2 that satisfies both xT (a + b) = 12
terexample for this statement: (A − B)(A + B) = and xT a = 2.
A2 − B 2 . 42. Let A be a (2 × 2) matrix, and let B and C be given
27. Let A and B be (2 × 2) matrices such that A2 = AB by
and A
= O. Can we assert that, by cancellation, 1 3 2 3
A = B? Explain. B= and C = .
1 4 4 5
28. Let A and B be as in Exercise 27. Find the flaw in
the following proof that A = B. a) If AT + B = C, what is A?
August 2, 2001 13:48 i56-ch01 Sheet number 70 Page number 70 cyan black
Linear Independence
If A = [A1 , A2 , . . . , An ], then, by Theorem 5 of Section 1.5, the equation Ax = b can
be written in terms of the columns of A as
x1 A1 + x2 A2 + · · · + xn An = b. (2)
From Eq. (2), it follows that system (1) is consistent if, and only if, b can be written as
a sum of scalar multiples of the column vectors of A. We call a sum such as x1 A1 +
x2 A2 + · · · + xn An a linear combination of the vectors A1 , A2 , . . . , An . Thus Ax = b
is consistent if, and only if, b is a linear combination of the columns of A.
Any time you need to know whether a set of vectors is linearly independent or
linearly dependent, you should start with the dependence equation:
a 1 v1 + a 2 v2 + · · · + a p vp = θ (5)
You would then solve Eq. (5). If there are nontrivial solutions, then the set of vectors
is linearly dependent. If Eq. (5) has only the trivial solution, then the set of vectors is
linearly independent.
We can phrase Eq. (5) in matrix terms. In particular, let V denote the (m×p) matrix
made up from the vectors v1 , v2 , . . . , vp :
V = [v1 , v2 , . . . , vp ].
Then Eq. (5) is equivalent to the matrix equation
V x = θ. (6)
Example 2 Determine whether the set {v1 , v2 , v3 } is linearly independent or linearly dependent,
where
1 2 0
v1 = 2 , v2 = −1 , and v3 = 5 .
3 4 2
Solution To determine whether the set is linearly dependent, we must determine whether the
vector equation
x1 v1 + x2 v2 + x3 v3 = θ (7)
August 2, 2001 13:48 i56-ch01 Sheet number 74 Page number 74 cyan black
has a nontrivial solution. But Eq. (7) is equivalent to the (3 × 3) homogeneous system
of equations V x = θ , where V = [v1 , v2 , v3 ]. The augmented matrix, [V | θ ], for this
system is
1 2 0 0
2 −1 5 0 .
3 4 2 0
This matrix reduces to
1 0 2 0
0 1 −1 0 .
0 0 0 0
Therefore, we find the solution x1 = −2x3 , x2 = x3 , where x3 is arbitrary. In particular,
Eq. (7) has nontrivial solutions, so {v1 , v2 , v3 } is a linearly dependent set. Setting x3 = 1,
for example, gives x1 = −2, x2 = 1. Therefore,
−2v1 + v2 + v3 = θ.
Note that from this equation we can express v3 as a linear combination of v1 and v2 :
v3 = 2v1 − v2 .
Similarly, of course, v1 can be expressed as a linear combination of v2 and v3 , and v2
can be expressed as a linear combination of v1 and v3 .
Example 3 Determine whether or not the set {v1 , v2 , v3 } is linearly dependent, where
1 −2 1
v1 = 2 , v2 = 1 , and v3 = −1 .
−3 1 −2
THE VECTOR SPACE R n, n > 3 The extension of vectors and their corresponding
algebra into more than three dimensions was an extremely important step in the development of
mathematics. This advancement is attributed largely to Hermann Grassmann (1809–1877) in his
Ausdehnungslehre. In this work Grassmann discussed linear independence and dependence and many
concepts dealing with the algebraic structure of R n (such as dimension and subspaces), which we will
study in Chapter 3. Unfortunately, Grassmann’s work was so difficult to read that it went almost
unnoticed for a long period of time, and he did not receive as much credit as he deserved.
that any set of vectors that contains the zero vector is linearly dependent (again, see the
exercises).
The unit vectors e1 , e2 , . . . , en in R n are defined by
1 0 0 0
0 1 0 0
e1 =
0 ,
e2 = 0 ,
e3 =
1 , ..., en =
0 .
(8)
.. .. .. ..
. . . .
0 0 0 1
It is easy to see that {e1 , e2 , . . . , en } is linearly independent. To illustrate, consider the
unit vectors
1 0 0
e1 = 0 , e2 = 1 , and e3 = 0
0 0 1
in R 3 . If V = [e1 , e2 , e3 ], then
1 0 0 0
[V | θ ] = 0 1 0 0 ,
0 0 1 0
so clearly the only solution of V x = θ (or equivalently, of x1 e1 + x2 e2 + x3 e3 = θ) is
the trivial solution x1 = 0, x2 = 0, x3 = 0.
The next example illustrates that, in some cases, the linear dependence of a set of
vectors can be determined by inspection. The example is a special case of Theorem 11,
which follows.
Theorem 11 Let {v1 , v2 , . . . , vp } be a set of vectors in R m . If p > m, then this set is linearly
dependent.
Proof The set {v1 , v2 , . . . , vp } is linearly dependent if the equation V x = θ has a nontrivial
solution, where V = [v1 , v2 , . . . , vp ]. But V x = θ represents a homogeneous (m × p)
system of linear equations with m < p. By Theorem 4 of Section 1.3, V x = θ has
nontrivial solutions.
Note that Theorem 11 does not say that if p ≤ m, then the set {v1 , v2 , . . . , vp } is
linearly independent. Indeed Examples 2 and 3 illustrate that if p ≤ m, then the set may
be either linearly independent or linearly dependent.
Nonsingular Matrices
The concept of linear independence allows us to state precisely which (n × n) systems of
linear equations always have a unique solution. We begin with the following definition.
Theorem 13 Let A be an (n × n) matrix. The equation Ax = b has a unique solution for every (n × 1)
column vector b if and only if A is nonsingular.
Proof Suppose first that Ax = b has a unique solution no matter what choice we make for b.
Choosing b = θ implies, by Definition 12, that A is nonsingular.
Conversely, suppose that A = [A1 , A2 , . . . , An ] is nonsingular, and let b be any
(n × 1) column vector. We first show that Ax = b has a solution. To see this, observe
first that
{A1 , A2 , . . . , An , b}
is a set of (n × 1) vectors in R n ; so by Theorem 11 this set is linearly dependent. Thus
there are scalars a1 , a2 , . . . , an , an+1 such that
a1 A1 + a2 A2 + · · · + an An + an+1 b = θ ; (9)
and moreover not all these scalars are zero. In fact, if an+1 = 0 in Eq. (9), then
a1 A1 + a2 A2 + · · · + an An = θ,
and it follows that {A1 , A2 , . . . , An } is a linearly dependent set. Since this contradicts
the assumption that A is nonsingular, we know that an+1 is nonzero. It follows from
Eq. (9) that
s1 A1 + s2 A2 + · · · + sn An = b,
where
−a1 −a2 −an
s1 = , s2 = , . . . , sn = .
an+1 an+1 an+1
August 2, 2001 13:48 i56-ch01 Sheet number 78 Page number 78 cyan black
1.7 EXERCISES
The vectors listed in Eq. (10) are used in several of the 5. {v1 , v2 , v3 } 6. {v2 , v3 , v4 }
exercises that follow.
7. {u4 , u5 } 8. {u3 , u4 }
1 2 2 9. {u1 , u2 , u5 } 10. {u1 , u4 , u5 }
v1 = , v2 = , v3 = ,
2 3 4 11. {u2 , u4 , u5 } 12. {u1 , u2 , u4 }
13. {u0 , u1 , u2 , u4 } 14. {u0 , u2 , u3 , u4 }
1 3
v4 = , v5 = , 15. Consider the sets of vectors in Exercises 1–14. Us-
1 6
ing Theorem 11, determine by inspection which of
these sets are known to be linearly dependent.
1 1 2
The matrices listed in Eq. (11) are used in some of the
u0 = 0 , u1 = 2 , u2 = 1 ,
exercises that follow.
0 −1 −3
1 2 1 2 1 3
−1 4 1 A= , B= , C= ,
3 4 2 4 2 4
u3 = 4 , u4 = 4 , u5 = 1
3 0 0
1 0 0 0 1 0
(10) D = 0 1 0 , E = 0 0 2 ,
In Exercises 1–14, use Eq. (6) to determine whether the 0 1 0 0 1 3
given set of vectors is linearly independent or linearly
dependent. If the set is linearly dependent, express one 1 2 1
vector in the set as a linear combination of the others.
F = 0 3 2
1. {v1 , v2 } 2. {v1 , v3 } 0 0 1
3. {v1 , v5 } 4. {v2 , v3 } (11)
August 2, 2001 13:48 i56-ch01 Sheet number 79 Page number 79 cyan black
In Exercises 16–27, use Definition 12 to determine 1 1
whether the given matrix is singular or nonsingular. If a 44. b = 45. b =
matrix M is singular, give all solutions of Mx = θ. 2 0
16. A 17. B 18. C In Exercises 46–47, let S = {v1 , v2 , v3 }.
19. AB 20. BA 21. D
a) For what value(s) a is the set S linearly
22. F 23. D + F 24. E
dependent?
25. EF 26. DE 27. F T
b) For what value(s) a can v3 be expressed as a
In Exercises 28–33, determine conditions on the scalars linear combination of v1 and v2 ?
so that the set of vectors is linearly dependent.
1 −2 3
1 2 46. v1 = , v2 = , v3 =
28. v1 = , v2 = −1 2 a
a 3
1 1 3
1 3 47. v1 = , v2 = , v3 =
29. v1 = , v2 = 0 1 a
2 a
48. Let S = {v1 , v2 , v3 } be a set of vectors in R 3 , where
1 1 0 v1 = θ . Show that S is a linearly dependent set
30. v1 = 2 , v2 = 3 , v3 = 1 of vectors. [Hint: Exhibit a nontrivial solution for
1 2 a either Eq. (5) or Eq. (6).)]
49. Let {v1 , v2 , v3 } be a set of nonzero vectors in R m
1 1 0 such that viT vj = 0 when i
= j . Show that the set
31. v1 = 2 , v2 = a , v3 = 2 is linearly independent. [Hint: Set a1 v1 + a2 v2 +
1 3 b a3 v3 = θ and consider θ Tθ .]
50. If the set {v1 , v2 , v3 } of vectors in R m is linearly
32. v1 =
a
, v2 =
b dependent, then argue that the set {v1 , v2 , v3 , v4 } is
1 3 also linearly dependent for every choice of v4 in R m .
51. Suppose that {v1 , v2 , v3 } is a linearly independent
1 b
33. v1 = , v2 = subset of R m . Show that the set {v1 , v1 + v2 , v1 +
a c v2 + v3 } is also linearly independent.
52. If A and B are (n × n) matrices such that A is non-
In Exercises 34–39, the vectors and matrices are from
singular and AB = O, then prove that B = O.
Eq. (10) and Eq. (11). The equations listed in Exercises
[Hint: Write B = [B1 , . . . , Bn ] and consider AB =
34–39 all have the form Mx = b, and all the equations
[AB1 , . . . , ABn ].]
are consistent. In each exercise, solve the equation and
express b as a linear combination of the columns of M. 53. If A, B, and C are (n × n) matrices such that A is
nonsingular and AB = AC, then prove that B = C.
34. Ax = v1 35. Ax = v3
[Hint: Consider A(B − C) and use the preceding
36. Cx = v4 37. Cx = v2 exercise.]
38. F x = u1 39. F x = u3 54. Let A = [A1 , . . . , An−1 ] be an (n × (n − 1)) matrix.
Show that B = [A1 , . . . , An−1 , Ab] is singular for
In Exercises 40–45, express the given vector b as a lin- every choice of b in R n−1 .
ear combination of v1 and v2 , where v1 and v2 are in 55. Suppose that C and B are (2 × 2) matrices and that
Eq. (10). B is singular. Show that CB is singular. [Hint: By
2 3 Definition 12, there is a vector x1 in R 2 , x1
= θ,
40. b = 41. b = such that Bx1 = θ .]
7 −1
56. Let {w1 , w2 } be a linearly independent set of vectors
0 0 in R 2 . Show that if b is any vector in R 2 , then b is
42. b = 43. b = a linear combination of w1 and w2 . [Hint: Consider
4 0 the (2 × 2) matrix A = [w1 , w2 ].]
August 2, 2001 13:48 i56-ch01 Sheet number 80 Page number 80 cyan black
57. Let A be an (n × n) nonsingular matrix. Show that tor v such that T v = θ . If trr = 0, but tii
= 0
AT is nonsingular as follows: for 1, 2, . . . , r − 1, use Theorem 4 of Section 1.3 to
a) Suppose that v is a vector in R n such that show that columns T1 , T2 , . . . , Tr of T are linearly
AT v = θ. Cite a theorem from this section that dependent. Then select a nonzero vector v such that
guarantees there is a vector w in R n such that T v = θ.]
Aw = v. 59. Let T be an (n × n) upper-triangular matrix as
b) By part (a), ATAw = θ, and therefore in Exercise 58. Prove that if tii
= 0 for i =
wTATAw = wTθ = 0. Cite results from 1, 2, . . . , n, then T is nonsingular. [Hint: Let T =
Section 1.6 that allow you to conclude that [T1 , T2 , . . . , Tn ], and suppose that a1 T1 + a2 T2 +
Aw = 0. [Hint: What is (Aw)T ?] · · · + an Tn = θ for some scalars a1 , a2 , . . . , an .
First deduce that an = 0. Next show an−1 = 0,
c) Use parts (a) and (b) to conclude that if
and so on.] Note that Exercises 58 and 59 establish
AT v = θ, then v = θ; this shows that AT is
that an upper-triangular matrix is singular if and only
nonsingular.
if one of the entries t11 , t22 , . . . , tnn is zero. By Ex-
58. Let T be an (n × n) upper-triangular matrix ercise 57 the same result is true for lower-triangular
matrices.
t11 t12 t13 · · · t1n
0 t t23 · · · t2n 60. Suppose that the (n × n) matrices A and B are row
22
equivalent. Prove that A is nonsingular if and only if
T = 0 0 t33 · · · t3n . B is nonsingular. [Hint: The homogeneous systems
. ..
.. . Ax = θ and Bx = θ are equivalent by Theorem 1
of Section 1.1.]
0 0 0 · · · tnn
Prove that if tii = 0 for some i, 1 ≤ i ≤ n, then T
is singular. [Hint: If t11 = 0, find a nonzero vec-
Polynomial Interpolation
t We begin by applying matrix theory to the problem of interpolating data with polyno-
mials. In particular, Theorem 13 of Section 1.7 is used to establish a general existence
Figure 1.17 and uniqueness result for polynomial interpolation. The following example is a simple
Points in the ty-plane illustration of polynomial interpolation.
Example 1 Find a quadratic polynomial, q(t), such that the graph of q(t) goes through the points
(1, 2), (2, 3), and (3, 6) in the ty-plane (see Fig. 1.17).
Solution A quadratic polynomial q(t) has the form
q(t) = a + bt + ct 2 , (1a)
August 2, 2001 13:48 i56-ch01 Sheet number 81 Page number 81 cyan black
y
q(3) = 6.
The constraints in (1b) are, by (1a), equivalent to
a+ b+ c=2
a + 2b + 4c = 3 (1c)
a + 3b + 9c = 6.
Clearly (1c) is a system of three linear equations in the three unknowns a, b, and c;
t so solving (1c) will determine the polynomial q(t). Solving (1c), we find the unique
solution a = 3, b = −2, c = 1; therefore, q(t) = 3 − 2t + t 2 is the unique quadratic
Figure 1.18 polynomial satisfying the conditions (1b). A portion of the graph of q(t) is shown in
Graph of q(t) Fig. 1.18.
Frequently polynomial interpolation is used when values of a function f (t) are given
Table 1.1 in tabular form. For example, given a table of n + 1 values of f (t) (see Table 1.1), an
interpolating polynomial for f (t) is a polynomial, p(t), of the form
t f (t)
p(t) = a0 + a1 t + a2 t 2 + · · · + an t n
t0 y0
t1 y1 such that p(ti ) = yi = f (ti ) for 0 ≤ i ≤ n. Problems of interpolating data in tables are
t2 y2 quite common in scientific and engineering work; for example, y = f (t) might describe a
.. .. temperature distribution as a function of time with yi = f (ti ) being observed (measured)
. . temperatures. For a time tˆ not listed in the table, p(tˆ ) provides an approximation for
tn yn f (tˆ ).
Example 2 Find an interpolating polynomial for the four observations given in Table 1.2. Give an
approximation for f (1.5).
Solution In this case, the interpolating polynomial is a polynomial of degree 3 or less,
Table 1.2 p(t) = a0 + a1 t + a2 t 2 + a3 t 3 ,
t f (t) where p(t) satisfies the four constraints p(0) = 3, p(1) = 0, p(2) = −1, and p(3) = 6.
As in the previous example, these constraints are equivalent to the (4 × 4) system of
0 3
equations
1 0
2 −1 a0 = 3
3 6 a0 + a1 + a2 + a3 = 0
a0 + 2a1 + 4a2 + 8a3 = −1
a0 + 3a1 + 9a2 + 27a3 = 6.
Solving this system, we find that a0 = 3, a1 = −2, a2 = −2, a3 = 1 is the unique
solution. Hence the unique polynomial that interpolates the tabular data for f (t) is
p(t) = 3 − 2t − 2t 2 + t 3 .
The desired approximation for f (1.5) is p(1.5) = −1.125.
August 2, 2001 13:48 i56-ch01 Sheet number 82 Page number 82 cyan black
Note that in each of the two preceding examples, the interpolating polynomial was
unique. Theorem 14, on page 83, states that this is always the case. The next example
considers the general problem of fitting a quadratic polynomial to three data points and
illustrates the proof of Theorem 14.
Example 3 Given three distinct numbers t0 , t1 , t2 and any set of three values y0 , y1 , y2 , show that
there exists a unique polynomial,
p(t) = a0 + a1 t + a2 t 2 , (2a)
a0 + a1 t0 + a2 t02 = y0
a0 + a1 t1 + a2 t12 = y1 (2b)
a0 + a 1 t2 + a2 t22 = y2 ,
where a0 , a1 , and a2 are the unknowns. The problem is to show that system (2b) has a
unique solution. We can write system (2b) in matrix form as T a = y, where
1 t0 t02 a0 y0
T = 1 t1 t12 , a = a1 , and y = y1 . (2c)
2
1 t t2 2
a 2 y 2
Let q(t) = c0 + c1 t + c2 t 2 . Then q(t) has degree at most 2 and, by system (2d),
q(t0 ) = q(t1 ) = q(t2 ) = 0. Thus q(t) has three distinct real zeros. By Exercise 25, if a
quadratic polynomial has three distinct real zeros, then it must be identically zero. That
is, c0 = c1 = c2 = 0, or c = θ . Hence T is nonsingular, and so system (2b) has a unique
solution.
The matrix T given in (2c) is the (3 × 3) Vandermonde matrix. More gener-
ally, for real numbers t0 , t1 , . . . , tn , the [(n + 1) × (n + 1)] Vandermonde matrix T
August 2, 2001 13:48 i56-ch01 Sheet number 83 Page number 83 cyan black
is defined by
1 t0 t02 · · · t0n
1 t12 · · · t1n
t1
T = . .. . (3)
.. .
1 tn tn2 · · · tnn
Following the argument given in Example 3 and making use of Exercise 26, we can show
that if t0 , t1 , . . . , tn are distinct, then T is nonsingular. Thus, by Theorem 13, the linear
system T x = y has a unique solution for each choice of y in R n+1 . As a consequence,
we have the following theorem.
Theorem 14 Given n+1 distinct numbers t0 , t1 , . . . , tn and any set of n+1 values y0 , y1 , . . . , yn , there
is one and only one polynomial p(t) of degree n or less, p(t) = a0 + a1 t + · · · + an t n ,
such that p(ti ) = yi , i = 0, 1, . . . , n.
y = a0 et0 x + a 1 e t1 x + · · · + an etn x
y = a0 t0 e t0 x
+ a 1 t1 e t1 x
+ · · · + an tn etn x
y = a0 t02 et0 x + a1 t12 et1 x + · · · + an tn2 etn x (4b)
.. .. ..
. . .
y (n) = a0 t0n et0 x + a1 t1n et1 x + · · · + an tnn etn x .
Substituting x = 0 in each equation of system (4b) and setting y (k) (0) = yk yields the
system
y0 = a0 + a1 + · · · + an
y1 = a0 t0 + a1 t1 + · · · + an tn
y2 = a0 t02 + a1 t12 + · · · + an tn2 (4c)
.. .. ..
. . .
yn = a0 t0 + a1 t1 + · · · + an tnn
n n
August 2, 2001 13:48 i56-ch01 Sheet number 84 Page number 84 cyan black
with unknowns a0 , a1 , . . . , an . Note that the coefficient matrix for the linear system (4c)
is
1 1 ··· 1
t0 t1 · · · tn
TT = 2 2
t0 t1 · · · tn ,
2 (4d)
.. ..
. .
t0n t1n · · · tnn
where T is the [(n + 1) × (n + 1)] Vandermonde matrix given in Eq. (3). It is left as
an exercise (see Exercise 57 of Section 1.7) to show that because T is nonsingular, the
transpose T T is also nonsingular. Thus by Theorem 13, the linear system (4c) has a
unique solution.
The next example is a specific case of Example 4.
Example 5 Find the unique function y = c1 ex + c2 e2x + c3 e3x that satisfies the constraints y(0) =
1, y (0) = 2, and y (0) = 0.
Solution The given function and its first two derivatives are
y = c1 ex + c2 e2x + c3 e3x
y = c1 ex + 2c2 e2x + 3c3 e3x (5a)
2x 3x
y = c1 e + 4c2 e
x
+ 9c3 e .
1 = c1 + c2 + c3
2 = c1 + 2c2 + 3c3 (5b)
0 = c1 + 4c2 + 9c3 .
and solving in the usual manner yields the unique solution c1 = −2, c2 = 5, c3 = −2.
Therefore, the function y = −2ex + 5e2x − 2e3x is the unique function that satisfies the
given constraints.
Numerical Integration
The Vandermonde matrix also arises in problems where it is necessary to estimate nu-
merically an integral or a derivative. For example, let I (f ) denote the definite integral
August 2, 2001 13:48 i56-ch01 Sheet number 85 Page number 85 cyan black
b
I (f ) = f (t) dt.
a
If the integrand is fairly complicated or if the integrand is not a standard form that can
be found in a table of integrals, then it will be necessary to approximate the value I (f )
numerically.
One effective way to approximate I (f ) is first to find a polynomial p that approxi-
mates f on [a, b],
p(t) ≈ f (t), a ≤ t ≤ b.
Next, given that p is a good approximation to f , we would expect that the approximation
that follows is also a good one:
b b
p(t) dt ≈ f (t) dt. (6)
a a
Of course, since p is a polynomial, the integral on the left-hand side of Eq. (6) can be
easily evaluated and provides a computable estimate to the unknown integral, I (f ).
One way to generate a polynomial approximation to f is through interpolation. If
we select n + 1 points t0 , t1 , . . . , tn in [a, b], then the nth-degree polynomial p that
satisfies p(ti ) = f (ti ), 0 ≤ i ≤ n, is an approximation to f that can be used in Eq. (6)
to estimate I (f ).
In summary, the numerical integration process proceeds as follows:
1. Given f , construct the interpolating polynomial, p.
b
2. Given p, calculate the integral, a p(t) dt.
b b
3. Use a p(t) dt as the approximation to a f (t) dt.
It turns out that this approximation scheme can be simplified considerably, and step 1
can be skipped entirely. That is, it is not necessary to construct the actual interpolating
b
polynomial p in order to know the integral of p, a p(t) dt.
We will illustrate the idea with a quadratic interpolating polynomial. Suppose p is
the quadratic polynomial that interpolates f at t0 , t1 , and t2 . Next, suppose we can find
scalars A0 , A1 , A2 such that
b
A0 + A 1 + A 2 = 1 dt
a
b
A0 t0 + A1 t1 + A2 t2 = t dt (7)
a
b
A0 t02 + A1 t12 + A2 t22 = t 2 dt.
a
b b
p(t) dt = [a0 + a1 t + a2 t 2 ] dt
a a
b b b
= a0 1 dt + a1 t dt + a2 t 2 dt
a a a
2
2
2
= a0 Ai + a 1 Ai ti + a 2 Ai ti2
i=0 i=0 i=0
2
= Ai [a0 + a1 ti + a2 ti2 ]
i=0
2
= Ai p(ti ).
i=0
The previous calculations demonstrate the following: If we know the values of a quadratic
polynomial p at three points t0 , t1 , t2 and if we can find scalars A0 , A1 , A2 that satisfy
system (7), then we can evaluate the integral of p with the formula
b
2
p(t) dt = Ai p(ti ). (8)
a i=0
Next, since p is the quadratic interpolating polynomial for f , we see that the values
of p(ti ) are known to us; that is, p(t0 ) = f (t0 ), p(t1 ) = f (t1 ), and p(t2 ) = f (t2 ). Thus,
combining Eq. (8) and Eq. (6), we obtain
b
2
2 b
p(t) dt = Ai p(ti ) = Ai f (ti ) ≈ f (t) dt,
a i=0 i=0 a
or equivalently,
b 2
n
f (t) dt ≈ Ai f (ti ). (10)
a i=0
August 2, 2001 13:48 i56-ch01 Sheet number 87 Page number 87 cyan black
Example 6 For an interval [a, b] let t0 = a, t1 = (a + b)/2, and t2 = b. Construct the corresponding
numerical integration formula.
Solution For t0 = a, t1 = (a + b)/2, and t2 = b, the system to be solved is given by (11) with
n = 2. We write system (11) as Cx = d, where
1 1 1 b−a
C = a t1 b and d = (b2 − a 2 )/2 .
a2 t12 b2 (b3 − a 3 )/3
It can be shown (see Exercise 23) that the solution of Cx = d is A0 = (b − a)/6,
A1 = 4(b − a)/6, A2 = (b − a)/6. The corresponding numerical integration formula is
b
f (t) dt ≈ [(b − a)/6]{f (a) + 4f [(a + b)/2] + f (b)}. (12)
a
The reader may be familiar with the preceding approximation, which is known as Simp-
son’s rule.
The function C(x) is important in applied mathematics, and extensive tables of the
function C(x) are available. The integrand is not a standard form, and C(x) must be
evaluated numerically. From a table, C(0.5) = 0.49223442 . . . .
Numerical Differentiation
Numerical differentiation formulas can also be derived in the same fashion as numerical
integration formulas. In particular, suppose that f is a differentiable function and we
wish to estimate the value f (a), where f is differentiable at t = a.
Let p be the polynomial of degree n that interpolates f at t0 , t1 , . . . , tn , where the
interpolation nodes ti are clustered near t = a. Then p provides us with an approximation
for f , and we can estimate the value f (a) by evaluating the derivative of p at t = a:
f (a) ≈ p (a).
As with a numerical integration formula, it can be shown that the value p (a) can
be expressed as
p (a) = A0 p(t0 ) + A1 p(t1 ) + · · · + An p(tn ). (13)
Solution The weights A0 , A1 , and A2 are determined by forcing Eq. (13) to hold for p(t) =
1, p(t) = t, and p(t) = t 2 . Thus the weights are found by solving the system
[p(t) = 1] 0 = A0 + A1 + A2
[p(t) = t] 1 = A0 (a − h) + A1 (a) + A2 (a + h)
2
p(t) = t 2a = A0 (a − h)2 + A1 (a)2 + A2 (a + h)2 .
August 2, 2001 13:48 i56-ch01 Sheet number 89 Page number 89 cyan black
By (4d), the matrix C is nonsingular and (see Exercise 24) the solution is A0 =
−1/2h, A1 = 0, A2 = 1/2h. The numerical differentiation formula has the form
Solution The weights A0 , A1 , A2 , and A3 are determined by forcing the preceding approximation
to be an equality for p(t) = 1, p(t) = t, p(t) = t 2 , and p(t) = t 3 . These constraints
lead to the equations
[p(t) = 1] 0 = A0 + A1 + A2 + A3
[p(t) = t] 0 = A0 (a) + A1 (a + h) + A2 (a + 2h) + A3 (a + 3h)
p(t) = t 2 2 = A0 (a)2 + A1 (a + h)2 + A2 (a + 2h)2 + A3 (a + 3h)2
p(t) = t 3 6a = A0 (a)3 + A1 (a + h)3 + A2 (a + 2h)3 + A3 (a + 3h)3 .
Since this system is a bit cumbersome to solve by hand, we decided to use the computer
algebra system Derive. (Because the coefficient matrix has symbolic rather than numer-
ical entries, we had to use a computer algebra system rather than numerical software
such as MATLAB. In particular, Derive is a popular computer algebra system that is
menu-driven and very easy to use.)
Figure 1.19 shows the results from Derive. Line 2 gives the command to row reduce
the augmented matrix for the system. Line 3 gives the results. Therefore, the numerical
differentiation formula is
1
f (a) ≈ [2f (a) − 5f (a + h) + 4f (a + 2h) − f (a + 3h)].
h2
August 2, 2001 13:48 i56-ch01 Sheet number 90 Page number 90 cyan black
1 1 1 1 0
a a + h a + 2h a + 3h 0
2: ROW_REDUCE
2 2 2 2
a (a + h) (a + 2h) (a + 3h) 2
3 3 3 3
a (a + h) (a + 2h) (a + 3h) 6a
2
1 0 0 0
2
h
5
0 1 0 0 –
2
h
3:
4
0 0 1 0
2
h
1
0 0 0 1 –
2
h
1.8 EXERCISES
In Exercises 1–6, find the interpolating polynomial for In Exercises 7–10, find the constants so that the given
the given table of data. [Hint: If the data table has function satisfies the given conditions.
k entries, the interpolating polynomial will be of degree 7. y = c1 e2x + c2 e3x ; y(0) = 3, y (0) = 7
k − 1 or less.]
8. y = c1 e(x−1) + c2 e3(x−1) ; y(1) = 1, y (1) = 5
t 0 1 2 t −1 0 2
1. 2. 9. y = c1 e−x + c2 ex + c3 e2x ; y(0) = 8, y (0) = 3,
y −1 3 6 y 6 1 −3 y (0) = 11
10. y = c1 ex + c2 e2x + c3 e3x ; y(0) = −1, y (0) = −3,
t −1 1 2 t 1 3 4
3. 4. y (0) = −5
y 1 5 7 y 5 11 14
As in Example 6, find the weights Ai for the numerical
t −1 0 1 2 integration formulas listed in Exercises 11–16. [Note:
5. It can be shown that the special formulas developed in
y −6 1 4 15 Exercises 11–16 can be translated to any interval of the
general form [a, b]. Similarly, the numerical differentia-
t −2 −1 1 2
6. tion formulas in Exercises 17–22 can also be translated.]
y −3 1 3 13
August 2, 2001 13:48 i56-ch01 Sheet number 91 Page number 91 cyan black
Existence of Inverses
As we saw earlier in Example 1, some matrices do not have an inverse. We now turn our
attention to determining exactly which matrices are invertible. In the process, we will
also develop a simple algorithm for calculating A−1 .
Let A be an (n × n) matrix. If A does have an inverse, then that inverse is an (n × n)
matrix B such that
AB = I. (7a)
(Of course, to be an inverse, the matrix B must also satisfy the condition BA = I . We
will put this additional requirement aside for the moment and concentrate solely on the
condition AB = I .)
Expressing B and I in column form, the equation AB = I can be rewritten as
A[b1 , b2 , . . . , bn ] = [e1 , e2 , . . . , en ]
or
[Ab1 , Ab2 , . . . , Abn ] = [e1 , e2 , . . . , en ]. (7b)
If A has an inverse, therefore, it follows that we must be able to solve each of the
following n equations:
Ax = e1
Ax = e2
.. (7c)
.
Ax = en .
In particular, if A is invertible, then the kth column of A−1 can be found by solving
Ax = ek , k = 1, 2, . . . , n.
We know (recall Theorem 13) that all the equations listed in (7c) can be solved if A
is nonsingular. We suspect, therefore, that a nonsingular matrix always has an inverse.
In fact, as is shown in Theorem 15, A has an inverse if and only if A is nonsingular.
Before stating Theorem 15, we give a lemma. (Although we do not need it here,
the converse of the lemma is also valid; see Exercise 70.)
x2 such that Qx2 = x1 . (In addition, note that x2 must be nonzero because x1 is nonzero.)
Therefore,
Rx2 = (PQ)x2
= P (Qx2 )
= P x1
= θ.
Thus, if either P or Q is singular, then the product PQ is also singular.
We are now ready to characterize invertible matrices.
R1 − 2R2 , R3 + 3R2 :
1 0 7 5 −2 0
0 1 −2 −2 1 0
0 0 1 −7 3 1
R1 − 7R3 , R2 + 2R3 :
1 0 0 54 −23 −7
0 1 0 −16 7 2 .
0 0 1 −7 3 1
Having the reduced echelon form above, we easily find the solutions of the three systems
Ax = e1 , Ax = e2 , Ax = e3 . In particular,
Computation of A−1
To calculate the inverse of a nonsingular (n × n) matrix, we can proceed as
follows:
Step 1. Form the (n × 2n) matrix [A | I ].
Step 2. Use elementary row operations to transform [A | I ] to the form [I | B].
Step 3. Reading from this final form, A−1 = B.
(Note: Step 2 of the algorithm above assumes that [A | I ] can always be row reduced
to the form [I | B] when A is nonsingular. This is indeed the case, and we ask you to
prove it in Exercise 76 by showing that the reduced echelon form for any nonsingular
matrix A is I . In fact, Exercise 76 actually establishes the stronger result listed next in
Theorem 16.)
R2 − 2R1 :
1 2 1 0
0 1 −2 1
R1 − 2R2 :
1 0 5 −2
.
0 1 −2 1
August 2, 2001 13:48 i56-ch01 Sheet number 98 Page number 98 cyan black
Part (a) of the remark is Exercise 69. To verify the formula given in (b), suppose
that 7
= 0, and define B to be the matrix
1 d −b
B= .
7 −c a
Then
1 d −b a b 1 ad − bc 0 1 0
BA = = = .
7 −c a c d 7 0 ad − bc 0 1
Similarly, AB = I , so B = A−1 .
The reader familiar with determinants will recognize the number 7 in the remark as
the determinant of the matrix A. We make use of the remark in the following example.
Note that the familiar formula (ab)−1 = a −1 b−1 for real numbers is valid only
because multiplication of real numbers is commutative. We have already noted that
matrix multiplication is not commutative, so, as the following example demonstrates,
(AB)−1
= A−1 B −1 .
Ill-Conditioned Matrices
In applications the equation Ax = b often serves as a mathematical model for a physical
problem. In these cases it is important to know whether solutions to Ax = b are sensitive
to small changes in the right-hand side b. If small changes in b can lead to relatively
large changes in the solution x, then the matrix A is called ill-conditioned.
The concept of an ill-conditioned matrix is related to the size of A−1 . This connection
is explained after the next example.
Example 7 The (n × n) Hilbert matrix is the matrix whose ijth entry is 1/(i + j − 1). For example,
the (3 × 3) Hilbert matrix is
1 1/2 1/3
1/2 1/3 1/4 .
1/3 1/4 1/5
Let A denote the (6 × 6) Hilbert matrix, and consider the vectors b and b + 7b:
1 1
2 2
1 1
b= , b + 7b = .
1.414 1.4142
1 1
2 2
Note that b and b + 7b differ slightly in their fourth components. Compare the solutions
of Ax = b and Ax = b + 7b.
Solution We used MATLAB to solve these two equations. If x1 denotes the solution of Ax = b,
and x2 denotes the solution of Ax = b + 7b, the results are (rounded to the nearest
integer):
−6538 −6539
185747
185706
−1256237 −1256519
x1 = and x2 = .
3271363 3272089
−3616326 −3617120
1427163 1427447
August 2, 2001 13:48 i56-ch01 Sheet number 102 Page number 102 cyan black
(Note: Despite the fact that b and b + 7b are nearly equal, x1 and x2 differ by almost
800 in their fifth components.)
Example 7 illustrates that the solutions of Ax = b and Ax = b + 7b may be quite
different even though 7b is a small vector. In order to explain these differences, let x1
denote the solution of Ax = b and x2 the solution of Ax = b + 7b. Therefore, Ax1 = b
and Ax2 = b + 7b. To assess the difference, x2 − x1 , we proceed as follows:
Ax2 − Ax1 = (b + 7b) − b = 7b.
Therefore, A(x2 − x1 ) = 7b, or
x2 − x1 = A−1 7b.
If A−1 contains large entries, then we see from the equation above that x2 − x1 can be
large even though 7b is small.
The Hilbert matrices described in Example 7 are well-known examples of ill-
conditioned matrices and have large inverses. For example, the inverse of the (6 × 6)
Hilbert matrix is
36 −630 3360 −7560 7560 −2772
−630 14700 −88200 211680 −220500 83160
3360 −88200 564480 −1411200 1512000 −582120
A−1 = .
−7560 211680 −1411200 3628800 −3969000 1552320
7560 −220500 1512000 −3969000 4410000 −1746360
−2772 83160 −582120 1552320 −1746360 698544
Because of the large entries in A−1 , we should not be surprised at the large difference
between x1 and x2 , the two solutions in Example 7.
1.9 EXERCISES
In Exercises 1–4, verify that B is the inverse of A by In Exercises 5–8, use the appropriate inverse matrix
showing that AB = BA = I . from Exercises 1–4 to solve the given system of linear
equations.
7 4 3 −4
1. A = , B= 5. 3x1 + 10x2 = 6 6. 7x1 + 4x2 = 5
5 3 −5 7
2x1 + 10x2 = 9 5x1 + 3x2 = 2
3 10 1 −1 7. x2 + 3x3 = 4 8. x1 =2
2. A = , B=
2 10 −.2 .3 5x1 + 5x2 + 4x3 = 2 −2x1 + x2 =3
x1 + x2 + x3 = 2 5x1 − 4x2 + x3 = 2
−1 −2 11 0 1 3
In Exercises 9–12, verify that the given matrix A does
3. A = 1 3 −15 , B = 5 5 4
not have an inverse. [Hint: One of AB = I or BA = I
0 −1 5 1 1 1 leads to an easy contradiction.]
1 0 0 1 0 0 0 0 0 0 4 2
4. A = 2 1 0 , B = −2 1 0 9. A = 1 2 1 10. A = 0 1 7
3 4 1 5 −4 1 3 2 1 0 3 9
August 2, 2001 13:48 i56-ch01 Sheet number 103 Page number 103 cyan black
2 2 4 1 1 1 In Exercises 29–34, solve the given system by forming
x = A−1 b, where A is the coefficient matrix for the
11. A = 1 1 7 12. A = 1 1 1
system.
3 3 9 2 3 2
29. 2x1 + x2 = 4 30. x1 + x2 = 0
3x1 + 2x2 = 2 2x1 + 3x2 = 4
In Exercises 13–21, reduce [A | I ] to find A−1 . In each
case, check your calculations by multiplying the given 31. x1 − x2 = 5 32. 2x1 + 3x2 = 1
matrix by the derived inverse. 3x1 − 4x2 = 2 3x1 + 4x2 = 7
33. 3x1 + x2 = 10 34. x1 − x2 = 10
13. 1 1 14. 2 3 −x1 + 3x2 = 5 2x1 + 3x2 = 4
2 3 6 7
The following matrices are used in Exercises 35–45.
15. 1 2 16. −1 −2 11 3 1 1 2 −1 1
−1
A = ,B = −1
,C =
2 1 13 −15 0 2 2 1 1 2
.
0 −1 5
(9)
17. 1 0 0 18. 1 3 5 In Exercises 35–45, use Theorem 17 and the matrices in
(9) to form Q−1 , where Q is the given matrix.
2 1 0 0 1 4
3 4 1 0 2 7 35. Q = AC 36. Q = CA
37. Q = AT 38. Q = ATC
19. 1 4 2 20. 1 −2 2 1
39. Q = C TAT 40. Q = B −1A
0 2 1 1 −1 5 0
41. Q = CB −1
42. Q = B −1
3 5 3 2 −2 11 2
43. Q = 2A 44. Q = 10C
0 2 8 1
−1
45. Q = (AC)B
21. 1 2 3 1
46. Let A be the matrix given in Exercise 13. Use the
−1 0 2 1
inverse found in Exercise 13 to obtain matrices B
and C such that AB = D and CA = E, where
2 1 −3 0
2 −1
1 1 2 1 −1 2 3
D= and E = 1 1 .
As in Example 5, determine whether the (2×2) matrices 1 0 2
in Exercises 22–26 have an inverse. If A has an inverse, 0 3
find A−1 and verify that A−1A = I . 47. Repeat Exercise 46 with A being the matrix given
in Exercise 16 and where
−3 2 2 −2
22. A = 23. A = 2 −1
1 1 2 3 −1 2 3
D = 1 1 and E = .
1 0 2
−1 3 2 1 0 3
24. A = 25. A =
2 1 4 2 48. For what values of a is
1 1 −1
6 −2
26. A = A= 0 1 2
9 −3
1 1 a
In Exercises 27–28 determine the value(s) of λ for which nonsingular?
A has an inverse. 49. Find (AB)−1 , (3A)−1 , and (AT )−1 given that
1 −2 3 1 2 5 3 −3 4
λ 4
27. A = 28. A = 4 −1 4 A−1 = 3 1 6 and B −1 = 5 1 3 .
1 λ
2 −3 λ 2 8 1 7 6 −1
August 2, 2001 13:48 i56-ch01 Sheet number 104 Page number 104 cyan black
50. Find the (3 × 3) nonsingular matrix A if A2 = 57. Consider the (n × n) matrix A defined in Exercise
AB + 2A, where 56. For x in R n , show that the product Ax has the
form Ax = x − λv, where λ is a scalar. What is the
2 1 −1
value of λ for a given x?
B = 0 3 2 .
58. Suppose that A is an (n × n) matrix such that
−1 4 1 ATA = I (the matrix defined in Exercise 56 is such
51. Simplify (A B) (C A) (B −1 C)−1 for (n × n)
−1 −1 −1 −1 a matrix). Let x be any vector in R n . Show that
invertible matrices A, B, and C. Ax = x; that is, multiplication of x by A pro-
duces a vector Ax having the same length as x.
52. The equation x 2 = 1 can be solved by setting
x 2 − 1 = 0 and factoring the expression to obtain 59. Let u and v be vectors in R n , and let I denote the
(x − 1)(x + 1) = 0. This yields solutions x = 1 and (n × n) identity. Let A = I + uvT , and suppose
x = −1. vT u
= −1. Establish the Sherman–Woodberry
formula:
a) Using the factorization technique given above,
what (2 × 2) matrix solutions do you obtain for A−1 = I − auvT, a = 1/(1 + vT u). (10)
the matrix equation X 2 = I ? −1 −1
[Hint: Form AA , where A is given by formula
b) Show that (10).]
a 1 − a2 60. If A is a square matrix, we define the powers A2 , A3 ,
A= and so on, as follows: A2 = AA, A3 = A(A2 ), and
1 −a so on. Suppose A is an (n × n) matrix such that
is a solution to X 2 = I for every real number a. A3 − 2A2 + 3A − I = O.
c) Let b = ±1. Show that Show that AB = I , where B = A2 − 2A + 3I .
b 0 61. Suppose that A is (n × n) and
B=
c −b A 2 + b1 A + b 0 I = O , (11)
2
is a solution to X = I for every real number c. where b0
= 0. Show that AB = I , where B =
d) Explain why the factorization technique used in (−1/b0 )[A + b1 I ].
part (a) did not yield all the solutions to the
It can be shown that when A is a (2 × 2) matrix such that
matrix equation X 2 = I .
A−1 exists, then there are constants b1 and b0 such that
53. Suppose that A is a (2 × 2) matrix with columns u Eq. (11) holds. Moreover, b0
= 0 in Eq. (11) unless A
and v, so that A = [u, v], u and v in R 2 . Suppose is a multiple of I . In Exercises 62–65, find the constants
also that uT u = 1, uT v = 0, and vT v = 1. Prove b1 and b0 in Eq. (11) for the given (2 × 2) matrix. Also,
that AT A = I . [Hint: Express the matrix A as verify that A−1 = (−1/b0 )[A + b1 I ].
u1 v1 u1 v1 62. A in Exercise 13. 63. A in Exercise 15.
A= , u= , v
u 2 v2 u2 v2 64. A in Exercise 14. 65. A in Exercise 22.
and form the product ATA.] 66. a) If linear algebra software is available, solve the
systems Ax = b1 and Ax = b2 , where
54. Let u be a vector in R n such that uT u = 1. Let
A = I − uuT , where I is the (n × n) identity. Verify 0.932 0.443 0.417
that AA = A. [Hint: Write the product uuT uuT as
A = 0.712 0.915 0.887 ,
uuT uuT = u(uT u)uT, and note that uT u is a scalar.]
0.632 0.514 0.493
55. Suppose that A is an (n × n) matrix such that
AA = A, as in Exercise 54. Show that if A has 1 1.01
an inverse, then A = I .
b1 = 1 , b2 = 1.01 .
56. Let A = I − avvT , where v is a nonzero vector in
R n , I is the (n × n) identity, and a is the scalar given −1 −1.01
by a = 2/(vT v). Show that A is symmetric and that Note the large difference between the two
AA = I ; that is, A−1 = A. solutions.
August 2, 2001 13:48 i56-ch01 Sheet number 105 Page number 105 cyan black
b) Calculate A−1 and use it to explain the results of 72. Let A and B be (n × n) nonsingular matrices. Show
part (a). that AB is also nonsingular.
67. a) Give examples of nonsingular (2 × 2) matrices 73. What is wrong with the following argument that if
A and B such that A + B is singular. AB is nonsingular, then each of A and B is also non-
b) Give examples of singular (2 × 2) matrices A singular?
and B such that A + B is nonsingular. Since AB is nonsingular, (AB)−1 exists. But
by Theorem 17, property 2, (AB)−1 = B −1 A−1 .
68. Let A be an (n × n) nonsingular symmetric matrix.
Therefore, A−1 and B −1 exist, so A and B are
Show that A−1 is also symmetric.
nonsingular.
69. a) Suppose that AB = O, where A is nonsing-
74. Let A and B be (n × n) matrices such that AB is
ular. Prove that B = O.
nonsingular.
b) Find a (2 × 2) matrix B such that AB = O,
where B has nonzero entries and where A is the a) Prove that B is nonsingular. [Hint: Suppose v is
matrix any vector such that Bv = θ, and write (AB)v
as A(Bv).]
1 1
A= . b) Prove that A is nonsingular. [Hint: By part (a),
1 1 B −1 exists. Apply Exercise 72 to the matrices
Why does this example not contradict part (a)? AB and B −1 .]
70. Let A, B, and C be matrices such that A is nonsing- 75. Let A be a singular (n × n) matrix. Argue that at
ular and AB = AC. Prove that B = C. least one of the systems Ax = ek , k = 1, 2, . . . , n,
71. Let A be the (2 × 2) matrix must be inconsistent, where e1 , e2 , . . . , en are the
n-dimensional unit vectors.
a b 76. Show that the (n × n) identity matrix, I , is non-
A= ,
c d singular.
and set 7 = ad − bc. Prove that if 7 = 0, then A 77. Let A and B be matrices such that AB = BA. Show
is singular. Conclude that A has no inverse. [Hint: that A and B must be square and of the same order.
Consider the vector [Hint: Let A be (p × q) and let B be (r × s). Now
show that p = r and q = s.]
d
v= ; 78. Use Theorem 3 to prove Theorem 16.
−c
79. Let A be (n × n) and invertible. Show that A−1 is
also treat the special case when d = c = 0.] unique.
SUPPLEMENTARY EXERCISES
1. Consider the system of equations 2. Let
1 −1 −1 x1
=1
x1 A= 2 −1 1 , x = x2 , and
2x1 + (a 2 + a − 2)x2 = a 2 − a − 4. −3 1 −3 x3
b1
For what values of a does the system have infinitely
b = b2 .
many solutions? No solutions? A unique solution
in which x2 = 0? b3
August 2, 2001 13:48 i56-ch01 Sheet number 106 Page number 106 cyan black
9. Find A−1 for each of the following matrices A In Exercises 14–18, A and B are (3 × 3) matrices such
that
1 2 1
2 3 5 −6 4 3
a) A = 2 5 4
A−1 = 7 2 1 and B −1 = 7 −1 5 .
1 1 0
4 −4 3 2 3 1
cos θ − sin θ
b) A =
sin θ cos θ 14. Without calculating A, solve the system of equations
10. For what values of λ is the matrix Ax = b, where
λ − 4 −1 x1 −1
A=
2 λ−1 x = x2 and b = 0 .
x3 1
singular? Find A−1 if A is nonsingular.
3 1 15. Without calculating A or B, find (AB)−1 .
−1
11. Find A if A is (2 × 2) and (4A) = . 16. Without calculating A, find (3A)−1 .
5 2
12. Find A and B if they are (2 × 2) and 17. Without calculating A or B, find (ATB)−1 .
18. Without calculating A or B, find
4 6 2 2 [(A−1B −1 )−1 A−1B]−1 .
A+B = and A − B = .
8 10 4 6
13. Let
1 0 0
A = 0 −1 0 .
0 0 −1
CONCEPTUAL EXERCISES
In Exercises 1–8, answer true or false. Justify your an- 6. If A is an (m × n) matrix such that Ax = θ for every
swer by providing a counterexample if the statement is x in R n , then A is the (m × n) zero matrix.
false or an outline of a proof if the statement is true. 7. If A is a (2×2) nonsingular matrix and u1 and u2 are
1. If A and B are symmetric (n × n) matrices, then AB nonzero vectors in R 2 , then {Au1 , Au2 } is linearly
is also symmetric. independent.
2. If A is an (n × n) matrix, then A + AT is symmetric. 8. Let A be (m × n) and B be (p × q). If AB is defined
3. If A and B are nonsingular (n × n) matrices such and square, then BA is also defined and square.
that A2 = I and B 2 = I , then (AB)−1 = BA.
4. If A and B are nonsingular (n × n) matrices, then In Exercises 9–16, give a brief answer.
A + B is also nonsingular. 9. Let P , Q, and R be nonsingular (n × n) matrices
5. A consistent (3 × 2) linear system of equations can such that PQR = I . Express Q−1 in terms of P
never have a unique solution. and R.
August 2, 2001 13:48 i56-ch01 Sheet number 108 Page number 108 cyan black
10. Suppose that each of A, B, and AB are symmetric 14. An (n × n) matrix A is idempotent if A2 = A.
(n × n) matrices. Show that AB = BA. What can you say about A if it is both idempotent
11. Let u1 , u2 , and u3 be nonzero vectors in R n such that and nonsingular?
u1T u2 = 0, u1T u3 = 0, and u2T u3 = 0. Show that 15. Let A and B be (n × n) idempotent matrices such
{u1 , u2 , u3 } is a linearly independent set. that AB = BA. Show that AB is also idempotent.
12. Let u1 and u2 be linearly dependent vectors in R 2 , 16. An (n×n) matrix A is nilpotent of index k if Ak = O
and let A be a (2 × 2) matrix. Show that the vectors but Ai
= O for 1 ≤ i ≤ k − 1.
Au1 and Au2 are linearly dependent. a) Show: If A is nilpotent of index 2 or 3, then A
13. An (n × n) matrix A is orthogonal provided that is singular.
AT = A−1 , that is, if AAT = ATA = I . If A
b) (Optional) Show: If A is nilpotent of index k,
is an (n × n) orthogonal matrix, then prove that
k ≥ 2, then A is singular. [Hint: Consider a
x = Ax for every vector x in R n .
proof by contradiction.]
MATLAB EXERCISES
Exercise 1 illustrates some ideas associated with population dynamics. We will look at this
topic again in Chapter 4, after we have developed the necessary analytical tools—eigenvalues
and eigenvectors.
1. Population dynamics An island is divided into three regions, A, B, and C. The yearly
migration of a certain animal among these regions is described by the following table.
From A From B From C
a) Use Eq. (1) to find the population distribution one year after the census.
b) Give a formula for xn in terms of powers of A and x0 .
c) Calculate the state vectors x1 , x2 , . . . , x10 . Observe that the population distribution
seems to be reaching a steady state. Estimate the steady-state population for
each region.
d) Calculate x20 and compare it with your estimate from part c).
e) Let x−1 denote the state vector one year prior to the census. Calculate x−1 .
f ) Demonstrate that Eq. (1) has not always been an accurate model for population
distribution by calculating the state vector four years prior to the census.
g) How should we rearrange the population just after the census so that the distribution
three years later is x3 = [250, 400, 200]T ? That is, what should x0 be in order to hit
the target x3 ?
We have already seen one example of a partitioned matrix (also called a block matrix) when
we wrote A in column form as A = [A1 , A2 , . . . , An ]; recall Section 1.6. Exercise 2 expands
on this idea and illustrates how partitioned matrices can be multiplied in a natural way.
where each of the Ai are matrices. Note that the matrix A need not be a square matrix; for
instance, A might be (7 × 12) with A1 being (3 × 5), A2 being (3 × 7), A3 being (4 × 5),
and A4 being (4 × 7). We can imagine creating a (2 × 2) block matrix by dividing the
array into four pieces using a horizontal line and a vertical line.
Now suppose B is also a (2 × 2) block matrix given by
B 1 B2
B= .
B3 B 4
Finally, let us suppose that the product AB can be formed and that B has been partitioned
in a way such that the following matrix is defined:
A1 B1 + A2 B3 A1 B2 + A2 B4
.
A3 B1 + A 4 B3 A3 B2 + A 4 B4
It turns out that the product AB is given by this block matrix. That is, if all the submatrix
products are defined, then we can treat the blocks in a partitioned matrix as though they
were scalars when forming products. It is tedious to prove this result in general, so we ask
you to illustrate its validity with some randomly chosen matrices.
a) Using the MATLAB command round(10*rand(6, 6)) generate two randomly
selected (6 × 6) matrices A and B. Compute the product AB. Then write each of A
and B as a block matrix of the form
A 1 A2 B 1 B2
A= B= .
A3 A4 B3 B4
August 2, 2001 13:48 i56-ch01 Sheet number 110 Page number 110 cyan black
Above, each Ai and Bi should be a (3 × 3) block. Using matrix surgery (see Section 4
of Appendix A) extract the Ai and Bi matrices and form the new block matrix:
A 1 B 1 + A 2 B3 A 1 B2 + A 2 B4
.
A 3 B1 + A 4 B3 A 3 B2 + A 4 B4
Compare the preceding block matrix with AB and confirm that they are equal.
b) Repeat this calculation on three other matrices (not necessarily (6 × 6) matrices).
Break some of these matrices into blocks of unequal sizes. You need to make sure
that corresponding blocks are the correct size so that matrix multiplication is defined.
c) Repeat the calculation in (a) with the product of a (2 × 3) block matrix times a
(3 × 3) block matrix.
In Exercise 3, determine how many places were lost to round-off error when Ax = b was
solved on the computer?
3. This exercise expands on the topic of ill-conditioned matrices, introduced at the end of
Section 1.9. In general, a mathematician speaks of a problem as being ill conditioned if
small changes in the parameters of the problem lead to large changes in the solution to the
problem.
Part d) of this exercise also discusses a very practical question:
How much reliance can I place in the solution to Ax = b that my com-
puter gives me?
A reasonably precise assessment of this question can be made using the concept of a
condition number for A.
An easily understood example of an ill-conditioned problem is the equation Ax = b
where A is the (n × n) Hilbert matrix (see Example 7, Section 1.9 for the definition of the
Hilbert matrix). When A is the Hilbert matrix, then a small change in any entry of A or a
small change in any entry of b will lead to a large change in the solution of Ax = b.
Let A denote the (n × n) Hilbert matrix; in MATLAB, A can be created by the
command A = hilb(n, n).
a) Let B denote the inverse of A, as calculated by MATLAB. For n = 8, 9, 10, 11, and
12, form the product AB and note how the product looks less and less like the identity.
In order to have the results clearly displayed, you might want to use the MATLAB
Bank format for your output. For each value n, list the difference of the (1, 1) entries,
(AB)11 − I11 . [Note that it is not MATLAB’s fault that the inverse cannot be
calculated with any accuracy. MATLAB’s calculations are all done with 17-place
arithmetic, but the Hilbert matrix is so sensitive that seventeen places are not enough.]
b) This exercise illustrates how small changes in b can sometimes dramatically shift the
solution of Ax = b when A is an ill-conditioned matrix. Let A denote the (9 × 9)
Hilbert matrix and let b denote the (9 × 1) column vector consisting entirely of 1’s.
Use MATLAB to calculate the solution u = inv(A)∗ b. Next change the fourth
component of b to 1.001 and let v = inv(A)∗ b. Compare the difference between the
two solution vectors u and v; what is the largest component (in absolute value) of the
difference vector u − v? For ease of comparison, you might form the matrix [u, v]
and display it using Bank format.
c) This exercise illustrates that different methods of solving Ax = b may lead to wildly
different numerical answers in the computer when A is ill-conditioned. For A and b
August 2, 2001 13:48 i56-ch01 Sheet number 111 Page number 111 cyan black
as in part b), compare the solution vector u found using the MATLAB command
u = inv(A)*b with the solution w found using the MATLAB command
ww = rref([A, b]). For comparison, display the matrix [u, w] using Bank
format. What is the largest component (in absolute value) of the difference vector
u − w?
d) To give a numerical measure for how ill conditioned a matrix is, mathematicians use
the concept of a condition number. You can find the definition of the condition
number in a numerical analysis text. The condition number has many uses, one of
which is to estimate the error between a machine-calculated solution to Ax = b and
the true solution. To explain, let xc denote the machine-calculated solution and let xt
denote the true solution. For a machine that uses d-place arithmetic, we can bound
the relative error between the true solution and the machine solution as follows:
xc − xt
≤ 10−d Cond(A). (2)
xt
In inequality (2), Cond(A) denotes the condition number. The left-hand side of the
inequality is the relative error (sometimes also called the percentage error). The relative
error has the following interpretation: If the relative error is about 10−k , then the two
vectors xc and xt agree to about k places. Thus, using inequality (2), suppose Cond(A) is
about 10c and suppose we are using MATLAB so that d = 17. Then the right-hand side
of inequality (2) is roughly (10−17 )(10c ) = 10−(17−c) . In other words, we might have as
few as 17 − c correct places in the computer-calculated solution (we might have more than
17 − c correct places, but inequality (2) is sharp and so there will be problems for which
the inequality is nearly an equality). If c = 14, for instance, then we might have as few as
3 correct places in our answer.
Test inequality (2) using the (n × n) Hilbert matrix for n = 3, 4, . . . , 9. As the
vector b, use the n-dimensional vector consisting entirely of 1’s. For a calculated solution,
use MATLAB to calculate xc = inv(A)∗ b where A is the (n × n) Hilbert matrix. For
this illustration we also need to determine the true solution xt . Now, it is known that the
Hilbert matrix has an inverse with only integer entries, see Example 6 in Section 1.9 for a
listing of the inverse of the (6 × 6) Hilbert matrix. (In fact, there is a known formula giving
the entries of Hilbert matrix inverses.) Therefore, the true solution to our problem is a
vector xt that has only integer entries. The calculated solution found by MATLAB can
be rounded in order to generate the true solution. Do so, using the MATLAB rounding
command: xt = round(xc ). Finally, the MATLAB command cond(A) will calcu-
late the condition number for A. Prepare a table listing n, the left-hand side of inequality
(2), and the right-hand side of inequality (2) with d = 17. Next, using the long format,
display several of the pairs xc and xt and comment on how well the order of magnitude of
the relative error compares with the number of correct places in xc .