427J LinearAlgebraNotes
427J LinearAlgebraNotes
Theorem 1. Suppose we have a system of linear equations, two of which (not necessarily
the first two) are designated as (A) and (B). The set of solutions to the system is not
changed if we do any of the following three things.
(1) Replace (A) by (A0 ) where (A0 ) is the equation obtained by adding d · (B) to (A)
where d is any real number.
(2) Replace (A) by (A0 ) where (A0 ) is the equation obtained by multiplying (A) by d
where d is any nonzero real number.
(3) Interchange equations (A) and (B).
Proof. What does it mean for the solution set to remain the same? It means that each
solution of the old system is a solution to the new one and that no new solutions are
created. Because each of these operations is reversible, it is actually enough to prove that
no solutions are lost. This is true because if a new solution were created, it would be lost
when we went backwards.
As (3) is obvious - they are the same equations in a different order - we need only
prove (1) and (2). Let (c1 , c2 , . . . , cn ) be a solution to our original system. It obviously
satisfies every equation in the new system except (A0 ) since they are the same equations
we had before. So we need only check that (c1 , c2 , . . . , cn ) satisfies (A0 ) to complete the
proof. Consider conclusion (1). Suppose (A) is the equation a1 x1 + a2 x2 + · · · + an xn = f
and (B) is the equation b1 x1 + b2 x2 + · · · + bn xn = g. Then (A0 ) is the equation (a1 +
db1 )x1 + (a2 + db2 )x2 + · · · + (ab + dbn )xn = f + dg. We simply plug our solution into
the left side of the equation and verify that it works. Since a1 c1 + a2 c2 + · · · + an cn = f
and b1 c1 + b2 c2 + · · · + bn cn = g, then (a1 + db1 )c1 + (a2 + db2 )c2 + · · · + (ab + dbn )cn =
(a1 c1 + a2 c2 + · · · + an cn ) + d(b1 c1 + b2 c2 + · · · + bn cn ) = f + dg.
The proof of conclusion (2) follows the same pattern, but is a bit less complicated.
Now we return to Example ??. First we multiply the first equation by 1/2. By Theo-
rem ??(1), this does not change the solution. We now have
Next we use Theorem ??(2) to add (−4) times the first equation to the second and (-1)
times the first to the third. This gives
For a quick description of the above, in the first step we divide the first row by 2 to create
a 1 in the upper left corner. This 1 is referred to as a pivot. In the second step, that pivot
was used to clear out the first column using Theorem ??(1). In the third step, a pivot was
created in the (2,2) position. In the fourth step, this pivot was used to clear out the column
beneath it. In the fifth step, a pivot was created for the third row. In the sixth step, the
column above the third pivot was cleared out and in the last step, the final zero was added
in the (1,2) position.
If we translate this matrix back into the system for equations it represents, we get a
system that is already solved.
x1 = 1
x2 = 3
x3 = 5
I have just solved the same system twice and really used the same calculations. However,
the second solution involved much less writing and so we prefer it.
HINT: I recommend doing this problem exactly as I did it. Notice that in the first step, I
changed Row 1, while in the second step I performed two operations at once - using Row
1 to change Rows 2 and 3. I do not like combining the first two operations as changing
the first row and using it in the same step risks computational error. On the other hand,
changing Rows 2 and 3 in the same
step saves writing and is not risky.
1 0 0 1
Now the matrix 0 1 0 3 is in a very useful form and deserves a special name.
0 0 1 5
Definition. A matrix is said to be in reduced row echelon form (RREF) if
• The rows consisting entirely of zeroes (if any) are beneath all nonzero rows.
• The leftmost nonzero element of each nonzero row is a 1. Leading 1’s are called
pivots.
• Each pivot is to the right of any pivot in a row above it.
• If a column contains a pivot, all other entries in the column are zero.
Here is a slightly more interesting example of a matrix in reduced row echelon form.
1 0 3 0 4
0 1 −4 0 1
Example 4. 0 0 0 1 −8 Notice that the ”1” in the (2,5) position is not a pivot
0 0 0 0 0
and that the fifth column, like the third, can be messy.
4
For completeness, I should add one additional definition.
Definition. A matrix is in row echelon form if it satisfies properties (1)-(3) in the definition
of RREF.
Note that in the chain of row operations we performed to find a matrix in RREF, the
last three matrices are in row echelon form. There are two ways to solve systems and they
are very similar. Either put the augmented matrix in reduced row echelon form or put it in
row echelon form and finish the problem by back substitution. Either is fine; in this course
we will use RREF.
So far, we have only seen two actual systems of equation and both have had a unique
solution. This is not a general rule. A system of equations can have a unique solution or
infinitely many solutions or no solutions at all. However, we can solve any system with the
same technique - put it in RREF and read off the answer.
1 2 3 4 x1 10
2 4 6 9 x2 22
Example 5. Solve the system of equations 3 6 8 11 x3 = 29.
4 8 11 15 x4 39
1 2 3 4 10
2 4 6 9 22
First we form the augmented matrix 3 6 8 11 29.
4 8 11 15 39
Then we use Type I operations
to make the entries
below the pivot in the upper left corner
1 2 3 4 10
0 0 0 1 2
equal to zero, obtaining
0 0 −1 −1 −1.
0 0 −1 −1 −1
At this point, we are supposed to ignore the first row and create a pivot in the second row.
Since the second column is all zeroes (remember we are ignoring the first row), we must
move on to the third column. As the (2,3) entryis zero, we employ Row operation 3 and
1 2 3 4 10
0 0 −1 −1 −1
switch the second and third rows. This gives us 0 0 0
. We multiply the
1 2
0 0 −1 −1 −1
1 2 3 4 10
0 0 1 1 1
second equation by -1, giving 0 0 0
and then subtract row 2 from row
1 2
0 0 −1 −1 −1
1 2 3 4 10
0 0 1 1 1
4 to obtain 0 0 0 1 2 . Finally, we use Row operation 1 three times to create
0 0 0 0 0
the
necessary zeros in columns 4 and 3 and obtain a matrix in reduced row echelon form
1 2 0 0 5
0 0 1 0 −1
0 0 0 1 2 . The solution to this system can be read off immediately, IF you know
0 0 0 0 0
5
how to do it. In equation form, this system is
x1 + 2x2 = 5
x3 = −1
x4 = 2
0=0
Because there was no pivot in the second column, we declare x2 to be a free variable and
move it to the right hand side of the equation(s). The system becomes
x1 = 5 − 2x2
x3 = −1
x4 = 2
0=0
The number of solutions is infinite. We can choose any value for x2 that we like and then
there is a unique choice for x1 , x3 , x4 . The solution set is {(5 − 2t, t, −1, 2) | t ∈ R}.
There is no limit to the number of free variables. We get one for each column without
a pivot. It also possible that we may get no solutions at all. The next example illustrates
this.
1 2 3 x1 2
Example 6. Solve the system of equations 4 5 6 x2 = 3.
7 8 9 x3 5
1 2 3 2
First we form the augmented matrix 4 5 6 3.
7 8 9 6
We then solve as usual by row operations. I won’t show the work here, but the first four
1 2 3 2
steps lead us to the matrix 0 1 2 5/3 and four more steps lead us to the reduced row
0 0 0 2
1 0 −1 0
echelon form 0 1 2 0. In equation form, this system is
0 0 0 1
x1 − x3 = 0
x2 − 2x3 = 0
0=1
No matter how we choose x1 , x2 , x3 , the equation 0 = 1 will never be satisfied. Hence
this system is inconsistent and has no solutions. What tips us off immediately that this is
the case is the
pivot in the
last column of the augmented matrix. Moreover, when we get to
1 2 3 2
the matrix 0 1 2 5/3, we know where all the pivots will be located. We know there
0 0 0 2
will be a pivot in the final column and we can skip the last few steps. In general, for any
problem, halfway through the solution process, we know where all the pivots are located.
So we know if there is no solution, a unique solution, or infinitely many solutions. In the
case of infinitely many solutions, we also know which variables are the free variables.
6
I’ll add one example where there is more than one free variable.
x1
x2
1 2 0 −4 5 0 0 x3
2
Example 7. Solve the system of equations 0 0 1 7 8
0 −5 x4 = 3.
0 0 0 0 0 1 −9 x5
5
x6
x7
Since this matrix is already in reduced row echelon form, we can just read off the solution.
The free variables are x2 , x4 , x5 , x7 . Informally, the solutions are
x1 = 2 − 2r + 4s − 5t
x2 =r
x3 = 3 − 7s − 8t + 5u
x4 =s
x5 =t
x6 = 5 + 9u
x7 =u
where r, s, t, u can take on any value. In set notation, the solution set is
{2 − 2r + 4s − 5t, r, 3 − 7s − 8t + 5u, s, t, 5 + 9u, u) | r, s, t, u ∈ R}.
Now we know how to solve systems and how to express the solutions. Sometimes though,
we just are interested in a simpler question. How many solutions does a system have? Is
it exactly one, infinitely many, or none at all? Let’s sum up what is totally obvious from
RREF.
Theorem 3. Suppose A x = b is a linear system. Suppose the matrix (A | b) is equivalent
to the RREF (C | d). Then
(1) If there is a pivot in the d column, the system has no solution. Otherwise it does.
(2) Now assume there is a solution. If every column of C contains a pivot, then the
solution is unique. If some column does not have a pivot, there are infinitely many
solutions.
Corollary 4. If a linear system has more unknowns than equations, than a unique solution
is impossible. If the system is consistent, i.e., has a solution, it has infinitely many.
Proof. The number of pivots cannot exceed the number of rows in the matrix, which is of
course the number of equations. Since the number of columns is greater, there must be at
least one column without a pivot.
2. Homogeneous Systems
When I first learned about simultaneous equations (and I suspect it was the same for
most of you), the most important problems were systems which had a unique solution and
typically consisted of 2 equations in 2 unknowns or 3 equations in 3 unknowns. There could
be more equation and unknowns, but such problems took too long to do by hand. Now we
7
learned that some systems had infinitely many solutions or none, but those systems seemed
like oddities.
In reality, we were not wrong in thinking that systems with unique solutions are impor-
tant. We note that that is how we find the solution to a second order linear differential
equation satisfying specific initial conditions. However, systems with infinitely many solu-
tions or no solution at all are not just curiosities, but mathematical objects of real signifi-
cance. Even systems where the number of equations does not equal the number of variables
naturally arise.
One type of problem is extremely important in linear algebra and will play a major role
in this course. These are called homogeneous linear systems and are characterized by the
property that the constants on the right hand side of all equations are zero. We will soon
see that homogeneous systems are vital in determining linear independence. Later on, we
will see that they are crucial for finding the general solution to systems of linear differential
equations.
1 4 2 1 x1 0
2 7 3 0 x2 0
Example 8. 3 1 −5 0 x3 = 0.
6 5 −7 1 x4 0
This is an example of a homogeneous system. If you compare it to a non-homogeneous
system, e.g., Example ?? or Example ??, you might notice something right away. It has a
solution. We can recall that Example ?? had infinitely many solutions while Example ?? did
not have any, but we really didn’t know that until we had solved the system. In Example ??,
we can notice that x1 = 0, x2 = 0, x3 = 0, x4 = 0 is a solution without doing any work at
all. This solution is called the trivial solution and works for every homogeneous system.
Hence a homogeneous solution has either infinitely many solutions or the trivial solution as
a unique solution.
Now we solve homogeneous equations in much the same way as other systems with a few
adjustments. First of all, there is no point in writing out the augmented matrix. You could
do this if you like, but the last column would be all zeroes and at every step, this column
won’t change. So we just work with the original matrix and imagine the zeroes are there
when we want to read off the answer. Now let’s solve Example 8 by putting the matrix into
RREF and then reading off the solution.
1 4 2 1 1 4 2 1 1 4 2 1 1 4 2 1
2 7 3 0 0 −1 −1 −2 0 1 1 2 0 1 1 2
3 1 −5 0 ∼ 0 −11 −11 −3 ∼ 0 −11 −11 −3 ∼ 0 0
Example 8. ∼
0 8
6 5 −7 1 0 −19 −19 −5 0 −19 −19 −5 0 0 0 delta
1 4 2 1 1 4 2 0 1 0 −2 0
0 1 1 2 ∼ 0 1 1 0 ∼ 0 1 1 0
0 0 0 1 0 0 0 1 0 0 0 1
0 0 0 delta 0 0 0 0 0 0 0 0
This gives us a single free variable x3 because the first, second, and fourth columns have
pivots but the second does not. As we did before - and remembering our right hand side was
just zeroes - we can replace our original system by x1 = 2x3 , x2 = −x3 , x4 = 0 and see that
the solution set is {(2t, −t, t, 0) | t ∈ R}.
Now there is another way in which homogeneous systems differ from non-homogeneous
ones. Notice that if we choose t = 1, we get a solution (2, −1, 1, 0) and that every other
8
solution is just a multiple of this solution. If you can imagine 4-space, the solution set is
just a straight line through the origin determined by this one other point. [The solution
set for a non-homogeneous system does not contain the origin.] We refer to the solution
(2, −1, 1, 0) as a basic solution.
The general situation is a little more complicated than this, but not much. What happens
when we put the matrix from a homogeneous problem into RREF? There can never be a
pivot in the constants column of course, because the constants column is zeroes and we don’t
even use it. If every other column contains a pivot, then the trivial solution is the unique
solution. If there are columns without pivots, than we have one or more free variables.
Example ?? is an illustration of what happens when there is one free variable. To see
what happens when there are more free variables, I will look at a homogeneous version of
Example ??.
x1
x2
1 2 0 −4 5 0 0 x3
0
Example 9. Solve the system of equations 0 0 1 7 8 0 −5 x4
= 0.
0 0 0 0 0 1 −9 x5
0
x6
x7
As in Example ??, we can read off the general solution
x1 = −2r + 4s − 5t
x2 =r
x3 = −7s − 8t + 5u
x4 =s
x5 =t
x6 = 9u
x7 =u
However, there is another way to look at this. Suppose we set one of our free variables,
say x2 equal to 1 and set each of the other free variables equal to zero. This means we are
letting r = 1 and s = t = u = 0. We get the basic solution.
x1 = −2
x2 =1
x3 =0
x4 =0
x5 =0
x6 =0
x7 =0
2
3 −2 6 1
trix, we can multiply. However notice that we get is 3 −1 +6 4 −5 2 +2 1.
0 11 −5 6
So Aw = 3v1 + 6v2 − 5v3 + 2v4 is actually a linear combination of the column vectors which
comprise A and the weights are just the entries of the vector w.
This perspective even makes more general matrix multiplication easier to understand.
Suppose A is an m × n matrix and B is an n × k matrix. Then we can view both A and B
as juxtapositions of column vectors. Suppose A = (v1 · · · vn ) and B = (w1 · · · wk ). Then
AB is also a juxtaposition of column vectors, in fact AB = (Aw1 · · · Awk ) and of course
the description of A helps us compute each Awi .
Now consider the following example.
Example 10. A company manufactures five products. Product A requires 3 units of steel, 1
unit of aluminum, 5 units of plastic, and 1 unit of glass. Product B requires 0 units of steel,
4 units of aluminum, 2 units of plastic, and 2 units of glass. Product C requires 5 units
of steel, 0 units of aluminum, 6 units of plastic, and 0 units of glass. Product D requires
8 units of steel, 0 units of aluminum, 0 units of plastic, and 3 units of glass. Product E
requires 3 units of steel, 9 units of aluminum, 8 units of plastic, and 4 units of glass. If they
wish to manufacture 1000 of Product A, 26 of Product B, 970 of Product C, 100 of Product
D, and 65 of Product E, how many units of each raw material is required?
What we need very simply is a function of five variables. We must input the number
of each product that we wish to manufacture and the function should then spit out four
numbers, one for each of the raw materials. We are looking for a function of the sort
f (a, b, c, d, e) = (r, s, t, u). The way we do this is to create a vector for each product.
10
3 0 5 8 3
1 4 0 0 9
Let vA = 5, vB = 2, vC = 6, vD = 0, vE = 8. Now notice that
1 2 0 3 4
1000vA + 26vB + 970vC + 100vD + 65vE is a 4-vector which gives the answer to our
question.
1000
26
If we let S = vA vB vC vD vE , then the answer is just the product S 970 .
100
65
So the function we are seeking is simply multiplication by the matrix S. The function
takes a 5-vector w and produces a 4-vector Sw, which is of course just a suitable linear
combination of the five vectors which detail how much of each raw material was needed for
each of the products.
This is a special kind of function called a linear transformation, something we will see
more of in Section 3.7 of the text and in Section ?? below. Linear transformations are great
to work with because there are surefire techniques to work with them. It is not like solving
differential equations by integration, where you can always be stumped by an impossible
integral. If we change Example 8 into a company which manufactures 17 products, each
requiring some subset of 46 ingredients, we can easily get a computer to solve the problem.
−3
π, f (3) = 6, f (4) = −3.
Example 14. The set of m × n matrices is a vector space.
Proof. A matrix is just a function from {(1, 1), . . . , (1, n), (2, 1), . . . , (2, n), . . . , ((m, 1), . . . , (m, n)}
to the real numbers.
12
And finally one last example which will be very important for us. This differs from the
last three examples in that we will employ Theorem ?? with V equal to the vector space
given by Example ??.
Example 15. An n-vector whose entries are functions from the real numbers (or some
subset of the real numbers such as an interval) to the real numbers is a vector space.
Proof. As in Example ??, we are letting T = {1, 2, . . . , n} in an application of Theorem ??.
Here we are using Example ?? for V .
Vector spaces are abstract mathematical objects. However, every vector space we en-
counter in this course will either be one of these or a subspace of one of these or a complex
number analogue. For example, the solutions to second order linear differential equations
are subspaces of Example ??. The solutions to the equation (d/dt)x = A x will be subspaces
of Example ??.
ck 0
15
k unknowns and we solve it by row reduction. If there is a pivot in every column, then
the solution is unique and the set is linearly independent. If some column lacks a pivot,
then there is a free variable, infinitely many solutions, and the set is linearly dependent. Of
course, this is what has to happen if n < k because the number of pivots can’t exceed the
number of rows and so there are not enough pivots to put one in every column. In any case
though, checking vectors in Rn for linear independence is completely straightforward. Set
up the matrix, put in row echelon form, check the pivots, and you have your answer.
So what happens if the vectors are not in Rn . Well, if we have another basis, we can
simply coordinatize and use the same process. Otherwise the process is trickier. This is
why the theorem (Theorem 6) we will encounter on Page 293 is so wonderful. It tells us how
to check for linear independence when our vectors are n-vectors of functions which solve a
linear system by checking vectors in Rn .
Example 19. Let P be the vector space of all polynomials (a subspace of the vector space
d
of all functions). Define T : P → P by T (f (t)) = dt f (t). Notice that Rules (1) and (2) for
linear transformation are just our old friends - the rules that the derivative of a sum is the
sum of the derivatives and that we can factor out constants when we are taking derivatives.
Now we come to one of the really beautiful ideas in linear algebra. If V is a finite
dimensional vector space, it has a basis x1 , . . . , xn . Any vector in the vector space can be
uniquely
written
in the form c1 x1 + · · · + cn xn and so you can visualize it as a vector in
c1
c2
Rn -
. . . - a column vector which, if you know what the basis is, describes the original
cn
vector completely. With this coordinatization, a linear transformation T : V → V can be
16
c1 d1
c2 d2
visualized as a function T (
. . .) = . . .. Because T is a linear transformation, each
cn dn
d1 c1
d2 c2
. . .= A . . . for some n × n matrix A.
di will be a linear function of the ci ’s and so
dn cn
What we see then is that in the world of finite dimensional vector spaces, Example ?? isn’t
just one example; it is the only example.
The next observation is both wonderful and a bit baffling. If you use a different basis for
V , the matrix of the transformation will be different. The confusing part is that knowing
the matrix of a transformation does not tell you the transformation unless you know what
the basis is. Many matrices can refer to the same transformation, while one matrix can
refer to many transformations. The wonderful part is that since we have many matrices
to pick from, we can choose one that is easy to work with. The next example is our first
encounter with diagonalizing matrices.
42 30
Example 21. Consider the transformation T : R2→ R2
given by T (x) = x.
−60 −43
This is a simple transformation of the plane, butit certainly
isn’t easy to picture what it
3 −2
does. However, the plane has another basis, , . With respect to this basis, the
−4 3
2 0
transformation is given by the matrix . Now, given any x ∈ R2 , you can actually
0 −3
construct T (x) geometrically without doing any calculations at all. TRY IT. You will be
working with
arrows and not algebra, but I’ll describe what is happeningalgebraically.
What
6 3
is T ( )? You decompose the vector into two vectors, one in the direction and
−5 −4
−2 6 3 −2
one in the direction. As it happens, =2 + . Now you stretch the
3 −5 −4 3
first vector to twice itslength
and youreverse
the second vector and triple its length. The
3 −2 18
exact answer is then 4 −3 = , which is hopefully close to what you got
−4 3 −25
graphically.
However, the value of this goes beyond just understanding the transformation better.
What we are working toward is solving linear systems of differential equations. For example,
we would like to find the general solution to the differential equation
d x1 42 30 x1
=
dt x2 −60 −43 x2
3 −3t −2
In fact, the general solution to this system is C1 e2t + C2 e . So you can see
−4 3
that Example ?? holds the key to solving linear systems. Now finding that special basis
was not guesswork. In Section 3.8, we will learn how to find that alternate basis.
17
There are three important vector spaces which are connected to an m×n matrix A. First
let T denote the linear transformation from the previous example. Recalling how matrix
multiplication works from Section 2, we can easily see exactly which vectors in Rm can be
written as Ax. If b = Ax, then b is necessarily a linear combination of the columns of A
and it is just as clear that every linear combination of the columns of A is equal to A x
for some choice of x. The image or range of the function T is thus the set of all linear
combinations of the columns of A. This motivates the following definition.
Definition. The column space of an m×n matrix A is the subspace of Rm which is spanned
by the columns of A, or equivalently, the set of all linear combinations of the columns of A.
Of course, this means that A x = b has a solution if and only if b is in the column space of
A. Using function terminology, the column space is the range of the transformation defined
by A.
There is also a row space, which is also the column space of the transpose of A. Finally,
there is the vector space of vectors which solve T (x) = 0 or A x = 0.
Definition. The nullspace of A is the set of all solutions to the homogeneous equation
A x = 0.
In understanding transformations, which we really won’t pursue in this course, the
nullspace is very important. For our purposes however, it is important that we be able
to find the nullspace and to find a basis for the nullspace. In the next section, we will be
finding eigenvectors. For each eigenvalue λ, we will have to find its eigenspace, which is just
the nullspace for the matrix A − λI. Then we will need to find a basis for that eigenspace.
The techniques for solving is exactly that which we encountered in Section 2. We find a
basis by identifying the free variables - which correspond to the columns without pivots.
Then, we choose one vector for each free variable. To find v, the vector corresponding to
the free variable xi , we let xi = 1 and all of the other free variables equal to zero. We then
solve for the dependent variables.
The way to compute determinants however is not by using the definition. Before
discussing this, I do want to discuss the two actual uses of determinants. The first
application is to vector calculus. It does not pertain to this course, but is very
important. The transformation given by a 3 × 3 matrix A maps the unit cube to
the parallelepiped determined by the columns. So it maps something of volume 1
to something of volume | det A|. In fact, the transformation multiplies all volumes
19
by the same factor and so the determinant is a key correcting factor when changing
variables in triple integrals. Moreover, there is nothing special about dimension
three.
The second important application - which is critical in this course - is related to
the fact that the RREF of A will have a column without a pivot precisely when
det A = 0. When we are simply working with A, this is clumsy. However, the
question that will concern us is for which values of λ will A − λI have a column
without a pivot. These values are found by solving the equation det(A − λI) = 0.
Now we move on to computing the value of the determinant. For small values
of n, the determinant is easy to calculate. If n = 1, det A = a11 . If n = 2,
det A = a11 a22 − a12 a21 . If n = 3, det A = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 −
a13 a22 a31 − a11 a23 a32 − a12 a21 a33 . However, the determinant of a 4 × 4 matrix has
24 terms and should be done some other way.
The standard way to compute a 3 × 3 determinant is by a technique referred to
as
basket-weaving. Set upa five column matrix by repeating the first two columns
a11 a12 a13 a11 a12
a21 a22 a23 a21 a22 Then multiply along the diagonals, adding the terms
a31 a32 a33 a31 a32
that go down to the right and subtracting the terms that go down to the left.
4 2 8
Example 22. Find the determinant of the matrix 0 6 3.
1 −7 5
4 2 8 4 2
We set up the 3 × 5 matrix 0 6 3 0 6 and see that the determinant
1 −7 5 1 −7
equals (4)(6)(5) + (2)(3)(1) + (8)(0)(−7) − (8)(6)(1) − (4)(3)(−7) − (2)(0)(5) = 162.
The determinant of a 2×2 matrix is found by an even easier form of basket-weaving,
but the technique fails for 4 × 4. Notice that you only get eight terms and there are
24 of them.
Larger determinants are computed in one of two ways. There is a very cumber-
some technique of expanding about a row or column. Unless the matrix has lots
of 0’s, this requires many calculations. In general, a better technique more closely
resembles putting the matrix in RREF. As computing large determinants will not
be stressed in this course, these methods will be relegated to the Appedix at the
end of these notes.
22
Let’s go back to the first question. How do we make sense of eAt ? Remember that
ex = 1 + x + x2 /2! + x3 /3! + x4 /4! + . . .. So we define eAt = I + At + (At)2 /2! + (At)3 /3! + . . ..
This converges to an n × n matrix and by taking derivatives term by term, we see that
d At
dt e = AeAt , which is exactly the property that we want eAt to have.
3 0 0
4 0 0
Example 25.It is easy to find the D
e if D is a
diagonal matrix. Let us compute e 0 0 2 .
9 0 0 27 0 0 81 0 0
Since D2 = 0 16 0, D3 = 0 64 0, D4 = 0 256 0 , etc., we see that
0 0 4 0 0 8 0 0 16
3
e 0 0
there is no interaction of terms and eD = 0 e4 0 .
0 0 e2
This is another illustration of how diagonalizing matrices - the process of finding a basis
so that the matrix of the linear transformation with respect to that basis is diagonal - is so
useful.
Example 26. Another easy example occurs when Ak = 0 for some k because
inthat case
0 1 0
the infinite series is just a finite sum. For instance, suppose A = 0 0 1. Then
0 0 0
0 0 1
A2 = 0 0 0 and A3 is the zero matrix. So
0 0 0
1 0 0 0 1 0 0 0 1 1 1 1/2
eA = 0 1 0 + 0 0 1 + (1/2) 0 0 0 = 0 1 1
0 0 1 0 0 0 0 0 0 0 0 1
Likewise
1 t t2 /2
eAt = 0 1 t
0 0 1
3 1 0
0
3 1
t
Example 27. Find e 0 0 3 .
Because
3 1 0 3 0 0 0 1 0
0 3 1 t 0 3 0t 0 0 1t
e 0 0 3 =e 0 0 3 e 0 0 0
There is a sophisticated way to find eA which you will not be responsible for, but which is
interesting and should help your understanding of the subject. It also shows just how neat
math can be. A matrix is only diagonalizable if it has a basis of eigenvectors. However, we
can put non-diagonalizable matrices in a nice form as well. There are matrices called Jordan
blocks. A Jordan block is simply a scalar matrix with 1’s just above the main diagonal.
Some examples should give you a clear idea of what they are.
5 1 0 0 0
9 1 0 0
3 1
−4 1 0 0 9 1 0 0 5 1 0 0
7 , , 0 −4 1 , , 0 0 5 1 0
Next, a Jordan matrix
0 3 0 0 9 1
0 0 −4 0 0 0 5 1
0 0 0 9
0 0 0 0 5
is a matrix which is built by arranging Jordan blocks down the diagonal and filling out the
rest of the matrix with zeroes. Here is an example of a Jordan matrix made up of four
Jordan blocks.
3 0 0 0 0 0 0
0 3 1 0 0 0 0
0 0 3 1 0 0 0
Example 30. 0 0 0 3 0 0 0
0 0 0 0 2 1 0
0 0 0 0 0 2 0
0 0 0 0 0 0 0
The blocks have sizes 1, 3, 2, 1 respectively.
26
Of course, the easiest Jordan matrices are diagonal matrices. For our purposes, there
are several very nice properties that Jordan matrices have. For instance, if J is the matrix
from Example ??, I can write down eJt without doing any calculations at all.
3t
e 0 0 0 0 0 0
0 e3t te3t t2 e3t /2 0 0 0
3t 3t
0 0 e te 0 0 0
eJt =
3t
0 0 0 e 0 0 0
2t 2t
0 0 0 0 e te 0
0 e2t 0
0 0 0 0
0 0 0 0 0 0 1
One just handles the Jordan blocks separately and patches them together. Example ??
shows us how to handle Jordan blocks.
The second neat thing is the following theorem.
Theorem 12. Suppose A is a square matrix.
(1) There exists a matrix B and a Jordan matrix J such that A = BJB −1 .
(2) eAt = BeJt B −1 .
I won’t prove this theorem, but the second statement is easy if you write out the terms
of the power series. I might remark that J is referred to as the Jordan canonical form.
Alas, this does not really make the problem at hand any easier. To use this theorem
to solve systems of differential equations, you must find the matrices J and B. Well, the
columns of B are actually a basis of generalized eigenvectors and so one must actually go
through the process above to find B.
The other point about Jordan canonical forms is that they say that linear transformations
are actually quite easy to understand if you pick the right basis. They fall into a small
number of different groups.
So finding the determinant is a way of predicting how row reduction will turn out and
row reduction is a way to compute a determinant.
Finally, I should say a little about computing large determinants. In general, the way to
go is by row reduction.
1 2 3 4 10
2 4 6 9 16
Example 31. Find the determinant of the matrix 3 6 8 11 29. First we use the
4 7 3 2 18
4 8 11 15 21
”1” in the upper left corner to zeroout the rest of the first column. This is completely
1 2 3 4 10
0 0
0 1 −4
harmless by Property ??. We get 0 0 −1 −1 −1
Then we interchange the
0 −1 −9 −14 −22
0 0 −1 −1 −19
second
and fourth rows, making
a note that we have reversed the sign of the determinant.
1 2 3 4 10
0 −1 −9 −14 −22
0 0 −1 −1 −1 Next we subtract the third row from the fifth row to create a zero
0 0 0 1 −4
0 0 −1 −1 −19
28
1 2 3 4 10
0 −1 −9 −14 −22
0 0 −1 −1 −1 Now we have an upper triangular matrix
in the (5, 3) place.
0 0 0 1 −4
0 0 0 0 −18
whose determinant is (1)(−1)(−1)(1)(−18) = −18. Recalling that we reversed the sign at
the second step, we see that the determinant of the original matrix is +18.
Life is probably easier if you avoid Type 2 row operations. It is not necessary to make
the pivots all equal to “1” and utilizing property 2 above gives you more to keep track of.
Here is an easy example to illustrate this.
4 7 9
Example 32. Find the determinant of the matrix 2 −1 14 . First we put zeros in
3 29 −4
the first column below the 4 by subtracting 4 times the first row from the second and 43 times
2
the first row from the third. If it is easier to envision using integer multiples of 1 47 49 ,
by all means do so. However, it serves no purpose to actually change the first row and
it forces you to remember that you divided the determinant by4 and so must multiply by
4 7 9
4 later. Just leave the 4 in the upper left corner. So we get 0 − 92 19
2 . Then add
95 43
0 4 −4
4 7 9
95 9 0 − 9 19 . The determinant then is
4 ÷ 2 times the second row to the third to get 2 2
0 0 709 18
just (4)(− 92 )( 709
18 ) = −709.
Yes, it is true that this problem is probably easier to do by basket-weaving, but one still
wants to know the method because row reduction is the superior technique for determinants
of size 4 × 4 or larger.
The alternative to row reduction is direct calculation. It would be very difficult to keep
track of all of the terms without being systematic. There is an approach which is systematic,
the cofactor method.
Definition. Let A be an n × n matrix. Let Aij denote the (n − 1) × (n − 1) matrix you
get by deleting row i and column j. Also let |A| denote the deteminant of A. So |Aij | is
the determinant of Aij .
Theorem
Pn 14. Let A be an n × n matrix. Then, for any fixedP choice of column j, |A| =
i+j a |A |. Also, for any fixed choice of row i, |A| = n i+j a |A |.
i=1 (−1) ij ij j=1 (−1) ij ij
I will go back to Example ?? to show what this theorem means. However, you should
quickly be convinced that this is a very bad way to do Example ??. Then I will offer another
example where this technique is actually helpful. The second matrix will have lots of zeroes
in it.
1 2 3 4 10
2 4 6 9 16
The determinant of 3 6 8 11 29 equals
4 7 3 2 18
4 8 11 15 21
29
4 6 9 16 2 3 4 10 2 3 4 10
6 8 11 29
− 2 · det 6 8 11 29 + 3 · det 4 6 9 16 − 4 ·
1 · det
7 3 2 18 7 3 2 18 7 3 2 18
8 11 15 21 8 11 15 21 8 11 15 21
2 3 4 10 2 3 4 10
4 6 9 16 4 6 9 16
6 8 11 29 + 4 · det 6 8 11 29 and there is still a lot of work ahead of
det
8 11 15 21 7 3 2 18
us. What we did above is much much easier.
3 2 0 1 0
2 0 0 0 0
Example 33. Find the determinant of the matrix 2 1 5 6 3. Taking advantage
9 7 0 4 2
2 7 0 5 0
of the zeroes, I will first expand about the second row, then expand about the new second
column, then expand about the last column. In each case, there will only be one nonzero
term.
Finally, I willcompute the easy two by two determinant.
3 2 0 1 0
2 0 0 0 0 2 0 1 0
1 5 6 3 2 1 0
2 1 5 6 3 = −2 · det 7 0 4 2 = (−2)(5) det 7 4 2 =
det
9 7 0 4 2 7 5 0
7 0 5 0
2 7 0 5 0
2 1
(−2)(5)(−2) det = (−2)(5)(−2)(2 · 5 − 7) = 60.
7 5
30