0% found this document useful (0 votes)
32 views

LinearAlgebraPrimer Ver 2010

This document provides an overview of linear algebra concepts and notation. It summarizes key matrix properties like addition and multiplication. Matrix multiplication is only defined when the number of columns of the first matrix equals the number of rows of the second. The transpose of a matrix A flips the matrix across its main diagonal. Matrix multiplication is not generally commutative. The document is intended to prime the reader for more in-depth study of linear algebra from recommended textbooks.

Uploaded by

Mario Pio
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

LinearAlgebraPrimer Ver 2010

This document provides an overview of linear algebra concepts and notation. It summarizes key matrix properties like addition and multiplication. Matrix multiplication is only defined when the number of columns of the first matrix equals the number of rows of the second. The transpose of a matrix A flips the matrix across its main diagonal. Matrix multiplication is not generally commutative. The document is intended to prime the reader for more in-depth study of linear algebra from recommended textbooks.

Uploaded by

Mario Pio
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/242366881

Linear Algebra Primer

Article · January 1995

CITATIONS READS
0 324

1 author:

Daniel Stutts
Missouri University of Science and Technology
33 PUBLICATIONS   227 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Development of novel ultrasonic actuators. View project

All content following this page was uploaded by Daniel Stutts on 13 January 2014.

The user has requested enhancement of the downloaded file.


Linear Algebra Primer
Daniel S. Stutts, Ph.D.
Original Edition: 12/1991 Current Edition: 4/1/04

1 Introduction
This primer was written to provide a brief overview of the main concepts and methods in elementary
linear algebra. It was not intended to take the place of any of the many elementary linear algebra texts
in the market. It contains relatively few examples and no exercises. The interested reader will find more
in depth coverage of these topics in introductory text books. Much of the material including the order
in which it is presented comes from Howard Anton’s “Elementary Linear Algebra” 2nd Ed., John Wiley,
1977. Another excellent basic text is “Linear Algebra and Its Applications,” by Charles G. Cullen. A
more advanced text is “Linear Algebra and its Applications” by Gilbert Strang.

The author hopes that this primer will answer some of your questions as they arise, and provide
some motivation (prime the pump, so to speak) for you to explore the subject in more depth. At the very
least, you now have a list (albeit a short one) of references from which to obtain more in depth explanation.

It should be noted that the examples given here have been motivated by the solution of consistent
systems of equations which have an equal number of unknowns and equations. Therefore, only the analysis
of square (n by n) matrices have been presented. Furthermore, only the properties of real matrices (those
with real elements) have been included.

1.1 Explanation of Notation Used


For clarity of notation, bold symbols are used to denote vectors and matrices. For matrices, upper case
bold letters are used, and for vectors, which are n × 1 matrices, bold lower case letters are used. Non-bold
symbols are used to denote scalar quantities.

Subscripts are used to denote elements of matricies or vectors. Superscripts (when not referring to
exponentiation) are used to identify eigenvectors and their respective components.

1.1.1 Indicial Notation


A matrix A may be described by indicial notation. The term located at the ith row and j th column is
denoted be the scalar aij .

Thus, ij th component of the sum of two matrices, A and B, may be written: [A + B]ij = aij + bij

Example 1
 
(a11 + b11 ) (a12 + b12 )
C = A(2×2) + B(2×2) =
(a21 + b21 ) (a22 + b22 )

Hence, for example, c12 = a12 + b12 .

1
2 Linear Systems of Equations
The following systems of equations

a11 x11 + a12 x12 + · · · + aln xln = b1


a21 x21 + a22 x22 + · · · + a2n x2n = b2
.. (1)
.
an1 xnl + an2 xn2 + · · · + ann xnn = bn

may be written in the matrix form as


    
a11 a12 · · · aln x1 b1
 a21 a22 · · · a2n  x2   b2 
= (2)
    
 .. .. ..  .. .. 
 . . .  .   . 
an1 an2 · · · ann xn bn
or
Ax = b (3)
where A is an n by n matrix and x and b are n by 1 matrices or vectors.

While the solution of systems of linear equations provides one significant motivation to study matrices
and their properties, there are numerous other applications for matrices. All applications of matrices
require a reasonable degree of understanding of matrix and vector properties.

3 Matrix Properties and Definitions


For any matrices A, B, and C, the following hold:

1. A + B = B + A

2. A + (B + C) = (A + B) + C

3. A(B + C) = AB + AC

Provided that AB and AC are defined

Definition 1 AB is defined if B has the same number of rows as A has columns.

A(m×r) B(r×n) = AB(m×n)

4. Identity Matrix: AI = IA = A where

 
1 0 0 . .

 0 1 0 . . 

I= 
 0 0 1 . . 

 . . . . . 
. . . . 1

2
5. Zero Matrix: 0A = A0 = 0
 
0 0 0 . .

 0 0 0 . . 

0= 
 0 0 0 . . 

 . . . . . 
. . . . 0

6. A + 0 = A

3.1 The transpose operation and its properties


The transpose of a matrix A, written as AT , is the matrix A with its off-diagonal components reflected
across the main diagonal. Hence, defining the components of the matrix A as aij in indicial notation, the
components of AT are given by aji .

3.1.1 Properties of the transpose operation


1. (AT )T = A
2. (A + B)T = AT + BT
3. (kA)T = kAT
4. (AB)T = BT AT
A matrix A is termed symmetric if AT = A and skew-symmetric if AT = −A. Clearly, a skew-symmetric
matrix can only have zero diagonal terms.

3.2 Multiplication of a Matrix


There are a few rules for matrix and vector multiplication which must be considered in addition to the
rules for the more familiar scalar algebra. These are enumerated with examples below.
1. Multiplication by a scalar The product of a scalar and the matrix with each of its elements multiplied
by the scalar. For example:
 
αa11 αa12
αA = [αaij ] =
αa21 αa22
2. Multiplication by a matrix The product of two matrices, A and B, when it is defined is another
matrix, C
C = AB (1)
where the components of C may be computed as follows:
n
X
cij = aik bkj = cij (2)
k=1

Example 2
     
a11 a12 b11 b12 a11 b11 + a12 b21 a11 b12 + a12 b22
AB = =
a21 a22 b21 b22 a21 b11 + a22 b21 a21 b12 + a22 b22

3
Note that in general, matrices are not commutative over multiplication:

AB 6= BA (3)
This fact leads to the definition of pre-multiplication and post-multiplication:
Definition 1 A matrix A is said to be pre-multiplied by a matrix B when B multiplies A from the
left, and post-multiplied when B is multiplied by A from the left.

Other terminology for the direction of multiplication in common use is left multiplication and right
multiplication. In Equation (3), on the r.h.s., A is being pre-multiplied by B, and on the l.h.s. A
is being post-multiplied by B.

3. Multiplication by a vector:
     
a11 a12 x1 a11 x1 + a12 x2
Ax = = (4)
a21 a22 x2 a21 x1 + a22 x2

Pre-multiplication of a matrix by a vector requires taking the transpose of the vector first in order
to comply with the rules of matrix multiplication.

3.3 Matrix Inverse

A−1 A = I if A−1 exists,

3.4 The Determinant Operation


Recall from vector mechanics that in three space (dimensions) the vector product of any two vectors
A = (a1 i + a2 j + a3 k) and B = (b1 i + b2 j + b3 k) is defined as

 
i j k
A × B = det  a1 a2 a3 
b1 b2 b3

= (a2 b3 − a3 b2 )i − (a1 b3 − a3 b1 )j + (a1 b2 − a2 b1 )k (5)

This method of computing the determinant is called cofactor expansion. A cofactor is the signed minor
of a given element in a matrix. A minor Mij is the determinant of the sub matrix which remains after
the ith row and the j th column of the matrix are deleted. In this case, we have

M11 = (a2 b3 − a3 b2 ) (6)

M12 = (a1 b3 − a3 b1 ) (7)

M13 = (a1 b2 − a2 b1 ) (8)

4
The cofactors are given by

Cij = (−1)i+j Mij (9)


Hence,
C11 = (−1)1+1 M11 = M11 (10)

C12 = (−1)1+2 M12 = −M12 (11)

etc.
In the above example,
 
i j k
det  a1 a2 a3  = c11 i + c12 j + c13 k (12)
b1 b2 b3

 
2 1
Example 3 Let A = then
3 −2

detA = 2(−2) − 1(3) = −7

 
2 1 0
Example 4 Let A =  3 −2 1 
1 −1 2
then

detA = 2(−2(2) − (−1)(1)) − 1(3(2) − 1(1)) + 0(3(−1) − (−2)(1)) = 2(−3) − 5 = −11

Expansion by cofactors is mostly useful for small matrices (less than 4×4). For larger matrices, the num-
ber of operations becomes prohibitively large. For example:

A 2 by 2 matrix requires 2 multiplications

A 3 by 3 matrix requires 9 multiplications

However, a 4 by 4 matrix requires the computation of 4+4! = 28 signed elementary products.

A 10 by 10 matrix would require 10 + 10! = 3,628,810 signed elementary products!

This trend suggests that soon even the largest and fastest computers would choke on such a compu-
tation.

5
For large matrices, the determinant is best computed using row reduction.

Row reduction consists of using elementary row and column operations to reduce a matrix down to a
simpler form, usually upper or lower triangular form.

This is accomplished by multiplying one row by a constant and adding it another row to produce a
zero at the desired position.

 
2 1 0
Example 5 Let A =  3 −2 1 
1 −1 2
Reduce A to upper triangular form, i.e., all zeros under the main diagonal (2 -2 2).

Multiplying row 1 by -1/2 and adding it to row 3 yields


 
2 1 0
 3 −2 1 
0 −1.5 2

Similarly, multiplying row 1 by -3/2 and adding it to row 2 yields


 
2 1 0
 0 −3.5 1 
0 −1.5 2
1.5
multiplying row 2 by − 3.5 = − 73 and adding it to row 3
 
2 1 0
 0 −7 1 
2
0 0 11 7

The determinant is now easily computed by multiplying the elements of the main diagonal.

detA = 2(− 72 )( 11
7 ) = −11

This type of row reduction is called Gaussian elimination and is much more efficient than the cofactor
expansion technique for large matrices.

3.5 Properties of Determinant Operations


1. If A is a square matrix then det(AT ) = det(A)

2. det(kA) = k n det(A) where A is an n × n matrix and k is a scalar

3. det(AB) = det(A)det(B) where A and B are square matrices of the same size.

4. A square matrix A is invertible if and only if det(A) 6= 0

Proof 1 If A is invertible then AA−1 = I ⇒ det AA−1 = (detA) detA−1 = detI = 1 Thus,
 

detA−1 = detA
1
⇒ detA 6= 0. 

6
An important implication of this result is the following.

5. For a homogeneous system of linear equations Ax = 0

There exists a nontrivial x (x 6= 0) if and only if det(A) = 0

Proof 2 Assume det(A) 6= 0. This implies that A−1 exists.


A−1 Ax = Ix = 0. This implies that x = 0, (a contradiction), so the only way that x can be other
than zero is for det(A) = 0, and hence, for non-trivial x, A−1 not to exist. 

This result is used very often in applied mathematics, physics and engineering.

6. If A is invertible then
A−1 = 1
det(A) adj(A) where adjA is the adjoint of A.

Definition 2 The adjoint of a matrix A is defined as the transpose of the cofactor matrix of A.

Another way to calculate the inverse of a matrix is by Gaussian elimination. This method is easier to
apply on larger matrices.

Since A−1 A = I, we start with the matrix A which we want to invert on the left and the identity
matrix on the right. We then do elementary row operations (Gaussian Elimination) on the matrix while
simultaneously doing the same operations on I. This can be accomplished by adjoining the two matrices
to form a matrix of the form [A I].

Example 6  
1 2 3
A= 2 5 3  (13)
1 0 8
Adjoining A with I yields  
1 2 3 1 0 0
 2 5 3 0 1 0  (14)
1 0 8 0 0 1
Adding -2 times the first row to the second row and -1 times the first row to the third yields
 
1 2 3 1 0 0
 0 1 −3 −2 1 0  (15)
0 −2 5 −1 0 1

Adding 2 times the 2nd row to the third yields


 
1 2 3 1 0 0
 0 1 −3 −2 1 0 
0 0 1 −5 2 1
Multiplying the third row by -1 yields

7
 
1 2 3 1 0 0
 0 1 −3 −2 1 0 
0 0 1 5 −2 −1
Adding 3 times the third row to the second and -3 times the third row to the first yields

 
1 2 3 −14 6 3
 0 1 −3 13 −5 −3 
0 0 1 5 −2 −1

Finally adding -2 times the second row to the first yields:


 
1 0 0 −40 16 9
 0 1 0 13 −5 −3 
0 0 1 5 −2 −1
Thus,
 
−40 16 9
A−1 =  13 −5 −3 
5 −2 −1

3.5.1 Cramer’s Rule


If Ax = b is a system of n linear equations in n unknowns such that det(A) 6= 0 then, the system has a
unique solution which is given by

det(A1 ) det(A2 ) det(A3 )


x1 = , x2 = , x3 = (16)
det(A) det(A) det(A)
where Aj is the matrix obtained by re placing the entries of the j th column A by the entries in the vector
b, where
 
b1
 b2 
 
 . 
b=   (17)
 . 

 . 
bn

Example 7 Consider the following system:

2x1 + 4x2 − 2x3 = 18


2x2 + 3x3 = −2
x1 + 5x3 = −7

solve for x1 , x2 and x3 .

Solution: Recasting the system in matrix-vector form, we have

8
    
2 4 −2 x1 18
 0 2 3   x2  =  −2 
1 0 5 x3 −7

Next, we define the three matrices formed by replacing in turn each of the columns of A with b:

   
18 4 −2 2 18 −2
A1 =  −2 2 3  A2 =  0 −2 3 
−7 0 5 1 −7 5

 
2 4 18
A3 =  0 2 −2 
1 0 −7
Next, we compute the individual determinants:

det(A) = 2[2(5) − (−2(1)] − 3[2(0) − 4(1)] = 36


det(A1 ) = (−7)[4(3) − (−2)(2)] + 5[18(2) − 4(−2)] = 108
det(A2 ) = 2[−2(5) − (3)(−7)] + 1[18(3) − (−2)(−2)] = 72
det(A3 ) = 2[2(−7) − (−2)(0)] + 1[4(−2) − 18(2)] = −72

Thus,

det(A1 ) 108 det(A2 ) 72 det(A3 ) 72


x1 = = = 3, x2 = = = 2, x3 = = − = −2 (18)
det(A) 36 det(A) 36 detA) 36
Note that we took advantage of any zero elements by expanding the cofactors along the rows or columns
that contained them.

Cramer’s rule is particularly efficient to use on 2 × 2 system.

Consider a general 2nd-order example

Example 8     
a11 a12 x1 b1
= (19)
a21 a22 x2 b2

det(A) = a11 a22 − a12 a21 , det(A1 ) = b1 a22 − a12 b2 , det(A2 ) = a11 b2 − b1 a21 (20)
Thus,
det(A1 ) a22 b1 − a12 b2 det(A2 ) a11 b2 − b1 a21
x1 = = , x2 = = (21)
det(A) a11 a22 − a12 a21 det(A) a11 a22 − a12 a21

3.6 The Characteristic Polynomial and Eigenvalues and Vectors


We have shown that the system
Ax = λx (22)

9
where λ is a scalar value. Equivalently, we may write

(A − λI)x = 0 (23)
Hence, from Theorem (5) on page 7, there exists a nontrivial x if and only if

det(A − λI) = 0 (24)

Evaluation of the above results in a polynomial in λ. This is the so called characteristic polynomial
and its roots λi are the characteristic values or eigenvalues. Evaluation of (24) yields,

λn + c1 λn−1 + c2 λn−2 · · · + cn = 0 (25)

Furthermore, the solution of (23), xi , corresponding to the ith eigenvalue, is the ith eigenvector of the
matrix A. It can be shown that the matrix thA itself satisfies the characteristic polynomial.

An + c1 An−1 + c2 An−2 + · · · + cn I = 0

This result is known as Cayley-Hamilton Theorem. It may be shown that the matrix A is also anihilated
by a minimum polynomial of degree less than or equal to that of the characteristic polynomial.

Example 9 Find the eigenvalues and eigenvectors of


 
4 −5
A=
1 −2
Solution:
     
4 −5 1 0 4−λ −5
det −λ = det
1 −2 0 1 1 −2 − λ

= (4 − λ)(−2 − λ) + 5 = 0

λ2 − 2λ − 3 = 0
or

(λ − 3)(λ + 1) = 0
Thus, λ1 = −1, and λ2 = 3 The eigenvector of A corresponding to λ = −1 may be found as follows:

x11
    
1 4 − (−1) −5 0
(A − λI) x = =
1 −2 − (−1) x12 0
or

x11
    
5 −5 0
=
1 −1 x12 0
which has an obvious solution of

10
 
1 1
x = c1
1

where c1 is any scalar.

Similarly, substituting λ2 = 3 yields

x21
    
1 −5 0
=
1 −5 x22 0

Thus,

 
2 5
x = c2
1

The calculation of eigenvalues and eigenvectors has many applications. One important application is the
similarity transformation. Under certain conditions, a general system of equations may be transformed
into a diagonal system. The most important case is that of symmetric matrices which will be discussed
later. In other words, the system Ax = b may be transformed into an equivalent system Dy = c where
D is the diagonal matrix – making the solution of Dy = c especially easy. Two matrices A and B are
said to be similar if det(A) = det(B). Another way of saying the above is if A and B are square matrices
B is similar to A if and only if there is an invertible matrix P such that A = PBP−1 . It turns out that
an n × n matrix A is diagonalizable if A has n linearly independent eigenvectors.

Example 10 Since the previous example had two independent eigenvectors (i.e., x1 6= Kx2 for any scalar
K)
 
4 −5
A= (26)
1 −2

should be diagonalizable. A matrix composed of x1 and x2 as its two columns will diagonalize A. We
show that this is so by trying it!

− 14 5
   
−1 4 −4 5 1 5
P AP = 1
4 − 14 1 −2 1 1

    
1 −1 5 −1 15 −1 0
= =
4 1 −1 1 −3 0 3

We see that D is composed of λ1 and λ2 on its main diagonal and zeros elsewhere. Hence, the matrix,
A, was indeed similar to a diagonal matrix.

11
3.7 Special Properties of Symmetric Matrices
Symmetric matrices have several special properties. The principal ones for an n by n symmetric matrix
are enumerated below:

1. Symmetric real matrices (those with real elements) have n real eigenvalues.

2. Eigenvectors corresponding to distinct real roots are orthogonal.

Proof 3 If A = AT , and
Ax1 = λ1 x1 (27)
and
Ax2 = λ2 x2 (28)
Then,
λ1 x2T Ax1 = λ1 x2T x1 (29)
and
λ2 x1T Ax2 = λ2 x1T x2 (30)
Since x2T x1 = x1T x2 and
x2T Ax1 = x1T Ax2 (31)
subtraction of (29) from (30) yields

(λ2 − λ1 ) x1T x2 = 0 (32)

Hence, if λ1 6= λ2 , x1 and x2 are orthogonal as claimed. 

3. If λr is a root of the characteristic polynomial of algebraic multiplicity m, there exist m independent


eigenvectors corresponding to λr .

One of the most important consequences of the above for symmetric matrices is that all symmetric ma-
trices are similar to a diagonal matrix. This fact has powerful consequences in the solution of systems of
linear ordinary differential equations with constant coefficients which result from the application of New-
ton’s 2nd law, or Hamilton’s principle. Essentially, such systems, which usually result from symmetric
operators, may be uncoupled by a similarity transformation, and hence, each ordinary differential equa-
tion solved individually. Exceptions to this rule include systems modeled with general viscous damping,
and those with gyroscopic inertial terms.

The following examples were contributed by Dr. Geroid P. MacSithigh.

Example 11 (symmetric)
 
4 0 0
A= 0 4 0 
0 0 5
Characteristic polynomial: (λ − 4)2 (λ − 5) = 0

λ1 = 5, λ2 = λ3 = 4

12
λ1 = 5



1
x1 =  −1 
0


  
1 0
2 3
λ2 = 4, x =  1 ,x =
  0  (33)
0 1

Example 12 (non-symmetric)
 
5 0 0
A= 4 5 0 
0 0 3
Characteristic polynomial: (λ − 5)2 (λ − 3) = 0

λ1 = 3, λ2 = λ3 = 5

 
1
λ1 = 3, x1 =  1 
0


0
λ2 = 5; x2 =  1 
0

Thus, only one eigenvector corresponding to λ = 5.

13
4 Bibliography
1. Anton, Howard, “Elementary Linear Algebra,” 1977, John Wiley&Sons

2. Cullen, Charles G., “Matrices and Linear Transformations,” 1972, 2nd Ed., Addison-Wesley, Reprinted
by Dover 1990.

3. Strang, Gilbert, “Linear Algebra and Its Applications,” 1988, 3rd Ed., International Thomson Pub-
lishing.

14
View publication stats

You might also like