Linear Algebra
Linear Algebra
MTH 706
Lecture 1
History
Algebra is named in honor of Mohammed Ibn-e- Musa al-Khowârizmî. Around 825, he
wrote a book entitled Hisb al-jabr u'l muqubalah, ("the science of reduction and
cancellation"). His book, Al-jabr, presented rules for solving equations.
Algebra is a branch of Mathematics that uses mathematical statements to describe
relationships between things that vary over time. These variables include things like the
relationship between supply of an object and its price. When we use a mathematical
statement to describe a relationship, we often use letters to represent the quantity that
varies, since it is not a fixed amount. These letters and symbols are referred to as
variables.
Algebra is a part of mathematics in which unknown quantities are found with the help of
relations between the unknown and known.
In algebra, letters are sometimes used in place of numbers.
The mathematical statements that describe relationships are expressed using algebraic
terms, expressions, or equations (mathematical statements containing letters or symbols
to represent numbers). Before we use algebra to find information about these kinds of
relationships, it is important to first introduce some basic terminology.
Algebraic Term
The basic unit of an algebraic expression is a term. In general, a term is either a product
of a number and with one or more variables.
Study of Algebra
Today, algebra is the study of the properties of operations on numbers. Algebra
generalizes arithmetic by using symbols, usually letters, to represent numbers or
______________________________________________________________________
©Virtual University Of Pakistan 2
1-Introduction and Overview VU
Algebraic Expressions
One of the most important problems in mathematics is that of solving systems of linear
equations. It turns out that such problems arise frequently in applications of mathematics
in the physical sciences, social sciences, and engineering. Stated in its simplest terms, the
world is not linear, but the only problems that we know how to solve are the linear ones.
What this often means is that only recasting them as linear systems can solve non-linear
problems. A comprehensive study of linear systems leads to a rich, formal structure to
analytic geometry and solutions to 2x2 and 3x3 systems of linear equations learned in
previous classes.
It is exactly what the name suggests. Simply put, it is the algebra of systems of linear
equations. While you could solve a system of, say, five linear equations involving five
unknowns, it might not take a finite amount of time. With linear algebra we develop
techniques to solve m linear equations and n unknowns, or show when no solution exists.
We can even describe situations where an infinite number of solutions exist, and describe
them geometrically.
Linear algebra is the study of linear sets of equations and their transformation properties.
______________________________________________________________________
©Virtual University Of Pakistan 3
1-Introduction and Overview VU
Linear algebra, sometimes disguised as matrix theory, considers sets and functions, which
preserve linear structure. In practice this includes a very wide portion of mathematics!
Thus linear algebra includes axiomatic treatments, computational matters, algebraic
structures, and even parts of geometry; moreover, it provides tools used for analyzing
differential equations, statistical processes, and even physical phenomena.
Linear Algebra consists of studying matrix calculus. It formalizes and gives geometrical
interpretation of the resolution of equation systems. It creates a formal link between
matrix calculus and the use of linear and quadratic transformations. It develops the idea
of trying to solve and analyze systems of linear equations.
Applications of Linear algebra
Linear algebra makes it possible to work with large arrays of data. It has many
applications in many diverse fields, such as
• Computer Graphics,
• Electronics,
• Chemistry,
• Biology,
• Differential Equations,
• Economics,
• Business,
• Psychology,
• Engineering,
• Analytic Geometry,
• Chaos Theory,
• Cryptography,
• Fractal Geometry,
• Game Theory,
• Graph Theory,
• Linear Programming,
• Operations Research
______________________________________________________________________
©Virtual University Of Pakistan 4
1-Introduction and Overview VU
It is very important that the theory of linear algebra is first understood, the concepts are
cleared and then computation work is started. Some of you might want to just use the
computer, and skip the theory and proofs, but if you don’t understand the theory, then it
can be very hard to appreciate and interpret computer results.
Why using Linear Algebra?
Linear Algebra allows for formalizing and solving many typical problems in different
engineering topics. It is generally the case that (input or output) data from an experiment
is given in a discrete form (discrete measurements). Linear Algebra is then useful for
solving problems in such applications in topics such as Physics, Fluid Dynamics, Signal
Processing and, more generally Numerical Analysis.
Linear algebra is not like algebra. It is mathematics of linear spaces and linear functions.
So we have to know the term "linear" a lot. Since the concept of linearity is fundamental
to any type of mathematical analysis, this subject lays the foundation for many branches
of mathematics.
Objects of study in linear algebra
Linear algebra merits study at least because of its ubiquity in mathematics and its
applications. The broadest range of applications is through the concept of vector spaces
and their transformations. These are the central objects of study in linear algebra
______________________________________________________________________
©Virtual University Of Pakistan 5
1-Introduction and Overview VU
8. Linear algebra is part of and motivates much abstract algebra. Vector spaces
form the basis from which the important algebraic notion of module has been
abstracted.
9. Vector spaces appear in the study of differential geometry through the tangent
bundle of a manifold.
10. Many mathematical models, especially discrete ones, use matrices to represent
critical relationships and processes. This is especially true in engineering as
well as in economics and other social sciences.
There are two principal aspects of linear algebra: theoretical and computational. A major
part of mastering the subject consists in learning how these two aspects are related and
how to move from one to the other.
Many computations are similar to each other and therefore can be confusing without
reasonable level of grasp of their theoretical context and significance. It will be very
tempting to draw false conclusions.
On the other hand, while many statements are easier to express elegantly and to
understand from a purely theoretical point of view, to apply them to concrete problems
you will need to “get your hands dirty”. Once you have understood the theory sufficiently
and appreciate the methods of computation, you will be well placed to use software
effectively, where possible, to handle large or complex calculations.
______________________________________________________________________
©Virtual University Of Pakistan 6
1-Introduction and Overview VU
Course Segments
The course is covered in 45 Lectures spanning over six major segments, which are given
below;
1. Linear Equations
2. Matrix Algebra
3. Determinants
4. Vector spaces
5. Eigen values and Eigenvectors, and
6. Orthogonal sets
Course Objectives
The main purpose of the course is to introduce the concept of linear algebra, to explain
the underline theory, the computational techniques and then try to apply them on real life
problems. Major course objectives are as under;
______________________________________________________________________
©Virtual University Of Pakistan 7
1-Introduction and Overview VU
I am indebted to several authors whose books I have freely used to prepare the lectures
that follow. The lectures are based on the material taken from the books mentioned
below.
I have taken the structure of the course as proposed in the book of David C. Lay. I would
be following this book. I suggest that the students should purchase this book, which is
easily available in the market and also does not cost much. For further study and
supplement, students can consult any of the above mentioned books.
I strongly suggest that the students should also browse on the Internet; there is plenty of
supporting material available. In particular, I would suggest the website of David C. Lay;
www.laylinalgebra.com, where the entire material, study guide, transparencies are readily
available. Another very useful website is www.wiley.com/college/anton, which contains a
variety of useful material including the data sets. A number of other books are also
available in the market and on the internet with free access.
I will try to keep the treatment simple and straight. The lectures will be presented in
simple Urdu and easy English. These lectures are supported by the handouts in the form
of lecture notes. The theory will be explained with the help of examples. There will be
enough exercises to practice with. Students are advised to go through the course on daily
basis and do the exercises regularly.
______________________________________________________________________
©Virtual University Of Pakistan 8
1-Introduction and Overview VU
The course will be spread over 45 lectures. Lectures one and two will be introductory and
the Lecture 45 will be the summary. The first two lectures will lay the foundations and
would provide the overview of the course. These are important from the conceptual point
of view. I suggest that these two lectures should be viewed again and again.
The course will be interesting and enjoyable, if the student will follow it regularly and
completes the exercises as they come along. To follow the tradition of a semester system
or of a term system, there will be a series of assignments (Max eight assignments) and a
mid term exam. Finally there will be terminal examination.
The assignments have weights and therefore they have to be taken seriously.
______________________________________________________________________
©Virtual University Of Pakistan 9
2-Introduction to Matrices VU
Lecture 2
Background
Introduction to Matrices
Matrix A matrix is a collection of numbers or functions arranged into rows and columns.
Matrices are denoted by capital letters A, B, , Y , Z . The numbers or functions are called
elements of the matrix. The elements of a matrix are denoted by small letters a, b, , y, z .
Rows and Columns The horizontal and vertical lines in a matrix are, respectively, called the
rows and columns of the matrix.
Order of a Matrix The size (or dimension) of matrix is called as order of matrix. Order of
matrix is based on the number of rows and number of columns. It can be written as r × c ; r
means no. of row and c means no. of columns.
If a matrix has m rows and n columns then we say that the size or order of the matrix
is m × n . If A is a matrix having m rows and n columns then the matrix can be written as
a11 a12 a1n
a
21 a22 a2 n
A=
amn
am1 am 2
The element, or entry, in the ith row and jth column of a m × n matrix A is written as aij
2 −1 3
For example: The matrix A = has two rows and three columns. So order of A
0 4 6
will be 2 × 3
Square Matrix A matrix with equal number of rows and columns is called square matrix.
4 7 −8
For Example The matrix A = 9 3 5 has three rows and three columns. So it is a
1 −1 2
square matrix of order 3.
Equality of matrices
4 7 −8 4 7 −8
Example The matrices A = 9 3 5 and B = 9 3 5 equal matrices
1 −1 2
1 −1 2
(i.e A = B) because they both have same orders and same corresponding elements.
Column Matrix A column matrix X is any matrix having n rows and only one column.
Thus the column matrix X can be written as
b11
b
21
X = b31 = [bi1 ]n ×1
bn1
A column matrix is also called a column vector or simply a vector.
Notice that the product kA is same as the product Ak . Therefore, we can write kA = Ak .
It implies that if we multiply a matrix by a constant k, then each element of the matrix is to
be multiplied by k.
Example 1
2 − 3 10 − 15
(a) 5⋅ 4 − 1 = 20 −5
1 / 5 6 1 30
1 e
t
(b) t
e ⋅ − 2 = − 2e t
4 t
4e
− 3t
2 2e − 3t 2 − 3t
e ⋅ = = e
5 5e − 3t 5
Addition of Matrices Only matrices of the same order may be added by adding
corresponding elements.
If A = [aij ] and B = [bij ] are two m × n matrices then A + B = [aij + bij ]
Obviously order of the matrix A + B is m × n
Example 3 Write the following single column matrix as the sum of three column vectors
3t 2 − 2e t
2
t + 7t
5t
Solution
3t 2 − 2et 3t 2
0 − 2e 0 −2
t
3
t 2 + 7t = t 2 + 7t + 0 =
1 t 2 7 t 0 et
+ +
0 5 0
5t 0 5t 0
Multiplication of Matrices We can multiply two matrices if and only if, the number of
columns in the first matrix equals the number of rows in the second matrix.
Otherwise, the product of two matrices is not possible.
OR
If the order of the matrix A is m × n then to make the product AB possible order of the
matrix B must be n × p . Then the order of the product matrix AB is m × p . Thus
Am × n ⋅ Bn × p = C m × p
n
= ∑ aik bkj
k =1 n× p
Example 4 If possible, find the products AB and BA , when
4 7 9 − 2
(a) A= , B =
3 5 6 8
5 8
− 4 − 3
(b) A = 1 0 ,
B=
2 0
2 7
Solution (a) The matrices A and B are square matrices of order 2. Therefore, both of the
products AB and BA are possible.
4 7 9 − 2 4 ⋅ 9 + 7 ⋅ 6 4 ⋅ (−2) + 7 ⋅ 8 78 48
AB = = =
3 5 6 8 3 ⋅ 9 + 5 ⋅ 6 3 ⋅ (−2) + 5 ⋅ 8 57 34
9 − 2 4 7 9 ⋅ 4 + (−2) ⋅ 3 9 ⋅ 7 + (−2) ⋅ 5 30 53
Similarly BA = = =
6 8 3 5 6 ⋅ 4 + 8 ⋅ 3
6 ⋅ 7 + 8 ⋅ 5 48 82
Note From above example it is clear that generally a matrix multiplication is not
commutative i.e. AB ≠ BA .
(b) The product AB is possible as the number of columns in the matrix A and the number of
rows in B is 2. However, the product BA is not possible because the number of column in the
matrix B and the number of rows in A is not same.
5 8
−4 −3
AB = 1 0
2 7 2 0
5 ⋅ (−4) + 8 ⋅ 2 5 ⋅ (−3) + 8 ⋅ 0 −4 −15
= 1 ⋅ (−4) + 0 ⋅ 2 1 ⋅ (−3) + 0 ⋅ 0 = −4 −3
2 ⋅ (−4) + 7 ⋅ 2 2 ⋅ (−3) + 7 ⋅ 0 6 −6
78 48 30 53
AB = , BA =
57 34 48 82
Clearly AB ≠ BA.
− 4 − 15
AB = − 4 −3
6
− 6
Example 5
2 − 1 3 − 3 2 ⋅ (−3) + (−1) ⋅ 6 + 3 ⋅ 4 0
(a) 0 4 5 6 = 0 ⋅ (−3) + 4 ⋅ 6 + 5 ⋅ 6 = 44
1 − 7 9 4 1 ⋅ (−3) + (−7) ⋅ 6 + 9 ⋅ 4 − 9
− 4 2 x − 4 x + 2 y
(b) =
3 8 y 3 x + 8 y
1 0 0 0
0 1 0 0
I = 0 0 1 0
0 0 0 1
Zero Matrix or Null matrix A matrix whose all entries are zero is called zero matrix or null
matrix and it is denoted by O .
0 0
0 0 0
For example O= ; O= ; O = 0 0
0 0 0
0 0
and so on. If A and O are the matrices of same orders, then A + O = O + A = A
Associative Law The matrix multiplication is associative. This means that if A, B and
C are m × p , p × r and r × n matrices, then A( BC ) = ( AB)C
The result is a m × n matrix. This result can be verified by taking any three matrices which
are confirmable for multiplication.
3 6 2
Example 6 Find the determinant of the following matrix A = 2 5 1
−1 2 4
Solution The determinant of the matrix A is given by
3 6 2
det( A) = 2 5 1
−1 2 4
We expand the det( A) by first row, we obtain
3 6 2
5 1 2 1 2 5
det( A) = 2 5 1 =3 -6 +2
2 4 −1 4 −1 2
−1 2 4
or det( A) = 3(20 - 2) - 6(8 + 1) + 2(4 + 5) = 18
Since order of the matrix A is m × n , the order of the transpose matrix Atr is n × m .
3 6 2
3 2 −1
Example 7 (a) The transpose of matrix A = 2 5 1 is A = 6 5 2
T
−1 2 4 2 1 4
5
(b) If X = 0 , then X T = [5 0 3]
3
2 3 −1
Example: A = 1 1 0
2 −3 5
Let A be a square matrix of order n x n. Then minor M ij of the element aij ∈ A is the
determinant of (n − 1) × (n − 1) matrix obtained by deleting the ith row and jth column
from A .
2 3 −1
Example If A = 1 1 0 is a square matrix. The Minor of 3 ∈ A is denoted by
2 −3 5
1 0
M 12 and is defined to be M 12 = = 5-0 = 5
2 5
Cofactor of an element of a matrix
Let A be a non singular matrix of order n × n and let C ij denote the cofactor (signed minor)
of the corresponding entry aij ∈ A , then it is defined to be Cij = (−1) i + j M ij
2 3 −1
Example If A = 1 1 0 is a square matrix. The cofactor of 3 ∈ A is denoted by
2 −3 5
1 0
C12 and is defined to be C12 = = (−1)1+ 2 = - (5 - 0) = -5
2 5
Theorem If A is a square matrix of order n × n then the matrix has a multiplicative inverse
A −1 if and only if the matrix A is non-singular.
1
Theorem Then inverse of the matrix A is given by A −1 = (Cij ) tr
det( A)
a11 a12
A=
a
21 a 22
Therefore C11 = a 22 , C12 = −a 21 , C 21 = −a12 and C 22 = a11 . So that
tr
−1 1 a 22 − a 21 1 a 22 − a12
A = =
det( A) − a12 a11 det( A) − a 21 a11
a 22 a 23 a 21 a 23 a 21 a 22
C11 = , C12 = − , C 13 = and so on.
a32 a33 a31 a33 a31 a32
1 4
Example 8 Find, if possible, the multiplicative inverse for the matrix A = .
2 10
1 4
Solution The matrix A is non-singular because det( A) = = 10 - 8 = 2
2 10
1 10 − 4 5 − 2
Therefore, A −1 exists and is given by A −1 = =
2 − 2 1 − 1 1 / 2
1 4 5 − 2 5 − 4 − 2 + 2 1 0
Check AA −1 = = = =I
2 10 − 1 1 / 2 10 − 10 − 4 + 5 0 1
5 − 2 1 4 5 − 4 20 − 20 1 0
AA −1 = = = =I
− 1 1 / 2 2 10 − 1 + 1 − 4 + 5 0 1
2 2 0
Solution Since det( A) = − 2 1 1 = 2(1 − 0) − 2(−2 − 3) + 0(0 − 3) = 12 ≠ 0
3 0 1
Therefore, the given matrix is non singular. So, the multiplicative inverse A −1 of the matrix
A exists. The cofactors corresponding to the entries in each row are
1 1 −2 1 −2 1
C11 = = 1, C12 = − = 5, C13 = = −3
0 1 3 1 3 0
2 0 2 0 2 2
C 21 = − = −2, C 22 = = 2, C 23 = − =6
0 1 3 1 3 0
2 0 2 0 2 2
C31 = = 2, C32 = − = −2, C33 = =6
1 1 −2 1 −2 1
1 − 2 2 1 / 12 − 1 / 6 1 / 6
1
−1
Hence A = 5 2 − 2 = 5 / 12 1 / 6 − 1 / 6
12
− 3 6 6 − 1 / 4 1 / 2 1 / 2
We can also verify that A ⋅ A −1 = A −1 ⋅ A = I
(
Suppose that A(t ) = aij (t ) )m× n is a matrix whose entries are functions those are continuous
on a common interval containing t , then integral of the matrix A(t ) is a matrix whose entries
are integrals of the corresponding entries of the matrix A(t ) . Thus
t
t
∫ A(s)ds = ∫t0 aij (s)ds m×n
t0
sin 2t
Example 11 Find the derivative and the integral of the following matrix X (t ) = e3t
8t − 1
Solution The derivative and integral of the given matrix are, respectively, given by
t
d ∫ sin 2 sds
(sin 2t ) 2 cos 2t 0
dt t t −1/ 2 cos 2t + 1/ 2
d 3t
X ′(t ) = (e ) = 3e 3t
and ∫ X=
( s )ds ∫ e =3s
ds 1/ 3e3t − 1/ 3
dt 0 2
d 8 −
0 t 4t t
(8t − 1)
dt ∫ 8s − 1ds
0
Exercise
Write the given sum as a single column matrix
2 − 1 3t
1. 3t t + (t − 1) − t − 2 4
− 1 3 − 5t
1 − 3 4 t −t 2
2.
2 5 − 1 2t − 1 + 1 − 8
0 − 4 − 2 −t 4 − 6
Determine whether the given matrix is singular or non-singular. If singular, find A− 1 .
3 2 1
3. A = 4 1 0
−2 5 −1
4 1 −1
= 4. A 6 2 −3
−2 −1 2
dX
Find
dt
1
5. X = 2 sin 2t − 4 cos 2t
− 3 sin 2t + 5 cos 2t
e 4 t
cos π t
2 t
6. If A ( t ) =
2t 3t 2 − 1
then find (a) ∫ A(t )dt , (b) ∫ A(s)ds.
0 0
2
6t 2
7. Find the integral ∫ B(t )dt if B ( t ) =
1 1/ t 4t
Lecture 3
Linear Equations
We know that the equation of a straight line is written as = y mx + c , where m is the
slope of line(Tan of the angle of line with x-axis) and c is the y-intercept(the distance at
which the straight line meets y-axis from origin).
Thus a line in R2 (2-dimensions) can be represented by an equation of the form
a1 x + a2 y =
b (where a 1 , a 2 not both zero). Similarly a plane in R3 (3-dimensional space)
can be represented by an equation of the form a1 x + a2 y + a3 z = b (where a 1 , a 2 , a 3 not
all zero).
where a1, a2 ,, an and b are constants and the “a’s” are not all zero.
Note A linear equation does not involve any products or square roots of variables. All
variables occur only to the first power and do not appear, as arguments of trigonometric,
logarithmic, or exponential functions.
A finite set of linear equations is called a system of linear equations or linear system. The
variables in a linear system are called the unknowns.
For example,
4 x1 − x2 + 3 x3 =−1
3 x1 + x2 + 9 x3 =−4
is a linear system of two equations in three unknowns x 1 , x 2 , and x 3 .
When two lines intersect in R2, we get system of linear equations with two unknowns
a1 x + b1 y =
c1
For example, consider the linear system
a2 x + b2 y =
c2
The graphs of these equations are straight lines in the xy-plane, so a solution (x, y) of this
system is infact a point of intersection of these lines.
Note that there are three possibilities for a pair of straight lines in xy-plane:
1. The lines may be parallel and distinct, in which case there is no intersection and
consequently no solution.
2. The lines may intersect at only one point, in which case the system has exactly
one solution.
3. The lines may coincide, in which case there are infinitely many points of
intersection (the points on the common line) and consequently infinitely many
solutions.
A linear system is said to be consistent if it has at least one solution and it is called
inconsistent if it has no solutions.
Thus, a consistent linear system of two equations in two unknowns has either one
solution or infinitely many solutions – there is no other possibility.
x2
x1
l2 3
l1 (a)
(a) x1 − 2 x2 =
−1 (b) x1 − 2 x2 =
−1
− x1 + 2 x2 =3 − x1 + 2 x2 =1
x2 x2
2
2
x1
l2 3 3
l1
l1 (a) (b)
In this case, the graph of each equation is a plane, so the solutions of the system, If any
correspond to points where all three planes intersect; and again we see that there are only
three possibilities – no solutions, one solution, or infinitely many solutions as shown in
figure.
Theorem 1 Every system of linear equations has zero, one or infinitely many solutions;
there are no other possibilities.
x− y =1
Example 1 Solve the linear system
2x + y =6
Solution
7
Adding both equations, we get x = . Putting this value of x in 1st equation, we
3
4 7 4
get y = . Thus, the system has the unique solution= x = ,y .
3 3 3
Geometrically, this means that the lines represented by the equations in the system
7 4
intersect at a single point , and thus has a unique solution.
3 3
x+ y = 4
Example 2 Solve the linear system
3x + 3 y =
6
Solution
Multiply first equation by 3 and then subtract the second equation from this. We obtain
0=6
This equation is contradictory.
Geometrically, this means that the lines corresponding to the equations in the original
system are parallel and distinct. So the given system has no solution.
4x − 2 y =1
Example 3 Solve the linear system
16 x − 8 y =
4
Solution
−16 x + 8 y =
−4
16 x − 8 y =4
0 =0
Thus, the solutions of the system are those values of x and y that satisfy the single
equation 4 x − 2 y =
1
Geometrically, this means the lines corresponding to the two equations in the original
system coincide and thus the system has infinitely many solutions.
Parametric Representation
The first approach yields the following parametric equations (by taking y=t in the
equation 4 x − 2 y =
1)
4 x − 2=
t 1, y= t
1 1
x=+ t, y =
t
4 2
We can now obtain some solutions of the above system by substituting some numerical
values for the parameter.
1 3
Example For t = 0 the solution is ( , 0). For t = 1, the solution is ( ,1) and for t = −1
4 4
1
the solution is (− , −1) etc.
4
x − y + 2z =
5
Example 4 Solve the linear system 2x − 2 y + 4z =10
3x − 3 y + 6 z =
15
Solution
Since the second and third equations are multiples of the first.
Geometrically, this means that the three planes coincide and those values of x, y and z
that satisfy the equation x − y + 2 z =
5 automatically satisfy all three equations.
x = 5 + t1 − 2t2 , y = t1 , z = t2
Some solutions can be obtained by choosing some numerical values for the parameters.
x − y + 2z = 5
1 − 2 + 2(3) =5
1− 2 + 6 = 5
5=5
Matrix Notation
x1 − 2 x2 + x3 =
0
Given the system 2 x2 − 8 x3 =8
−4 x1 + 5 x2 + 9 x3 =
−9
1 −2 1
With the coefficients of each variable aligned in columns, the matrix 0 2 −8
−4 5 9
is called the coefficient matrix (or matrix of coefficients) of the system.
An augmented matrix of a system consists of the coefficient matrix with an added column
containing the constants from the right sides of the equations. It is always denoted by A b
1 −2 1 0
A b = 0 2 −8 8
−4 5 9 −9
In order to solve a linear system, we use a number of methods. 1st of them is given
below.
Successive elimination method In this method the x1 term in the first equation of a
system is used to eliminate the x1 terms in the other equations. Then we use the x2 term
in the second equation to eliminate the x2 terms in the other equations, and so on, until
we finally obtain a very simple equivalent system of equations.
x1 − 2 x2 + x3 =
0
Example 5 Solve 2 x2 − 8 x3 =8
−4 x1 + 5 x2 + 9 x3 =
−9
Solution We perform the elimination procedure with and without matrix notation,
and place the results side by side for comparison:
x1 − 2 x2 + x3 = 0 1 −2 1 0
2 x2 − 8 x3 = 8 0 2 −8 8
−4 x1 + 5 x2 + 9 x3 =
−9 −4 5 9 −9
To eliminate the x1 term from third equation add 4 times equation 1 to equation 3,
4 x1 − 8 x2 + 4 x3 =
0
−4 x1 + 5 x2 + 9 x3 =
−9
−3 x2 + 13 x3 =
−9
The result of the calculation is written in place of the original third equation:
x1 − 2 x2 + x3 = 0 1 −2 1 0
2 x2 − 8 x3 = 8 0 2 −8 8
−3 x2 + 13 x3 = −9 0 −3 13 −9
x1 − 2 x2 + x3 =
0 1 −2 1 0
x2 − 4 x3 =
4 0 1 −4 4
−3 x2 + 13 x3 =
−9 0 −3 13 −9
To eliminate the x2 term from third equation add 3 times equation 2 to equation 3,
Now using 3rd equation eliminate the x 3 term from first and second equation i.e. multiply
3rd equation with 4 and add in second equation. Then subtract the third equation from first
equation we get
x1 − 2 x2 =
−3 1 −2 0 −3
x2 = 16 0 1 0 16
x3 = 3 0 0 1 3
x1 = 29 1 0 0 29
0 1 0 16
x2 = 16
x = 3 0 0 1 3
3
To verify that (29, 16, 3) is a solution, substitute these values into the left side of the
original system for x 1 , x 2 and x 3 and after computing, we get
The results agree with the right side of the original system, so (29, 16, 3) is a solution of
the system.
1. (Replacement) Replace one row by the sum of itself and a nonzero multiple of
another row.
2. (Interchange) Interchange two rows.
3. (Scaling) Multiply all entries in a row by a nonzero constant.
Note If the augmented matrices of two linear systems are row equivalent, then the two
systems have the same solution set.
Row operations are extremely easy to perform, but they have to be learnt and practice.
1. Is the system consistent; that is, does at least one solution exist?
2. If a solution exists is it the only one; that is, is the solution unique?
We try to answer these questions via row operations on the augmented matrix.
Solution
First obtain the triangular matrix by removing x 1 and x 2 term from third equation and
removing x 2 from second equation.
x1 − 2 x2 + x3 =
0 1 −2 1 0
x2 − 4 x3 =4 0 1 −4 4
−4 x1 + 5 x2 + 9 x3 =
−9 −4 5 9 −9
x1 − 2 x2 + x3 =
0 1 −2 1 0
x2 − 4 x3 =4 0 1 −4 4
− 3 x2 + 13 x3 =
−9 0 −3 13 −9
x1 − 2 x2 + x3 =
0 1 −2 1 0
x2 − 4 x3 =
4 0 1 −4 4
x3 = 3 0 0 1 3
x1 − 2(16) + 3 =0
x1 = 29
So a solution exists and the system is consistent and has a unique solution.
To eliminate the 5x 1 term in the third equation, add –5/2 times row 1 to row 3:
2 −3 2 1
0 1 −4 8
0 −1/ 2 2 −3 / 2
Next, use the x 2 term in the second equation to eliminate the –(1/2) x 2 term from the
third equation. Add ½ times row 2 to row 3:
2 −3 2 1
0 1 −4 8
0 0 0 5 / 2
2 x1 − 3 x2 + 2 x3 =
1
x2 − 4 x3 =
8
0 = 2.5
There are no values of x 1 , x 2 , x 3 that will satisfy because the equation 0 = 2.5 is never
true.
Hence original system is inconsistent (i.e., has no solution).
Exercises
1. State in words the next elementary “row” operation that should be performed on the
system in order to solve it. (More than one answer is possible in (a).)
a. x1 + 4 x2 − 2 x3 + 8 x4 =
12 b. x1 − 3 x2 + 5 x3 − 2 x4 =0
x2 − 7 x3 + 2 x4 =
−4 x2 + 8 x3 =
−4
5 x3 − x4 =7 2 x3 = 7
x3 + 3 x4 =
−5 x4 = 1
2. The augmented matrix of a linear system has been transformed by row operations into
the form below. Determine if the system is consistent.
1 5 2 −6
0 4 −7 2
0 0 5 0
5 x1 − x2 + 2 x3 =7
−2 x1 + 6 x2 + 9 x3 =0
−7 x1 + 5 x2 − 3 x3 =
−7
2 x1 − x2 =
h
−6 x1 + 3 x2 =
k
x2 + 5 x3 =
−4 x1 − 5 x2 + 4 x3 =
−3
5. x1 + 4 x2 + 3 x3 =
−2 6. 2 x1 − 7 x2 + 3 x3 =
−2
2 x1 + 7 x2 + x3 =
−1 2 x1 − x2 − 7 x3 =
1
x1 + 2 x2 =
4 2 x1 − 4 x3 =
−10
7. x1 − 3 x2 − 3 x3 =
2 8. x2 + 3 x3 =
2
x2 + x3 =
0 3 x1 + 5 x2 + 8 x3 =
−6
Determine the value(s) of h such that the matrix is augmented matrix of a consistent
linear system.
1 −3 h 1 h −2
9. 10.
−2 6 −5 −4 2 10
Find an equation involving g, h, and that makes the augmented matrix correspond to a
consistent system.
1 −4 7 g 2 5 −3 g
11. 0 3 −5 h 12. 4 7 −4 h
−2 5 −9 k −6 −3 1 k
Find the elementary row operations that transform the first matrix into the second, and
then find the reverse row operation that transforms the second matrix into first.
1 3 −1 5 1 3 −1 5
15. 0 1 −4 2 , 0 1 −4 2
0 2 −5 −1 0 0 3 −5
Lecture 4
A rectangular matrix is in echelon form (or row echelon form) if it has the following three
properties:
The following matrices are in echelon form. The leading entries ( ) may have any
nonzero value; the started entries (*) may have any values (including zero).
2 −3 2 1
1. 0 1 −4 8
0 0 0 5 / 2
0 * * * * * * * *
* * * 0
0 * * 0 0 * * * * * *
2. 3. 0 0 0 0 * * * * *
0 0 0 0
0 0 0 0 0 * * * *
0 0 0 0 0 0 0 0 0 0 0 0 *
1 4 −3 7 1 1 0
4. 0 1 6 2 5. 0 1 0
0 0 1 5 0 0 0
0 1 2 6 0
6. 0 0 1 −1 0
0 0 0 0 1
The following matrices are in reduced echelon form because the leading entries are 1’s,
and there are 0’s below and above each leading 1.
1 0 0 29
1. 0 1 0 16
0 0 1 1
0 1 * 0 0 0 * * 0 *
1 * 0 *
0 *
0 1 * * 0 0 1 0 0 * * 0
2. 3. 0 0 0 0 1 0 * * 0 *
0 0 0 0
0 0 0 0 0 1 * * 0 *
0 0 0 0
0 0 0 0 0 0 0 0 1 *
0 1 −2 0 1
1 0 0 4 1 0 0 0 0 0 1 3
4. 0 1 0 7
5. 0 1 0
6.
0 0 0 0 0
0 0 1 −1 0 0 1
0 0 0 0 0
Note A matrix may be row reduced into more than one matrix in echelon form, using
different sequences of row operations. However, the reduced echelon form obtained from
a matrix, is unique.
Theorem 1 (Uniqueness of the Reduced Echelon Form) Each matrix is row equivalent
to one and only one reduced echelon matrix.
Pivot Positions
A pivot position in a matrix A is a location in A that corresponds to a leading entry in an
echelon form of A.
Note When row operations on a matrix produce an echelon form, further row operations
to obtain the reduced echelon form do not change the positions of the leading entries.
Pivot column
Example 2 Reduce the matrix A below to echelon form, and locate the pivot columns
0 −3 −6 4 9
−1 −2 −1 3 1
A=
−2 −3 0 3 −1
1 4 5 −9 −7
Solution Leading entry in first column of above matrix is zero which is the pivot
position. A nonzero entry, or pivot, must be placed in this position. So interchange first
and last row.
1 ↵Pivot 4 5 −9 −7
−1 −2 −1 3 1
−2 −3 0 3 −1
0 −3 −6 4 9
Pivot Column
Since all entries in a column below a leading entry should be zero. For this add row 1 in
row 2, and multiply row 1 by 2 and add in row 3.
Pivot
1 4 5 −9 −7
0 −6 −6
2 4 R1 + R2
0 5 10 −15 −15
2 R1 + R3
0 − 3 − 6 4 9
Next pivot column
Add –5/2 times row 2 to row 3, and add 3/2 times row 2 to row 4.
1 4 5 −9 −7 5
− R2 + R3
0 2 4 −6 −6 2
0 0 0 0 0 3
R2 + R4
0 0 0 −5 0 2
Pivot
1 4 5 −9 −7 * * * *
0 2 4
−6 −6 0 * * *
General form
0 0 0 −5 0 0 0 0 *
0 0 0 0 0 0 0 0 0 0
Pivot column
This is in echelon form and thus columns 1, 2, and 4 of A are pivot columns.
Pivot positions
0 −3 −6 4 9
−1 −2 −1 3 1
−2 −3 0 3 −1
1 4 5 −9 −7
Pivot columns
Pivot element
A pivot is a nonzero number in a pivot position that is used as needed to create zeros via
row operations
The Row Reduction Algorithm consists of four steps, and it produces a matrix in
echelon form. A fifth step produces a matrix in reduced echelon form.
Example 3 Apply elementary row operations to transform the following matrix first
into echelon form and then into reduced echelon form.
0 3 −6 6 4 −5
3 −7 8 −5 8 9
3 −9 12 −9 6 15
Solution
STEP 1 Begin with the leftmost nonzero column. This is a pivot column. The pivot
position is at the top.
0 3 −6 6 4 −5
3 −7 8 −5 8 9
3 −9 12 −9 6 15
Pivot column
STEP 2 Select a nonzero entry in the pivot column as a pivot. If necessary, interchange
rows to move this entry into the pivot position
Interchange rows 1 and 3. (We could have interchanged rows 1 and 2 instead.)
Pivot
3 −9 12 −9 6 15
3 −7 8 −5 8 9
0 3 −6 6 4 −5
STEP 3 Use row replacement operations to create zeros in all positions below the pivot
STEP 4 Cover (or ignore) the row containing the pivot position and cover all rows, if
any, above it. Apply steps 1 –3 to the sub-matrix, which remains. Repeat the process until
there are no more nonzero rows to modify.
With row 1 covered, step 1 shows that column 2 is the next pivot column; for step 2,
we’ll select as a pivot the “top” entry in that column.
Pivot
3 −9 12 −9 6 15
0 2 −4 4 2 −6
0 3 −6 6 4 −5
Next pivot column
According to step 3 “All entries in a column below a leading entry are zero”. For this
subtract 3/2 time R 2 from R 3
3 −9 12 −9 6 15
0 2 −4 4 2 −6 R3 − R2
3
2
0 0 0 0 1 4
When we cover the row containing the second pivot position for step 4, we are left with a
new sub matrix having only one row:
3 −9 12 −9 6 15
0 2 −4 4 2 −6
0 0 0 0 1 4
Pivot
This is the Echelon form of the matrix.
To change it in reduced echelon form we need to do one more step:
STEP 5 Make the leading entry in each nonzero row 1. Make all other entries of that
column to 0.
1 −3 4 −3 2 5
0 1 −2 2 1 −3
1
R2 ,
1
R1
2 3
0 0 0 0 1 4
1 0 −2 3 5 −4
0 1 −2 2 1 −3 3R2 + R1
0 0 0 0 1 4
Subtract row 3 from row 2, and multiply row 3 by 5 and then subtract it from first row
1 0 −2 3 0 −24
0 R2 − R3
1 − 2 2 0 − 7 R1 − 5 R3
0 0 0 0 1 4
This is the matrix is in reduced echelon form.
When this algorithm is applied to the augmented matrix of the system it gives solution set
of linear system.
Suppose, for example, that the augmented matrix of a linear system has been changed
into the equivalent reduced echelon form
1 0 −5 1
0 1 1 4
0 0 0 0
There are three variables because the augmented matrix has four columns. The associated
system of equations is
x1 −5 x3 = 1
x2 + x3 =
4 (1)
0=0 which means x 3 is free
The variables x 1 and x 2 corresponding to pivot columns in the above matrix are called
basic variables. The other variable, x 3 is called a free variable.
Whenever a system is consistent, the solution set can be described explicitly by solving
the reduced system of equations for the basic variables in terms of the free variables. This
operation is possible because the reduced echelon form places each basic variable in one
and only one equation.
In (4), we can solve the first equation for x 1 and the second for x 2 . (The third equation is
ignored; it offers no restriction on the variables.)
x1 = 1 + 5 x3
x2= 4 − x3 (2)
x3 is free
By saying that x 3 is “free”, we mean that we are free to choose any value for x 3 . When
x 3 = 0, the solution is (1, 4, 0); when x 3 = 1, the solution is (6, 3, 1 etc).
Note The solution in (2) is called a general solution of the system because it gives an
explicit description of all solutions.
Example 4 Find the general solution of the linear system whose augmented matrix has
1 6 2 −5 −2 −4
been reduced to 0 0 2 −8 −1 3
0 0 0 0 1 7
Solution The matrix is in echelon form, but we want the reduced echelon form
before solving for the basic variables. The symbol “~” before a matrix indicates that the
matrix is row equivalent to the preceding matrix.
1 6 2 −5 −2 −4
0 0 2 −8 −1 3
0 0 0 0 1 7
By R1 + 2 R3 and R2 + R3 We get
1 6 2 −5 0 10
0 0 2 −8 0 10
0 0 0 0 1 7
1
By R2 we get
2
1 6 2 −5 0 10
0 0 1 −4 0 5
0 0 0 0 1 7
By R1 − 2 R2 we get
1 6 0 3 0 0
0 0 1 −4 0 5
0 0 0 0 1 7
The pivot columns of the matrix are 1, 3 and 5, so the basic variables are x 1 , x 3 , and x 5 .
The remaining variables, x 2 and x 4 , must be free.
x1 = -6x 2 –3x 4
x2 is free
x3 = 5 + 4x 4 (7)
x4 is free
x5 =7
Note that the value of x 5 is already fixed by the third equation in system (6).
Exercise
1. Find the general solution of the linear system whose augmented matrix is
1 −3 −5 0
0 1 1 3
x1 − 2 x2 − x3 + 3 x4 =
0
−2 x1 + 4 x2 + 5 x3 − 5 x4 =
3
3 x1 − 6 x2 − 6 x3 + 8 x4 =
2
Find the general solutions of the systems whose augmented matrices are given in
Exercises 3-12
1 0 2 5 1 −3 0 −5
3. 2 0 3 6 4. −3 7 0 9
0 3 6 9 1 3 −3 7
5. −1 1 −2 −1 6. 3 9 −4 1
1 2 −7 1 2 4
7. −1 −1 1 8. −2 −3 −5
2 1 5 2 1 −1
1 0 −9 0 4
2 −4 3
0 1 3 0 −1
9. −6 12 −9 10.
0 0 0 1 −7
4 −8 6
0 0 0 0 1
1 −2 0 0 7 −3 1 0 −5 0 −8 3
0 1 0 0 −3 1 0 1 4 −1 0 6
11. 12.
0 0 0 1 5 −4 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0
Determine the value(s) of h such that the matrix is the augmented matrix of a consistent
linear system.
1 4 2 1 h 3
13. 14.
−3 h −1 2 8 1
Choose h and k such that the system has (a) no solution, (b) a unique solution, and (c)
many solutions. Give separate answer for each part.
15. x 1 + hx 2 = 1 16. x 1 - 3x 2 = 1
2x 1 + 3x 2 = k 2x 1 + hx 2 = k
Lecture 05
NULL SPACE
Definition The null space of an m × n matrix A, written as Nul A, is the set of all
solutions to the homogeneous equation Ax = 0. In set notation,
Nul A = {x: x is in Rn and Ax = 0}
OR
Nul ( A=
) {x / ∀x ∈ , Ax= 0}
A more dynamic description of Nul A is the set of all x in Rn that are mapped into the
zero vector of Rm via the linear transformation x → Ax , where A is a matrix of
transformation. See Figure1
Nul A
0
0
Rm
Rn
Figure 1
5
and let u = 3 . Determine if u ∈ Nul A .
1 -3 -2
Example 1: Let A =
-5 9 1 -2
Solution: To test if u satisfies Au = 0, simply compute
5
1 -3 -2 5 - 9 + 4 0
Au = 3 = = . Thus u is in Nul A.
-5 9 1 -25 + 27 - 2 0
-2
4 0
A=
−8 20
Solution: To find the null space of A we need to solve the following system of
equations:
4 0 x1 0
=
−8 20 x2 0
4 x1 + 0 x2 0
⇒ =
−8 x1 + 20 x2 0
⇒ 4 x1 + 0=
x2 0 ⇒=
x1 0
and ⇒ − 8 x1 + 20=
x2 0 ⇒=
x2 0
We can find Null space of a matrix with two ways i.e. with matrices or with system of
linear equations. We have given this in both matrix form and (here first we convert the
matrix into system of equations) equation form. In equation form it is easy to see that by
solving these equations together the only solution is x=
1 x=
2 0 . In terms of vectors from
2 the solution consists of the single vector {0} and hence the null space of A is {0} .
Theorem 1: Elementary row operations do not change the null space of a matrix.
Or
Null space N(A) of a matrix A can not be changed (always same) by changing the matrix
with elementary row operations.
Example: Determine the null space of the following matrix using the elementary row
operations: (Taking the matrix from the above Example)
4 0
A=
−8 20
Solution: First we transform the matrix to the reduced row echelon form:
4 0 1 0 1
~ R1
−8 20 −8 20 4
1 0
~ R2 + 8 R1
0 20
1 0 1
~ R2
0 1 20
We can observe and compare both the above examples which show the same result.
Or simply, the null space is the space of all the vectors of a Matrix A of any order those
are mapped (assign) onto zero vector in the space Rn (i.e. Ax = 0).
Proof: We know that the subspace of A consists of all the solution to the system
Ax = 0 . First, we should point out that the zero vector, 0, in Rn will be a solution to this
system and so we know that the null space is not empty. This is a good thing since a
vector space (subspace or not) must contain at least one element.
Now we know that the null space is not empty. Consider u, v be two any vectors
(elements) (in) from the null space and let c be any scalar. We just need to show that the
sum (u+v) and scalar multiple (c.u) of these are also in the null space.
Certainly Nul A is a subset of Rn because A has n columns. To show that Nul(A) is the
subspace, we have to check three conditions whether they are satisfied or not. If Nul(A)
satisfies the all three condition, we say Nul(A) is a subspace otherwise not.
First, zero vector “0” must be in the space and subspace. If zero vector does not in the
space we can not say that is a vector space (generally, we use space for vector space).
And we know that zero vector maps on zero vector so 0 is in Nul(A). Now choose any
vectors u, v from Null space and using definition of Null space (i.e. Ax=0)
Au = 0 and Av = 0
Now the other two conditions are vector addition and scalar multiplication. For this we
proceed as follow:
Let start with vector addition:
To show that u + v is in Nul A, we must show that A (u + v) = 0. Using the property of
matrix multiplication, we find that
A (u + v) = Au + Av = 0 + 0 = 0
Thus u + v is in Nul A, and Nul A is closed under vector addition.
For Matrix multiplication, consider any scalar , say c,
A (cu) = c (Au) = c (0) = 0
which shows that cu is in Nul A. Thus Nul A is a subspace of Rn.
a - 2b + 5c - d = 0
-a - b + c =0
We see that H is the set of all solutions of the above system of homogeneous linear
equations.
Therefore from the Theorem 2, H is a subspace of R4.
It is important that the linear equations defining the set H are homogeneous. Otherwise,
the set of solutions will definitely not be a subspace (because the zero-vector (origin) is
not a solution of a non- homogeneous system), geometrically means that a line that not
passes through origin can not be a subspace, because subspace must hold the zero vector
(origin). Also, in some cases, the set of solutions could be empty. In this case, we can not
find any solution of a system of linear equations, geometrically says that lines are parallel
or not intersecting.
If the null space having more than one vector, geometrically means that the lines intersect
more than one point and must passes through origin (zero vector) .
Example 3: Find a spanning set for the null space of the matrix
- 3 6 -1 1 - 7
A = 1 - 2 2 3 - 1
2 - 4 5 8 - 4
Solution: The first step is to find the general solution of Ax = 0 in terms of free
variables.
After transforming the augmented matrix [A 0] to the reduced row echelon form and we
get;
1 -2 0 -1 3 0
0 0 1 2 -2 0
0 0 0 0 0 0
which corresponds to the system
x1 - 2x2 - x4 + 3x5 = 0
x3 + 2x4 - 2x5 = 0
0=0
The general solution is
x1 = 2x2 + x4 - 3x5
x2 = free variable
x3 = - 2x4 + 2x5
x4 = free variable
x5 = free variable
Next, decompose the vector giving the general solution into a linear combination of
vectors where the weights are the free variables. That is,
Two points should be made about the solution in Example 3 that apply to all problems of
this type. We will use these facts later.
1. The spanning set produced by the method in Example 3 is automatically linearly
independent because the free variables are the weights on the spanning vectors.
For instance, look at the 2nd, 4th and 5th entries in the solution vector in (3) and
note that x2 u + x4 v + x5 w can be 0 only if the weights x 2 , x 4 and x 5 are all zero.
2. When Nul A contains nonzero vector, the number of vectors in the spanning set
for Nul A equals the number of free variables in the equation Ax = 0.
1 -3 2 2 1
0 3 6 0 -3
Example 4: Find a spanning set for the null space of A = 2 -3 -2 4 4 .
3 -6 0 6 5
-2 9 2 -4 -5
Solution: The null space of A is the solution space of the homogeneous system
x1 - 3x2 + 2x3 + 2x4 + x5 = 0
0x1 + 3x2 +6x3 +0x4 - 3x5 = 0
2x1 - 3x2 - 2x3 + 4x4 + 4x5 = 0
3x1 - 6x2 +0x3 +6x4 + 5x5 = 0
-2x1 + 9x2 + 2x3 - 4x4 - 5x5 = 0
1 -3 2 2 1 0
0 3 6 0 -3 0
2 -3 -2 4 4 0
3 -6 0 6 5 0
-2 9 2 -4 -5 0
1 -3 2 2 1 0
0 3 6 0 -3 0 - 2R1 + R3
0 3 -6 0 2 0 -3R1 + R4
0 3 -6 0 2 0 2R1 + R5
0 3 6 0 -3 0
1 -3 2 2 1 0
0 1 2 0 -1 0
0 3 -6 0 2 0 (1/3)R2
0 3 -6 0 2 0
0 3 6 0 -3 0
1 -3 2 2 1 0
0 - 3R2 + R3
0 1 2 0 -1
0 0 -12 0 5 0 -3R2 + R4
0 0 -12 0 5 0 -3R2 + R5
0 0 0 0 0 0
1 -3 2 2 1 0
0 1 2 0 -1 0
0 0 1 0 - 5/12 0 (-1/12)R3
0 0 -12 0 5 0
0 0 0 0 0 0
1 -3 2 2 1 0
0 1 2 0 -1 0
0 0 1 0 - 5/12 0 12R3 + R4
0 0 0 0 0 0
0 0 0 0 0 0
1 -3 0 2 11/ 6 0
0
0 1 0 0 -1/6
- 2R3 + R2
0 0 1 0 - 5/12 0
-2R3 + R1
0 0 0 0 0 0
0 0 0 0 0 0
1 0 0 2 4/3 0
0 1 0 0 -1/6 0
0 0 1 0 - 5/12 0 3R2 + R1
0 0 0 0 0 0
0 0 0 0 0 0
The reduced row echelon form of the augmented matrix corresponds to the system
1 x1 + 2 x4 +(4/3) x5 = 0
1 x2 + (-1/6) x5 = 0
1 x3 + (-5/12) x5 = 0 .
0=0
0=0
No equation of this system has a form zero = nonzero; Therefore, the system is
consistent. The system has infinitely many solutions:
c4 = (-2,0,0,1,0) c5 = (-4/3,1/6,5/12,0,1)
The Column Space of a Matrix: Another important subspace associated with a matrix
is its column space. Unlike the null space, the column space is defined explicitly via
linear combinations.
The next theorem follows from the definition of Col A and the fact that the columns of A
are in Rm.
Note that a typical vector in Col A can be written as Ax for some x because the notation
Ax stands for a linear combination of the columns of A. That is,
Col A = {b: b = Ax for some x in Rn}
The notation Ax for vectors in Col A also shows that Col A is the range of the linear
transformation x → Ax.
6a - b
Example 6: Find a matrix A such that W = Col A. W = a + b : a,b in R
-7 a
Solution: First, write W as a set of linear combinations.
6 -1 6 -1
W = a 1 +b 1 : a,b in R = Span 1 , 1
-7 0 -7 0
6 -1
Second, use the vectors in the spanning set as the columns of A. Let A = 1 1 .
-7 0
Then W = Col A, as desired.
x3 x2
0 W
x1
We know that the columns of A span Rm if and only if the equation Ax = b has a
solution for each b. We can restate this fact as follows:
The column space of an m × n matrix A is all of Rm if and only if the equation Ax = b has
a solution for each b in Rm.
1 0 0 2
0 1 0 -1 3R2 + R1
0 0 1 3
⇒ x1 = 2,x2 = -1,x3 = 3 . Since the system is consistent, b is in the column space of A.
Solution:
The coefficient matrix Ax = b is:
1 1 2 x1 −1
1 0 1 x2 = 0
2 1 3 x 2
3
The augmented matrix for the linear system that corresponds to the matrix
equation Ax = b is:
1 1 2 −1
1 0 1 0
2 1 3 2
We reduce this matrix to the Reduced Row Echelon Form:
1 1 2 −1 1 1 2 −1
1 0 1 0 ~ 0 −1 −1 1 R2 + ( −1) R1
2 1 3 2 2 1 3 2
1 1 2 −1
~ 0 −1 −1 1 R3 + ( −2 ) R1
0 −1 − 1 4
1 1 2 −1
~ 0 1 1 −1 ( −1) R2
0 −1 − 1 4
1 1 2 −1
~ 0 1 1 −1 R3 + R2
0 0 0 3
1 1 2 −1
1
~ 0 1 1 −1 R3
0 3
0 0 1
1 1 2 −1
~ 0 1 1 0 R2 + R3
0 0 0 1
1 1 2 0
~ 0 1 1 0 R1 + R3
0 0 0 1
1 0 1 0
~ 0 1 1 0 R1 + ( −1) R2
0 0 0 1
1.
1 −1 2 5
=A =9 3 1 ; b 1
1 1 1 0
1 −1 1 1
2. =A 1 1 −1 =; b 2
−1 −1
−1 3
1 1 − 2 1 1
0 2 0 1 2
=3. A = ; b
1 1 1 − 3 3
0 2 2 1 4
Theorem 5: If x 0 denotes any single solution of a consistent linear system Ax=b and if
v1 , v 2 , v 3 ,..., vk form the solution space of the homogeneous system Ax=0, then every
solution of Ax=b can be expressed in the form x = x0 + c1v1 + c2v 2 + ... + ck vk and,
conversely, for all choices of scalars c1 , c2 , c3 ,..., ck , the vector x is a solution of Ax=b.
General and Particular Solutions: The vector x 0 is called a particular solution of Ax=b
.The expression x 0 + c 1 v 1 +c 2 v 2 + . . . +c k v k is called the general solution of Ax=b , and
the expression c 1 v 1 +c 2 v 2 + . . . +c k v k is called the general solution of Ax=0.
Example 7: Find the vector form of the general solution of the given linear system
Ax = b; then use that result to find the vector form of the general solution of Ax=0.
x1 + 3x2 - 2x3 + 2x5 =0
2x1 +6x2 - 5x3 - 2x4 + 4x5 - 3x6 = -1
5x3 +10x4 +15x6 = 5
2x1 +6x2 + 8x4 + 4x5 +18x6 = 6
Solution: We solve the non-homogeneous linear system. The augmented matrix of this
system is given by
1 3 -2 0 2 0 0
2 6 -5 -2 4 -3 -1
0 0 5 10 0 15 5
2 6 0 8 4 18 6
1 3 -2 0 2 0 0
0 0 -1 -2 0 -3 -1 -2R1 + R2
0 0 5 10 0 15 5 -2R1 + R4
0 0 4 8 0 18 6
1 3 -2 0 2 0 0
0 0 1 2 0 3 1
- 1R2
0 0 5 10 0 15 5
0 0 4 8 0 18 6
1 3 -2 0 2 0 0
0 0 1 2 0 3 1 -5R2 + R3
0 0 0 0 0 0 0 -4R2 + R4
0 0 0 0 0 6 2
1 3 -2 0 2 0 0
0 0 1 2 0 3 1
R34
0 0 0 0 0 6 2
0 0 0 0 0 0 0
1 3 -2 0 2 0 0
0 0 1 2 0 3 1
(1/6)R3
0 0 0 0 0 1 1/3
0 0 0 0 0 0 0
1 3 -2 0 2 0 0
0 0 1 2 0 0 0
- 3R3 + R2
0 0 0 0 0 1 1/3
0 0 0 0 0 0 0
1 3 0 4 2 0 0
0 0 1 2 0 0 0
2R2 + R1
0 0 0 0 0 1 1/3
0 0 0 0 0 0 0
The reduced row echelon form of the augmented matrix corresponds to the system
1 x1 + 3 x2 + 4 x4 + 2 x5 =0
1 x3 + 2 x4 =0
1 x6 = (1/3)
0=0
No equation of this system has a form zero = nonzero; Therefore, the system is
consistent. The system has infinitely many solutions:
x1 = -3 x2 - 4 x4 - 2 x5 x2 = r x3 = -2 x4
x4 = s x5 = t x6 = 1/3
x1 = -3r - 4s - 2t x2 = r x3 = -2s
1
x4 = s x5 = t x6 =
3
Activity:
1. Suppose that x1 = −1, x2 =
2, x3 = 4, x4 =−3 is a solution of a non-homogenous
linear system Ax = b and that the solution set of the homogenous system Ax = 0
is given by this formula:
x1 =−3r + 4 s ,
x2 = r − s ,
x3 = r ,
x4 = s
(a) Find the vector form of the general solution of Ax = 0 .
(b) Find the vector form of the general solution of Ax = 0 .
Find the vector form of the general solution of the following linear system Ax = b; then
use that result to find the vector form of the general solution of Ax=0:
x1 − 2 x2 =
1
2.
3 x1 − 9 x2 =
2
x1 + 2 x2 − 3 x3 + x4 =3
− 3 x1 − x2 + 3 x3 + x4 =
−1
3.
− x1 + 3 x2 − x3 + 2 x4 =2
4 x1 − 5 x2 − 3 x4 =
−5
2 4 -2 1
Example 8: Let A = -2 -5 7 3
3 7 -8 6
(a) If the column space of A is a subspace of Rk, what is k?
(b) If the null space of A is a subspace of Rk, what is k?
Solution:
(a) The columns of A each have three entries, so Col A is a subspace of Rk, where k = 3.
(b) A vector x such that Ax is defined must have four entries, so Nul A is a subspace of
Rk, where k = 4.
When a matrix is not square, as in Example 8, the vectors in Nul A and Col A live in
entirely different “universes”. For example, we have discussed no algebraic operations
that connect vectors in R3 with vectors in R4. Thus we are not likely to find any relation
between individual vectors in Nul A and Col A.
2 4 -2 1
Example 9: If A = -2 -5 7 3 , find a nonzero vector in Col A and a nonzero
3 7 -8 6
vector in Nul A
2
Solution: It is easy to find a vector in Col A. Any column of A will do, say, -2 . To
3
find a nonzero vector in Nul A, we have to do some work. We row reduce the augmented
1 0 9 0 0
matrix [A 0] to obtain [ A 0 ] ~ 0 1 -5 0 0 . Thus if x satisfies Ax = 0,
0 0 0 1 0
then x1 = -9x3 , x2 = 5x3 , x4 = 0 , and x 3 is free. Assigning a nonzero value to x 3 (say), x 3 =
1, we obtain a vector in Nul A, namely, x = (-9, 5, 1, 0).
3
2 4 -2 1 -2 3
Example 10: With A = -2 -5 7 3 , let u = and v = -1 .
-1
3 7 -8 6 3
0
(a) Determine if u is in Nul A. Could u be in Col A?
The following table summarizes what we have learned about Nul A and Col A.
The kernel (or null space) of such a T is the set of all u in V such that T (u) = 0 (the zero
vector in W). The range of T is the set of all vectors in W of the form T (x) for some x in
V. If T happens to arise as a matrix transformation, say, T (x) = Ax for some matrix A –
then the kernel and the range of T are just the null space and the column space of A, as
defined earlier. So if T(x) = Ax, col A = range of T.
Range
Domain
0
0 W
V’
Kernel
Kernel is a Range is a
subspace of V subspace of W
function. To explain this in any detail would take us too far a field at this point. So we
present only two examples. The first explains why the operation of differentiation is a
linear transformation.
Example 11: Let V be the vector space of all real-valued functions f defined on an
interval [a, b] with the property that they are differentiable and their derivatives are
continuous functions on [a, b]. Let W be the vector space of all continuous functions on
[a, b] and let D : V → W be the transformation that changes f in V into its
derivative f ′ . In calculus, two simple differentiation rules are
D( f + g= ) D( f ) + D ( g ) and D(cf =
) cD( f )
That is, D is a linear transformation. It can be shown that the kernel of D is the set of
constant functions of [a, b] and the range of D is the set W of all continuous functions on
[a, b].
a
Example 13: Let W = b : a - 3b - c = 0 . Show that W is a subspace of R3 in
c
different ways.
Solution: First method: W is a subspace of R3 by Theorem 2 because W is the set of all
solutions to a system of homogeneous linear equations (where the system has only one
equation). Equivalently, W is the null space of the 1x3 matrix A = [1 - 3 - 1].
Second method: Solve the equation a – 3b – c = 0 for the leading variable a in terms of
the free variables b and c.
3b + c
Any solution has the form b , where b and c are arbitrary, and
c
3b+ c 3 1
b = b 1 + c 0
c 0 1
↑ ↑
v1 v2
7 -3 5 2 7
Example 14: Let A = -4 1 -5 , v = 1 ,and W = 6
-5 2 -4 -1 -3
Suppose you know that the equations Ax = v and Ax = w are both consistent. What can
you say about the equation Ax = v + w?
Solution: Both v and w are in Col A. Since Col A is a vector space, v + w must be in Col
A. That is, the equation Ax = v + w is consistent.
Activity:
1. Let V and W be any two vector spaces. The mapping T : V → W such that T (v) =
0 for every v in V is a linear transformation called the zero transformation. Find
the kernel and range of the zero transformation.
Exercises:
5 5 21 19
1. Determine if w = -3 is in Nul A, where A= 13 23 2 .
2 8 14 1
In exercises 2 and 3, find an explicit description of Nul A, by listing vectors that span the
null space.
1 -2 0 4 0
1 3 5 0
2. 3. 0 0 1 -9 0
0 1 4 -2
0 0 0 0 1
In exercises 4-7, either use an appropriate theorem to show that the given set, W is a
vector space, or find a specific example to the contrary.
a
a
b a - 2b = 4c
4. b : a + b + c = 2 5. :
c c 2a = c + 3d
d
b - 2d
5+d -a + 2b
6. :b,d real 7. a - 2b : a,b real
b + 3d
3a - 6b
d
2 s + 3t b - c
r + s - 2t 2b + c + d
8. : r,s,t real 9. :b, c, d real
4r + s 5c - 4d
3r - s - t d
For the matrices in exercises 10-13, (a) find k such that Nul A is a subspace of Rk, and
(b) find k such that Col A is a subspace of Rk.
2 -6 7 -2 0
-1 3 -2 0 -5
10. A = 11. A =
-4 12 0 -5 7
3 -9 -5 7 -2
4 5 -2 6 0
12. A= 13. A= [1 -3 9 0 -5]
1 1 0 1 0
-6 12 2
=
14. Let A = and w . Determine if w is in Col A. Is w in Nul A?
-3 6 1
-8 -2 -9 2
15. Let A =
= 1 . Determine if w is in Col A. Is w in Nul A?
6 4 8 and w
4 0 4 -2
p(0)
16. Define T: P 2 → R2 by T (p) = . For instance, if p (t) = 3 + 5t + 7t2, then
p(1)
3
T( p ) = .
15
a. Show that T is a linear transformation.
b. Find a polynomial p in P 2 that spans the kernel of T, and describe the range of T.
p(0)
17. Define a linear transformation T: P 2 → R2 by T (p) = . Find polynomials p 1
p(1)
and p 2 in P 2 that span the kernel of T, and describe the range of T.
18. Let M 2x2 be the vector space of all 2x2 matrices, and define T: M 2x2 → M 2x2 by
a b
T (A) = A + AT, where A = .
c d
(a) Show that T is a linear transformation.
(b) Let B be any element of M 2x2 such that BT=B. Find an A in M 2x2 such that T (A) = B.
(c) Show that the range of T is the set of B in M 2x2 with the property that BT=B.
(d) Describe the kernel of T.
19. Determine whether w is in the column space of A, the null space of A, or both, where
1 7 6 -4 1 1 -8 5 -2 0
1 -5 -1 0 -2 2 -5 2 1 -2
(a) w =
= ,A (b) w =
= ,A
-1 9 -11 7 -3 1 10 -8 6 -3
-3 19 -9 7 1 0 3 -2 1 0
Lecture 06
Definition: Let V be an arbitrary nonempty set of objects on which two operations are
defined, addition and multiplication by scalars.
If the following axioms are satisfied by all objects u, v, w in V and all scalars l and
m, then we call V a vector space.
Theorem: If W is a set of one or more vectors from a vector space V, then W is subspace
of V if and only if the following conditions hold:
Definition; The null space of an m x n matrix A (Nul A) is the set of all solutions of the
hom equation Ax = 0
Nul A = {x: x is in Rn and Ax = 0}
Definition: The column space of an m x n matrix A (Col A) is the set of all linear
combinations of the columns of A.
If A = [a 1 … a n ],
then
Col A = Span { a 1 ,… , a n }
Since we know that a set of vectors S = {v1 , v2 , v3 ,...v p } spans a given vector space V if
every vector in V is expressible as a linear combination of the vectors in S. In general
there may be more than one way to express a vector in V as linear combination of vectors
in a spanning set. We shall study conditions under which each vector in V is expressible
as a linear combination of the spanning vectors in exactly one way. Spanning sets with
this property play a fundamental role in the study of vector spaces.
In this Lecture, we shall identify and study the subspace H as “efficiently” as possible.
The key idea is that of linear independence, defined as in Rn.
If the trivial solution is the only solution to this equation then the vectors in the set are
called linearly independent and the set is called a linearly independent set. If there is
another solution then the vectors in the set are called linearly dependent and the set is
called a linearly dependent set.
Just as in Rn, a set containing a single vector v is linearly independent if and only if v ≠ 0 .
Also, a set of two vectors is linearly dependent if and only if one of the vectors is a
multiple of the other. And any set containing the zero-vector is linearly dependent.
Another situation in which it is easy to determine linear independence is when there are
more vectors in the set than entries in the vectors. If n > m, then the n vectors
a1 , a2 , a3 ,...an in Rm are columns of an m × n matrix A. The vector equation
x1a1 + x2 a2 + x3 a3 + ... + xn an =0 is equivalent to the matrix equation Ax = 0 whose
corresponding linear system has more variables than equations. Thus there must be at
least one free variable in the solution, meaning that there are nontrivial solutions
−2 2 0
=v1 = 1 , v2 =
1 , v3 0
1 − 2 1
Solution: Let there exist scalars c1 , c2 , c3 in R such that
c1v1 + c3v2 + c3v3 =
0
Therefore,
−2 2 0
⇒ c1 1 + c2 1 + c3 0 =
0
1 − 2 1
−2c1 2c2 0
⇒ c + c + 0 =
1 2 0
c1 −2c2 c3
−2c1 + 2c2 0
⇒ =
c1 + c2 0
c1 − 2c2 + c3 0
The above can be written as:
=
−2c1 + 2c2 0 ........(1) ⇒
= −c1 + c2 0........(4) (dividing by 2 on both sides of (1))
c1 + c2 =
0 .......(2)
c1 − 2c2 + c3 =
0 ......(3)
Note: The linearly independent or linearly dependent sets can also be determined using
the Echelon Form or the Reduced Row Echelon Form methods.
The main difference between linear dependence in Rn and in a general vector space is that
when the vectors are not n – tuples, the homogeneous equation (1) usually cannot be
written as a system of n linear equations. That is, the vectors cannot be made into the
columns of a matrix A in order to study the equation Ax = 0. We must rely instead on the
definition of linear dependence and on Theorem 1.
Example 2: The set {Sin t, Cos t} is linearly independent in C [0, 1] because Sin t and
Cos t are not multiples of one another as vectors in C [0, 1]. That is, there is no scalar c
such that Cos t = c. Sin t for all t in [0, 1]. (Look at the graphs of Sin t and Cos t.)
However, {Sin t Cos t, Sin 2t} is linearly dependent because of he identity:
Sin 2t = 2 Sin t Cos t, for all t.
Useful results:
• A set containing the zero vector is linearly dependent.
• A set of two vectors is linearly dependent if and only if one is a multiple of the
other.
• A set containing one nonzeoro vector is linearly independent. i.e. consider the set
containing one nonzeoro vector {v1} so {v1} is linearly independent when v1 ≠ 0 .
• A set of two vectors is linearly independent if and only if neither of the vectors is
a multiple of the other.
Activity: Determine whether the following sets of vectors are linearly independent or
linearly dependent:
=1. i(1,=
0, 0, 0 ) , j (= 0,1, 0, 0 ) , k ( 0, 0, 0,1) in 4.
2.
v1 =( 2, 0, − 1) , v2 = ( −3, − 2, − 5) , v3 = ( −6,1, − 1) , v4 =( −7, 0, 2 ) in
3
.
=3. i (1,=0, 0,..., 0 ) , j (= 0,1, 0,..., 0 ) , k ( 0, 0, 0,...,1) in m
.
4. 3 x 2 + 3 x + 1, 4 x 2 + x, 3 x 2 + 6 x + 5, − x 2 + 2 x + 7 in p2
Example 4: Let e 1 ,…, e n be the columns of the n × n identity matrix, In . That is,
1 0 0
0 1 0
e1 = , e2 = , ... en =
0 0 1
x3
e3
e2
x2
e1
x1
Example 6: Let S = {1, t, t2, …, tn}. Verify that S is a basis for P n . This basis is called
the standard basis for P n .
y=t
2
y=t
y=1
Example 7: Check whether the set of vectors {(2, -3, 1), (4, 1, 1), (0, -7, 1)} is basis for
R3?
Solution: The set S = {v 1 , v 2 , v 3 } of vectors in R3 spans V = R3 if
c 1 v 1 + c 2 v 2 + c 3 v 3 = d 1w 1 + d 2w 2 + d3 w3 (*)
with w 1 = (1,0,0), w 2 = (0,1,0) , w 3 = (0,0,1) has at least one solution for every set of
values of the coefficients d 1 , d 2 , d 3 . Otherwise (i.e., if no solution exists for at least some
values of d 1 , d 2 , d 3 ), S does not span V. With our vectors v 1 , v 2 , v 3 , (*) becomes
c 1 (2,-3,1) + c 2 (4,1,1) + c 3 (0,-7,1) = d 1 (1,0,0) + d 2 (0,1,0) + d 3 (0,0,1)
Rearranging the left hand side yields
2 c1 + 4 c2 +0 c3 = 1 d1 +0 d 2 +0 d 3
-3 c1 +1 c2 -7 c3 = 0 d1 +1 d 2 +0 d 3 (A)
1 c1 +1 c2 +1 c3 = 0 d1 +0 d 2 +1 d 3
2 4 0 c1 d1
⇒ -3 1 -7 c2 =
d
2
1 1 1 c3 d 3
2 4 0
We now find the determinant of coefficient matrix -3 1 -7 to determine whether the
1 1 1
system is consistent (so that S spans V), or inconsistent (S does not span V).
2 4 0
Now det -3 1 -7 = 2(8) – 4(4) +0 = 0
1 1 1
Therefore, the system (A) is inconsistent, and, consequently, the set S does not span the
space V.
- 4 6 8
We now find the determinant of coefficient matrix 1 5 4 to determine whether the
3 2 1
system is consistent (so that S spans V), or inconsistent (S does not span V).
- 4 6 8
Now det 1 5 4 = -26 ≠ 0. Therefore, the system (A) is consistent, and,
3 2 1
consequently, the set S spans the space V.
The set S = {p 1 (t), p 2 (t), p 3 (t)} of vectors in P 2 is linearly independent if the only
solution of
c 1 p 1 (t) + c 2 p 2 (t) + c 3 p 3 (t) = 0 (**)
is c 1 , c 2 , c 3 = 0. In this case, the set S forms a basis for span S. Otherwise (i.e., if a
solution with at least some nonzero values exists), S is linearly dependent. With our
vectors p 1 (t), p 2 (t), p 3 (t), (2) becomes: c 1 (-4 + 1 t + 3 t2) + c 2 (6 + 5 t + 2 t2) + c 3 (8 +
4 t + 1 t2) = 0 Rearranging the left hand side yields
(-4 c 1 +6 c 2 +8 c 3 )1 + (1 c 1 +5 c 2 +4 c 3 ) t + (3 c 1 +2 c 2 +1 c 3 ) t2 = 0
This yields the following homogeneous system of equations:
-4 c1 +6 c2 + 8 c3 = 0 - 4 6 8 c1 0
1 c1 + 5 c2 + 4 c3 = 0 ⇒ 1 5 4 c2 = 0
3 c1 + 2 c2 +1 c3 = 0
3 2 1
c3 0
- 4 6 8
As det 1 5 4 = -26 ≠ 0. Therefore the set S = {p 1 (t), p 2 (t), p 3 (t)} is linearly
3 2 1
independent. Consequently, the set S forms a basis for span S.
1 0 0 1 0 0 0 0
Example 9: The set S = , , , is a basis for the vector
0 0 0 0 1 0 0 1
space V of all 2 x 2 matrices.
Solution: To verify that S is linearly independent, we form a linear combination of the
vectors in S and set it equal to zero:
1 0 0 1 0 0 0 0 0 0
c1 + c2 + c3 + c4 =
0 0 0 0 1 0 0 1 0 0
c1 c2 0 0
This gives c c = 0 0 , which implies that c 1 = c 2 = c 3 = c 4 = 0. Hence S is
3 4
linearly independent.
a b
To verify that S spans V we take any vector in V and we must find scalars c 1 , c 2 ,
c d
c 3 , and c 4 such that
1 0 0 1 0 0 0 0 a b c c a b
c1 + c2 + c3 + c4 = ⇒ 1 2=
0 0 0 0 1 0 0 1 c d c3 c4 c d
We find that c1 = a, c2 = b, c3 = c, and c4 = d so that S spans V.
The basis S in this example is called the standard basis for M22. More generally, the
standard basis for Mmn consists of mn different matrices with a single 1 and zeros for the
remaining entries
1 0 0 1 0 0 0 0
= d1 + d2 + d3 + d4
0 0 0
0 1
0 0 1
3 c1 +0 c2 +0 c3 +1c4 6 c1 -1c2 - 8 c3 +0 c4
3 c -1c -12c -1c -6 c +0 c - 4 c +2c =
1 2 3 4 1 2 3 4
1 d1 +0 d 2 +0 d 3 +0 d 4 0 d1 +1 d 2 +0 d 3 +0 d 4
0 d +0 d +1 d +0 d 0 d1 +0 d 2 +0 d 3 +1 d 4
1 2 3 4
3 c1 + 0 c2 + 0 c3 +1 c4 = 1 d1 +0 d 2 +0 d 3 +0 d 4
6 c1 - 1 c2 - 8 c3 +0 c4 = 0 d1 +1 d 2 +0 d 3 +0 d 4
3 c1 - 1 c2 - 12 c3 - 1 c4 = 0 d1 +0 d 2 +1 d 3 +0 d 4
-6 c1 +0 c2 - 4 c3 + 2 c4 = 0 d1 +0 d 2 +0 d 3 +1 d 4
3 0 0 1 c1 d1
6 -1 -8 0 c d
⇒ 2 = 2
3 -1 -12 -1 c3 d 3
-6 0 -4 2 c4 d 4
3 0 0 1
6 -1 -8 0
We now find the determinant of coefficient matrix A = to determine
3 -1 -12 -1
-6 0 -4 2
whether the system is consistent (so that S spans V), or inconsistent (S does not span V).
Now det (A) = 48 ≠ 0. Therefore, the system (A) is consistent, and, consequently, the set
S spans the space V.
Now, the set S = {v 1 , v 2 , v 3 , v 4 } of vectors in M 22 is linearly independent if the only
solution of c 1 v 1 + c 2 v 2 + c 3 v 3 + c 4 v 4 = 0 is c 1 , c 2 , c 3 , c 4 = 0. In this case the set S
forms a basis for span S. Otherwise (i.e., if a solution with at least some nonzero values
exists), S is linearly dependent. With our vectors v 1 , v 2 , v 3 , v 4 , we have
3 6 0 -1 0 -8 1 0 0 0
c1 + c + c + c = 0
-6 -1 0 -12 -4 -1 2 0
2 3 4
3
Rearranging the left hand side yields
3 c1 +0 c2 +0 c3 +1 c4 6 c1 - 1 c2 - 8 c3 +0 c4 0 0
3 c - 1 c - 12 c - 1 c
- 6 c1 +0 c2 - 4 c3 + 2 c4 0
=
0
1 2 3 4
3 0 0 1 c1 0
6 -1 -8 0 c 0
2 =
3 -1 -12 -1 c3 0
-6 0 -4 2 c4 0
As det (A) = 48 ≠ 0
1 -3 - 4
Example 11: Let v1 = -2 , v 2 = 5 , v 3 = 5 , and H = Span{v1 , v 2 , v 3 }.
-3 7 6
Note that v 3 = 5v 1 + 3v 2 and show that Span {v 1 , v 2 , v 3 } = Span {v 1 , v 2 }. Then find a
basis for the subspace H.
Solution:
x3
x2
v3
v1
v2
x1
Now let x be any vector in H – say, x = c1v1 + c2v2 + c3v3. Since v3 = 5v1 + 3v2, we may
substitute
x = c 1 v 1 + c 2 v 2 + c 3 (5v 1 + 3v 2 )
= (c 1 + 5c 3 ) v 1 + (c 2 + 3c 3 ) v 2
Thus x is in Span {v 1 , v 2 }, so every vector in H already belongs to Span {v 1 , v 2 }. We
conclude that H and Span {v 1 , v 2 } are actually the same set of vectors. It follows that
{v 1 , v 2 } is a basis of H since {v 1 , v 2 } is obviously linearly independent.
1.
=v1 (1,=
0, 0 ) , v2 (=
0, 2, 1) , v1 ( 3, 0, 1)
2.
= v1 (1,=
2, 3) , v2 (=
0, 1, 1) , v1 ( 0, 1, 3)
Since we know that span is the set of all linear combinations of some set of vectors and
basis is a set of linearly independent vectors whose span is the entire vector space. The
spanning set is a set of vectors whose span is the entire vector space. "The Spanning set
theorem" is that a spanning set of vectors always contains a subset that is a basis.
Procedure:
The procedure for finding a subset of S that is a basis for W = span S is as follows:
Step 1 Write the Equation,
c 1 v 1 + c 2 v 2 + …+ c n v n =0 (3)
Step 2 Construct the augmented matrix associated with the homogeneous system of
Equation (1) and transforms it to reduced row echelon form.
Step 3 The vectors corresponding to the columns containing the leading 1’s form a basis
for W = span S.
Thus if S = {v 1 , v 2 ,…, v 6 } and the leading 1’s occur in columns 1, 3, and 4, then { v 1 , v 3 , v 4 } is
a basis for span S.
Note In step 2 of the procedure above, it is sufficient to transform the augmented matrix to row
echelon form.
Two Views of a Basis When the Spanning Set Theorem is used, the deletion of
vectors from a spanning set must stop when the set becomes linearly independent. If
an additional vector is deleted, it will not be a linear combination of the remaining
vectors and hence the smaller set will no longer span V. Thus a basis is a spanning set
that is as small as possible.
A basis is also a linearly independent set that is as large as possible. If S is a basis for V,
and if S is enlarged by one vector – say, w – from V, then the new set cannot be linearly
independent, because S spans V, and w is therefore a linear combination of the elements
in S.
Example 13: The following three sets in R3 show how a linearly independent set can be
enlarged to a basis and how further enlargement destroys the linear independence of the
set. Also, a spanning set can be shrunk to a basis, but further shrinking destroys the
spanning property.
1 2 1 2 4 1 2 4 7
0 , 3 0 , 3 , 5 0 , 3 , 5 , 8
0 0 0 0 6 0 0 6 9
1 0 s
Example 14: Let v1 = 0 , v 2 = 1 , and H = s : s in R . then every vector in H is a
0 0 0
s 1 0
linear combination of v 1 and v 2 because= s s 0 + s 1 . Is {v , v } a basis for H?
1 2
=
1. v1 (1,=
0, 2 ) , v2 (=
3, 2,1) , v3 (1,
= 0, 6 ) , v4 ( 3, 2,1)
=
2. v1 (1,=
2, 2 ) , v2 ( 3,
= 2,1) , v3 (1,1,
= 7 ) , v4 ( 7 , 6, 4 )
Exercises:
Determine which set in exercises 1-4 are bases for R2 or R3. Of the sets that are not bases,
determine which one are linearly independent and which ones span R2 or R3. Justify your
answers.
1 3 -3 1 -2 0 0
1. 0 , 2 , -5
2. -3 , 9 , 0 , -3
-2 -4 1 0 0 0 5
8. Explain why the following sets of vectors are not bases for the indicated vector spaces.
(Solve this problem by inspection).
(a) u 1 = (1, 2), u 2 = (0, 3), u 3 = (2, 7) for R2
(b) u 1 = (-1, 3, 2), u 2 = (6, 1, 1) for R3
(c) p 1 = 1 + x + x2, p 2 = x – 1 for P 2
1 1 6 0 3 0 5 1 7 1
=
(d) A = ,B = ,C = , D = ,E for M 22
2 3 -1 4 1 7 4 2 2 9
In exercises 11-13, determine a basis for the solution space of the system.
x1 + x2 - x3 = 0 2x1 + x2 + 3x3 = 0
11. -2x1 - x2 + 2x3 = 0 12. x1 + 5x3 = 0
- x1 + x3 = 0 x2 + x3 = 0
x+ y+ z = 0
3x + 2y - 2z = 0
13.
4x + 3y - z = 0
6x + 5y + z = 0
15. Find a standard basis vector that can be added to the set {v 1 , v 2 } to produce a basis
for R3.
(a) v 1 = (-1, 2, 3), v 2 = (1, -2, -2) (b) v 1 = (1, -1, 0), v 2 = (3, 1, -2)
16. Find a standard basis vector that can be added to the set {v 1 , v 2 } to produce a basis
for R4.
v 1 = (1, -4, 2, -3), v 2 = (-3, 8, -4, 6)
Lecture 07
Theorem 2: If a vector space V has a basis of n vectors, then every basis of V must
consist of exactly n vectors.
Note:
(1) The dimension of the zero vector space {0} is defined to be zero.
(2) Every finite dimensional vector space contains a basis.
Example 1: The n dimensional set of real numbers Rn, set of polynomials of order n Pn,
and set of matrices of order m × n Mmn are all finite- dimensional vector spaces.
However, the vector spaces F (- ∞ , ∞ ), C (- ∞ , ∞ ), and Cm (- ∞ , ∞ ) are infinite-
dimensional.
Example 2:
(a) Any pair of non-parallel vectors a, b in the xy-plane, which are necessarily linearly
independent, can be regarded as a basis of the subspace R2. In particular the set of unit
vectors {i, j} forms a basis for R2. Therefore, dim (R2) = 2.
_______________________________________________________________________________________
@Virtual University Of Pakistan 85
07-Dimension of a Vector Space VU
Any set of three non coplanar vectors {a, b, c} in ordinary (physical) space, which will be
necessarily linearly independent, spans the space R3. Therefore any set of such vectors forms a
basis for R3. In particular the set of unit vectors {i, j, k} forms a basis of R3. This basis is called
standard basis for R3. Therefore dim (R3) = 3.
(b) The set B = {1, x, x2, … ,xn} forms a basis for the vector space P n of polynomials of degree
< n. It is called the standard basis with dim (P n ) = n + 1.
Note:
_______________________________________________________________________________________
@Virtual University Of Pakistan 86
07-Dimension of a Vector Space VU
a b
Now A=
c d
Substituting the value of d, it becomes
a b
A=
c -2a + b - 3c
This can be written as
a 0 0 b 0 0
A= + +
0 -2a 0 b c - 3c
1 0 0 1 0 0
=a + b + c
0 -2 0 1 1 - 3
= a A 1 + bA 2 + cA 3
1 0 0 1 0 0
where A1 = , A2 = , and A 3 =
0 −2 0 1 1 −3
The matrix A is in W if and only if A = aA 1 + bA 2 + cA 3 , so {A 1 , A 2 , A 3 } is a spanning set for
W. Now, check if this set is a basis for W or not. We will see whether {A 1 , A 2 , A 3 } is linearly
independent or not. {A 1 , A 2 , A 3 } is said to be linearly independent if
aA1 + bA 2 + cA 3 =0 ⇒ a=b=c=0 i.e.,
1 0 0 1 0 0 0 0
a +b +c =
0 -2 0 1 1 - 3 0 0
a 0 0 b 0 0 0 0
0 + 0 + =
b c - 3c 0 0
-2a
a b 0 0
c =
−2a + b − 3c 0 0
3 -1
Example 4: Let H = Span {v 1 , v 2 }, where v1 = 6 and v 2 = 0 . Then H is the plane
2 1
studied in Example 10 of lecture 23. A basis for H is {v 1 , v 2 }, since v 1 and v 2 are not
multiples and hence are linearly independent. Thus, dim H = 2.
_______________________________________________________________________________________
@Virtual University Of Pakistan 87
07-Dimension of a Vector Space VU
3v2
x=2v1+3v2
2v2
v2
0
v1
2v1
_______________________________________________________________________________________
@Virtual University Of Pakistan 88
07-Dimension of a Vector Space VU
1-dimensional subspaces:
1-dimensional subspaces include any subspace spanned by a single non-zero
vector. Such subspaces are lines through the origin.
2-dimensional subspaces:
Any subspace spanned by two linearly independent vectors. Such subspaces are
planes through the origin.
3-dimensional subspaces:
The only 3-dimensional subspace is R3 itself. Any three linearly independent
vectors in R3 span all of R3, by the Invertible Matrix Theorem.
x3
x3
3 dim
0 dim
x2 x1
x1 2 dim
1 dim
_______________________________________________________________________________________
@Virtual University Of Pakistan 89
07-Dimension of a Vector Space VU
2 2 -1 0 1
-1 -1 2 -3 1
Example 7: Find a basis for the null space of A = .
1 1 -2 0 -1
0 0 1 1 1
Solution: The null space of A is the solution space of homogeneous system
2x1 + 2x2 - x3 + x5 = 0
- x1 - x2 + 2x3 - 3x4 + x5 = 0
x1 + x2 - 2x3 - x5 = 0
x3 + x4 + x5 = 0
The most appropriate way to solve this system is to reduce its augmented matrix into
reduced echelon form.
2 2 -1 0 1 0
-1 -1 2 -3 1 0
R4 R2 , R3 R1
1 1 -2 0 -1 0
0 0 1 1 1 0
1 1 -2 0 -1 0
0 0 1 1 1 0
R3 − 2 R1 , R3 − 3R2
2 2 -1 0 1 0
-1 -1 2 -3 1 0
1 1 -2 0 -1 0
0 0 1 1 1 0
R3 − 3R2
0 0 3 0 3 0
-1 -1 2 -3 1 0
1 1 -2 0 -1 0
0 0 1 1 1 0 1
− R3
0 0 0 -3 0 0 3
-1 -1 2 -3 1 0
1 1 -2 0 -1 0
0 0 1 1 1 0
R4 + R1
0 0 0 1 0 0
-1 -1 2 -3 1 0
_______________________________________________________________________________________
@Virtual University Of Pakistan 90
07-Dimension of a Vector Space VU
1 1 -2 0 -1 0
0 0 1 1 1 0
R4 + 3R3
0 0 0 1 0 0
0 0 0 -3 0 0
1 1 -2 0 -1 0
0 0 1 1 1 0
R1 + 2 R2
0 0 0 1 0 0
0 0 0 0 0 0
1 1 0 2 1 0
0 0 1 1 1 0
R2 − R3 , R1 − 2 R3
0 0 0 1 0 0
0 0 0 0 0 0
1 1 0 0 1 0
0 0 1 0 1 0
0 0 0 1 0 0
0 0 0 0 0 0
Thus, the reduced row echelon form of the augmented matrix is
1 1 0 0 1 0
0 0 1 0 1 0
0 0 0 1 0 0
0 0 0 0 0 0
which corresponds to the system
1x1 +1 x2 + 1 x5 = 0
1 x3 + 1 x5 = 0
1 x4 =0
0=0
No equation of this system has a form zero = nonzero. Therefore, the system is
consistent. Since the number of unknowns is more than the number of equations, we will
assign some arbitrary value to some variables. This will lead to infinite many solutions of
the system.
_______________________________________________________________________________________
@Virtual University Of Pakistan 91
07-Dimension of a Vector Space VU
x1 = - 1x2 -1x5
x2 = s
x3 = - 1x5
x4 = 0
x5 = t
The general solution of the given system is
x1 = - s - t , x2 = s , x3 = - t , x4 = 0 , x5 = t
Therefore, the solution vector can be written as
x1 -s - t -s -t -1 -1
x s s 0 1 0
2
x3 = -t = 0 + -t = s 0 +t -1
x4 0 0 0 0 0
x5 t 0 t 0 1
-1 -1
1 0
which shows that the vectors v1 = 0 and v 2 = -1 span the solution space .Since they
0 0
0 1
are also linearly independent,{v1,v2} is a basis for Nul A.
The next two examples describe a simple algorithm for finding a basis for the column
space.
1 4 0 2 0
0 0 1 -1 0
Example 8: Find a basis for Col B, where B = [ b1 b2 , ..., b5 ] =
0 0 0 0 1
0 0 0 0 0
Solution Each non-pivot column of B is a linear combination of the pivot columns. In
fact, b 2 = 4b 1 and b 4 = 2b 1 – b 3 . By the Spanning Set Theorem, we may discard b 2 and
b 4 and {b 1 , b 3 , b 5 } will still span Col B. Let
1 0 0
0 1 0
S = { b1 , b3 , b5 } = , ,
0 0 1
0 0 0
Since b 1 ≠ 0 and no vector in S is a linear combination of the vectors that precede it, S is
linearly independent. Thus S is a basis for Col B.
What about a matrix A that is not in reduced echelon form? Recall that any
linear dependence relationship among the columns of A can be expressed in the form Ax
_______________________________________________________________________________________
@Virtual University Of Pakistan 92
07-Dimension of a Vector Space VU
= 0, where x is a column of weights. (If some columns are not involved in a particular
dependence relation, then their weights are zero.) When A is row reduced to a matrix B,
the columns of B are often totally different from the columns of A. However, the
equations Ax = 0 and Bx = 0 have exactly the same set of solutions. That is, the columns
of A have exactly the same linear dependence relationships as the columns of B.
Elementary row operations on a matrix do not affect the linear dependence relations
among the columns of the matrix.
Proof: The general proof uses the arguments discussed above. Let B be the reduced
echelon form of A. The set of pivot columns of B is linearly independent, for no vector in
the set is a linear combination of the vectors that precede it. Since A is row equivalent to
B, the pivot columns of A are linearly independent too, because any linear dependence
relation among the columns of A corresponds to a linear dependence relation among the
columns of B. For this same reason, every non-pivot column of A is a linear combination
of the pivot columns of A. Thus the non-pivot columns of A may be discarded from the
spanning set for Col A, by the Spanning Set Theorem. This leaves the pivot columns of A
as a basis for Col A.
Note: Be careful to use pivot columns of A itself for the basis of Col A. The columns of
an echelon form B are often not in the column space of A. For instance, the columns of
the B in Example 8 all have zeros in their last entries, so they cannot span the column
space of the A in Example 9.
_______________________________________________________________________________________
@Virtual University Of Pakistan 93
07-Dimension of a Vector Space VU
1 -2
Example 10: Let v1 = -2 and v 2 = 7 . Determine if {v 1 , v 2 } is a basis for R3. Is {v 1 ,
3 -9
2
v 2 } a basis for R ?
1 -2 1 -2
Solution Let A = [v 1 v 2]. Row operations show that A = -2 7 0 3 . Not every
3 -9 0 0
row of A contains a pivot position. So the columns of A do not span R3, by Theorem 4 in
Lecture 6. Hence {v 1 , v 2 } is not a basis for R3. Since v 1 and v 2 are not in R2, they cannot
possibly be a basis for R2. However, since v 1 and v 2 are obviously linearly independent,
they are a basis for a subspace of R3, namely, Span {v 1 , v 2 }.
_______________________________________________________________________________________
@Virtual University Of Pakistan 94
07-Dimension of a Vector Space VU
Procedure:
Basis and Linear Combinations
Given a set of vectors S = {v 1 , v 2 , …,v k } in Rn, the following procedure produces a subset
of these vectors that form a basis for span (S) and expresses those vectors of S that are
not in the basis as linear combinations of the basis vector.
Step1: Form the matrix A having v 1 , v 2 ,..., v k as its column vectors.
Step2: Reduce the matrix A to its reduced row echelon form R, and let
w 1 , w 2 ,…, w k be the column vectors of R.
Step3: Identify the columns that contain the leading entries i.e., 1’s in R. The
corresponding column vectors of A are the basis vectors for span (S).
Step4: Express each column vector of R that does not contain a leading entry as
a linear combination of preceding column vector that do contain leading entries
(we will be able to do this by inspection). This yields a set of dependency
equations involving the column vectors of R. The corresponding equations for the
column vectors of A express the vectors which are not in the basis as linear
combinations of basis vectors.
_______________________________________________________________________________________
@Virtual University Of Pakistan 95
07-Dimension of a Vector Space VU
1 2 -1 0
0 0 1 1
- 1R2
0 0 2 2
0 0 3 3
1 2 -1 0
0 0 1 1 -2R2 + R3
0 0 0 0 -3R2 + R4
0 0 0 0
1 2 0 1
0 0 1 1
R2 + R1
0 0 0 0
0 0 0 0
Labeling the column vectors of the resulting matrix as w 1 , w 2 , w 3 and w 4 yields
1 2 0 1
0 0 1 1
0 0 0 0
(B)
0 0 0 0
↑ ↑ ↑ ↑
w1 w2 w3 w4
The leading entries occur in column 1 and 3 so {w 1 , w 3 } is a basis for the column space
of (B) and consequently {v 1 , v 3 } is the basis for column space of (A).
(b) We shall start by expressing w 2 and w 4 as linear combinations of the basis vector w 1
and w 3 . The simplest way of doing this is to express w 2 and w 4 in term of basis vectors
with smaller subscripts. Thus we shall express w 2 as a linear combination of w 1 , and we
shall express w 4 as a linear combination of w 1 and w 3. By inspection of (B), these linear
combinations are w 2 = 2w 1 and w 4 = w 1 + w 3 . We call them the dependency equations.
The corresponding relationship of (A) are v 3 = 2v 1 and v 5 = v 1 + v 3 .
_______________________________________________________________________________________
@Virtual University Of Pakistan 96
07-Dimension of a Vector Space VU
1 -2 4 0 -7
-1 3 -5 4 18
5 1 9 2 2
(A)
2 0 4 -3 -8
↑ ↑ ↑ ↑ ↑
v1 v2 v3 v4 v5
Finding a basis for column space of this matrix can solve the first part of our problem.
Transforming Matrix to Reduced Row Echelon Form:
1 -2 4 0 -7
-1 3 -5 4 18
5 1 9 2 2
2 0 4 -3 -8
1 -2 4 0 -7
0 1 -1 4 11 R1 + R2
-5R1 + R3
0 11 -11 2 37
-2R1 + R4
0 4 -4 -3 6
1 -2 4 0 -7
0 1 -1 4 11 -11R2 + R3
0 0 0 - 42 - 84 -4R2 + R4
0 0 0 -19 - 38
1 -2 4 0 -7
0 1 -1 4 11
(-1/42)R3
0 0 0 1 2
0 0 0 -19 - 38
1 -2 4 0 -7
0 1 -1 4 11
19R3 + R4
0 0 0 1 2
0 0 0 0 0
1 -2 4 0 -7
0 1 -1 0 3
(-4)R3 + R2
0 0 0 1 2
0 0 0 0 0
1 0 2 0 -1
0 1 -1 0 3
2R2 + R1
0 0 0 1 2
0 0 0 0 0
_______________________________________________________________________________________
@Virtual University Of Pakistan 97
07-Dimension of a Vector Space VU
_______________________________________________________________________________________
@Virtual University Of Pakistan 98
07-Dimension of a Vector Space VU
When the dimension of a vector space or subspace is known, the search for a basis is
simplified by the next theorem. It says that if a set has the right number of elements, then
one has only to show either that the set is linearly independent or that it spans the space.
The theorem is of critical importance in numerous applied problems (involving
differential equations or difference equations, for example) where linear independence is
much easier to verify than spanning.
Theorem 5 (The Basis Theorem): Let V be a p-dimensional vector space, p> 1. Any
linearly independent set of exactly p elements in V is automatically a basis for V. Any set
of exactly p elements that spans V is automatically a basis for V.
The Dimensions of Nul A and Col A: Since the pivot columns of a matrix A form a
basis for Col A, we know the dimension of Col A as soon as we know the pivot columns.
The dimension of Nul A might seem to require more work, since finding a basis for Nul
A usually takes more time than a basis for Col A. Yet, there is a shortcut.
Let A be an m × n matrix, and suppose that the equation Ax = 0 has k free variables.
From lecture 21, we know that the standard method of finding a spanning set for Nul A
will produce exactly k linearly independent vectors say, u 1 , … , u k , one for each free
variable. So {u 1 , … , u k } is a basis for Nul A, and the number of free variables determines
the size of the basis. Let us summarize these facts for future reference.
The dimension of Nul A is the number of free variables in the equation Ax = 0, and the
dimension of Col A is the number of pivot columns in A.
Example 15: Find the dimensions of the null space and column space of
-3 6 -1 1 -7
A = 1 -2 2 3 -1
2 -4 5 8 -4
Solution: Row reduce the augmented matrix [A 0] to echelon form and obtain
1 -2 2 3 -1 0
0 0 1 2 -2 0
0 0 0 0 0 0
_______________________________________________________________________________________
@Virtual University Of Pakistan 99
07-Dimension of a Vector Space VU
Example 16: Decide whether each statement is true or false, and give a reason for each
answer. Here V is a non-zero finite-dimensional vector space.
1. If dim V = p and if S is a linearly dependent subset of V, then S contains more than
p vectors.
2. If S spans V and if T is a subset of V that contains more vectors than S, then T is
linearly dependent.
Solution:
1. False. Consider the set {0}.
2. True. By the Spanning Set Theorem, S contains a basis for V; call that basis S ′ .
Then T will contain more vectors than S ′ . By Theorem 1, T is linearly dependent.
_______________________________________________________________________________________
@Virtual University Of Pakistan 100
07-Dimension of a Vector Space VU
Exercises:
For each subspace in exercises 1-6, (a) find a basis and (b) state the dimension.
2c
s − 2t
a − b
1. s + t : s, t in R 2. : a, b, c in R
3t b − 3c
a + 2b
a − 4b − 2c 3a + 6b − c
2a + 5b − 4c 6a − 2b − 2c
3. : a, b, c in R 4. : a, b, c in R
− a + 2c
−9a + 5b + 3c
−3a + 7b + 6c −3a + b + c
5. {(a, b, c): a – 3b + c = 0, b – 2c = 0, 2b – c = 0}
6 {(a, b, c, d): a - 3b + c = 0}
Determine the dimensions of Nul A and Col A for the matrices shown in exercises 9 to
12.
1 −6 9 0 −2 1 3 −4 2 −1 6
0 1 2 −4 5 0 0 1 −3 7 0
9. A = 10. A =
0 0 0 5 1 0 0 0 1 4 −3
0 0 0 0 0 0 0 0 0 0 0
1 −1 0
1 0 9 5
11. A = 12. A = 0 4 7
0 0 1 −4 0 0 5
_______________________________________________________________________________________
@Virtual University Of Pakistan 101
07-Dimension of a Vector Space VU
13. The first four Hermite polynomials are 1, 2t, -2 + 4t2, and -12t + 8t3. These
polynomials arise naturally in the study of certain important differential equations in
mathematical physics. Show that the first four Hermite polynomials form a basis of P 3 .
14. Let B be the basis of P 3 consisting of the Hermite polynomials in exercise 13, and let
p (t) = 7 – 12 t – 8 t2 + 12 t3. Find the coordinate vector of p relative to B.
_______________________________________________________________________________________
@Virtual University Of Pakistan 102
08- Rank VU
Lecture 08
Rank
With the help of vector space concepts, for a matrix several interesting and useful
relationships in matrix rows and columns have been discussed.
For instance, imagine placing 2000 random numbers into a 40 x 50 matrix A and then
determining both the maximum number of linearly independent columns in A and the
maximum number of linearly independent columns in AT (rows in A). Remarkably, the
two numbers are the same. Their common value is called the rank of the matrix. To
explain why, we need to examine the subspace spanned by the subspace spanned by the
rows of A.
The Row Space: If A is an m × n matrix, each row of A has n entries and thus can be
identified with a vector in Rn. The set of all linear combinations of the row vectors is
called the row space of A and is denoted by Row A. Each row has n entries, so Row A is
a subspace of Rn. Since the rows of A are identified with the columns of AT, we could
also write Col AT in place of Row A.
-2 -5 8 0 -17 r1 = (-2,-5,8,0,-17)
1 3 -5 1 5 r2 = (1,3,-5,1,5)
Example 1: Let A = and
3 11 -19 7 1 r3 = (3,11,-19,7,1)
1 7 -13 5 -3 r4 = (1,7,-13,5,-3)
5
The row space of A is the subspace of R spanned by {r 1 , r 2 , r 3 , r 4 }. That is, Row A =
Span {r 1 , r 2 , r 3 , r 4 }. Naturally, we write row vectors horizontally; however, they could
also be written as column vectors
Example: Let
r1 = (2,1,0)
2 1 0 r2 = (3,-1,4)
A= and
3 -1 4
We could use the Spanning Set Theorem to shrink the spanning set to a
basis.
Some times row operation on a matrix will not give us the required information but row
reducing certainly worthwhile, as the next theorem shows
Theorem 1: If two matrices A and B are row equivalent, then their row spaces are the
same. If B is in echelon form, the nonzero rows of B form a basis for the row space of A
as well as B.
(a) A given set of column vectors of A is linearly independent if and only if the
corresponding column vectors of B are linearly independent.
(b) A given set of column vector of A forms a basis for the column space of A if and only
if the corresponding column vector of B forms a basis for the column space of B.
we can find a set of column vectors of R that forms a basis for the column space of R,
then the corresponding column vectors of A will form a basis for the column space of A.
The first, third, and fifth columns of R contains the leading 1’s of the row vectors, so
1 4 5
0 1 -2
c1′ = c′3 = c5′ =
0 0 1
0 0 0
form a basis for the column space of R, thus the corresponding column vectors of A
1 4 5
2 9 8
namely, c1 = c3 = c5 =
2 9 9
-1 -4 -5
form a basis for the column space of A.
Example:
The matrix
1 -2 5 0 3
0 1 3 0 0
R=
0 0 0 1 0
0 0 0 0 0
is in row-echelon form.
The vectors
r1 = [1 -2 5 0 3 ]
r2 = [ 0 1 3 0 0 ]
r3 = [ 0 0 0 1 0 ]
form a basis for the row space of R, and the vectors
1 -2 0
0 1 0
c1 = , c2 = , c3 =
0 0 1
0 0 0
form a basis for the column space of R.
1 -2 0 0 3
2 -5 -3 -2 6
0 5 15 10 0
2 6 18 8 6
1 -2 0 0 3
0 1 3 2 0
(-2)R1 + R2
(-2)R1 + R4
0 5 15 10 0
(-1)R2
0 10 18 8 0
1 -2 0 0 3
0 1 3 2 0 (-5)R2 + R3
0 0 0 0 0 (-10)R2 + R4
0 0 -12 -12 0
1 -2 0 0 3
0 1 3 2 0
R34
0 0 -12 -12 0
0 0 0 0 0
1 -2 0 0 3
0 1 3 2 0
(-1/12)R3
0 0 1 1 0
0 0 0 0 0
1 -2 0 0 3
0 1 3 2 0
Therefore, R=
0 0 1 1 0
0 0 0 0 0
The non-zero row vectors in this matrix are
w1 = (1,-2,0,0,3), w 2 = (0,1,3,2,0), w 3 = (0,0,1,1,0)
These vectors form a basis for the row space and consequently form a basis for the
subspace of R5 spanned by v 1 , v 2 , v 3 .
1 2 0 2
0 1 -5 -10
0 -3 15 18 (-1)R2
0 -2 10 8
0 0 0 0
1 2 0 2
0 1 -5 -10
(3)R2 + R3
0 0 0 -12
(2)R2 + R4
0 0 0 -12
0 0 0 0
1 2 0 2
0 1 -5 -10
0 0 0 1 (-1/12)R3
0 0 0 -12
0 0 0 0
1 2 0 2
0 1 -5 -10
0 0 0 1 12 R3 + R4
0 0 0 0
0 0 0 0
1 2 0 2
0 1 -5 -10
Now R = 0 0 0 1
0 0 0 0
0 0 0 0
The first, second and fourth columns contain the leading 1’s, so the corresponding
column vectors in AT form a basis for the column space of AT; these are
1 2 2
-2 -5 6
c1 = 0 , c2 = -3 and c4 = 18
0 -2
8
3 6 6
Transposing again and adjusting the notation appropriately yields the basis vectors
r1 = [1 -2 0 0 3] , r2 = [ 2 -5 -3 -2 6] and r4 = [ 2 6 18 8 6]
for the row space of A.
The following example shows how one sequence of row operations on A leads to bases
for the three spaces: Row A, Col A, and Nul A.
Example 5: Find bases for the row space, the column space and the null space of the
matrix
-2 -5 8 0 -17
1 3 -5 1 5
A=
3 11 -19 7 1
1 7 -13 5 -3
Solution: To find bases for the row space and the column space, row reduce A to an
1 3 -5 1 5
0 1 -2 2 -7
echelon form: A B =
0 0 0 -4 20
0 0 0 0 0
By Theorem (1), the first three rows of B form a basis for the row space of A (as well as
the row space of B). Thus Basis for Row A:
{(1, 3, -5, 1, 5), (0, 1, -2, 2, -7), (0, 0, 0, -4, 20)}
For the column space, observe from B that the pivots are in columns 1, 2 and 4. Hence
columns 1, 2 and 4 of A (not B) form a basis for Col A:
-2 -5 0
1 3 1
Basis for Col A : , ,
3 11 7
1 7 5
Any echelon form of A provides (in its nonzero rows) a basis for Row A and also
identifies the pivot columns of A for Col A. However, for Nul A, we need the reduced
echelon form. Further row operations on B yield
1 0 1 0 1
0 1 -2 0 3
A BC=
0 0 0 1 -5
0 0 0 0 0
The equation Ax = 0 is equivalent to Cx = 0, that is,
x1 + x3 + x5 = 0
x2 - 2x3 + 3x5 = 0
x4 - 5x5 = 0
Observe that, unlike the bases for Col A, the bases for Row A and Nul A have no simple
connection with the entries in A itself.
Note:
1. Although the first three rows of B in Example (5) are linearly independent, it is wrong
to conclude that the first three rows of A are linearly independent. (In fact, the third
row of A is 2 times the first row plus 7 times the second row).
2. Row operations do not preserve the linear dependence relations among the rows of a
matrix.
Theorem 3: (The Rank Theorem) The dimensions of the column space and the row
space of an m × n matrix A are equal. This common dimension, the rank of A, also equals
the number of pivot positions in A and satisfies the equation
rank A + dim Nul A = n
Example 6:
(a) If A is a 7 × 9 matrix with a two – dimensional null space, what is the rank of A?
(b). Could a 6 × 9 matrix have a two – dimensional null space?
Solution:
(a) Since A has 9 columns, (rank A) + 2 = 9 and hence rank A = 7.
(b) No, If a 6 × 9 matrix, call it B, had a two – dimensional null space, it would have to
have rank 7, by the Rank Theorem. But the columns of B are vectors in R6 and so the
dimension of Col B cannot exceed 6; that is, rank B cannot exceed 6.
The next example provides a nice way to visualize the subspaces we have been studying.
Later on, we will learn that Row A and Nul A have only the zero vector in common and
are actually “perpendicular” to each other. The same fact will apply to Row AT (= Col A)
and Nul AT. So the figure in Example (7) creates a good mental image for the general
case.
3 0 −1
Example 7:= Let A 3 0 −1 . It is readily checked that Nul A is the x 2 – axis, Row A
4 0 5
is the x 1 x 3 – plane, Col A is the plane whose equation is x 1 – x 2 = 0 and Nul AT is the set
of all multiples of (1, -1, 0). Figure 1 shows Nul A and Row A in the domain of the linear
transformation x → Ax; the range of this mapping, Col A, is shown in a separate copy of
R3, along with Nul AT.
x3 x3
0 0
Nul A Nul AT x2
x2 Col A
Row A
x1
x1
3
R
R3
Figure 1 – Subspaces associated with a matrix A
-1 2 0 4 5 -3
3 -7 2 0 1 4
Example 9: Find the rank and nullity of the matrix A = .
2 -5 2 4 6 1
4 -9 2 -4 -4 7
Verify that values obtained verify the dimension theorem.
-1 2 0 4 5 -3
3 -7 2 0 1 4
Solution
2 -5 2 4 6 1
4 -9 2 -4 -4 7
1 -2 0 -4 -5 3
3 -7 2 0 1 4
(-1)R1
2 -5 2 4 6 1
4 -9 2 -4 -4 7
1 -2 0 -4 -5 3
0 -1 2 12 16 - 5
(-3)R1 + R2
(-2)R1 + R3
0 -1 2 12 16 -5
(-4)R1 + R4
0 -1 2 12 16 -5
1 -2 0 -4 -5 3
0 1 -2 -12 -16 5
(-1)R2
0 -1 2 12 16 -5
0 -1 2 12 16 -5
1 -2 0 -4 -5 3
0 1 -2 -12 -16 5 R2 + R3
0 0 0 0 0 0 R2 + R4
0 0 0 0 0 0
1 0 -4 - 28 - 37 13
0 1 -2 -12 -16 5
2R2 + R1
0 0 0 0 0 0
0 0 0 0 0 0
The reduced row-echelon form of A is
1 0 -4 -28 -37 13
0 1 -2 -12 -16 5
(1)
0 0 0 0 0 0
0 0 0 0 0 0
Example 10: Find the rank and nullity of the matrix; then verify that the values obtained
1 -3 2 2 1
0 3 6 0 -3
satisfy the dimension theorem A = 2 -3 -2 4 4
3 -6 0 6 5
-2 9 2 -4 -5
Solution: Transforming Matrix to the Reduced Row Echelon Form:
1 -3 2 2 1
0 3 6 0 - 3
2 -3 -2 4 4
3 -6 0 6 5
- 2 9 2 -4 - 5
1 -3 2 2 1
0 3 6 0 - 3 (-2)R1 + R3
0 3 -6 0 2 (-3)R1 + R4
0 3 -6 0 2 2R1 + R5
0 3 6 0 - 3
1 -3 2 2 1
0 1 2 0 -1
0 3 -6 0 2 (1/3)R2
0 3 -6 0 2
0 3 6 0 - 3
1 -3 2 2 1
0 1 2 0 -1 (-3) R2 + R3
0 0 -12 0 5 (-3) R2 + R4
0 0 -12 0 5 (-3)R2 + R5
0 0 0 0 0
1 -3 2 2 1
0 1 2 0 -1
0 0 1 0 - 5/12 (-1/12)R3
0 0 -12 0 5
0 0 0 0 0
1 -3 2 2 1
0 1 2 0 -1
0 0 1 0 - 5/12 12R3 + R4
0 0 0 0 0
0 0 0 0 0
1 -3 0 2 11/6
0 1 0 0 -1/6
(-2 ) R3 + R2
0 0 1 0 - 5/12
(-2 ) R3 + R1
0 0 0 0 0
0 0 0 0 0
1 0 0 2 4/3
0 1 0 0 -1/6
0 0 1 0 - 5/12 (3) R2 + R1 (1)
0 0 0 0 0
0 0 0 0 0
Since there are three nonzero rows (or equivalently, three leading 1’s) the row space and
column space are both three dimensional so rank (A) = 3.
To find the nullity of A, we find the dimension of the solution space of the linear system
Ax = 0. The system can be solved by reducing the augmented matrix to reduced row
echelon form. The resulting matrix will be identical to (1), except with an additional last
column of zeros, and the corresponding system of equations will be
4
x1 + 0x2 + 0x3 + 2x4 + x5 = 0
3
1
0x1 + x2 + 0x3 + 0x4 - x5 = 0
6
5
0x1 + 0x2 + x3 + 0x4 - x5 = 0
12
x 1 = -2 x 4 +(-4/3) x 5 x 2 = (1/6) x 5
x 3 = (5/12) x 5 x4 = s
x5 = t
Suppose now that A is an m x n matrix of rank r, it follows from theorem (5) that AT is an
n x m matrix of rank r . Applying theorem (3) on A and AT yields
Nullity (A)=n-r, nullity (AT)=m-r
From which we deduce the following table relating the dimensions of the four
fundamental spaces of an m x n matrix A of rank r.
Rank and the Invertible Matrix Theorem: The various vector space concepts
associated with a matrix provide several more statements for the Invertible Matrix
Theorem. We list only the new statements here, but we reference them so they follow the
statements in the original Invertible Matrix Theorem in lecture 13.
Proof: Statement (m) is logically equivalent to statements (e) and (h) regarding linear
independence and spanning. The other statements above are linked into the theorem by
the following chain of almost trivial implications:
( g ) ⇒ ( n ) ⇒ (o ) ⇒ ( p ) ⇒ ( r ) ⇒ ( q ) ⇒ ( d )
Only the implication (p) ⇒ (r) bears comment. It follows from the Rank Theorem
because A is n × n . Statements (d) and (g) are already known to be equivalent, so the
chain is a circle of implications.
We have refrained from adding to the Invertible Matrix Theorem obvious statements
about the row space of A, because the row space is the column space of AT. Recall from
(1) of the Invertible Matrix Theorem that A is invertible if and only if AT is invertible.
Hence every statement in the Invertible Matrix Theorem can also be stated for AT.
Numerical Note:
Many algorithms discussed in these lectures are useful for understanding
concepts and making simple computations by hand. However, the algorithms are often
unsuitable for large-scale problems in real life.
Rank determination is a good example. It would seem easy to reduce a matrix to echelon
form and count the pivots. But unless exact arithmetic is performed on a matrix whose
entries are specified exactly, row operations can change the apparent rank of a matrix.
5 7
For instance, if the value of x in the matrix is not stored exactly as 7 in a
5 x
computer, then the rank may be 1 or 2, depending on whether the computer treats x – 7 as
zero.
In practical applications, the effective rank of a matrix A is often determined from the
singular value decomposition of A.
Exercises:
1 −4 9 −7 1 0 −1 5
1. A = 0 −2 5 −6
−1 2 −4 1 , B =
5 −6 10 7 0 0 0 0
1 −3 4 −1 9 1 −3 0 −7
5
−2 6 −6 −1 −10 2 −3 8
2. A = , B = 0 0
−3 9 −6 −6 −3 0 0 0 0 5
3 −9 4 9 0 0 0 0 0 0
2 −3 6 2 5 2 −3 6 2 5
−2 3 −3 −3 −4 0 0 3 −1 1
3. A = ,B
4 −6 9 5 9 0 0 0 1 3
−2 3 3 −4 1 0 0 0 0 0
1 1 −3 7 9 −9 1 1 −3 7 9 −9
1 2 −4 10 13 −12 0 1 −1 3 4 −3
4. A =1 −1 −1 1 1 −3 , B 0 0 0 1 −1 −2
1 −3 1 −5 −7 3 0 0 0 0 0 0
1 −2 0 0 −5 −4 0 0 0 0 0 0
5. If a 3 x 8 matrix A has rank 3, find dim Nul A, dim Row A, and rank AT.
6. If a 6 x 3 matrix A has rank 3, find dim Nul A, dim Row A, and rank AT.
7. Suppose that a 4 x 7 matrix A has four pivot columns. Is Col A = R4? Is Nul A = R3?
Explain your answers.
8. Suppose that a 5 x 6 matrix A has four pivot columns. What is dim Nul A? Is Col A =
R4? Why or why not?
10. If the null space of a 7 x 6 matrix A is 5-dimensional, what is the dimension of the
column space of A?
11. If the null space of an 8 x 5 matrix A is 2-dimensional, what is the dimension of the
row space of A?
12. If the null space of a 5 x 6 matrix A is 4-dimensional, what is the dimension of the
row space of A?
14. If A is a 4 x 3 matrix, what is the largest possible dimension of the row space of A? If
A is a 3 x 4 matrix, what is the largest possible dimension of the row space of A? Explain.
Lecture 09
Solution of Linear System of Equations and Matrix Inversion
Jacobi’s Method
This is an iterative method, where initial approximate solution to a given system of
equations is assumed and is improved towards the exact solution in an iterative way.
In general, when the coefficient matrix of the system of equations is a sparse matrix
(many elements are zero), iterative methods have definite advantage over direct methods
in respect of economy of computer memory
Such sparse matrices arise in computing the numerical solution of partial differential
equations
Let us consider
a11 x1 + a12 x2 + + a1n xn = b1
a21 x1 + a22 x2 + + a2 n xn = b2
an1 x1 + an 2 x2 + + ann xn = b1
In this method, we assume that the coefficient matrix [A] is strictly diagonally dominant,
that is, in each row of [A] the modulus of the diagonal element exceeds the sum of the
off-diagonal elements.
We also assume that the diagonal element do not vanish. If any diagonal element
vanishes, the equations can always be rearranged to satisfy this condition.
Now the above system of equations can be written as
b a a
x1 = 1 − 12 x2 − − 1n xn
a11 a11 a11
b2 a21 a2 n
x2 = − x1 − − xn
a22 a22 a22
bn an1 an ( n −1)
xn = − x1 − − xn −1
ann ann ann
We shall take this solution vector ( x1 , x2 ,..., xn )T as a first approximation to the exact
solution of system. For convenience, let us denote the first approximation vector by
( x1(1) , x2(1) ,..., xn(1) ) got after taking as an initial starting vector.
Substituting this first approximation in the right-hand side of system, we obtain the
second approximation to the given system in the form
b1 a12 (1) a
x1(2) = − x2 − − 1n xn(1)
a11 a11 a11
b2 a21 (1) a2 n (1)
x2 =
(2)
− x1 − − xn
a22 a22 a22
bn an1 (1) an ( n −1) (1)
xn =
(2)
− x1 − − xn −1
ann ann ann
This second approximation is substituted into the right-hand side of Equations and obtain
the third approximation and so on.
This process is repeated and (r+1)th approximation is calculated
b a a
x1( r +1) = 1 − 12 x2( r ) − − 1n xn( r )
a11 a11 a11
( r +1) b2 a21 ( r ) a2 n ( r )
x2 = − x1 − − xn
a22 a22 a22
( r +1) bn an1 ( r ) an ( n −1) ( r )
xn = − x1 − − xn −1
ann ann ann
Briefly, we can rewrite these Equations as
n a
bi
xi( r +=
1)
− ∑ ij x (jr ) ,
aii j =1 aii
j ≠i
= r 1,=2,..., i 1, 2,..., n
It is also known as method of simultaneous displacements,
since no element of xi( r +1) is used in this iteration until every element is computed.
A sufficient condition for convergence of the iterative solution to the exact solution is
n
aii > ∑ aij , i=
1, 2,..., n When this condition (diagonal dominance) is true, Jacobi’s
j =1
j ≠1
method converges
Example
Find the solution to the following system of equations using Jacobi’s iterative method for
the first five iterations:
83 x + 11 y − 4 z = 95
7 x + 52 y + 13 z =104
3 x + 8 y + 29 z = 71
Solution
95 11 4
x= − y+ z
83 83 83
104 7 13
y= − x − z
52 52 52
71 3 8
z = − x− y
29 29 29
Taking the initial starting of solution vector as (0, 0, 0)T , from Eq. ,we have the first
approximation as
x (1) 1.1446
(1)
y = 2.0000
(1)
z 2.4483
Now, using Eq. ,the second approximation is computed from the equations
Making use of the last two equations we get the second approximation as
x (2) 0.9976
(2)
y = 1.2339
z (2) 1.7424
Similar procedure yields the third, fourth and fifth approximations to the required
solution and they are tabulated as below;
Variables
Iteration number r x y z
Example
Solve the system by jacobi’s iterative method
8x − 3 y + 2 z = 20
4 x + 11 y − z = 33
6 x + 3 y + 12 z =35
(Perform only four iterations)
Solution
Consider the given system as
8x − 3 y + 2 z = 20
4 x + 11 y − z = 33
6 x + 3 y + 12 z =35
the system is diagonally do min ant
1
x= [ 20 + 3 y − 2 z ]
8
1
=
y [33 − 4 x + z ]
11
1
=
z [35 − 6 x − 3 y ]
12
we start with an initial aproximation x=
0 y=
0 z=
0 0
substituting these
first iteration
1
=
x1 [ 20 + 3(0) − 2(0)] = 2.5
8
1
=
y1 [33 − 4(0) + 0=] 3
11
1
z1= [35 − 6(0) − 3(0)]= 2.916667
12
Second iteration
1
x2 = [ 20 + 3(3) − 2(2.9166667) ] = 2.895833
8
1
y2 = [33 − 4(2.5) + 2.9166667 ] = 2.3560606
11
1
z2 = [35 − 6(2.5) − 3(3)] = 0.9166666
12
third iteration
1
x3 = [ 20 + 3(2.3560606) − 2(0.9166666)] = 3.1543561
8
1
y3 = [33 − 4(2.8958333) + 0.9166666] = 2.030303
11
1
z3 =− [35 6(2.8958333) − 3(2.3560606)] = 0.8797348
12
fourth iteration
1
x4 = [ 20 + 3(2.030303) − 2(0.8797348)] = 3.0419299
8
1
y4 = [33 − 4(3.1543561) + 0.8797348] = 1.9329373
11
1
z4 = [35 − 6(3.1543561) − 3(2.030303)] = 0.8319128
12
Example
Solve the system by jacobi’s iterative method
3 x + 4 y + 15 z =54.8
x + 12 y + 3 z =39.66
10 x + y − 2 z = 7.74
(Perform only four iterations)
Solution
third iteration
1
x3 = [ 7.74 − 2.3271667 + 2(3.1949778) ] = 1.1802789
10
1
y3 = [39.66 − 1.3908333 − 3(3.1949778) ] =2.3903528
12
1
z3 = [54.8 − 3(1.3908333) − 4(2.3271667)] = 2.7545889
15
fourth iteration
1
x4 = [ 7.74 − 2.5179962 + 2(2.7798501) ] = 1.0781704
10
1
y4 = [39.66 − 1.1802789 − 3(2.7545889) ] =2.51779962
12
1
z4 = [54.8 − 3(1.1802789) − 4(2.3903528)] = 2.7798501
15
Lecture 10
Solution of Linear System of Equations and Matrix Inversion
Gauss–Seidel Iteration Method
It is another well-known iterative method for solving a system of linear equations of the
form
a11 x1 + a12 x2 + + a1n xn = b1
a21 x1 + a22 x2 + + a2 n xn = b2
an1 x1 + an 2 x2 + + ann xn = bn
In Jacobi’s method, the (r + 1)th approximation to the above system is given by
Equations
b a a
x1( r +1) = 1 − 12 x2( r ) − − 1n xn( r )
a11 a11 a11
( r +1) b2 a21 ( r ) a2 n ( r )
x2 = − x1 − − xn
a22 a22 a22
b a a
xn( r +1) = n − n1 x1( r ) − − n ( n −1) xn( r−)1
ann ann ann
Here we can observe that no element of xi( r +1) replaces xi( r ) entirely for the next cycle of
computation.
In Gauss-Seidel method, the corresponding elements of xi( r +1) replaces those of
xi( r ) as soon as they become available.
Hence, it is called the method of successive displacements. For illustration consider
a11 x1 + a12 x2 + + a1n xn = b1
a21 x1 + a22 x2 + + a2 n xn = b2
an1 x1 + an 2 x2 + + ann xn = bn
In Gauss-Seidel iteration, the (r + 1)th approximation or iteration is computed from:
b a a
x1( r +1) = 1 − 12 x2( r ) − − 1n xn( r )
a11 a11 a11
b a a
x2( r +1) = 2 − 21 x1( r +1) − − 2 n xn( r )
a22 a22 a22
( r +1) bn an1 ( r +1) an ( n −1) ( r +1)
xn = − x1 − − xn −1
ann ann ann
Thus, the general procedure can be written in the following compact form
bi i −1 aij ( r +1) n a
xi( r +1) = − ∑ x j − ∑ ij x (jr ) for all i = 1, 2,..., n and r = 1, 2,...
aii j = 1 aii j = i +1 aii
To describe system in the first equation, we substitute the r-th approximation into the
right-hand side and denote the result by x1( r +1) . In the second equation, we substitute
( x1( r +1) , x3( r ) ,..., xn( r ) ) and denote the result by x2( r +1)
In the third equation, we substitute ( x1( r +1) , x2( r +1) , x4( r ) ,..., xn( r ) ) and denote the result by
x3( r +1) , and so on. This process is continued till we arrive at the desired result. For
illustration, we consider the following example :
Note
The difference between jacobi’s method and gauss Seidel method is that in jacobi’s
method the approximation calculated are used in the next iteration for next
approximation but in Gauss-seidel method the new approximation calculated is
instantly replaced by the previous one.
Example
Find the solution of the following system of equations using Gauss-Seidel method and
perform the first five iterations:
4 x1 − x2 − x3 =
2
− x1 + 4 x2 − x4 =2
− x1 + 4 x3 − x4 =
1
− x2 − x3 + 4 x4 =
1
Solution
The given system of equations can be rewritten as
Further, we take x4 = 0 and the current value of x1 we obtain from the third equation of
the system
x3(1) =
0.25 + (0.25)(0.5) + 0
= 0.375
Now, using the current values of x2 and x3 the fourth equation of system gives
=
x4(1) 0.25 + (0.25)(0.625)
+(0.25)(0.375) =
0.5
The Gauss-Seidel iterations for the given set of equations can be written as
x1( r +1) =
0.5 + 0.25 x2( r ) + 0.25 x3( r )
x2( r +1) =
0.5 + 0.25 x1( r +1) + 0.25 x4( r )
x3( r +1) =
0.25 + 0.25 x1( r +1) + 0.25 x4( r )
x4( r +1) =
0.25 + 0.25 x2( r +1) + 0.25 x3( r +1)
Now, by Gauss-Seidel procedure, the 2nd and subsequent approximations can be
obtained and the sequence of the first five approximations are tabulated as below:
Variables
Iteration x1 x2 x3 x4
number r
Example
Solve the system by Gauss-Seidel iterative method
8x − 3 y + 2 z = 20
4 x + 11 y − z = 33
6 x + 3 y + 12 z =35
(Perform only four iterations)
Solution
Consider the given system as
8x − 3 y + 2 z = 20
4 x + 11 y − z = 33
6 x + 3 y + 12 z =35
the system is diagonally do min ant
1
x= [ 20 + 3 y − 2 z ]
8
1
=
y [33 − 4 x + z ]
11
1
=
z [35 − 6 x − 3 y ]
12
we start with an initial aproximation x=
0 y=
0 z=
0 0
substituting these
first iteration
1
=
x1 [ 20 + 3(0) − 2(0)] =2.5
8
1
=
y1 [33 − 4(2.5) + 0=] 2.0909091
11
1
z1 = [35 − 6(2.5) − 3(2.0909091) ] = 1.1439394
12
Second iteration
1 1
x2= [ 20 + 3 y1 − z1 ]= [ 20 + 3(2.0909091) − 2(1.1439394)]= 2.9981061
8 8
1 1
y=2 [33 − 4 x2 + z1=] [33 − 4(2.9981061) + 1.1439394=] 2.0137741
11 11
1 1
z2= [35 − 6 x2 − 3 y2 =] [35 − 6(2.9981061) − 3(2.0137741)=] 0.9141701
12 12
third iteration
1
x3 = [ 20 + 3(2.0137741) − 2(0.9141701)] = 3.0266228
8
1
y3 = [33 − 4(3.0266228) + 0.9141701] = 1.9825163
11
1
z3 =− [35 6(3.0266228) − 3(1.9825163)] = 0.9077262
12
fourth iteration
1
x4 = [ 20 + 3(1.9825163) − 2(0.9077262)] = 3.0165121
8
1
y4 = [33 − 4(3.0165121) + 0.9077262] = 1.9856071
11
1
z4 =− [35 6(3.0165121) − 3(1.9856071)] = 0.8319128
12
Example
Solution
28 x + 4 y − z =32
x + 3 y + 10 z =
24
2 x + 17 y + 4 z =
35
the given system is diagonally do min ant so we will make it diagonaaly do min ant by
iterchanaginhg the equations
28 x + 4 y − z = 32
2 x + 17 y + 4 z =35
x + 3 y + 10 z = 24
1
=
x [32 − 4 y + z ]
28
1
=y [35 − 2 x − 4 z ]
17
1
=z [24 − x − 3 y ]
10
First approximation
putting y = z = 0
1
=x1 = [32] 1.1428571
28
puting x = 1.1428571 , z = 0
1
y1 = [35 − 2(1.1428571) − 4(0)] = 1.9243697
17
=
putting =
x 1.1428571 , y 1.9243697
1
z1 =[24 − 1.1428571 − 3(1.9243697)] = 1.7084034
10
Second iteration
1
x2 = [32 − 4(1.9243697) + 1.7084034] =0.9289615
28
1
y2 =− [35 2(0.9289615) − 4(1.7084034)] = 1.5475567
17
1
z2 =[24 − 0.9289615 − 3(1.5475567)] = 1.8408368
10
third iteration
1
x3 = [32 − 4(1.5475567) + 1.8428368] =
0.9875932
28
1
y3 =− [35 2(0.9875932) − 4(1.8428368)] = 1.5090274
17
1
z3 =[24 − 0.9875932 − 3(1.5090274)] = 1.8485325
10
fourth iteration
1
x4 = [32 − 4(1.5090274) + 1.8485325] =
0.9933008
28
1
y4 =− [35 2(0.9933008) − 4(1.8428368)] = 1.5070158
17
1
z4 = [24 − 0.9933008 − 3(1.5070158)] = 1.8485652
10
Example
Using Gauss-Seidel iteration method, solve the system of the equation.
10 x − 2 y − z − w = 3
−2 x + 10 y − z − w = 15
− x − y + 10 z − 2 w =27
− x − y − 2 z + 10 w =−9
1
y3 = [15 + 2(0.9836405) + 2.9565624 − 0.0247651] = 1.9899087
10
1
=z3 [27 + 0.9836405 + 1.9899087 + 2(−0.0247651)]= 2.9924019
10
1
w3 = [−9 + 0.983405 + 1.9899087 + 2(2.9924019)] =−0.0041647
10
fourth iteration
1
x4 = [3 + 2(1.9899087) + 2.9924019 − 0.0041647] = 0.9968054
10
1
y4 = [15 + 2(0.9968054) + 2.9924019 − 0.0041647] = 1.9981848
10
1
=z4 [27 + 0.9968054 + 1.9981848 + 2(−0.0041647)]= 2.9986661
10
1
w4 = [−9 + 0.9968054 + 1.9981848 + 2(2.9986661)] =−0.0007677
10
Note
When to stop the iterative processes ,we stop the iterative process when we get the
required accuracy means if your are asked that find the accurate up to four places of
decimal then we will simply perform up to that iteration after which we will get the
required accuracy. If we calculate the root of the equation and its consecutive values are
1.895326125, 1.916366125, 1.919356325, 1.919326355, 1.919327145, 1.919327128
Here the accuracy up to seven places of decimal is achieved so if you are asked to acquire
the accuracy up to six places of decimal then we will stop here .
But in the solved examples only some iteration are carried out and accuracy is not
considered here.
Lecture 11
Solution of Linear System of Equations and Matrix Inversion
Relaxation Method
This is also an iterative method and is due to Southwell.To explain the details, consider
again the system of equations
a11 x1 + a12 x2 + + a1n xn = b1
a21 x1 + a22 x2 + + a2 n xn = b2
an1 x1 + an 2 x2 + + ann xn = bn
Let
X ( p ) = ( x1( p ) , x2( p ) ,..., xn( p ) )T
be the solution vector obtained iteratively after p-th iteration. If Ri( p ) denotes the
residual of the i-th equation of system given above , that is of ai1 x1 + ai 2 x2 + + ain xn =
bi
defined by
Ri( p ) = bi − ai1 x1( p ) − ai 2 x2( p ) − − ain xn( p )
we can improve the solution vector successively by reducing the largest residual to zero
at that iteration. This is the basic idea of relaxation method.
To achieve the fast convergence of the procedure, we take all terms to one side and then
reorder the equations so that the largest negative coefficients in the equations appear on
the diagonal.
Ri
dxi =
aii
In other words, we change xi . to ( xi + dxi ) to relax Ri that is to reduce Ri to zero.
Example
6 x1 − 3 x2 + x3 =
11
2 x1 + x2 − 8 x3 =−15
x1 − 7 x2 + x3 =10
by the relaxation method, starting with the vector (0, 0, 0).
Solution
At first, we transfer all the terms to the right-hand side and reorder the equations, so that
the largest coefficients in the equations appear on the diagonal.
Thus, we get
0 = 11 − 6 x1 + 3 x2 − x3
0 = 10 − x1 + 7 x2 − x3
0= −15 − 2 x1 − x2 + 8 x3
Starting with the initial solution vector (0, 0, 0), that is taking x=
1 x=
2 x=
3 0,
of which the largest residual in magnitude is R3, i.e. the 3rd equation has more error and
needs immediate attention for improvement.
number R1 R2 R3 Ri dxi x1 x2 x3
number R1 R2 R3 Ri dxi x1 x2 x3
number R1 R2 R3 Ri dxi x1 x2 x3
At this stage, we observe that all the residuals R1, R2 and R3 are small enough and
therefore we may take the corresponding values of xi at this iteration as the solution.
Hence, the numerical solution is given by
x1 =1.0017, x2 = −0.9901, x3 = 2.0017,
The exact solution is
x1 =1.0, x2 =−1.0, x3 = 2.0
Example
Solution
x y z r1 r2 r3
1 0 0 -10 1 1 L1
0 1 0 2 -10 1 L2
0 0 1 2 2 -10 L3
x y z r1 r2 r3
0 0 0 6 7 8 L4
0 0 1 8 9 -2 L5=L4+L3
0 1 0 10 -1 -1 L6=L5+L2
1 0 0 0 0 0 L7=L6+L1
Explanation
(1) In L4 ,the largest residual is 8.to reduce it, To reduce it ,we give an increment of
8 8
= = 0.8 ≅ 1
c3 10
the resulting residulas are obtained by
L4 + (1) L3 , i.e line L5
Example
Solve the system by relaxation method, the equations
9x − y + 2z = 7
x + 10 y − 2 z = 15
2 x − 2 y − 13 z = −17
Solution
The residuals r1 , r2 , r3 are given by
9x − y + 2z = 9
x + 10 y − 2 z = 15
2 x − 2 y − 13 z =−17
here
r1 = 9 − 9 x + y − 2 z
r2 = 15 − x − 10 y + 2 z
r3 =−17 − 2 x + 2 y + 13 z
Operation table
x y z r1 r2 r3
1 0 0 -9 -1 -2
0 1 0 1 -10 2
0 0 1 -2 2 13
Relaxation table is
x y z r1 r2 r3
0 0 0 9 15 -17
0 0 1 7 17 -4
0 1 0 8 7 -2
0.89 0 0 -0.01 6.11 -3.78
0 0.61 0 0.6 0.01 -2.56
0 0 0.19 0.22 0.39 -0.09
0 0.039 0 0.259 0 -0.012
0.028 0 0 0.007 -0.028 -0.068
0 0 0.00523 -0.00346 -1.01754 -0.00001
Then x=0.89+0.028=0.918;y=1+0.61+0.039=1.694
And z=1+0.19+0.00523=1.19523
Now substituting the values of x,y,z in (1) ,we get
r1=9-9(0.918)+1.649-2(1.19523)=-0.00346
r2=15-0.918-10(1.649)+2(1.19523)=-0.1754
r3=-17-2(0.918) +2(1.649) +13(1.19523) =-0.00001
Which is agreement with the final residuals.
Lecture 12
Lecture 13
Lecture 14
Fixed Points:
A fixed point of an n × n matrix A is a vector x in Rn such that Ax = x. Every square
matrix A has at least one fixed point, namely x = 0. We call this the trivial fixed point of
A.
The general procedure for finding the fixed points of a matrix A is to rewrite the equation
Ax = x as Ax = Ix or, alternatively, as
(I – A)x = 0 (1)
Since this can be viewed as a homogeneous linear system of n equations in n unknowns
with coefficient matrix I – A, we see that the set of fixed points of an n × n matrix is a
subspace of Rn that can be obtained by solving (1).
The following theorem will be useful for ascertaining the nontrivial fixed points of a
matrix.
Theorem 1:
If A is an n x n matrix, then the following statements are equivalent.
(a) A has nontrivial fixed points.
(b) I – A is singular.
(c) det(I – A) = 0.
Example 1:
In each part, determine whether the matrix has nontrivial fixed points; and, if so, graph
the subspace of fixed points in an xy-coordinate system.
3 6 0 2
= (a ) A = (b) A
1 2 0 1
Solution:
(a) The matrix has only the trivial fixed point since.
1 0 3 6 −2 −6
(I −=A) − =
0 1 1 2 −1 −1
−2 −6
det( I − A) = det = (−1)(−2) − (−1)(−6) =
− 4≠ 0
−1 −1
1 0 0 2 1 −2
( I −=A) − =
0 1 0 1 0 0
1 −2
=
det( I − A) det = 0
0 0
The fixed points x =(x, y) are the solutions of the linear system (I – A)x=0, which we can
express in component form as
1 -2 x 0
0 0 y = 0
A general solution of this system is
x = 2t, y = t (2)
which are parametric equations of the line y = 12 x . It follows from the corresponding
vector form of this line that the fixed points are
x 2t 2
= x = = t (3)
y t 1
0 2 2t 2t
As a check, =Ax = = x
0 1 t t
so every vector of form (3) is a fixed point of A.
y = 12 x
(2, 1) •
Figure 1
Example 2:
1 6 6 3
=
Let A = , u = and v -2 . Are u and v eigenvectors of A?
5 2 -5
Solution:
1 6 6 -24 6
=
Au =
= - 4=-5 - 4u
5 2 -5 20
1 6 3 -9 3
=
Av = ≠λ
5 2 -2 11 -2
Thus u is an eigenvector corresponding to an eigenvalue – 4, but v is not an eigenvector
of A, because Av is not a multiple of v.
Example 3:
1 6
Show that 7 is an eigenvalue of A = , find the corresponding eigenvectors.
5 2
Solution:
The scalar 7 is an eigenvalue of A if and only if the equation
Ax = 7x (A)
has a nontrivial solution. But (A) is equivalent to Ax – 7x = 0, or
(A – 7I) x = 0 (B)
To solve this homogeneous equation, form the matrix
1 6 7 0 -6 6
= A - 7 I = -
5 2 0 7 5 -5
The columns of A – 7I are obviously linearly dependent, so (B) has nontrivial solutions.
Thus 7 is an eigenvalue of A. To find the corresponding eigenvectors, use row operations:
−6 6 0
5 −5 0
1 −1 0
~ (−1R1 − R2 )
5 −5 0
1 −1 0
~ ( R2 − 5 R1 )
0 0 0
1
The general solution has the form x 2 . Each vector of this form with x2 ≠ 0 is an
1
eigenvector corresponding to λ = 7.
The equivalence of equations (A) and (B) obviously holds for any λ in place of λ = 7.
Thus λ is an eigenvalue of A if and only if the equation
(A - λ I)x = 0 (C)
has a nontrivial solution.
Eigen space:
The set of all solutions of (A - λ I)x = 0 is just the null space of the matrix A - λ I. So
this set is a subspace of Rn and is called the eigenspace of A corresponding to λ . The
eigenspace consists of the zero vector and all the eigenvectors corresponding to λ .
Example 3 shows that for matrix A in Example 2, the eigenspace corresponding to λ = 7
consists of all multiples of (1, 1), which is the line through (1, 1) and the origin. From
Example 2, one can check that the eigenspace corresponding to λ = -4 is the line through
(6, -5). These eigenspaces are shown in Fig. 1, along with eigenvectors (1, 1) and (3/2, -
5/4) and the geometric action of the transformation x → Ax on each eigenspace.
4 -1 6
Example 4: Let A = 2 1 6 .
2 -1 8
Find a basis for the corresponding eigenspace where eigen value of matrix is 2.
4 -1 6 2 0 0 2 -1 6
=
Solution: Form A - 2 I =2 1 6 - 0 2 0 2 -1 6 and row reduce the
2 -1 8 0 0 2 2 -1 6
2 -1 6 0 augmented matrix for (A – 2I) x = 0:
2 -1 6 0
2 -1 6 0
2 -1 6 0
~ 0 0 0 0 R2 − R1
2 −1 6 0
2 -1 6 0
~ 0 0 0 0 R3 − R1
0 0 0 0
At this point we are confident that 2 is indeed an eigenvalue of A because the equation
(A – 2I) x = 0 has free variables. The general solution is
2 x1 − x2 + 6 x3 =
0........(a )
Let=
x2 t ,=
x3 s then
2 x1 = t − 6 s
=x1 ( 1 2 ) t − 3s
then
x1 t / 2 − 3s t / 2 −3s 1/ 2 −3
x = t
2 = t + 0 = t 1 + s 0
x3 s 0 s 0 1
The most direct way of finding the eigenvalues of an n × n matrix A is to rewrite the
equation Ax = λ x as Ax = λ Ix , or equivalently, as
(λ I - A) x = 0 (4)
and then try to determine those values of λ , if any, for which this system has nontrivial
solutions. Since (4) have nontrivial solutions if and only if the coefficient matrix λ I - A is
singular, we see that the eigenvalues of A are the solutions of the equation
det(λ I - A) = 0 (5)
Equation (5) is known as characteristic equation. Also, if λ is an eigenvalue of A, then
equation (4) has a nonzero solution space, which we call the eigenspace of A
corresponding to λ . It is the nonzero vectors in the eigenspace of A corresponding to λ
that are the eigenvectors of A corresponding to λ .
The above discussion is summarized by the following theorem.
Example 6:
6 -3 1
(1) Is 5 an eigenvalue of A = 3 0 5 ?
2 2 6
(2) If x is an eigenvector for A corresponding to λ , what is A3x?
Solution:
(1) The number 5 is an eigenvalue of A if and only if the equation (A- λ I) x = 0 has a
nontrivial solution. Form
6 -3 1 5 0 0 1 -3 1
= A - 5 I = 3 0 5 - 0 5 0 3 -5 5
2 2 6 0 0 5 2 2 1
and row reduce the augmented matrix:
1 -3 1 0
3 -5 5 0
2 2 1 0
1 -3 1 0
~ 0 4 2 0 R2 − 3R1
2 2 1 0
1 -3 1 0
~ 0 4 2 0 R3 − 2 R1
0 8 -1 0
1 -3 1 0
~ 0 4 2 0 R3 − 2 R2
0 0 -5 0
At this point it is clear that the homogeneous system has no free variables. Thus A – 5I is
an invertible matrix, which means that 5 is not an eigenvalue of A.
Exercises:
3 2
1. Is λ = 2 an eigenvalue of ?
3 8
−1 + 2 2 1
2. Is an eigenvector of 1 4 ? If so, find the eigenvalue.
1
4 3 7 9
3. Is −3 an eigenvector of
−4 −5 1 ? If so, find the eigenvalue.
1 2 4 4
1 3 6 7
4. Is −2 an eigenvector of
3 3 7 ? If so, find the eigenvalue.
1 5 6 5
3 0 −1
5. Is λ = 4 an eigenvalue of 2 3 1 ? If so, find one corresponding eigenvector.
−3 4 5
1 2 2
6. Is λ = 3 an eigenvalue of 3 −2 1 ? If so, find one corresponding eigenvector.
0 1 1
In exercises 7 to 12, find a basis for the eigenspace corresponding to each listed
eigenvalue.
4 −2 7 4
=7. A =
− , λ 10 =8. A =
− − , λ 1.5
3 9 3 1
4 0 1 1 0 −1
9. A = −2 1 0 , λ =1, 2,3
10. A = 1 −3 0 , λ = −2
−2 0 1 4 −13 1
3 0 2 0
4 2 3 1
−1 1 −3 , λ = 3 3 1 0
11. A = 12. A = ,λ=4
0 1 1 0
2 4 9
0 0 0 4
0 0 0 4 0 0
13. 0 2 5
14. 0 0 0
0 0 −1 1 0 −3
1 2 3
15. For A = 1 2 3 , find one eigenvalue, with no calculation. Justify your answer.
1 2 3
16. Without calculation, find one eigenvalue and two linearly independent vectors of
5 5 5
A = 5 5 5 . Justify your answer.
5 5 5
Lecture 15
The Characteristic equation contains useful information about the eigenvalues of a square
matrix A. It is defined as
det( A − λ I ) =
0,
Where λ is the eigenvalue and I is the identity matrix. We will solve the Characteristic
equation (also called the characteristic polynomial) to work out the eigenvalues of the
given square matrix A.
2 3
Example 1: Find the eigenvalues of A = .
3 -6
Solution: In order to find the eigenvalues of the given matrix, we must solve the matrix
equation
(A- λI)x = 0
for the scalar λ such that it has a nontrivial solution (since the matrix is non singular).
By the Invertible Matrix Theorem, this problem is equivalent to finding all λ such that
the matrix A - λ I is not invertible, where
2 3 λ 0 2 - λ 3
= A - λ I = - -6 - λ
.
3 -6 0 λ 3
1 5 0
Example 2: Compute det A for A = 2 4 -1
0 -2 0
__________________________________________________________________________________
@Virtual University Of Pakistan 154
15-The Characteristic Equation VU
Solution:
Firstly, we will reduce the given matrix in echelon form by applying elementary row
operations
by R2 − 2 R1
1 5 0
A = 0 -6 -1 ,
0 -2 0
by R2 ↔ R3
1 5 0
0 -2 0 ,
0 -6 -1
by R3 − 3R2
1 5 0
0 -2 0 ,
0 0 -1
which is an upper triangular matrix. Therefore,
det A = (1)(−2)(−1)
= 2.
Note: These Properties will be helpful in using the characteristic equation to find
eigenvalues of a matrix A.
Example 3: (a) Find the eigenvalues and corresponding eigenvectors of the matrix
1 3
A=
4 2
(b) Graph the eigenspaces of A in an xy-coordinate system.
Solution: (a) The eigenvalues will be worked out by solving the characteristic equation
of A. Since
__________________________________________________________________________________
@Virtual University Of Pakistan 155
15-The Characteristic Equation VU
1 0 1 3 λ - 1 -3
= λ I - A λ= 0 1 - 4 2 -4 λ - 2 .
The characteristic equation det(λ I - A) = 0 becomes
λ - 1 -3
= 0.
-4 λ - 2
Expanding and simplifying the determinant, it yields
λ 2 - 3λ - 10 = 0,
or
(λ + 2)(λ - 5) = 0. (1)
Thus, the eigenvalues of A are λ = −2 and λ = 5 .
Now, to work out the eigenspaces corresponding to these eigenvalues, we will solve the
system
λ - 1 -3 x 0
-4 λ - 2 y = 0 (2)
for λ = −2 and λ = 5 . Here are the computations for the two cases.
(i) Case λ = -2
In this case Eq. (2) becomes
−3 −3 x 0
−4 −4 y = 0 ,
which can be written as
−3 x − 3 y = 0,
−4 x − 4 y =0 ⇒ x =− y.
In parametric form,
x = – t, y = t . (3)
Thus, the eigenvectors corresponding to λ = −2 are the nonzero vectors of the form
x -t -1
= x = = t . (4)
y t 1
It can be verified as
1 3 -t 2t -t
= =
4 2 t -2t -2= t -2 x
Thus,
Ax = λ x
(ii) Case λ = 5
In this case Eq. (2) becomes
4 -3 x 0
-4 3 y = 0 ,
which can be written as
__________________________________________________________________________________
@Virtual University Of Pakistan 156
15-The Characteristic Equation VU
4x − 3y =
0
3
−4 x + 3 y = 0 ⇒ x = y.
4
In parametric form,
3
=x =t , y t. (5)
4
Thus, the eigenvectors corresponding to λ = 5 are the nonzero vectors of the form
x 34 t 34
= x = = t 1 . (6)
y t
It can be verified as
1 3 34 t 154 t 34 t
=Ax =
= 5 = 5 x.
4 2 t 5t t
y
( λ = −2 ) ( λ = 5)
y = –x y = 43 x
Figure 1(a)
It can also be drawn using the vector equations (4) and (6) as shown in Figure 1(b). When
an eigenvector x in the eigenspace for λ = 5 is multiplied by A, the resulting vector has
the same direction as x but the length is increased by a factor of 5 and when an
eigenvector x in the eigenspace for λ = −2 is multiplied by A, the resulting vector is
oppositely directed to x and the length is increased by a factor of 2. In both cases,
multiplying an eigenvector by A produces a vector in the same eigenspace.
__________________________________________________________________________________
@Virtual University Of Pakistan 157
15-The Characteristic Equation VU
y
( λ = −2 ) ( λ = 5)
5x
x
x x
–2x
Figure 1(b)
0 -1 0
Example 4: Find the eigen values of the matrix A = 0 0 1
- 4 -17 8
Solution:
λ 0 0 0 −1 0
det(λ I - A) det 0
= λ 0 − 0 0 1
0 0 λ −4 − 17 8
λ 1 0
= 0 λ -1 (7)
4 17 λ -8
= λ 3 - 8 λ 2 + 17 λ - 4,
which yields the characteristic equation
λ 3 - 8 λ 2 + 17 λ - 4 =
0 (8)
To solve this equation, firstly, we will look for integer solutions. This can be done by
using the fact that if a polynomial equation has integer coefficients, then its integer
solutions, if any, must be divisors of the constant term of the given polynomial. Thus, the
only possible integer solutions of Eq.(8) are the divisors of –4, namely ± 1, ±2, and ± 4 .
Substituting these values successively into Eq. (8) yields that λ =4 is an integer solution.
This implies that λ – 4 is a factor of Eq.(7), Thus, dividing the polynomial by λ – 4 and
rewriting Eq.(8), we get
(λ - 4 )(λ 2 - 4 λ + 1) =
0.
__________________________________________________________________________________
@Virtual University Of Pakistan 158
15-The Characteristic Equation VU
Now, the remaining solutions of the characteristic equation satisfy the quadratic equation
λ 2 - 4λ + 1 =0.
Solving the above equation by the quadratic formula, we get the eigenvalues of A as
λ= 4, λ= 2 + 3, λ= 2− 3
5 -2 6 -1
0 3 -8 0
Example 5 Find the characteristic equation of A =
0 0 5 4
0 0 0 1
Solution: Clearly, the given matrix is an upper triangular matrix. Forming A - λ I , we get
5 - λ -2 6 -1
0 3 - λ -8 0
det( A - λ I ) = det
0 0 5-λ 4
0 0 0 1- λ
Now using the fact that determinant of a triangular matrix is equal to product of its
diagonal elements, the characteristic equation becomes
(5 - λ ) 2 (3 - λ )(1- λ ) = 0.
Expanding the product, we can also write it as
λ 4 -14λ 3 + 68λ 2 -130λ + 75 = 0.
Here, the eigenvalue 5 is said to have multiplicity 2 because ( λ - 5) occurs two times as a
factor of the characteristic polynomial. In general, the (algebraic) multiplicity of an
eigenvalue λ is its multiplicity as a root of the characteristic equation.
Note:
From the above mentioned examples, it can be easily observed that if A is an n × n matrix,
then det (A – λ I) is a polynomial of degree n called the characteristic polynomial of A.
Activity:
__________________________________________________________________________________
@Virtual University Of Pakistan 159
15-The Characteristic Equation VU
Work out the eigenvalues and eigenvectors for the following square matrix.
5 8 16
A = 4 1 8 .
−4 −4 −11
Similarity:
Let A and B be two n x n matrices, A is said to be similar to B if there exist
an invertible matrix P such that
P -1AP = B,
or equivalently,
A = PBP -1.
-1
Replacing Q by P , we have
Q -1BQ = A.
So B is also similar to A. Thus, we can say that A and B are similar.
Similarity transformation:
The act of changing A into P -1AP is called a similarity
transformation.
The following theorem illustrates use of the characteristic polynomial and it provides the
foundation for several iterative methods that approximate eigenvalues.
Theorem 2:
If n x n matrices A and B are similar, then they have the same
characteristic polynomial and hence the same eigenvalues (with the same multiplicities).
Note: It must be clear that Similarity and row equivalence are two different concepts. ( If
A is row equivalent to B, then B = EA for some invertible matrix E.) Row operations on
a matrix usually change its eigenvalues.
__________________________________________________________________________________
@Virtual University Of Pakistan 160
15-The Characteristic Equation VU
.95 .03
Example 7: Let A = . Analyze the long term behavior of the dynamical
.05 .97
0.6
system defined by x k+1 = Ax k (k = 0, 1, 2, …), with x0 = .
0.4
Solution: The first step is to find the eigenvalues of A and a basis for each eigenspace.
The characteristic equation for A is
= 0 det( A − λ I )
0.95 - λ 0.03
= = 0.05 (.95 - λ )(.97 - λ ) - (.03)(.05)
0.97 - λ
0 det
= λ 2 -1.92λ + .92
By the quadratic formula
1.92 ± (1.92) 2 - 4(.92) 1.92 ± .0064 1.92 ± .08
=λ = = = 1 or .92
2 2 2
Firstly, the eigenvectors will be found as given below.
Ax = λ x,
( Ax − λ x) =
0,
( A − λI )x =
0.
For λ = 1
0.95 0.03 1 0 x1
− =0,
0.05 0.97 0 1 x2
−0.05 0.03 x1
= 0,
0.05 −0.03 x2
which can be written as
−0.05 x1 + 0.03 x2 = 0
0.03 3
0.05 x1 − 0.03 x2 =0 ⇒ x1 =x2 or x1 = x2 .
0.05 5
In parametric form, it becomes
3
= x1 = t and x2 t.
5
For λ = 0.92
__________________________________________________________________________________
@Virtual University Of Pakistan 161
15-The Characteristic Equation VU
__________________________________________________________________________________
@Virtual University Of Pakistan 162
15-The Characteristic Equation VU
.375
As k → ∞ , (.92)k tends to zero and x k tends to = .125v1 .
.625
1 - 4
Example 8: Find the characteristic equation and eigenvalues of A = .
4 2
Solution: The characteristic equation is
1- λ - 4
= =
0 det( A - λ I ) det
4 2-λ
= (1- λ )(2 - λ ) - (-4)(4),
= λ 2 - 3λ + 18,
which is a quadratic equation whose roots are given as
3 ± (-3) 2 - 4(18)
λ=
2
3 ± -63
=
2
Thus, we see that the characteristic equation has no real roots, so A has no real
eigenvalues. A is acting on the real vector space R2 and there is no non-zero vector v in
R2 such that Av = λ v for some scalar λ .
__________________________________________________________________________________
@Virtual University Of Pakistan 163
15-The Characteristic Equation VU
Exercises:
Find the characteristic polynomial and the eigenvalues of matrices in exercises 1 to 12.
3 −2 5 −3
1. 2.
1 −1 −4 3
2 1 3 −4
3. 4.
−1 4 4 8
5 3 7 −2
5. 6.
−4 4 2 3
1 0 −1 0 3 1
7. 2 3 −1
8. 3 0 2
0 6 0 1 2 0
4 0 0 −1 0 1
9. 5 3 2
10. −3 4 1
−2 0 2 0 0 2
6 −2 0 5 −2 3
11. −2 9 0
12. 0 1 0
5 8 3 6 7 −2
For the matrices in exercises 13 to 15, list the eigenvalues, repeated according to their
multiplicities.
4 −7 0 2 5 0 0 0
0 3 −4 6 8 −4 0 0
13. 14.
0 0 3 −8 0 7 1 0
0 0 0 1 1 −5 2 1
__________________________________________________________________________________
@Virtual University Of Pakistan 164
15-The Characteristic Equation VU
3 0 0 0 0
−5 1 0 0 0
15. 3 8 0 0 0
0 −7 2 1 0
−4 1 9 −2 3
16. It can be shown that the algebraic multiplicity of an eigenvalue λ is always greater
than or equal to the dimension of the eigenspace corresponding to λ . Find h in the matrix
A below such that the eigenspace for λ =5 is two-dimensional:
5 −2 6 −1
0 3 h 0
A=
0 0 5 4
0 0 0 1
__________________________________________________________________________________
@Virtual University Of Pakistan 165
16-Diagonalization VU
Lecture 16
Diagonalization
Diagonalization is a process of transforming a vector A to the form A = PDP-1 for some
invertible matrix P and a diagonal matrix D. In this lecture, the factorization enables us to
compute Ak quickly for large values of k which is a fundamental idea in several
applications of linear algebra. Later, the factorization will be used to analyze (and
decouple) dynamical systems.
The “D” in the factorization stands for diagonal. Powers of such a D are trivial to
compute.
5 0 5 0 5 0 52 0
=
Example 1: If D = , then D 2 = and
0 3 0 3 0 3 0 32
5 0 52 0 53 0
= D 3 = 2 3
0 3 0 3 0 3
5k 0
=
In general, D k
k
for k ≥ 1
0 3
The next example shows that if A = PDP-1 for some invertible P and diagonal D, then it
is quite easy to compute Ak.
7 2
Example 2: Let A = k -1
. Find a formula for A , given that A = PDP , where
- 4 1
1 1 5 0
= P = and D
-1 -2 0 3
Solution: The standard formula for the inverse of a 2 × 2 matrix yields
2 1
P -1 =
-1 -1
By associative property of matrix multiplication,
−1
= A2 ( PDP -1 )(= PDP -1 ) PD ( P -1= =
P ) DP -1 PDIDP PDDP -1
1
where I is the identity matrix.
1 1 52 0 2 1
= PD= 2 -1
P -1 -2 2
0 3 -1 -1
Again, =A3 ( PDP= -1
P -1 )=
) A2 ( PD =
P D 2 P -1 PDD 2 -1
P PD 3 P -1
1
1 1 5k 0 2 1
= =
Ak PD k -1
P -1 -2 ,
Thus, in general, for k ≥ 1, 0 3k -1 -1
__________________________________________________________________________________
@Virtual University Of Pakistan 166
16-Diagonalization VU
5k 3k 2 1
= k ,
−5 −2.3k -1 -1
2.5k - 3k 5k - 3k
= k k .
2.3 - 2.5 2.3k - 5k
Activity:
Work out C 4 , given that C = PDP −1 where
1 0 2 0
= P = , D
3 1 0 1
Remarks:
A square matrix A is said to be diagonalizable if A is similar to a diagonal matrix, that is,
if A = PDP-1 for some invertible matrix P and some diagonal matrix D. The next theorem
gives a characterization of diagonalizable matrices and tells how to construct a suitable
factorization.
In fact, A = PDP-1, with D a diagonal matrix, if and only if the columns of P are n
linearly independent eigenvectors of A. In this case, the diagonal entries of D are
eigenvalues of A that correspond, respectively, to the eigenvectors in P.
In other words, A is diagonalizable if and only if there are enough eigenvectors to form a
basis of Rn. We call such a basis an eigenvector basis.
Proof: First, observe that if P is any n × n matrix with columns v 1 , … , v n and if D is any
diagonal matrix with diagonal entries λ1 ,...., λn then
= [v1 v2 ... vn ]
AP A= [ Av1 Av 2 ... Avn ] , (1)
λ1 0 0
0 λ 0
=
while PD P= 2 [λ v λ v λ v ] (2)
1 1 2 2 n n
0 0 λn
Suppose now that A is diagonalizable and A = PDP-1. Then right-multiplying this
relation by P, we have AP = PD. In this case, (1) and (2) imply that
[ Av1 Av2 Avn ] = [λ1v1 λ2v2 λn vn ] (3)
Equating columns, we find that
= Av1 λ= 1v 1 , Av 2 λ2 v 2 ,=
, Avn λn vn (4)
Since P is invertible, its columns v 1 ,…, v n must be linearly independent. Also, since these
columns are nonzero, Eq.(4) shows that λ1 ,....., λn are eigenvalues and v 1 , …, v n are
__________________________________________________________________________________
@Virtual University Of Pakistan 167
16-Diagonalization VU
corresponding eigenvectors. This argument proves the “only if” parts of the first and
second statements along with the third statement, of the theorem.
Finally, given any n eigenvectors v 1 , …, v n use them to construct the columns of P and
use corresponding eigenvalues λ1 ,....., λn to construct D. By Eqs. (1) – (3). AP =PD. This
is true without any condition on the eigenvectors. If, in fact, the eigenvectors are linearly
independent, then P is invertible (by the Invertible Matrix Theorem), and AP = PD
implies that A = PDP-1.
Diagonalizing Matrices
1 3 3
Example 3: Diagonalize the following matrix, if possible A = -3 -5 -3
3 3 1
Solution: To diagonalize the given matrix, we need to find an invertible matrix P and a
diagonal matrix D such that A = PDP-1 which can be done in following four steps.
__________________________________________________________________________________
@Virtual University Of Pakistan 168
16-Diagonalization VU
1
Thus, the basis vector for λ = 1 is v1 = −1
1
__________________________________________________________________________________
@Virtual University Of Pakistan 169
16-Diagonalization VU
Step 4: Form D from the corresponding eigen values. For this purpose, the order of the
eigen values must match the order chosen for the columns of P. Use the eigen value λ = -
2 twice, once for each of the eigenvectors corresponding to λ = -2:
1 0 0
D = 0 -2 0
0 0 -2
Now, we need to check do P and D really work. To avoid computing P -1, simply verify
that AP = PD. This is equivalent to A = PDP -1 when P is invertible.We compute
1 3 3 1 -1 -1 1 2 2
=AP =-3 -5 -3 -1 1 0 -1
-2 0
3 3 1 1 0 1 1 0 -2
1 -1 -1 1 0 0 1 2 2
=PD =
-1 1 0 0 -2 0 -1 -2 0
1 0 1 0 0 -2 1 0 -2
__________________________________________________________________________________
@Virtual University Of Pakistan 170
16-Diagonalization VU
There are no other eigen values and every eigen vector of A is a multiple of either v 1 or
v 2 . Hence it is impossible to form a basis of R3 using eigenvectors of A. By above
Theorem, A is not diagonalizable.
The condition in Theorem 2 is sufficient but not necessary i.e., it is not necessary for an n
x n matrix to have n distinct eigen values in order to be diagonalizable. Example 3 serves
as a counter example of this case where the 3 x 3 matrix is diagonalizable even though it
has only two distinct eigen values.
__________________________________________________________________________________
@Virtual University Of Pakistan 171
16-Diagonalization VU
5 0 0 0
0 5 0 0
A=
1 4 -3 0
-1 -2 0 -3
Solution: Since A is triangular matrix, the eigenvalues are 5 and –3, each with
multiplicity 2. Using the method of lecture 28, we find a basis for each eigen space.
-8 -16
4 4
=
Basis for λ 5=: v1 and= v2
1 0
0 1
0 0
0 0
=
Basis for λ -3=: v3 and
= v4
1 0
0 1
The set {v 1 ,…,v 4 }is linearly independent, by Theorem 3. So the matrix P =[v 1 …v 4 ] is
invertible, and A=PDP -1 , where
-8 -16 0 0 5 0 0 0
4 4 0 0 0 5 0 0
=P = and D
1 0 1 0 0 0 -3 0
0 1 0 1 0 0 0 -3
Example 7:
4 -3
(1) Compute A8 where A =
2 -1
-3 12 3 2
=
(2) Let A = 1 1
, v = , and v 2 1 . Suppose you are told that v 1 and
-2 7
v 2 are eigenvectors of A. Use this information to diagonalize A.
(3) Let A be a 4 x 4 matrix with eigenvalues 5, 3, and -2, and suppose that you
know the eigenspace for λ =3 is two-dimensional. Do you have enough
information to determine if A is diagonalizable?
Solution:
Here, det (A- λ I)= λ 2 -3 λ +2=( λ -2)( λ -1).
__________________________________________________________________________________
@Virtual University Of Pakistan 172
16-Diagonalization VU
__________________________________________________________________________________
@Virtual University Of Pakistan 173
16-Diagonalization VU
Exercise:
5 7 2 0 2 −3 1 0
1. P = ,D = 2. P = ,D =
2 3 0 1 −3 5 0 1/2
In exercises 3 and 4, use the factorization A = PDP-1 to compute Ak, where k represents
an arbitrary positive integer.
a 0 1 0 a 0 1 0
3.
− = 3 1 0 b −3 1
3( a b ) b
−2 12 3 4 2 0 −1 4
4. =
−1 5 1 1 0 1 1 −3
In exercises 5 and 6, the matrix A is factored in the form PDP-1. Use the Diagonalization
Theorem to find the eigenvalues of A and a basis for each eigenspace.
2 2 1 1 1 2 5 0 0 1 / 4 1 / 2 1/ 4
5. 1 3 1 =
1 0 −1 0 1 0 1 / 4 1 / 2 −3 / 4
1 2 2 1 −1 0 0 0 1 1 / 4 −1 / 2 1 / 4
4 0 −2 −2 0 −1 5 0 0 0 0 1
6. 2 5 4 = 0 1 2 0 5 0 2 1 4
0 0 5 1 0 0 0 0 4 −1 0 −2
3 −1 2 3
7. 8.
1 5 4 1
−1 4 −2 4 2 2
9. −3 4 0
10. 2 4 2
−3 1 3 2 2 4
__________________________________________________________________________________
@Virtual University Of Pakistan 174
16-Diagonalization VU
2 2 −1 4 0 −2
11. 1 3 −1
12. 2 5 4
−1 −2 2 0 0 5
7 4 16 0 −4 −6
13. 2 5 8
14. −1 0 −3
−2 −2 −5 1 2 5
4 0 0 −7 −16 4
15. 1 4 0
16. 6 13 −2
0 0 5 12 16 1
5 −3 0 9 4 0 0 0
0 3 1 −2 0 4 0 0
17. 18.
0 0 2 0 0 0 2 0
0 0 0 2 1 0 0 2
__________________________________________________________________________________
@Virtual University Of Pakistan 175
17- Inner Product VU
Lecture 17
Inner Product
v1
v
2
.
u1 u2 . . . un = u1v1 + u2 v2 + ... + un vn
.
.
vn
Example 1
2 3
Compute u.v and v.u when u =
−5 and v =
2 .
−1 −3
Solution
2 3
u= 2
−5 and v =
−1 −3
u t = [ 2 − 5 − 1]
3
u.v= u v= [ 2 − 5 − 1] 2 = 2(3) + (−5)(2) + (−1)(−3)
t
−3
=6 − 10 + 3 =−1
=vt [3 2 − 3]
2
v.u= v u= [3 2 − 3] −5 = 3(2) + (2)(−5) + (−3)(−1)
t
−1
=6 − 10 + 3 =−1
Theorem
Let u , v and w be vectors in R n , and let c be a scalar. Then
a. u.v = v.u
b. (u + v).w =u.w + v.w
c. (cu= ).v c=
(u.v) u.(cv)
d. u.u ≥ 0 and u.u = 0 if and only if u = 0
Observation
Length or Norm
The length or Norm of v is the nonnegative scalar v defined by
v= v.v= v12 + v12 + ... + vn 2
v = v.v
2
Unit vector
A vector whose length is 1 is called a unit vector. If we divide a non-zero vector
v by its length v , we obtain a unit vector u as
v
u=
v
1
The length of u is =u = v 1
v
Definition
The process of creating the unit vector u from v is sometimes called normalizing v ,
and we say that u is in the same direction as v . In this case “ u ” is called the
normalized vector.
Example 2
Let v = (1, 2, 2, 0) in R 4 . Find a unit vector u in the same direction as v .
Solution
The length of v is given by
1
3
1
2
1 1 2
=u = v = 3
v 3 2
2
0 3
0
To check that u = 1
1 −2 2 1 4 4
u = u.u = ( ) 2 + ( ) 2 + ( ) 2 + (0) 2 = + + +0= 1
3 3 3 9 9 9
Example 3
2
Let W be the subspace of R 2 spanned by X = ( ,1) . Find a unit vector Z that is a basis
3
for W.
Solution
1 2 2 13
=z =
13 3 3 13
Definition
For u and v vectors in R n , the distance between u and v , written as dist ( u , v ), is the
length of the vector u − v . That is
dist (u , v=
) u −v
Example 4
Compute the distance between the vectors u = (7, 1) and v = (3, 2)
Solution
7 3 7 − 3 4
u − v= − = =
1 2 1 − 2 −1
The vectors, u , v and u − v are shown in the fig. below. When the vector u − v is
added to v , the result is u . Notice that the parallelogram in the fig. below shows that
the distance from u to v is the same as the distance of u − v to o .
Example 5
If u = (u1 , u2 , u3 ) and v = (v1 , v2 , v3 ) , then
dist (u , v) = u − v = (u − v).(u − v)
= (u1 − v1 ) 2 + (u2 − v2 ) 2 + (u3 − v3 ) 2
Definition
Two vectors in u and v in R n are orthogonal (to each other) if u.v = 0
Note
The zero vector is orthogonal to every vector in R n because 0 .v = 0 for all v in R n .
t
u+v = u + v
2 2 2
Two vectors u and v are orthogonal if and only if
Orthogonal Complements
The set of all vectors z that are orthogonal to w in W is called the orthogonal
complement of W and is denoted by w⊥
Example 6
Let W be a plane through the origin in R3, and let L be the line through the origin
and perpendicular to W. If z and w are nonzero, z is on L, and w is in W, then the line
segment from 0 to z is perpendicular to the line segment from 0 to w; that is, z . w = 0.
So each vector on L is orthogonal to every w in W. In fact, L consists of all vectors
that are orthogonal to the w’s in W, and W consists of all vectors orthogonal to the z’s
in L. That is,
L = W ⊥ and W = L⊥
Remarks
The following two facts about W ⊥ , with W a subspace of Rn, are needed later in the
segment.
(1) A vector x is in W ⊥ if and only if x is orthogonal to every vector in a set that
spans W.
(2) W ⊥ is a subspace of Rn.
Theorem 3
Let A be m x n matrix. Then the orthogonal complement of the row space of A is the
null space of A, and the orthogonal complement of the column space of A is the null
space of AT: (Row A) ⊥ = Nul A, (Col A) ⊥ = Nul AT
Proof
The row-column rule for computing Ax shows that if x is in Nul A, then x is
orthogonal to each row of A (with the rows treated as vectors in Rn). Since the rows of
A span the row space, x is orthogonal to Row A. Conversely, if x is orthogonal to
Row A, then x is certainly orthogonal to each row of A, and hence Ax = 0. This
proves the first statement. The second statement follows from the first by replacing A
with AT and using the fact that Col A = Row AT.
Angles in R2 and R3
If u and v are nonzero vectors in either R2 or R3, then there is a nice connection
between their inner product and the angle ϑ between the two line segments from the
origin to the points identified with u and v. The formula is
u ⋅ v =u v cos ϑ (2)
2
To verify this formula for vectors in R , consider the triangle shown in Fig. 7, with
sides of length u , v , and u − v . By the law of cosines,
u − v = u + v − 2 u v cos ϑ
2 2 2
2
1
= u12 + u22 + v12 + v22 − (u1 − v1 ) 2 − (u2 − v2 ) 2 =u1v1 + u2 v2 =u ⋅ v
2
(u 1 , u 2 )
u u−v
ϑ
v (v 1 , v 2 )
Example 7
Find the angle between the vectors u=
(1, −1, 2), v=
(2,1, 0)
Solution
u.v = (1)(2) + (−1)(1) + (2)(0) = 2 − 1 + 0 = 1
And
u= (1) 2 + (−1) 2 + (2) 2 = 1 + 1 + 4= 6
v= (2) 2 + (1) 2 + (0) 2 = 4 + 1= 5
1 1
cos θ
= =
6 5 30
1
cos θ =
30
−1 1
=θ cos= 79.48°
30
Exercises
Q.1
1 −3
when u =
Compute u.v and v.u= 5 and v
1
3 5
Q.2
Let v = (2,1, 0,3) in R 4 . Find a unit vector u in the direction opposite to that of v .
Q.3
1 3 5
Let W be the subspace of R 3 spanned by X = ( , , ) . Find a unit vector Z that is a
2 2 2
basis for W.
Q.4
Compute the distance between the vectors u = (1, 5, 7) and v = (2, 3, 5).
Q.5
=
Find the angle between the vectors =
u (2,1,3), v (1, 0, 2) .
Lecture 18
Orthogonal and Orthonormal sets
Objectives
Orthogonal Set
{ }
Let S = u1 , u2 ,..., u p be the set of non-zero vectors in R n , is said to be an orthogonal set
if all vectors in S are mutually orthogonal. That is
O ∉ S and ui . u j = o ∀ i ≠ j , i, j = 1, 2,..., p.
Example
To show that S is orthogonal, we show that each vector in S is orthogonal to other. That
is
ui . u j = o ∀ i ≠ j , i, j = 1, 2,3.
For= i 1,= j 2
3 −1
u1 . u2 = 1 . 2
1 1
=− 3 + 2 + 1 =0
Which implies u1 is orthogonal to u2 .
For= i 1,= j 3
−1
3 2
u= 1 . − 2
1 . u3
1 7
2
−3 7
= −2+
2 2
= 0.
Theorem
{ }
Suppose that S = u1 , u2 ,..., u p is an orthogonal set of non-zero vectors in R n and
W = Span {u1 , u2 ,..., u p } . Then S is linearly independent set and a basis for W .
Proof
Suppose
0= c1u1 + c2u2 + ... + c p u p .
Where c1 , c2 ..., c p are scalars.
u=
1 .0 u1 .(c1u1 + c2u2 + ... + c p u p )
=
0 u1 .(c1u1 ) + u1 .(c2u2 ) + ... + u1 .(c p u p )
= c1 (u1 . u1 ) + c2 (u1 . u2 ) + ... + c p (u1 . u p )
= c1 (u1 . u1 )
Example
Solution
To show that S = {u1 , u2 } is linearly independent set, we show that the following vector
equation
c1u1 + c2u2 =
0.
has only the trivial solution. i.e. c=
1 c=
2 0.
c1u1 + c2u2 =
0
3 −1 0
c1 + c2 =
1 3 0
3c1 −c2 0
c + 3c =
1 2 0
3c1 − c2 =
0
c1 + 3c2 =
0
Solve them simultaneously, gives
c=
1 c=
2 0.
Therefore if S is an orthogonal set then it is linearly independent.
Orthogonal basis
Theorem
If S = {u1 , u2 ,..., u p } is an orthogonal basis for a subspace W of R n . Then each y in W can
be uniquely expressed as a linear combination of u1 , u2 ,..., u p . That is
y= c1u1 + c2u2 + ... + c p u p .
Where
y .u j
cj =
u j .u j
Proof
Example
Solution
We want to write
y =c 1 u1 + c 2 u2 + c 3 u3
Where c 1 , c 2 and c 3 are to be determined.
By the above theorem
y . u1
c1 =
u1 . u1
6 3
1 . 1
−8 1 11
= = = 1
3 3 11
1. 1
1 1
y . u2
c2 =
u2 . u2
6 − 1
1 . 2
−8 1 −12
= = = −2
− 1 − 1 6
2 . 2
1 1
And
y . u3
c3 =
u3 . u3
−1
6 2
1 . −2
−8 7
= 2 = −33 = − 2
−1 −1 33 / 2
2 2
−2 . −2
7 7
2 2
Hence
y= u1 − 2u2 − 2u3 .
Example
Solution
We want to write
y =c 1 u1 + c 2 u2 + c 3 u3
And
3 0
7 . 0
y . u3 4 1 4
c3= = = = 4
u3 . u3 0 0 1
0 . 0
1 1
Hence
y= − 2u1 + 5u2 + 4u3 .
Exercise
Decomposition of a non- zero vector y ∈ R n into the sum of two vectors in such a way,
one is multiple of u ∈ R n and the other orthogonal to u . That is
=y y∧ + z
Where y ∧ = α u for some scalar α and z is orthogonal to u .
− y∧ z= y − y ∧
u
y
y∧ = αu
z= y − y ∧
− y∧
Hence
y .u
y∧ = u , which is an orthogonal projection of y onto u .
u .u
And
z= y − y ∧
y .u is a component of y
= y− u
u .u
Example
7 4
Let y = and u = .
6 2
Find the orthogonal projection of y onto u. Then write y as a sum of two orthogonal
vectors, one in span {u} and one orthogonal to u.
Solution
Compute
7 4
y ⋅ u= ⋅ = 40
6 2
4 4
u ⋅ u= ⋅ = 20
2 2
y ⋅u 40 4 8
The orthogonal projection of y onto u is=yˆ = u = u 2=
2 4 and the
u ⋅u 20
7 8 − 1
component of y orthogonal to u is y − yˆ= − =
6 4 2
7 8 −1
= +
The sum of these two vectors is y. That is, 6 4 2
y yˆ ( y − yˆ )
x2
•y
L = Span {u}
y − ˆy • ŷ
•2
•u
x1
1 8
Example
x2
•y
L = Span {u}
y − ˆy • ŷ
•2
•u
x1
1 8
Solution
The distance from y to L is the length of the perpendicular line segment from y to the
orthogonal projection ŷ .
The length equals the length of y − yˆ .
This distance is
y − yˆ = (−1) 2 + 22 = 5
Example
Decompose y = (-3,-4) into two vectors ŷ and z, where ŷ is a multiple of u =
(-3, 1) and z is orthogonal to u. Also prove that y ∧ . z = 0
Solution
It is very much clear that y ∧ is an orthogonal projection of y onto u and it is calculated by
applying the following formula
y. u
y∧ = u
u. u
−3 −3 3
−4 . 1 −3
9 − 4 −3 1 −3 − 2
= = = =
−3 −3 1 9 + 1 1 2 1 1
1. 1 2
3
∧ −3 − 3 −3 + 3/ 2 − 2
z =y − y = − = =
−4 2 −4 − 1/ 2 − 9
1 2
2
So,
3 3
∧
− 2 − 2
=y = and z
1 − 9
2 2
Now
3 3
∧
− 2 − 2
y .z = .
1 − 9
2 2
9 9
= −
4 4
=0
∧
Therefore, y is orthogonal to z
Exercise
Find the orthogonal projection of a vector y = (-3, 2) onto u = (2, 1). Also prove that
=y y ∧ + z , where y ∧ a multiple of u and z is is an orthogonal to u.
Orthonormal Set
Let S = {u1 , u2 ,..., u p } be the set of non-zero vectors in R n , is said to be an orthonormal set
if S is an orthogonal set of unit vectors.
Example
Solution
2 1
5 0 5
−1 u =
u1 = 0 , u2 = 3 0 .
−1 0 2
5 5
2 2
5 5
u1 . u1 = 0 . 0
−1 −1
5 5
4 1
= +0+
5 5
=1
0 0
u2 . u2 =
−1 . −1
0 0
= 0 +1+ 0
=1
And
1 1
5 5
u3 . u3 = 0 . 0
2 2
5 5
1 4
= +
5 5
=1
Hence
S = {u1 , u2 , u3 } is an orthonormal set.
Orthonormal basis
{ }
Let S = u1 , u2 ,..., u p be a basis for a subspace W of R n , is also an orthonormal basis if
S is an orthonormal set.
Example
Solution
Theorem
A m × n matrix U has orthonormal columns if and only if U t U = I
Proof
Keep in mind that in an if and only if statement, one part depends on the other,
so, each part is proved separately. That is, we consider one part and then prove the other
part with the help of that assumed part.
Before proving both sides of the statements, we have to do some extra work which is
necessary for the better understanding.
Let u1 , u2 ,..., um be the columns of U. Then U can be written in matrix form as
U = [u1 u2 u3 ... um ]
Taking transpose, it becomes
u1t
t
u2
u t
3
Ut =.
.
.
t
um
u1t u1t u1 u1t u2 u1t u3 ... u1t um
t t t t t
u2 u2u1 u2u2 u2u3 ... u2um
u t u t u u t u u t u ... u t u
3 3 1 3 2 3 3 3 m
U t U = . [u1 u2 u3 . . . um ] = . . . .
. . . . .
. . . . .
t t
um umu1 umu2 umu3 ... umum
t t t
As u .v = v u t
Therefore
u1.u1 u1.u2 u1.u3 ... u1.um
u .u u .u u .u ... u .u
2 1 2 2 2 3 2 m
u3 .u1 u3 .u2 u3 .u3 ... u3 .um
UtU = . . . .
. . . .
. . . .
u .u u .u u .u ... u .u
m 1 m 2 m 3 m m
1 0 0 ... 0
0 1 0 ... 0
0 0 1 ... 0
= . . . .
. . . .
. . . .
0 0 0 ...
1
1 0 0 ... 0
0 1 0 ... 0
0 0 1 ... 0
= . . . .
. . . .
. . . .
0 1
0 0 ...
That is
U t U = I.
Which is our required result.
Exercise
Prove that the following matrices have orthonormal columns using above theorem.
1 1 1
(1) 1 − 1
2
2 −2 1
(2) 1 2 2
2 1 − 2
cos θ sin θ
(3) − sin θ cos θ
Solution (1)
Let
1 1 1
U = 1 − 1
2
1 1 1
Ut = 1 − 1
2
Then
1 1 1 1 1
UtU = 1
2 − 1
1 − 1
1 0
= =
0 1
I
UtU =I
Therefore, by the above theorem, U has orthonormal columns.
Theorem
a) Ux = x
b) (Ux).(Uy ) = x. y
=
c) (Ux ).(Uy ) 0= iff x. y 0
Example
1/ 2 2/3
1 2
Let U = 1/ 2 − 2 / 3 and X =
2
0 1/ 3 3
Verify that Ux = x
Solution
1/ 2 2/3
1/ 2 1/ 2 0 1 0
U TU = 1/ 2 − 2 / 3
− 2 / 3 1/ 3 0 1
2 / 3
0 1/ 3
1/ 2 2/3 3
1 2
Ux = 1/ 2 − 2 / 3 = −1
2
1/ 3 1
3
0
Ux = 9 + 1 + 1= 11
x = 2+9 = 11
Lecture 19
Orthogonal Decomposition
Objectives
The objectives of the lecture are to learn about:
• Orthogonal Decomposition Theorem.
• Best Approximation Theorem.
Orthogonal Projection
The orthogonal projection of a point in R 2 onto a line through the origin has an important
analogue in R n .
That is given a vector Y and a subspace W in R n , there is a vector ŷ in W such that
0 ŷ
Example 1
Let {u1 , u2 ,..., u5 } be an orthogonal basis for R 5 and let y= c1u1 + c2 u2 + ... + c5 u5 .
Consider the subspace W= Span{u 1 , u 2 } and write y as the sum of a vector z1 in W and a
vector z2 in W ⊥ .
Solution
Write
y = c1u1 + c2u2 + c3u3 + c4u4 + c5u5
z1 z2
z = y - yˆ y
• •
• •
0 yˆ = projwY
Since {u1 , u2 ,..., u p } is the basis for W and y ∧ is written as a linear combination of
these basis vectors. Therefore, by definition of basis y ∧ ∈W .
Now, we will show that z =y − y ∧ ∈ W ⊥ . For this it is sufficient to show that z ⊥ u j for
each j = 1, 2,... p.
Let u1 ∈W be an arbitrary vector.
z . u1= ( y − y ∧ ). u1
= y . u1 − y ∧ . u1
= y . u1 − (c1u1 + c2u2 + ... + c p u p ). u1
=
y . u1 − c1 (u1 . u1 ) − c2 (u2 . u1 ) − ...c p (u p . u1 )
=
y . u1 − c1 (u1 . u1 ) where u j . u1 =
0, j =
2,3,... p
y . u1
= y . u1 − (u1 . u1 )
u1 . u1
= y . u1 − y . u1
=0
Therefore, z ⊥ u1.
Since u1 is an arbitrary vector, therefore z ⊥ u j for j = 1, 2,... p.
Hence by definition of W ⊥ , z ∈W ⊥
Now, we must show that = y y ∧ + z is unique by contradiction.
y y ∧ + z and =
Let = y y1∧ + z1 , where y ∧ , y1∧ ∈ W and z , z1 ∈ W ⊥ ,
also z ≠ z1 and y ∧ ≠ y1∧ . Since above representations for y are equal, that is
y ∧ + z = y1∧ + z1
⇒ y ∧ − y1∧ =z1 − z
Let
s y ∧ − y1∧
=
Then
s= z1 − z
Since W is a subspace, therefore, by closure property
s = y ∧ − y1∧ ∈ W
Furthermore, W ⊥ is also a subspace, therefore by closure property
s = z1 − z ∈ w⊥
Since
s ∈ W and s ∈ W ⊥ . Therefore by definition s ⊥ s
That is s. s=0
Therefore
s = y ∧ − y1∧ = 0
⇒ y∧ =
y1∧
Also
z1 = z
This shows that representation is unique.
Example
2 − 2 1
Let u1 = 5 , u2 = 1 , and y = 2
−1 1 3
Observe that {u1 , u2 } is an orthogonal basis for W=span {u1 , u2 } , write y as the sum of a
vector in W and a vector orthogonal to W.
Solution
1 − 2 / 5 7 / 5
y =
= 2 2 + 0
3 1/ 5 14 / 5
Example
1 4
Let W=span {u1 , u2 } , where u1 = −3 and u2 = 2
2 1
2
Decompose y = −2 into two vectors; one in W and one in W ⊥ . Also verify that these
5
two vectors are orthogonal.
Solution
Let y ∧ ∈ W and z =y − y ∧ ∈ W ⊥ .
Since y ∧ ∈ W , therefore y ∧ can be written as following:
y ∧ c1u1 + c2u2
=
y.u1 y.u2
= u1 + u2
u1.u1 u2 .u2
1 4
9 3
= −3 + 2
7 7
2 1
3
y = −3
∧
3
Now
2 3
z = y − y = −2 − −3
∧
5 3
−1
z = 1
2
Now we show that z ⊥ y ∧ , i.e. z. y ∧ = 0
−1 3
=
z. y ∧ 1 . −3
2 3
=0
Therefore z ⊥ y ∧ .
Exercise
1 3
Let W=span {u1 , u2 } , where u1 = 0 and u2 = 1
−3 1
6
Write y = −8 as a sum of two vectors; one in W and one in W ⊥ . Also verify that these
12
two vectors are orthogonal.
Example
1 4 2
Let W=span {u1 , u2 } , where u1 = −3 , u2 = 2 and y =
−2 . Then using above
2 1 5
theorem, find the distance from y to W.
Solution
Using above theorem the distance from y to W is calculated using the following formula
y − Pr ojwy =y − y∧
Since, we have already calculated
2 3 −1
y − y = −2 − −3 = 1
∧
5 3 2
So y − y∧ =6
Example
The distance from a point y in R n to a subspace W is defined as the distance from y to the
nearest point in W.
Find the distance from y to W =span {u1 , u2 } , where
Theorem
Example
−7 −1 −9
Let u1 =
= 1 , u2 =
1 , y
1
4 −2 6
and W =span {u1 , u2 } . Use the fact that u 1 and u 2 are orthogonal to compute Pr ojw y .
Solution
y.u1 y.u2
Pr=
ojw y u1 + u2
u1.u1 u2 .u2
88 −2
= u1 + u2
66 6
−7 − 1 − 9
4 1
= 1 − 1 = 1 =y
3 3
4 2 6
Lecture 20
Example 1
Solution
1 3 0
v2 =x2 − P = x2 −
x2 .v1 15
v1 = 2 − 6 = 0
v1.v1 45
2 0 2
Example 2
0 5
=x1 = 4 and x2 6
2 −7
Solution
5 0
v2 = x2 −
x2 .v1
v1 =− 10
v1.v1 6 20 4
-7 2
5 0 5
1
= 6 − 4 = 4
2
-7 2 −8
0 5
Thus, an orthogonal basis for W is 4 , 4
2 −8
Theorem
x2 .v1
v=
1 x1 v=
2 x2 − v1
v1.v1
x3 .v1 x .v
v3 =
x3 − v1 − 3 2 v2
v1.v1 v2 .v2
x p .v1 x p .v2 x p .v p -1
vp = xp − v1 − v2 − ... − v p -1
v1.v1 v2 .v2 v p -1.v p -1
In addition
Example 3
1 0 0
1 1 0
=x1 =, x = , x
1 2 1 3 1
1 1 1
Solution
Step 1 Let v 1 = x 1
Step 2
x2 .v1
Let v2= x2 − v1
v1.v1
0 1 −3 / 4 −3
1
3 1 1/ 4 1
Since v2 =x1 = − = =
1 4 1 1/ 4 1
1 1 1/ 4 1
Step 3
0 1 −3 0
0 1
x3 .v1 x3 .v2 2 2 1 2 / 3
v3 =
x3 − v1 + v2 = − + =
v1.v1 v2 .v2 1 4 1 12 1 2 / 3
1 1 1 2 / 3
0 0 0
0 2 / 3 −2 / 3
−
v3 = =
1 2 / 3 1/ 3
1 2 / 3 1/ 3
Example 4
Find an orthogonal basis for the column space of the following matrix by Gram-Schmidt
Process.
−1 6 6
3 − 8 3
1 − 2 6
1 − 4 − 3
Solution
Name the columns of the above matrix as x 1 , x 2 , x 3 and perform the Gram-Schmidt
Process on these vectors.
−1 6 6
3 −8 3
=
x1 = ,x = ,x
1 2 −2 3 6
1 −4 −3
Set v 1 = x 1
x2 .v1
v=
2 x2 − v1
v1.v1
6 −1 3
−8 3 1
= − (−3) =
−2 1 1
−4 1 −1
x3 .v1 x .v
v3 =
x3 − v1 − 3 2 v2
v1.v1 v2 .v2
6 −1 3 −1
3 3 −1
1 5 1
= − − =
6 2 1 2 1 3
−3 1 −1 −1
−1 3 −1
3 1 −1
Thus, orthogonal basis is , ,
1 1 3
1 −1 −1
Example 5
Find the orthonormal basis of the subspace spanned by the following vectors.
3 0
=x1 =6 , x2 0
0 2
Solution
3 0
=v1 =6 , v20
0 2
Orthonormal Basis
3 1/ 5 0
1
v2 0
1 1
=u1 = v1 = 6 2 / 5 =
, u2 =
v1 45 0 v2
0 1
Example 6
Find the orthonormal basis of the subspace spanned by the following vectors.
2 4
−1
x1 =
−5 and x 2 =
1 2
Solution
x2 .v1
v=
2 x2= x2 − v1
v1.v1
Set v1 = x 1 4 2 4 2
15 1
v2 = −1 − −5 = −1 − −5
30 2
2 1 2 1
4 1 6
= −1 − −5 / 2 = 3
2 1/ 2 3
1 1
Now = v2 = and since=
v1 30
54 3 6
Theorem
Example 7
1 2 5
−1 1 − 4
Find a QR factorization of matrix A = −1 4 −3
1 −4 7
1 2 1
Solution
Firstly find the orthonormal basis by applying Gram Schmidt process on the columns of
A. We get the following matrix Q.
1/ 5 1/ 2 1/ 2
−1/ 5 0 0
Q = −1/ 5 1/ 2 1/ 2
1/ 5 − 1/ 2 1/ 2
1/ 5 1/ 2 − 1/ 2
5 − 5 4 5
Now R = Q
T
A = 0 6 −2
0 0 4
Verify that A=QR.
Theorem
y ⋅ u1 y ⋅up
=yˆ u1 + + up
u1 ⋅ u1 up ⋅up
and z = y – ŷ . The vector ŷ is called the orthogonal projection of y onto W and is
often written as proj w y.
y − yˆ < y − v
Exercise 1
1 1/ 3
=x =1 and x 1/ 3
1 2
1 −2 / 3
Exercise 2
Find an orthogonal basis for the column space of the following matrix by Gram-Schmidt
Process.
3 −5 1
1 1 1
−1 −5 − 2
3 −7 8
Exercise 3
Find a QR factorization of
1 3 5
1
−1 −3
A = 0 2 3
1 5 2
1 5 8
Lecture 21
Least Square Solution
Best Approximation Theorem
Least-squares solution
The most important aspect of the least-squares problem is that no matter what “x” we
select, the vector Ax will necessarily be in the column space Col A. So we seek an x that
makes Ax the closest point in Col A to b. Of course, if b happens to be in Col A, then b is
Ax for some x and such an x is a “least-squares solution.”
Given A and b as above, apply the Best Approximation Theorem stated above to the
subspace Col A. Let bˆ = projCol Ab
Since b̂ is in the column space of A, the equation Ax = bˆ is consistent, and there is an x̂
in Rn such that
Axˆ = bˆ (1)
Since b̂ is the closest point in Col A to b, a vector x̂ is a least-squares solution of Ax = b
if and only if x̂ satisfies Axˆ = bˆ . Such an x̂ in Rn is a list of weights that will build b̂ out
of the columns of A.
Since set of least-squares solutions is nonempty and any such x̂ satisfies the normal
equations. Conversely, suppose that x̂ satisfies AT Axˆ = AT b. Then it satisfy that b − Axˆ
is orthogonal to the rows of AT and hence is orthogonal to the columns of A. Since the
columns of A span Col A, the vector b − Axˆ is orthogonal to all of Col A. Hence the
equation b = Axˆ + (b − Axˆ ) is a decomposition of b into the sum of a vector in Col A and
a vector orthogonal to Col A. By the uniqueness of the orthogonal decomposition, Axˆ
must be the orthogonal projection of b onto Col A. That is, Axˆ = bˆ and x̂ is a least-
squares solution.
Definition
b − Axˆ ≤ b − Ax ∀ x ∈ Rn
Theorem
The set of least-squares solutions of Ax = b coincides with the nonempty set of solutions
of the normal equations
AT Axˆ = AT b
Example 1
Find the least squares solution and its error from the following matrices,
4 0 2
=
A =0 2 , b 0
1 1 11
Solution
Firstly we find
4 0
4 0 1 17 1 and
=A A = 0 T
2
0 2 1 1 1 1 5
2
4 0 1 19
=A b =
T
0
0 2 1 11 11
17 1 x1 19
Then the equation AT Axˆ = AT b becomes =
1 5 x2 11
Row operations can be used to solve this system, but since ATA is invertible and 2 × 2 , it
1 5 −1
is probably faster to compute ( AT A) −1 =
84 −1 17
1 5 −1 19 1 84 1
) −1 AT b
Therefore, xˆ = ( AT A= = =
84 −1 17 11 84 168 2
4 0 2
Now again as A= 0 2 , b = 0
1 1 11
4 0 4
1
Then =Axˆ =0 2 4
1 1
2
3
2 4 −2
b − Axˆ = 0 − 4 =
Hence −4
11 3 8
So b − Axˆ = (−2) 2 + (−4) 2 + 82 = 84
The least-squares error is 84. For any x in R2, the distance between b and the vector Ax
is at least 84.
Example 2
Find the general least-squares solution of Ax = b in the form of a free variable with
1 1 0 0 −3
1 1 0 0 −1
1 0 1 0 0
=A = ,b
1 0 1 0 2
1 0 0 1 5
1 0 0 1 1
Solution
1 1 0 0
1 1 1 1 1 1 1 1 0 0 6 2 2 2
1 1 0 0 0 0 1 0 1 0 0
=Firstly we find, A A =
T 2 2 0
and
0 0 1 1 0 0 1 0 1 0 2 0 2 0
0 0 0 0 1 1 1 0 0 1 2 0 0 2
1 0 0 1
−3
1 1 1 1 1 1 −1 4
1 1 0 0 0 0 0 −4
A b =
T
0 0 1 1 0 0 2 2
0 0 0 0 1 1 5 6
1
Then augmented matrix for AT Axˆ = AT b is
6 2 2 2 4 1 0 0 1 3
2 2 0 0 −4 0 1 0 −1 −5
2 0 2 0 2 0 0 1 −1 −2
2 0 0 2 6 0 0 0 0 0
Theorem
The matrix AT A is invertible iff the columns of A are linearly independent. In this case,
the equation Ax = b has only one least-squares solution x̂ , and it is given by
xˆ = ( AT A) −1 AT b
Example 3
Find the least squares solution to the following system of equations.
2 4 6 0
1 − 3 0 x1
= A = x 2 , b 1
7 1 4 −2
x3
1 0 5 4
Solution
As
2 1 7 1
= 0
A 4 − 3
T
1
6 0 4 5
2 4 6
2 1 7 1 55 12 45
4 − 3 1 0 1 − 3 0
AT A =
7 1 =12 26 28
4
6 0 4 5 45 28 77
1 0 5
Now
0
2 1 7 1 −9
4 − 3 1 0 1 =
AT b =
−2 −5
6 0 4 5 12
4
As AT Axˆ = AT b
55 12 45 x1 −9
12 26 28 x 2 = −5
45 28 77 x3 12
x1 −.676
x2 = −.776
x3 .834
Example 4
Compute the least square error for the solution of the following equation
2 4 6 0
1 − 3 0 x1 1
=A =
x2 , b
7 1 4 −2
x3
1 0 5 4
.
Solution
2 4 6 2 4 6
1 − 3 0 x1 1 − 3 −.676
x2 0
−.776
=Ax =
7 1 4 7 1 4
x3 .834
1 0 5 1 0 5
0.548
1.652
Ax =
−2.172
3.494
∈= b − Ax
is as small as possible, or in other words is smaller than all other possible choices of x.
0 0.548 −0.548
1 1.652 0.652
b − Axˆ= − =
−2 −2.172 0.172
As
4 3.494 1.494
∈ =∈12 + ∈2 2 + ∈32 + ∈4 2
2
∈= b − Ax =
(−0.548) 2 + (0.652) 2 + (0.172) 2 + (1.494) 2
=0.3003+0.4251+.02958+2.23=2.987
Theorem
Example 1
1 3 5 3
1 1 0 5
=
Find the least square solution for A = ,b
1 1 2 7
1 3 3 −3
Solution
First of all we find QR factorization of the given matrix A. For this we have to find out
orthonormal basis for the column space of A by applying Gram-Schmidt Process, we get
the matrix of orthonormal basis Q,
1
2 12 12
1
2 −1 2 −1 2
Q= And
1
2 −1 2 1 2
2 1 2 −1 2
1
1 3 5
1 2 1 2 1 2 2 4 5
0
12
R = Q A = 1 2 −1 2 −1 2 1 2
T 1 1
= 0 2 3
1 1 2
1 2 −1 2 1 2 −1 2 0 0 2
1 3 3
3
1 2 1 2 12 1 2 6
1 2 −1 2 −1 2 1 2 5 =
QT b =
Then 7 −6
1 2 −1 2 1 2 −1 2 4
−3
2 4 5 x1 6
The least-squares solution x̂ satisfies Rxˆ = Q b; that is, 0 2 3 x2 =
T
−6
0 0 2 x3 4
10
This equation is solved easily and yields xˆ = −6 .
2
Example 2
3 1 1
=
A =6 2 , b 2
0 2 3
Solution
First of all we find QR factorization of the given matrix A. Thus, we
have to make
Orthonormal basis by applying Gram Schmidt process on the columns of A,
Let v 1 = x 1
1 3 0
v1 = 2 − 6 = 0
x2 .v1 15
v2 =x2 − P = x2 −
v1.v1 45
2 0 2
Thus
1 0
1 0
0 and QT
2
=Q =
1
2
0 0
0 1
3 1
1 0
Now R = Q A = 2
T 2
1
6
0 0
0 2
1
15 5 1 2 0 5
= = And Q b =
T
2 1 3
2
0 0 0 3
15 5 5
0 xˆ =
2 3
1
xˆ =
1.8
Exercise 1
1 −6 −1
1 −2 2
=A = , b
1 1 1
1 7 6
Exercise 2
1 3 5 3
1 1 0 5
=A = , b
1 1 2 7
1 3 3 −3
Exercise 3
2 1 −5
A= 8
−2 0 , b =
2 3 1
Lecture 22
Eigen Value Problems
Let [A] be an n x n square matrix. Suppose, there exists a scalar and a vector
X = ( x1 x2 xn )T
such that
[ A]( X ) = λ ( X )
d ax
(e ) = a (e ax )
dx
d2
2
(sin ax) = −a 2 (sin ax)
dx
Then λ is the eigen value and X is the corresponding eigenvector of the matrix [A].
We can also write it as [ A − λ I ]( X ) =
(O)
This represents a set of n homogeneous equations possessing non-trivial solution,
provided
A − λI = 0
This determinant, on expansion, gives an n-th degree polynomial which is called
characteristic polynomial of [A], which has n roots. Corresponding to each root, we can
solve these equations in principle, and determine a vector called eigenvector.
Finding the roots of the characteristic equation is laborious. Hence, we look for better
methods suitable from the point of view of computation. Depending upon the type of
matrix [A] and on what one is looking for, various numerical methods are available.
Note!
We shall consider only real and real-symmetric matrices and discuss power and Jacobi’s
methods
Power Method
To compute the largest eigen value and the corresponding eigenvector of the system
[ A]( X ) = λ ( X )
where [A] is a real, symmetric or un-symmetric matrix, the power method is widely used
in practice.
Procedure
Step 1: Choose the initial vector such that the largest element is unity.
Step 4: This process of iteration is continued and the new normalized vector is repeatedly
pre-multiplied by the matrix [A] until the required accuracy is obtained.
At this point, the result looks like
= A]v ( k −1) qk v ( k )
u (k ) [ =
Here, qk is the desired largest eigen value and v ( k ) is the corresponding eigenvector.
Example
Find the eigen value of largest modulus, and the associated eigenvector of the matrix by
power method
2 3 2
[ A] = 4 3 5
3 2 9
Solution
We choose an initial vector υ (0)
as (1,1,1)T .
Then, compute first iteration
2 3 2 1 7
=u (1)
A]v 4 3 5 1 12
[= (0)
3 2 9 1 14
Now we normalize the resultant vector to get
12
6
=
u (1) 14= 7 q1v
(1)
1
The second iteration gives,
2 3 2 12 397
= A]v (1) 4 3 5 76 677
u (2) [ =
171
3 2 9 1 14
0.456140
= 12.2143
= 0.783626 q2 v
(2)
1.0
Continuing this procedure, the third and subsequent iterations are given in the following
slides
2 3 2 0.456140
=
u [=
(3)
A]v 4 3 5 0.783626
(2)
3 2 9 1.0
5.263158 0.44096
= = 9.175438 11.935672 =
0.776874 q3v
(3)
11.935672 1.0
5.18814
=
u (4)
[=
A]v (3)
9.07006
11.86036
0.437435
= 11.8636
= 0.764737 q4 v
(4)
1.0
5.16908
=
u (5)
[=
A]v (4)
9.04395
11.84178
0.436512
= 11.84178
= 0.763732 q5v
(5)
1.0
After rounding-off, the largest eigen value and the corresponding eigenvector as accurate
to two decimals are
0.44
λ = 11.84 ( X ) = 0.76
1.00
Example
Find the first three iterations of the power method of the given matrix
7 6 −3
−12 −20 24
−6 −12 16
Solution
7 6 −3
−12 −20 24
−6 −12 16
we choose finitial vector as v (0) =(1,1,1)t
first iteration
7 6 −3 1 7 + 6 − 3 10
u =[ A] v = −12 −20 24 1 = −12 − 20 + 24 = −8
(1)
(0)
−6 −12 16 1 −6 − 12 + 16 −2
10 1
by diagonali sin g −8 u= 10 −0.8=
(1) q v (1)
1
−2 −0.2
sec ond iteration
7 6 −3 1 7 − 4.8 + 0.6 2.8
u = [ A] v = −12 −20
(2) (1)
24 −0.8 =
−12 + 16 − 4.8 = −0.8
−6 −12 16 −0.2
−6 + 9.6 − 3.2 0.4
2.8 1
by diagonali sin g −0.8 u= 2.8 −0.2857
(1)
= q v (2)
2
0.4 0.1428
third iteration
7 6 −3 1 7 − 1.7142 − 0.4284 4.8574
u = [ A] v = −12 −20 24 −0.2857 =
(3)
(2) −12 + 5.714 + 3.4272 =
−2.8588
−6 −12 16 0.1428 −6 + 3.4284 + 2.2848 −0.2868
now daigonali sin g
4.8574 1
−2.8588 now normali sin g 4.8574 −0.5885
−0.2868 −0.0590
Example
Find the first three iteration of the power method applied on the following matrices
1 −1 0
−2 4 −2 use x 0 = (−1, 2,1)t
0 −1 2
Solution
1 −1 0
−2 4 −2 USE x (0) = (−1, 2,1)t
0 −1 2
1st iterations
1 −1 0 −1 −1 − 2 + 0 −3
u = [ Α ] x = −2 4 −2 2 = 2 + 8 − 2 = 8
(1) (0)
0 −1 2 1 0 − 2 + 2 0
now we normalize the resul tan t vector to get
−3
−3 8
=
=
u (1)
8 8 =1 q1 x (1)
0 0
−3
−3 − 1 + 0
1 −1 0 8 8 −1.375
6
u = [ Α ] x = −2 4 −2 1 =
(2) (1)
+ 4 + 0 = 4.75
8
0 −1 2 0 −1
−1
−1.375 −0.28947
= 4.75 4.75
=
(2)
u 1
−1 −0.2152
1 −1 0 −0.28947 −1.28947 −0.25789
u (3)
= [ Α] x (2)
= −2 4 −2 1 =
4.99998 = 4.99998
1
0 −1 2 −0.2152 −1.42104 −0.28420
Exercise
Find the largest eigen value and the corresponding eigen vector by power method after
fourth iteration starting with the initial vector υ (0) = (0, 0,1)T
1 −3 2
=[ A] 4 4 −1
6 3 5
Let
λ1 , λ2 , , λn
be the distinct eigen values of an n x n matrix [A], such that λ1 > λ2 > > λn and
suppose v1 , v2 , , vn are the corresponding eigen vectors
Power method is applicable if the above eigen values are real and distinct, and hence, the
corresponding eigenvectors are linearly independent.
Then, any eigenvector v in the space spanned by the eigenvectors v1 , v2 , , vn
can be written as their linear combination v = c1v1 + c2 v2 + + cn vn
=Av1 λ=
1v1 , Av2 λ2 v2 , =
Avn λn vn
We get
λ λ
Av λ1 c1v1 + c2 2 v2 + + cn n vn
=
λ1 λ1
Again, pre-multiplying by A and simplifying, we obtain
λ2
2
λn
2
=
A 2
v λ1 c1v1 + c2 v2 + + cn vn
2
λ1 λ1
Similarly, we have
λ2
r
λn
r
=
A r
v λ c1v1 + c2 v2 + + cn vn
r
λ1 λ1
1
and
λ2
r +1
λn
r +1
=A v (λ1 ) c1v1 + c2 v2 + + cn vn
r +1 r +1
λ1 λ1
Now, the eigen value λ1
r +1
λ1r +1 ( A v) p
λ1
= = Lt , =p 1, 2, , n
λ1 r r →∞ ( Ar v )
p
Here, the index p stands for the p-th component in the corresponding vector
Sometimes, we may be interested in finding the least eigen value and the corresponding
eigenvector.
In that case, we proceed as follows.
We note that [ A]( X ) = λ ( X ).
Pre-multiplying by [ A−1 ] , we get
[ A−=
1
A−1 ]λ ( X ) λ[ A−1 ]( X )
][ A]( X ) [ =
Which can be rewritten as
1
[ A−1 ]( X ) = ( X )
λ
which shows that the inverse matrix has a set of eigen values which are the reciprocals of
the eigen values of [A].
Thus, for finding the eigen value of the least magnitude of the matrix [A], we have to
apply power method to the inverse of [A].
Lecture 23
Jacobi’s Method
Definition
An n x n matrix [A] is said to be orthogonal if
[ A]T [ A] = [ I ],
i.e.[ A]T = [ A]−1
In order to compute all the eigen values and the corresponding eigenvectors of a real symmetric matrix,
Jacobi’s method is highly recommended. It is based on an important property from matrix theory, which
states
that, if [A] is an n x n real symmetric matrix, its eigen values are real, and there exists an orthogonal matrix
[S] such that the diagonal matrix D is
[ S −1 ][ A][ S ]
sij = − sin θ , s ji =
sin θ ,
= =
sii cos θ , s jj cos θ
While each of the remaining off-diagonal elements are zero, the remaining diagonal elements are assumed
to be unity. Thus, we construct S1 as under
i-th column j -th column
↓ ↓
1 0 0 0 0
0 1 0 0 0
0 0 cos θ − sin θ 0 ← i-th row
S1 =
0 0 sin θ cos θ 0 ← j -th row
0 0 0 0 1
Where cos θ , − sin θ ,sin θ and cos ϑ are inserted in (i, i ), (i, j ), ( j , i ), ( j , j ) − th positions respectively,
and elsewhere it is identical with a unit matrix.
Now, we compute
−1
=D1 S= 1 AS1 S1T AS1
Since S1 is an orthogonal matrix, such that .After the transformation, the elements at the position (i , j), (j ,
i) get annihilated, that is dij and dji reduce to zero, which is seen as follows:
dii dij
d d jj
ji
cos θ sin θ aii aij cos θ − sin θ
= a jj sin θ cos θ
a
− sin θ cos θ aij
aii cos 2 θ sin θ cos θ + a jj sin 2 θ (a jj − aii ) sin θ cos θ + aij cos 2θ
(a ji − aii ) sin θ cos θ + aij cos 2θ aii sin 2 θ + a jj cos 2 θ − 2aij sin θ cos θ
Therefore, d ij = 0 only if,
a jj − aii
aij cos 2θ + sin 2θ =
0
2
That is if
2aij
tan 2θ =
aii − a jj
Thus, we choose θ such that the above equation is satisfied, thereby, the pair of off-diagonal elements dij
and dji reduces to zero.However, though it creates a new pair of zeros, it also introduces non-zero
contributions at formerly zero positions.
Also, the above equation gives four values of , but to get the least possible rotation, we choose
Lecture 24
Example
Find all the eigen values and the corresponding eigen vectors of the matrix by Jacobi’s
method
1 2 2
A= 2 3 2
2 2 1
Solution
The given matrix is real and symmetric. The largest off-diagonal element is found to be
a=
13 a=
31 2.
Now, we compute
2aij 2a13 4
tan 2θ = = = = ∞
aii − a jj a11 − a33 0
This gives, θ =π 4
Thus, we construct an orthogonal matrix Si as
cos π4 0 − sin π4 2 0 − 2
1 1
=S1 = 0 1 0 0 1 0
sin π4 0 cos π4 12 0 1
2
12 − 12 0
S 2 = 12 1
2
0
0 0 1
At the end of the second rotation, we get
D2 = S 2−1 D1S 2
12 1
2
0 3 2 0 12 − 1
2
0
= − 12 1
2
0 2 3 0 12 1
2
0
0 1 0 0 −1 0 1
0 0
5 0 0
= 0 1 0
0 0 −1
This turned out to be a diagonal matrix, so we stop the computation. From here, we
notice that the eigen values of the given matrix are 5,1 and –1. The eigenvectors are the
column vectors of S = S1S 2
Therefore
12 0 − 12 12 − 12 0
S = 0 1 0 12 1
2
0
1 0 1
0 1
2 2 0
1
− 12 − 1
2 2
= 1
2
1
2
0
1
2 − 12 1
2
Example
Find all the eigen values of the matrix by Jacobi’s method.
2 −1 0
A= −1 2 −1
0 −1 2
Solution
Here all the off-diagonal elements are of the same order of magnitude. Therefore, we can
choose any one of them. Suppose, we choose a12 as the largest element and compute
−1
tan 2θ = = ∞
0
Which gives, θ = π 4.
12 − 12 0
S = 12 1
2
0
0 0 1
The first rotation gives
D1 = S1−1 AS1
12 − 12 0 2 −1 0 12 − 1
2
0
1
=
− 2
1
2
0 −1 2 −1 12 1
2
0
0 1 0 −1 2 0 1
0 0
1 0 − 12
= 0 3 − 12
1
− 2 − 2
1
2
Now, we choose d13 = −1 2
As the largest element of D1 and compute
2d13 − 2
tan 2θ =
=
d11 − d33 1 − 2
θ = 27o 22′41′′ .
Now we construct another orthogonal matrix S2, such that
0.888 0 −0.459
S 2 = 0 1 0
0.459 0 0.888
At the end of second rotation, we obtain
0.634 −0.325 0
=D2 S= −1 −0.628
2 D1 S 2 0.325 3
0 −0.628 2.365
Now, the numerically largest off-diagonal element of D2 is found to be d 23 = −0.628 and
compute.
−2 × 0.628
tan 2θ =
3 − 2.365
θ = −31 35′24′′.
o
0.634 −0.277 0
= D3 S= −1
D2 S3 0.277 3.386 0
3
0 0 1.979
To reduce D3 to a diagonal form, some more rotations are required. However, we may
take 0.634, 3.386 and 1.979 as eigen values of the given matrix.
Example
Using Jacobi’s method, find the eigenvalues and eigenvectors of the following matrix,
1 1/ 2 1/ 3
1/ 2 1/ 3 1/ 4
1/ 3 1/ 4 1/ 5
Solution:
The given matrix is real and symmetric. The l arg est off − diagonal element is found to be
1
a=
12 a=
21
2
Now we comute
1
2
=
tan 2θ
2aij
=
2a12
= = 2 3
aii − a jj a11 − a22 1 − 1 2
3
3
tan −1
=θ = 2 28.155
2
Lecture 25
Inner Product Space
Inner Product Space
In mathematics, an inner product space is a vector space with the additional structure
called an inner product. This additional structure associates each pair of vectors in the
space with a scalar quantity known as the inner product of the vectors. Inner products
allow the rigorous introduction of intuitive geometrical notions such as the length of a
vector or the angle between two vectors. They also provide the means of defining
orthogonality between vectors (zero inner product). Inner product spaces generalize
Euclidean spaces (in which the inner product is the dot product, also known as the scalar
product) to vector spaces of any (possibly infinite) dimension, and are studied in
functional analysis.
Definition
An inner product on a vector space V is a function that to each pair of vectors
u and v associates a real number 〈u , v〉 and satisfies the following axioms,
For all u , v , w in V and all scalars C:
1) 〈 u , v〉 = 〈 v, u 〉
2) 〈u + v, w〉 = 〈u , w〉 + 〈 v, w〉
3) 〈 cu , v〉 = c 〈u , v〉
4) 〈u , u 〉 ≥ 0 and= 〈u , u 〉 0 =
iff u 0
Example 1
Fix any two positive numbers say 4 & 5 and for vectors u = u1 , u2 and
v = v1 , v2 in R 2 set
u=
, v 4u1v1 + 5u2 v2
Show that it defines an inner product.
Solution
Certainly Axiom 1 is satisfied, because
u , v = 4u 1 v 1 +5u 2 v 2 = 4v 1 u 1 + 5v 2 u 2 = v ,u .
If w = (w 1 , w 2 ), then
u + v, w = 4(u1 + v1 ) w1 + 5(u2 + v2 ) w2
= 4u1w1 + 5u2 w2 + 4v1w1 + 5v2 w2 = u , w + v, w
This verifies Axiom 2.
For Axiom 3, we have cu, v = 4(cu1 )v1 + 5(cu2 )v2 = c(4u1v1 + 5u2 v2 ) = c u, v
For Axiom 4, note that u , u = 4u12 + 5u22 ≥ 0, and 4u12 + 5u22 =
0 only if u 1 = u 2 = 0, that
is, if u = 0. Also, 0, 0 = 0.
Example 2
Let A be symmetric, positive definite n × n matrix and let u and v be vectors in ℜn .
Show that u , v = u t Av defines and inner product.
Solution
We check that
u=
, v u= t
Av u= . Av Av.u
( v A=
)u
t
= A=
t
v.u t
v=
t
Au v, u
Also
u , v + w = u t A ( v + w )= u t Av + u t Aw
= u, v + u, w
And
=
cu , v ( cu
= ) Av
t
( u t Av ) c u, v
c=
Finally since A is positive definite
u, u = u t Au > 0 for all u ≠ 0
So u=
, u u=
t
Au 0 iff=u 0
So u , v = u t Av is an inner product space.
Example 3
Let t 0 , …, t n be distinct real numbers. For p and q in P n , define
= p, q p (t0 )q (t0 ) + p (t1 )q (t1 ) + + p (tn )q (tn )
Show that it defines inner product.
Solution
Certainly Axiom 1 is satisfied, because
=p, q p (t0 )q (t0 ) + p (t1 )q (t1 ) + + p (tn )q (tn )
= q (t0 ) p (t0 ) + q (t1 ) p (t1 ) + + q (tn )=
p (tn ) q, p
If r = r (t0 ) + r (t1 ) + + r (tn ) , then
p + q, r= [ p(t0 ) + q(t0 )] r (t0 ) + [ p(t1 ) + q(t1 )] r (t1 ) + + [ p(tn ) + q(tn )] r (tn )
= [ p(t0 )r (t0 ) + p(t1 )r (t1 ) + ... + p(tn )r (tn )] + [ q(t0 )r (t0 ) + q(t1 )r (t1 ) + ... + q(tn )r (tn )]
= p, r + q, r
This verifies Axiom 2.
For Axiom 3, we have
= cp, q [cp(t0 )] q(t0 ) + [cp(t1 )] q(t1 ) + + [cp(tn )] q(tn )
= c [ p (t0 )q (t0 ) + p (t1 )q (t1 ) + + p (t=
n ) q (t n ) ] c p, q
For Axiom 4, note that
p=
, p [ p (t0 )]2 + [ p (t1 )]2 + + [ p (tn )]2 ≥ 0
Also, 0, 0 =0. (We still use a boldface zero for the zero polynomial, the zero vector in
P n .) If p, p =0, then p must vanish at n + 1 points: t 0 , …, t n . This is possible only if p is
the zero polynomial, because the degree of p is less than n + 1. Thus
= p, q p (t0 )q (t0 ) + p (t1 )q (t1 ) + + p (tn )q (tn ) defines an inner product on P n .
Example 4
Compute p, q where p(t)= 4+t q(t) = 5-4t2
Refer to P2 with the inner product given by evaluation at -1, 0 and 1 in example 2.
Solution
P(=−1) 3 , P=(0) 4 , P=(1) 5
q (=
−1) 1 , q=
(0) 5 , q=(1) 1
Example 5
Compute the orthogonal projection of q onto the subspace spanned by p, for p and q in
the above example.
Solution
The orthogonal projection of q onto the subspace spanned by p
P(=
−1) 3 , P= =
(0) 4 , P (1) 5
q (=
−1) 1 , q=
(0) 5 , q=
(1) 1
= =
q. p 28 p. p 50
q. p 28
=q = p (4 + t )
p. p 50
56 14
= + t
25 25
Example 6
Let V be P2 , with the inner product from example 2 where
1
=
t0 0= , t1 and=t2 1
2
Let p ( t=) 12t 2 and q ( t=) 2t − 1
Compute p, q and q, q
Solution
, q p ( 0) q ( 0) + p
p= ( ) + p (1) q (1)
1
2
1
q, q = q ( 0 ) + q + q (1)
2 2
2
( −1) + ( 0 ) + (1) =
=
2 2 2
2
Norm of a Vector
Let V be and inner product space with the inner product
denoted by u , v just as in R n , we define the length or norm of a vector V to be the
scalar
= =
2
v u, v or v u, v
Example 7
Compute the length of the vectors in example 3.
Solution
2
1
p ( 0 ) + p 2 + p (1)
2 2
p =p, p =
2
= ( 0 ) + ( 3) + (12 ) = 153
2 2 2
p = 153
Example 8
Let ℜ2 have the inner product of example 1 and let x=(1,1) and y=(5,-1)
b) Describe all vectors ( z1 , z2 ) that are
2
a) Find x , y and x, y
orthogonal to y.
Solution
=x =
x, x 4(1)(1) + 5(1)(1)
= 4 + 5= 9= 3
=y y=
,y 4(5)(5) + 5(−1)(−1)
= 100 + 5= 105
2
x, y = x, y x, y
= [ 4(1)(5) + 5(1)(−1)]
2
= [ 20 − 5]
2
= [15
= ] 225
2
Example 9
Le V be P4 with the inner product in example 2 involving evaluation of
polynomials at -2,-1,0,1,2 and view P2 as a subspace of V. Produce an orthogonal basis
for P2 by applying the Gram Schmidt process to the polynomials 1, t & t 2 .
Solution
Polynomial: 1 t t2
1 −2 4
1 −1 1
Vector of values: 1 , 0 , 0
1 1 1
1 2 4
The inner product of two polynomials in V equals the (standard) inner product of their
corresponding vectors in R5. Observe that t is orthogonal to the constant function 1. So
onto Span {p 0 , p 1 }:
t 2 , p0 = t 2 ,1 = 4 + 1 + 0 + 1 + 4 = 10
p0 , p0 = 5
t 2 , p1 = t 2 , t =−8 + (−1) + 0 + 1 + 8 =0
10
The orthogonal projection of t2 onto Span {1, t} is p0 + 0 p1. Thus
5
p2 (t ) =
t 2 − 2 p0 (t ) =
t2 − 2
An orthogonal basis for the subspace P 2 of V is:
Polynomial: p0 p1 p2
1 −2 2
1 −1 −1
Vector of values: 1 , 0 , −2
1 1 −1
1 2 2
Example 10
Let V be P4 with the inner product in example 5 and let P0 , P1 & P2
be the orthogonal basis for the subspace P2 , find the best approximation to
1
p ( t )= 5 − t 4 by polynomials in P2 .
2
Solution:
The values of p 0 , p 1 , and p 2 at the numbers – 2, –1, 0, 1, and 2 are listed
in R5 vectors in
Polynomial: p0 p1 p2
1 −2 2
1 −1 −1
Vector of values: 1 , 0 , −2
1 1 −1
1 2 2
The corresponding values for p are: −3,9 2,5,9 2, and –3.
We compute
p, p0 = 8 p, p1 = 0 p, p2 = −31
= p0 , p0 5,= p2 , p2 14
Then the best approximation in V to p by polynomials in P 2 is
p, p0 p, p1 p, p2
pˆ = proj p2 p = p0 + p1 + p2
p0 , p0 p1 , p1 p2 , p2
=
5 p0 + 14 p2 =
8 −31
5 − 14 (t − 2).
8 31 2
This polynomial is the closest to P of all polynomials in P 2 , when the distance between
polynomials is measured only at –2, –1, 0, 1, and 2.
For all u , v in V
u, v ≤ u v
Triangle Inequality
For all u , v in V
u+v ≤ u + v
Proof
u + v = u + v, u + v
2
=u , u + 2 u , v + v, v
≤ u + 2 u, v + v
2 2
≤ u +2 u v + v
2 2
(u )
2
u+v = + v
2
⇒ u+v = u + v
C [ a, b ]
Inner product for
Example 11
For f , g in C [ a, b ] , set
f , g = ∫ f ( t ) g ( t ) dt
a
1. f , g = g , f
2. f + h, g = f , g + h, g
3. cf , g = c f , g
For Axiom 4, observe that
b
= ∫ [ f (t )] dt ≥ 0
2
f, f
a
The function [f(t)]2 is continuous and nonnegative on [a, b]. If the definite integral of
[f(t)]2 is zero, then [f(t)]2 must be identically zero on [a, b], by a theorem in advanced
calculus, in which case f is the zero function. Thus f , f = 0 implies that f is the zero
function of [a, b].
b
So f , g = ∫ f (t ) g (t )dt defines an inner product on C[a, b].
a
Example 12
Compute f,g where f (t ) =
1 − 3t 2 and g (t ) = C [ 0,1] .
t − t 3 on v =
Solution
Let V be the space C [ a, b ] with the inner product
f , g = ∫ f ( t ) g ( t ) dt
a
f (t ) =
1 − 3t 2 , g (t ) =
t − t3
∫ (1 − 3t ) ( t − t )dt
1
f,g = 2 3
0
∫ ( 3t − 4t 3 + t )dt
1
= 5
0
1
1 6 4 1 2
= t −t + t
2 2 0
=0
Example 13
b
Let W be the subspace spanned by the polynomials
P1 ( t ) =
1 , P2 ( t ) =
2t − 1 & P3 ( t ) =
12t 2
Use the Gram – Schmidt process to find an orthogonal basis for W.
Solution
p3 , q1 = ∫
0
12t 2 ⋅1dt = 4t 3 = 4
0
1 1
q1 , q1 = ∫ 1 ⋅1dt = t =1
0 0
1 1
p3 , q2 = ∫ 12t 2 (2t − 1)dt= ∫ (24t − 12t 2 )dt= 2
3
0 0
1 1 1
1
q2 , q2 =∫0 (2t − 1) 2 dt =
− =
3
(2 t 1)
6 0 3
p ,q p ,q 4 2
Then projw2 p3 = 3 1 q1 + 3 2 q2 = q1 + q2 =4q1 + 6q2
q1 , q1 q2 , q2 1 13
And q3 = p3 − projw2 p3 = p3 − 4q1 − 6q2
As a function, q 3 (t) = 12t2 – 4 – 6(2t – 1) = 12t2 – 12t + 2. The orthogonal basis for the
subspace W is {q 1 , q 2 , q 3 }
Exercises
Let ℜ2 have the inner product of example 1 and let x=(1,1) and y=(5,-1)
b) Describe all vectors ( z1 , z2 ) that are
2
a) Find x , y and x, y
orthogonal to y.
2) Let ℜ2 have the inner product of Example 1. Show that the Cauchy-Shwarz
inequality holds for x=(3,-2) and y=(-2,1)
Exercise 3-8 refer to P2 with the inner product given by evaluation at -1,0 and 1 in
example 2.
3) Compute p, q where p(t)= 4+t q(t) = 5-4t2
4) Compute p, q where p(t)= 3t - t2 q(t) = 3 + t2
5) Compute P and q for p and q in exercise 3.
6) Compute P and q for p and q in exercise 4.
7) Compute the orthogonal projection of q onto the subspace spanned by p, for p
and q in Exercise 3.
9) Let P 3 have the inner product given by evaluation at -3,-1,1, and 3. Let
= pο (t ) 1=
, p1 (t ) t , and=p2 (t ) t 2
a)Computer the orthogonal projection of P2 on to the subspace spanned by P0 and P1 . b)
Find a polynomial q that is orthogonal to P0 and P1 such tha { p0 , p1 , q} is an orthogonal
basis for span { p0 , p1 , q} . Scale the polynomial q so that its vector of values at (-3,-
1,1,3) is (1,-1,-1,1)
10) Let P 3 have the inner product given by evaluation at -3,-1,1, and 3. Let
= pο (t ) 1=, p1 (t ) t , and= p2 (t ) t 2
Find the best approximation to p (t ) = t 3 by polynomials in Span { p0 , p1 , q} .
11) Let p0 , p1 , p2 be the orthogonal polynomials described in example 5, where the
inner product on P4 is given by evaluation at -2, -1, 0, 1, and 2. Find the orthogonal
projection of t 3 onto Span { p0 , p1 , p2 }
12) Compute f,g where f (t ) =
1 − 3t 2 and g (t ) = C [ 0,1] .
t − t 3 on v =
16) Let V be the space C[-2,2] with the inner product of Example 7. Find an
orthogonal basis for the subspace spanned by the polynomials 1, t , t 2 .
17)
u1 v1
Let u = and v = be two vectors in R 2 . Show that u= , v 2u1v1 + 3u2 v2
u2 v2
defines an inner product.
Lecture 26
Application of inner product spaces
Definition
An inner product on a vector space V is a function that associates to each pair of vectors u
and v in V, a real number u , v and satisfies the following axioms, for all u, v, w in V
and all scalars c:
1. u , v = v, u
2. u + v, w = u , w + v, w
3. cu , v = c u , v
4. u , u ≥ 0 and u , u =
0 iff u =
0.
Xβ = y
1 x1 y0
1 x2 y
1
. . β0 .
=
Where X = , β = , y
. . β1 .
. . .
1 xn yn
Example 1
Find the equation = y β 0 + β1 x of the least-squares line that best fits the data points
(2, 1), (5, 2), (7, 3), (8, 3).
Solution
Xβ = y
1 2 1
1 2
5 β0
Here X
= = , β = , y
1 7 β1 3
1 8 3
For the least-squares solution of xβ = y , obtain the normal equations(with the new
notation) :
X T X βˆ = X T y
i.e, compute
1 2
1 1 1 1 1 5 4 22
X T X =
2 5 7 8 1 7 22 142
1 8
1
1 1 1 1 2 9
X T y =
2 5 7 8 3 57
3
4 22 β 0 9
=
22
142 β1 57
Hence,
−1
β0 4 22 9
β = 22 142 57
1
1 142 − 22 9
=
84 −22 4 57
1 24 2 / 7
= =
84 30 5 /14
Thus, the least -squares line has the equation
2 5
y= + x
7 14
Weighted Least-Squares
Where W is the diagonal matrix with (positive) w1 ,..., wn on its diagonal, that is
w1 0 ... 0
0 w2
. . .
W =
. . .
. . .
0 . .. wn
Thus, x̂ is the ordinary least-squares solution of the equation
WAx = Wy
Example 2
y β 0 + β1 x that best fits the data (–2, 3), (–1, 5), (0, 5), (1, 4),
Find the least squares line =
(2, 3). Suppose that the errors in measuring the y-values of the last two data points are
greater than for the other points. Weight this data half as much as the rest of the data.
Solution
Write X , β and y
1 −2 3
1 − 1 5
β0
=X 1 =0 , β = 5
β1
, y
1 1 4
1 2 3
2 0 0 0 0 1 −2
0 1 − 1
0 2 0 0
WX = 0 0 2 0 0 1 0
0 0 0 1 0 1 1
0 0 0 0 1 1 2
2 − 4 6
2 − 2 10
=WX = 2 0 , Wy 10
1 1 4
1 2 3
For normal equation, compute
14 −9 59
(WX )T WX =
−9
T
25 −34
, and (WX ) Wy
And solve
14 − 9 β 0 59
=
−9
25 β1 −34
−1
β0 14 − 9 59
β = −9 25 −34
1
β0 1 25 9 59
β = 14 −34
1 269 9
β0 1 25 9 59
β = 14 −34
1 269 9
β0 1 1169 4.3
= β =
1 269 55 0.2
=
Therefore, the solution to two significant digits is β 0 4.3
= and β1 0.20 .
Hence the required line is =y 4.3 + 0.2 x
In contrast, the ordinary least-squares line for this data can be found as:
1 −2
1 − 1
1 1 1 1 1 5 0
XTX = 1 0
−2 − 1 0 1 2 10
1
0
1
1 2
3
5
1 1 1 1 1 20
XT y = 5
−2 − 1 0 1 2 −1
4
3
5 0 β 0 20
=
0
10 β1 −1
−1
β 0 5 0 20
β = 0 10 −1
1
β 0 1 10 0 20
β = 0 5 −1
1 50
β 0 1 200 4.0
=
β =
1 50 − 5 −0.1
An aspect of technical analysis that tries to predict the future movement of a stock based
on past data. Trend analysis is based on the idea that what has happened in the past gives
traders an idea of what will happen in the future.
Linear Trend
A first step in analyzing a time series, to determine whether a linear relationship provides
a good approximation to the long-term movement of the series computed by the method
of semi averages or by the method of least squares.
Note
The simplest and most common use of trend analysis occurs when the points t0 , t1,..., tn
can be adjusted so that they are evenly spaced and sum to zero.
Example
Fit a quadratic trend function to the data (-2,3), (-1,5), (0,5), (1,4) and (2,3)
Solution
The t-coordinates are suitably scaled to use the orthogonal polynomials found in Example
5 of the last lecture. We have
Polynomial : p0 p1 p2 data : g
1 −2 2 3
1 −1 −1 5
Vector of values : 1 , 0 , −2 , 5
1 1 −1 4
1 2 2 3
< g , p0 > < g , p1 > < g , p2 >
pˆ = p0 + p1 + p2
< p0 , p0 > < p1 , p1 > < p2 , p2 >
20 1 7
= p0 − p1 − p2
5 10 14
and pˆ (t ) =−
4 0.1t − 0.5(t 2 − 2)
Fourier series
2π
1
am =
π ∫
0
f (t ) cos mt dt and
2π
1
bm =
π ∫
0
f (t ) sin mt dt
Example
and let m and n be unequal positive integers. Show that cos mt and cos nt
are orthogonal.
Solution
When m ≠ n
2π
< cos mt , cos nt > = ∫ cos mt cos nt dt
0
2π
1
2 ∫0
= [cos (mt + nt ) + cos(mt − nt ) dt
2π
1 sin (mt + nt ) sin (mt − nt )
= +
2 m + n m − n 0
=0
Example
Solution
We compute
2π
1 1 2
2π
a0 1 1
2 2 π ∫0
= = . t dt = t π
2π 2 0
2π 2π
1 11 t
ak = ∫
π 0
t cos kt dt =
π k 2
cos kt + sin kt = 0
k 0
2π 2π
1 11 t 2
bk = ∫
π 0
t sin kt dt =
π k 2
sin kt −
k
cos kt
0
=
−
k
2 2
π − 2sin t − sin 2t − sin 3t − ⋅⋅⋅ − sin nt
3 n
The norm of the difference between f and a Fourier approximation is called the mean
square error in the approximation.
It is common to write
a0 ∞
f (t ) = + ∑ (am cos mt + bm sin mt )
2 m =1
This expression for f (t) is called the Fourier series for f on [0, 2π ] . The term am cos mt ,
for example, is the projection of f onto the one-dimensional subspace spanned by cos mt .
Example
Let q1 (=
t ) 1, q2 (=
t ) t , and q3 (=
t ) 3t 2 − 4 .Verify that {q1 , q2 , q3 } is an orthogonal set in
C{-2,2] with the inner product
b
< f , g >= ∫ f (t ) g (t ) dt
a
Solution:
2 2
1 2
< q=
1 , q2 > ∫−=
2
1.t dt
2
t=dt
−2
0
∫ 1.(3t
2
< q1 , q3 =
> 2
− 4) dt
= (t 2 − 4t ) = 0
−2
−2
2 2
3
< q2 , q=
3 > ∫−2 t.(3t − 4)=
dt ( t 4 − 2t 2=
2
) 0
4 −2
Exercise
1. Find the equation = y β 0 + β1 x of the least-squares line that best fits the data
points (0, 1), (1, 1), (2, 2), (3, 2).
2. Find the equation = y β 0 + β1 x of the least-squares line that best fits the data
points (-1, 0), (0, 1), (1, 2,),(2, 4).
3. Find the least-squares line = y β 0 + β1 x that best fits the data
(-2, 0), (-1, 0), (0, 2,),(1, 4),(2, 4), assuming that the first and last data points are
less reliable. Weight them half as much as the three interior points.
4: To make a trend analysis of six evenly spaced data points, one can use orthogonal
polynomials with respect to evaluation at the points t=-5, -3, -1, 1, 3and 5
(a). Show that the first three orthogonal polynomials are
3 2 35
p0 =
(t ) 1, p1=(t ) t , and p2 = (t ) t −
8 8
(b) Fit a quadratic trend function to the data
(-5, 1), (-3, 1), (-1, 4), (1, 4), (3, 6), (5, 8)
5: For the space C [0, 2π ] with the inner product defined by
2π
< f , g >= ∫
0
f (t ) g (t ) dt
integration calculations.
f (t ) =
3 − 2sin t + 5sin 2t − 6 cos 2t
Lecture 27
Lecture 28
Lecture 29
Lecture 30
Newton’s Method
PPT’s slides are available in VULMS/downloads
31- Quasi-Newton Method VU
Lecture 31
Quasi-Newton Method
PPT’s slides are available in VULMS/downloads