Projections Onto Subspaces
Projections Onto Subspaces
Projections
If we have a vector b and a line determined by a vector a, how do we find the
point on the line that is closest to b?
a
p
We can see from Figure 1 that this closest point p is at the intersection
formed by a line through b that is orthogonal to a. If we think of p as an
approximation of b, then the length of e = b − p is the error in that approxi
mation.
We could try to find p using trigonometry or calculus, but it’s easier to use
linear algebra. Since p lies on the line through a, we know p = xa for some
number x. We also know that a is perpendicular to e = b − xa:
a T (b − xa)= 0
xa a = a T b
T
aT b
x = ,
aT a
aT b
and p = ax = a . Doubling b doubles p. Doubling a does not affect p.
aT a
Projection matrix
aa T a
p = xa = ,
aT a
so the matrix is:
aa T
P= .
aT a
Note that aa T is a three by three matrix, not a number; matrix multiplication is
not commutative.
The column space of P is spanned by a because for any b, Pb lies on the
line determined by a. The rank of P is 1. P is symmetric. P2 b = Pb because
1
the projection of a vector already on the line through a is just that vector. In
general, projection matrices have the properties:
PT = P and P2 = P.
Why project?
As we know, the equation Ax = b may have no solution. The vector Ax is
always in the column space of A, and b is unlikely to be in the column space.
So, we project b onto a vector p in the column space of A and solve Ax̂ = p.
A T Ax̂ = A T b.
When projecting onto a line, A T A was just a number; now it is a square matrix.
So instead of dividing by a T a we now have to multiply by ( A T A)−1
In n dimensions,
x̂ = ( A T A ) −1 A T b
p = Ax̂ = A ( A T A ) −1 A T b
P = A ( A T A ) −1 A T .
PT = P and P2 = P.
2
y
0
−1 0 1 2 3 4
x
Figure 2: Three points and a line close to them.
Least Squares
Suppose we’re given a collection of data points (t, b):
and we want to find the closest line b = C + Dt to that collection. If the line
went through all three points, we’d have:
C+D = 1
C + 2D = 2
C + 3D = 2,
In our example the line does not go through all three points, so this equation
is not solvable. Instead we’ll solve:
A T Ax̂ = A T b.
3
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.