Ip
Ip
Introduction
Recall in the lecture on vector spaces that geometric vectors (i.e. vectors in two and three-dimensional
Cartesian space) have the properties of addition, subtraction, a zero vector, additive inverses, scal-
ability, magnitude (i.e. length), and angle. Moreover, the formal definition of vector space includes
all these properties except for the last two: magnitude and angle. In this lecture we introduce a
vector operation that allows us to define length and angle for vectors in an arbitrary vector space.
This operation is called the “inner product between two vectors”, and is a generalization of the dot
product that was introduced in the Matrices lecture.
Recall that if u and v are vectors in Rn then the dot product u · v is defined as
u · v = u1 v1 + · · · + un vn .
Recall also how this operation is used when computing the entry of a the product C = AB of two
matrices. In particular cij = ai, · b,j .
We now describe how to use the dot product to obtain the length of a vector, and the angle between
two vectors in R2 . Note that the same can also be done for vectors in R1 and R3 , but it seems more
illustrative and straightforward to show it for R2 .
p
Example 1. Show that the length of the vector v = (x, y) in R2 , denoted |v| is equal to x2 + y 2 .
1
In conclusion, the length of vector v = (x, y) is given as
p √
|v| = x2 + y 2 = v · v,
and so the length of a vector in R2 can be solely expressed in terms of the dot product.
The next step is to show how the angle between two vectors u and v can be expressed solely in terms
of the dot product.
Example 2. Show that the angle between the vectors u = (x1 , y1 ) and v = (x2 , y2 ) can be given as
u·v
cos−1 ( ).
|u||v|
Conclude that the angle between two vectors can be expressed solely by the dot product.
2
Example 3. Given u = (3, 5) and v = (−2, 1) determine the lengths of these vectors and the angle
between them.
Our goal in this section is to generalize the concept of dot product so that the generalization may
be applied to any vector space. Notice how the dot product relies on a vector having components.
But not all vector spaces have vectors that have obvious components. For example, the function
space F(R, R) has vectors that are functions that do not necessarily have components. Therefore,
we cannot simply define the generalized dot product in terms of a formula that involves vector
components, as was done for the dot product. The next best idea is to list all of the important
algebaric properties of the dot product, and require that the generalized dot product have these
properties.
As a first step, notice that the dot product is a function that accepts two vector inputs u and v, and
returns a real-valued output u · v. Therefore, the generalized dot product should also be a function
that takes in two vectors u and v, and returns a real-valued output. But so as not to confuse the
generalized dot product with the original dot product, we use the notation <u, v> to denote the
real-valued output. Moreover, instead of using the term “generalized dot product”, we prefer the
term “inner product”. Thus, given a vector space V, we say that <, > is an inner product on V iff
<, > is a function that takes two vectors u, v ∈ V and returns the real number <u, v>. Moreover,
<, > must satisfy the following properties.
3
Positivity <u, u> ≥ 0, and <u, u> = 0 iff u = 0.
It is left as an exercise to show that for V = R2 , if <u, v> = u · v, then the above four properties are
satisfied. In other words, the dot product is itself an innter product. If an inner product is defined
on a vector space V, then V is called an inner-product space.
Proposition 1. An inner product on a vector space V satisfies the following additional properties.
1. <0, u> = 0
Proof of Proposition 1. Property 2 is proved by invoking the symmetry and additivity properties,
while Property 3 is proved by invoking the symmetry and the scalar associativity properties. As for
Property 1,
<0, u> = <0u, u> = 0<u, u> = 0.
Examples
Example 4. Let
a1 b 1 a2 b 2
A= and B =
c1 d1 c2 d2
be real-valued matrices, and hence matrices of vector space M22 . We show that M22 is an inner
product space by defining
<A, B> = a1 a2 + b1 b2 + c1 c2 + d1 d2 .
It remains to show that <, > has all the requisite properties.
Symmetry
<A, B> = a1 a2 + b1 b2 + c1 c2 + d1 d2 =
a2 a1 + b2 b1 + c2 c1 + d2 d1 = <B, A>.
Additivity Let
a3 b 3
C= .
c3 d 3
Then,
<A + B, C> = (a1 + a2 )a3 + (b1 + b2 )b3 + (c1 + c2 )c3 + (d1 + d2 )d3 =
a1 a3 + a2 a3 + b 1 b 3 + b 2 b 3 + c 1 c 3 + c 2 c 3 + d 1 d 3 + d 2 d 3 =
4
(a1 a3 + b1 b3 + c1 c3 + d1 d3 ) + (a2 a3 + b2 b3 + c2 c3 + d2 d3 ) = <A, C> + <B, C>.
Scalar Associativity
k<A, B> = k(a1 a2 + b1 b2 + c1 c2 + d1 d2 ) =
k(a1 a2 ) + k(b1 b2 ) + k(c1 c2 ) + k(d1 d2 ) =
(ka1 )a2 + (kb1 )b2 + (kc1 )c2 + (kd1 )d2 = <kA, B>.
Positivity
<A, A> = a21 + b21 + c21 + d21 ≥ 0,
and <A, A> = 0 implies A = 0, since this is only possible when all entries of A are zero.
Example 5. Let C[a, b] denote the set of all real-valued functions that are continuous on the closed
interval [a, b]. Since the sum of any two continuous functions is also continuous and f continuous
implies (kf ) is continuous, for any scalar k, it follows that C[a, b] is a vector space (the proof of this
is exactly the same as the proof that F(R, R) is a vector space). Moreover, recall from calculus
that any continuous function over a closed interval can be integrated over that interval. Thus, given
continuous functions f, g ∈ C[a, b], define
Z b
<f, g> = f (x)g(x)dx.
a
Prove that <, > has all the requisite properties of an inner product.
Example 5 Solution.
5
Measuring Length and Angle in Inner Product Spaces
Recall that for vectors in R2 the length of a vector and angle between two vectors can be expressed
entirely in terms of the dot product. To review,
√
|v| = v · v,
while
u·v
θ(u, v) = cos−1 ( √ √ ).
u·u v·v
So it makes sense to use these definitions for length and angle in an inner product space. In particular,
if vector space V has defined inner product <, >, then the length of vector v ∈ V is defined as
√
|v| = <v, v>,
Do the above definitions make sense? The following are three fundamental properties of length
relating to geometrical vectors that ought to also be valid for vectors in an inner product space.
It is left as an exercise to prove that the length definition for vectors in any inner product space does
indeed satisfy the above properties.
As for the angle definition θ(u, v) we must verify that the expression
<u, v>
√ √
<u, u> <v, v>
is in fact a number between -1 and 1, since this is the valid range of the cos−1 function. To prove
this we need the following important theorem.
6
Proof of Theorem 1. For any scalar t, by several applications of the four properties of inner
products, we get
0 ≤ <tu + v, tu + v> = t2 <u, u> + 2t<u, v> + <v, v>,
which may be written as at2 + bt + c ≥ 0, where a = <u, u>, b = 2<u, v>, and c = <v, v>. But
at2 + bt + c ≥ 0 implies that the equation at2 + bt + c = 0 either has no roots, or exactly one root.
In other words, we must have
b2 − 4ac ≤ 0,
which implies
4<u, v>2 ≤ 4<u, u><v, v>,
or
<u, v>2 ≤ <u, u><v, v>.
Example 6. Let
−1 3 5 −4
A= and B =
2 −2 0 2
be vectors of the inner product space from Example 4. Determine the lengths of A and B, and
θ(A, B).
7
Example 7. Let f (x) = x and g(x) = sin x be two vectors of the inner product space from Example
5, where we assume [a, b] = [0, 2π]. Determine the lengths of f and g, and θ(f, g).
Vector v of an inner product space V is said to be a unit vector iff |v| = 1. For example, in R2 ,
(0, 1) and ( √12 , √12 ) are examples of unit vectors, since
√
|(0, 1)| = 02 + 12 = 1,
and s
1 1 1 1 p
|( √ , √ )| = ( √ )2 + ( √ )2 = 1/2 + 1/2 = 1.
2 2 2 2
Note that, for any nonzero vector v, we can always find a unit vector that has the same direction as
v. This is done by scaling v with 1/|v|. Indeed
1 1
| v| = · |v| = 1.
|v| |v|
1 1
When a nonzero vector v gets replaced by |v|
v, we say that v has been normalized, or that |v|
v is
the normalization of v.
8
and f (x) = x2 , where the latter is considered a vector of C[0, 1].
In addition to unit vectors, another important concept is that of perpendicularity. Recall that two
geometric vectors u and v are perpendicular provided they make a 90◦ angle. In terms of the dot
product, this would imply that
u·v
cos 90◦ = 0 = ,
|u||v|
which implies that u · v = 0. This gives us an idea of how to generalize the concept of perpendic-
ularity to an arbitrary inner product space. However, the term “orthogonal” is used instead of the
term “perpendicular”. To generalize, vectors u and v in some inner product space are said to be
orthogonal iff
<u, v> = 0.
9
Example 8. Show that the following pairs of vectors are orthogonal.
b. Matrices
−1 3 5 −1
A= and B =
2 −2 0 −4
c. Functions f (x) = cos(2x) and g(x) = cos(3x) that belong to C[0, 2π].
Orthonormal Bases
Recall that a basis for a vector space V is a linearly independent set of vectors B that spans V.
Moreover, basis B is said to be orthonormal iff
2. for any two basis vectors e1 , e2 ∈ B, <e1 , e2 > = 0; i.e. e1 and e2 are orthogonal.
10
As an example, e1 = (1, 0, 0), e2 = (0, 1, 0), and e3 = (0, 0, 1) is an orthonormal basis for R3 , since
all three vectors are clearly unit vectors (verify!), and, if i 6= j,
<ei , ej > = ei · ej = 0,
since ei ’s only nonzero component is i, while ej ’s only nonzero component is j, and i 6= j. For
example,
<e1 , e3 > = e1 · e3 = (1)(0) + (0)(0) + (0)(1) = 0.
Example 9. Consider the subspace W of C[0, 2π] that is spanned by the vectors
Verify that these vectors are pairwise orthogonal, and then normalize them to form an orthonormal
basis for W.
11
The following theorem provides one reason for the importance of orthonormal bases. It states that,
given an orthonormal basis, one can readily write any vector v as a linear combination of the basis
vectors, since the ei coefficient is none other than <v, ei >.
Theorem 2. Suppose e1 , . . . , en is an orthonormal basis for an inner product space V. Then for any
vector v ∈ V,
v = <v, e1 >e1 + · · · + <v, en >en .
v = c1 e1 + · · · + cn en ,
Therefore, it must be the case that ci = <v, ei >, and the proof is complete.
Theorem 3. Suppose v1 , . . . , vn are pairwise orthogonal nonzero vectors; i.e., <vi , vj > = 0 for all
i 6= j. Then v1 , . . . , vn are linearly independent.
c1 v1 + · · · + cn vn = 0
12
Figure 1: Projection of vector a on to the subspace spanned by b
Gram-Schmidt Algorithm
Suppose v1 , . . . , vn is a basis for inner product space V, but not necessarily an orthonormal basis.
We now provide an algorithm, called the Gram-Schmidt Orthonormalization Algorithm, for
converting a basis v1 , . . . , vn into an orthonormal basis e1 , . . . , en . To describe the algorithm, it is
helpful to have the notion of the orthogonal projection of a vector on to a subspace.
Later we show that this definition is independent of the given orthonormal basis (i.e. the same
projection vector will be computed, regardless of chosen orthonormal basis). We may think of
proj(v, W) as the component of v that lies in W, while v − proj(v, W) is the component of v that
lies outside of W. Figure 1 shows an example of this. Here the subspace W is the space spanned by
vector b. Moreover, a1 is the component of a that lies in W, while a2 is the component of a that lies
outside of of W. Notice how a2 looks perpendicular to b. This is no coincidence. It turns out that
the “outside” component is always orthogonal to W. This is proven in the following theorem.
for all w ∈ W.
Proof of Theorem 4. Let e1 , . . . , en be an orthonormal basis for W (later we will prove that every
finite-dimensional vector space has such a basis). Then it suffices to prove that
<v − proj(v, W), ei > = <v − <v, e1 >e1 + · · · + <v, en >en , ei > =
13
<v, ei > − <v, ei ><ei , ei > = <v, ei > − <v, ei > = 0.
We are now in a position to describe the Gram-Schmidt algorithm for converting a finite basis
v1 , . . . , vn to an orthonormal basis e1 , . . . , en . The algorithm works in stages.
Stage 1 Let e1 = v1 /|v1 |. Thus span(v1 ) = span(e1 ), since the vectors are multiples of each other.
Stage k: k ≥ 2 Assume that e1 , . . . , ek−1 have been constructed, where each ei is a unit vector, and
the vectors are pairwise orthogonal. Moreover, assume that
Then set
ek = vk − proj(vk , {e1 , . . . , ek−1 })/|(vk − proj(vk , {e1 , . . . , ek−1 })|.
In other words, ek is the normalization of the difference of vk with its projection on to the sub-
space generated by e1 , . . . , ek−1 . Notice that vk − proj(vk , {e1 , . . . , ek−1 }) 6= 0, since, otherwise,
it would imply that vk is in
Example 10. Use the Gram-Schmidt algorithm for converting basis v1 = (1, 0, 0), v2 = (1, 1, 0),
v3 = (1, 1, 1), to an orthonormal basis for R3 .
Next
v2 − proj(v2 , {e1 }) = v2 − <v2 , e1 >e1 = (1, 1, 0) − (1, 0, 0) = (0, 1, 0).
Since this vector is already normalized, we have e2 = (0, 1, 0).
Finally,
v3 − proj(v3 , {e1 , e2 }) = v3 − <v3 , e1 >e1 − <v3 , e2 >e2 = (1, 1, 1) − (1, 0, 0) − (0, 1, 0) = (0, 0, 1).
Again, since this vector is already normalized, we have e3 = (0, 0, 1). Therefore, the orthonormal
basis is e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1).
Example 11. Repeat the previous example, but now assume the basis ordering is v1 = (1, 1, 1),
v2 = (1, 1, 0), v3 = (1, 0, 0).
14
More on Projections
Recall that proj(v, W) is defined in terms of an orthonormal basis for finite-dimensional subspace
W. In this section we first prove that proj(v, W) is independent of the chosen basis. This is proved
in the following theorem.
v = w + w⊥ ,
Proof of Theorem 5. Let v ∈ V be arbitrary. Using the Gram-Schmidt algorithm, one can compute
an orthonormal basis e1 , . . . , en for W. This basis can then be used to compute w = proj(v, W) and
w⊥ = v−proj(v, W). Then by Theorem 4, w ∈ W and w⊥ is orthogonal to W. Moreover, v = w+w⊥ .
It remains to show that w and w⊥ are unique.
w + w⊥ = w2 + w2⊥ ,
which implies
w − w2 = w⊥ − w2⊥ .
Now, since w⊥ and w2⊥ are both orthogonal to W, then so is w⊥ − w2⊥ , since, for any vector u ∈ W,
The next fact to establish about proj(v, W) is that it is the closest vector in W to v. In other words,
if we want to approximate v with some vector in W, then proj(v, W) is the best choice, since it
minimizes the distance |v − w| amongst all possible vectors w ∈ W. To prove this we first need to
establish the “Pythagorean Theorem” for inner product spaces.
Theorem 6 (Pythagorean Theorem). If V is an inner product space with inner product <, >
and u, v ∈ V satisfy <u, v> = 0, then
15
Proof of Theorem 6. Suppose u, v ∈ V satisfy <u, v> = 0. Then
|u + v|2 = <u + v, u + v> = <u, u> + <v, v> + 2<u, v> = |u|2 + |v|2 .
meaning that w = proj(v, W) minimizes the quantity |v − w|, amongst all possible w ∈ W.
In the above equation, notice how the first term on the right depends only on v. Thus, if |v − w|2
(and hence |v − w|) is to be minimized, then the second term on the right must be made as small
as possible. But this term can be made to equal zero by assigning w = proj(v, W). Therefore,
w = proj(v, W) is the vector in W that minimizes |v − w|.
16
Exercises
1. Find the angle that is made between each of the following pairs of vectors.
2. Show that for V = R2 , if <u, v> is defined by <u, v> = u · v, then all four inner-product
properties are satisfied.
6. Let
1 1 1 −2
A= and B =
1 −1 3 0
be vectors of the inner product space from Example 4. Determine the lengths of A and B, and
θ(A, B).
7. Let
2 6 3 2
A= and B =
1 −3 1 0
be vectors of the inner product space from Example 4. Determine the lengths of A and B, and
θ(A, B).
8. Let p(x) = x2 and q(x) = x. Use the inner product from Exercise 4 to determine |p|, |q|, and
θ(p, q).
9. Let p(x) = x2 − x + 2 and q(x) = 5x2 + 1. Use the inner product from Exercise 4 to determine
|p|, |q|, and θ(p, q).
10. Use the inner product from Example 5 on the vector space C[−1, 1] to find |f |, |g|, and θ(f, g)
for f (x) = x and g(x) = x2 .
17
11. Use the inner product from Example 5 on the vector space C[0, 2π] to find |f |, |g|, and θ(f, g)
for f (x) = x and g(x) = cos x.
12. Let
2 −3 −1 3
A= and B =
1 5 −3 0
be vectors of the inner product space from Example 4. Normalize A and B.
13. Let p(x) = 4x2 + 6x + 1 and q(x) = 10x − 4. Use the inner product from Exercise 4 to normalize
p and q.
14. Let e1 , . . . , en be an orthonormal basis for a subspace W of vector space V . Prove that if v is
orthogonal to each ei , then v is orthogonal to every vector w ∈ W.
15. Is v1 = (2/3, −2/3, 1/3), v2 = (2/3, 1/3, −2/3), and v3 = (1/3, 2/3, 2/3) an orthonormal set of
vectors?
√ √
16. Is v1 = (1, 0, 0), v2 = (0, 1/ 2, 1/ 2), and v3 = (0, 0, 1) an orthonormal set of vectors?
17. Use the Gram-Schmidt algorithm to transform v1 = (1, −3), v2 = (2, 2) into an orthonormal
basis.
18. Use the Gram-Schmidt algorithm to transform v1 = (1, 0), v2 = (3, −5) into an orthonormal
basis.
19. Use the Gram-Schmidt algorithm to transform v1 = (1, 1, 1), v2 = (−1, 1, 0) v3 = (1, 2, 1) into
an orthonormal basis.
20. Use the Gram-Schmidt algorithm to transform v1 = (1, , 0, 0), v2 = (3, 7, −2) v3 = (0, 4, 1) into
an orthonormal basis.
18
Exercise Solutions
1. a) 10.3◦ , b) 113.2◦ , c) 0◦ , d) 142.8◦
Additivity:
(u1 v1 + u2 v2 ) + (u1 w1 + u2 w2 ) = u · v + u · w.
Scalar Associativity:
Positivity:
u · u = u21 + u22 ≥ 0,
and can only equal zero when u1 = u2 = 0; i.e. u = 0.
Additivity:
<u, v + w> = 2u1 (v1 + w1 ) + 6u2 (v2 + w2 ) = (2u1 v1 + 2u1 w1 ) + (6u2 v2 + 6u2 w2 ) =
Scalar Associativity:
k<u, v> = k(2u1 v1 + 6u2 v2 ) = k(2u1 v1 ) + k(6u2 v2 ) = 2(ku1 )v1 + 6(ku2 )v2 = <ku, v>.
Positivity:
<u, u> = 2u21 + 6u22 ≥ 0,
and can only equal zero when u1 = u2 = 0; i.e. u = 0.
4. Let p, q, r be polynomials in P2 .
Symmetry:
<p, q> = p(0)q(0) + p(1/2)q(1/2) + p(1)q(1) = q(0)p(0) + q(1/2)p(1/2) + q(1)p(1) = <q, p>.
Additivity:
19
(p(0)q(0) + p(0)r(0)) + (p(1/2)q(1/2) + p(1/2)r(1/2)) + (p(1)q(1) + p(1)r(1)) =
(p(0)q(0) + p(1/2)q(1/2) + p(1)q(1)) + (p(0)r(0) + p(1/2)r(1/2) + p(1)r(1)) = <p, q> + <p, r>.
Scalar Associativity:
Positivity:
<p, p> = (p(0))2 + (p(1/2))2 + (p(1))2 ≥ 0,
and can only equal zero when p(0) = p(1/2) = p(1) = 0. But the only way an (at most)
second-degree polynomial can have three zeros is if the polynomial is constant and eqauls zero
for all x; i.e. p(x) = 0.
then it implies that p has roots at x = 0 and x = 1. For example, if p(x) = x2 − x, then
<p, p> = 0, but p 6= 0.
√ √ √
6. |A| = 4 = 2 2, |B| = 17,
Therefore, √
θ(A, B) = cos−1 (3/(2 17) = 68.7◦ .
√ √ √
7. |A| = 50 = 5 2, |B| = 14,
Therefore, √ √
θ(A, B) = cos−1 (19/(5 2 14)) = 44.1◦ .
√ √
8. |p| = 17/4, |q| = 5/2
Therefore, √ √
θ(p, q) = cos−1 (9/( 17 5)) = 12.5◦ .
√ √
9. |p| = 156/4, |q| = 157/2
Therefore,
(287)(4)(2)
θ(p, q) = cos−1 ( √ √ ) = 23.5◦ .
(16)( 156)( 157)
20
10.
1 1
x3 1
Z Z
2 2
|f | = f (x)dx = x2 dx = | = 2/3.
−1 −1 3 −1
p
Therefore, |f | = 2/3. Similarly,
1 1
x5 1
Z Z
2 2 4
|g| = g (x)dx = x dx = |−1 = 2/5.
−1 −1 5
p
Therefore, |g| = 2/5. Finally,
1 1
x4 1
Z Z
<f, g> = f (x)g(x)dx = x3 dx = | =0
−1 −1 4 −1
Therefore, f and g are orthogonal, and the angle between them is 90◦ .
11.
2π 2π
x3 2π
Z Z
2 2
|f | = f (x)dx = x2 dx = | = 8π 3 /3.
0 0 3 0
p
Therefore, |f | = 8π 3 /3. Similarly,
Z 2π Z 2π
2 2
|g| = g (x)dx = cos2 xdx =
0 0
Z 2π
(1 + cos(2x))/2dx = π.
0
√
Therefore, |g| = π. Finally,
Z 2π Z 2π
<f, g> = f (x)g(x)dx = x cos xdx =
0 0
Z 2π
x sin x|2π
0 − sin x = cos x|2π
0 = 0.
0
Therefore, f and g are orthogonal, and the angle between them is 90◦ .
√ √
12. |A| = 39, while |B| = 19 Therefore,
√ √ √ √
2/√39 −3/√ 39 −1/√19 3/ 19
A/|A| = and B/|B| = .
1/ 39 5/ 39 −3/ 19 0
13. √ √
|p| = 12 + 52 + 112 = 147,
while p √
|q| = (−4)2 + 12 + 62 = 53.
√ √
Therefore, p/|p| = (4x2 + 6x + 1)/ 147, while q/|q| = (10x − 4)/ 53.
21
14. Let w ∈ W be arbitrary. Then w = c1 e1 + · · · + cn en for some coefficients c1 , . . . , cn . Then
15. Yes, all vectors have unit length and are pairwise orthogonal.
√
16. No, <v2 , v3 > = 1/ 2 6= 0.
√ √ √ √
17. The orthonormal basis is e1 = (1/ 10, −3 10), e2 = (3/ 10, (1/ 10).
−20 √ 255
24. c1 = hv, e1 i = 0.5, c2 = hv, e2 i = √
53
, c3 = hv, e3 i = 11925
22