0% found this document useful (0 votes)
5 views

Calculus III Lecture Notes

Technical University of Mombasa

Uploaded by

fchuchu789
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Calculus III Lecture Notes

Technical University of Mombasa

Uploaded by

fchuchu789
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 154

MTH 2321 Notes CONTENTS

Calculus III Lecture Notes, Baylor


Jonathan Stanfill

Contents
1 Vectors: A New Way to View Space 3

2 Parametrizations and Vector-Valued Functions 9

3 Calculus of Vector-Valued Functions 14

4 Arc Length and Speed 18

5 The Dot Product 21

6 The Cross Product 26

7 Planes in 3-Space 31

8 Curves and Surfaces 35

9 Polar, Spherical, and Cylindrical Coordinates 38

10 Curvature 45

11 Motion in 3-Space 50

12 Multi-variable Functions and Parametrized Surfaces 53

13 Partial Derivatives 58

14 Differentiability and Tangent Planes 61

15 Parametrized Surfaces, Tangent Planes, and Curvature 66

16 The Gradient and Directional Derivatives 71

17 The Chain Rule 76

18 Optimization 79

19 Constrained Optimization 84

20 Integration in Two Variables 89

21 Double Integrals 93

22 Triple Integrals 98

23 Change of Variables 107

24 Vector Fields 113

25 Line Integrals 117

1
MTH 2321 Notes CONTENTS

26 Conservative Vector Fields 125

27 Surface Integrals 130

28 Surface Integrals of Vector Fields 134

29 Green’s Theorem 139

30 Stokes’ Theorem 146

31 Divergence Theorem 151

2
MTH 2321 Notes 1 VECTORS: A NEW WAY TO VIEW SPACE

1 Vectors: A New Way to View Space (12.1, 12.2)


In this section we will view the plane R2 and three space R3 in a new way - not only as sets of ordered
pairs/triples, but as vector spaces, i.e. spaces containing vectors. Vectors are used in practically all areas
of mathematics and its applications. In physical settings they are used to describe quantities that have both
a magnitude and a direction (as anyone who has seen Despicable Me knows) - e.g. forces and velocity. In
particular, they are used as fundamental objects in Newtonian mechanics, quantum physics, special and gen-
eral relativity, and electricity and magnetism. They are also used in practical applications such as computer
graphics, economics, and statistics. Vectors will play a fundamental role in this course as well, we will learn
to algebraically and geometrically manipulate them, and they allow for simple statements generalizing the
Fundamental Theorem of Calculus. Along this theme, this course provides a basis to extend what you have
learned in Cal 1 and Cal 2 into the three dimensional (and sometimes higher dimensional) setting. This will
allow us to explore the physical settings and applications mentioned above, and we will encounter a number
of deeper mathematical concepts.

We will start by viewing vectors in R2 and R3 geometrically. First, we need to get oriented.

R2 is oriented as the standard x, y-plane with the y


positive x-axis pointing right and the positive y-axis
pointing up

R3 is oriented as 3-space via the so-called right hand


rule: point your right hand so your fingers curl from
the positive x axis to the positive y axis. Then your
thumb is pointing towards the positive z axis.

There are three coordinate planes in R3 , defined


by setting one coordinate equal to zero.

The xy-plane is defined by setting z = 0, the xz-


plane is defined by setting y = 0, and the yz-plane is
defined by setting x = 0.

We define vectors as geometrics objects in the plane or in 3-space as follows.


Definition 1.1. • A vector is an arrow with a direction and a length, or “magnitude,” but no fixed

position. We will denote vectors as bold letters, v or with arrows, v .
• A vector is determined by two points: a base point (or tail) and terminal point (or head).
−−→
• For two points P and Q, the arrow from P to Q is the vector v = P Q with base point P and terminal
−−→
point Q. If O is the origin, we call v = OP the position vector of the point P .
−−→
• The magnitude of v = P Q is kvk, which is the length of the arrow and the distance from P to Q.

3
MTH 2321 Notes 1 VECTORS: A NEW WAY TO VIEW SPACE

Example 1.2. Let’s consider vectors in the plane R2 .

Here we have the following: 3


−−−→ −−−→ −−→ Q1
u = P1 Q1 v = P2 Q2 , w = OQ3 . 2
Here u and v are the same arrow, just between dif- Q3
ferent points - i.e. they have the same length and 1 u
w
same direction. Q2
O
Also we have that w is the position vector of the point −3 −2 −1 1 2 3
Q3 = (−2, 1).
−1 P1
v
−2

P2
−3

Definition 1.3. Two vectors v and u (either in R2 or R3 ) of non-zero length are parallel if the lines
through them are parallel. Note that parallel vectors in the plane point either in the same direction or
opposite directions.
We say that u is a translate of v if u and v have the same length and direction (but might have different
basepoints). We say that translates are equivalent.

−−→
Definition 1.4. For any two points P = (x1 , y1 , z1 ) and Q = (x2 , y2 , z2 ), the components of v = P Q are
x = x2 − x1 , y = y2 − y1 , z = z2 − z1
and we write
v = hx, y, zi .
Here x is called the x-component, y is called the y-component, and z is called the z-component. There
is a similar definition for vectors in R2 .

By geometry, we have the following:

The length of v = hx, yi is given by p


kvk = x2 + y 2 ,
and the length of v = hx, y, zi is given by
p
kvk = x2 + y 2 + z 2 .
Definition 1.5. The zero vector in R2 is 0 = h0, 0i and has length 0. Similarly, the zero vector in R3 is
0 = h0, 0, 0i.

The components of a vector determine the length and direction, but not the base point - i.e. two vectors are
equivalent if and only if they have the same components. It is easy to see that every vector is equivalent to
a unique vector based at the origin - just translate the arrow so its base point is the origin. We adopt the
standard convention that all vectors are based at the origin unless otherwise stated.
−−→ −→
Example 1.6. Determine if v = P Q and w = RS are equivalent if
P = (0, 0, 0), Q = (0, 3, 7), R = (−4, 12, 2), S = (−4, 9, 3).
−−→ −→
As v = P Q = h0 − 0, 3 − 0, 7 − 0i = h0, 3, 7i and w = RS = h−4 − (−4), 9 − 12, 3 − 2i = h0, −3, 1i, v and
w are not equivalent as they have different components.

4
MTH 2321 Notes 1 VECTORS: A NEW WAY TO VIEW SPACE

We now consider ways to combine vectors algebraically.


Definition 1.7. Scaling a vector means changing its length by a scale factor (and not changing the direc-
tion).
e.g.

1
v 2v 2v −v

Because we use numbers to scale a vector we will often refer to real numbers as scalars.
Scalar Multiplication: If v = hv1 , v2 , v3 i then, for any scalar λ, λv = λ hv1 , v2 , v3 i = hλv1 , λv2 , λv3 i.
Scalar multiplication is similar for vectors in R2 .

Notice that 0 · v = 0 for any vector v, and that for any scalar λ and vector v, the scalar multiple λv:
• has length kλvk = |λ| · kvk,
• points in the same direction as v if λ > 0 and
• points in the opposite direction as v if λ < 0.
Thus, a vector v is parallel to a vector w if and only if w = λv for some non-zero scalar λ.
Definition 1.8. Vector addition between two vectors with the same base point can be pictured by placing
the arrows head to tail. Formally, we have the Parallelogram Law: If v and w have the same base point,
then v + w is the vector from the base point to the opposite vertex of the parallelogram formed by v and w.

w’
v

w +v
v v’
w w=
v+

We can similarly define vector subtraction as v − w = v + (−w). The following properties can be verified
using the Parallelogram Law and geometry.
Proposition 1.9 (Properties of Vector Algebra). For all vectors u = hu1 , u2 , u3 i, v = hv1 , v2 , v3 i, and
w = hw1 , w2 , w3 i, and any scalar λ,
• u + v = hu1 , u2 , u3 i + hv1 , v2 , v3 i = hu1 + v1 , u2 + v2 , u3 + v3 i, and hence u + 0 = u
• u + v = v + u, i.e. vector addition is commutative
• u + (v + w) = (u + v) + w, i.e. vector addition is associative
• λ(u + v) = λu + λv, i.e. scalar multiplication is distributive
• λu = λ hu1 , u2 , u3 i = hλu1 , λu2 , λu3 i

5
MTH 2321 Notes 1 VECTORS: A NEW WAY TO VIEW SPACE

Definition 1.10. A linear combination of vectors v, w is any vector of the form α · v + β · w where α, β
are scalars.

Why is this useful?

In general, to write a vector u = hu1 , u2 , u3 i as a linear combination of two vectors v = hv1 , v2 , v3 i and
w = hw1 , w2 , w3 i, one has to solve a system of linear equations:

u=α·v+β·w
⇐⇒ hu1 , u2 , u3 i = α hv1 , v2 , v3 i + β hw1 , w2 , w3 i
⇐⇒ u1 = αv1 + βw1 and u2 = αv2 + βw2 and u3 = αv3 + βw3 .

So, vectors give a way of visualizing this system geometrically, with a solution given via a parallelogram!
This relationship between vectors and systems of equations is valid for any number of variables (e.g.
vectors in R20 ), and is the beginnings of linear algebra.

Proposition 1.11. If v, w ∈ R2 are not parallel (i.e. not co-linear), then any vector in R2 can be written
as a linear combination of v and w. Such a pair is called a basis of R2 .
Similarly, if u, v, and w ∈ R3 are not co-planar (meaning they do not lie within the same plane), then any
vector in R3 can be written as a linear combination of u, v, and w. Such a triple is called a basis of R3 .
Definition 1.12. A vector of length one is called a unit vector. The unit vector pointing in the direction
v
of v, for v 6= 0, is ev = . Unit vectors are often used to indicate a direction when the length is not needed.
kvk

Example 1.13. Unit vectors (based at the origin) y


in the plane always end on the unit circle, and can
be written as e = hcos θ, sin θi where θ is the angle
the vector makes with the positive x-axis.
θ
We can check that such a vector e is a unit vector:
p x
kek = cos2 θ + sin2 θ = 1

for any θ.

Unit vectors (based at the origin) in R3 always z


end on the unit sphere, and can be written as
e = hsin φ sin θ, sin φ cos θ, cos φi where θ is the an-
gle the vector makes with the positive x-axis and φ
is the angle the vector makes with the positive z-axis. φ

We can check that such a vector e is a unit vector:


q
kek = sin2 φ sin2 θ + sin2 φ cos2 θ + cos2 φ = 1 y
for any θ, φ. θ
x

Remark: Along with Definition 1.12, one can now write v=kvk ev in terms of vector dependent angles.

6
MTH 2321 Notes 1 VECTORS: A NEW WAY TO VIEW SPACE

Definition 1.14. The unit vectors that point in the direction of the positive axes in R2 and R3 are called
the standard basis vectors. They are:

i = h1, 0i and j = h0, 1i in R2

and
i = h1, 0, 0i , j = h0, 1, 0i , and k = h0, 0, 1i in R3
As i, j, and k are not coplanar (they are in fact mutually perpendicular), any vector in R3 can be written as
a linear combination of them! Namely,

ha, b, ci = a · i + b · j + c · k.

Similarly for i, j in R2 .

The following theorem is rather useful.


Theorem 1.15. The Triangle Inequality For any two vectors v and w,

kv + wk ≤ kvk + kwk.

Further, equality holds if and only if v = 0, w = 0, or v = λw for some λ ≥ 0.

Sketch of Proof. The idea of the proof is simply that the shortest distance between any two points is a
straight line. The picture below illustrates this and indicates why this is called the triangle inequality.

v w
v+

Example 1.16. Find the magnitude of the force on 60◦ 30◦


rope A and rope B if the weight W = 10 lbf (pounds
of force) and the weight is not moving. Rope A Rope B


W

The forces acting on the weight are the force of the


weight, W, with a magnitude of W = 10 lbf, act- FA FB
ing vertically downwards, and the forces FA and FB 120◦
acting through ropes A and B, respectively. Let’s re-
draw this diagram with these forces represented by 30◦

vectors originating from the weight at the origin.

7
MTH 2321 Notes 1 VECTORS: A NEW WAY TO VIEW SPACE

Since the weight is not moving, these forces must cancel each other out, i.e.

W + FA + FB = 0.

Let kFA k = fA and kFB k = fB , then


* √ +
−f A f A 3
FA = fA hcos 120◦ , sin 120◦ i = ,
2 2
* √ +
◦ ◦ fB 3 fB
FB = fB hcos 30 , sin 30 i = ,
2 2
W = 10 hcos −90◦ , sin −90◦ i = h0, −10i .

Substituting these in the above equation we get


* √ + * √ + * √ √ +
−fA fA 3 fB 3 fB −fA fB 3 fA 3 fB
0 = h0, −10i + , + , = + , −10 + + .
2 2 2 2 2 2 2 2

Equating the components we get √


−fA fB 3
0= +
2 2
and √
fA 3 fB
0 = −10 + + .
2 2

The first equation gives fA = fB 3, and substituting this into the second equation gives
3fB fB
0 = −10 + + fB = 5
2 2
and hence √
fA = 5 3.

8
MTH 2321 Notes 2 PARAMETRIZATIONS AND VECTOR-VALUED FUNCTIONS

2 Parametrizations and Vector-Valued Functions (11.1, 12.2, 13.1)


Remember from last section that our ultimate goal is to extend the concepts learned in Cal 1 and Cal 2 into
the three dimensional (and sometimes higher dimensional) settings. This means we want to explore a way
to do calculus in 2 or 3 (or higher) dimensions with vectors!

Recall that in Calculus 1 we investigate the behavior of functions (and corresponding graphs) using derivatives
and integrals. So, to do calculus in the vector setting, we need a vector version of functions and graphs! In this
section we will define such a version of functions and consider the motion of a particle along a curve in the
plane or 3-space.

Definition 2.1. A curved in R2 is called a plane y


curve.

Not every plane curve is the graph of a function


y = f (x)!
x

Definition 2.2. A curve in R3 is called a space z


curve.

As not every plane curve is the graph of a function y = f (x), and we don’t have a way describe space curves
algebraically at all, we need a new way to describe curves. Consider a particle moving along the plane curve
C graphed below.

We can describe the motion of this particle by speci-


fying the coordinates at which the particle is located
at a given time: 2

x = x(t), y = y(t).
i.e., the function t=1 •t = 2

c(t) = (x(t), y(t)) 0
gives the coordinates for the point at which the par-
ticle is located at time t. C
•t = 0
Notice that this curve is not the graph of a function −2
y = f (x).

−3 −2 −1 0 1 2 3

9
MTH 2321 Notes 2 PARAMETRIZATIONS AND VECTOR-VALUED FUNCTIONS

Definition 2.3. If c(t) = (x(t), y(t)) traces out the curve C, we call c(t) a parametrization of the para-
metric curve (or path) C with parameter t. The equations x = x(t) and y = y(t) are called parametric
equations.

We can then view such a function as a vector-valued function, that is, a function whose output is a
vector. In the context of the above example, we can view the particle’s path as being represented by the
vector-valued function c(t) = hx(t), y(t)i = x(t)i + y(t)j. Such a vector-valued function is called a vector
parametrization of the path C with parameter t. We also have vector parametrizations for space curves.
Definition 2.4. If a particle’s coordinates in R3 are given by c(t) = (x(t), y(t), z(t)), then the vector-valued
function
c(t) = hx(t), y(t), z(t)i = x(t)i + y(t)j + z(t)k
is a vector parametrization of the curve traced out by the particle with parameter t and components
x(t), y(t), and z(t).
Example 2.5. For ease and familiarity, let’s start with a curve that is the graph of a function. Consider a
particle moving along the graph of y = x2 , starting at the origin and moving to the point (2, 4).

6 The points on this curve are

(x, x2 ), x ∈ [0, 2],

4 y = x2 •t = 2 so a parametrization of this curve giving the particle’s


position at time t is given by

r(t) = hx(t), y(t)i = ti + t2 j, t ∈ [0, 2].


2
This is not the only parametrization! Another is
•t = 1
h √ i
0 • s(t) = t2 , t4 t ∈ 0, 2 .
t=0
Notice that both paths start at (0, 0) and, following
−1 0 1 2 3 4 5 6 the graph of y = x2 , end at (2, 4).

The previous example shows us that parametrizations are not unique. In fact, every curve can be
parametrized infinitely many ways.

One major difference between R2 and R3 is that a line in R2 is given by a single linear equation (y = mx + b),
but a similar equation in R3 defines a plane. We can, however, give a uniform way to describe lines in
parametric form.
Example 2.6. Find parametric equations for the line through the point P = (2, 3, 4) with direction
v = h6, 5, 1i.

To find this, we will find the line with direction v = h6, 5, 1i through the origin and then translate it to
go through the point P = (2, 3, 4). The line with direction v = h6, 5, 1i consists of all points that are
terminal points for vectors pointing in the same or the opposite direction as v. Hence, the line with direction
v = h6, 5, 1i through the origin is given by

l(t) = tv = t h6, 5, 1i .

Now we translate this line to go through the point P , by adding the position vector of the point P .

r(t) = h2, 3, 4i + t h6, 5, 1i .

10
MTH 2321 Notes 2 PARAMETRIZATIONS AND VECTOR-VALUED FUNCTIONS

The previous example can be generalized as follows.


Definition 2.7. The parametric form of the line L parallel to v = hv1 , v2 , v3 i through the point P =
(p1 , p2 , p3 ) is
r(t) = p0 + tv = hp1 , p2 , p3 i + t hv1 , v2 , v3 i
−−→
where the vector p0 = OP = hp1 , p2 , p3 i is the position vector of the point P . The vector v is called the
direction vector for L and the coordinates of the points on L are given by the parametric equations

x = p1 + v 1 t y = p2 + v2 t z = p3 + v3 t for − ∞ < t < ∞.

Note that the above parametrization of L is not unique. However, two lines coincide if they are parallel
and pass through a common point, so we can always check if 2 given parametrizations are of the same line.
We can also determine if two lines intersect by equating their parametrizations and determining if there are
parameters t1 and t2 where the parametrizations give the same vector, in which case the lines intersect the
the terminal points of said vector.
−−→
To find the parametrization of the line between to points P and Q, use v = P Q as the direction vector.
Note that the line segment P Q then corresponds to 0 ≤ t ≤ 1.

Example 2.8. Find a vector parametrization of the path that traces out the circle of radius R centered at
(a, b) exactly twice.

As the circle of radius R centered at (a, b)pconsists of all points (x, y) of distance R from the point (a, b),
the equation for this circle is given given by (x − a)2 + (y − b)2 = R, or equivalently (x−a)2 +(y−b)2 = R2 .

This looks like the Pythagorean theorem, and we know the Pythagorean identity from trigonometry, so if
we take
x − a = R cos θ and y − b = R sin θ
then we will have for all θ

(x − a)2 + (y − b)2 = (R cos θ)2 + (R sin θ)2 = R2 (sin2 θ + cos2 θ) = R2 .

Thus the parametric equations

x(θ) = R cos θ + a and y(θ) = R sin θ + b

parametrize the circle of radius R centered at (a, b). To have the path trace the circle exactly twice, we can
take θ over any half-open interval of length 4π. So one parametrization of the path that traces out the circle
of radius R centered at (a, b) exactly twice is given by

r(θ) = hR cos θ + a, R sin θ + bi , θ ∈ [0, 4π).

Notice in the last example the path is different from the curve it traced - the curve is just the circle while
the path moves around the circle twice. To have the path trace the curve exactly (i.e. go around the circle
exactly once, take a half-open interval of length 2π.

Question: Does the path in the last example travel around the circle clockwise or counterclockwise?

Drawing curves or surfaces in R3 without the aid of computer generate images is difficult. One way to help
us visualize these curves or surfaces in R3 is via projection into the coordinate planes.
Definition 2.9. The projection of a path r(t) = hx(t), y(t), z(t)i into the xy-plane is given by rxy (t) =
hx(t), y(t), 0i. Similarly the projections into the xz- and yz-planes are given by rxz (t) = hx(t), 0, z(t)i and
ryz (t) = h0, y(t), z(t)i, respectively.

11
MTH 2321 Notes 2 PARAMETRIZATIONS AND VECTOR-VALUED FUNCTIONS

Recall the following equations for surfaces in R3 .


Definition 2.10. A sphere in R3 of radius r centered at the point P = (p1 , p2 , p3 ) consists of all points
(x, y, z) that are a distance R from the point P . So by the distance formula in R3 an equation for this sphere
is p
(x − p1 )2 + (y − p2 )2 + (z − p3 )2 = r,
or, equivalently,
(x − p1 )2 + (y − p2 )2 + (z − p3 )2 = r2 .
A right circular cylinder in R3 of radius r whose central axis is the vertical line through the point P =
(p1 , p2 , 0) has cross section that are circles of radius r parallel to the xy-plane and centered at a point
(p1 , p2 , z) for any z ∈ R i.e., it’s projection into the xy-plane is the circle of radius r centered at P . Such
a right circular cylinder so consists of all points (x, y, z) so that (x, y, 0) is a distance r from the point
P = (p1 , p2 , 0), so by the distance formula in R3 an equation for this cylinder is
p
(x − p1 )2 + (y − p2 )2 + (0 − 0)2 = r,

or, equivalently,
(x − p1 )2 + (y − p2 )2 = r2 .
Notice that this equation doesn’t rely on z, and is just the equation for the circle in the xy-plane centered
at (p1 , p2 , 0)!

Similarly, an equation for a right circular cylinder in R3 of radius r whose central axis is the line parallel to
the xz-plane through the point P = (p1 , 0, p3 ) is given by

(x − p1 )2 + (z − p3 )2 = r2 ,

and an equation for a right circular cylinder in R3 of radius r whose central axis is the line parallel to the
yz-plane through the point P = (0, p2 , p3 ) is given by

(y − p2 )2 + (z − p3 )2 = r2 .

Example 2.11. Find a parametrization of the circle of radius 5 centered at the point P = (−2, 7, 1) located
in a plane parallel to the yz-plane.

Do this we will find a parametrization for the circle of radius 5 centered at the origin located in the yz-plane,
then translate this circle to be centered at P . Notice that the circle of radius 5 centered at the origin in the
yz-plane is exactly the projection of the sphere of radius 5 centered at the origin onto the yz-plane (see the
figures here). So, its equation is similar to the one we have seen before, but with y and z instead of x and y:

y 2 + z 2 = 25.

Thus it has a parametrization given by

r(t) = h0, 5 cos t, 5 sin ti .

To translate this circle so it is centered at P , we add the position vector of the point P to the parametrization
above
s(t) = h−2, 7, 1i + h0, 5 cos t, 5 sin ti = h−2, 7 + 5 cos t, 1 + 5 sin ti .
Example 2.12. Parametrize the curve C that traces out the intersection of the right circular cylinder of
radius 1 whose central axis is the z-axis and the and sphere of radius 2 centered at the point (1, 0, 0). See
the graphs of these surfaces here.

First we need the equations for these surfaces: The right circular cylinder of radius 1 whose central axis
is the z-axis is given by x2 + y 2 = 1, and the sphere of radius 2 centered at the point (1, 0, 0) is given by
(x − 1)2 + y 2 + z 2 = 4. We find the parametrization of the intersection in two ways, both of which rely on

12
MTH 2321 Notes 2 PARAMETRIZATIONS AND VECTOR-VALUED FUNCTIONS

using both equations.

First way: Use the method of substitution to solve for y and z in terms of x:
p
x2 + y 2 = 1 y 2 = 1 − x2 , or y = ± 1 − x2

Then plugging this in to the second equation we get


p
(x − 1)2 + y 2 + z 2 = 4 x2 + (1 − x2 ) + z 2 = 4 z 2 = 3 − (x − 1)2 + x2 or z = ± 3 − (x − 1)2 + x2 .
√ p
Now that we have y and z in terms of x, we can let x = t to get y = ± 1 − t2 and z = ± 3 − (t − 1)2 + t2 .
As the values for x range from −1 to 1, we will take the parameter t from −1 to 1 as well. Since we can
combine the ±s in four ways, we need four parametrizations to trace out the entirety of this curve, all with
−1 ≤ t ≤ 1:
D p p E D p p E
r1 (t) = t, 1 − t2 , 3 − (t − 1)2 + t2 , r2 (t) = t, 1 − t2 , − 3 − (t − 1)2 + t2
D p p E D p p E
r3 (t) = t, − 1 − t2 , 3 − (t − 1)2 + t2 , r1 (t) = t, − 1 − t2 , − 3 − (t − 1)2 + t2 .

Second way: Use the parametrization of the circle x2 + y 2 = 1: x(t) = cos t and y(t) = sin t for 0 ≤ t < 2π.
Plugging this into the equation of the cylinder we get
q
(x − 1)2 + y 2 + z 2 = 4 (cos t − 1)2 + sin2 t + z 2 = 4 z = ± 4 − (cos t − 1)2 − sin2 t.

So this time we only need two parametrizations to trace out the entirety of the curve, this time with
0 ≤ t < 2π:
 q   q 
r1 (t) = cos t, sin t, 4 − (cos t − 1)2 − sin2 t , and r2 (t) = cos t, sin t, − 4 − (cos t − 1)2 − sin2 t .

13
MTH 2321 Notes 3 CALCULUS OF VECTOR-VALUED FUNCTIONS

3 Calculus of Vector-Valued Functions (13.2)


As this is a calculus course, let’s do some calculus. You may remember the main themes from calculus 1 are
differentiation and integration, and these will be what we focus on as well, in the context of vector-valued
functions.

We now define limit, continuity, and derivative of a vector-valued function similar to the correspdonding
definitions for a scalar-valued function.

Definition 3.1. We say that a vector-valued function r(t) approaches the limit u as t → t0 and write
lim r(t) = u if
t→t0
lim kr(t) − uk = 0.
t→t0

Note that the above limit is a “calculus 1” limit, i.e. a limit of scalars.

As vectors starting at the origin are determined uniquely by their components, we have the following theorem.
Theorem 3.2. Let r(t) = hx(t), y(t), z(t)i. Then

lim r(t) exists ⇐⇒ each of lim x(t), lim y(t), lim z(t) exist,
t→t0 t→t0 t→t0 t→t0

and if the above limits exist, we have


 
lim r(t) = lim x(t), lim y(t), lim z(t) .
t→t0 t→t0 t→t0 t→t0

Definition 3.3. We say that r(t) is continuous at t0 if

lim r(t) = r(t0 ).


t→t0

By the previous theorem, we immediately see that

r(t) = hx(t), y(t), z(t)i is continuous ⇐⇒ each of x(t), y(t), z(t) are continuous.

Definition 3.4. The derivative of r(t) is defined to be

dr r(t + h) − r(t)
r0 (t) = = lim .
dt h→0 h
If this limit exists we say that r(t) is differentiable.

By the previous theorem, we immediately get the following corollary.


Corollary 3.5. Let r(t) = hx(t), y(t), z(t)i. Then

r0 (t) exists ⇐⇒ each of x0 (t), y 0 (t), z 0 (t) exist,

and if the above derivatives exist, we have

r0 (t) = hx0 (t), y 0 (t), z 0 (t)i .

We say that vector-valued derivatives are computed componentwise. Higher derivatives of r(t) are
defined as in the scalar-valued case.

14
MTH 2321 Notes 3 CALCULUS OF VECTOR-VALUED FUNCTIONS

Example 3.6. Let r(t) = t2 − e2t , sin(πt), ln t .

Find lim r(t).


t→1
D E
lim r(t) = lim t2 − e2t , sin(πt), ln t = lim t2 − e2t , lim sin(πt), lim ln t = 1 − e2 , 0, 0 .
t→1 t→1 t→1 t→1 t→1
00
Find r (t).
   
0 2t 1 00 2t 2 −1
r (t) = 2t − 2e , π cos(πt), r (t) = 2 − 4e , −π sin(πt), 2
t t
As vector-valued derivatives are computed componentwise, the following rules from scalar-valued derivatives
are immediate.
Theorem 3.7. Assume that r(t), and s(t) are differentiable vector-valued functions, and f (t) is a differen-
tiable scalar-valued function.
d
• Sum Rule (r(t) + s(t)) = r0 (t) + s0 (t)
dt
d
• Scalar Multiple Rule For any scalar λ, (λr(t)) = λr0 (t)
dt
d
• Product Rule for Scalar-Valued Functions (f (t)r(t)) = f (t)r0 (t) + f 0 (t)r(t)
dt
d
• Chain Rule for Scalar-Valued Functions (r(f (t))) = r0 (f (t))f 0 (t)
dt
Example 3.8. Let r(t) = t7 , 4, cos t and f (t) = 2t .

Find r0 (t) and f 0 (t).


r0 (t) = 7t6 , 0, − sin t and f 0 (t) = 2t ln 2
d
Find (f (t)r(t)).
dt
d
(f (t)r(t)) = f (t)r0 (t) + f 0 (t)r(t)
dt
d t 7 d
2 t , 4, cos t + 2t t7 , 4, cos t

=
dt dt
= 2t ln 2 t7 , 4, cos t + 2t 7t6 , 0, − sin t
= t7 2t ln(2) + 7t6 2t , 4(2t ) ln(2), cos(t)2t ln(2) − sin(t)2t

d
Find (r(f (t))).
dt
d
(r(f (t))) = r0 (f (t))f 0 (t)
dt
= r0 (2t )(2t ln 2)
= 7(2t )6 , 0, − sin(2t ) (2t ln 2)
= 7(ln 2)(26t+1 ), 0, −2t ln(2) sin(2t )
As in the scalar-valued function case, the derivative of a vector-valued function can be interpreted geomet-
rically. Recall that for y = f (x), f 0 (a) is the slope of the line tangent to the graph of f (x) at x = a. The
derivative of a vector valued function can be interpreted in the same way! That is, the derivative vector
r0 (t0 ) (if it exists and is non-zero) points in the direction of motion tangent to the path of r(t) at the time
t = t0 . This is illustrated below.

15
MTH 2321 Notes 3 CALCULUS OF VECTOR-VALUED FUNCTIONS

r(t0 h→0
+h r(t0 + h)
)−
r(t0 r(t)
h ) r(t0 )
r(t) r(t0 )

r(t0 + h)

• O
O

Definition 3.9. r0 (t0 ) is the tangent vector or velocity vector at r(t0 ).


We can then construct the tangent line using the parametrization of a line given a direction and a point.
Definition 3.10. The tangent vector r0 (t0 ), if it exists and is non-zero, is a direction vector for the line
tangent to the path r(t), and so the tangent line at r(t0 ) has a vector parametrization

L(t) = r(t0 ) + tr0 (t0 ).

As with scalar-valued functions, we are interested in when vector-valued functions have a derivative equal
to zero (i.e. the zero vector), as this is where the path “stops.” We will explore this with an example.
Example 3.11. If a wheel of radius 1 rolls along the x-axis, a point on the rim of the wheel will trace out a
curve called a cycloid. Assume a given point P starts at the origin as the wheel rolls. Find a parametrization
r(t) for the curve traced out by this point and find the points where r0 (t) = 0 and the points where r0 (t) is
horizontal and non-zero. See animation of this here.

The point P starts at the origin. At time t, the point has rolled t radians along the x-axis, and the center
C of the circle is C = (t, 1). Then to translate from the center of the circle to the position of P , we shift by
the differences in coordinates: down by cos t and left by sin t (see the picture below).

C = (t, 1)

1 t
cos t

P sin t

So, the path traced out by P is

r(t) = ht − sin t, 1 − cos ti t ≥ 0.

From the animation, we can see that the tangent vector will be zero at the cusps occurring at even multi-
ples of π and the tangent vector will be horizontal and non-zero at the odd multiples of π. Let’s show this
algebraically.

The tangent vector is given by r0 (t) = h1 − cos t, sin ti. Then r0 (t) = 0 if

1 − cos t = 0 and sin t = 0 t = 2kπ, k ∈ N.

r0 (t) is horizontal and non-zero if it has a y-component of zero and a non-zero x-component, i.e.

1 − cos t 6= 0 and sin t = 0 t = (2k − 1)π, k ∈ N.

We’ve now dealt with derivatives of vector-valued functions, but how about antiderivatives? We can formally
define an integral of a vector-valued function using Riemann Sums, as in calculus 1. For ease, we won’t do
that. Instead we define integration componentwise.

16
MTH 2321 Notes 3 CALCULUS OF VECTOR-VALUED FUNCTIONS

Definition 3.12. The definite integral over [a, b] of r(t) = hx(t), y(t), z(t)i is defined by
Z b *Z +
b Z b Z b
r(t) dt = x(t) dt, y(t) dt, z(t) dt .
a a a a

The integral exists if and only if each of the component integrals exists. As they are defined this way,
vector-valued integrals obey the same linearity rules as scalar-valued integrals.

Definition 3.13. An antiderivative of r(t) is a vector-valued function R(t) such that R0 (t) = r(t).

Recall that for a scalar-valued function f (x), two antiderivatives of f (x) differ by only a constant. By
applying this to each component, we get the following.
Theorem 3.14. If R01 (t) = R02 (t), then
R1 (t) = R2 (t) + C
for some constant vector C.

This leads to the following definition, which is similar to the scalar-valued case.
Definition 3.15. The indefinite integral or general antiderivative of r(t) is
Z
r(t) dt = R(t) + C

where R(t) as an antidertivative of r(t) and C is a constant vector.

Example 3.16. Let r(t) = e2t , t3 , ln(t) .


Z
Find r(t) dt.
Z  2t 4 
e t
r(t) dt = , , t(ln(t) − 1) + C
2 4
Z e
Find r(t) dt.
1 Z Z e Z e Z e 
2t 3
r(t) dt = e dt, t dt, ln(t) dt
1 1 1
Then by the First Fundamental Theorem of Calculus,
 2t e 4 e 
e t e
= , , t(ln(t) − 1)|1
2 4
 2e 1 2 1 4 
e e e 1
= − , − ,1 − e
2 2 4 4
 2e
e − e2 e4 − 1

= , ,1 − e
2 4
The above example illustrates how the First Fundamental Theorem of Calculus generalizes to vector valued
functions. Note that this is only the first of a few generalizations aof the First Fundamental Theorem of
Calculus that we will see this semester!
Theorem 3.17 (The Fundamental Theorem of Calculus for Vector-Valued Functions). If r(t) is continuous
on [a, b] and R(t) is an antiderivative of r(t), then
Z b
r(t) dt = R(b) − R(a).
a

17
MTH 2321 Notes 4 ARC LENGTH AND SPEED

4 Arc Length and Speed (13.3)


In this section we use integration to find the arc length of a path. Given a path (in the plane or in 3-space)
we can approximate its arc length by taking polygonal approximations. See the figure below.

We can find the exact length of the curve by letting the number of points we use to approximate go to
infinity. It’s not hard to imagine that this acts like a Riemann sum, and in fact it (almost) does! We can
get over the “almost” by assuming the curve has a continuously differentiable parametrization (i.e. r0 (t) is
continuous).

The following theroems and definitions are valid for both plane and space curves, using the appropriate norm
k · k. Recall that for a plane curve given by r(t) = hx(t), y(t)i,
p
kr(t)k = x(t)2 + y(t)2 and r0 (t) = hx0 (t), y 0 (t)i ,

and for a space curve given by r(t) = hx(t), y(t), z(t)i,


p
kr(t)k = x(t)2 + y(t)2 + z(t)2 and r0 (t) = hx0 (t), y 0 (t), z 0 (t)i .

Theorem 4.1. If r(t) is differentiable and r0 (t) is continuous on [a, b], then the length s of the path r(t) for
a ≤ t ≤ b is Z b
s= kr0 (t)k dt.
a

Definition 4.2. The arc length function is


Z t
s(t) = kr0 (u)k du.
a

This function gives the arc length of the path r(u) over [a, t].

As speed is, by definition, the rate of change of the distance traveled with respect to t, by the Second
Fundamental Theorem of Calculus we have the following.
Definition 4.3. The speed at time t for a parametrized curve r(t) with arc length function s(t) is

ds
= kr0 (t)k.
dt
This is why r0 (t) is called the velocity vector and the tangent vector - it points in the direction of motion
and its length is the instantaneous speed. We will write

v(t) = r0 (t) and v(t) = kv(t)k.

18
MTH 2321 Notes 4 ARC LENGTH AND SPEED

Example 4.4. An ant starts at the origin and walks along a curve in space. The ants position after t seconds
is given by r(t) = t, t2 , t3 . Find a function that gives the distance walked by the ant after t seconds.

We are looking for the arc length function for r(t) over [0, t]. Since r0 (t) = 1, 2t, 3t2 , the distance walked
by the ant after t seconds is given by
Z tp
s(t) = 1 + 4u2 + 9u4 du.
0

This could be simplified further as an expression without an integral by using Cal 2 techniques.

We have already seen that a curve can have different parametrizations. In fact, in general we get parametriza-
tions by changing the speed.
Example 4.5. Both r1 (t) = ht, et i and r2 (u) = 2u, e2u trace out the graph of y = ex . Notice that we
obtain r2 (u) from r1 (t) by substituting t = 2u into r1 (t). The speed of r1 (t) is given by
p
kr10 (t)k = k 1, et k = 1 + e2t ,

while the speed of r2 (u) is given by


p p
kr02 (u)k = k 2, 2e2u k = 4 + 4e2u = 2 1 + e2u .

So the speed of r2 (u) is twice the speed of r1 (t)!

In general, given a parametrization r1 (t), we can obtain a new parametrization of the same curve by substi-
tuting t = g(u) to get the parametrization r2 (u) = r1 (g(u)).

While this new parametrization will trace out the same underlying curve, depending on how g(u) acts it
might do so slower, or faster, or slower in some parts and faster in others, or it might stop and go backwards
for a while, etc.

We now identify a special parametrization of a given curve.


Definition 4.6. An arc length parametrization r(t) of a curve is a parametrization that has a constant
speed of 1, i.e.
kr0 (t)k = 1 for all t.
Notice that for any arc length parametrization r(t), the distance traveled over any time interval [a, b] is
exactly b − a:
Z b Z b
kr0 (t)k dt = 1 dt = b − a.
a a
We demonstrate a general method for finding arc length parametrizations with an example.
Example 4.7. Find an arc length parametrization of the circle in the plane x = 4 of radius 2 centered at
(4, 1, 7).

Start with any parametrization of the curve so that r0 (t) 6= 0 for all t:

r(t) = h4, 1, 7i + 2 h0, cos t, sin ti = h4, 1 + 2 cos t, 7 + 2 sin ti .

Check that r0 (t) 6= 0: r0 (t) = h0, −2 sin t, 2 cos ti , which is never 0.

Now find the arc length function:


Z t Z tp Z t
s = g(t) = kr0 (u)k du = 4 sinu +4 cos2 u du = 2 du = 2t
0 0 0

19
MTH 2321 Notes 4 ARC LENGTH AND SPEED

Find the inverse of s = g(t) :


s
s = 2t t= .
2
Now substitute this into the original parametrization wherever there is a t:
s D s  s E
r1 (s) = r(g −1 (s)) = r( ) = 4, 1 + 2 cos , 7 + 2 sin .
2 2 2
This should be an arc length parametrization! Let’s check:
D s  s E r s s √
0
kr1 (s)k = 0, − sin , cos = sin2 + cos2 = 1 = 1.
2 2 2 2

Note: Most of the time we can’t actually calculate s = g(t) nor find a formula for g −1 (s), so we can only
find arc length parametrizations in some special cases.

20
MTH 2321 Notes 5 THE DOT PRODUCT

5 The Dot Product (12.3, 13.2)


So far we have introduced vector addition and subtraction and multiplying a vector by a scalar, but we do
not have a way to multiply two vectors. In this section we introduce the first of two very important ways to
multiply vectors.
Definition 5.1. The dot product of two vectors

v = hv1 , v2 , v3 i and w = hw1 , w2 , w3 i

is the scalar defined by


v · w = v1 w1 + v2 w2 + v3 w3 .
That is, the dot product is given by multiplying the corresponding coordinates and adding the results.
The dot product v · w is also sometimes called the scalar or inner product, and is often denoted (v, w)
or hv, wi. For vectors in R2 , the dot product is defined similarly,

hv1 , v2 i · hw1 , w2 i = v1 w1 + v2 w2 .

Theorem 5.2 (Properties of the Dot Product). For any vectors u, v, and w
1. Dotting with the zero vector gives the scalar zero, i.e.

0 · v = v · 0 = 0.

2. The dot product is commutative, i.e.


v · w = w · v.

3. Scalars can be “pulled out” of the dot product, i.e.

(λv) · w = v · (λw) = λ (v · w) .

4. The dot product is distributive over vector addition, i.e.

u · (v + w) = u · v + u · w

and
(v + w) · u = v · u + w · v.

5. The dot product of a vector with itself is the vector’s square length, i.e.

v · v = kvk2 .

Proof. For ease, we will prove the above for vectors in R3 . So let u = hu1 , u2 , u3 i, v = hv1 , v2 , v3 i, and
w = hw1 , w2 , w3 i.

Proof of (1):
v · 0 = v1 0 + v2 0 + v3 0 = 0, 0 · v = 0v1 + 0v2 + 0v3 = 0.
Proof of (2):
v · w = v1 w1 + v2 w2 + v3 w3 = w1 v1 + w2 v2 + w3 v3 = w · v.
Proof of (3):

(λv) · w = (λv1 )w1 + (λv2 )w2 + (λv3 )w3


v · (λw) = v1 (λw1 ) + v2 (λw2 ) + v3 (λw3 )
λ (v · w) = λ(v1 w1 ) + λ(v2 w2 ) + λ(v3 w3 ).

21
MTH 2321 Notes 5 THE DOT PRODUCT

Notice that the right hand sides in the above are all equal.

Proof of (4):
u · (v + w) = u1 (v1 + w1 ) + u2 (v2 + w2 ) + u3 (v3 + w3 )
= u1 v1 + u1 w1 + u2 v2 + u2 w2 + u3 v3 + u3 w3
= u1 v1 + u2 v2 + u3 v3 + u1 w1 + u2 w2 + u3 w3
= u · v + u · w.
The other case (multiplying on the right) is similar.

Proof of (5):
q 2
v · v = v12 + v22 + v32 = v12 + v22 + v32 = kvk2 .

We can view the dot product as a way to measure how “close” the direction of v is to that of w.
The angle between two vectors is not uniquely defined in general: if θ is an angle between two vectors, so is 2π
− θ.

2π − θ

By convention, the angle between two vectors is chosen so that so that 0 ≤ θ ≤ π.


Theorem 5.3 (The Dot Product and The Angle). Let θ be the angle between two non-zero vectors v and w.
Then v·w
v · w = kvkkwk cos θ, or cos θ = .
kvkkwk
Before we prove this, let’s recall the Law of Cosines: the three sides of a triangle satisfy
c2 = a2 + b2 − 2ab cos θ
if the side c is opposite the angle θ.

Proof. Let v and w be non-zero vectors with angle between them θ. If we view v and w as sides of a triangle,
the third side is given by v − w (see the figure below).

v−w
v

θ w

So by the Law of Cosines,


kv − wk2 = kvk2 + kwk2 − 2kvkkwk cos θ.
On the other hand, by properties of the dot product we have
kv − wk2 = (v − w) · (v − w)
= v · v − 2v · w + w · w
= kvk2 + kwk2 − 2v · w.

22
MTH 2321 Notes 5 THE DOT PRODUCT

Comparing the two right hand sides we get

kvk2 + kwk2 − 2kvkkwk cos θ = kvk2 + kwk2 − 2v · w


−2kvkkwk cos θ = −2v · w
kvkkwk cos θ = v · w

Recall that the angle θ = arccos x is the angle in [0, π], so we have that
 
v·w v·w
cos θ = or θ = arccos .
kvkkwk kvkkwk

Also, notice two other equations involving lengths and the dot product that showed up in the above proof:

kv − wk2 = kvk2 + kwk2 − 2kvkkwk cos θ.

and
kv − wk2 = kvk2 + kwk2 − 2v · w.
Definition 5.4. Two nonzero vectors v and w are perpendicular or orthogonal if the angle between
π
them is . In this case we write v ⊥ w.
2
π
As cos = 0, we can use the dot product to determine if two vector are perpendicular. If we define the zero
2
vector to be orthogonal to all vectors, we have

v ⊥ w ⇐⇒ v · w = 0.

Example 5.5. The standard basis vectors i = h1, 0, 0i, j = h0, 1, 0i, and k = h0, 0, 1i are mutually orthogonal
and all have length 1. In terms of dot products we have:

i·j=i·k=j·k=0 and i · i = j · j = k · k = 1.

Example 5.6. Determine whether u = h2, −1, 7i is orthogonal to v = h3, 2, 1i or w = h1, 2, 0i.

Compute the dot products:


u · v = h2, −1, 7i · h3, 2, 1i = 6 − 2 + 7 = 11
and
u · w = h2, −1, 7i · h1, 2, 0i = 2 − 2 + 0 = 0.
So u is not orthogonal to v but is orthogonal to w.

Example 5.7. Determine whether the angle between v = h0, −2, 4i and w = h1, 3, 1i is obtuse, acute, or
state that v ⊥ w.

For any vectors v and w, the angle θ between v and w is obtuse if π2 < θ ≤ π, which occurs for 0 ≤ θ ≤ π if
cos θ < 0. Then as v · w = kvkkwk cos θ and kvk, kwk ≥ 0, we have that

the angle θ between v and w is obtuse ⇐⇒ v · w < 0.

For this example, as


v · w = h0, −2, 4i · h1, 3, 1i = 0 − 6 + 4 = −2,
the angle between v and w is obtuse.

23
MTH 2321 Notes 5 THE DOT PRODUCT

Example 5.8. Find all vectors orthogonal to v = h2, 3, 7i and give a specific nonzero vector orthogonal to v.

Let w = hw1 , w2 , w3 i. Then v ⊥ w ⇐⇒ v · w = 0, i.e.

2w1 + 3w2 + 7w3 = 0.

So any vector w orthogonal to v must satisfy the above equation. One such solution is w = h1, 4, −2i. The
collection of all such vectors forms a plane through the origin with equation 2x + 3y + 7z = 0. It turns out
that this is the unique plane through the origin orthogonal to v. We will discus this further later on.

Another important application of the dot product is to find the projection of a vector along another nonzero
vector. Given the vectors u and v, the projection of u along v, denoted ukv, is the vector parallel to v obtained
by drawing a line from the endpoint of u perpendicular to the line through v

v v

θ ukv
u θ

ukv
We can see that the length of ukv is kuk cos θ. Since θ
is acute, ukv and v point in the same direction. If θ is obtuse, ukv and v point in opposite directions.

Formally, we have the following definition.


Definition 5.9 (Projection). Assume v 6= 0. The projection of u along v is the vector
u · v    
u·v u·v
ukv = v= v = ev .
v·v kvk2 kvk
We sometime denote ukv by projv u.
u·v
The length of ukv, = kuk cos θ is called the component of u along v, and can be denoted by compv u.
kvk

The previous figures show us that we can decompose any nonzero vector in the following way: If v 6= 0,
then every vector u can be be written as the vector sum of ukv and a vector u⊥v that is orthogonal to v. That
is,
u = ukv + u⊥v.
From the figure, we see the vector u⊥v = u − ukv is represented by the dashed line!

u⊥v θ ukv

Writing u as
u = ukv + u⊥v
is called the decomposition of u with respect to v.

24
MTH 2321 Notes 5 THE DOT PRODUCT

Example 5.10. Find the decomposition of u = h2, 4, −1i with respect to v = h1, 1, 3i.

As
u·v=2+4−3=3
and
v · v = 1 + 1 + 9 = 11,
we have that u · v  
3 3 3 9
ukv = v= v= , , .
v·v 11 11 11 11
Then we have  
19 41 −20
u⊥v = u − ukv = , , .
11 11 11
So the decomposition of u with respect to v is
   
3 3 9 19 41 −20
u = h2, 4, −1i = ukv+ u⊥v = , , + , , .
11 11 11 11 11 11

Now that we have a new way to multiply vectors, we naturally have a new way to multiply vector-valued
functions, which leads us to a new product rule for derivatives!
Theorem 5.11 (Product Rule for Dot Products). Assume that r1 and r2 are differentiable vector-valued
functions. Then
d
(r1 (t) · r2 (t)) = r01 (t) · r2 (t) + r1 (t) · r02 (t)
dt
You should notice that this looks just like the scalar-valued product rule. In fact, the product rule is what
it is, you just need to make sure to use the appropriate product (i.e. dot product or scalar product).
Proof. We will prove this for vector-valued functions in R3 , the proof for vector-valued functions in the plane
is similar.

If r1 = hx1 (t), y1 (t), z1 (t)i and r2 = hx2 (t), y2 (t), z2 (t)i, then by the dot product and scalar-valued product
rule we have
d d
(r1 (t) · r2 (t)) = (x1 (t)x2 (t) + y1 (t)y2 (t) + z1 (t)z2 (t))
dt dt
= x01 (t)x2 (t) + x1 (t)x02 (t) + y10 (t)y2 (t) + y1 (t)y20 (t) + z10 (t)z2 (t) + z1 (t)z20 (t)
= [x01 (t)x2 (t) + y10 (t)y2 (t) + z10 (t)z2 (t)] + [x1 (t)x02 (t) + y1 (t)y20 (t) + z1 (t)z20 (t)]
= r01 (t) · r2 (t) + r1 (t) · r02 (t).

An interesting example of applying the dot product, projections, and decompositions is to consider a bridge
oriented 30 degrees east of north with a wind blowing where the wind vector is u = h30, 0i km/h. How
would we find the speed of the part of the wind blowing directly at the bridge?

25
MTH 2321 Notes 6 THE CROSS PRODUCT

6 The Cross Product (12.4, 13.2)


In the previous section we saw that the dot product is a kind of multiplication of vectors that yields a
scalar. In this section we introduce a kind of multiplication of vectors that yields another vector, called the
cross product. The cross product is often used in physics to describe rotations, such as torque, angular
momentum, and magnetic forces. It can also be used to understand the Coriolis Force in meteorology!

Before we can define the cross product, we will cover some basic facts and definitions in the subject of linear
algebra.
Definition 6.1. A matrix is an array of numbers consisting of rows (horizontal) and columns (vertical).
e.g.,  
  5 6 7
1 2  8
and 9 10 
3 4
1.1 π 0
The example on the left is called a 2 × 2 matrix, as there are two rows and two columns. Similarly, the
example on the right is called a 3 × 3 matrix.

We now define an operation on a 2 × 2 or 3 × 3 matrix called the determinant, which is a scalar associated
to the matrix.
Definition 6.2. The determinant of a 2 × 2 matrix is the scalar defined by the formula
 
a b a b
det = = ad − bc.
c d c d

Notice there that the determinant is the difference of the diagonal products.

Before we can define the determinant of a 3 × 3 matrix, we need to define a minor.


Definition 6.3. Given a 3 × 3 matrix, a minor is the determinant of a 2 × 2 submatrix obtained by crossing
out a specified row and column. e.g., to find the the (1, 2)-minor of the matrix
 
a1 b1 c1
A =  a2 b2 c2  ,
a3 b3 c3

first cross out the first row and the second column

a1 b1 c1
a2 c2
a2 b2 c2 = a3 c3
,
a3 b3 c3

then take the determinant of this 2 × 2 matrix


a2 c2
= a2 c3 − c2 a3 .
a3 c3

Definition 6.4. The determinant of a 3 × 3 matrix is defined by the following formula

a1 b1 c1
b2 c2 a2 c2 a2 b2
a2 b2 c2 = a1 − b1 + c1 .
b3 c3 a3 c3 a3 b3
a3 b3 c3

That is, a1 times the (1, 1)-minor minus b1 times the (1, 2)-minor plus c1 times the (1, 3)-minor.

26
MTH 2321 Notes 6 THE CROSS PRODUCT

We now define the cross product of two vectors, which is a “symbolic” determinant of a matrix whose first
row has the entries i, j, and k.
Definition 6.5 (The Cross Product). The cross product of the vector v = hv1 , v2 , v3 i and w = hw1 , w2 , w3 i
is the vector
i j k
v2 v3 v1 v3 v1 v2
v × w = v1 v2 v3 = i− j+ k
w2 w3 w1 w3 w1 w2
w1 w2 w3

Example 6.6. Let v = h−2, 7, 10i and w = h0, 3, 5i. Find v × w, w × v, and v × v.

i j k
7 10 −2 10 −2 7
v×w= −2 7 10 = i− j+ k
3 5 0 5 0 3
0 3 5
= (35 − 30)i − (−10 − 0)j + (−6 − 0)k = h5, 10, −6i

i j k
3 5 0 5 0 3
w×v= 0 3 5 = i− j+ k
7 10 −2 10 −2 7
−2 7 10
= (30 − 35)i − (0 + 10)j + (0 + 6)k = h−5, −10, 6i

What do you notice about how v × w and w × v are related? Importantly, notice that you don’t get the
same thing! So order matters with the cross product.

i j k
7 10 −2 10 −2 7
v×v= −2 7 10 = i− j+ k
7 10 −2 10 −2 7
−2 7 10
= (70 − 70)i − (−20 + 20)j + (−14 + 14)k = 0

Notice that v × v was the zero vector. Do you think is a fluke?

Theorem 6.7 (Properties of the Cross Product). Let v, w, and u be vectors in R3 . Then
1. v × w = −w × v
2. v × v = 0
3. v × w = 0 ⇐⇒ either v = 0, w = 0, or w = λv for some scalar λ
4. (λv) × w = v × (λw) = λ (v × w)

5. (u + v) × w = u × w + v × w and u × (v + w) = u × v + u × w
The proof of this theorem can be found in the textbook. We will now explore the geometric meaning of the
cross product. Recall that we oriented 3-space using the right-hand rule - we now describe a right-handed
system for any three vectors.

Definition 6.8 (Right-Handed System). Let u, v, and w be vectors in R3 that are not co-planar (i.e. do not
all lie in a plane). We say that {v, w, u} forms a right-handed system if the direction of u is determined
by the right-hand rule: if you point your right hand in the direction of v and curl your fingers towards w,
your thumb points to the same side of the plane spanned by v and w as u.

27
MTH 2321 Notes 6 THE CROSS PRODUCT

Example 6.9. {i, j, k} forms a right-handed system.


k

i
Theorem 6.10 (Geometric Properties of the Cross Product). For any two nonzero nonparallel vectors v
and w, the cross product v × w is the unique vector satisfying the following three properties.
1. v × w is orthogonal to v and w
2. v × w has length kvkkwk sin θ if θ is the angle between v and w
3. {v, w, v × w} forms a right-handed system
There are always two vectors orthogonal to v and w with length kvkkwk sin θ. The right hand rule determines
which is v × w.

v×w
w
θ
v

−v × w

The cross product of any two of the standard basis vectors yields the third (up to a minus sign). You can
check this on your own, but you should get:
i × j = k, j × k = i, k × i = j.
Reversing the order of the above products yields the negative. One way to remember this is the following.
Start with writing the basis vectors alphabetically twice:
i j k i j k.
Then you can read off the cross products from left to right and always get a positive result. Reading off
from right to left will give a negative result.

Recall that in 2 dimensions, multiplication is related to area (think: rectangles). Similarly, the cross product
has an interpretation as area.
Theorem 6.11 (The Cross Product and Areas). If P is the parallelogram spanned by v and w and T is
the triangle spanned by v and w, then
Area(P) = kv × wk
and
kv × wk
Area(T ) = .
2
Sketch of proof. Let P be the parallelogram (shaded in) spanned by two nonzero vectors v and w as in the
figure below.

28
MTH 2321 Notes 6 THE CROSS PRODUCT

The base of P is b = kvk, and the height of P is


h = kwk sin θ. So, the area of P is

Area(P) = bh = kvkkwk sin θ = kv × wk.

Now let T be the triangle spanned by v and w as in the figure below.

The area of T is half the area of P, so

Area(P) kv × wk
Area(T ) = = .
2 2

We can also use the cross product to find the volume of the parallelepiped P spanned by three nonzero
vectors u, v, and w in R3 . See the figure below.

The base of P is the parallelogram spanned by v and w, and so has area b = kv × wk. The height of P is
h = kuk| cos θ| where θ is the angle between u and v × w. So, we have

Volume(P) = bh = kv × wkkuk| cos θ|.

Now, recall that


kv × wkkuk| cos θ| = u · (v × w).

29
MTH 2321 Notes 6 THE CROSS PRODUCT

The quantity u · (v × w), called the vector triple product, can be expressed as a determinant! If u =
hu1 , u2 , u3 i, v = hv1 , v2 , v2 i, and w = hw1 , w2 , w3 i, then
 
v2 v3 v1 v3 v1 v2
u · (v × w) = u · i− j+ k
w2 w3 w1 w3 w1 w2
v2 v3 v1 v3 v1 v2
= u1 i − u2 j + u3 k
w2 w3 w1w3 w1 w2
 
u1 u2 u3 u
= v1 v2 v3 = det  v 
w1 w2 w3 w

This gives us the following theorem.


Theorem 6.12 (The Vector Triple Product and Volume). Let u, v, and w be nonzero vectors in R3 . Then
the parallelepiped P spanned by u, v, and w has volume
 
u
Volume(P ) = |u · (v × w)| = det  v 
w

Example 6.13. Let u = h1, 0, −2i, v = h−2, 3, 2i, and w = h0, 0, 3i. Find the area A of the parallelogram
P spanned by u and v, and find the volume V of the parallelepiped P spanned by u, v, and w.

First we find A. To do this use the formula A = ku × vk.

i j k
0 −2 1 −2 1 0
u×v= 1 0 −2 = i− j+ k
3 2 −2 2 −2 3
−2 3 2
= (0 + 6)i − (2 − 4)j + (3 − 0)k = h6, 2, 3i

Then √
A = ku × vk = 36 + 4 + 9 = 7.
To find V we use the formula V = |w · (u × v)|.

|w · (u × v)| = | h0, 0, 3i · h6, 2, 3i | = 9.

As with the dot product, now that we have a new way to multiply vectors, we naturally have a new way to
multiply vector-valued functions, which leads us to a new product rule for derivatives!
Theorem 6.14 (Product Rule for Cross Products). Assume that r1 and r2 are differentiable vector-valued
functions. Then
d
(r1 (t) × r2 (t)) = [r01 (t) × r2 (t)] + [r1 (t) × r02 (t)]
dt
Recall from when we saw the product rule for dot products: the product rule is what it is, you just need to
make sure to use the appropriate product, i.e. cross product, dot product, or scalar product.

The proof of the product rule for cross products is left as an exercise. Hint: write out the cross product
componentwise, then take the derivative componentwise using the scalar product rule.
d
Example 6.15. Prove that (r(t) × r0 (t)) = r(t) × r00 (t).
dt
Use the product formula for cross products:
d
(r(t) × r0 (t)) = r0 (t) × r0 (t) + r(t) × r00 (t) = r(t) × r00 (t).
dt | {z }
=0

30
MTH 2321 Notes 7 PLANES IN 3-SPACE

7 Planes in 3-Space (12.5)


In this section we will explore planes in 3-space. We will start by introducing linear equations.

A linear equation in two variables is an equation of the form


ax + by = c
where a, b, and c are arbitrary constants. Note that the above general from is equivalent to the equation
y = mx+b, which we all know defines a line in R2 . Note that a line is 1-dimensional, while R2 is 2-dimensional.

Similarly, a linear equation in three variables is an equation of the form


ax + by + cz = d,
where a, b, c, and d are arbitrary constants. As in the case of 2-variables, a linear equation in three variables
defines a space that is has dimension 3 − 1 = 2, i.e. a linear equation in three variables forms a plane in R3 .

In general, a linear equation in n-variables is of the form


a1 x1 + · · · + an xn = b
where a1 , . . . , an , and b are arbitrary constants. A linear equation in n-variables defines an n − 1 dimensional
hyperplane in Rn .

What does it take to specify a plane through a given point? Well, let’s start by looking at the 2-variable
version (i.e. lines). What does it take to specify a line through a given point? Another point!

Souping this up a dimension, we can determine a plane P passing through a specified point P0 completely
by specifying two other points that are not colinear. That is, a plane is uniquely determined by three
non-colinear points.

Let’s consider another way to specify a line through a given point. Given a point P , a line through P is
uniquely determined by the direction of a vector orthogonal to it! Similarly, we can determine a plane P
passing through a specified point P0 completely by specifying a nonzero vector n that is orthogonal to P.
Such a vector is called a normal vector (here, normal is another word for orthogonal or perpendicular).

The image to the right shows that for a plane through


P0 = (x0 , y0 , z0 ), the point P = (x, y, x) lies on the
−−→
plane if and only if the vector P0 P is orthogonal to
n = ha, b, ci. That is,
−−→
n · P0 P = 0.
−−→
As P0 P = hx − x0 , y − y0 , z − z0 i, the above is
ha, b, ci · hx − x0 , y − y0 , z − z0 i = 0,
and taking the dot product we get
a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0.

Rewriting the above equation, we get the following.


Theorem 7.1 (Equation of a Plane). The equation of the plane through the point P0 = (x0 , y0 , z0 ) with
normal vector n = ha, b, ci is given by any of the following.
Vector form:
n · hx, y, zi = d.

31
MTH 2321 Notes 7 PLANES IN 3-SPACE

Scalar form:
a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0
or
ax + by + cz = d
where
d = ax0 + by0 + cz0 .
Note that if n is a normal vector to the plane P, so is every nonzero scalar multiple λn as all of these
multiples point in the same (or opposite) direction.
Example 7.2. Let P be the plane given by

x − 2y + 7z = 3.

Multiplying the right- and left-hand sides of this equation by any nonzero number gives the same solution
set (i.e. the same plane), and so P is also given by

3x − 6y + 21z = 9.

The first equation has the normal vector h1, −2, 7i while the second equation has the normal vector h3, −6, 21i.
Just as two lines are parallel if they have the same slope, two planes are parallel if they have the same
normal vector.
Example 7.3. Let P be the plane given by

x − 2y + 7z = 3

as in the previous example. A plane parallel to P is given by

x − 2y + 7z = −300.

We already saw that P is also given by


3x − 6y + 21z = 9,
and so another plane parallel to P is
3x − 6y + 21z = 1.
In general, a family of parallel planes is given by choosing a normal vector n = ha, b, ci and letting the
constant d vary in the scalar form of the equation for a plane

ax + by + cz = d.

Every family of parallel planes with normal vector n = ha, b, ci has a unique member passing through any
given point, with unique member through the origin given by

ax + by + cz = 0.

Example 7.4. Recall that in R2 , x = 0 gives the y axis and y = 0 gives the x axis. In R3 ,

x=0 gives the yz plane,

y=0 gives the xz plane,


and
z=0 gives the xy plane.
Similarly, x = a, y = b, and z = c gives planes parallel to the respective coordinate planes.

32
MTH 2321 Notes 7 PLANES IN 3-SPACE

Example 7.5. Let P be the plane given by 3x − 2y + 9z = 4. Find an equation of the plane parallel to P
passing through the origin and the plane parallel to P passing through the point Q = (−3, 7, 1).

The family of planes parallel to P are all of the form

3x − 2y + 9z = d.

The one through the origin has equation

3x − 2y + 9z = 0.

To find the plane through Q parallel to P, we use the above equation with d = 3(−3) − 2(7) + 9(1) = −14,

3x − 2y + 9z = −14.

As we have already said, just as two non-coincidental points determine a line, three non-collinear points
determine a plane. (Collinear points are points that all lie on the same line.) In the figure below, the
three points P, Q, and R uniquely determine the plane P.

Example 7.6. Find an equation of the plane P through the points P = (0, 1, 2), Q = (−2, 3, 7), and
R = (4, 8, 3).

To do this we first need to find a normal vector, then we can use any of the points P , Q, or R to find an
equation for the plane. To find the normal vector, as illustrated in the figure above, we will find two vectors
−−→ −→
in the plane, P Q and P R, and take their cross product.
−−→
P Q = h−2, 3, 7i − h0, 1, 2i = h−2, 2, 5i

and
−→
P R = h4, 8, 3i − h0, 1, 2i = h4, 7, 1i ,
so
i j k
−−→ −→
n = PQ × PR = −2 2 5 = −33i + 22j − 22k = h33, 22, −22i .
4 7 1
Then, using n and the point P , an equation for P is

33x + 22(y − 1) − 22(z − 2) = 0.

Example 7.7. Describe the intersection of the plane 1.5x + 2y − 3z = 4 and the line r(t) = h0, 3, −1i +
t h4, 6, 1i.

The intersection of a plane with a line should yield either a point (if the line does not lie on the plane) or
the entire line (if the line does lie on the plane). To determine the intersection in either case, we can use the
following procedure.

33
MTH 2321 Notes 7 PLANES IN 3-SPACE

Identify the parametric equations of the line:


x = 4t, y = 3 + 6t, z = −1 + t.
Substitute these into the equation of the plane:
1.5x + 2y − 3z = 6t + 2(3 + 6t) − 3(−1 + t) = 4.
1
Now solve for t to get t = − . As we were able to solve for a specific time value, the line does not lie on the
3
plane, and the intersection is a point! To find this point, plug in the t value you solved for into the equation
of the line:
4 4
x = 4t = − , y = 3 + 6t = 3 − 2 = 1, z = −1 + t = − ,
3 3
 
4 4
so the intersection is the point − , 1, − .
3 3
Definition 7.8. The intersection of a plane P with a coordinate plane or a plane parallel to a coordinate
plane is called a trace. The trace of a plane is a line, unless P is parallel to the coordinate plane in question,
in which case the trace is either empty (if P is not itself the coordinate plane) or all of P (if P is itself the
coordinate plane).
Example 7.9. Find the traces in the coordinate planes of the plane P given by 10x − 2y − z = 1.

First, notice that P is not any of the coordinate planes, nor is it parallel to any of the coordinate planes.
So, the traces of P in the coordinate planes will be lines. We find the traces by intersecting P with each of
the coordinate planes.

To find the trace in the yz-plane, we substitute x = 0 into the equation for P to get the line
−2y − z = 1.
To find the trace in the xz-plane, we substitute y = 0 into the equation for P to get the line
10x − z = 1.
To find the trace in the xy-plane, we substitute z = 0 into the equation for P to get the line
10x − 2y = 1.
Example 7.10. Describe the intersection of the planes x + y + z = 0 and 2x − 3y − z = 0.

First, as these two planes are distinct and do not have the same normal vector, they are not parallel. So,
their intersection is a line. To describe this we treat the two equations of the planes as a system of equations
and solve via substitution or elimination.
Let’s use elimination to get rid of the “z”s:
x+y+z =0
+ 2x − 3y − z = 0

3x − 2y = 0
2 5
So x = y. Substituting this back into the first equation we get that z = − y. Since we have both x and
3 3
2 5
z in terms of y, we can parametrize the intersection by letting y = t, so that x = t and z = − t. As the
3 3
origin is in the intersection, the line must go through the origin, and so is given by
 
2 5
r(t) = t , 1, − .
3 3

34
MTH 2321 Notes 8 CURVES AND SURFACES

8 Curves and Surfaces (12.6)


In the following sections we will explore general quadratic equations and the graphs of their solutions. A
general quadratic equation in two variables is of the form

ax2 + bxy + cy 2 + dx + ey + f = 0.

Notice that the above equation only has terms of degree two or less (when adding the degrees of x and y)
and that these are all such terms! Three (or four) families of conic sections should be vaguely familiar
from precalculus. We will give the general formulas for these curves but not go into much detail (as this
should be review).
Theorem 8.1 (Conic Sections). A circle in the plane centered at (h, k) with radius r is given by the
equation
(x − h)2 + (y − k)2 = r2 .
An ellipse in the plane centered at (h, k) with intercepts at (±a, 0) and (0, ±b) and vertical & horizontal
axes is given by the equation
(x − h)2 (y − k)2
2
+ = 1.
a b2
A hyperbola in the plane centered at (h, k) with vertices at (±a, 0) is given by the equation

(x − h)2 (y − k)2
− = 1.
a2 b2
A parabola in the plane with vertex (h, k) and vertical axis is given by

(y − k) = c(x − h)2

where c 6= 0.

See here and here for illustrations.

Quadratic surfaces are the three dimensional analogue of conic sections and are the graphs of solutions
to general quadratic equations in three variables:

Ax2 + By 2 + Cz 2 + Dxy + Eyz + F zx + ax + by + cz + d = 0.

As with conic sections, quadratic surfaces are grouped into a few types, and if we orient the coordinate axes
in a nice way with the quadratic surface, we get nice general forms of the corresponding quadratic equations.
Definition 8.2. When the axes of a quadratic surface coincide with the coordinate axes we say the quadratic
is in standard position.
When a quadratic surface is in standard position, some of the coefficients (often D, E, F, a, b, and c) are
0! For ease, we will only consider quadratic surfaces in standard position. The quadratics are ellipsoids
(and spheres), hyperboloids, paraboloids, and cones. While we will give the general equations for
these quadratics, we really want to be able to identify them by their cross sections or traces: curves on the
surface obtained by intersection the surface with planes parallel to the coordinate planes.

See here for illustrations. In the following general formulas, exchanging the position of the variables is equiv-
alent to rotating the surface to align with different axes.

35
MTH 2321 Notes 8 CURVES AND SURFACES

Definition 8.3. The surface analogue of ellipses are ellipsoids. The equation for an ellipsoid in standard
position is
 x 2  y 2  z 2
+ + = 1.
a b c
Note that when a = b = c, we get a sphere of radius a. The traces of ellipsoids are ellipses:
 x 2  y 2  z 2
0
xy-trace : + + =1
a b c
 x 2  y 2  z 2
0
yz-trace : + + =1
a b c
 x 2  y 2  z 2
0
xz-trace : + + = 1.
a b c
Definition 8.4. The surface analogue of hyperbolas are hyperboloids, which come in two types: hyper-
boloids of one sheet and hyperboloids of two sheets. The equation for hyperboloid of one sheet in
standard position is given by
 x 2  y 2  z 2
+ = + 1,
a b c
and the equation for hyperboloid of two sheets in standard position is given by
 x 2  y 2  z 2
+ = − 1.
a b c
The horizontal traces of hyperboloids are ellipses and the vertical traces are hyperbolas/crossed lines:
 x 2  y 2  z 2
0
xy-trace : + = +1
a b c
 x 2  y 2  z 2
0
yz-trace : + = +1
a b c
 x 2  y 2  z 2
0
xz-trace : + = +1
a b c

Definition 8.5. An elliptic cone is like a hyperboloid of two sheets where the middle is pinched down to
a point. Elliptic cones are defined by equations of the form
 x 2  y 2  z 2
+ = .
a b c
The horizontal traces of hyperboloids are ellipses and the vertical traces are hyperbolas/crossed lines:
 x 2  y 2  z 2
0
xy-trace : + =
a b c
 x 2  y 2  z 2
0
yz-trace : + =
a b c
 x 2  y 2  z 2
0
xz-trace : + =
a b c

So far the general equations we have seen have been of the form

Ax2 + By 2 + Cz 2 + d = 0.

We will now consider quadratic surfaces that are not of this form.
Definition 8.6. The surface analogue of parabolas are paraboloids, which come in two types: elliptic
paraboloids and hyperbolic paraboloids. Elliptic paraboloids are given by equations of the form
 x 2  y 2
z= + .
a b

36
MTH 2321 Notes 8 CURVES AND SURFACES

The vertical traces of elliptic paraboloids are parabolas that open up and the horizontal traces of elliptic
paraboloids are ellipses:
 x 2  y 2
xy-trace : z0 = +
a b
 x 2  y 2
yz-trace : z= +
a b
 x 2  y 2
xz-trace : z= +
a b

Hyperbolic paraboloids are given by equations of the form


 x 2  y 2
z= − .
a b
The vertical traces of hyperbolic paraboloids are parabolas (opening up and down) and the horizontal traces
of hyperbolic paraboloids are hyperbolas/crossed lines:
 x 2  y 2
xy-trace : z0 = −
a b
 x 2  y 2
0
yz-trace : z= −
a b
 x 2  y 2
0
xz-trace : z= −
a b

Aside: Notice that an elliptic paraboloid has a minimum point, while a hyperbolic paraboloid does not. We
will explore this idea further in later sections.

The final examples of quadratic surfaces we will consider are cylinders.


Definition 8.7. Given a curve C in the xy-plane, the cylinder with base C consists of all vertical lines
through C. Such a cylinder is defined by the equation for C, considered as a three variable equation (i.e.
there is no z variable). Similarly one can have a cylinder with a base in the xz- or yz-planes. The horizontal
traces of a cylinder are always copies of the base curve C and the vertical traces of a cylinder are always sets
of vertical lines.

Definition 8.8. A right-circular cylinder with base curve a circle in the xy-plane centered at (h, k) with
radius r is given by the equation
(x − h)2 + (y − k)2 = r2 .
Definition 8.9. An elliptic cylinder with base curve an ellipse in the xy-plane centered at (h, k) is given
by the equation
(x − h)2 (y − k)2
2
+ = 1.
a b2
Definition 8.10. A hyperbolic cylinder with base curve a hyperbola in the xy-plane centered at (h, k)
is given by the equation
(x − h)2 (y − k)2
− = 1.
a2 b2
Definition 8.11. A parabolic cylinder with base curve a parabola in the xy-plane with vertex (h, k)
(y − k)2 = a(x − h)2 .
See illustrations here.

37
MTH 2321 Notes 9 POLAR, SPHERICAL, AND CYLINDRICAL COORDINATES

9 Polar, Spherical, and Cylindrical Coordinates (11.3, 12.7)


Up until now we have thought of the plane and 3-space in terms of Euclidean coordinates x, y, and z. This
way to describe space can be cumbersome when describing things where the distance from the origin or
angles play an important role, such as circular/elliptical motion (think: planets orbiting the sun). A much
more natural coordinate system to describe these situations is polar coordinates.

In polar coordinates, just as in Euclidean coordinates, we label a point P in the plane by its position relative
to the origin, but instead of length and height (x and y), we use distance and angle. That is, a point P is
labeled by
P = (r, θ),
where the radial coordinate r is the distance from the origin to P and the angular coordinate θ is the
angle between the positive x-axis and the line OP . As usual, we take positive angles to be counter-clockwise.

In Euclidean coordiantes, horizontal lines, y = y0 , and vertical lines, x = x0 , are examples of level curves.
In polar coordinates, the level curves are given by

r = R, the circle of radius R centered at the origin

and
θ = θ0 , the line with slope m = tan θ0 through the origin.

Fun Fact
Polar coordinates were introduced in the mid-17th century by two mathematicians at almost the same
time while studying Archimedean spirals and parabolic arcs.

How can we relate polar and Euclidean coordinates on the plane?

Let x, y be Euclidean coordinates and r, θ be po-


lar coordinates. Then one can convert from polar to
Euclidean and vice versa by

x = r cos θ
y = r sin θ

and
p
r= x2 + y 2
y
tan θ = (x 6= 0).
x

Note that polar coordinates, unlike Euclidean coordiantes, are not unique. Notably, the origin O has no
clearly defined angle, and so we assign to the origin all coordinates of the form (0, θ).
Example 9.1. Plot the points (2, π), (2, 3π), and (2, −π). What do you notice?

38
MTH 2321 Notes 9 POLAR, SPHERICAL, AND CYLINDRICAL COORDINATES

Now plot the points (1, π2 ) and (−1, 3π


2 ), where a negative radial coordinate is the reflection though the
origin of the corresponding positive radial coordinate. What do you notice?

When converting a point P = (x, y) to polar coordinates, remember that there are two angles between 0
y
and 2π that satisfy tan θ = . You must choose the combination of θ and r (positive or negative) so that
x
(x, y) = (r, θ).

Example 9.2. Describe the point P = (2, π6 ) (given in polar coordinates) in rectangular coordinates.

P = (2, π6 ) is in the first quadrant, so we know x and y


y should be positive. As r = 2 and θ = π6 ,

π 3
x = r cos θ = 2 cos =
6 2 √ !
3
and P = ,1
π 2
y = r sin θ = 2 sin = 1.
6 •
√ 2
=
So, in rectangular coordinates P = ( 23 , 1).
r

π
θ= 6 x

Example 9.3. Describe the point P = (1, −1) (given in rectangular coordinates) in polar coordinates.

P = (1, −1) is in the fourth quadrant, so if we take y


r > 0, we need 3π π x
2 ≤ θ ≤ 2π or − 2 ≤ θ ≤ 0. As
x = 1 and y = −1, θ = − π4
p √
r = x2 + y 2 = 2
r
=
√ 2


and P = (1, −1)
y π
tan θ = = −1 θ=−
x 4
√ π
So, in polar coordinates P = ( 2, − 4 ).

39
MTH 2321 Notes 9 POLAR, SPHERICAL, AND CYLINDRICAL COORDINATES

Example 9.4. Describe and sketch the region given in polar coordinates.

y
π
r ≤ 2, 0 ≤ θ ≤ .
2

This region is all points at most distance 2 form the


origin from angle 0 to angle π2 , i.e. the north east
quarter of the disc of radius two centered at the ori- 1
gin, including the boundary.

x
1 2

y
π π
− ≤θ≤ .
2 2

This region is all points at any distance form the ori- x


gin from angle − π2 to angle π2 , i.e. right half of the
plane, including the y-axis.

1 2
Recall that a circular sector of angle ∆θ and radius r has area r ∆θ, and so to find an area of a sector
2
bounded by a curve r = f (θ) and two rays θ = α and θ = β, we can use a Riemann sum technique.

Which gives us the following.


Theorem 9.5 (Area in Polar Coordinates). If f is a continuous function, then the area bounded by a curve
r = f (θ) in polar form and the rays θ = α and θ = β, where α < β, is given by
Z β Z β
1 2 1
r dθ = f (θ)2 dθ.
2 α 2 α

40
MTH 2321 Notes 9 POLAR, SPHERICAL, AND CYLINDRICAL COORDINATES

Example 9.6. The graph of the curve r = 2 − 2 sin θ is pictured below. Find the area of the shaded region.

π/2

3π/4 π/4

r = 2 − 2 sin θ

π 0
0 1 2 3

5π/4 7π/4

3π/2

The shaded region is half of the area of the entire cardioid, so we will find the entire area and then half it.
The area of the cardioid is
1 2π
Z
2
(2 − 2 sin θ) dθ = 6π,
2 0
and so the area of the shaded region is 3π.

Recall that if a path r(t) = hx(t), y(t)i is differentiable and r0 (t) is continuous on [a, b], then the length s of
the path r(t) for a ≤ t ≤ b is given by
Z b Z bp
0
s= kr (t)k dt = x0 (t)2 + y 0 (t)2 dt.
a a

We now derive an arc length formula for polar coordinates. For paths where the distance from the origin or
angles play an important role, a polar coordinate parametrization can be handy. A curve given by r = f (θ)
can be parametrized by
r(θ) = hr cos θ, r sin θi = hf (θ) cos θ, f (θ) sin θi .
Then
dx
x0 (θ) = = −f (θ) sin θ + f 0 (θ) cos θ

and
dy
y 0 (θ) = = f (θ) cos θ + f 0 (θ) sin θ.

So,
x0 (t)2 + y 0 (t)2 = f (θ)2 + f 0 (θ)2
and we get the arc length of the path r = f (θ) from θ = α to θ = β is
Z βp
s= f (θ)2 + f 0 (θ)2 dθ.
α

41
MTH 2321 Notes 9 POLAR, SPHERICAL, AND CYLINDRICAL COORDINATES

Example 9.7. Consider the cardioid in the previous example, given by r = 2 − 2 sin θ. Find the perimeter
of this cardioid.

The perimeter is the arc length from θ = 0 to θ = 2π, which is


Z 2π q
2 2
s= (2 − 2 sin θ) + (−2 cos θ) dθ = 16.
0

We now extend polar coordinates to 3-space. There are two natural ways to do this.

The first extension of polar coordinates is cylindrical coordinates. In cylindrical coordinates, we replace
the x and y components of a point P = (x, y, z) with polar coordinates, to get

P = (r, θ, z)

where r ≥ 0 and θ are the polar coordinates of the projection of P onto the xy-plane.

Let x, y, z be rectangular coordinates and r, θ, z be


cylindrical coordinates. Then
x = r cos θ
y = r sin θ
z=z
and
p
r= x2 + y 2
y
tan θ =
x
z = z.

The level surfaces in cylindrical coordinates are


r = R,
a right-circular cylinder with axis the z-axis,
θ = θ0 ,
a vertical half-plane through the z-axis making an
angle θ0 with the xz-plane, and
z = c,
a horizontal plane at height c.

42
MTH 2321 Notes 9 POLAR, SPHERICAL, AND CYLINDRICAL COORDINATES

Example 9.8. Describe and sketch the region given in cylindrical coordinates.

x2 + y 2 ≤ 4, x ≥ 0, y ≥ 0, 0 ≤ z ≤ 2.

x2 + y 2 ≤ 4 means r ≤ 2, x ≥ 0 and y ≥ 0 means


π
0 ≤ θ ≤ . So, this region is all points of distance at
2
π
most 2 from the z axis, with angle from 0 to and
2
z = 0 to z = 2.

p
x2 + y 2 ≤ z ≤ 2, y ≥ 0, x ≤ 0.

p
x2 + y 2 ≤ z ≤ 2 means r ≤ z ≤ 2, y ≥ 0 and x ≤ 0
π
means ≤ θ ≤ π. The surface z = r is the right
2
circular cone with vertex at the origin, so this region
is all the ponts above this cone up to z = 2 , with
π
angle from to π.
2

The second extension of polar coordinates is spherical coordinates.

Consider a point P = (x, y, z) in R3 . Let


ρ be the distance from P to the origin.
Then we can think of P as point on the sphere of
radius ρ centered at the origin, and P is then deter-
mined by two angular coordinates:
• θ, with 0 ≤ θ < 2π, is the polar angle of the
projection Q of P onto the xy-plane, and
• φ, with 0 ≤ φ ≤ π, is the angle of dec-
lination, which measures how much the ray
through P declines from the vertical.

Thus, P is determined by the triple P = (ρ, θ, φ), which are called spherical coordinates.

43
MTH 2321 Notes 9 POLAR, SPHERICAL, AND CYLINDRICAL COORDINATES

Let x, y, z be rectangular coordinates and ρ, θ, φ be


spherical coordinates. Then
x = ρ sin φ cos θ
y = ρ sin φ sin θ
z = ρ cos φ
and
p
ρ= x2 + y 2 + z 2
y
tan θ =
x
z
cos φ = .
ρ

The level surfaces in spherical coordinates are given by

Example 9.9. Describe and sketch the region given in spherical coordinates.

z
x2 + y 2 + z 2 ≤ 4, x ≥ 0, y ≥ 0, z ≥ 0.

x2 + y 2 + z 2 ≤ 4 means ρ ≤ 2, x ≥ 0 and y ≥ 0 means


π π
0 ≤ θ ≤ , and z ≥ 0 means 0 ≤ φ ≤ . So, this
2 2
is the region inside the sphere of radius 2 in the first
octant.
y

Can you think of an easy, practical application of using spherical coordinates?

44
MTH 2321 Notes 10 CURVATURE

10 Curvature (13.4)
In this section we will explore curvature, which is a measurement of how much a curve or surface bends.
Recall from Calculus 1 that one way to measure how the graph of a function f bends is by the second deriva-
tive. This is not a nice enough way to measure bending of curves, especially since not all curves are graphs
of functions! Curvature should be an intrinsic property of a curve, regardless of position or orientation in
the plane/space, so we need a definition of curvature that depends only on the curve itself.

Definition 10.1. Consider a curve with parametrization r(t). If r0 (t) 6= 0 for all t in the domain of r(t), we
say the parametrization is regular.
Definition 10.2. At every point P along a curve, there is a unit tangent vector T = Tp of length one
that points in the direction of motion of the parametrization. The unit tangent vector at time t is given by

r0 (t)
T(t) = .
kr0 (t)k

Example 10.3. The unit tangent vector is drawn at a point on the curve below. At this point, is the
parametrization of the curve clockwise or counterclockwise?

Example 10.4. Find the unit tangent vector at t = 1 for r(t) = t, t4 , 3t + 4 .

r0 (1) 1, 4(1)3 , 3 h1, 4, 3i 1


T(1) = 0
= 3
=√ = √ h1, 4, 3i
kr (1)k k h1, 4(1) , 3i k 1+4 +32 2 26
Imagine a fly buzzing around a room along some path and observing how the unit tangent vector T changes
direction. If T is changing, that means the path is bending, and the more rapidly T changes, the more the
dT
path bends. So, it seems like letting curvature from the change in T, , is a good idea. However, if the
dt
dT
fly follows the same path at two different speeds (i.e. for two different parametrizations of the path),
dt
will have a different values for a given time. So, let’s assume the fly follows the path at a unit speed, i.e. an
arc length parametrization!

45
MTH 2321 Notes 10 CURVATURE

Definition 10.5 (Curvature). Let r(s) be an arc length parametrization and T the unit tangent vector.
The curvature at r(s) is
dT
κ(s) = = kr00 (s)k.
ds
A curve with constant curvature zero is called flat.
Example 10.6. Compute the curvature of a circle of radius R.

As a circle is symmetric in the sense that it bends the same amount at every point, the curvature at every
point on a given circle should be the same, i.e. circles have constant curvature. It is also clear that
circles with smaller radii bend “more” than circles of larger radii. So we have a vague idea of what the
curvature of a circle should be - now let’s find it exactly. We might as well consider the circle in R2 of radius
R centered at the origin, for ease. This circle is given by the parametrization

r(t) = hR cos t, R sin ti .

Recall that an arc length parametrization is one with kr0 (t)k = 1 for all t. To find an arc length parametriza-
tion of the circle of radius R we first compute the arc length function
Z θ Z θ
s(θ) = kr0 (u)kdu = Rdu = Rθ.
0 0

So s = Rθ, and the inverse of the arc length function is θ = Rs . So an arc length parametrization of the
circle of radius R is s D s sE
r1 (s) = r = R cos , R sin .
R R R
Then
dr1 d D s sE D s sE
T(s) = = R cos , R sin = − sin , cos ,
ds ds R R R R
and
dT d D s sE 1 D s sE
= − sin , cos =− cos , sin .
ds ds R R R R R
Thus
dT 1
κ(s) = = .
ds R
1
Thus a circle of radius R has constant curvature of .
R

Remember from our section on arc length that arc length parametrizations are often impossible to find.
Fortunately, we can compute curvature using any regular parametrization r(t)! Since the arc length s is a
function of time t, we have
dT dT ds dT
= = v(t)
dt ds dt dt
ds
where v(t) = = kr0 (t)k is the speed of r(t). Thus,
dt
dT dT
= v(t) = κ(t)v(t)
dt dt

and solving for curvature we get


1 dT
κ(t) = .
v(t) dt
This can be used to prove the following theorem (the proof is in your book).

46
MTH 2321 Notes 10 CURVATURE

Theorem 10.7 (Formula for Curvature). If r(t) is a regular parametrization of a space curve, then

kr0 (t) × r00 (t)k


κ(t) = 3 .
kr0 (t)k

The above equation can be applied to plane curves by letting z(t) = 0.


We now consider curvature the graph of y = f (x). We already said that just using f 00 was not enough.
Corollary 10.8 (Curvature of a Graph in the Plane). The curvature at the point (x, f (x)) on the graph of
y = f (x) is
|f 00 (x)|
κ(x) = 3 .
(1 + f 0 (x)2 ) 2
Proof. The curve y = f (x) can be parametrized by r(x) = hx, f (x)i. Then r0 (x) = h1, f 0 (x)i and r00 (x) =
h0, f 00 (x)i. To apply the previous theorem we define the z components to be zero:

r0 (x) = h1, f 0 (x), 0i and r00 (x) = h0, f 00 (x), 0i .

Then
i j k
r0 (x) × r00 (x) = 1 f 0 (x) 0 = f 00 (x)k,
0 f 00 (x) 0
so that
kr0 (x) × r00 (x)k = |f 00 (x)|,
and p
kr0 (x)k = 11 + f 0 (x)2 .
Thus by the previous theorem,
|f 00 (x)|
κ(x) = 3 .
(1 + f 0 (x)2 ) 2

It can be shown that T and T0 are orthogonal (you should check this yourself! hint: dot product). The unit
vector in the direction of T0 , assuming it is nonzero, is the normal vector:

T0 (t)
N(t) = .
kT0 (t)k

As T and T0 are orthogonal, so are T and N. The tangent and normal vectors play an important role in
1
understanding a space curve. As κ(t) = kT0 (t)k we have
v(t)

T0 (t) = v(t)κ(t)N(t).

N points in the direction in which the curve is turn-


ing, or to the “inside” of the curve.

47
MTH 2321 Notes 10 CURVATURE

π
Example 10.9. Find the unit normal vector at t = to the helix r(t) = hcos t, sin t, ti.
4

0
√ vector r (t) = h− sin t, cos t, 1i has length
The tangent
0
kr (t)k = 2, so
1
T(t) = √ h− sin t, cos t, 1i ,
2
1
T0 (t) = √ h− cos t, − sin t, 0i ,
2
1
T0 (t) = √ ,
2
and
N(t) = h− cos t, − sin t, 0i .
Thus, * √
π √ +
2 2
N = − ,− ,0 .
4 2 2

Note that at every point on the helix the normal vector is horizontal and points at the z-axis, which reflects
that the curve is always turning back towards the z-axis!

Recap so far: given a point on a curve at time t, we have the unit tangent vector T(t) pointing in the di-
rection of motion and the unit normal vector N(t) pointing in the direction of turning. We now introduce a
third vector related to T and N which, when combined with T and N, gives a frame of reference for the point.

Definition 10.10. The binormal vector B(t) is given by

B(t) = T(t) × N(t).

Note that as T and N are unit vectors which are orthogonal, B is also a unit vector:
π
kBk = kT(t) × N(t)k = kTkkNk sin = 1.
2
Also, as T and N are orthogonal and B is the cross product of T and N, we have that T, N, and B are
mutually perpendicular.
Definition 10.11. The set of mutually perpendicular vectors T, N, and B are called the Frenet Frame.

As the Frenet Frame depends only on the intrinsic


properties of the curve, it is very useful in analyzing
the path of a space curve with no reference points,
i.e. spacecraft, satellites, asteroids, etc., even DNA!

48
MTH 2321 Notes 10 CURVATURE

Example 10.12. Find a formula for the binormal vector for a point on the helix r(t) = hcos t, sin t, ti.

We have already found that


1
T(t) = √ h− sin t, cos t, 1i
2
and
N(t) = h− cos t, − sin t, 0i .
Thus
i j k
sin t cos t 1
B(t) = T(t) × N(t) = − sin
√ t
2
cos
√ t
2
√1
2
= √ i − √ j + √ k.
− cos t − sin t 0 2 2 2

49
MTH 2321 Notes 11 MOTION IN 3-SPACE

11 Motion in 3-Space (13.5)


In this section we study the motion of an object traveling along a path r(t) in 3-space. Recall that the
velocity vector is the derivative
r(t + h) − r(t)
v(t) = r0 (t) = lim .
h→0 h
As we have already seen, the velocity vector (if it is nonzero) points in the direction of motion, and its
magnitude v(t) = kv(t)k is the object’s speed. The acceleration vector is the second derivative r00 (t) =
a(t).
Example 11.1. Tucker throws her cat toy and it travels along the path r(t) = t3 , 1 − t, 4t2 . Calculate
the velocity and acceleration vectors and the speed of the toy at t = 1.

Taking derivatives we get the velocity and accelera-


tion

v(t) = r0 (t) = 3t2 , −1, 8t v(1) = h3, −1, 8i ,

a(t) = r00 (t) = h6t, 0, 8i a(1) = h6, 0, 8i ,

and the speed at t = 1 is


p √
v(1) = kv(1)k = 32 + (−1)2 + 82 = 74.

Near the surface of the Earth, objects in free fall with no air resistance have an acceleration of −9.8m/s2 =
−32f t/s2 . We will refer to g = 9.8m/s2 as the gravitational constant. We then know that for an object
moving along a space curve near the surface of the Earth with no additional means of acceleration will have
an acceleration vector (in m/s2 ) given by
a(t) = −gk.
Note that if the object is traveling within a vertical plane, we can treat it as travelling along a plane curve,
and then its acceleration (in m/s2 ) is
a(t) = −gj.
Let us now consider an object that is projected from the ground, ignoring wind resistance, and travels along
a curve r(t) that lies in a vertical plane. The object has an initial velocity v(0) = v0 and is projected at
an angle θ. If v(t) is the objects speed, and v(0) = v0 is the initial speed, we know we can write v0 as
v0 = hv0 cos θ, v0 sin θi. Then, using a(t) = −gj, we can find v(t) and r(t) by integrating!
Z
v(t) = a(t) dt = −gtj + C0 .

Then v0 = C0 , so that
v(t) = hv0 cos θ, −gt + v0 sin θi .
Integrating again we get
Z  
1
r(t) = v(t) dt = (v0 cos θ)t, − gt2 + (v0 sin θ)t + C1 .
2

Placing the object’s starting point at the origin we get r(0) = r0 = 0, so that C1 = 0, and hence
 
1
r(t) = (v0 cos θ)t, − gt2 + (v0 sin θ)t .
2

50
MTH 2321 Notes 11 MOTION IN 3-SPACE

Example 11.2. John and Sully are playing fetch. John throws Sully’s ball from an initial height of 2 meters
at an angle of 60 degrees with initial speed of 15 meters/second. The ball hits a tree 15 meters away. How
high up the tree did the ball hit?

We will start by placing the origin at the point where John releases the ball (then, at the end we know to
add 2 meters to the height we get). We then know that
√ E
  D
1 2
r(t) = (v0 cos θ)t, − gt + (v0 sin θ)t = 7.5t, −4.9t2 + 7.5 3t .
2

The position vector for the point at which the ball hits the tree is h15, Hi, hence at the time when the ball
hits the tree we have D √ E
7.5t, −4.9t2 + 7.5 3t = h15, Hi ,

and thus equating components we get


7.5t = 15 t=2
and √
−4.9t2 + 7.5 3t = H H = 6.38.
Thus the ball hits the tree at a height of 8.38 meters.
In linear motion, acceleration is zero if the speed is constant. However, in two or three dimensions acceleration
can be nonzero if the speed is constant! This happens when v(t) = kv(t)k is constant but the direction of
v(t) is changing. An example of this is uniform circular motion, where an objects travels in a circular
path at a constant speed. In this case, the acceleration vector a(t) is called the centripetal acceleration.
Example 11.3. Find a(t) and ka(t)k for motion of an object around a circle of radius R at a constant speed v.

Assume that the object follows the path


r(t) = R hcos ωt, sin ωti
for some constant ω. Then the velocity and speed of
the object are
v(t) = Rω h− sin ωt, cos ωti and v = kv(t)k = R|ω|.
v
Thus |ω| = , and so
R
a(t) = −Rω 2 hcos ωt, sin ωti
and  v 2 v2
ka(t)k = Rω 2 = R = .
R R
v2
So the centripetal acceleration a(t) has length and
R
points towards the origin, as a(t) is a negative multi-
ple of r(t).

In 1666, Isaac Newton discovered his laws of motion, the second of which is: The vector sum of the forces F
on an object is equal to the mass m of that object times the acceleration a of the object:

F = ma = mr00 .

Newton, in 1687, was also the first to derive Johannes Kepler’s laws of planetary motion, stated by Kepler
in the 1610’s.

51
MTH 2321 Notes 11 MOTION IN 3-SPACE

Kepler’s first law states that the orbit of a planet is


an ellipse with the sun at one focus. The proof of this
is given in the text book.

Kepler’s second law states that the position vector


pointing from the sun to the planets sweeps out equal
area in equal times. The proof of this is given in the
text book.

The two shaded regions have the same area, and so


the planet sweeps them out in equal times. To do
this, the planet must travel faster going from A to
B than from C to D.

Kepler’s Third Law requires a little set-up. Let

R be the semimajor axis of the ellipse in meters


m3
G be the universal gravitational constant: 6.673 × 10−11
kgs2
M be the mass of the sun, about 1.989 × 103 0kg
T be the period of the orbit in seconds.

Kepler’s Third Law then states


4π 2
 
2
T = R3 .
GM
In fact, Kepler only knew that T 2 and R3 were proportional, while Newton later found the proportionality
constant. Note that this is how astronomers measure the mass on the planets/moon in our solar systems
(and how they try to measure masses of galaxies).

52
MTH 2321 Notes 12 MULTI-VARIABLE FUNCTIONS AND PARAMETRIZED SURFACES

12 Multivariable Functions, Parametrized Surfaces (14.1,14.2,16.4)


In this section we will extend concepts we have learned so far to functions of more than one variable.

Definition 12.1. A function of n variables is a function with input an n-tuple, (x1, . . . , xn). The domain
of a function of n-variables is the set of all n-tuples for which the function is defined. The range of a function of
n-variables is the image under the function of the domain.

Definition 12.2. The graph of a function of two variables f (x, y) is the set of points in R3 of the form (a, b, c=
f (a, b)) for all (a, b) in the domain of f .

If a function f of two variables is continuous (defined later), then the graph of f is a surface in R3 whose height
(above or below) the xy-plane at (a, b) is the value of f (a, b). We will often write z = f (x, y) to emphasize that
the z-coordinate of a point on the graph is function of x and y.

We will analyze the graphs of functions of two variables by fixing the x, y, or z coordinate and analyzing the
resulting space curve.

Definition 12.3. For a function of two variables f (x, y),


• the vertical trace in the plane x = a is the intersection of the graph of f with the vertical plane
x = a,
• the vertical trace in the plane y = b is the intersection of the graph of f with the vertical plane
y = b,
• the horizontal trace at height c is the intersection of the graph with the horizontal plane z = c.
Create your own example and view the vertical traces here.

Definition 12.4. Associated to the horizontal trace at height c is a level curve corresponding to c:
the curve in the xy-plane given by f (x, y) = c. Each level curve corresponding to c is the projection of the
horizontal trace at height c onto the xy-plane.

Definition 12.5. A contour map is a plot in the xy-plane that shows the level curves f (x, y) = c for
equally spaced values of c. The interval m between the values is called the contour interval.

Moving from one curve to the next, the value of the function (and hence height of the graph) changes by
±m. so, level curves are spaced out if the graph is flatter and level curves close together if the graph is
steeper. See examples of contour maps and level curves here.

Graphs of functions of 3 or more variables cannot be directly visualized (as we would need to draw in at
least 4 dimensions). However, we can extend the idea of level curves to graphs of functions of 3 variables.

Definition 12.6. The level surfaces of a function of three variables f (x, y, z) are the surface with equation
f (x, y, z) = c.
Up to this point, our description of surfaces has been that of solution sets to equations, and now as the
graph of a function f (x, y). Just as we moved from a similar description for plane curves and space curves
to parametrized curves, we will use functions of two variables to parametrize a surface.

53
MTH 2321 Notes 12 MULTI-VARIABLE FUNCTIONS AND PARAMETRIZED SURFACES

Definition 12.7. A parametrized surface is a surface S whose points are described in the form

G(u, v) = (x(u, v), y(u, v), z(u, v)).

The two parameters u and v vary in a region D called the parameter domain.

Note that as a surface is two dimensional, it requires two parameters (whereas a curve is one dimensional
and requires only one parameter).

Example 12.8. Parametrize the surface given by x2 + y 2 = 4.

This surface is the right-circular cylinder centered around the z-axis of radius 2, and can be nicely parametrized
using cylindrical coordinates. The points on this cylinder in cylindrical coordinates are of the form (2, θ, z),
so we will use θ and z as our parameters:

G(θ, z) = (2 cos θ, 2 sin θ, z), 0 ≤ θ < 2π, −∞ < z < ∞.

The parametrization of a cylinder amounts to wrapping the rectangular parameter domain around the z
axis, as shown below.

Definition 12.9 (Parametrization of a Right-Circular Cylinder). A right-circular cylinder with center the
z-axis and radius R is parametrized by

G(θ, z) = (R cos θ, R sin θ, z), 0 ≤ θ < 2π, −∞ < z < ∞.

54
MTH 2321 Notes 12 MULTI-VARIABLE FUNCTIONS AND PARAMETRIZED SURFACES

Example 12.10. Parametrize the sphere centered at the origin of radius 4.

The sphere centered at the origin of radius 4 is nicely parametrized using spherical coordinates. The points
on this sphere in spherical coordinates are of the form (4, θ, φ), so we will use θ and φ as our parameters:

G(θ, φ) = (4 cos θ sin φ, 4 sin θ sin φ, 4 cos φ), 0 ≤ θ < 2π, 0 ≤ φ ≤ π.

The parametrization of a sphere amounts to wrapping the rectangular parameter domain around origin, and
collapsing the top and bottom edges to the north and south pole, respectively. The north and south poles
correspond to φ = 0 and φ = π, respectively, with any value of θ. The function G is not 1-1 at the poles.

Definition 12.11 (Parametrization of a Sphere). A sphere centered at the origin with radius R is parametrized
by
G(θ, φ) = (R cos θ sin φ, R sin θ sin φ, R cos φ), 0 ≤ θ < 2π, 0 ≤ φ ≤ π.
The simplest situation for parametrizing a surface is when the surface is itself the graph of a function
z = f (x, y). In this case we have the following.

Definition 12.12 (Parametrization of Graph). If S is the graph of a function z = f (x, y), S is parametrized
by
G(x, y) = (x, y, f (x, y)).

Of course, other surfaces we have encountered can be parametrized, but one will need to find a suitable
parametrization with appropriate coordinates.

Now that we have functions of more than one variable, let’s do calculus with them! We will focus on functions
of two variables, but the results extend naturally to functions of n variables.

55
MTH 2321 Notes 12 MULTI-VARIABLE FUNCTIONS AND PARAMETRIZED SURFACES

Definition 12.13. Assume that f (x, y) is defined near P = (a, b). Then

lim f (x, y) = L
(x,y)→P

if, for any ε > 0 there exists a δ > 0 such that if the distance from (x, y) to P is less than δ, then

|f (x, y) − L| < ε.

This definition is similar to the one variable case. Recall that in the one variable case,

lim f (x) = L,
x→a

f (x) must approach the value L if x approaches a from the left or the right. In the two variable case, f (x, y)
must approach L as (x, y) approaches P , no matter how it approaches!

Definition 12.14. A function f (x, y) is continuous at P = (a, b) if

lim f (x, y) = f (a, b).


(x,y)→P

A function f (x, y) is continuous if it is continuous at every point in its domain.


All of the standard limit laws from calculus 1 still apply: sums, constant multiples, products, quotients,
substitution for continuous functions, composition of continuous functions are continuous, etc.
Example 12.15. Evaluate the limit or determine that it does not exist.
y2
lim
(x,y)→(0,0) 3x2 + y2

y2
As f (x, y) = is undefined at (0, 0), it is not continuous at (0, 0) and so we cannot use substitution.
3x2 + y 2
If we let (x, y) → (0, 0) along the y-axis, where x = 0, we get

y2
lim f (0, y) = lim = 1.
(x,y)→(0,0) y→0 y 2

This does not mean the limit is 1! If we let (x, y) → (0, 0) along the x-axis, where y = 0, we get
0
lim f (x, 0) = lim = 0.
(x,y)→(0,0) x→0 3x2

Since approaching (0, 0) from two different ways gave two different limits, the limit does not exist.

Let’s show this another way. If we let (x, y) approach (0, 0) along the line y = mx, then

m2 x2 m2 m2
lim f (x, mx) = lim = lim = .
(x,y)→(0,0) x→0 3x2 +m x 2 2 x→0 3 + m 2 3 + m2

As we get different values by approaching (0, 0) along different lines, the limit does not exist!
Example 12.16. Evaluate the limit or determine that it does not exist.
xy 2
lim
(x,y)→(0,0) x2 + y 4

If we try the same techniques as above, we end up getting:


0
lim f (0, y) = lim = 0,
(x,y)→(0,0) y→0 y4

56
MTH 2321 Notes 12 MULTI-VARIABLE FUNCTIONS AND PARAMETRIZED SURFACES

0
lim f (x, 0) = lim = 0,
(x,y)→(0,0) x→0 x2
and
m 2 x3 m2 x 0
lim f (x, mx) = lim = lim = = 0.
(x,y)→(0,0) x→0 x2 + m4 x4 x→0 1 + m4 x2 1
So it would seem the√original limit was zero, however this is not the case. If we let (x, y) approach (0, 0)
along the curve y = x, we get

√ x2 1 1
lim f (x, x) = lim = lim = .
(x,y)→(0,0) x→0 x2 + x2 x→0 2 2

So the original limit does not exist.


Example 12.17. Evaluate the limit or determine that it does not exist.

x2 − y 2
lim p
(x,y)→(0,0) x2 + y 2

For this limit, all of the above techniques yield a limit of zero. In fact, the limit is zero! We will show this
by converting to polar coordinates.

x2 − y 2 (r cos θ)2 − (r sin θ)2


= r (cos θ)2 − (sin θ)2 = r cos 2θ

p =
2
x +y 2 r

Here we used the identity (cos θ)2 − (sin θ)2 = cos 2θ. Further,

x2 − y 2
0≤ p = |r cos 2θ| ≤ r.
x2 + y 2

Now, as (x, y) → (0, 0) in rectangular coordinates is equivalent to r → 0 in polar coordinates, we have

x2 − y 2
0≤ lim p ≤ lim r = 0.
(x,y)→(0,0) x2 + y 2 r→0

Thus by the Squeeze Theorem


x2 − y 2
lim p = 0,
(x,y)→(0,0) x2 + y 2
and hence
x2 − y 2
lim p = 0.
(x,y)→(0,0) x2 + y 2

57
MTH 2321 Notes 13 PARTIAL DERIVATIVES

13 Partial Derivatives (14.3)


Last section we introduced multi-variable functions. As with functions of one variable and vector valued
functions, we are interested in studying the behavior of multi-variable functions. Unlike a function of one
variable, a multi-variable function f does not have a unique rate of change - each variable may affect f in
different ways!

For example, let L be the luminosity of a light source and D be the distance from the source to your eye.
Then the brightness B you perceive is a function of L and D, given by
L
B= .
4πD2
By inspection of the formula we see that increasing the luminosity (while fixing distance) will increase the
brightness. We also see that increasing the distance (while fixing luminosity) decreases the brightness. So L
and D affect the brightness differently. We will use this idea of only letting one variable change at a time to
express rates of change of a multi-variable function. The following definition is stated for a function of two
variables for ease, there is a similar definition for functions of n-variables.
Definition 13.1. The partial derivatives of a function f (x, y) are the rates of change with respect to
each variable separately. f has 2 first-order partial derivatives: the partial derivative with respect to x,
∂f ∂f
denoted fx or , and the partial derivative with respect to y, denoted fy or .
∂x ∂y
The partial derivative with respect to x is defined as the derivative of f (x, y) with y treated as a constant:
∂f f (x + h, y) − f (x, y)
fx = = lim .
∂x h→0 h
Similarly, the partial derivative with respect to y is defined as the derivative of f (x, y) with x treated as a
constant:
∂f f (x, y + h) − f (x, y)
fy = = lim .
∂y h→0 h
As partial derivatives are computed using derivatives of functions of one-variable, all the differentiation rules
you know (e.g. product, quotient, chain) are still valid!
Example 13.2. Find the first-order partial derivatives of f (x, y) = x4 y − xexy and evaluate them both at
the point (2, 1).

To find fx , take the derivative of f (x, y), treating y as a constant:


fx = 4x3 y − xyexy − exy , and so fx (2, 1) = 32 − 3e2 .
To find fy , take the derivative of f (x, y), treating x as a constant:
fy = x4 − x2 exy , and so fy (2, 1) = 16 − 4e2 .
As with derivatives of one-variable functions, the first-order partials at a point can be interpreted graphically
- they are the slopes of the tangent lines to the vertical traces:

58
MTH 2321 Notes 13 PARTIAL DERIVATIVES

As the partial derivative fx (a, b) is the derivative of f (x, b) viewed as a function of x only, we can estimate
the change in the function, ∆f , when x changes by ∆x as in the single-variable case, and we can do the
same thing for fy (a, b). So,

f (a + ∆x, b) − f (a, b) ≈ fx (a, b)∆x


f (a, b + ∆y) − f (a, b) ≈ fy (a, b)∆y.

Example 13.3. Use the contour map below to answer the following questions.

(a) Is fx (P ) greater than, less than, or equal to fx (Q)?

(b) Is fy (P ) greater than, less than, or equal to fy (Q)?

(c) Is fx (x, y) increasing, decreasing, or neither as a function of y?

Just as functions of one variable have higher-order derivatives, multi-variable functions have higher-order
partial derivatives!
Definition 13.4. The four second-order partial derivatives of a function f (x, y) are
   
∂ ∂f ∂ ∂f
fxx = fyy =
∂x ∂x ∂y ∂y
   
∂ ∂f ∂ ∂f
fxy = fyx =
∂y ∂x ∂x ∂y

This notation can also be used for higher-order derivatives, e.g.


    
∂ ∂ ∂ ∂ ∂f
fxyzzy =
∂y ∂z ∂z ∂y ∂x

Example 13.5. Find the four second order partial derivatives of f (x, y) = 3xy 2 − sin(x) cos(y).

fx = 3y 2 − cos(x) cos(y) and fy = 6xy + sin(x) sin(y)


and so

fxx = − sin(x) cos(y), fyy = 6x + sin(x) cos(y)


fxy = 6y + cos(x) sin(y), fyx = 6y + cos(x) sin(y)

Notice that in the above example, fxy = fyx ! This is not a coincidence...

59
MTH 2321 Notes 13 PARTIAL DERIVATIVES

Theorem 13.6 (Clairaut’s Theorem). If fxy and fyx are both continuous functions then fxy = fyx .

The following theorem is a neat application of partial derivatives, but requires a little set-up. We can
integrate a function f (x, y) with respect to x by treating y as a constant:
Z
F (x, y) = f (x, y) dx ⇒ Fx = f (x, y).

Similarly, we can integrate f (x, y) with respect to y or do the same for a function of three or more variables.
We can extend this to definite integrals as well.

Example 13.7. Z
1 3
x2 y dx = x y + Cy
3
where Cy is a function of y, which we are treating as a constant!

Z 3 x=3
1 3 1
x2 y dx = x y = 9y − y.
1 3 x=1 3
Notice the definite integral with respect to x of f (x, y) is a single-variable function (of y).
Theorem 13.8 (Leibniz’s Rule). Let f (x, y) be a function of two variables so that f and fy are continuous
for a ≤ x ≤ b and c ≤ y ≤ d. Then
Z b Z b
d
f (x, y) dx = fy (x, y) dx.
dy a a

60
MTH 2321 Notes 14 DIFFERENTIABILITY AND TANGENT PLANES

14 Differentiability and Tangent Planes (14.4)


Recall from calculus 1 that a function f (x) is differentiable at x = a if the derivative

f (a + h) − f (a)
f 0 (a) = lim
h→0 h
exists.

It then seems reasonable to say that a function f (x, y) is differentiable at (x, y) = (a, b) if the partial
derivatives
f (a + h, b) − f (a, b) f (a, b + h) − f (a, b)
fx (a, b) = lim and fy (a, b) = lim
h→0 h h→0 h
both exist. However, it turns out that this is too naı̈ve. The “correct” definition will involve linearization.

Back to calculus 1: recall that the derivative f 0 (a) (if it exists) gives the slope of the tangent line at (a, f (a))
to the graph of f (x), and the equation of the tangent line is given by

L(x) = f (a) + f 0 (a)(x − a),

and is called the linearization of f (x) at x = a (because it is a linear equation!), which is a good approximation
of f for x close to a,
f (x) ≈ f (a) + f 0 (a)(x − a) = L(x).
If we then denote the error in this approximation by E(x), so that

f (x) = f (a) + f 0 (a)(x − a) + E(x) = L(x) + E(x),

we have
E(x)
lim = 0.
x→a |x − a|
We now generalize this to functions of two variables. For a function f (x, y), we can use the partial derivatives
fx (x, y) and fy (x, y) to describe tangent planes to the graph of f (x, y).

Recall that to determine a plane we need a point and a normal vector. We have our point, P , we need to
find a normal vector. Further, recall that we know the slopes of two tangent lines through P , one with slope
fx (a, b) and one with slope fy (a, b). So, to find a normal vector to the plane we will find vectors on these
two lines and take their cross product!

61
MTH 2321 Notes 14 DIFFERENTIABILITY AND TANGENT PLANES

By the definition of slope of a line, the vectors (with


basepoint P )

v = h1, 0, fx (a, b)i and u = h0, 1, fy (a, b)i

lie on the lines determined by fx (a, b) and fy (a, b),


respectively.

As u lies in the plane given by x = a and v lies in the plane given by y = b, and neither vector is vertical, u
and v are not parallel, and so their cross product is a normal vector to the tangent plane at P :

i j k
n=u×v= 0 1 fy (a, b) = hfx (a, b), fy (a, b), −1i .
1 0 fx (a, b)

So, if a function f (x, y) has a tangent plane at (a, b, f (a, b)), it must be given by

fx (a, b)(x − a) + fy (a, b)(y − b) − (z − f (a, b)) = 0.

However, this plane might not always be an “honest” tangent plane! (i.e. the tangent plane may not
approximate the function.) Just as we used the tangent line to find give the linearization of a function of
one variable, we can use the tangent plane to give the linearization of f (x, y).
Definition 14.1 (Linearization of f (x, y) at (a, b)). The linearization of f (x, y) at (a, b) is the linear
function which gives the tangent plane:

L(x, y) = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b).

We would like that the linearization of f (x, y) at (a, b) is a good approximation of f . If we denote the error
by E(x, y), so that

f (x, y) = L(x, y) + E(x, y) = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b) + E(x, y),

then we would like that


E(x, y)
lim p = 0.
(x,y)→(a,b) (x − a)2 + (y − b)2
Unfortunately, the existence of fx (a, b) and fy (a, b) does not give the above.
Example 14.2. Consider the function
(
xy(x+y)
x2 +y 2 if (x, y) 6= (0, 0)
f (x, y) =
0 if (x, y) = (0, 0)

See this function here.

This function is seen to be continuous everywhere, and it can be checked that

fx (0, 0) = fy (0, 0) = 0.

So, the tangent plane at the origin should be z = 0. However, the graph does not look close to the z = 0
plane near the origin! E.g., along x = y, f (x, y) = x.

62
MTH 2321 Notes 14 DIFFERENTIABILITY AND TANGENT PLANES

With the previous example in mind, a continuous function f (x, y) should be differentiable, and it’s lineariza-
tion a good approximation, if it is locally linear, i.e. the graph of f (x, y) looks like the tangent plane when
zoomed in. (Fun fact - this is really saying that the graph of f is a 2-dimensional real manifold.)

Definition 14.3 (Differentiable and Linearizable). Let f (x, y) be defined near (a, b). We say that f (x, y) is
differentiable at (a, b) if
1. fx (a, b) and fy (a, b) exist and
2. f (x, y) is linearizable near (a, b), i.e. if f (x, y) = L(x, y) + E(x, y), then

E(x, y)
lim p = 0.
(x,y)→(a,b) (x − a)2 + (y − b)2

Then, if f (x, y) is differentiable at (a, b), then L(x, y) is a good approximation of f (x, y) for (x, y) near (a, b),
i.e. f (x, y) ≈ L(x, y). In this way, if f (x, y) is differentiable then f has an “honest” tangent plane:

Theorem 14.4 (Equation of the Tangent Plane). If f (x, y) is differentiable at (a, b), then the graph of
f (x, y) has a tangent plane at (a, b, f (a, b)) given by

fx (a, b)(x − a) + fy (a, b)(y − b) − (z − f (a, b)) = 0,

or, equivalently,
z = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b).

Theorem 14.5 (Criterion for Differentiability). If fx and fy exist and are continuous on an open disk D,
then f (x, y) is differentiable on D.

63
MTH 2321 Notes 14 DIFFERENTIABILITY AND TANGENT PLANES

Example 14.6. Let f (x, y) = sin x cos y. Find the tangent plane to f at (0, 0) and use this to approximate
f (.05, .05).

fx = cos x cos y and fy = − sin x sin y,


so fx and fy are continuous, and hence f is differentiable. Then as

fx (0, 0) = 1 and fy (0, 0) = 0,

the tangent plane to f at (0, 0) is z = x. So f (.05, .05) ≈ .05.

If f (x, y) is a differentiable function, then it is useful to define the differential of f (x, y) as the formal
expression
∂f ∂f
df = dx + dy.
∂x ∂y
We can interpret this as follows. Let ∆x and ∆y be small changes in x and y, respectively. Then,
∂f ∂f
∆f ≈ ∆x + ∆y,
∂x ∂y
i.e.
f (a + ∆x, b + ∆y) ≈ f (a, b) + fx (a, b)∆x + fy (a, b)∆y,
or
∆f = f (a + ∆x, b + ∆y) − f (a, b) ≈ fx (a, b)∆x + fy (a, b)∆y.

Note: the above linear approximation formula can be extended to functions of three variables by

f (a + ∆x, b + ∆y, c + ∆z) ≈ f (a, b, c) + fx (a, b, c)∆x + fy (a, b, c)∆y + fz (a, b, c)∆z.

Example 14.7. Show that f (x, y) = x2 + y −2 is differentiable on its domain and find an equation of the
tangent plane at P = (4, 1).

The partial derivatives of f are


fx = 2x and fy = −2y −3 ,
which both exist and are continuous on the domain of f , y 6= 0. So f is differentiable on its domain. To find
the tangent plane we need:

f (4, 1) = 17, fx (4, 1) = 8, fy (4, 1) = −2.

Then the tangent plane is given by

z = 17 + 8(x − 4) − 2(y − 1).

64
MTH 2321 Notes 14 DIFFERENTIABILITY AND TANGENT PLANES

r
9.1
Example 14.8. Approximate .
3.9
r
x
Consider the linear approximation of f (x, y) = at (9, 4):
y

f (9 + ∆x, 4 + ∆y) = f (9, 4) + fx (9, 4)∆x + fy (9, 4)∆y.

Then, let ∆x = 0.1 and ∆y = −0.1, and


1 1 3
f (x, y) = x 2 y − 2 f (9, 4) =
2
1 −1 −1 1
fx (x, y) = x 2 y 2 ⇒ fx (9, 4) =
2 12
1 1 3 3
fy (x, y) = − x 2 y − 2 fy (9, 4) = − .
2 16

So, r
9.1 3 1 3
≈ + (0.1) − (−0.1) ≈ 1.5271.
3.9 2 12 16

65
MTH 2321 Notes 15 PARAMETRIZED SURFACES, TANGENT PLANES, AND CURVATURE

15 Parametrized Surfaces, Tangent Planes, and Curvature (16.4)


Let’s now consider a parametrized surface S given by

G(u, v) = (x(u, v), y(u, v), z(u, v)).

We will assume G is one-to-one on it’s domain (i.e. the surface doesn’t intersect itself) and that G is
continuously differentiable, i.e. the parametric equations x(u, v), y(u, v), and z(u, v) have continuous
partial derivatives.

In the above image, we see that the grid lines (vertical and horizontal) through the point (u0 , v0 ) in the
domain of G are mapped to the grid curves G(u, v0 ) and G(u0 , v) through the point P = G(u0 , v0 ) on the
surface S. The tangent vectors to these grid curves are then given by:
 
∂G ∂x ∂y ∂z
For G(u, v0 ) : Tu (P ) = (u0 , v0 ) = (u0 , v0 ), (u0 , v0 ), (u0 , v0 )
∂u ∂u ∂u ∂u
 
∂G ∂x ∂y ∂z
For G(u0 , v) : Tv (P ) = (u0 , v0 ) = (u0 , v0 ), (u0 , v0 ), (u0 , v0 )
∂v ∂v ∂v ∂v

As in the figure above, we can then form a normal vector to the surface at the point P by

N(P ) = Tu (P ) × Tv (P ).

Note that this notation is slightly different from what we have seen before in that Tu , Tv , and N need not
be unit vectors. In fact, if the parametrization is changed (still giving the same surface), then lengths may
change and the directions may be reversed.
Definition 15.1. The parametrization G is called regular if the normal vector N(P ) is non-zero for all
points P = G(u, v).

66
MTH 2321 Notes 15 PARAMETRIZED SURFACES, TANGENT PLANES, AND CURVATURE

We can then use the above normal vector to find the tangent plane to a point on a parametrized surface!
We state this below using a vector surface parametrization.
Theorem 15.2. Let r(u, v) be a vector parametrization of a surface. The normal vector to the surface at
P = (x0 , y0 , z0 ) corresponding to r(u0 , v0 ) is
n = ru × rv = ha, b, ci
and the equation of the plane tangent to the surface at P is
a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0.
Example 15.3. Find an equation of the tangent plane to z = x2 + 2y 2 at (1, 1, 3).

The surface given by z = x2 + 2y 2 is parametrized by r(u, v) = u, v, u2 + 2v 2 , and the point (1, 1, 3)


corresponds to (u, v) = (1, 1). Then.
ru = h1, 0, 2ui and rv = h0, 1, 4vi ,
and so
n = ru × rv = h−2, −4, 1i .
Thus an equation for the tangent plane is
−2(x − 1) − 4(y − 1) + (z − 3) = 0.
Example 15.4. Compute the outward pointing normal vector of the sphere of radius R centered at the
origin given by the standard vector parametrization
r(θ, φ) = hR cos θ sin φ, R sin θ sin φ, R cos φi .
Use this normal
 π π vector
 to find an equation for the tangent plane to the sphere of radius R = 2 at the point
given by r , .
4 3

First, note that the outward pointing normal vector


will be a positive multiple of the unit radial vector
er , which is given by

er = hcos θ sin φ, sin θ sin φ, cos φi .

Then
rθ = h−R sin θ sin φ, R cos θ sin φ, 0i
and
rφ = hR cos θ cos φ, R sin θ cos φ, −R sin φi ,
so, a normal vector to the tangent plane is
i j k
n = rφ × rθ = R cos θ cos φ R sin θ cos φ −R sin φ
−R sin θ sin φ R cos θ sin φ 0
= R2 cos θ sin2 φi + R2 sin θ sin2 φj + R2 cos φ sin φk
= R2 sin φ hcos θ sin φ, sin θ sin φ, cos φi
= R2 sin φ er .

67
MTH 2321 Notes 15 PARAMETRIZED SURFACES, TANGENT PLANES, AND CURVATURE

This normal vector is outward pointing as it is a positive multiple of er !


π π √ √ !
6 6
The point on the sphere given by r , is , , 1 . The outward pointing normal at this point is
4 3 2 2
* √ √ +
D π π π π π πE 3 2 3 2 √
n = 4 cos sin2 , 4 sin sin2 , 4 cos sin = , , 3 .
4 3 4 3 3 3 2 2
√ √ !
6 6
So, an equation for the tangent plane to the sphere at , , 1 is given by
2 2
√ √ ! √ √ !
3 2 6 3 2 6 √
x− + y− + 3 (z − 1) = 0.
2 2 2 2

We will now look at two new kinds of curvature for surfaces:


1
Gauss Curvature: K = κ1 κ2 and Mean Curvature: H= (κ1 + κ2 )
2
where κ1 and κ2 are the principal curvatures.

Recall that for a point on a curve r(t), the curvature κ measures how much the curve bends through that
point. The principal curvatures above are the maximum and minimum curvature for trace curves in any
direction, i.e those given by intersecting a normal plane with the surface for a given point.
Example 15.5. See the following surfaces here.

Given a point on a sphere, normal curves are circles, which have positive curvature, so a sphere has positive
Gauss and mean curvature.

Given a point on a plane, normal curves are straight lines, which have curvature zero, so a plane has Gauss
and mean curvature zero.

Given a point on a hyperbolic paraboloid, normal curves in any direction can have positive or negative
curvature, so hyperbolic paraboloids have negative Gauss curvature and mean curvature approximately zero.

Given a point on a cylinder, normal curves in any direction can have positive or zero curvature, so cylinders
have Gauss curvature zero and positive mean curvature.

Given a point on a cone, normal curves in any direction can have positive or zero curvature, so cones have
Gauss curvature zero and positive mean curvature.

Definition 15.6. Surfaces with Gauss curvature K = 0 are intrinsically flat.

Surfaces for which mean curvature H = 0 are called minimal surfaces.

Now, we want to actually compute H and K for a surface given by r(u, v). To do this we follow the method
of Gauss. Define
E = ru · ru , F = ru · rv , and G = rv · rv .
Then the first fundamental form of the surface is given by

I = E du2 + 2F du dv + G dv 2 .

68
MTH 2321 Notes 15 PARAMETRIZED SURFACES, TANGENT PLANES, AND CURVATURE

We also write I as a matrix,  


E F
I= .
F G
Recall the unit normal to r is given by
ru × rv
n= ,
kru × rv k
and define
L = ruu · n, M = ruv · n, and N = rvv · n.
Then the second fundamental form of the surface is given by
II = L du2 + 2M du dv + N dv 2 .
We also write II as a matrix,  
L M
II = .
M N
Definition 15.7 (Gauss Curvature K). The Gauss curvature K of a surface r(u, v) is
L M
det(II) M N LN − M 2
K= = = .
det(I) E F EG − F 2
F G
Definition 15.8 (Mean Curvature H). The mean curvature H of a surface r(u, v) is
LG − 2M F + N E
H=
2 (EG − F 2 )
1 2
x − y 2 at the origin.

Example 15.9. Find K and H for the surface z =
2
This surface can be parametrized as
 
1 2 2
r(u, v) = u, v, (u − v ) .
2
Then the needed partial derivatives are
ru = h1, 0, ui , rv = h0, 1, −vi ,
ruu = h0, 0, 1i , ru,v = h0, 0, 0i , and rvv = h0, 0, −1i .
Then,
ru × rv = h−u, v, 1i
and so
1
n= √ h−u, v, 1i .
1 + u2 + v 2
So,
E = r u · r u = 1 + u2 , F = ru · rv = −uv, and G = rv · rv = 1 + v 2
and
1 −1
L = ruu · n = √ , M = ruv · n = 0, and N = rvv · n = √ .
1 + u2 + v 2 1 + u2 + v 2
The origin corresponds to (u, v) = (0, 0), so
LN − M 2
K= = −1
EG − F 2
and
LG − 2M F + N E
H= = 0.
2 (EG − F 2 )

69
MTH 2321 Notes 15 PARAMETRIZED SURFACES, TANGENT PLANES, AND CURVATURE

Example 15.10. Find H and K for the following surfaces.

Sphere of radius R: x2 + y 2 + z 2 = R2

The sphere of radius R is parametrized by r(u, v) = hR cos u sin v, R sin u sin v, R cos vi. Using the above
1 1
formulas you get K = 2 and H = .
R R
Cylinder of radius R: x2 + y 2 = R2

The cylinder of radius R is parametrized by r(u, v) = hR cos u, R sin u, vi. Using the above formulas you get
1
K = 0 and H = .
2R
p
Cone: z = x2 + y 2
D p E
The cones is parametrized by r(u, v) = u, v, u2 + v 2 . Using the above formulas you get K = 0 and
1
H= √ .
2 2u2 + 2v 2
Paraboloid: z = 12 x2 + 12 y 2
 
1 1
The paraboloid is parametrized by r(u, v) = u, v, u2 + v 2 . Using the above formulas you get K =
2 2
1 2 + u2 + v 2
2 and H = 3 .
(1 + u2 + v2 ) 2 (1 + u2 + v 2 ) 2

Example 15.11. The Enneper surface, given by

u2 v2
     
1 2 1 2 1 2 2

r(u, v) = u 1− + v ,− v 1 + u − , u −v
3 3 3 3 3

is a minimal surface, i.e. H = 0 (you should check this on your own!). See a picture of the Ennerper surface
and more info here.

70
MTH 2321 Notes 16 THE GRADIENT AND DIRECTIONAL DERIVATIVES

16 The Gradient and Directional Derivatives (14.5)


We have seen in the last couple sections that the rate of change of a multi-variable function depends on a
choice of direction, and so it is natural to use vectors to describe the derivative of f is a specified direction.
Definition 16.1 (Gradient). The gradient of a function f (x, y) at a point P = (a, b) is the vector

∇fP = hfx (a, b), fy (a, b)i .

We also write the gradient itself as a vector-valued function


 
∂f ∂f
∇f = hfx , fy i = , .
∂x ∂y

Similarly, the gradient of f (x, y, z) is


 
∂f ∂f ∂f
∇f = hfx , fy , fz i = , , .
∂x ∂y ∂z
The gradient assigns to each point in the domain of f a vector which encodes the rate of change of f in each
component. A similar definition to the above can be stated for functions of n-variables.

Example 16.2. If f (x, y, z) = ez x2 − 4y, find ∇f (1, 2, 3).

∇f = 2xez , −4, ez x2 ⇒ ∇f (1, 2, 3) = 2e3 , −4, e3 .


The following theorem follows directly from the definition of the gradient, and shows that the gradient acts
like a derivative.

Theorem 16.3 (Properties of the Gradient). If f and g are differentiable multi-variable functions and c is
a constant, then
• ∇ (f + g) = ∇f + ∇g
• ∇ (cf ) = c∇f
• Product Rule for Gradients ∇ (f g) = f ∇g + g∇f
• Chain Rule for Gradients If F (t) is a differentiable function, then ∇ (F (f (x, y, z))) = F 0 (f (x, y, z))∇f

The first application of the gradient we will see is the chain rule for paths. Recall that a path (in the plane
or in space) is represented by a vector-valued function r(t).

Definition 16.4. Given f (x, y) and r(t) = hx(t), y(t)i, define

f (r(t)) = f (x(t), y(t)).

Similarly, given f (x, y, z) and r(t) = hx(t), y(t), z(t)i

f (r(t)) = f (x(t), y(t), z(t)).

These f s are scalar-valued functions of one variable.

Example 16.5. If f (x, y) = x2 + y 2 and r(t) = t, t2 , f (r(t)) = (t)2 + (t2 )2 = t2 + t4 .

71
MTH 2321 Notes 16 THE GRADIENT AND DIRECTIONAL DERIVATIVES

Theorem 16.6 (Chain Rule for Paths). If f and r(t) are differentiable, then

d
f (r(t)) = ∇f (r(t)) · r0 (t).
dt
The above chain rule looks similar to chain rules of the past - and that’s because the chain rule just is what
it is! It’s always “the derivative of the outside plug in the inside, times the derivative of the inside.” What
changes is what we means by “derivative” and “times.”

In this case, we want “the derivative” of a multi-variable function, which means the gradient! We then want
to multiply two vector-valued functions to produce a scalar, which means the dot product!

d
Example 16.7. Let f (x, y) = x2 + y 2 and r(t) = t, t2 . Find f (r(t)).
dt
d
The first way: We already saw that f (r(t)) = t2 + t4 , so we know f (r(t)) = 2t + 4t3 . Let’s do this again
dt
by using the chain rule for paths:

∇f = h2x, 2yi and r0 (t) = h1, 2ti ,


so
d
f (r(t)) = ∇f (r(t)) · r0 (t) = 2t, 2t2 · h1, 2ti = 2t + 4t3 .
dt
We now sketch a proof of the chain rule for paths. Let

∆f = f (r(t + h)) − f (r(t)), ∆x = x(t + h) − x(t),

∆y = y(t + h) − y(t), and ∆t = h.


If f (x, y) is differentiable, then
∂f ∂f
∆f ≈ ∆x + ∆y
∂x ∂y
and so
∆f ∂f ∆x ∂f ∆y
≈ + .
∆t ∂x ∆t ∂y ∆t
Then taking the limit as h → 0, we get derivatives, and the above becomes the equality
df ∂f dx ∂f dy
= +
dt ∂x dt ∂y dt
   
∂f ∂f dx dy
= , · ,
∂x ∂y dt dt
= ∇f · r0 (t).

As we have already said, the rate of change of a multi-variable function depends on a choice of direction. We
now consider rates of change of a function f (x, y) with respect to a specified direction, called directional
derivatives.

72
MTH 2321 Notes 16 THE GRADIENT AND DIRECTIONAL DERIVATIVES

y
v
Consider the parametrized line

• r(t) = r0 + tv

P = (a, b) where r0 is the position vector of P = (a, b).

Note that r0 (0) = v.


x

Definition 16.8 (Directional Derivative). The derivative of f (x, y) (similarly, f (x, y, z)) with respect
to v at P is the derivative
d
Dv f |P = f (r(t)) = ∇f (P ) · v.
dt t=0
The directional derivative of f (x, y) in the direction of v is
1 1
Dev f |P = Dv f |P = ∇f (P ) · v.
kvk kvk

Example 16.9. Find the directional derivative of f (x, y, z) = 1000 + yz 2 + x2 z − xyz in the direction of
v = h0, 1, 1i at P = (1, 2, 1).

∇f = 2xz − yz, z 2 − xz, 2yz + x2 − xy ⇒ ∇f |P = h0, 0, 3i ,


then
Dv f |P = h0, 0, 3i · h0, 1, 1i = 3
and so
1 3
Dv f |P = √ .
kvk 2
If v is itself a unit vector, then the directional derivative in the direction of v is just Dv f = ∇f ·u. Especially,

Di f = fx and Dj f = fy .

We now consider how to interpret the directional derivative geometrically.

Du f (a, b) is the slope of the tangent line to the


trace curve through Q in the vertical plane through
P in the direction u.

73
MTH 2321 Notes 16 THE GRADIENT AND DIRECTIONAL DERIVATIVES

Question: For which unit vector u is the directional derivative Du f |P maximal? i.e., in which direction
does f (x, y) increase the most at P ?

If u is a unit vector and θ is the angle between ∇f (P ) and u, then

Du f |P = ∇f (P ) · u
= k∇f (P )kkuk cos θ
= k∇f (P )k cos θ

As cos θ ≤ 1, Du f |P will be maximal when cos θ = 1, or θ = 0, i.e. when u points in the direction of the
gradient! (Assuming Du f |P 6= 0.)

Now, suppose a path r(t) lies on a level curve f (x, y) = c. Then the function f (r(t)) is constant, and so

d
f (r(t)) = 0.
dt
On the other hand, by the chain rule for paths
d
f (r(t)) = ∇f |r(t) · r0 (t).
dt
Thus, ∇f · r0 (t) = 0, and so the gradient is perpendicular to level curves (and surfaces)!

On the left we see the contour map of a function


f (x, y). We see that gradient at P is orthogonal to
the level curve through P and points in the
direction of maximum increase of f .
We summarize this with the following theorem.

Theorem 16.10 (Geometric Interpretation of the Gradient). Assume that ∇f 6= 0. Let u be a unit vector
making an angle θ with ∇f (P ). Then

Du f (P ) = k∇fP k cos θ,

and
• ∇f (P ) points in the direction of maximum rate of increase of f at P
• −∇f (P ) points in the direction of maximum rate of decrease of f at P
• ∇f (P ) is normal to the level curve (or surface) at P
• k∇f (P )k yields the maximum slope of a tangent line to the surface given by z = f (x, y) at (P, f (P ))

74
MTH 2321 Notes 16 THE GRADIENT AND DIRECTIONAL DERIVATIVES

Now suppose P is a point on the level surface f (x, y, z) = c. Then

∇f (P ) is a normal vector for the tangent plane to the surface at P.

Example 16.11. Find an equation of the tangent plane to the surface x2 + 3y 2 + 4z 2 = 20 at the point
P = (2, 2, 1).

Since ∇f (P ) is a normal vector for the tangent plane to the surface at P , the equation for the tangent plane
is given by
∇f (P ) · hx − 2, y − 2, z − 1i = 0.
Then
∇f = h2x, 6y, 8zi ⇒ ∇f (P ) = h4, 12, 8i .
Thus an equation for the tangent plane is

h4, 12, 8i · hx − 2, y − 2, z − 1i = 0,

or, equivalently,
4(x − 2) + 12(y − 2) + 8(z − 1) = 0.

75
MTH 2321 Notes 17 THE CHAIN RULE

17 The Chain Rule (14.6)


In the previous section we introduced the chain rule for paths, where for differentiable functions f (x1 , . . . , xn )
and r(t)

df ∂f dx1 ∂f dxn
= + ··· +
dt ∂x dt ∂x dt
 1  n 
∂f ∂f dx1 dxn
= ,..., · ,...,
∂x1 ∂xn dt dt
= ∇f · r0 (t).

Note again that the chain rule just is what it is! It’s always “the derivative of the outside plug in the inside,
times the derivative of the inside.” What changes is what we means by “derivative” and “times.”

Viewing this a little differently, consider the composition f (x1 (t), . . . , xn (t)) where f is a differentiable
function of n variables and each of x1 , . . . , xn are differentiable functions on one variable. The derivative
should be given by the chain rule, and should look similar to the above. That is, we should have
df
= ∇f ·???
dt
where “???” represents the derivative of the inside in vector form. We represent this by considering the
vector-valued function
r(t) = hx1 (t), . . . , xn (t)i .
Then derivative of the inside in vector form is given by “???” = r0 (t)! That is, if f is a differentiable
function of n variables and each of x1 , . . . , xn are differentiable functions of one variable then
df
= ∇f · hx01 (t), . . . , x0n (t)i .
dt
Notice that in this setting the functions x1 , . . . , xn were given as individual differential functions of t, and
not as components of a vector valued function!

We now want to extend the chain rule to general composite functions. For a function f (x, y, z) and functions
x(u, v), y(u, v), and z(u, v), the composition

f (x(u, v), y(u, v), z(u, v))

is then a function of two variables, with independent variables u and v. This extends to an arbitrary com-
position of multi-variable functions, when the composition is defined.

Example 17.1. If f (x, y, z) = xy − ez , x(u, v) = uv, y(u, v) = sin u, and z(u, v) = 3v 2 − u, then
2
−u
f (x(u, v), y(u, v), z(u, v)) = uv sin u − e3v .

∂f
Now, find :
∂u
∂f 2
= v sin u + uv cos u + e3v −u .
∂u
∂f
Now, let’s find using the above rule, but with “the derivative of the inside” being a partial derivative!
∂u
From the above,
df
= ∇f · hx0 , y 0 , z 0 i .
dt

76
MTH 2321 Notes 17 THE CHAIN RULE

Thinking of the derivative of the inside as partial derivatives, this becomes


 
∂f ∂x ∂y ∂z
= ∇f · , , .
∂u ∂u ∂u ∂u
∂x ∂y ∂z
Then as ∇f = hy, x, −ez i, = v, = cos u, and = −1, we get
∂u ∂u ∂u
 
∂f ∂x ∂y ∂z
= ∇f · , , = hy, x, −ez i · hv, cos u, −1i = vy + x cos u + ez .
∂u ∂u ∂u ∂u
∂f
This gives with both sets of variables, x, y, z and u, v. We want the answer only in terms of the indepedent
∂u
variables u and v. Substituting in x, y, and z we then get
∂f 2
= v sin u + uv cos u + e3v −u .
∂u
∂f
Notice that this is the same as what we found above. On your own, you should try finding in the two
∂v
∂f
ways we just did for .
∂u
We want to generalize the example from above to a chain rule as before, but this time where the inside
functions x1 , . . . , xn are themselves multi-variable functions! This version of the chain rule should, of course,
should still be “the derivative of the outside plug in the inside, times the derivative of the inside,” but where
the derivative of the inside is a partial derivative.
Theorem 17.2 (The General Chain Rule). Let f (x1 , . . . , xn ) be a differential function of n variables and
each of x1 (t1 , . . . , tm ), . . . , xn (t1 , . . . , tm ) be differential functions of m variables. Then for each independent
variable t1 , . . . , tm
 
∂f ∂x1 ∂xn
= ∇f · ,...,
∂tk ∂tk ∂tk
   
∂f ∂f ∂x1 ∂xn
= ,..., · ,...,
∂x1 ∂xn ∂tk ∂tk
∂f ∂x1 ∂f ∂xn
= + ···
∂x1 ∂tk ∂xn ∂tk
Example 17.3. Let f (x, y, z) be a differentiable function of three variables and (ρ, θ, φ) be spherical coor-
∂f ∂f ∂f ∂f
dinates. Express in terms of , , and .
∂ρ ∂x ∂y ∂z
Recall that spherical coordinates are given by

x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ,

so that
∂x ∂y ∂z
= sin φ cos θ, = sin φ cos θ, = cos φ.
∂ρ ∂ρ ∂ρ
Then
 
∂f ∂x ∂y ∂z
= ∇f · , ,
∂ρ ∂ρ ∂ρ ∂ρ
 
∂f ∂f ∂f
= , , · hsin φ cos θ, sin φ cos θ, cos φi
∂x ∂y ∂z
∂f ∂f ∂f
= sin φ cos θ + sin φ cos θ + cos φ.
∂x ∂y ∂z

77
MTH 2321 Notes 17 THE CHAIN RULE

dy
Recall that in calculus 1 we used implicit differentiation to compute when a function y = y(x) was given
dx
implicitly, i.e. in the form f (x, y) = 0. We can similarly use implicit differentiation when z = z(x, y) is
defined implicitly by an equation
f (x, y, z) = 0.
Though we may not be able to solve for z = z(x, y), we can treat f (x, y, z) as a composite function
f (x, y, z(x, y)) with x and y independent variables, and use the chain rule to find the partial derivatives
∂z ∂z
and . To see this, consider f (x, y, z) = 0 as above, and partial differentiate both sides with respect
∂x ∂y
to x:
0 = f (x, y, z)
∂f
0=
∂x  
∂x ∂y ∂z
0 = ∇f · , ,
∂x ∂x ∂x
   
∂f ∂f ∂f ∂x ∂y ∂z
= , , · , ,
∂x ∂y ∂z ∂x ∂x ∂x
∂f ∂x ∂f ∂y ∂f ∂z
= + +
∂x ∂x ∂y ∂x ∂z ∂x

∂x ∂y
Then as = 1 and = 0, we get
∂x ∂x
∂f ∂f ∂z
0= + ,
∂x ∂z ∂x
and so
∂f
∂z ∂x fx
= − ∂f =− .
∂x ∂z
fz
Similarly, we could partial differentiate both sides of f (x, y, z) = 0 with respect to y and get
∂f
∂z ∂y fy
= − ∂f = − .
∂y ∂z
fz
∂r ∂t s
Example 17.4. Find and if r2 = te r .
∂t ∂r
Here we have
s
f (r, s, t) = r2 − te r = 0.
∂r ∂t
To find , we treat s and t as independent variables, and to find , we treat s and r as independent
∂t ∂r
variables. This gives
∂r ft ∂t fr
=− and =− .
∂t fr ∂r ft
The partial derivatives of f w.r.t. t and r are
st s s
fr = 2r + er and ft = −e r .
r2
So,
s
∂r ft er
=− = s
∂t fr 2r + rst2 e r
and s
∂t fr 2r + rst2 e r
=− = s .
∂r ft er

78
MTH 2321 Notes 18 OPTIMIZATION

18 Optimization (14.7)
In this section we optimize functions of two variables. Recall from calculus 1 that “optimization” is the pro-
cess of finding extreme values - things like “greatest,” “least,” “best,” “worst,” etc. Optimizing a function of
one variable amounts to finding the highest and lowest points (in terms of y-value) on the graph. Similarly,
optimizing a function of two variables amount to finding the highest and lowest points on the graph, in terms
of the z-value. As with functions of one variable, a function f (x, y) can have both local and global extrema.

Definition 18.1 (Local Extrema). A function f (x, y) has a local maximum at P = (a, b) if there exists
and open disk around P of radius r, D(P, r), such that
f (x, y) ≤ f (a, b) for all (x, y) ∈ D(P, r).
A function f (x, y) has a local minimum at P = (a, b) if there exists and open disk around P of radius r,
D(P, r), such that
f (x, y) ≥ f (a, b) for all (x, y) ∈ D(P, r).
Local minima and local maxima are called local extrema.

A function f (x, y) has a local extremum at (a, b) if


f (a, b) is the greatest or least value of the function
“close” to (a, b).

A function f (x, y) has a global extremum at (a, b) if


f (a, b) is the greatest or least value of the function
on its entire domain.

Recall from calculus 1 Fermat’s Theorem for functions of one variable: If f (a) is a local extreme value then
a is a critical point (i.e. f 0 (a) = 0 or DNE), and so the tangent line (if it exists) is horizontal at (a, f (a)).

A similar result holds for functions of two variables and tangent planes!

Recall the tangent plane at (a, b, f (a, b)) is given by


z = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b).
This tangent plane will be horizontal when it is given in the form z = c, with c a constant. That is, the
tangent plane will be horizontal when z = f (a, b) and hence fx (a, b) = fy (a, b) = 0! This is the motivation
behind the following definition.

79
MTH 2321 Notes 18 OPTIMIZATION

Definition 18.2 (Critical Point). A point (a, b) in the domain of f (x, y) is called a critical point if
• fx (a, b) = 0 or fx (a, b) does not exist, and
• fy (a, b) = 0 or fy (a, b) does not exist.
More generally, the above definition holds for functions f (x1 , . . . , xn ) where each partial derivative at must
be zero or not exist. As we have seen, defining critical points this way leads to the following theorem.
Theorem 18.3 (Fermat’s Theorem). If f (x, y) has a local extrema at (a, b) then (a, b) is a critical point of
f (x, y).

Example 18.4. Find the critical points of f (x, y) = x − y 2 − ln(x + y).


1 1
fx = 1 − and fy = −2y −
x+y x+y
Note that both partial derivatives are undefined when x + y = 0, but points (x, y) satisfying this are not in
the domain of f . Now, set both partial derivatives equal to zero to get
1 1
fx = 0 0=1− and fy = 0 0 = −2y −
x+y x+y
1 1
Then from fx = 0 we get 1 = , and plugging this into fy = 0 we get 0 = −2y − 1 and y = − . Then
x+y 2
1 1 3
1= 1= 1 x= .
x+y x− 2
2
 
3 1
So the point ,− is a critical point.
2 2
Recall that for functions f (x), not all critical points are relative extrema - they can also be inflection points,
e.g. the point (0, 0) on the graph of f (x) = x3 . Something similar occurs with graphs of two variables! See
some examples here and here. The three options for critical points are shown below.

Notice that for local maxima,

moving away from the point along any direction take you downhill

and for local minima,

moving away from the point along any direction take you uphill,

while for saddle points,

moving away from the point can take you uphill or downhill, depending on the direction.

80
MTH 2321 Notes 18 OPTIMIZATION

Formally, we have the following definition of a saddle point. As with functions of one-variable, we can use
second derivatives to determine the type of critical point for a function of two-variables. To do this, we first
define the Hessian of f (x, y) as the matrix
 
fxx fxy
H(f ) = .
fyx fyy
We then define the discriminant, D = D(a, b) as the determinant of H(f ):
fxx (a, b) fxy (a, b)
D = D(a, b) = = fxx (a, b)fyy (a, b) − fxy (a, b)fyx (a, b).
fyx (a, b) fyy (a, b)
Assuming fxy and fyx are continuous, by Clairaut’s Theorem we have
2
D = D(a, b) = fxx (a, b)fyy (a, b) − fxy (a, b).
Theorem 18.5 (The Second Derivative Test). If P = (a, b) is a critical point of f (x, y) and the four second
order partial derivatives of f are continuous at P = (a, b), then
1. if D(a, b) > 0 and fxx (a, b) > 0, then f (a, b) is a local minimum,
2. if D(a, b) > 0 and fxx (a, b) < 0, then f (a, b) is a local maximum,
3. if D(a, b) < 0 then (a, b) is a saddle point.
Note that the second derivative test does not apply if D(a, b) = 0! A proof of this theorem is in the textbook.

Example 18.6. Find any relative extrema and saddle points of f (x, y) = 4x − 3x3 − 2xy 2 .

First, note that f (x, y) is a polynomial in x and y, and so f and all of its partial derivatives are continuous.
Now find the critical points of f :
fx (x, y) = 4 − 9x2 − 2y 2 = 0
fy (x, y) = −4xy = 0
From fy (x, y) = 0 we get x = 0 or y = 0. If x = 0, from fx (x, y) = 0 we get

4 − 2y 2 = 0 ⇒ y = ± 2.
If y = 0, from fx (x, y) = 0 we get
2
4 − 9x2 = 0 ⇒ x=± .
3
So the critical points of f are
 √   √ 

2
 
2

0, 2 , 0, − 2 , ,0 , − ,0 .
3 3
Find the discriminant D(x, y) by first finding the second order partial derivatives:
fxx = −18x, fyy (x, y) = −4x, fxy (x, y) = −4y,
so the discriminant is
2
D(x, y) = fxx (x, y)fyy (x, y) − fxy (x, y) = 72x2 − 16y 2 .
Now apply the second derivative test to each critical point:
 √   √ 
D 0, 2 = −32 < 0, ⇒ 0, 2 is a saddle point
 √   √ 
D 0, − 2 = −32 < 0, ⇒ 0, − 2 is a saddle point
     
2 2 2 16
D , 0 = 32 > 0 and fxx , 0 = −12 < 0, ⇒ f ,0 = is a local maximum
3 3 3 9
     
2 2 2 16
D − , 0 = 32 > 0 and fxx − , 0 = 12 > 0, ⇒ f ,0 = − is a local minimum
3 3 3 9

81
MTH 2321 Notes 18 OPTIMIZATION

As in the one-variable case, we are often interested in finding the extrema of a function on a given domain
D ⊂ R2 , called global or absolute extrema. Of course, global extrema need not exist. For example, the
function f (x, y) = x has no maximum value over R2 . This was also the case for functions of one-variable,
where we had the following theorem to guarantee existence of global extrema.
Theorem 18.7 (From Cal 1). A continuous function f (x) over a closed and bounded interval [a, b] attains
both a global maximum and a global minimum, which occur either at the endpoints of the interval or the
critical points of f on in interior of the interval, (a, b).
Fun fact: Closed and bounded intervals are called compact.

We now want to state a similar theorem for functions of two variables - i.e. we want to know what it takes
for a function f (x, y) to have global extrema on a domain D. Continuity will be required, but so will a
generalization of “closed intervals” for domains D ⊂ R2 . The concept of a set being “closed” lies in the
realm of topology, so to generalize this we will need to do some introductory topology.
Definition 18.8. A domain D ⊂ R2 is called bounded is there is a number M > 0 such that D is contained
in the disk of radius M centered at the origin, i.e. all points in D are less than the distance M away from
the origin.

A point P is called an interior point of D if D contains a disk D(P, r) centered at P of some radius r > 0,
i.e. you can draw a circle around P of some small enough radius so the interior of the circle lies in the
domain D. The interior of D is the set of all interior points of D.

A point P is called a boundary point of D if every disk centered at P contains points in D and points not
in D. The boundary of D is the set of all the boundary points of D.

A domain D is called closed if it contains its boundary. A domain D is open if every point in D is an
interior point. Note: A set (domain) is not a door! Sets can be neither open nor closed, or can be both (i.e.,
clopen)!

82
MTH 2321 Notes 18 OPTIMIZATION

Theorem 18.9 (Existence and Location of Global Extrema). Let f (x, y) be a continuous function on a
closed and bounded domain D in R2 . Then
• f takes on both a global maximum and a global minimum on D, and
• these extrema values occur either at critical points in the interior of D or at points on the boundary
of D.
Example 18.10. Find the global extreme values of f (x, y) = x3 + y 3 − 3xy on the domain D = {(x, y) :
0 ≤ x ≤ 2, 0 ≤ y ≤ 2}.

First, find the critical points in the interior of the domain:

fx (x, y) = 3x2 − 3y = 0 and fx (x, y) = 3y 2 − 3x = 0

fx = 0 gives y = x2 , subtituting this into fy = 0 we get


2
3 x2 − 3x = 3x x3 − 1 = 0,


and so x = 0 or x = 1. Then using y = x2 we get the critical points (0, 0) and (1, 1), of which only (1, 1) is
on the interior of D, so we will consider f (1, 1) = −1.

y
y = 2, 0 ≤ x ≤ 2
2
In the figure to the right we see the domain D is
x = 0, 0 ≤ y ≤ 2

x = 2, 0 ≤ y ≤ 2
a square, with the interior shaded gray and with
boundary the edges of the square. CP at (1, 1)

Next we want to find the extreme values on the
boundary of the domain. The boundary comes in
four pieces, which we will treat separately. x
y = 0, 0 ≤ x ≤ 2 2

Along the top edge, f (x, y) = f (x, 2) = x3 − 6x + 8. Using√Cal 1 techniques


√ we see that on the top edge
f (x, y) has a maximum of f (0, 2) = 8 and a minimum of f ( 2, 2) = 8 − 4 2.

Along the right edge, f (x, y) = f (2, y) = y 3 − 6y + 8. Using Cal


√ 1 techniques
√ we see that on the right edge
f (x, y) has a maximum of f (2, 0) = 8 and a minimum of f (2, 2) = 8 − 4 2.

Along the bottom edge, f (x, y) = f (x, 0) = x3 . Using Cal 1 techniques we see that on the bottom edge
f (x, y) has a maximum of f (2, 0) = 8 and a minimum of f (0, 0) = 0.

Along the left edge, f (x, y) = f (0, y) = y 3 . Using Cal 1 techniques we see that on the bottom edge f (x, y)
has a maximum of f (0, 2) = 8 and a minimum of f (0, 0) = 0.

By the previous theorem, the global extrema of f (x, y) occur at critical points or on the boundary, so we
compare the value of f (x, y) at any critical points on the interior of D with the extreme values of f (x, y) on
the boundary to get
f (x, y) has a global minimum on D of − 1 at (1, 1)
and
f (x, y) has a global maximum on D of 8 at (0, 2) and (2, 0).

83
MTH 2321 Notes 19 CONSTRAINED OPTIMIZATION

19 Constrained Optimization (14.8)


In real world problems, optimization problems often come with many constraints. For example, fielding the
best sports team subject to a salary cap, arriving at a destination subject to speed limit laws, making your
best grades subject to amount of time available, etc. In this section we tackle optimization problems that
come with some known constraint using the method of Lagrange multipliers.

The following images give a geometric idea as to how we will do this. Consider a function f (x, y) that we
want to optimize subject to a constraint given by g(x, y) = 0. The solution must satisfy the constraint, and
so must lie on the level curve given by g(x, y) = 0.

Image we start at the point Q on the curve g(x, y) = 0. The gradient ∇fQ points in the direction of maxi-
mum increase of f , and so to find an optimizer we want to move in this direction. However, we must remain
on the constraint curve! Since ∇fQ points towards to the right, we can move in that direction and increase
f (x, y). Let’s keep doing this until we arrive at the point P . At P , ∇fP is orthogonal to the constraint
curve, and so we cannot increase f by moving along the curve in either direction - which means we have
arrived at a local maximum for f on the constraint curve!

Recall that gradients are orthogonal to normal curves, especially ∇gP is orthogonal to g(x, y) = 0, and so
∇fP and ∇gP point in the same direction, and hence are scalar multiples of each other!

Also note in the above that the extrema of f occurs where a level curve of f is tangent to the constraint
curve g(x, y) = 0.
Theorem 19.1 (Lagrange Multipliers). Let f (x, y) and g(x, y) be differentiable. If f (x, y) has a local
extrema on the constraint curve g(x, y) = 0 at P = (a, b) with ∇gP 6= 0, then there is a scalar λ called a
Lagrange multiplier such that
∇fP = λ∇gP .
(In the above theorem, the requirement that ∇gP 6= 0 is beyond the scope of this course, but it allows that
the curve g(x, y) = 0 can be nicely parametrized near P .)

Proof. Assume f (x, y) and g(x, y) are differentiable, f (x, y) has a local extrema on the constraint curve
g(x, y) = 0 at P = (a, b), and ∇gP 6= 0. Since ∇gP 6= 0, we can choose a parametrization r(t) of the
constraint curve g(x, y) = 0 near P so that r(0) = P and r0 (0) 6= 0. Then f (r(0)) = f (P ) and, by
assumption, f (r(t)) has a local extrema at t = 0. Thus t = 0 is a critical point of f (r(t)) and so, using the
chain rule, we get
d
f (r(t)) = ∇fP · r0 (0) = 0.
dt t=0

So ∇fP is orthogonal to r0 (0), and hence ∇fP is orthogonal to g(x, y) = 0. As ∇gP is also orthogonal to
g(x, y) = 0, ∇fP and ∇gP must be parallel.

84
MTH 2321 Notes 19 CONSTRAINED OPTIMIZATION

Definition 19.2. The equation


∇fP = λ∇gP .
is called the Lagrange condition. When written in terms of components we get the Lagrange equations:
fx (a, b) = λgx (a, b)
fy (a, b) = λgy (a, b).
A point P satisfying the Lagrange equations is called a critical point for f with constraint g = 0, and
f (a, b) is called a critical value.

Example 19.3. Find the extrema of f (x, y) = x2 + y 2 subject to the constraint x4 + y 4 = 1.

First, identify the constraint: g(x, y) = x4 + y 4 − 1 = 0, then find the gradients of f and g:
∇f = h2x, 2yi and ∇g = 4x3 , 4y 3 .
To use Lagrange multipliers, we need ∇g 6= 0. ∇g = 4x3 , 4y 3 = 0 at (0, 0), which does not lie on the curve
g(x, y) = 0. Thus ∇g 6= 0 for g(x, y) = 0.

Now we get the Lagrange condition


h2x, 2yi = λ 4x3 , 4y 3
and the Lagrange equations
2x = λ(4x3 ), 2y = λ(4y 3 ).
To solve for the points (x, y) that satisfy the Lagrange equations, solve for λ in both equations:
2x 1 2y 1
λ= 3
= 2, λ= 3
= 2.
4x 2x 4y 2y
As we divided by both x3 and y 3 , we must consider the cases x = 0 and y = 0 separately. First let’s consider
x 6= 0 and y 6= 0, and solve for x and y using the constraint. To do this, first equate the two expressions for
λ and solve for either x or y:
1 1
= 2 ⇒ x2 = y 2 ⇒ y = ±x.
2x2 2y
Now substitute this into the constraint equation to solve for x and y:
1 1
x4 + (±x)4 = 1 ⇒ x4 = ⇒ x = ±2− 4
2
and hence, using y = ±x, we get the critical points
 1 1
  1 1
  1 1
  1 1

2− 4 , 2− 4 , 2− 4 , −2− 4 , −2− 4 , 2− 4 , −2− 4 , −2− 4 .

Let’s now consider the cases x = 0 and y = 0. If x = 0, by the constraint, y = ±1, which gives the critical
points
(0, −1), (0, 1).
If y = 0, by the constraint, x = ±1, which gives the critical points
(−1, 0), (1, 0).
Now, evaluate f at the critical points:
 1 1
  1 1
  1 1
  1 1
 √
f 2− 4 , 2− 4 = f 2− 4 , −2− 4 = −2− 4 , 2− 4 = −2− 4 , −2− 4 = 2

and
f (0, −1) = f (0, 1) = f (−1, 0) = f (1, 0) = 1.
As the constraint x4 + y 4 − 1 = 0 is a closed and bounded set in R2 and f is continuous, we know f achieves
both a maximum
√ and a minimum on this set. Thus we can conclude that the maximum of f subject to
g = 0 is 2 and the minimum of f subject to g = 0 is 1.

85
MTH 2321 Notes 19 CONSTRAINED OPTIMIZATION

The image to the right gives an idea of constrained


optimization. Notice that the level curve g(x, y) = 0
will be closed in R2 and so if it is bounded, by a
theorem from last section, extreme values will exist
on the curve!

In the next example, we extend the method of La-


grange multipliers to a function f (x, y, z). In fact,
the method of Lagrange multipliers can be extended
to functions of any number of variables!

Example 19.4. Show that the box of maximum volume with dimensions summing to 3 inches is the 1×1×1
cube.

Let the dimensions of the box be x ≥ 0, y ≥ 0, and z ≥ 0. Then the volume of the box is given by
f (x, y, z) = xyz. We want to show that f has a maximum at (1, 1, 1) subject to the constraint g(x, y, z) =
x + y + z − 3 = 0. Note that the domain of f (x, y, z) is x, y, z ≥ 0. The gradients of f and g are

∇f = hyz, xz, xyi and ∇g = h1, 1, 1i ,

so ∇g 6= 0 for all (x, y, z). We then get the Lagrange condition

hyz, xz, xyi = λ h1, 1, 1i ,

and the Lagrange equations


yz = λ, xz = λ, xy = λ.
Thus, we get yz = xz = xy, which gives

yz = xz ⇒ z(y − x) = 0

and
xz = xy ⇒ x(z − y) = 0.
If any of x, y, or z = 0, then the volume will be zero, which is clearly the minimum volume. So we can
assume x > 0, y > 0, and z > 0. Then the first equation gives x = y and the second equation gives z = y.
Substituting these into the constraint we get

x + y + z − 3 = 3y = 3 = 0 ⇒ y = 1,

and hence x = z = 1. So there is a critical point at (1, 1, 1).

The constraint x + y + x = 3 with domain x ≥ 0, y ≥ 0, z ≥ 0 is the part of the plane x + y + z = 3 which


lies in the first octant. This is a bounded and closed set in R3 , and f is continuous of this set, so f has
global extreme values on this set (via a generalization of our earlier theorems). The minimum is zero (when
any of x, y, z = 0), and so the maximum is 1, when x = y = z = 1.

86
MTH 2321 Notes 19 CONSTRAINED OPTIMIZATION

Example 19.5. Given n nonzero numbers σ1 , . . . , σn , show that the minimum value of
f (x1 , . . . , xn ) = x21 σ12 + · · · + x2n σn2
 −1
Xn
subject to x1 + · · · + xn = 1 is c =  σj−2 
j=1

The constraint is g(x1 , . . . , xn ) = x1 + · · · + xn − 1 = 0. The gradients are ∇f = 2σ12 x1 , . . . , 2σn2 xn and


∇g = h1, . . . , 1i 6= 0. So we can form the Lagrange condition
2σ12 x1 , . . . , 2σn2 xn = λ h1, . . . , 1i ,
and the Lagrange equations
2σ12 x1 = λ ··· 2σn2 xn = λ.
Thus, equating all of the left sides above, we get that σ12 x1 = · · · = σn2 xn . Let’s call this common value c, so
c
that for i = 1, . . . , n, we have σi2 xi = c, and hence xi = 2 . Then from the constraint we get
σi
n
c c X
1 = x1 + · · · + xn = + · · · + = c σj−2 .
σ12 σn2 j=1
 −1
Xn
Thus, solving for c we get c =  σj−2  , and we get the critical point
j=1
 
c c
(x1 , . . . , xn ) = ,..., 2 .
σ12 σn
We need to show this is a minimum value. As xi → ±∞, for one or more i’s, f → ∞. So since f is continuous
(it is a polynomial), it must have a minimum value on the constraint. So, this minimum occurs at the above
critical point and it is
n  2 X n n
c2
  X
c c 2 j 2
X 1 2 −1
f , . . . , = σ j = = c 2 =c c = c.
σ12 σn2 j=1
σ 2
1 j=1
σ 2
j j=1
σ j

The method of Lagrange multipliers can actually be extended to any number of constraints! We just need
to give each their own Lagrange multiplier. i.e., the Lagrange condition for optimizing f subject to g = 0
and h = 0 is
∇f = λ∇g + µ∇h.
1 1
Example 19.6. Find the point lying on the intersection of the plane x + y + z = 0 and the sphere
2 4
x2 + y 2 + z 2 = 9 with the largest z coordinate.
1 1
We want to maximize f (x, y, z) = z subject to the constraints g(x, y, z) = x + y + z = 0 and h(x, y, z) =
2 4
x2 + y 2 + z 2 − 9 = 0.
 
1 1
The gradients are ∇f = h0, 0, 1i, ∇g = 1, , , and ∇h = h2x, 2y, 2zi. Note that ∇g 6= 0 and ∇h = 0
2 4
only at the origin, which is not on the sphere of radius 3. So we can form the Lagrange condition
 
1 1
h0, 0, 1i = λ 1, , + µ h2x, 2y, 2zi ,
2 4
and the Lagrange equations
1 1
0 = λ + 2µx, 0= λ + 2µy, 1= λ + 2µz.
2 4

87
MTH 2321 Notes 19 CONSTRAINED OPTIMIZATION

From the first two equations we get


λ = −2µx, λ = −4µy,
and setting these equal we get
−2µx = −4µy ⇒ x = 2y.
Using the constraint g = 0 we get
1 1 1
2y + y + z = 0 ⇒ y=− z.
2 4 10
Now, we can substitute this in to the contraint h = 0 to get
 2  2 r
1 1 60
− z + − z + z2 − 9 = 0 ⇒ z=± .
5 10 7

This gives two critical points,


r r ! r r !
1 60 1 60 1 60 1 60
− ,− , and , ,− .
5 7 10 7 5 7 10 7
r r !
1 60 1 60
The one with the largest z component is then − ,− , .
5 7 10 7

88
MTH 2321 Notes 20 INTEGRATION IN TWO VARIABLES

20 Integration in Two Variables (15.1)


Over the next few sections we will introduce multiple integrals, i.e. integrals of multi-variable functions.
As with functions of one variable, multiple integration is the accumulation of a function over a domain, and
so can be used to compute volume, surface area, probabilities, average values, and more.

In this section we introduce integration in two vari-


ables - i.e. a double integral, denoted

Z Z
f (x, y) dA.
D

Here, the domain D ⊂ R2 is measured in area, so we


use the differential dA. As with single integrals, we
will formally define double integrals as the limit of a
sum and then evaluate them using the Fundamental
Theorem of Calculus.

One major difference between double and single integrals is the domain of integration. In the single variable
case, the domain is an interval of the form [a, b]. In the two variable case, the domain can be significantly
more complicated.

In this section, for ease, we focus only on the case


where the domain D is a rectangle. In the follow-
ing sections we will extend any results for double
integrals over rectangles to double integrals over ar-
bitrary domains.

A rectangle R = [a, b] × [c, d] is the set of all points


(x, y) where
a ≤ x ≤ b, and c ≤ y ≤ d.
As with single integrals, double integrals over rectan-
gles can be computed via Riemann sums.

Subdivide R into Rij = [xi−1 , xi ] × [yj−1 , yj ], so each sub-rectangle Rij has area ∆Aij = ∆xi ∆yj . Choose
sample points Pij in each Rij . Then the we get a Riemann sum
N X
X M N X
X M
SN,M = f (Pi,j )∆Aij = f (Pi,j )∆xi ∆yj .
i=1 j=1 i=1 j=1

89
MTH 2321 Notes 20 INTEGRATION IN TWO VARIABLES

Each term in the Riemann sum gives the signed volume of the box of height f (Pij ) above Rij :

f (Pij )∆Aij = f (Pij )∆xi ∆yj = height × area


| {z }
signed volume of the box

Letting ∆xi and ∆yj → 0 (and the number of boxes go to infinity), we get the formal definition of the double
integral!

Theorem 20.1 (Double Integral over a Rectangle). The double integral of f (x, y) over a rectangle R is
defined as the limit
Z Z XN XM
f (x, y)dA = lim f (Pi,j )∆Aij .
R ∆xi →0
∆yj →0 i=1 j=1

If the limit exists we say f is integrable over R.


The double integral gives the signed volume of the solid region between the graph of f (x, y) and the
rectangle R, where regions above the xy-plane count as positive volume and regions below the xy-plane
count as negative volume. The following theorems are similar to those for single integrals.
Theorem 20.2. If f (x, y) is continuous on a rectangle R, then f is integrable over R.
Theorem 20.3 (Linearity of the Double Integral). If f (x, y) and g(x, y) are integrable over a rectangle R,
then Z Z Z Z Z Z
f (x, y) + g(x, y)dA = f (x, y)dA + g(x, y)dA
R R R
and for any constant C, Z Z Z Z
Cf (x, y)dA = C f (x, y)dA.
R R

Example 20.4. By geometry, Z Z


dA = Area(R),
R
and hence Z Z Z Z
CdA = C dA = CArea(R).
R R

90
MTH 2321 Notes 20 INTEGRATION IN TWO VARIABLES

Let f (x, y) be continuous over R = [a, b] × [c, d], and fix some a ≤ x0 ≤ b. Then,
Z d
S(x0 ) = f (x0 , y)dy = area of the cross section in the vertical plane x = x0
c
of the region between the graph of f (x, y) and the xy-axis.

Similarly, for a fixed c ≤ y0 ≤ d,


Z b
s(y0 ) = f (x, y0 )dx = area of the cross section in the vertical plane y = y0
a
of the region between the graph of f (x, y) and the xy-axis.

We can then integrate to accumulate these cross sections using iterated integrals, i.e. integrals of the form
! !
Z Z b d Z Z d b
f (x, y)dy dx and f (x, y)dx dy.
a c c a

Theorem 20.5 (Fubini’s Theorem). The double integral of a continuous function f (x, y) over a rectangle
R = [a, b] × [c, d] is equal to the iterated integral (in either order),
! !
Z Z Z Z b d Z Z d b
f (x, y)dA = f (x, y)dy dx and f (x, y)dx dy.
R a c c a

Example 20.6. Evaluate the integral using Fubini’s theorem. Check that reversing the order of integration
gives the same answer. Z Z
x3 y dA.
[1,3]×[0,2]

Z Z Z 3 Z 2
3
x y dA = x3 y dydx
[1,3]×[0,2] x=1 y=0
2
!
3
x3 y 2
Z
= dx
x=1 2 y=0
Z 3
= 2x3 dx
x=1
4 3
x 81 1
= = − = 40.
2 x=1 2 2

91
MTH 2321 Notes 20 INTEGRATION IN TWO VARIABLES

Z Z Z 2 Z 3
x3 y dA = x3 y dxdy
[1,3]×[0,2] y=0 x=1
3
!
2
x4 y
Z
= dx
y=0 4 x=1
Z 2
81 y
= y − dx
y=0 4 4
Z 2
= 20y dx
y=0
2
= 10y 2 y=0
= 40.

Example 20.7. Let R = [1, 3] × [0, 1] and evaluate the following integral in both orders of integration.
Which way is easier? Z Z
I= yexy dA
R
First we evaluate Z 3 Z 1
yexy dydx.
x=1 y=0

Using integration by parts with dv = exy dy and u = y, we get the inside integral
Z 1 1
Z 1
y 1 xy
yexy dy = exy − e dy
y=0 x y=0 y=0 x
1 1
y xy 1 xy
= e e −
x y=0 x2 y=0

= ex x−1 − x−2 + x−2 .




So,
Z 3
ex x−1 − x−2 + x−2 dx.

I=
1

Using a table of integrals you get


Z
ex x−1 − x−2 dx = x−1 ex + C,


so
3 e3 2
I = x−1 ex − x−1 1
= −e+ .
3 3
Now changing the order of integration we get
!
Z 1 Z 3 Z 1 3
y xy
yexy dxdy = e dy
y=0 x=1 y=0 y x=1
Z 1
= e3y − ey dy
0
1
e3y
= − ey
3 0
e3 2
= −e+ .
3 3
This was much easier! If in iterated integral looks hard/impossible, try changing the order of integration!

92
MTH 2321 Notes 21 DOUBLE INTEGRALS

21 Double Integrals (15.2, 15.4)


In this section we extend double integrals to be over more general domains than rectangles. First we consider
properties of curves.
Definition 21.1. A curve is called simple if it does not self intersect.

A curve is called closed if it has no endpoints and completely encloses an area.

A curve is called smooth if it is simple and has no cusps. Such curves are given by parametrizations with
continuous derivatives up to any order.

A curve is called piecewise smooth if it consists of finitely many smooth curves.


Example 21.2. Label the following curves as simple, closed, smooth, and/or piecewise smooth.

We will define double integrals over closed domains D ⊂ R2 (i.e. domains that contain their boundary) with
boundary curves that are piecewise smooth closed curves. From now on D will always be such a domain.

We do not need to define a double integral over D


from scratch, but can use our definition of double in-
tegral over a rectangle. Given a function f (x, y) and
domain D, choose any rectangle R containing D and
define a new function
(
f (x, y) if (x, y) ∈ D
f˜(x, y) =
0 if (x, y) 6∈ D
The double integral of f over D is then defined as the
double integral of f˜ over R!
Z Z Z Z
= f (x, y)dA = f˜(x, y)dA.
D R

We say that f is integrable over D if the integral of f˜ over R exists. Note that the value of the integral
does not depend on the choice of R (as f˜ is zero outside of D).

Can you spot a possible flaw with this definition?

93
MTH 2321 Notes 21 DOUBLE INTEGRALS

The function f˜ might not be continuous! Do not fear, the following theorem saves the day.

Z y) is continuous on a closed domain D whose boundary is a closed, simple, piecewise


Theorem 21.3. If fZ(x,
smooth curve, then f (x, y)dA exists.
D

As before, the double integral defines the signed volume between the graph of f (x, y) and the xy-plane,
where regions below the xy-plane count as negative volume. The linearity rules and Fubini’s Theorem follow
from integrals over rectangles.

One can approximate double integrals over D via a limiting process of taking integrals over rectangles that
cover D. We omit this here, you can see more about it in the textbook.

Although double integrals give a signed volume, we


can express the area of D as a double integral:
Z Z
Area (D) = 1 dA.
D

By the linearity of the integral we have, for any con-


stant C,
Z Z
C dA = CArea (D) .
D

We now want to evaluate double integrals over D using iterated integrals, as we did for rectangles. We are
able to do this when D is a simple region, i.e. a region between two graphs in the xy-plane.

Definition 21.4. The region D is vertically sim-


ple if it is bounded by the graphs of two continuous
functions y = g1 (x) and y = g2 (x):

D = {(x, y) | a ≤ x ≤ b, g1 (x) ≤ y ≤ g2 (x)} .

The region D is horizontally simple if it is bounded


by the graphs of two continuous functions x = g1 (y)
and x = g2 (y):

D = {(x, y) | c ≤ y ≤ d, g1 (y) ≤ x ≤ g2 (y)} .

94
MTH 2321 Notes 21 DOUBLE INTEGRALS

Theorem 21.5. If D is vertically simple,

D = {(x, y) | a ≤ x ≤ b, g1 (x) ≤ y ≤ g2 (x)} ,

then Z Z Z b Z g2 (x)
f (x, y) dA = f (x, y) dy dx.
D a g1 (x)

If D is horizontally simple,
D = {(x, y) | c ≤ y ≤ d, g1 (y) ≤ x ≤ g2 (y)} ,
then Z Z Z b Z g2 (y)
f (x, y) dA = f (x, y) dx dy
D c g1 (y)

When writing a double integral over a vertically simple region as an iterated integral, the inner integral,
Z g2 (x)
f (x, y) dy, is an integral over the dashed segment in the previous figure. Similarly for the inner
g1 (x)
Z g2 (y)
integral f (x, y) dx for horizontally simple regions.
g1 (y)
Z Z
3
Example 21.6. Find the integral xey dA for D = {(x, y) | 0 ≤ x ≤ 1, x ≤ y ≤ 1}.
D

D is vertically simple, so
Z Z Z 1 Z 1 
3 3
xey dA = xey dy dx.
D 0 x
However, we cannot compute this integral! As D is
also horizontally simple, we can write the integral as
an iterated integral in the other order.

To do this we need to identify the constant bounds


on y and the bounds on x in terms of y:
D = {(x, y) | 0 ≤ y ≤ 1, 0 ≤ x ≤ y} .
Then
Z Z Z 1 Z y 
y3 y3
xe dA = xe dx dy
D 0 0
1 y
x2 y3
Z
= e dy
0 2 0
Z 1
y 2 y3
= e dy
0 2
Z 1 1
1 u 1
= e du = eu
y=0 6 6 y=0
1
1 y3 e−1
= e = .
6 y=0 6

Recall from calculus 1 that we can use the definite integral to find the area between two graphs over a given
interval. Similarly, we can use double integrals to find the volume of a region bounded by two surfaces over
some domain.

95
MTH 2321 Notes 21 DOUBLE INTEGRALS

Theorem 21.7. Suppose z1 (x, y) and z2 (x, y) are integrable functions on D and z1 (x, y) ≤ z2 (x, y) for all
points in D. Then the volume V of the region between the graphs of z1 (x, y) and z2 (x, y) is
Z Z
V = (z2 (x, y) − z1 (x, y)) dA.
D

The proof of the above theorem is in your textbook.

Example 21.8. Find the volume of the region between the graphs of f1 (x, y) = x and f2 (x, y) = x2 + y 2 − 1
over D = {(x, y) | y − 1 ≤ x ≤ 1 − y, 0 ≤ y ≤ 1}.

From the graph, we see that over D, f1 (x, y) ≥ f2 (x, y). So the volume of the region is given by
Z Z Z 1 Z 1−y
2 2
x − x2 − y 2 + 1 dxdy

x − x + y − 1 dA =
D 0 y−1
1 1−y
x2 x3
Z
= − − xy 2 + x dy
0 2 3 x=y−1
1
8y 3
Z
4
= − 4y 2 + dy
0 3 3
1
8y 4 4y 3 4 2
= − + y =
12 3 3 0 3

As we saw with single variable functions, greater functions have greater integrals over the same domain.
Theorem 21.9 (Comparison Theorem). Let f (x, y) and g(x, y) be integrable functions on D. If f (x, y) ≥
g(x, y) for all (x, y) ∈ D, then Z Z Z Z
f (x, y) dA ≥ g(x, y) dA.
D D

Corollary 21.10. If f (x, y) is an integrable function on D and m ≤ f (x, y) ≤ M for all (x, y) ∈ D, then
Z Z
m · Area(D) ≤ f (x, y) dA ≤ M · Area(D).
D

We also have a mean value theorem for double integrals over “nice” domains. We say D if connected is
any two points in D can be joined by a curve that lies in D:

Theorem 21.11 (Mean Value Theorem for Double Integrals). If f (x, y) is continuous and D is closed,
bounded, and connected, then there exists a point P ∈ D such that
Z Z
f (x, y) dA = f (P )Area(D).
D

96
MTH 2321 Notes 21 DOUBLE INTEGRALS

In calculus 1, we used u-substitution to simplify an integral. We can similarly use a change of variables in
double integrals to simplify not just the integrand, but the bounds of integration.

For example, when the domain of integration is an


angular sector or polar rectangle of the form

D = {(r, θ) | θ1 ≤ θ ≤ θ2 , r1 ≤ r ≤ r2 } ,

or, more generally, a radially simple region of the


form

D = {(r, θ) | θ1 ≤ θ ≤ θ2 , r1 (θ) ≤ r ≤ r2 (θ), } ,

changing from Euclidean to polar coordinates often


makes things easier.

There is a way to do Riemann sums in polar coordinates, for more info see the textbook. The point is the
following.
Theorem 21.12 (Double Integrals in Polar). For a continuous function f (x, y) on the domain
D = {(r, θ) | θ1 ≤ θ ≤ θ2 , r1 (θ) ≤ r ≤ r2 (θ), } ,
Z Z Z θ2 Z r2 (θ)
f (x, y) dA = f (r cos θ, r sin θ) · r drdθ.
D θ1 r1 (θ)

Note in the above that dA = r dr dθ!


Z Z p n p o
Example 21.13. Evaluate x2 + y 2 dA for D = (x, y) | 0 ≤ y ≤ 3, 0 ≤ x ≤ 9 − y 2 .
D

D is the quater in the first quadrant of the circle of


radius 3 centered at the orgin. In polar coordinates,
we have
n π o
D = (r, θ) | 0 ≤ θ ≤ , 0 ≤ r ≤ 3 .
2
p
In polar coordinates, f = x2 + y 2 = r, and using
change of variables we get
Z Z p Z π2 Z 3
x2 + y 2 dA = r · r drdθ
D 0 0
π
Z 2
Z 3
= r2 drdθ
0 0
π 3
r3
Z 2
= dθ
0 3
0
Z π
2 π
= 9 dθ = 9 ·
0 2

97
MTH 2321 Notes 22 TRIPLE INTEGRALS

22 Triple Integrals (15.3. 15.4)


In this section we generalize the idea of double integrals of function of two variables to triple integrals of
functions of three variables. Instead of a rectangle [a, b] × [c, d] in the plane, consider the box
B = [x1 , x2 ] × [y1 , y2 ] × [z1 , z2 ] = {(x, y, z) | x1 ≤ x ≤ x2 , y1 ≤ y ≤ y2 , z1 ≤ z ≤ z2 } .
We can then decompose B into smaller boxes and use a Riemann sums technique to define the triple integral
over B. We will not do this, but instead evaluate triple integral via Fubini’s theorem.
Theorem 22.1 (Fubini’s Theorem for Triple Integrals). If f (x, y, z) is continuous over the box B = [x1 , x2 ]×
[y1 , y2 ] × [z1 , z2 ], then the triple integral over B of f is equal to the iterated integral of f :
Z Z Z Z z2 Z y2 Z x2
f (x, y, z) dV = f (x, y, z) dx dy dz.
B z1 y1 x1

Furthermore, the iterated integral may be evaluated in any order.


x
Example 22.2. Evaluate the integral of f (x, y, z) = over the box B = [0, 2] × [1, 3] × [0, 1].
y+z
We can choose any order to evaluate the iterated integral, so why not dx dy dz. Then
Z 1Z 3Z 2 Z 1Z 3 2
x x2
dx dy dz = dy dz
0 1 0 y+z 0 1 2(y + z) 0
Z 1Z 3
2
= dy dz
0 1 y+z
Z 1 3
(u − sub) = 2 ln(y + z) dz
0 1
Z 1
= 2 ln(3 + z) − 2 ln(1 + z) dz
0
1
(IBP) = 2(3 + z) (ln(3 + z) − 1) − 2(1 + z) (ln(1 + z) − 1)
0
= 8 ln(4) − 4 ln(2) − 6 ln(3).
Theorem 22.3. If f (x, y, z) is continuous over B = [x1 , x2 ] × [y1 , y2 ] × [z1 , z2 ] and f (x, y, z) = g(x)h(y)j(z),
then Z Z Z Z x Z y Z z
2 2 2

f (x, y, z) dV = g(x) dx h(y) dy j(z) dz.


B x1 y1 z1

xey
Example 22.4. Evaluate the integral of f (x, y, z) = over the box B = [1, 3] × [−1, 2] × [1, e] as an
z
iterated integral and as a product of integrals.

e 2 3 e 2 3
xey x2 ey
Z Z Z Z Z
dx dy dz = dy dz
1 −1 1 z 1 −1 2z 1
eZ 2
4ey
Z
= dy dz
1 −1 z
e 2
4ey
Z
= dz
1 z −1
4 e − e−1e
Z  2
= dz
1 z
e
= 4 e2 − e−1 ln(z) = 4 e2 − e−1
 
1

98
MTH 2321 Notes 22 TRIPLE INTEGRALS

and
e 2 3 3 2 Z e
xey
Z Z Z Z Z
1
dx dy dz = x dx ey dy dz
1 −1 1 z 1 −1 1 z
! !
3 2 e
x2 y
= e ln z
2 1 −1 1
 
9 1
e2 − e−1 (1 − 0)

= −
2 2
= 4 e2 − e−1 .


Just as we generalized double integrals over rectangles to double integrals over vertically or horizontally
simple regions, we now want to generalize triple integrals over boxes to triple integrals over regions W that
are z-simple.

Definition 22.5 (z-simple).

A region W ⊂ R3 is z-simple if it is the region be-


tween two surfaces z = z1 (x, y) and z = z2 (x, y) over
a domain D in the xy-plane, i.e.

W = {(x, y, z) | (x, y) ∈ D, z1 (x, y) ≤ z ≤ z2 (x, y)} .

In this case the domain D is the projection of W


onto the xy-plane.

As we did with double integrals, we define a triple integral over a z-simple region W by as
Z Z Z Z Z Z
f (x, y, z) dV = f˜(x, y, z) dV
W B

where B is a box containing W and f˜ is the function equal to f on W and zero on the rest of B. As before,
the triple integral exists if z1 , z2 , and f are continuous.

Just as a double integral of f represents a signed volume between the xy-axis and the graph of f over the
region D, a triple integral represent a signed four-dimensional volume. This, of course, cannot be drawn
in our lowly three-dimensional world. However, as we saw with double integrals, triple integrals can also
represent the size of the domain of integration in the following way: if W is a z-simple region, then
Z Z Z
dV = Volume (W) .
W

In particular, if W is the region between the graphs z = z1 (x, y) and z = z2 (x, y) over a domain D,
W = {(x, y, z) | (x, y) ∈ D, z1 (x, y) ≤ z ≤ z2 (x, y)}, then
!
Z Z Z Z Z Z z2 (x,y)
Z Z
Volume (W) = dV = dz dA = z2 (x, y) − z1 (x, y) dA,
W D z1 (x,y) D

which is exactly the rule for double integrals we saw last section! As we also saw with double integrals, by
the linearity of the integral, for any constant C,
Z Z Z Z Z Z
C dV = C dV = C · Volume (W) .
W W

In practice we will compute triple integrals via the following theorem.

99
MTH 2321 Notes 22 TRIPLE INTEGRALS

Theorem 22.6. If f is continuous on the z-simple region W = {(x, y, z) | (x, y) ∈ D, z1 (x, y) ≤ z ≤ z2 (x, y)},
then !
Z Z Z Z Z Z z=z2 (x,y)
f (x, y, z) dV = f (x, y, z) dA
W D z=z1 (x,y)

Z Z Z
Example 22.7. Evaluate the integral ez dV
W
where W is the tetrahedron with vertices the origin,
(4, 0, 0), (0, 4, 0), and (0, 0, 6).

Notice that W is a z-simple region - W is bounded


below by the xy-plane, z = 0, and bounded above
by the plane containing the points A = (4, 0, 0),
B = (0, 4, 0), and C = (0, 0, 6).

To find this plane we compute a vector normal to the plane:

i j k
−−→ −→
n = AB × AC = h−4, 4, 0i × h−4, 0, 6i = −4 4 0 = h24, 24, 16i .
−4 0 6

So an equation for this plane is


3 3 3 3
24(x − 4) + 24y + 16z = 0 =⇒ z = − (x − 4) − y = − x + 6 − y
2 2 2 2
Thus
 
3 3
W= (x, y, z) | (x, y) ∈ D, 0 ≤ z ≤ − (x − 4) − y ,
2 2

where D is the projection of W onto the xy-plane,


which is the region in the first quadrant bounded by
x = 0, y = 0, and the line between (4, 0) and (0, 4):

4−0
y−0= (x − 4) =⇒ y = −x + 4,
0−4

i.e.

D = {(x, y) | 0 ≤ x ≤ 4, 0 ≤ y ≤ −x + 4} .

100
MTH 2321 Notes 22 TRIPLE INTEGRALS

So, as D is vertically simple, we have


!
Z Z Z Z Z Z z=− 23 x+6− 32 y
ez dV = z
e dz dA
W D z=0
!
Z x=4 Z y=−x+4 Z z=− 32 x+6− 32 y
z
= e dz dy dx
x=0 y=0 z=0
z=− 23 x+6− 32 y
!
Z x=4 Z y=−x+4
z
= e dy dx
x=0 y=0 z=0
Z x=4 Z y=−x+4
3 3
= e− 2 x+6− 2 y − 1 dy dx
x=0 y=0
!
Z x=4 y=−x+4
2 3 3
= − e− 2 x+6− 2 y − y dx
x=0 3 y=0
Z x=4    
2 3 3 2 3
= − e− 2 x+6− 2 (−x+4) − (−x + 4) − − e− 2 x+6 dx
x=0 3 3
Zx=4
2 2 3
= − e0 + x − 4 + e− 2 x+6 dx
x=0 3 3
Zx=4
14 2 3
= − + x + e− 2 x+6 dx
x=0 3 3
x=4
14 x2 4 3
=− x+ − e− 2 x+6
3 2 9 x=0
4e6 100
= −
9 9
There is nothing special about the z-coordinate - we can in fact generalize the idea of a z-simple region to
the x- and y- coordinates:

Definition 22.8 (x- and y-simple). A region W


is x-simple if it is the region between two surfaces
x = x1 (y, z) and x = x2 (y, z) over a domain D in the
yz-plane, i.e.

W = {(x, y, z) | (y, z) ∈ D, x1 (y, z) ≤ z ≤ x2 (y, z)} .

Here the domain D is the projection of W onto the


yz-plane.

In this case we can evaluate the triple integral over W as


Z Z Z Z Z Z x=x2 (y,z) !
f (x, y, z) dV = f (x, y, z) dx dA.
W D x=x1 (y,z)

Similarly, a region W is y-simple if it is the region between two surfaces y = y1 (x, z) and y = y2 (x, z) over a
domain D in the xz-plane, i.e.
W = {(x, y, z) | (x, z) ∈ D, y1 (x, z) ≤ y ≤ y2 (x, z)} .
In this case the domain D is the projection of W onto the xz-plane, and we can evaluate the triple integral
over W as Z Z Z Z Z Z y=y2 (x,z) !
f (x, y, z) dV = f (x, y, z) dy dA.
W D y=y1 (x,z)

101
MTH 2321 Notes 22 TRIPLE INTEGRALS

Example 22.9. The region W (to the right) is the


region bounded by

z = 4−y 2 , y = 2x, z = 0, and x = 0.

Notice that W is x-, y-, and z-simple.


Z Z Z
Express xyz dV are an iterated integral in
W
three ways - with dV = dzdxdy, with dV = dxdydz,
and with dV = dydxdz.

dV = dzdxdy :

To integrate over W with dV = dzdxdy, we view W


as a z-simple region:

W = (x, y, z) | (x, y) ∈ D, 0 ≤ z ≤ 4 − y 2


where D is the projection of W into the xy-plane,


viewed as a horizontally simple region (since dA =
dxdy).

D is bounded in the xy-plane by y = 2x and x = 0. To find the boundary curve for D, we intersect the
surface z = 4 − y 2 with the xy-plane:

z = 4 − y 2 =⇒ 0 = 4 − y 2 =⇒ y = ±2.

We can see the bound on D is the positive option, y = 2.

Viewing D as a horizontally simple region, we solve


for the bounds in terms of y:
n yo
D = (x, y) | 0 ≤ y ≤ 2, 0 ≤ x ≤ .
2
So,
Z Z Z Z Z Z z=4−y2 !
xyz dzdxdy = xyz dz dA
W D z=0
Z y=2 Z x= y2 Z z=4−y 2
= xyz dzdxdy
y=0 x=0 z=0

dV = dxdydz :

To integrate over W with dV = dxdydz, we view W


as an x-simple region by rewriting the plane y = 2x
as x = y2 :
n yo
W = (x, y, z) | (y, z) ∈ T , 0 ≤ x ≤
2
where T is the projection of W into the yz-plane.

102
MTH 2321 Notes 22 TRIPLE INTEGRALS

Projecting W into the yz-plane, we get the region T in the yz-plane bounded by y = 0, z = 0, and z = 4−y 2 .

Since dA = dydz, we want constant bounds for z and bounds for y in terms of z.

To do this we solve for the bounds in terms of z:


 √
T = (y, z) | 0 ≤ z ≤ 4, 0 ≤ y ≤ 4 − z .
So,
x= y2
Z Z Z Z Z Z !
xyz dxdydz = xyz dx dA
W D x=0

Z z=4 Z y= 4−z Z x= y2
= xyz dxdydz
z=0 y=0 x=0

dV = dydxdz :

To integrate over W with dV = dydxdz, we view


W as a y-simple region by solving for the surface
z = 4 − y 2 in terms of z:
 √
W = (x, y, z) | (x, z) ∈ S, 2x ≤ y ≤ 4 − z
where S is the projection of W into the xz-plane.

S is bounded in the xz-plane by z = 0 and y = 0.


This boundary curve is the projection of the bound-
ary of the left face of W onto the xz-plane. These
points lie both in the plane y = 2x and on the sur-
face z = 4 − y 2 , i.e. these points are of the form
Q = (x, 2x, 4 − 4x2 ). Projecting this into the xz-
plane gives points of the form (x, 0, 4 − 4x2 ), i.e. the
curve is z = 4 − 4x2 .

Since dA = dxdz, we want constant bounds for z and


bounds for x in terms of z.
 r 
z
S = (x, z) | 0 ≤ z ≤ 4, 0 ≤ x ≤ 1 − .
4
So,
√ !
Z Z Z Z Z Z y= 4−z
xyz dxdydz = xyz dy dA
W D y=2x
√ √
Z z=4 Z x= 1− z4 Z y= 4−z
= xyz dydydz
z=0 x=0 y=2x

If you evaluate any of these integrals you should get a value of 23 . To see how to write the same integral in
the forms dV = dzdydx, dV = dxdzdy, and dV = dydzdx see example 5 on page 911 in the textbook.

103
MTH 2321 Notes 22 TRIPLE INTEGRALS

As with single and double integrals, we have a mean value theorem for triple integrals.
Theorem 22.10 (Mean Value Theorem for Triple Integrals). If f (x, y, z) is continuous on a closed, bounded,
connected region W then there exists a point P ∈ W such that
Z Z Z
f (x, y, z) dV = f (P ) · Volume(W).
W

Recall that in the previous section we converted to polar coordinates when integrating over a domain that
was radially simple (i.e. a domain nicely described by polar coordinates). We will similarly convert to
cylindrical or spherical coordinates to integrate over regions that are nicely described by either cylindrical
or spherical coordinates.

Definition 22.11 (Axial Symmetry).

Cylindrical coordinates nicely describe regions that


are axially symmetric.

A region is axially symmetric if it is the region


between to surfaces z1 (r, θ) and z2 (r, θ), where r, θ,
and z are cylindrical coordinates, over a domain D
which is radially simple.

Theorem 22.12 (Triple Integration in Cylindrical Coordinates). If f (x, y, z) is continuous on the region

W = {(r, θ, z) | θ1 ≤ θ ≤ θ2 , r1 (θ) ≤ r ≤ r2 (θ), z1 (r, θ) ≤ z ≤ z2 (r, θ)} ,

then Z Z Z Z θ=θ2 Z r=r2 (θ) Z z=z2 (r,θ)


f (x, y, z) dV = f (r cos θ, r sin θ, z) · r · dzdrdθ.
W θ=θ1 r=r1 (θ) z=z1 (r,θ)

The previous theorem boils down to: changing to cylindrical coordinates gives dV = r · dzdrdθ.
Example 22.13. Find the volume of a cone of height H and radius R using a triple integral.

You should recall that the volume of a cone is V =


π 2
3 R H. To derive this we will let W be the solid cone
in the figure to the right, and evaluate
Z Z Z
dV.
W

This cone is axially symmetric with


W = {(r, θ, z) | 0 ≤ θ ≤ 2π, 0 ≤ r ≤ R, z1 (r, θ) ≤ z ≤ H} ,
where z1 (r, θ) is the surface that gives the cone. Once
we find z1 we can compute the integral by changing
to cylindrical coordinates.

104
MTH 2321 Notes 22 TRIPLE INTEGRALS

To find z1 we use similar triangles.

z r H
= =⇒ z = r.
H R R

Thus
Z Z Z Z θ=2π Z r=R Z z=H
dV = r · dzdrdθ
W θ=0 r=0 z= H
Rr
Z θ=2π Z r=R  
H
= r H − r drdθ
θ=0 r=0 R
Z θ=2π Z r=R
H 2
= rH − r drdθ
θ=0 r=0 R
θ=2π 2 r=R
Hr3
Z
Hr
= − dθ
θ=0 2 3R r=0
Z θ=2π
HR2 HR 3
= − dθ
θ=0 2 3R
θ=2π
HR2
Z
= dθ
θ=0 6
HR2 π
= (2π − 0) = HR2 .
6 3
Definition 22.14. Spherical coordinates nicely describe regions that are centrally simple.

A region is centrally simple if it is the region between to surfaces ρ1 (θ, φ) and ρ2 (θ, φ), where θ, φ, and ρ
are spherical coordinates, over a domain D which is a rectangle in θ and φ.

Centrally simple regions have the property that every ray from the origin that intersects W does so in a
point or a line segment between ρ1 (θ, φ) and ρ2 (θ, φ).
Theorem 22.15 (Triple Integration in Spherical Coordinates). If f is continuous on the region

W = {(ρ, θ, φ) | θ1 ≤ θ ≤ θ2 , φ1 ≤ φ ≤ φ2 , ρ1 (θ, φ) ≤ ρ ≤ ρ2 (θ, φ)} ,

then
Z Z Z Z θ=θ2 Z φ=φ2 Z ρ=ρ2 (θ,φ)
f (x, y, z) dV = f (ρ sin φ cos θ, ρ sin φ sin θ, ρ cos φ) · ρ2 sin φ · dρdφdθ.
W θ=θ1 φ=φ1 ρ=ρ1 (θ,φ)

The previous theorem boils down to: changing to spherical coordinates gives dV = ρ2 sin φ · dρdφdθ.
Example 22.16. Use integration in spherical coordinates to find the volume of the region W between the
surfaces x2 + y 2 + z 2 = 4 and x2 + y 2 = 1. See this region here.

There are many ways to find this volume, we could integrate over the whole region to find the volume.

Alternatively, we can see from the graph that W is symmetric about the xy-plane, and so we could find the
volume for z ≥ 0 and double it to find the entire volume.

105
MTH 2321 Notes 22 TRIPLE INTEGRALS

As another alternative, we can also see from the graph that W is in fact symmetric about the z-axis, and so
we could find the volume of the part of W in the first octant (x, y, z ≥ 0) and multiply that by 8. For fun,
we will do it this way.

In spherical coordinates, the first octant is given by 0 ≤ θ ≤ π2 and 0 ≤ φ ≤ π2 . W takes all values for θ in
the first octant, so 0 ≤ θ ≤ π2 , but W does not take all φ values in the first octant.

Consider the cross section of this region to the right.

The minimum angle for φ in the region satisfies


sin φ = 21 , so that φ = π6 . Thus our bounds for
the part of W in the first octant are π6 ≤ φ ≤ π2 .

Now we have that the part of W in the first octant is


n π π π o
(ρ, θ, φ) | 0 ≤ θ ≤ , ≤ φ ≤ , ρ1 (θ, φ) ≤ ρ ≤ ρ2 (θ, φ) ,
2 6 2
where ρ1 (θ, φ) gives the cylinder of radius 1 and ρ2 (θ, φ) gives the sphere of radius 2. The sphere of radius
2 is exactly ρ = 2, so ρ2 (θ, φ) = 2.

The cylinder of radius 1 is


x2 + y 2 = 1 =⇒ ρ2 sin2 φ cos2 θ + ρ2 sin2 φ sin2 θ = 1
=⇒ ρ2 sin2 φ = 1
=⇒ ρ2 = csc2 φ
=⇒ ρ = csc φ,
so ρ1 (θ, φ) = csc φ. Then,
Z Z Z
Volume(W) = 8 dV
Z θ= π
2
Z φ= π
2
Z ρ=2
=8 ρ2 sin φ · dρdφdθ
θ=0 φ= π
6 ρ=csc φ
θ= π φ= π ρ=2
ρ3
Z 2
Z 2
=8 sin φ dφdθ
θ=0 φ= π
6
3 ρ=csc φ
θ= π φ= π
csc2 φ
Z Z
2 2 8
=8 sin φ − dφdθ
θ=0 φ= π
6
3 3
Z θ= π φ= π
2 8 cot φ 2
=8 − cos φ + dθ
θ=0 3 3 φ= π
6
Z θ= π
√ √
2 4 3 3
=8 − dθ
θ=0 3 3
Z θ= π
2 √
=8 3dθ
θ=0
√ π √
= 8 3 · = 4π 3
2

106
MTH 2321 Notes 23 CHANGE OF VARIABLES

23 Change of Variables (15.6)


RR
So far we are able to, as long as f is nice enough, compute double integrals D
f (x, y) dA, where D is
horizontally simple or vertically simple. We also learned how to integrate over domains that are radially
simple by changing the variables to polar coordiantes.
RRR
We are also able to, as long as f is nice enough, compute triple integrals W
f (x, y, z) dV where W is
x-, y-, or z-simple, and we learned how to integrate over regions that are axially symmetric by changing the
variables to cylindrical coordinates and how to integrate over regions that are centrally simple by changing
the variables to spherical coordinates.

Changing to polar, cylindrical, or spherical coordinates are special cases of a general change of variables
formula for multiple integrals, which is the focus of this section. We will first consider the double integral
case, and start by introducing some lingo.
Definition 23.1. A map G from a set X to a set Y is a function G : X → Y with domain X. The image
of x ∈ X is G(x) ∈ Y . The range of G is the image of the domain, G(X) = {G(x) | x ∈ X} ⊂ Y .

To use change of variables for double integrals we will consider maps G : D → R2 defined on a domain
D ⊂ R2 . To ease notation, we will often use u and v as the variables for the domain D and x and y as the
variables for the domain G(D), and so write G(u, v) = (x(u, v), y(u, v)) where the components x and y are
functions of the input u and v.

Z Z
The idea of change of variables is that evaluate an integral of the form f (x, y) dA over a not-so-nice
D
domain D ⊂ R2 , one can change the variables using a map G : D0 → R2 so that G(D0 ) = D and the region
D0 is nice in the new coordinate system! The key steps in this process will be to identify the appropriate the
domain D0 that maps to D under a given map G, and to find out how the map G changes the area of D0 .

Before we get into actual change of variables, let’s get more comfortable with maps G : R2 → R2 . We’ll start
with a specific type of map called a linear map.

Definition 23.2. A map G(u, v) is a linear map if it has the form

G(u, v) = (Au + Cv, Bu + Dv) ,

where A, B, C, and D are constants.

We can picture linear maps as mapping vectors in the uv-plane to vectors in the xy-plane, and hence mapping
lines in the uv-plane to lines in the xy-plane. A linear map G has the following properties:

G(u1 + u2 , v1 + v2 ) = G(u1 , v1 ) + G(u2 , v2 )

and
G(cu, cv) = c · G(u, v) for any constant c.

107
MTH 2321 Notes 23 CHANGE OF VARIABLES

A consequence of this is that a linear map G maps the parallelogram spanned by any two vectors a and b
in the uv-plane to the parallelogram spanned by G(a) and G(b) in the xy-plane. E.g. the parallelogram
spanned by i and j is mapped to the parallelogram spanned by G(i) = hA, Bi and G(j) = hC, Di.

More generally, G maps the line segment P Q to the line segment G(P )G(Q), and the grid generated by i
and j is mapped to the grid generate by G(i) = hA, Bi and G(j) = hC, Di:

Example 23.3. Find the image of the triangle T with vertices (1, 2), (3, 0), and (−1, −1) under the map
G(u, v) = (2u, u − 3v).

Because G is linear, it maps the line segment joining two vertices of T to the line segment joining the images
of the two vertices. Hence, the image of T is a triangle whose vertices are the images:

G(1, 2) = (2, −5), G(3, 0) = (6, 3), and G(−1, −1) = (−2, 2).

y y G(3, 0)
3 3 •
G(−1, −1)
• 2
2 •
1
x
1 −2 −1 1 2 3 4 5 6
−1
x −2
• G
−1 1 2 3 −→
−3
• −1 −4
−5 • G(1, 2)

Now that we are more comfortable with maps G : R2 → R2 , we want to consider how areas change under
these maps. It turns out that the following plays an important role.

108
MTH 2321 Notes 23 CHANGE OF VARIABLES

Definition 23.4 (The Jacobian). The Jacobian of a map G(u, v) = (x(u, v), y(u, v)) is the determinant

∂x ∂x
∂u ∂v ∂x ∂y ∂x ∂y
Jac(G) = ∂y ∂y
= − .
∂u ∂v ∂v ∂u
∂u ∂v

∂(x, y)
We also denote Jac(G) by . Note that as G is a function of u and v, so is Jac(G).
∂(u, v)
Theorem 23.5 (Jacobian of a Linear Map). The Jacobian of a linear map

G(u, v) = (Au + Cv, Bu + Dv)

is constant with value


∂x ∂x
∂u ∂v A C
Jac(G) = ∂y ∂y
= = AD − BC.
∂u ∂v B D
Under the map G, the area of a region D is multiplied by the factor | Jac(G)|, i.e.

Area(G(D)) = | Jac(G)|Area(D).

The area relationship in the above theorem does not hold in general for non-linear maps G. However, using
linear approximation one can show that for a non-linear map G, small domains D, and a sample point P ∈ D,

Area(G(D)) ≈ | Jac(G)(P )|Area(D).

(See page 882 in the textbook.) We will see that this estimate is good enough!
Example 23.6. Consider the polar coordiantes map G : R2 → R2 given by (using r and θ instead of u
and v),
G(r, θ) = (r cos θ, r sin θ) .
Under G, the image of a polar rectangle R = [r1 , r2 ] × [θ1 , θ2 ] in the rθ-plane is a angular sector in the
xy-plane:

Then from above,


Area(G(R)) ≈ | Jac(G)(P )|Area(R)
where P ∈ R. Well, let’s compute Jac(G) :

∂x ∂x
∂r ∂θ cos θ −r sin θ
Jac(G) = ∂y ∂y
= = r cos2 θ + r sin2 θ = r.
∂r ∂θ sin θ r cos θ

Thus
Area(G(R)) ≈ | Jac(G)(P )|Area(R) = r · Area(R),
which works nicely with what we already know, namely

dA = Jac(G) · drdθ = r · drdθ.

109
MTH 2321 Notes 23 CHANGE OF VARIABLES

As we saw in theorem 21.12, the change of variables to polar coordinates gives


Z Z Z Z
f (x, y) dA = f (r cos θ, r sin θ) · r · drdθ,
G(R) R

The general change of variables formula has a similar form for a map G : D0 → D, where D0 is in the
uv-plane and D is in the xy-plane.

Theorem 23.7 (Change of Variables Formula). Let G : D0 → D be a map such that x(u, v) and y(u, v)
have continuous partial derivatives and such that G is one-to-one on the interior of D0 . Then, if f (x, y) is
continuous, Z Z Z Z
f (x, y) dA = f (x(u, v), y(u, v)) · | Jac(G)| · dudv.
D D0

The proof of this theorem is in your textbook.


Z Z
Example 23.8. Use the Change of Variables Formula to calculate e4x−y dxdy where P is the paral-
P
lelogram spanned by the vector h4, 1i and h3, 3i.

First we want to define a map G as in the change of variables formula. Sincle P is a parallelogram, we can
do this nicely with a linear map and can (for ease) take D0 = R to be the unit square: G : R → P

To find a linear map of the form G(u, v) = (Au + Cv, Bu + Dv) that does this,recall that the parallelogram
spanned by i and j is mapped to the parallelogram spanned by G(i) = hA, Bi and G(j) = hC, Di. So, we
want G(1, 0) = (A, B) = (4, 1) and G(0, 1) = (C, D) = (3, 3), i.e.

G(u, v) = (4u + 3v, u + 3v).

Now let’s find the Jacobian of G using the formula for a linear map:

Jac(G) = AD − BC = 4 · 3 − 1 · 3 = 9.

110
MTH 2321 Notes 23 CHANGE OF VARIABLES

Finally, use the change of variables formula, substituting x(u, v) = 4u + 3v and y(u, v) = u + 3v:
Z Z Z Z
4x−y
e dxdy = e4(4u+3v)−(u+3v) | Jac(G)|dudv
P R
Z 1 Z 1
= 9e15u+9v dudv
0 0
Z 1 Z 1
= 9e15u e9v dudv
0 0
Z 1 Z 1
=9 9e15u du e9v dv
0 0
1 15 1 9 
=9· e −1 e −1
15 9
Example
Z Z 23.9. Use the Change of Variables Formula and the map G(u, v) = (uv −1 , uv) to calculate
x2 + y 2 dxdy where
D n y o
D = (x, y) | 1 ≤ xy ≤ 4, 1 ≤ ≤ 4 .
x
For this map G, we have x = uv −1 and y = uv. Substituting these into the inequalities defining D we get:

1 ≤ xy ≤ 4 =⇒ 1 ≤ (uv −1 )(uv) = u2 ≤ 4 =⇒ 1 ≤ u ≤ 2

and
y uv
1≤ ≤ 4 =⇒ 1 ≤ −1 = v 2 ≤ 4 =⇒ 1 ≤ v ≤ 2.
x uv
D is graphed below, where we have solved for y in the bounds 1 ≤ xy ≤ 4, along with the rectangle
R = [1, 2] × [1, 2] in the uv-plane.

So we have that G maps the rectangle R = [1, 2] × [1, 2] to the domain D. Next, find the Jacobian of G:

∂x
∂u
∂x
∂v v −1 −uv −2 2u
Jac(G) = ∂y ∂y
= = .
v u v
∂u ∂v

111
MTH 2321 Notes 23 CHANGE OF VARIABLES

Finally, use the change of variables formula, substituting x(u, v) = uv −1 and y(u, v) = uv:
Z Z Z Z  
2 2
2 2
x + y dxdy = uv −1 + (uv) | Jac(G)|dudv
D R
Z 2 Z 2  2u
= u2 v −2 + u2 v 2 dudv
1 1 v
Z 2 Z 2
2u3 v −3 + v dudv

=
1 1
Z 2 Z 2
=2 u3 du v −3 + v dv
1 1
! !
2 2
u4
1 −2 v 2
=2 − v +
1 42 2 1
!
  2
16 1 1 4 1 1 225
=2 − − + + − =
4 4 8 2 2 2 1 16

The change of variables formula has the same form for three (or more) variables as in two variables. Let
G : W0 → W be a map from a region W0 in (u, v, w)-space to a region W in (x, y, z)-space with components

x = x(u, v, w), y = y(u, v, w), z = z(u, v, w).

The Jacobian of G is the 3 × 3 determinant


∂x ∂x ∂x
∂u ∂v ∂w
∂(x, y, z) ∂y ∂y ∂y
Jac(G) = = ∂u ∂v ∂w
.
∂(u, v, w)
∂z ∂z ∂z
∂u ∂v ∂w

The change of variables formula says that if x, y, and z have continuous partial derivatives and if f is
continuous, then
Z Z Z Z Z Z
f (x, y, z) dV = f (x(u, v, w), y(u, v, w), z(u, v, w)) | Jac(G)| dudvdw.
W W0

Example 23.10. Let’s use the above to verify the change of variables formulas for changing to cylindrical
coordinates. Recall the formula for changing to cylindrical coordinates was dV = r · dzdrdθ, so we want to
show that the cylindrical coordinates map has | Jac(G)| = r. The components of the cylindrical coordinates
map are
x(r, θ, z) = r cos θ, y(r, θ, z) = r sin θ, and z(r, θ, z) = z.
So,

∂x ∂x ∂x
∂r ∂θ ∂z cos θ −r sin θ 0
∂y ∂y ∂y
r cos θ 0 sin θ 0
Jac(G) = ∂r ∂θ ∂z
= sin θ r cos θ 0 = cos θ + r sin θ +0
∂z ∂z ∂z
0 1 0 1
∂r ∂θ ∂z 0 0 1
= cos θ (r cos θ − 0) + r sin θ (sin θ) = r cos2 θ + r sin2 θ = r.

Thus | Jac(G)| = r!
You should verify the change of variables formula for changing to spherical coordinates as an exercise.

112
MTH 2321 Notes 24 VECTOR FIELDS

24 Vector Fields (16.1)


Recall from section 2 that a vector-valued function is a function whose output is a vector. E.g.,
r(t) = ht, 2 sin(t)i , c(θ) = h4 sin θ, 4 cos θ, θ + 1i , etc..
In this section we define a new kind of vector valued function called a vector field.
Definition 24.1 (Vector Field). A vector field in the plane is a function F(x, y) = hF1 (x, y), F2 (x, y)i
which assigns to each point (x, y) in the plane the vector F(x, y).

A vector field in R3 is a function F(x, y, z) = hF1 (x, y, z), F2 (x, y, z), F3 (x, y, z)i which assigns to each point
(x, y, z) in R3 the vector F(x, y, z).

In general, a vector field in Rn is a function F(x1 , . . . , xn ) = hF1 (x1 , . . . , xn ), . . . , Fn (x1 , . . . , xn )i which as-
signs to each point (x1 , . . . , xn ) ∈ Rn the vector F(x1 , . . . , xn ).

We will restrict our attention to vector fields on R2 and R3 , which nicely model the real world, where a large
number of objects/molecules are moving in space.

When drawing a vector field, we draw the vector F(P ) as a vector based at the point P , rather than the
origin. Create your own examples of vectors fields on R2 here and on R3 here.

Example√ 24.2. Find the vector corresponding to the point P = (2, 1, 0) for the vector field F(x, y, z) =
hxy, −z x, 2i.

The vector is F(P ) = (2)(1), −(0) 2, 2 = h2, 0, 2i. Graph this vector field using the links above and
identify this vector.

Definition 24.3 (Constant, Unit, and Radial Vector Fields). A constant vector field is a vector field
that assigns the same vector to every point, i.e. the F(P ) = c for some constant vector c and all points P .

A unit vector field is a vector field where kF(P )k = 1 for all points P , i.e. (P) is a unit vector for all
points P .

A radial vector field is a vector field where F(P ) depends only on the distance r from P to the origin and
−−→
is parallel to OP .

Create an example of each of the above types of vector fields and graph them using the links above.

The following definition is of two important examples of vector fields.

Definition 24.4 (The Unit Radial Vector Fields). The unit radial vector fields in R2 and R3 are,
respectively, * +
Dx yE x y
er = , = p ,p
r r x2 + y 2 x2 + y 2
and * +
Dx y z E x y z
er = , , = p ,p ,p .
r r r x2 + y 2 + z 2 x2 + y 2 + z 2 x2 + y 2 + z 2
Note that in both cases er (P ) is the unit vector based at the point P pointing away from the origin, except
at P = the origin, where er is undefined.

113
MTH 2321 Notes 24 VECTOR FIELDS

We often write the derivative of a function f (x) as f 0 (x), but we also write this using Leibniz notation:

df d
f 0 (x) = = f.
dx dx
Similarly, we write the partial derivative of f with respect to a variable t as
∂f ∂
ft = = f.
∂t ∂t
d ∂
In writing these derivatives this way, it’s easier to view and as operators that say “take the derivative
dx ∂t
with respect to x” and “take the partial derivative with respect to t.”

Similarly, recall the gradient of a function f (x, y, z), which is derivative-like, is


 
∂f ∂f ∂f
∇f = , , ,
∂x ∂y ∂z

and so we can view ∇ as an operator that says “take the gradient” when applied to a function f (x, y, z). We
will use this operator, called del or nabla, to define two derivative-like operators on vector fields - one that
is scalar-valued and one that is vector-valued.
Definition 24.5 (The del operator). The del operators in R2 and R3 , respectively, are
 
∂ ∂
∇= ,
∂x ∂y

and  
∂ ∂ ∂
∇= , , .
∂x ∂y ∂z
Definition 24.6 (Divergence of a Vector Field). The divergence of a vector field F = hF1 , F2 i is the scalar
valued function defined by

div(F) = ∇ · F
 
∂ ∂
= , · hF1 , F2 i
∂x ∂y
∂F1 ∂F2
= + .
∂x ∂y

Similarly, the divergence of a vector field F = hF1 , F2 , F3 i is the scalar valued function defined by

div(F) = ∇ · F
 
∂ ∂ ∂
= , , · hF1 , F2 , F3 i
∂x ∂y ∂z
∂F1 ∂F2 ∂F3
= + +
∂x ∂y ∂z
The divergence operator obeys the same linearity rules as derivatives, i.e.

div(F + G) = div(F) + div(G)


div(cF) = c div(F) for any constant c.

The divergence of a vector field will play an important role later on in our generalizations of the fundamental
theroem of calculus. For now, let’s get our hands dirty with some computations.

114
MTH 2321 Notes 24 VECTOR FIELDS

Example 24.7. Evaluate the divergence of F = sin x, e4xz , xy 2 z 3 at P = (π, 3, −1).

div(F) = ∇ · F
 
∂ ∂ ∂
= , , · sin x, e4yz , xy 2 z 3
∂x ∂y ∂z
∂ ∂ ∂
e4xz + xy 2 z 3
 
= (sin x) +
∂x ∂y ∂z
= cos x + 3xy 2 z 2 .

Hence,
div(F)(P ) = cos π + 3(π)(32 )((−1)2 ) = −1 + 27π
The divergence of a vector field has physical significance. Consider fluid motion (think: gas moving around a
room) with velocity modeled by a vector field F. We can interpret the divergence physically in the following
way:
• if div(F)(P ) > 0, the fluid is flowing away from the point P (think: heating a gas), such a point is
called a source

• if div(F)(P ) < 0, the fluid is flowing towards the point P (think: cooling a gas), such a point is called
a sink

• if div(F)(P ) = 0, the particles are neither expanding nor compressing near P .

If a vector field F has no sources or sinks, i.e. div(F) = 0, we say F is incompressible.

Example 24.8. Determine if the following vector fields are incompressible. If not, find a source or a sink.
Use the link above to view the vector fields.

(a) F = hy, cos(x)i.

As div(F) = 0 + 0 = 0, F is an incompressible vector field.

(b) G = h−x, −y, −zi

As div(G) = −1 − 1 − 1 = −3, G is not incompressible and every point in R3 is a sink.

(c) H = h2x, y, 1i

As div(H) = 2 + 1 + 0 = 3, H is not incompressible and every point in R3 is a source.

(d) J = h− cos(x), −0.5yi

As div(J) = sin(x) − 0.5, J is not incompressible. There are infinitely many sinks, sources, and points
π π
that are neither sinks  div(J)(− 2 , 0) = −1 − 0.5
 nor sources. For example,  −1.5, so (− 2 , 0) is a sink,
=
while div(J) 2 , 0 = 1 − 0.5 = 0.5, so 2 , 0 is a source, and div(J) 6 , 0 = 0.5 − 0.5 = 0, so π6 , 0
π π π


is neither a source nor a sink.

115
MTH 2321 Notes 24 VECTOR FIELDS

While the divergence measures if a vector field is flowing into or out of a point, it doesn’t measure the
direction in which this is happening. This is evident in that divergence is a scalar and not a vector! The
next operation we will consider on vector fields is scalar-valued.
Definition 24.9 (Curl of a Vector Field). The curl of a vector field F = hF1 , F2 , F3 i is
curl(F) = ∇ × F
i j k
∂ ∂ ∂
= ∂x ∂y ∂z
F1 F3 F2
     
∂F3 ∂F2 ∂F3 ∂F1 ∂F2 ∂F1
= − i− − j+ − k
∂y ∂z ∂x ∂z ∂x ∂y
 
∂F3 ∂F2 ∂F1 ∂F3 ∂F2 ∂F1
= − , − , −
∂y ∂z ∂z ∂x ∂x ∂y
The curl of F measures how F is rotating through a point.
As with the divergence, the curl obeys the same linearity rules as derivatives, i.e.
curl(F + G) = curl(F) + curl(G)
curl(cF) = c curl(F) for any constant c.
D 2
E
Example 24.10. Calculate the curl of F = xyz, ex y , y cos(z) .

curl(F) = ∇ × F
i j k
∂ ∂ ∂
= ∂x ∂y ∂z
x2 y
xyz e y cos(z)
 
∂ ∂  x2 y  ∂ ∂ ∂  x2 y  ∂
= (y cos(z)) − e , (xyz) − (y cos(z)) , e − (xyz)
∂y ∂z ∂z ∂x ∂x ∂y
D 2
E
= cos(z) − 0, xy − 0, 2xyex y − xz
D 2
E
= cos(z), xy, 2xyex y − xz
We have now seen the “del” operator ∇ used in three ways:
• ∇ applied to a scalar function f (x, y) or f (x, y, z) gives the gradient,
   
∂f ∂f ∂f ∂f f ∂
∇f (x, y) = , , and ∇f (x, y, z) = , , ,
∂x ∂y ∂x ∂y ∂z
• ∇ applied to a vector field F = hF1 , F2 i or F = hF1 , F2 , F3 i using the dot product to gives the
divergence,
∂F1 ∂F2 ∂F1 ∂F2 ∂F3
div(F) = ∇ · F = + , and div(F) = ∇ · F = + + ,
∂x ∂y ∂x ∂y ∂z
• and ∇ applied to a vector field F = hF1 , F2 , F3 i using the cross product gives the curl,
 
∂F3 ∂F2 ∂F1 ∂F3 ∂F2 ∂F1
curl(F) = ∇ × F = − , − , −
∂y ∂z ∂z ∂x ∂x ∂y
The gradient, divergence, and curl are all intimately related, as we will see later on. As a highlight of this,
you will compute in the homework the following identities:
curl (∇f ) = 0
and
div (curl(F)) = 0.

116
MTH 2321 Notes 25 LINE INTEGRALS

25 Line Integrals (16.2)


In this section we introduce integrals over curves, called line integrals, of both scalar-valued functions and
vector fields.
Z
We will start with defining the scalar line integral over a space curve C: f ds.
C

As with all integrals, line integrals are defined using a Riemann sums technique: Consider a curve C. Divide
C into arcs C1 , . . . , CN , each of which has length ∆si , and choose a sample point Pi in each arc Ci . Then
form the Riemann sum:
N
X N
X
f (Pi )length(Ci ) = f (Pi )∆si .
i=1 i=1

Definition 25.1 (Line Integral). The line integral of f (x, y, z) over a space curve C is
Z N
X
f ds = lim f (Pi )∆si ,
C [∆si ]→0
i=1

if this limit exists. This definition is valid for both functions of two and three variables, f (x, y) and f (x, y, z).
Considering the function f = 1, we can see that the line integral over a curve C of f = 1 is the length of the
curve:
Z XN
1 ds = lim ∆si = length(C).
C [∆si ]→0
i=1
The way we will compute scalar line integrals in practice is by parametrizing the curve C.

Suppose that C is parametrized by a continuously differentiable parametrization r(t) for a ≤ t ≤ b (recall


continuously differentiable means r0 (t) exists and is continuous). We then partition [a, b] into N sub-intervals
via
a = t0 < t1 < · · · < tN −1 < tN = b,
which corresponds to a partition C1 , . . . , CN of C where Ci is parametrized by r(t) for ti−1 ≤ t ≤ ti , and so
we can pick sample points Pi = r(t∗i ) in each Ci :

Now, we want to construct a Riemann sum as before, using f (Pi ) and the length of Ci . To do so, recall the
arc length formula: Z ti
∆si = length(Ci ) = kr0 (t)k dt.
ti−1
0 0
Then as r (t) is continuous, kr (t)k is approximately constant on the small interval [ti−1 , ti ], and so we can
approximate ∆si using the sample points:
Z ti
kr0 (t)k dt ≈ kr0 (t∗i )k∆ti .
ti−1

117
MTH 2321 Notes 25 LINE INTEGRALS

This gives us the following approximation:


N
X N
X
f (Pi )∆si ≈ f (r(t∗i ))kr0 (t∗i )k∆ti .
i=1 i=1

Taking the limits as ∆si and ∆ti → 0 gives us the following theorem.

Theorem 25.2 (Computing a Scalar Line Integral). Let r(t) be a continuously differentiable parametrization
of a curve C for a ≤ t ≤ b. If f (x, y, z) is continuous, then
Z Z b
f (x, y, z) ds = f (r(t))kr0 (t)k dt.
C a

The theorem also applies to continuous functions f (x, y).

Notice that letting f (x, y, z) = 1, we recover the arc length formula:


Z Z b
length(C) = 1 ds = kr0 (t)k dt.
C a

The previous theroem boils down to: when replacing f (x, y, z) over C with f (r(t)) over [a, b], we get the line
element or arc length differential
ds = kr0 (t)kdt.
Z t
To justify this notation, recall the arc length function: s(t) = kr0 (t)kdt.
a
ds
By the fundamental theroem of calculus, we have that = kr0 (t)k, and we get the above by solving for ds.
dt

y2
Z
Example 25.3. Compute ds for the curve C given by y = x2 for 1 ≤ x ≤ 2.
C x3
The curve C is parametrized by r(t) = t, t2 for 1 ≤ t ≤ 2. To find the line integral, we need to find
ds = kr0 (t)kdt:
r0 (t) = h1, 2ti
p
kr0 (t)k = 1 + 4t2
p
ds = 1 + 4t2 dt.
Then,
2
2
y2 t2 p
Z Z
ds = 1 + 4t2 dt
C x3 1 t3
Z 2 p
= t 1 + 4t2 dt
1

To compute this integral use u-substitution with u = 1 + 4t2 :


Z 2 2
√ du 2 3
u = u2
t=1 8 24 t=1
2
1 3
= (1 + 4t2 ) 2
12 t=1
1  3 3

= 17 − 5
2 2
12

118
MTH 2321 Notes 25 LINE INTEGRALS

Z
Example 25.4. Compute x + y + z ds where C is the helix r(t) = hcos t, sin t, ti for 0 ≤ t ≤ π.
C

The helix given by r(t) is graphed to the right, C is


the part from t = 0 to t = π.

To compute this line integral we need to find


ds = kr0 (t)kdt:

r0 (t) = h− sin t, cos t, 1i


q
2 2 2

kr0 (t)k = (− sin t) + (cos t) + (1) = 2

ds = 2dt
Then,
Z Z π
x + y + z ds = f (r(t)) kr0 (t)k dt
C Length of the helix? Note that we can find the
Z0 π √ length of C by integrating the function 1:
= (cos t + sin t + t) 2dt
0 Z
 π


1 length(C) = 1 ds
= 2 sin t − cos t + t2
2 C
0 Z π √
√ 2
 
π = 2dt
= 2 2+
2 √0
= 2π

Recall from last section that a vector field is a function that assigns a vector to each point in R2 or R3 .
So, given a plane or space curve we can consider the accumulation of a vector field over the curve, i.e. the
line integral of a vector field.

For example: imagine (if you can) that you are on the first floor of Sid Rich with your book bag and have a
class on the 2nd or 3rd floor. As you carry your bag up the stairs and to your classroom, you are working
against gravity. The work you output to do this can be computed as a line integral of the vector field given
by gravity over the path you take.

One important difference between scalar and vector line integrals is that
vector line integrals depend on the direction along the curve.
With the precious example in mind, it’s not hard to imagine that it takes more work to carry your books up
the stairs than it does to carry your books along the same path down the stairs.
Definition 25.5 (Oriented Curve). An oriented curve C is the curve C with a specified direction, called
an orientation. We refer to this specified direction as the positive direction, with the opposite direction
being the negative direction.

119
MTH 2321 Notes 25 LINE INTEGRALS

To define the line integral of a vector field F along an oriented curve C, we first define the tangential com-
ponent of F, which encodes how much F is pointing in the direction of C.

Let C be a piecewise smooth oriented curve given by a regular parametrization r(t) for a ≤ t ≤ b, so that
r(t) traces out C in the positive direction. We call this a positively oriented parametrization r(t). Recall
the unit tangent vector at a point P = r(t) is given by
r0 (t)
T(P ) = .
kr0 (t)k
Definition 25.6 (Tangential Component). The tan-
gential component of F at P is the scalar given by

F(P ) · T(P ) = kF(P )kkT(P )k cos θ = kF(P )k cos θ,

where θ is the angle between F(P ) and T(P ).

We can then form the scalar-valued function F · T


for every point P on C.

The vector line integral of F is then defined as the


scalar line integral of F · T.

Before defining a vector line integral, let’s introduce the notation. The line integral of a vector field F =
hF1 , F2 , F3 i along an oriented curve C is denoted
Z
F · dr,
C

where dr = hdx, dy, dzi is the vector line element or vector differential. Another way to denote this is
by writing out the dot product:
Z Z
F · dr = F1 dx + F2 dy + F3 dz.
C C

Definition 25.7 (Vector Line Integral). The line integral of a vector field F along an oriented curve C is
the integral of the tangential component of F:
Z Z
F · dr = F · T ds.
C C

To compute this in practice, we make the following calculation. Let r(t) be a positively oriented regular
parametrization of C for a ≤ t ≤ b. Recall that ds = kr0 (t)kdt. Then
r0 (t)
 
F · T ds = F(r(t)) · 0 kr0 (t)k dt = F(r(t)) · r0 (t) dt.
kr (t)k
This gives the following theroem.
Theorem 25.8 (Computing a Vector Line Integral). If r(t) is a positively oriented regular parametrization
of C for a ≤ t ≤ b, then
Z Z b
F · dr = F(r(t)) · r0 (t) dt.
C a

120
MTH 2321 Notes 25 LINE INTEGRALS

The previous theroem boils down to: when replacing F over C with F(r(t)) over [a, b], we get the vector
line element or vector differential
dr = r0 (t)dt.
Note that when doing this the dot product is taken, so that we get a scalar valued function of t as the
integrand.

In alternate notation, if r(t) = hx(t), y(t), z(t)i, then the previous theorem can be written as
Z Z b 
dx dy dz
F1 dx + F2 dy + F3 dz = F1 (r(t)) + F2 (r(t)) + F3 (r(t)) dt.
C a dt dt dt
Z
Example 25.9. Evaluate F · dr where F = z, y 2 , x and C is parametrized in the positive direction by
C
r(t) = 1 + t, et , t2 for 0 ≤ t ≤ 2.

First, let’s find F(r(t)) · r0 (t) :

F(r(t)) = t2 , e2t , 1 + t and r0 (t) = 1, et , 2t

F(r(t)) · r0 (t) = t2 , e2t , 1 + t · 1, et , 2t = 3t2 + e3t + 2t.


Then the line integral is
Z Z 2
F · dr = F(r(t)) · r0 (t) dt
C 0
2 2
e3t
Z
= 3t2 + e3t + 2t dt = t3 + + t2
0 3 0
e6 e6 + 35
   
1
= 8+ + 4 − 0 + + 0) =
3 3 3

As a vector line integral is a scalar line integral of the tangential component F · T = kF(P )k cos θ, where
θ is the angle between F and T, we can read something off about the sign and magnitude of a vector line
integral from the graph.

Example 25.10. Consider the line integral of F = h2y, −3i around the ellipse below.

Along the top part of the ellipse, the angles θ Along the bottom part of the ellipse, the angles θ
between F and T along the path are mostly obtuse, between F and T along the path are mostly acute,
so F · T ≤ 0 and the line integral is negative. Here so F · T ≥ 0 and the line integral is positive. Here
traversing the ellipse is working against the vector traversing the ellipse is working with the vector field.
field.

We can then guess that the line integral around the entire ellipse is negative since kFk (the length of the
arrows) is larger in the top half than in the bottom half. Let’s check this by integrating.

121
MTH 2321 Notes 25 LINE INTEGRALS

Let C be the ellipse from before, graphed to the right.


Then C with counterclockwise orientation is given by
r(θ) = h5 + 4 cos θ, 3 + 2 sin θi for 0 ≤ θ < 2π. Recall
that F = h2y, −3i. We want to show that
Z Z
F1 dx + F2 dy = 2y dx − 3 dy < 0.
C C

As x(θ) = 5 + 4 cos θ and y(θ) = 3 + 2 sin θ, we have


dx dy
= −4 sin θ and = 2 cos θ.
dθ dθ

Then
Z Z 2π  
dx dy
2y dx − 3 dy = 2y −3 dθ
C 0 dθ dθ
Z 2π
= 2 (3 + sin θ) (−4 sin θ) − 3 (2 cos θ) dθ
0
Z 2π
= −24 sin θ − 16 sin2 θ + 6 cos θdθ.
0
Z 2π Z 2π Z 2π
Now recall that sin θ dθ = cos θ dθ = 0 and sin2 θ dθ = π, so that
0 0 0
Z
2y dx − 3 dy = −16π.
C

Vector line integrals have several nice properties. Especially, they have the same linearity properties as scalar
integrals: Z Z Z
(F + G) · dr = F · dr + G · dr
C C C
and Z Z
cF · dr = c F · dr
C C
for any smooth oriented curve C, vector fields F and G, and constant c.

We also have that if −C is the oriented curve with


the opposite orientation as C, then
Z Z
F · dr = − F · dr.
−C C

This is because the unit tangent vector changes from


T to −T when the orientation of C changes. See the
figure to the right.

Another property of line integrals is additivity. That is, if C a piecewise smooth curve, i.e. C is a union of n
smooth curves C1 ∪ · · · ∪ Cn , then
Z Z Z
F · dr = F · dr + · · · + F · dr
C C1 Cn

122
MTH 2321 Notes 25 LINE INTEGRALS

Example 25.11. Compute C F · dr where F = hez , ey , x + yi and C is the triangle with vertices (1, 0, 0),
R

(0, 1, 0), and (0, 0, 1), oriented with direction from (1, 0, 0) to (0, 1, 0).

The triangle with vertices A = (1, 0, 0), B = (0, 1, 0),


and C = (0, 0, 1) is picture to the right. It is a piece-
wise smooth as it is the union of its three smooth
edges. So using additivity,
Z Z Z Z
F · dr = F · dr + F · dr + F · dr.
C AB BC CA

The segment AB is parametrized by r(t) = h1 − t, t, 0i for 0 ≤ t ≤ 1, so


F(r(t)) · r0 (t) = F(1 − t, t, 0) · h−1, 1, 0i = e0 , et , 1 · h−1, 1, 0i = −1 + et ,
Z Z 1 1
F · dr = et − 1 dt = et − t = e − 2.
AB 0 0

The segment BC is parametrized by r(t) = h0, 1 − t, ti for 0 ≤ t ≤ 1, so


F(r(t)) · r0 (t) = F(0, 1 − t, t) · h0, −1, 1i = et , e1−t , 1 − t · h0, −1, 1i = −et + 1 − t,
Z 1 1
t2
Z
3
F · dr = −et + 1 − t dt = et + t − = − e.
BC 0 2 0 2

The segment CA is parametrized by r(t) = ht, 0, 1 − ti for 0 ≤ t ≤ 1, so


F(r(t)) · r0 (t) = F(t, 0, 1 − t) · h1, 0, −1i = e1−t , 1, t · h1, 0, −1i = e1−t − t,
Z 1 1
t2
Z
3
F · dr = e1−t − t dt = −e1−t − = − + e.
CA 0 2 0 2
Thus, Z
3 3
F · dr = e − 2 + − e − + e = e − 2.
C 2 2
Finally for this section, we have two applications of vector line integrals. Recall that in physics, work is the
energy expended when a force is applied to an object as it moves along a path.

For example, the work W performed along the line


segment from P to Q by applying a constant force F
at an angle θ, pictured to the right, is
W = (tangential component of F) × distance
= (kFk cos θ) dist(P Q).

With this in mind, we define the work the force F acts on the object moving along a curve C with a line
integral: Z
W = F · dr.
C
This is called the “work performed by the field F.” We will often consider the work required to move an
object in the presence of a force field F, in which case F acts on the object and we must work against the
force to move the object. In this case,
Z
Work performed against F = − F · dr.
C

123
MTH 2321 Notes 25 LINE INTEGRALS

Example 25.12. Calculate the work performed against F in moving a particle from the (1, 1, 1) to (4, 8, 2)
along the path
r(t) = t2 , t3 , t for 1 ≤ t ≤ 2.
in the presence of the force field F = x2 , −z, − yz (in newtons).
Z
The work performed against F is − F · dr. First, let’s find F · dr = F(r(t)) · r0 (t) dt:
C

F(r(t)) = F(t , t , t) = t , −t, −t2 ,


2 3 4
r0 (t) = 2t, 3t2 , 1 =⇒ F(r(t)) · r0 (t) = 2t5 − 3t3 − t2 .

Then the work performed against F in joules is


Z Z 2   2
1 6 3 4 1 3 89
− F · dr = − 2t5 − 3t3 − t2 dt = − t − t − t = .
C 1 3 4 3 1 12

Line integrals are also used to compute the flux across a plane curve, which is defined to be the integral
of the normal component of a vector field (instead of the tangent component).

Suppose a plane curve C is parametrized by r(t) for


a ≤ t ≤ b, and let

N = N(t) = hy 0 (t), −x0 (t)i

and
N(t) N(t)
n(t) = = 0 .
kN(t)k kr (t)k
Notice here that N(t) · r0 (t) = 0, so N(t) and n(t) are
normal to r(t).
We also have that both N(t) and n(t) point to the
right as you follow the curve C along r(t).

Definition 25.13 (Flux Across a Curve). In the setting above, the flux across a curve C is the integral of
the normal component of F, F · n:
Z Z b Z b
N(t)
Flux across C = F · n ds = F(r(t)) · kr0 (t)k dt = F(r(t)) · N(t) dt.
C a kr0 (t)k a

Physically, if F is the velocity field of a fluid modeled in R2 , then the flux is the quantity of fluid flowing
across the curve per unit time.
D 2
E
Example 25.14. Calculate the flux of the velocity vector field v = 3 + 2y − y3 , 0 , in centimeters per
second, across the quarter ellipse r(t) = h3 cos t, 6 sin ti for 0 ≤ t ≤ π2 .

The vector field along the path is


(6 sin t)2
 
v(r(t)) = 3 + r(6 sin t) − , 0 = 3 + 12 sin t − 12 sin2 t, 0 .
3
The tangent vector is r0 (t) = h−3 sin t, 6 cos ti, and so
N(t) = h6 cos t, 3 sin ti. Then
v(r(t)) · N(t) = 18 cos t + 72 sin t cos t − 72 sin2 t cos t.
Then, the flux is
Z π2
18 cos t+72 sin t cos t−72 sin2 t cos t dt = 30 cm2 /s.
0

124
MTH 2321 Notes 26 CONSERVATIVE VECTOR FIELDS

26 Conservative Vector Fields (16.3)


Recall that vector fields nicely model the real world (think: flow of a fluid through space). In fact, the forces
of physical systems which conserve energy give rise to conservative vector fields.

Definition 26.1 (Conservative Vector Fields and Potential Functions). A vector field F = hF1 , F2 , F3 i is
called conservative if it arises as the gradient of a differentiable function f (x, y, z), i.e.
 
∂f ∂f ∂f
F = ∇f = , , .
∂x ∂y ∂z
The function f (x, y, z) is called a potential function for F. A similar definition holds for n variables.

Recall that gradient vectors ∇f are orthogonal to


level curves f (x, y, z) = c, and so for a conservative
vector field F = ∇f , the vector F(P ) = ∇f (P ) is or-
thogonal to the level curve f (x, y, z) = f (P ) through
P for all points P .

In the previous section we saw that the work done


against a vector field for a particle to travel form
point A to point B can be computed using a line
integral. For arbitrary vector fields, different paths
from A to B can give different amounts of work be-
ing output. For a conservative vector field, the work
is independent of the path taken.

The following theorem is a consequence of homework problem problem 76(d), where you should have shown
that curl(∇f ) = 0.

Theorem 26.2 (Curl of a Conservative Vector Field). If F = hF1 , F2 i is conservative, then


∂F1 ∂F2
= .
∂y ∂x
If F = hF1 , F2 , F3 i is conservative, then curl(F) = 0 or, equivalently,
∂F1 ∂F2 ∂F2 ∂F3 ∂F3 ∂F1
= , = , and = .
∂y ∂x ∂z ∂y ∂x ∂z
Like antiderivatives in one variable, potential functions are unique up to an additive constant.
Definition 26.3 (Uniqueness of Potential Functions). If F is conservative on an open connected domain,
then any two potential functions of F differ by a constant.

Example 26.4. Determine if F = y, x + z 2 , 2yz is a conservative vector field.

First, let’s check if F satisfies the conditions of the previous theorem. If it doesn’t, we know F is not
conservative.
∂F1 ∂F2 ∂F2 ∂F3 ∂F3 ∂F1
=1= , = 2z = , and =0= .
∂y ∂x ∂z ∂y ∂x ∂z
So we can’t immediately say that F is not conservative. To determine if it is, we will try to find a potential
function f of F or show that there is no such potential function. Such a potential function must satisfy
∂f ∂f ∂f
= y, = x + z2, and = 2yz.
∂x ∂y ∂z

125
MTH 2321 Notes 26 CONSERVATIVE VECTOR FIELDS

Then, we must have that Z


f (x, y, z) = y dx = xy + C1 (y, z).

i.e, the only way an “x” can appear in f (x, y, z) is in the term xy. Similarly,
Z
f (x, y, z) = x + z 2 dy = xy + yz 2 + C2 (x, z),

so the only way a “y” can appear in f (x, y, z) is in the terms xy + yz 2 . Notice that this does not contradict
the above for how an “x” can appear. At this point, we know that f (x, y, z) = xy + yz 2 + C(z). Finally, we
have Z
f (x, y, z) = 2yz dz = yz 2 + C3 (x, y).

So there are no extra z terms, and we have a potential function f (x, y, z) = xy + yz 2 for F. To check, verify
that ∇f = F.

D 2
E
Example 26.5. Determine if F = xy, x2 , zy is a conservative vector field.

First, let’s check if F satisfies the conditions of the previous theorem. If it doesn’t, we know F is not
conservative.
∂F1 ∂F2 ∂F2 ∂F3 ∂F3 ∂F1
=x= , = 0 6= z = , and =0= .
∂y ∂x ∂z ∂y ∂x ∂z
As the second equality does not hold, F is not conservative.

p
Example 26.6. Show that f (x, y, z) = r = x2 + y 2 + z 2 is a potential function for the unit radial vector

√ x √ y √ z
field er = , , .
x2 +y 2 +z 2 x2 +y 2 +z 2 x2 +y 2 +z
2

∂f x ∂f y ∂f z
=p , =p , and =p .
∂x x + y2 + z2
2 ∂y x + y2 + z2
2 ∂z x + y2 + z2
2

Thus ∇f = er , and er is conservative.

As was said earlier, conservative vector fields are


path-independent. That is, the line integral of a
conservative vector field F along a path from P to
Q depends only on P and Q, and not the particular
path taken.
Z Z
F · dr = F · dr
r1 r2

This yields a nice result for closed curves. We call


the line integral of F over a closed curve C the cir-
culation of F around C and write this in a way that
emphasizes that C is closed:
I
F · dr
C

126
MTH 2321 Notes 26 CONSERVATIVE VECTOR FIELDS

Theorem 26.7 (Fundamental Theorem for Conservative Vector Fields). Assume F is conservative with
F = ∇f on a domain D.
• If C1 is an oriented curve in D from P to Q, then
Z
F · dr = f (Q) − f (P ).
C1

In particular, F is path-independent on D.
• If C2 is a closed curve in D, the circulation of F around C2 is zero:
I
F · dr = 0.
C2

Example 26.8. Let F = 2xy + z, x2 , x . Evaluate


R
C
F · dr where C is any curve from P = (1, −1, 2) to
Q = (2, 2, 3).

Hint: since a specific curve wasn’t given, it must be that F is path-independent! This is a clue to use the
fundamental theorem for conservative vector fields.

First, let’s find a potential function f (x, y, z) for F. We must have that
Z
f (x, y, z) = 2xy + z dx = x2 y + xz + C1 (y, z)
Z
f (x, y, z) = x2 dy = x2 y + C2 (x, z)
Z
f (x, y, z) = x dz = xz + C3 (x, y).

So f (x, y, z) = x2 y + zx is a potential function for F, and hence F is conservative. Then for any curve C
from P = (1, −1, 2) to Q = (2, 2, 3), we have
Z
F · dr = f (Q) − f (P ) = f (2, 2, 3) − f (1, −1, 2) = 14 − 1 = 13
C

Example 26.9. Let f = esin(xyz) + 14x4 yz 10 .

Evaluate
I
∇f · dr
C

where C is the oriented curve to the right.

By the fundamental theorem for conservative vector


fields, the above line integral is zero.

Theorem 26.10. A vector field F on an open connected domain D is path-independent if and only if it is
conservative.
This means that the only path-independent vector fields are conservative vector fields. The proof of this
theorem is in the textbook.

Let’s now consider an application of conservative vector fields to physics. The conservation of energy prin-
ciple says that the sum of kinetic and potential energy remains constant in an isolated system. We’ll now

127
MTH 2321 Notes 26 CONSERVATIVE VECTOR FIELDS

show that conservation of energy is valid for the motion of a particle of mass m under a force field F if F
has a potential function, i.e. F is conservative. (Note that this is exactly why we call conservative vector
fields “conservative.”)

As usual, physicists and mathematicians have miss-matched notation, so will follow the physicists convention
(for a brief time!) of writing the potential function V of F so that
F = −∇V.
When a particle of mass m is at the point P = (x, y, z), it is said to have potential energy P E = V (P ).
Suppose the particle moves along a path r(t), so that its velocity is v = r0 (t), and its kinetic energy is
KE = 12 mkvk2 = 12 mv · v. Then the total energy at time t is
1
E = KE + P E = mv · v + V (r(t)).
2
Theorem 26.11 (Conservation of Energy). The total energy E of a particle moving under the influence of
dE
a conservative force field F = −∇V is constant in time. i.e. = 0.
dt
You will prove this theorem in the homework (problem 83).

We know from theroem 26.2 that if F is conservative it satisfies curl(F) = 0 and the corresponding cross-
partial conditions. A natural question is: if curl(F) = 0, and so the corresponding cross-partial conditions
hold, does F necessarily need be conservative? The answer to that is sometimes.

A domain D is called simply connected if it doesn’t have any holes. More precisely, D is simply connected
if every loop in D can be shrunk to a point in D.

Theorem 26.12 (Existence of Potential Functions). Let F be a vector field on a simply connected domain.
If curl(F) = 0, and so the corresponding cross-partial conditions hold, then F is conservative.
Consider the following example.
Example 26.13. Show that the vortex field
 
−y x
F= , 2
x + y x + y2
2 2

satisfies the cross-partial conditions but is not conservative.

By the quotient rule,


x2 + y 2 − x(2x) y 2 − x2
 
∂F2 ∂ x
= = =
∂x ∂x x2 + y 2 (x2 + y 2 )2 (x2 + y 2 )2
−(x2 + y 2 ) + y(2y) y 2 − x2
 
∂F1 ∂ −y
= 2 2
= 2 2 2
= 2
∂y ∂y x + y (x + y ) (x + y 2 )2

128
MTH 2321 Notes 26 CONSERVATIVE VECTOR FIELDS

Thus F satisfies the cross-partial conditions.

Now, consider the circulation of F around the unit circle C parametrized by r(t) = hcos θ, sin θi for 0 ≤ θ ≤ 2π:
F(r(t)) · r0 (t) = h− sin t, cos ti · h− sin t, cos ti = sin2 (t) + cos2 (t) = 1
I Z 2π Z 2π
F · dr = F(r(t)) · r0 (t) dt = dt = 2π 6= 0!
C 0 0
So the circulation of F around the closed curve C is non-zero, and hence F cannot be conservative!
To see why the vortex field does not contradict Theorem 26.12, let’s look at the graph of the vortex field
and its domain.

×
We can see that the domain of the vortex field is the plane R2 without the origin (denoted R2 ), which is
not simply connected!

In fact, we can show that the vortex field is conservative when considered over any simply connected subset
×
of R2 : Let f (x, y) = θ = tan−1 xy for x 6= 0. Then
   
∂f 1 −y −y ∂f 1 1 x
= 2 2
= 2 2
and = 2 = 2 .
∂x 1+ y x x +y ∂y 1+ y x x + y2
x x
So, if x 6= 0, F = ∇f , and the vortex field is conservative. Then the line integral of F along a curve C in a
×
simply connected subset of R2 is the change in the potential function f (x, y) = θ along the path:
Z
F · dr = θ2 − θ1 = the change in the angle θ along C.
C
This gives us insight into why the vortex field F is not conservative on all of its domain - the angle θ is
defined only up to integer multiples of 2π, but the change in angle doesn’t work this way, which is why the
integral over the unit circle was 2π. In general, if a closed path C winds around the origin n times (counting
negative if the winding is clockwise), then
I
F · dr = 2πn.
C
This is called the winding number of the path.

129
MTH 2321 Notes 27 SURFACE INTEGRALS

27 Surface Integrals (16.4)


In section 25 we generalized integrals over an interval [a, b] to line integrals over a curve. In this section we
will generalize double integrals over a domain D to integrals over a surface S, called surface integrals.

Recall from section 15 that if


G(u, v) = (x(u, v), y(u, v), z(u, v))
is the parametrization of a surface, the tangent vectors to the grid curves through a point P = G(u0 , v0 ) are
 
∂G ∂x ∂y ∂z
Tu (P ) = (u0 , v0 ) = (u0 , v0 ), (u0 , v0 ), (u0 , v0 )
∂u ∂u ∂u ∂u

and  
∂G ∂x ∂y ∂z
Tv (P ) = (u0 , v0 ) = (u0 , v0 ), (u0 , v0 ), (u0 , v0 ) ,
∂v ∂v ∂v ∂v
and a vector normal to the surface at the point P is gven by

N(P ) = Tu (P ) × Tv (P ).

The length kNk of the normal vector has an important interpretation geometrically. Consider a rectangle
D, divided into small rectangle Rij of size ∆u × ∆v. How does the area of Rij compare to its image under
G, G(Rij ) = Sij ?

130
MTH 2321 Notes 27 SURFACE INTEGRALS

Note that in the figure above, for small ∆u and ∆v, the area of Sij is approximately the same as the area
−−→ −→
of the parallelogram spanned by P Q and P S, i.e.
−−→ −→
Area(Sij ) ≈ kP Q × P Sk
−−→ −→
Now, we can approximate P Q and P S via linear approximation:
−−→ ∂G
P Q = G(uij + ∆u, vij ) − G(uij , vij ) ≈ (uij , vij )∆u = Tu ∆u
∂u
and
−→ ∂G
P S = G(uij , vij + ∆v) − G(uij , vij ) ≈ (uij , vij )∆v = Tv ∆v.
∂v
Thus,
−−→ −→
Area(Sij ) ≈ kP Q × P Sk
≈ kTu ∆u × Tv ∆vk
= kTu × Tv k∆u∆v
= kN(uij , vij )kArea(Rij ).
So, N is the area distortion factor that measures how the area of a small rectangle is changed under the
map G.

Now, the entire surface S parametrized by G(u, v) can be broken into small pieces of the form Sij , and so
by approximating each patch we can approximate the surface area of S:
X X
Area(S) = Area(Sij ) ≈ kN(uij , vij )k∆u∆v.
i,j i,j

Now observe that the sum on the right is a Riemann sum for the double integral of kN(u, v)k over the
parameter domain D! Letting ∆u and ∆v go to zero, we get the integral and equality, and hence
Z Z
Area(S) = kN(u, v)k dudv.
D

We can now use this to define the integral of a function f (x, y, z) over a surface, called a surface integral
and denoted Z Z
f (x, y, z) dS.
S
To do this, partition S into small pieces Sij and choose a sample point Pij = G(uij , vij ) in each of the Sij ,
we can approximate the accumulation of f over S via the sum
X
f (Pij )Area(Sij ).
i,j

Letting the size of the Sij go to zero, as above, we get the surface integral
Z Z X
f (x, y, z) dS = lim f (Pij )Area(Sij ),
S Area(Sij )→0
i,j

if the limit on the right exists. To evaluate this integral, we use the previous approximation of Area(Sij ):
X X
f (Pij )Area(Sij ) ≈ f (G(uij , vij ))kN(uij , vij )k∆u∆v.
i,j i,j

The right side is now a Riemann sum for the double integral of the function
f (G(u, v))kN(u, v)k
over the parameter domain D. We can them use a limiting process to get equality for “nice enough” maps
G.

131
MTH 2321 Notes 27 SURFACE INTEGRALS

Theorem 27.1 (Surface Integrals and Surface Area). Let G(u, v) be a parametrization of a surface S with
parameter domain D so that G is continuously differentiable, one-to-one, and regular (except possible on the
boundary of D). Then Z Z Z Z
f (x, y, z) dS = f (G(u, v))kN(u, v)k dudv.
S D
If f (x, y, z) = 1 then we get Z Z
Area(S) = kN(u, v)k dudv.
D

This theorem can be remembered by how to replace the surface element, dS:

dS = kN(u, v)k dudv.

Aside: The change of variables formula for double integrals is just a special case of the above theorem. If
the surface S is in fact a domain in the xy-plane, i.e. z(u, v) = 0, and we may view G as a map from the
uv-plane to the xy-plane, in which case it turns out that the Jacobian of G is exactly kN(u, v)k.

In the following two examples, let S be the surface


that is portion of the cone x2 + y 2 = z 2 lying above
the disk x2 + y 2 ≤ 4, which is parametrized by

G(θ, t) = (t cos θ, t sin θ, t)

for
0 ≤ t ≤ 2, 0 ≤ θ ≤ 2π.

Example 27.2. Calculate the surface area of S.

We first must compute the tangent and normal vectors:


∂G ∂G
Tθ = = h−t sin θ, t cos θ, 0i , Tt = = hcos θ, sin θ, 1i
∂θ ∂t
i j k
N = Tθ × Tt = −t sin θ
t cos θ 0 = ht cos θ, t sin θ, −ti
cos θ sin θ 1
q √ √
Then, keeping in mind that 0 ≤ t ≤ 2, kNk = t2 cos2 θ + t2 sin2 θ + (−t)2 = 2t2 = 2t.
Then, Z Z Z 2 Z 2π √ √
Area(S) = kNk dudv = 2t dθdt = 4 2π.
D 0 0
Z Z
Example 27.3. Calculate x2 z dS.
S

First let’s find f (G(θ, t)): f (G(θ, t)) = f (t cos θ, t sin θ, t) = (t cos θ)2 t = t3 cos2 θ.
Then,
Z Z Z t=2 Z θ=2π
x2 z dS = f (G(θ, t))kN(θ, t)k dθdt
S t=0 θ=0
Z t=2 Z θ=2π √ 
= t3 cos2 θ 2t dθdt
t=0 θ=0


Z 2  Z 2π 
32 2π
= 2 t4 dt cos2 θ dθ =
0 0 5

132
MTH 2321 Notes 27 SURFACE INTEGRALS

In the case where the surface S is the graph of a function z = g(x, y), recall that S is parametrized by
G(x, y, g(x, y)),
and that then the tangent and normal vectors are
Tx = (1, 0, gx (x, y)), Ty = (0, 1, gy (x, y))

i j k
N = Tx × Ty = 1 0 gx (x, y) = h−gx (x, y), −gy (x, y), 1i ,
0 1 gy (x, y)
and hence q
kNk = 1 + gx (x, y)2 + gy (x, y)2 .
Then the surface integral of f (x, y, z) over the portion of the graph of g(x, y) lying over a domain D in the
xy-plane is Z Z Z Z q
f (x, y, z) dS = f (x, y, g(x, y)) 1 + gx (x, y)2 + gy (x, y)2 dxdy.
S D
Z Z
Example 27.4. Calculate z − x dS where S is the portion of the graph of z = x + y 2 with 0 ≤ x ≤ y
S
and 0 ≤ y ≤ 1.

S is graphed to the right.

Let z = g(x, y) = x + y 2 ,. Then gx = 1, gy = 2y, and


q
dS = 1 + gx (x, y)2 + gy (x, y)2 dxdy
p
= 1 + 1 + 4y 2 dxdy
p
= 2 + 4y 2 dxdy.
Then for f (x, y, z) = z − x we get
f (x, y, g(x, y)) = f (x, y, x + y 2 ) = x + y 2 − x = y 2 .

Then,
Z Z Z y=1 Z x=y p
z − x dS = y2 2 + 4y 2 dxdy
S y=0 x=0
Z 1 p x=y 
2
= xy 2+ 4y 2 dy
0 x=0
Z 1 p
= y3 2 + 4y 2 dy
0

Then let u = 2 + 4y 2 to get du = 8ydy and y 2 = 14 (u − 2). Then


Z 1 p
1 u=6 1 √
Z
3 2
y 2 + 4y dy = (u − 2) u du
0 8 u=2 4
Z 6
1 3 1
= u 2 − 2u 2 du
32 2
  6 √ √
1 2 5 4 3 6 6+ 2
= u2 − u2 =
32 5 3 2 30

133
MTH 2321 Notes 28 SURFACE INTEGRALS OF VECTOR FIELDS

28 Surface Integrals of Vector Fields (16.5)


The last type of integrals we will look at are surface integrals of vector fields. Just as we computed flux
across a plane curve using line integrals of vector fields, we will compute flux through a surface, e.g. the flow
of molecules across a cell membrane, via surface integrals of vector fields.

As flow through a surface specifies sides of a surface (“inside” and “outside”), we need to specify a positive
direction. As with plane curves, we do this via an orientation, which in this case depends on the choice of
a unit normal vector n(P ).

This orientation of a surfaces specifies two sides of the surface in a consistent way, where flow in the direction
of n(P ) is considered positively and flow in the direction of −n(P ) is considered negatively.

Recall that we computed flux through a plane with via the normal component F · n. We will compute flux
through a surface similarly.

Definition 28.1 (Normal Component). The nor-


mal component of a vector field F at a point P
on an oriented surface S is
F(P ) · n(P ) = kF(P )k cos θ,
where θ is the angle between F(P ) and n(P ).

We now define the flux of F through S/vector surface


integral as the surface integral of the normal compo-
nent.

Definition 28.2 (Vector Surface Integral/Flux). The vector surface integral is defined as
Z Z Z Z
F · dS = (F · n) dS.
S S

This is also called the flux of F across S.

Recall from last section that we computed surface integrals over S as double integrals via a parametrization
G(u, v) of S, so that Z Z Z Z
f (x, y, z) dS = f (G(u, v))kN(u, v)k dudv.
S D
In the case of a surface integral of the normal component F · n, let G(u, v) be a regular parametrization
oriented with N(u, v) = Tu × Tv giving the orientation. Then the unit normal vector n is
N(u, v)
n= ,
kN(u, v)k

134
MTH 2321 Notes 28 SURFACE INTEGRALS OF VECTOR FIELDS

and hence we get


Z Z Z Z
F · dS = (F · n) dS
S
Z ZS  
N(u, v)
= F(G(u, v)) · kN(u, v)k dudv
D kN(u, v)k
Z Z
= F(G(u, v)) · N(u, v) dudv.
D

In fact this formula holds even if N(u, v) is zero at points on the boundary of D.

From the above, we can remember how to compute vector surface integrals via the vector surface element

dS = N(u, v)dudv.

Notice also that if we choose the opposite orientation for the surface S, we replace N(u, v) with −N(u, v),
and so the integral changed sign.
Z Z
Example 28.3. Calculate F · dS where F = h0, 0, xi and S is the surface parametrized by G(u, v) =
S
(u2 , v, u3 − v 2 ) for 0 ≤ u ≤ 1, 0 ≤ v ≤ 1, and oriented by upward pointing normal vectors.

First, compute N:

Tu = 2u, 0, 3u2 Tv = h0, 1, −2vi

N(u, v) = Tu × Tv = −3u2 , 4uv, 2u .

As 0 ≤ u ≤ 1, the z-component of N is positive and


so it is the upward pointing normal.

Now let’s find F(G(u, v)) · N(u, v)

F(G(u, v)) = 0, 0, u2 =⇒ F(G(u, v)) · N(u, v) = 2u3 .

Finally, compute the vector surface integral as a double integral over the rectangle [0, 1] × [0, 1]:
Z Z Z 1 Z 1
F · dS = F(G(u, v)) · N(u, v) dvdu
S u=0 v=0
Z 1 Z 1
= 2u3 dvdu
u=0 v=0
Z 1
1
= 2u3 du = .
u=0 2

Notice that a positive answer is expected as over S the vector field F points vertically, which matches the
direction of the upward pointing normal N(u, v).

135
MTH 2321 Notes 28 SURFACE INTEGRALS OF VECTOR FIELDS

Example 28.4. Calculate the flux of F = 0, x2 , 0 through the surface S defined by y = 1 + x2 + z 2 for
1 ≤ y ≤ 5 and oriented with normal pointing in the negative y-direction.

As S is given explicitly as y in terms of x and z, let’s


view S as the graph of f (x, z) = y = 1 + x2 + z 2 . We
can then parametrize S as

G(x, z) = (x, 1 + x2 + z 2 , z).

To find the parameter domain D, use 1 ≤ y ≤ 5:

1 ≤ y ≤ 5 =⇒ 1 ≤ 1+x2 +z 2 ≤ 5 =⇒ 0 ≤ x2 +z 2 ≤ 4,

so the parameter domain is the disk centered at (0, 0)


of radius 2 in the xz-plane.

So, let’s switch to polar coordinates in the xz-plane! Let x = r cos θ and z = r sin θ, so that

y = 1 + x2 + z 2 = 1 + r 2

and so we can parametrize S in terms of r and θ as

G(r, θ) = (r cos θ, 1 + r2 , r sin θ), 0 ≤ θ ≤ 2π, 0 ≤ r ≤ 2.

Now let’s compute N:


Tr = hcos θ, 2r, sin θi Tθ = h−r sin θ, 0, r cos θi
and
N = Tr × Tθ = 2r2 cos θ, −r, 2r2 sin θ .
As the y-component of N is negative, this is the orientation we want. Now let’s find F(G(r, θ)) · N(r, θ):

F(G(r, θ)) = 0, r2 cos2 θ, 0 =⇒ F(G(r, θ)) · N(r, θ) = −r3 cos2 θ.

Now integrate to find the flux


Z Z Z Z
= F(G(r, θ)) · N(r, θ) drdθ
S D
Z 2π Z 2
= −r3 cos2 θ drdθ
θ=0 r=0
Z 2π  Z 2 
=− cos2 θ dθ −r3 dr
θ=0 r=0
= −4π

Note that we do note add in an extra r in this case, which in the past came from a Jacobian factor, even
though we converted to polar! When doing surface integrals, the area distortion factor kNk encodes this
info.

136
MTH 2321 Notes 28 SURFACE INTEGRALS OF VECTOR FIELDS

Example 28.5. Calculate the flux of F = hz, x, 1i across the upper hemisphere S of the unit sphere in R3 ,
oriented with outward pointing normal.

The upper half of the unit sphere is parametrized in


spherical coordinates by
G(θ, φ) = (cos θ sin φ, sin θ sin φ, cos φ),
π
for 0 ≤ φ ≤ 2 and ≤ θ ≤ 2π.

Recall all the way back in section 15 we computed the


outward pointing normal vector for a sphere! You can
recompute it now if you want, or go back and check
that it is
N = sin φ hcos θ sin φ, sin θ sin φ, cos φi .

Now let’s find F(G(θ, φ)) · N(θ, φ):

F(G(θ, φ)) = hz, x, 1i = hcos φ, cos θ sin φ, 1i

and so
F(G(θ, φ)) · N(θ, φ) = cos θ sin2 φ cos φ + cos θ sin θ sin3 φ + cos φ sin φ.
Then integrate to calculate the flux
π
Z Z Z 2
Z 2π
F · dS = F(G(θ, φ)) · N(θ, φ) dθdφ
S φ=0 θ=0
Z π Z 2π
2
= cos θ sin2 φ cos φ + cos θ sin θ sin3 φ + cos φ sin φ dθdφ
φ=0 θ=0 | {z }
integral over 0≤θ≤2π is zero
π
Z 2
Z 2π
= cos φ sin φ dθdφ
φ=0 θ=0
Z π π
2 2
= 2π cos φ sin φ dφ = −π cos2 φ = π.
φ=0 0

As a final example we will consider flow rate through a surface.

In the figure to the right, let the fisherman’s net be


the surface S, and consider the vector field v that
gives the velocity of the water flow. The flow rate
is the volume of water that flows through the net per
unit time. We then have the following.

For a fluid with velocity vector field v, the flow rate


across the surface S, in volume per unit time, is
Z Z
v · dS.
S

137
MTH 2321 Notes 28 SURFACE INTEGRALS OF VECTOR FIELDS

Example 28.6. Let v = x2 + y 2 , 0, z 2 be the velocity vector field (in centimeters per second) of a fluid
in R3 . Compute the flow rate through the upper half of the unit sphere centered at the origin.

Using spherical coordinates, the upper half of the unit sphere, S, is given by

x = cos θ sin φ, y = sin θ sin φ, z = cos φ,


π
for 0 ≤ φ ≤ 2 and 0 ≤ θ ≤ 2π. As before, the upward pointing normal vector is

N = sin φ hcos θ sin φ, sin θ sin φ, cos φi .

Then,
v = x2 + y 2 , 0, z 2 = sin2 φ, 0, cos2 φ ,
v · N = sin4 φ cos θ + sin φ cos3 φ.
Then,
π
Z Z Z 2
Z 2π
v · dS = sin4 φ cos θ + sin φ cos3 φ dθdφ,
S φ=0 θ=0

as before the integral over 0 ≤ θ ≤ 2π of sin4 φ cos θ is zero, which leaves


π π
Z 2
Z 2π Z 2
3
sin φ cos φ dθdφ = 2π sin φ cos3 φ dφ
φ=0 θ=0 φ=0
π
cos4 φ
  2
π 3
= 2π − = cm /s.
4 φ=0 2

Note that since we used the upward pointing normal N, this is the flow rate through the hemisphere from
below to above.

138
MTH 2321 Notes 29 GREEN’S THEOREM

29 Green’s Theorem (17.1)


Our last three sections are over three generalizations of the fundamental theorem of calculus. Recall this
theroem: If F 0 exists on [a, b] then
Z b
F 0 (x) dx = F (b) − F (a).
a

If we view the boundary of the interval [a, b] as the set of two points {a, b}, then this theroem says we can
find the integral of the derivative of a function over an interval by evaluate the function on the boundary of
the integral.

Our first generalization of the FTOC says that we can find the double integral of a certain type of derivative
over a region D in the xy-plane by finding the line integral around the boundary of the region. Our second
generalization beefs this up to surface integrals of a type of derivative being equal to line integrals over the
boundary of the surface, and the final generalization allows us to find triple integrals of a type of derivative
over a region in space as the surface integral over the boundary of the solid.

In this section we will look at our first generalization of the FTOC, called Green’s Theorem. Recall from
section 26 that the circulation of a conservative vector fields F around any closed path is zero. Green’s
theorem tells us what happens if F is not conservative. We first need to introduce some notation.

Definition 29.1. Consider a domain D whose


boundary, denoted ∂D, is a simple closed curve (from
section 21). The

boundary orientation of ∂D

is the direction to traverse the boundary such that


the region is always on your left, as in the figure to
the right. If there is a single boundary curve, this is
equivalent to going counterclockwise.

Recall the following notation for the line integral of F = hF1 , F2 i:


Z Z
F · dr and F1 dx + F2 dy.
C C

If C is parametrized by r(t) = hx(t), y(t)i for a ≤ t ≤ b, then

dx = x0 (t)dt and dy = y 0 (t)dt,

and Z Z b
F1 dx + F2 dy = (F1 (x(t), y(t))x0 (t) + F2 (x(t), y(t))y 0 (t)) dt.
C a

Theorem 29.2 (Green’s Theorem). Let D be a domain whose boundary ∂D is a simple closed curve oriented
counterclockwise and let F = hF1 , F2 i where F1 and F2 have continuous second order partial derivatives.
Then I I ZZ  
∂F2 ∂F1
F · dr = F1 dx + F2 dy = − dA.
∂D ∂D D ∂x ∂y
∂F2 ∂F1
Notice that if F is conservative, and hence − = 0, then Green’s theorem confirms what we already
∂x ∂y
know: The line integral of a conservative vector field around a closed curve is zero.

139
MTH 2321 Notes 29 GREEN’S THEOREM

Example 29.3. Verify Green’s Theroem for the line integral along the unit circle C oriented counterclock-
wise: I
xy 2 dx + x dy.
C
To verify Green’s theorem for this integral we compute the left and right hand sides of Green’s theroem and
verify they are equal. First, evaluate the integral directly.

The unit circle is parametrized by x = cos θ and y = sin θ for 0 ≤ θ ≤ 2π, and hence

dx = − sin θdθ and dy = cos θdθ.

Then via
I Z 2π
xy 2 dx + x dy = cos θ sin2 θ(− sin θ) + cos θ(cos θ) dθ

C 0
Z 2π
= − cos θ sin3 θ + cos2 θ dθ
0

1
Then using u-sub with u = sin θ and using cos2 θ = (1 + cos 2θ) we get
2
Z 2π 2π 2π
sin4 θ
 
3 2 1 1
− cos θ sin θ + cos θ dθ = − + θ + sin 2θ
0 4 0 2 2 0
1
= 0 + (2π + 0) = π.
2
Now let’s evaluate the integral using Green’s Theorem:
I ZZ   ZZ
2 ∂F2 ∂F1
xy dx + x dy = − dA = 1 − 2xy dA
C ∂x ∂y
Z ZD ZZ D
ZZ
= 1 dA − 2xy dA = Area(D) − 2xy dA
D D D

where D is the unit disk x2 + y 2 ≤ 1. Area(D) = π, and the function 2xy is symmetric about the origin, as
is the unit disk, and so this term integrates to zero, leaving
ZZ  
∂F2 ∂F1
− dA = π.
D ∂x ∂y

Let’s verify this directly.



ZZ Z 1 Z 1−x2
2xy dA = √ 2xy dy dy
D y=− 1−x2
x=−1

Z 1 1−x2
2
= xy √
dx
x=−1 y=− 1−x2
Z 1
2 2
= x(1 − x ) − x(1 − x ) dx = 0.
x=−1

Thus Green’s theorem is verified for this example.

140
MTH 2321 Notes 29 GREEN’S THEOREM

Example 29.4. Compute the circulation of F(x, y) = sin x, x2 y 3 around the triangle C with vertices (0, 0),
(2, 0), and (2, 2), oriented counterclockwise.

C is graphed to the right.

To compute the line integral directly, we would need


to parametrize the line segments making up the three
sides of the triangle. Instead, we use Green’s Theo-
rem to integrate over the vertically simple region D
enclosed by C, which is given by:

0≤x≤2 and 0≤y≤x

By Green’s Theorem,
I ZZ  
2 3 ∂F2 ∂F1
sin x dx + x y dy = − dA
C ∂x ∂y
Z ZD
= 2xy 3 dA
D
Z 2 Z x
= 2xy 3 dy dx
x=0 y=0
Z 2 x
xy 4
= dx
x=0 2 y=0
2 2
x5 x6
Z
16
= dx = =
x=0 2 12 0 3
ZZ
∂F2 ∂F1
Recall that dA = area(D). So, if we can find a vector field F = hF1 , F2 i such that − =1
D ∂x ∂y
then, by Green’s Theorem,
I I ZZ  
∂F2 ∂F1
F · dr = F1 dx + F2 dy = − dA
C C D ∂x ∂y
ZZ
= dA = area(D).
D

So how do we find such a vector field F = hF1 , F2 i? Here are three ways:

∂F1 ∂F2 ∂F2 ∂F1


• If we choose F1 = 0, then ∂y = 0, and so we need ∂x = 1. Thus if F = h0, xi, then − =
∂x ∂y
1 − 0 = 1.
∂F2 ∂F1
• Similarly, if we choose F2 = 0, then ∂x = 0, and so we need ∂y = −1. Thus if F = h−y, 0i, then
∂F2 ∂F1
− = 0 − (−1) = 1.
∂x ∂y
D y xE ∂F2 ∂F1 1 1
• Combining the above ideas, if we choose F = − , , then − = + = 1.
2 2 ∂x ∂y 2 2
Thus we get three ways to find the area enclosed by a simple closed curve C:
I I I
1
Area enclosed by C = x dy = −y dx = x dy − y dx.
C C 2 C

141
MTH 2321 Notes 29 GREEN’S THEOREM

 x 2  y 2
Example 29.5. Find a general formula for the area bounded by an ellipse, given by + = 1,
a b
using a line integral.

The ellipse C can be parametrized by


x = a cos θ, y = b sin θ, for 0 ≤ θ < 2π,
so
dx = −a sin θdθ and dy = b cos θdθ.
Let’s find the area in all three ways above, just for fun.
I Z 2π
Area of ellipse = x dy = (a cos θ) (b cos θ) dθ
C 0
Z 2π
= ab cos2 θ dθ = abπ
0

I Z 2π
Area of ellipse = −y dx = (−b sin θ) (−a sin θ) dθ
C 0
Z 2π
= ab sin2 θ dθ = abπ
0

I
1
Area of ellipse = x dy − y dx
2 C
Z 2π
1
= (a cos θ) (b cos θ) − (b sin θ) (−a sin θ) dθ
2 0
Z 2π
ab
= cos2 θ + sin2 θ dθ
2 0
Z 2π
ab
= dθ = abπ
2 0

As with all other types of integrals we have seen, circulation around a closed curve as an important additivity
property.

If we decompose a domain D into two (or more) non-overlapping domains D1 and D2 that intersect only on
part of their boundaries, as above, then
I I I
F · dr = F · dr + F · dr.
∂D ∂D 1 ∂D 2

Notice if you “add up” all of the boundaries of D1 and D2 , you get Ctop , Cbot , and two copies of Cmiddle
oriented in opposite directions, which justifies the equality above. We will use this to show that we can
extend Green’s theorem to more general domains.

Consider a domain D whose boundary consists of mote than one simple closed curve, as below. As before,
∂D denotes the boundary of D with the boundary orientation (so that the region lies on the left as the curve
is traversed).

142
MTH 2321 Notes 29 GREEN’S THEOREM

In the above, notice that C5 is oriented so that D2 lies on the right as C5 is traversed. This is what C5 shows
up with a minus sign in the boundary orientation of D2 .

As hinted at above, Green’s Theorem remains valid for these more general domains!
Theorem 29.6 (General form of Green’s Theorem). Let D be a domain whose boundary ∂D consists of
finitely many simple closed curves oriented with the boundary orientation, and let F = hF1 , F2 i where F1
and F2 have continuous second order partial derivatives. Then
I I ZZ  
∂F2 ∂F1
F · dr = F1 dx + F2 dy = − dA.
∂D ∂D D ∂x ∂y

This can be shown by the additivity property from above. To illustrate this, consider the figure below.

By decomposing the domain D into D1 and D2 , we get

∂D = ∂D1 + ∂D2

as the shared boundaries have opposite orientation, and thus cancel. Then use the previous version of Green’s
Theorem and the additivity of double integrals:
I I I
F · dr = F · dr + F · dr
∂D ∂D1 ∂D2
ZZ   ZZ  
∂F2 ∂F1 ∂F2 ∂F1
= − dA + − dA
D1 ∂x ∂y D2 ∂x ∂y
ZZ  
∂F2 ∂F1
= − dA.
D ∂x ∂y

143
MTH 2321 Notes 29 GREEN’S THEOREM

I
Example 29.7. Calculate F · dr where
C1

F = x − y, x + y 3
and C1 is the outer boundary curve oriented counter-
clockwise of the domain C to the right. Assume the
area of D is 8.

Notice that the curve C1 is not specified, and so can-


not be parametrized!

This means we cannot compute the line integral directly. However, ∂D = C1 − C2 , where C2 is the unit circle
parametrized in the counterclockwise direction, and so by Green’s Theorem
I I I ZZ  
∂F2 ∂F1
F · dr − F · dr = F · dr = − dA.
C1 C2 ∂D D ∂x ∂y
I ZZ   I
∂F2 ∂F1
We can then compute both integrals F · dr and − dA, and solve for F · dr!
C2 D ∂x ∂y C1

As
∂F2 ∂F1 ∂ ∂
x + y3 −

− = (x − y) = 1 − (−1) = 2,
∂x ∂y ∂x ∂y
we have ZZ   ZZ
∂F2 ∂F1
− dA = 2 dA = 2Area(D) = 16.
D ∂x ∂y D
I
To compute F · dr, parametrize the unit circle in the normal way: r(θ) = hcos θ, sin θi. Then
C2

F · r0 (θ) = cos θ − sin θ, cos θ + sin3 θ · h− sin θ, cos θi


= − sin θ cos θ + sin2 θ + cos2 θ + sin3 θ cos θ
= 1 − sin θ cos θ + sin3 θ cos θ.

Using your outstanding calculus 2 skills you should be able to then evaluate the following integral:
I Z 2π
F · dr = 1 − sin θ cos θ + sin3 θ cos θ dθ = 2π.
C2 0
I
Finally, we solve F · dr as
C1
I ZZ   I
∂F2 ∂F1
F · dr = − dA + F · dr = 16 + 2π.
C1 D ∂x ∂y C2

Finally for this section, recall that the flux across a plane curve C parametrized by r(t) for a ≤ t ≤ b is given
by
b b
hy 0 (t), −x0 (t)i 0
Z Z Z
F · n ds = F · n kr0 (t)k dt = F· kr (t)k dt
C a a kr0 (t)k
Z b Z b
0 0
= F · hy (t), −x (t)i dt = F1 y 0 (t) − F2 x0 (t) dt
a a
Z b
= F1 dy − F2 dx
a

144
MTH 2321 Notes 29 GREEN’S THEOREM

If C = ∂D is in fact a simple closed curve which is the


boundary of a domain D, then we can use Green’s
theorem to calculate the flux out of of C!

That is, we can use Green’s Theorem on

I Z b
F · n ds = F1 dy − F2 dx
C a

noting that F1 and F2 have changed roles and that


F2 has a negative sign.

We then get I I ZZ  
∂F1 ∂F2
F · n ds = F1 dy − F2 dx = + dA.
∂D ∂D D ∂x ∂y
Now notice that the integrand in the double integral above is exactly div(F)! This gives us the vector form
of Green’s Theorem: I ZZ
F · n ds = div(F) dA.
∂D D

Example 29.8. Calculate the flux of F = x3 , y 3 + y out of the unit circle.

Using the vector form of Green’s theorem, with div(F) = 3x2 + 3y 2 + 1, we get
I ZZ
Flux = F · n ds = 3x2 + 3y 2 + 1 dA
∂D D

where D is the unit disk. Converting to polar coordinates we get


ZZ Z 2π Z 1
3x2 + 3y 2 + 1 dA = 3r2 + 1 r dr dθ

Flux =
D θ=0 r=0
Z 2π  Z 1 
= dθ 3r3 + r dr
θ=0 r=0
!
4 2 1
3r r 5π
= 2π + = .
4 2 0 2

145
MTH 2321 Notes 30 STOKES’ THEOREM

30 Stokes’ Theorem (17.2)


In this section we look at our second big generalization of the fundamental theorem of calculus, called Stokes’
theorem. Stokes’ theorem is in fact an extension of the vector form of Green’s theorem to three dimensions!
Recall the vector form of Green’s theorem:

If C = ∂D is a simple closed curve oriented counterclockwise which is the boundary of a domain D and
F = hF1 , F2 i where F1 and F2 have continuous second order partial derivatives, then
I ZZ
F · n ds = div(F) dA.
∂D D

In extending this to three dimensions we will replace the circulation over a plane curve with the circulation
over certain space curves and we will replace the double integral with surface integrals! To state Stokes’
theorem we need to give some terminology to the specific settings in which the theorem applies.

Consider the three surfaces below. Each surface, S, has a different type of boundary, ∂S: the surface on
the left has a boundary consisting of a single simple closed curve, the surface in the middle has a boundary
consisting of three simple closed curves, and the surface on the right has an empty boundary, denoted ∂S = ∅.
We call such a surface a closed surface. (Think: closed surfaces won’t leak if filled with liquid.)

As with Green’s theorem in the previous section, the statement of Stokes’ theorem will depend on an appro-
priately oriented boundary curve. Recall from section 26 that an orientation of a surface is a continuously
varying choice of a unit normal vector n(P ), where flow in the direction of n(P ) is considered positively and
flow in the direction of −n(P ) is considered negatively.

As in the previous section, we want a boundary orientation for ∂S that is specified by the orientation
of S. This orientation is given, similar to before, by imagining you are a positive normal vector on the
boundary of the oriented surface. The orientation of the boundary is given so that as you walk along the
boundary, the surface is on your left. In the figures below, notice that changing which normal vector is the
orientation of the surface changes the boundary orientation of the boundary curves.

146
MTH 2321 Notes 30 STOKES’ THEOREM

In the statement of Stokes’ theorem (below), we will assume that S is an oriented surface parametrized by
G : D → S where D is a domain in the plane bounded by finitely many smooth, simple, closed curves, and
that G is one-to-one and regular, except possible on the boundary of D. More generally, Stokes’ theroem
actually applies to finite unions of surfaces of this type. The surfaces we will consider will always have the
above properties.
Theorem 30.1 (Stokes’ Theorem). The S be a surface as described above and let F = hF1 , F2 , F3 i s.t. F1 ,
F2 , and F3 have continuous partial derivatives on an open region containing S. Then if ∂S has the boundary
orientation, I ZZ
F · dr = curl(F) · dS.
∂S S
Note that if S is closed, i.e. ∂S = ∅, then the above integrals are zero.
Recall that we can write the curl of F as curl(F) = ∇ × F, in which case Stokes’ theorem says
I ZZ
F · dr = (∇ × F) · dS.
∂S S

This emphasizes that Stokes’ theroem is a generalization of the FTOC: a double integral over a surface of a
derivative (the curl) is equal to a single integral over the boundary of the surface.

Recall that the curl measures the extent to which F is conservative, and that if F is conservative then
curl(F) = 0. In this case Stokes’ theroem recovers what we already know: the circulation of a conservative
vector field around a closed path is zero.

Example 30.2. Verify Stokes’ Theorem for


F = h−y, 2x, x + zi
and the upper hemisphere of the unit sphere with
outward pointing normal vector.

First, the upper hemisphere of the unit sphere is given


by
S = (x, y, z) | x2 + y 2 + z 2 = 1, z ≥ 0 .


Compute the left side of Stokes’ Theorem:


The boundary of S is the unit circle in the xy-plane oriented counterclockwise: r(t) = hcos t, sin t, 0i for
0 ≤ t ≤ 2. Then
r0 (t) = h− sin t, cos t, 0i and F(r(t)) = h− sin t, 2 cos t, cos ti
and hence
F(r(t)) · r0 (t) = sin2 t + 2 cos2 t = 1 + cos2 t.
1
Thus, (as we have done before using cos2 t = 2 (1 + cos 2t))
I Z 2π
F · dr = 1 + cos2 t dt = 2π + π = 3π.
∂S 0

Compute the right ride of Stokes’ Theorem:


The curl of F is
i j k
∂ ∂ ∂
∇×F= ∂x ∂y ∂z
−y 2x x + z
 
∂ ∂ ∂ ∂ ∂ ∂
= (x + z) − (2x), − (x + z) + (−y), (2x) − (−y) = h0, −1, 3i .
∂y ∂z ∂x ∂z ∂x ∂y

147
MTH 2321 Notes 30 STOKES’ THEOREM

S can be parametrized using spherical coordinates:


π
G(θ, φ) = (cos θ sin φ, sin θ sin φ, cos φ) for 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ .
2
As we computed back in section 15, the outward pointing unit normal for the unit sphere is

N = sin φ hcos θ sin φ, sin θ sin φ, cos φi ,

and hence
curl(F) · N = − sin θ sin2 φ + 3 cos φ sin φ.
Then, again using the periodicity of sin on [0, 2π],
π
ZZ Z 2
Z 2π
curl(F) · dS = − sin θ sin2 φ + 3 cos φ sin φ dθ dφ
S φ=0 θ=0
Z π
2
= 0 + 2π 3 cos φ sin φ dφ
φ=0
π
1 2
= 6π sin2 φ = 3π
2 φ=0

Thus the left and right sides of Stokes’ Theorem agree.

Example
I 30.3. Use Stokes’ Theorem to show that
F · dr = 0 where
C
D 2
E
F = sin x2 , ey + x2 , z 4 + 2x2
and C is the triangle to the right.

First, note that is we wanted to compute the line in-


tegral directly, we could parametrize the edges of the
triangle and use the additive of line integrals.

Instead, we can use Stokes’ theroem reduce this to a single surface integral. First, compute the curl of F:

i j k
∂ ∂ ∂
∇×F= ∂x ∂y ∂z = h0, −4x, 2xi .
2
2
x e y + x2 z 4 + 2x2

The surface to integrate over is S, the interior of the triangle. To find N for this triangle, we need to find
the equation for the plane in which the triangle lies. Two vectors in this plane are

h0 − 3, 0 − 0, 1 − 0i = h−3, 0, 1i and h0 − 0, 0 − 2, 1 − 0i = h0, −2, 1i ,

so a normal vector to this plane is

N = h−3, 0, 1i × h0, −2, 1i = h2, 3, 6i .

Then
curl(F) · N = h0, −4x, 2xi · h2, 3, 6i = −12x + 12x = 0.
Thus, ZZ ZZ
curl(F) · dS = curl(F) · N dA = 0.
S D

148
MTH 2321 Notes 30 STOKES’ THEOREM

Recall that the line integral of a conservative vector


field F = ∇f is path independent
Z Z
F · dr = f (Q) − f (P ) = F · dr.
C1 C2
Z
Moreover, recall that F · dr = 0 if C is closed.
C

Analogous results hold for surface integrals of a vector field F = curl(A) = ∇ × A. In this case, the vector
field A is called a vector potential for F.

If S1 and S2 are two surfaces that satisfy the condi-


tions of Stokes’ Theorem and have the same oriented
boundary curve C, then by Stokes’ Theorem we have

ZZ I ZZ
∇ × A · dS = A · dr = ∇ × A · dS.
S1 C S2

In other words, the surface integral of a vector field


with a vector potential A is surface independent!

ZZ
Moreover, F · dS = 0 if F = ∇ × A and S is closed (i.e. ∂S = ∅).
S

This is summarized in the following theorem.

Theorem 30.4 (Surface Independence for Curl of Vector Fields). If F = curl(A), then the flux of F through
a surface S depends only on the oriented boundary ∂S and not on the surface itself:
ZZ I
F · dS = A · dr.
S ∂S
ZZ
In particular, if S is closed (i.e. ∂S = ∅), then F · dS = 0.
S

Note that another way to write the above equation is


ZZ I
(∇ × A) · dS = A · dr.
S ∂S

149
MTH 2321 Notes 30 STOKES’ THEOREM

Example 30.5. Let F = ∇ × A where


A = hy + z, sin(xy), exyz i .
Find the flux of F outward through the surfaces S1
and S2 (to the right) whose common boundary C is
the unit circle in the xz-plane.

First, notice that with C oriented as in the figure, S1


lies to the left and S2 lies to the right. So, by the
above theorem we have
ZZ I
F · dS = A · dr
S1 C

and ZZ I
F · dS = − A · dr
S2 C

I
To compute A · dr, we parametrize C by r(t) = hcos t, 0, sin ti for 0 ≤ t ≤ 2π. Notice that this traces the
C
circle in the direction from the figure as r(0) = h1, 0, 0i and r( π2 ) = h0, 0, 1i.

Then
A(r(t)) = 0 + sin t, sin(0), e0 = hsin t, 0, 1i and r0 (t) = h− sin t, 0, cos ti ,
and so
A(r(t)) · r0 (t) = − sin2 t + cos t.
Thus, I Z 2π
A · dr = − sin2 t + cos t dt = −π.
C 0
So, ZZ ZZ
F · dS = −π and F · dS = π.
S1 S2

Notice that if we consider the closed surface S = S1 ∪ S2 with outward pointing normal, then (as guaranteed
by the previous theroem) the flux of F = curl(A) out of S is −π + π = 0.

Notice also that if S2 (or S1 ) had been oriented oppositely, then the boundary orientation of C would be the
same for both S1 and S2 , and so (as guaranteed by the above theorem) the two surfaces would generate the
same flux (either π or −π).

150
MTH 2321 Notes 31 DIVERGENCE THEOREM

31 Divergence Theorem (17.3)


So far we have the following “fundamental theorems” of the type:

Integral of a derivative on an oriented domain = Integral over the oriented boundary of the domain.

• From calculus 1, the Fundamental Theorem of Calculus relates the integral of f 0 (x) over an interval
[a, b] to the “integral” of f (x) over the boundary of [a, b] consisting of the points a and b:
Z b
f 0 (x) dx = f (b) − f (a)
a | {z }
| {z } “integral” over the boundary points
integral of derivative over an interval

The boundary of [a, b] is oriented by assigning a plus sign to b (where [a, b] is on the left) and a minus
sign to a (where [a, b] is on the right).
• The Fundamental Theorem of Calculus for Vector-Valued Functions generalizes the FTOC to the
integral of a vector valued function r0 (t)
Z b
r0 (t) dt = r(b) − r(a)
a | {z }
| {z } “integral” over the boundary points
integral of derivative over an interval

• The Fundamental Theorem for Line Integrals gen-


eralizes the FTOC: Instead of taking an integral
over an interval [a, b] (which can be thought of as
an oriented path along the x-axis), we take an in-
tegral along any path from points P to Q, and
instead of f 0 (x) we use the gradient:
Z
∇f · dr = f (Q) − f (P )
C | {z }
| {z } “integral” over the boundary points
integral of derivative over a curve

The boundary of C is oriented by assigning a plus sign to Q (where C is on the left) and a minus sign
to P (where C is on the right).

• Green’s Theorem is a two-dimensional version of


the FTOC that relates the integral of a certain
derivative over a domain D in the plane to an in-
tegral over its boundary curve C = ∂D
ZZ   I
∂F2 ∂F1
− dA = F · dr
∂x ∂y
| D {z } | ∂D {z }
integral of derivative over a domain integral over the boundary curve

• Stokes’ Theorem extends Green’s Theorem: In-


stead of a domain in the plane (i.e. a surface con-
tained in the xy-plane), we allow any surface in
R3 . The appropriate derivative is the curl:
ZZ Z
curl(F) · dS = o F · dr
| S {z } | ∂S{z }
integral of derivative over a surface integral over the boundary curve

151
MTH 2321 Notes 31 DIVERGENCE THEOREM

History fun facts:

• Green’s Theorem was not actually first stated by the British mathematician George Green. Instead,
the French mathematician Augustin Cauchy first stated it in 1846. It is called Green’s Theorem
because Green published a result that implies Green’s Theorem before this, in 1828.
• Stokes’ Theorem is so called because the Irish mathematician George Stokes liked to include it on
competitive examinations at Cambridge University. However, this theorem was first stated in a
letter from the Scots-Irish mathematician William Thompson (Lord Kelvin) to Stokes before this,
in 1850.
• A special case of the Divergence Theorem (coming soon to notes near you) was first discovered
by the French mathematician Joseph-Louis Lagrange in 1762. The German mathematician Carl
Friedrich Gauss then independently discovered and published special cases of the Divergence
Theorem in 1813 and later in 1833 and 1839. The general theorem was first proved in 1826 by
the Russian mathematician Michael Ostrogradsky.

Our final theorem, called the Divergence Theorem, also follows this pattern, relating a surface integral over
a closed surface to an appropriate triple integral.

Recall that a closed surface is one with an empty boundary (think: closed surfaces wont leak if filled with
liquid), and hence a closed surface encloses a three-dimensional region W. That is, S is the boundary of W:

∂W = S.

We will consider piecewise smooth closed surfaces S, which means S consists of finitely many smooth surfaces
that have glued together along their boundaries. Two examples of this are below.

Theorem 31.1 (Divergence Theorem). Let S be a closed surface that encloses a region W in R3 . Assume
that S is piecewise smooth and is oriented by normal vector pointing to the outside of W. Let F be a vector
field whose domain contains W. Then
ZZ ZZZ
F · dS = div(F)dV.
S W

The above equation can also be written in the form


ZZ ZZZ
F · dS = ∇ · FdV.
S W

152
MTH 2321 Notes 31 DIVERGENCE THEOREM

Example 31.2. Verify the Divergence Theorem for


F = y, yz, z 2 and the cylinder to the right with
outward pointing normal vectors.

To verify theRRDivergence Theorem, we must verify


that the flux S F · dS, where S is the surface of the
cylinder, is equal to the integral of div(F) over the
cylinder.

First let’s compute the flux through S - which will be the sum of the three surface integrals over the top,
bottom, and side of the cylinder.

First, let’s compute the flux through the side of the cylinder. We can parametrize this cylinder using
cylindrical coordinates with a fixed radius r = 2 as:

G(θ, z) = (2 cos θ, 2 sin θ, z) for 0 ≤ θ < 2π, 0 ≤ z ≤ 5.

Then F(G(θ, z)) = 2 sin θ, 2z sin θ, z 2 , and the normal vector is

N = Tθ × Tz = h−2 sin θ, 2 cos θ, 0i × h0, 0, 1i = h2 cos θ, 2 sin θ, 0i .

Note that this normal vector does point outside the cylinder. Thus
ZZ ZZ Z 5 Z 2π
F · dS = F(G(θ, z)) · N dA = 2 sin θ, 2z sin θ, z 2 · h2 cos θ, 2 sin θ, 0i dθ dz
side D 0 0
!
Z Z 5 2π Z 5 2π
= 4 cos θ sin θ + 4z sin2 θ dθ dz = 2 sin2 θ + 2z (θ − sin θ cos θ) dz
0 0 0 0
Z 5 5
= 4πzdz = 2πz 2 = 50π
0 0

Now let’s compute the flux through the top of the cylinder. We can parametrize the top disk as

G(x, y) = (x, y, 5) for (x, y) ∈ D



where D is the disk of radius 2: D = (x, y) | x2 + y 2 ≤ 1 . Then F(G(x, y)) = hy, 5y, 25i and

N = Tx × Ty = h1, 0, 0i × h0, 1, 0i = h0, 0, 1i .

Notice that the normal vector does point outside the cylinder. And hence
ZZ ZZ ZZ
F · dS = F(G(x, y)) · N dA = 25 dA = 25Area(D) = 100π.
top D D

Now let’s compute the flux though the bottom of the cylinder. We can parametrize the bottom disk as

G(x, y) = (x, y, 0) for (x, y) ∈ D

where D is the disk of radius 2. Then F(G(x, y)) = hy, 0, 0i and

N = Tx × Ty = h1, 0, 0i × h0, 1, 0i = h0, 0, 1i .

Notice that this is the normal vector points towards the interior of the cylinder, so we want to consider the
vector −N = h0, 0, −1i. Then
ZZ ZZ ZZ
F · dS = F(G(x, y)) · N dA = 0 dA = 0.
bottom D D

153
MTH 2321 Notes 31 DIVERGENCE THEOREM

Thus the total flux of F out of the cylinder is 150π.


ZZZ
Now let’s compare this with the triple integral of the divergence of F: div(F) dV .
W

∂ ∂ ∂ 2
div(F) = div( y, xy, z 2 ) = y+ xy + z = 3z.
∂x ∂y ∂z
We can then view W is a z-simple region:

W = {(x, y, z) | 0 ≤ z ≤ 5, (x, y) ∈ D}

where D is the disk of radius 2. Thus,


ZZZ ZZ Z 5 ZZ
75
div(F) dV = 3z dV dA
W D 2
 D z=0  
75 75
= (Area(D)) = (4π) = 150π.
2 2
Example 31.3. Use the Divergence Theroem to
evaluate ZZ
x2 , z 4 , ez · dS
S
where S is the boundary of the box W to the right.

First, compute the divergence of F = x2 , z 4 , ez .

∂ 2 ∂ 4 ∂ z
x2 , z 4 , ez e = 2x + ez .

div = x + z +
∂x ∂y ∂z
Then, ZZ ZZZ Z 2 Z 3 Z 1
x2 , z 4 , ez · dS = 2x + ez dV = 2x + ez dz dy dx = 6e + 6.
S W 0 0 0
Example 31.4. Compute the flux of

F = z 2 + xy 2 , cos(x + z), e−y − zy 2

through the surface S to the right.

Rather than trying to compute the surface integral,


let’s use the Divergence Theorem. First let’s com-
pute the divergence of F.

∂ ∂ ∂
z 2 + xy 2 + e−y − zy 2 = y 2 − y 2 = 0.
 
div(F) = (cos(x + z)) +
∂x ∂y ∂z
Then by the divergence theorem the flux must be zero:
ZZ ZZZ ZZZ
F · dS = div(F) dV = 0 dV = 0.
S W W

154

You might also like