0% found this document useful (0 votes)
139 views

Calculus

Uploaded by

Moe Aung Kyaw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
139 views

Calculus

Uploaded by

Moe Aung Kyaw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 391

CLP-3 MULTIVARIABLE

CALCULUS

Joel Feldman, Andrew Rechnitzer, & Elyse


Yeager
University of British Columbia
University of British Columbia
CLP-3 Multivariable Calculus

Joel Feldman, Andrew Rechnitzer and Elyse


Yeager
This text is disseminated via the Open Education Resource (OER) LibreTexts Project (https://LibreTexts.org) and like the hundreds
of other texts available within this powerful platform, it is freely available for reading, printing and "consuming." Most, but not all,
pages in the library have licenses that may allow individuals to make changes, save, and print this book. Carefully
consult the applicable license(s) before pursuing such effects.
Instructors can adopt existing LibreTexts texts or Remix them to quickly build course-specific resources to meet the needs of their
students. Unlike traditional textbooks, LibreTexts’ web based origins allow powerful integration of advanced features and new
technologies to support learning.

The LibreTexts mission is to unite students, faculty and scholars in a cooperative effort to develop an easy-to-use online platform
for the construction, customization, and dissemination of OER content to reduce the burdens of unreasonable textbook costs to our
students and society. The LibreTexts project is a multi-institutional collaborative venture to develop the next generation of open-
access texts to improve postsecondary education at all levels of higher learning by developing an Open Access Resource
environment. The project currently consists of 14 independently operating and interconnected libraries that are constantly being
optimized by students, faculty, and outside experts to supplant conventional paper-based books. These free textbook alternatives are
organized within a central environment that is both vertically (from advance to basic level) and horizontally (across different fields)
integrated.
The LibreTexts libraries are Powered by MindTouch® and are supported by the Department of Education Open Textbook Pilot
Project, the UC Davis Office of the Provost, the UC Davis Library, the California State University Affordable Learning Solutions
Program, and Merlot. This material is based upon work supported by the National Science Foundation under Grant No. 1246120,
1525057, and 1413739. Unless otherwise noted, LibreTexts content is licensed by CC BY-NC-SA 3.0.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not
necessarily reflect the views of the National Science Foundation nor the US Department of Education.
Have questions or comments? For information about adoptions or adaptions contact [email protected]. More information on our
activities can be found via Facebook (https://facebook.com/Libretexts), Twitter (https://twitter.com/libretexts), or our blog
(http://Blog.Libretexts.org).

This text was compiled on 11/16/2022


TABLE OF CONTENTS
Licensing
Colophon
Feedback about the text
Preface

1: Vectors and Geometry in Two and Three Dimensions


1.1: Points
1.2: Vectors
1.3: Equations of Lines in 2d
1.4: Equations of Planes in 3d
1.5: Equations of Lines in 3d
1.6: Curves and their Tangent Vectors
1.7: Sketching Surfaces in 3d
1.8: Cylinders
1.9: Quadric Surfaces

2: Partial Derivatives
2.1: Limits
2.2: Partial Derivatives
2.3: Higher Order Derivatives
2.4: The Chain Rule
2.5: Tangent Planes and Normal Lines
2.6: Linear Approximations and Error
2.7: Directional Derivatives and the Gradient
2.8: Optional — Solving the Wave Equation
2.9: Maximum and Minimum Values
2.10: Lagrange Multipliers

3: Multiple Integrals
3.1: Double Integrals
3.2: Double Integrals in Polar Coordinates
3.3: Applications of Double Integrals
3.4: Surface Area
3.5: Triple Integrals
3.6: Triple Integrals in Cylindrical Coordinates
3.7: Triple Integrals in Spherical Coordinates
3.8: Optional— Integrals in General Coordinates

4: Appendices
A: Appendices
A.1: Trigonometry
A.2: Powers and Logarithms
A.3: Table of Derivatives

1 https://math.libretexts.org/@go/page/91890
A.4: Table of Integrals
A.5: Table of Taylor Expansions
A.6: 3-D Coordinate Systems
A.7: ISO Coordinate System Notation
A.8: Conic Sections and Quadric Surfaces
B: Hints for Exercises
C: Answers to Exercises
D: Solutions to Exercises

Index
Detailed Licensing

2 https://math.libretexts.org/@go/page/91890
Licensing
A detailed breakdown of this resource's licensing can be found in Back Matter/Detailed Licensing.

1 https://math.libretexts.org/@go/page/115427
Colophon
Cover Design Nick Loewen — licensed under the CC-BY-NC-SA 4.0 License.
Source files A link to the source files for this document can be found at the CLP textbook website. The sources are licensed under
the CC-BY-NC-SA 4.0 License.
Edition CLP3 Multivariable Calculus: May 2021
Website CLP-3
©2016 – 2021 Joel Feldman, Andrew Rechnitzer, Elyse Yeager
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. You can
view a copy of the license here.

1 https://math.libretexts.org/@go/page/92299
Feedback about the text
The CLP-3 Multivariable Calculus text is still undergoing testing and changes. Because of this we request that if you find a
problem or error in the text then:
1. Please check the errata list that can be found at the text webpage.
2. Is the problem in the online version or the PDF version or both?
3. Note the URL of the online version and the page number in the PDF
4. Send an email to [email protected] . Please be sure to include
a description of the error
the URL of the page, if found in the online edition
and if the problem also exists in the PDF, then the page number in the PDF and the compile date on the front page of PDF.

1 https://math.libretexts.org/@go/page/92301
Preface
This text is a merger of the CLP Multivariable Calculus textbook and problembook. It is, at the time that we write this, still a work
in progress; some bits and pieces around the edges still need polish. Consequently we recommend to the student that they still
consult text webpage for links to the errata — especially if they think there might be a typo or error. We also request that you send
us an email at [email protected]
Additionally, if you are not a student at UBC and using these texts please send us an email (again using the feedback button) —
we'd love to hear from you.
Joel Feldman, Andrew Rechnitzer and Elyse Yeager

1 https://math.libretexts.org/@go/page/92300
CHAPTER OVERVIEW
1: Vectors and Geometry in Two and Three Dimensions
Before we get started doing calculus in two and three dimensions we need to brush up on some basic geometry, that we will use a
lot. We are already familiar with the Cartesian plane 1, but we'll start from the beginning.
1. René Descartes (1596–1650) was a French scientist and philosopher, who lived in the Dutch Republic for roughly twenty years
after serving in the (mercenary) Dutch States Army. He is viewed as the father of analytic geometry, which uses numbers to
study geometry.
1.1: Points
1.2: Vectors
1.3: Equations of Lines in 2d
1.4: Equations of Planes in 3d
1.5: Equations of Lines in 3d
1.6: Curves and their Tangent Vectors
1.7: Sketching Surfaces in 3d
1.8: Cylinders
1.9: Quadric Surfaces

This page titled 1: Vectors and Geometry in Two and Three Dimensions is shared under a CC BY-NC-SA 4.0 license and was authored, remixed,
and/or curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the
LibreTexts platform; a detailed edit history is available upon request.

1
1.1: Points
Each point in two dimensions may be labeled by two coordinates 1 (x, y) which specify the position of the point in some units with
respect to some axes as in the figure below.

This is why the xy-plane is called “two dimensional” — the name of each point consists of two real numbers.
The set of all points in two dimensions is denoted 2 R . Observe that
2

the distance from the point (x, y) to the x-axis is |y|


if y > 0, then (x, y) is above the x-axis and if y < 0, then (x, y) is below the x-axis
the distance from the point (x, y) to the y -axis is |x|
if x > 0, then (x, y) is to the right of the y -axis and if x < 0, then (x, y) is to the left of the y -axis
−−− −−−
the distance from the point (x, y) to the origin (0, 0) is √x + y2 2

Similarly, each point in three dimensions may be labeled by three coordinates (x, y, z), as in the two figures below.

The set of all points in three dimensions is denoted R . The plane that contains, for example, the x - and y-axes is called the xy-
3

plane.
The xy-plane is the set of all points (x, y, z) that satisfy z = 0.
The xz-plane is the set of all points (x, y, z) that satisfy y = 0.
The yz-plane is the set of all points (x, y, z) that satisfy x = 0.
More generally,
The set of all points (x, y, z) that obey z = c is a plane that is parallel to the xy-plane and is a distance |c| from it. If c > 0, the
plane z = c is above the xy-plane. If c < 0, the plane z = c is below the xy-plane. We say that the plane z = c is a signed
distance c from the xy-plane.
The set of all points (x, y, z) that obey y = b is a plane that is parallel to the xz-plane and is a signed distance b from it.
The set of all points (x, y, z) that obey x = a is a plane that is parallel to the yz-plane and is a signed distance a from it.

1.1.1 https://math.libretexts.org/@go/page/89203
Observe that our 2d distances extend quite easily to 3d.
the distance from the point (x, y, z) to the xy-plane is |z|
the distance from the point (x, y, z) to the xz-plane is |y|
the distance from the point (x, y, z) to the yz-plane is |x|
−−−−−−−−− −
the distance from the point (x, y, z) to the origin (0, 0, 0) is √x + y + z 2 2 2

−−−−−−−−−−
To see that the distance from the point (x, y, z) to the origin (0, 0, 0) is indeed √x 2 2
+y +z
2
,

apply Pythagoras to the right-angled triangle with vertices (0, 0, 0), (x, 0, 0) and (x, y, 0) to see that the distance from (0, 0, 0)
−−−−−−
to (x, y, 0) is √x + y and then
2 2

apply Pythagoras to the right-angled triangle with vertices (0, 0, 0), (x, y, 0) and (x, y, z) to see that the distance from (0, 0, 0)
−−−−−−−−−−−−−−
−−−−−− 2 −−−−−−−−− −
to (x, y, z) is √(√x 2 2
+y ) +z
2 2 2
= √x + y + z
2
.

More generally, the distance from the point (x, y, z) to the point (x , y ′ ′
,z )

is
−−−−−−−−−−−−−−−−−−−−−−−−−
′ 2 ′ 2 ′ 2
√ (x − x ) + (y − y ) + (z − z )

Notice that this gives us the equation for a sphere quite directly. All the points on a sphere are equidistant from the centre of the
sphere. So, for example, the equation of the sphere centered on (1, 2, 3) with radius 4, that is, the set of all points (x, y, z) whose
distance from (1, 2, 3) is 4, is
2 2 2
(x − 1 ) + (y − 2 ) + (z − 3 ) = 16

Here is an example in which we sketch a region in the xy-plane that is specified using inequalities.

 Example 1.1.1

In this example, we sketch the region


2 2
{(x, y) | − 12 ≤ x − 6x + y − 4y ≤ −9,   y ≥ 1}

in the xy-plane.
We do so in two steps. In the first step, we sketch the curves x
2
− 6x + y
2
− 4y = −12,
2
x − 6x + y
2
− 4y = −9, and
y = 1.

1.1.2 https://math.libretexts.org/@go/page/89203
By completing squares, we see that the equation x − 6x + y − 4y = −12 is equivalent to (x − 3) + (y − 2) = 1,
2 2 2 2

which is the circle of radius 1 centred on (3, 2). It is sketched in the figure below.
By completing squares, we see that the equation x − 6x + y − 4y = −9 is equivalent to (x − 3) + (y − 2) = 4,
2 2 2 2

which is the circle of radius 2 centred on (3, 2). It is sketched in the figure below.
The point (x, y) obeys y = 1 if and only if it is a distance 1 vertically above the x-axis. So y = 1 is the line that is parallel
to the x-axis and is one unit above it. This line is also sketched in the figure below.

In the second step we determine the impact that the inequalities have.
The inequality x 2 2
− 6x + y − 4y ≥ −12 is equivalent to (x − 3) + (y − 2) ≥ 1 and hence is equivalent to
2 2

−−−−−−−−−−−−−− −
√(x − 3 ) + (y − 2 )2 ≥ 1.
2
So the point
(x, y) satisfies x − 6x + y − 4y ≥ −12 if and only if the distance from (x, y)
2 2

to (3, 2) is at least 1, i.e. if and only if (x, y) is outside (or on) the circle (x − 3) + (y − 2) = 1. 2 2

The inequality x − 6x + y − 4y ≤ −9 is equivalent to (x − 3) + (y − 2) ≤ 4 and hence is equivalent to


2 2 2 2

−−−−−−−−−−−−−− −
≤ 2. So the point (x, y) satisfies the inequality x − 6x + y − 4y ≤ −9 if and only if the
2 2 2 2
√(x − 3 ) + (y − 2 )

distance from (x, y) to (3, 2) is at most 2, i.e. if and only if (x, y) is inside (or on) the circle (x − 3) + (y − 2) = 4. 2 2

The point (x, y) obeys y ≥ 1 if and only if (x, y) is a vertical distance at least 1 above the x-axis, i.e. is above (or on) the
line y = 1.
So the region
2 2
{(x, y) | − 12 ≤ x − 6x + y − 4y ≤ −9,   y ≥ 1}

consists of all points (x, y) that


are inside or on the circle (x − 3) + (y − 2) = 4 and
2 2

are also outside or on the circle (x − 3) + (y − 2) = 1 and


2 2

are also above or on the line y = 1.


It is the shaded region in the figure below.

Here are a couple of examples that involve spheres.

 Example 1.1.2

In this example, we are going to find the curve formed by the intersection of the xy-plane and the sphere of radius 5 centred on
(0, 0, 4).

The point (x, y, z) lies on the xy-plane if and only if z = 0, and lies on the sphere of radius 5 centred on (0, 0, 4) if and only if
2 2 2
x + y + (z − 4 ) = 25. So the point (x, y, z) lies on the curve of intersection if and only if both z = 0 and
2 2 2
x + y + (z − 4 ) = 25, or equivalently
2 2 2 2 2
z = 0, x +y + (0 − 4 ) = 25 ⟺ z = 0, x +y =9

1.1.3 https://math.libretexts.org/@go/page/89203
This is the circle in the xy-plane that is centred on the origin and has radius 3. Here is a sketch that show the parts of the sphere
and the circle of intersection that are in the first octant. That is, that have x ≥ 0, y ≥ 0 and z ≥ 0.

 Example 1.1.3

In this example, we are going to find all points (x, y, z) for which the distance from (x, y, z) to (9, −12, 15) is twice the
distance from (x, y, z) to the origin (0, 0, 0).
−−−−−−−−−−−−−−−−−−−−−−−− −
The distance from (x, y, z) to (9, −12, 15) is √(x − 9) + (y + 12) 2 2
+ (z − 15 )
2
. The distance from (x, y, z) to (0, 0, 0) is
−−−−−−−−− −
2
√x + y + z2
. So we want to find all points (x, y, z) for which
2

−−−−−−−−−−−−−−−−−−−−−−−−− −−−−−−−−−−
2 2 2 2 2 2
√ (x − 9 ) + (y + 12 ) + (z − 15 ) = 2√ x +y +z

Squaring both sides of this equation gives


2 2 2 2 2 2
x − 18x + 81 + y + 24y + 144 + z − 30z + 225 = 4(x +y +z )

Collecting up terms gives


2 2 2
3x + 18x + 3 y − 24y + 3 z + 30z = 450 and, dividing by 3,
2 2 2
x + 6x + y − 8y + z + 10z = 150 and, completing squares,
2 2 2
x + 6x + 9 + y − 8y + 16 + z + 10z + 25 = 200 or
2 2 2
(x + 3 ) + (y − 4 ) + (z + 5 ) = 200


This is the sphere of radius 10 √2 centred on (−3, 4, −5).

Exercises
Stage 1

 1

Describe the set of all points (x, y, z) in R that satisfy 3

1. x2
+y
2
+z
2
= 2x − 4y + 4

2. x2
+y
2
+z
2
< 2x − 4y + 4

 2

Describe and sketch the set of all points (x, y) in R that satisfy 2

1. x = y
2. x + y = 1
3. x + y = 4
2 2

4. x + y = 2y
2 2

5. x + y < 2y
2 2

1.1.4 https://math.libretexts.org/@go/page/89203
 3

Describe the set of all points (x, y, z) in R that satisfy the following conditions. Sketch the part of the set that is in the first
3

octant.
1. z = x
2. x + y + z = 1
3. x + y + z = 4
2 2 2

4. x + y + z = 4,
2 2 2
z =1

5. x + y = 4
2 2

6. z = x + y
2 2

 4

Let A be the point (2, 1, 3).


1. Find the distance from A to the xy-plane.
2. Find the distance from A to the xz-plane.
3. Find the distance from A to the point (x, 0, 0) on the x-axis.
4. Find the point on the x-axis that is closest to A.
5. What is the distance from A to the x-axis?

Stage 2

 5
Consider any triangle. Pick a coordinate system so that one vertex is at the origin and a second vertex is on the positive x-axis.
Call the coordinates of the second vertex (a, 0) and those of the third vertex (b, c). Find the circumscribing circle (the circle
that goes through all three vertices).

 6. ✳

A certain surface consists of all points P = (x, y, z) such that the distance from P to the point (0, 0, 1) is equal to the distance
from P to the plane z + 1 = 0. Find an equation for the surface, sketch and describe it verbally.

 7

Show that the set of all points P that are twice as far from (3, −2, 3) as from (3/2, 1, 0) is a sphere. Find its centre and radius.

Stage 3

 8
The pressure p(x, y) at the point (x, y) is at least zero and is determined by the equation x 2
− 2px + y
2 2
= 3p . Sketch several
isobars. An isobar is a curve with equation p(x, y) = c for some constant c ≥ 0.

1. This is why the xy-plane is called “two dimensional” — the name of each point consists of two real numbers.
2. Not surprisingly, the 2 in R signifies that each point is labelled by two numbers and the R in R signifies that the numbers in
2 2

question are real numbers. There are more advanced applications (for example in signal analysis and in quantum mechanics)
where complex numbers are used. The space of all pairs (z , z ), with z and z complex numbers is denoted R .
1 2 1 2
2

This page titled 1.1: Points is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.

1.1.5 https://math.libretexts.org/@go/page/89203
1.2: Vectors
In many of our applications in 2d and 3d, we will encounter quantities that have both a magnitude (like a distance) and also a
direction. Such quantities are called vectors. That is, a vector is a quantity which has both a direction and a magnitude, like a
velocity. If you are moving, the magnitude (length) of your velocity vector is your speed (distance travelled per unit time) and the
direction of your velocity vector is your direction of motion. To specify a vector in three dimensions you have to give three
components, just as for a point. To draw the vector with components a, b, c you can draw an arrow from the point (0, 0, 0) to the
point (a, b, c).

Similarly, to specify a vector in two dimensions you have to give two components and to draw the vector with components a, b you
can draw an arrow from the point (0, 0) to the point (a, b).
There are many situations in which it is preferable to draw a vector with its tail at some point other than the origin. For example, it
is natural to draw the velocity vector of a moving particle with the tail of the velocity vector at the position of the particle, whether
or not the particle is at the origin. The sketch below shows a moving particle and its velocity vector at two different times.

As a second example, suppose that you are analyzing the motion of a pendulum. There are three forces acting on the pendulum
bob: gravity g, which is pulling the bob straight down, tension t in the rod, which is pulling the bob in the direction of the rod, and
air resistance r, which is pulling the bob in a direction opposite to its direction of motion. All three forces are acting on the bob. So
it is natural to draw all three arrows representing the forces with their tails at the bob.

In this text, we will used bold faced letters, like v, t, g, to designate vectors. In handwriting, it is clearer to use a small overhead
arrow 1 , as in v,⃗  t ,⃗  g ,⃗  instead. Also, when we want to emphasise that some quantity is a number, rather than a vector, we will call
the number a scalar.
Both points and vectors in 2d are specified by two numbers. Until you get used to this, it might confuse you sometimes — does a
given pair of numbers represent a point or a vector? To distinguish 2 between the components of a vector and the coordinates of the
point at its head, when its tail is at some point other than the origin, we shall use angle brackets rather than round brackets around
the components of a vector. For example, the figure below shows the two-dimensional vector ⟨2, 1⟩ drawn in three different
positions. In each case, when the tail is at the point (u, v) the head is at (2 + u, 1 + v). We warn you that, out in the real world 3,

1.2.1 https://math.libretexts.org/@go/page/89204
no one uses notation that distinguishes between components of a vector and the coordinates of its head — usually round brackets
are used for both. It is up to you to keep straight which is being referred to.

By way of summary,

 Definition 1.2.1

we use
bold faced letters, like v, t, g, to designate vectors, and
angle brackets, like ⟨2, 1⟩ , around the components of a vector, but use
round brackets, like (2, 1), around the coordinates of a point, and use
“scalar” to emphasise that some quantity is a number, rather than a vector.

Addition of Vectors and Multiplication of a Vector by a Scalar


Just as we have done many times in the CLP texts, when we define a new type of object, we want to understand how it interacts
with the basic operations of addition and multiplication. Vectors are no different, and we shall shortly see a natural way to define
addition of vectors. Multiplication will be more subtle, and we shall start with multiplication of a vector by a number (rather than
with multiplication of a vector by another vector).
By way of motivation for the definitions of addition and multiplication by a number, imagine that we are out for a walk on the xy-
plane.
Suppose that we take a step and, in doing so, we move a units parallel to the x-axis and a units parallel to the y -axis. Then
1 2

we say that ⟨a , a ⟩ is the displacement vector for the step. Suppose now that we take a second step which moves us an
1 2

additional b units parallel to the x-axis and an additional b units parallel to the y -axis, as in the figure on the left below. So the
1 2

displacement vector for the second step is ⟨b , b ⟩ . All together, we have moved a + b units parallel to the x-axis and
1 2 1 1

a +b
2 units parallel to the y -axis. The displacement vector for the two steps combined is ⟨a + b , a + b ⟩ . We shall define
2 1 1 2 2

the sum of ⟨a , a ⟩ and ⟨b , b ⟩ , denoted by ⟨a , a ⟩ + ⟨b , b ⟩ , to be ⟨a + b , a + b ⟩ .


1 2 1 2 1 2 1 2 1 1 2 2

Suppose now that, instead, we decide to step in the same direction as the first step above, but to move twice as far, as in the
figure on the right below. That is, our step will move us 2a units in the direction of the x-axis and 2a units in the direction of
1 2

the y -axis and the corresponding displacement vector will be ⟨2a , 2a ⟩ . We shall define the product of the number 2 and the
1 2

vector ⟨a , a ⟩ , denoted by 2 ⟨a , a ⟩ , to be ⟨2a , 2a ⟩ .


1 2 1 2 1 2

Here are the formal definitions.

 Definition 1.2.2. Adding Vectors and Multiplying a Vector by a Number.

These two operations have the obvious definitions


a = ⟨a1 , a2 ⟩ ,  b = ⟨b1 , b2 ⟩ ⟹ a + b = ⟨a1 + b1 , a2 + b2 ⟩

a = ⟨a1 , a2 ⟩ ,  s a number ⟹ sa = ⟨sa1 , sa2 ⟩

and similarly in three dimensions.

1.2.2 https://math.libretexts.org/@go/page/89204
Pictorially, you add the vector b to the vector a by drawing b with its tail at the head of a and then drawing a vector from the tail
of a to the head of b, as in the figure on the left below. For a number s, we can draw the vector sa, by just
changing the vector a's length by the factor |s|, and,
if s < 0, reversing the arrow's direction,
as in the other two figures below.

The special case of multiplication by s = −1 appears so frequently that (−1)a is given the shorter notation −a. That is,
− ⟨a1 , a2 ⟩ = ⟨−a1 , −a2 ⟩

Of course a + (−a) is 0, the vector all of whose components are zero.


To subtract b from a pictorially, you may add −b (which is drawn by reversing the direction of b ) to a. Alternatively, if you draw
a and b with their tails at a common point, then a − b is the vector from the head of b to the head of a. That is, a − b is the

vector you must add to b in order to get a.

The operations of addition and multiplication by a scalar that we have just defined are quite natural and rarely cause any problems,
because they inherit from the real numbers the properties of addition and multiplication that you are used to.

 Theorem 1.2.3. Properties of Addition and Scalar Multiplication.

Let a, b and c be vectors and s and t be scalars. Then


(1) a+b = b +a (2) a + (b + c) = (a + b) + c

(3) a+0 = a (4) a + (−a) = 0

(5) s(a + b) = sa + sb (6) (s + t)a = sa + ta

(7) (st)a = s(ta) (8) 1a = a

We have just been introduced to many definitions. Let's see some of them in action.

 Example 1.2.4

For example, if

a = ⟨1, 2, 3⟩ b = ⟨3, 2, 1⟩ c = ⟨1, 0, 1⟩

then
2a = 2 ⟨1, 2, 3⟩ = ⟨2, 4, 6⟩

−b = − ⟨3, 2, 1⟩ = ⟨−3, −2, −1⟩

3c = 3 ⟨1, 0, 1⟩ = ⟨3, 0, 3⟩

and

1.2.3 https://math.libretexts.org/@go/page/89204
2a − b + 3c = ⟨2, 4, 6⟩ + ⟨−3, −2, −1⟩ + ⟨3, 0, 3⟩

= ⟨2 − 3 + 3 , 4 − 2 + 0 , 6 − 1 + 3⟩

= ⟨2, 2, 8⟩

 Definition 1.2.5

Two vectors a and b


are said to be parallel if  a = s b  for some nonzero real number s and
are said to have the same direction if  a = s b  for some number s > 0.

There are some vectors that occur sufficiently commonly that they are given special names. One is the vector 0. Some others are
the “standard basis vectors”.

 Definition 1.2.6

The standard basis vectors in two dimensions are


^
ı = ⟨1, 0⟩ ^
ȷ = ⟨0, 1⟩

The standard basis vectors in three dimensions are


^
^
ı = ⟨1, 0, 0⟩ ^
ȷ = ⟨0, 1, 0⟩ k = ⟨0, 0, 1⟩

We'll explain the little hats in the notation ^


ı, ^
^
ȷ, k shortly. Some people rename ^
ı, ^
ȷ and k
^
to e 1, e2 and e respectively. Using the
3

above properties we have, for all vectors,


^
⟨a1 , a2 ⟩ = a1 ^
ı + a2 ^
ȷ ⟨a1 , a2 , a3 ⟩ = a1 ^
ı + a2 ^
ȷ + a3 k

A sum of numbers times vectors, like a ^ı + a ^ȷ is called a linear combination of the vectors. Thus all vectors can be expressed as
1 2

linear combinations of the standard basis vectors. This makes basis vectors very helpful in computations. The standard basis
vectors are unit vectors, meaning that they are of length one, where the length of a vector a is denoted 4 |a| and is defined by

 Definition 1.2.7. Length of a Vector.


−−−−−−
2 2
a = ⟨a1 , a2 ⟩ ⟹ |a| = √ a +a
1 2

−−−−−−−−−−
2 2 2
a = ⟨a1 , a2 , a3 ⟩ ⟹ |a| = √ a +a +a
1 2 3

A unit vector is a vector of length one. We'll sometimes use the accent ^  to emphasise that the vector a
^ is a unit vector. That is,

^ | = 1.
|a

1.2.4 https://math.libretexts.org/@go/page/89204
 Example 1.2.8

Recall that multiplying a vector a by a positive number s, changes the length of the vector by a factor s without changing the
⟨1,1,1⟩
direction of the vector. So (assuming that |a| ≠ 0) a

|a|
is a unit vector that has the same direction as a. For example, √3
is a
unit vector that points in the same direction as ⟨1, 1, 1⟩ .

 Example 1.2.9

We go for a walk on a flat Earth. We use a coordinate system with the positive x-axis pointing due east and the positive y-axis
pointing due north. We
start at the origin and
walk due east for 4 units and then

walk northeast for 5√2 units and then
head towards the point (0, 11), but we only go
one third of the way.

We will now use vectors to figure out our final location.


On the first leg of our walk, we go 4 units in the positive x-direction. So our displacement vector — the vector whose tail is
at our starting point and whose head is at the end point of the first leg — is ⟨4, 0⟩ . As we started at (0, 0) we finish the first
leg of the walk at (4, 0).
On the second leg of our walk, our direction of motion is northeast, i.e. is 45 above the direction of the positive x-axis.

Looking at the figure on the right above, we see that our displacement vector, for the second leg of the walk, has to be in

the same direction as the vector ⟨1, 1⟩ . So our displacement vector is the vector of length 5√2 with the same direction as
−−−−−− – ⟨1,1⟩
⟨1, 1⟩ . The vector ⟨1, 1⟩ has length √1 2 2
+1 = √2 and so √2
has length one and our displacement vector is

– ⟨1, 1⟩
5 √2  = 5 ⟨1, 1⟩ = ⟨5, 5⟩

√2

If we draw this displacement vector, ⟨5, 5⟩ with its tail at (4, 0), the starting point of the second leg of the walk, then its
head will be at (4 + 5, 0 + 5) = (9, 5) and that is the end point of the second leg of the walk.
On the final leg of our walk, we start at (9, 5) and walk towards (0, 11). The vector from (9, 5) to (0, 11) is
⟨0 − 9 , 11 − 5⟩ = ⟨−9, 6⟩ . As we go only one third of the way, our final displacement vector is

1
⟨−9, 6⟩ = ⟨−3, 2⟩
3

If we draw this displacement vector with its tail at (9, 5), the starting point of the final leg, then its head will be at
(9 − 3, 5 + 2) = (6, 7) and that is the end point of the final leg of the walk, and our final location.

The Dot Product


Let's get back to the arithmetic operations of addition and multiplication. We will be using both scalars and vectors. So, for each
operation there are three possibilities that we need to explore:

1.2.5 https://math.libretexts.org/@go/page/89204
“scalar plus scalar”, “scalar plus vector” and “vector plus vector”
“scalar times scalar”, “scalar times vector” and “vector times vector”
We have been using “scalar plus scalar” and “scalar times scalar” since childhood. “vector plus vector” and “scalar times vector”
were just defined above. There is no sensible way to define “scalar plus vector”, so we won't. This leaves “vector times vector”.
There are actually two widely used such products. The first is the dot product, which is the topic of this section, and which is used
to easily determine the angle θ (or more precisely, cos θ) between two vectors. We'll get to the second, the cross product, later.
Here is preview of what we will do in this dot product subsection §1.2.2. We are going to give two formulae for the dot product,
a ⋅ b, of the pair of vectors a = ⟨a , a , a ⟩ and b = ⟨b , b , b ⟩ .
1 2 3 1 2 3

The first formula is a ⋅ b = a b + a b + a b . We will take it as our official definition of a ⋅ b. This formula provides us
1 1 2 2 3 3

with an easy way to compute dot products.


The second formula is a ⋅ b = |a| |b| cos θ, where θ is the angle between a and b.

We will show, in Theorem 1.2.11 below, that this second formula always gives the same answer as the first formula. The second
formula provides us with an easy way to determine the angle between two vectors. In particular, it provides us with an easy way
to test whether or not two vectors are perpendicular to each other. For example, the vectors ⟨1, 2, 3⟩ and ⟨−1, −1, 1⟩ have dot
product

⟨1, 2, 3⟩ ⋅ ⟨−1, −1, 1⟩ = 1 × (−1) + 2 × (−1) + 3 × 1 = 0

This tell us as the angle θ between the two vectors obeys cos θ = 0, so that θ = π

2
. That is, the two vectors are perpendicular to
each other.
After we give our official definition of the dot product in Definition 1.2.10, and give the important properties of the dot product,
including the formula a ⋅ b = |a| |b| cos θ, in Theorem 1.2.11, we'll give some examples. Finally, to see the dot product in action,
we'll define what it means to project one vector on another vector and give an example.

 Definition 1.2.10. Dot Product.


The dot product of the vectors a and b is denoted a ⋅ b and is defined by
a = ⟨a1 , a2 ⟩ , b = ⟨b1 , b2 ⟩ ⟹ a ⋅ b = a1 b1 + a2 b2

a = ⟨a1 , a2 , a3 ⟩ , b = ⟨b1 , b2 , b3 ⟩ ⟹ a ⋅ b = a1 b1 + a2 b2 + a3 b3

in two and three dimensions respectively.

The properties of the dot product are as follows:

 Theorem 1.2.11. Properties of the Dot Product.

Let a, b and c be vectors and let s be a scalar. Then


(0) a, b are vectors and a ⋅ b is a scalar

2
(1) a ⋅ a = |a|

(2) a⋅ b = b ⋅ a

(3) a ⋅ (b + c) = a ⋅ b + a ⋅ c, (a + b) ⋅ c = a ⋅ c + b ⋅ c

(4) (sa) ⋅ b = s(a ⋅ b)

(5) 0⋅a=0

(6) a ⋅ b = |a| |b| cos θ where θ is the angle between a and b

(7) a⋅ b = 0 ⟺ a = 0 or b = 0 or a ⊥ b

Proof.

1.2.6 https://math.libretexts.org/@go/page/89204
Properties 0 through 5 are almost immediate consequences of the definition. For example, for property 3 (which is called the
distributive law) in dimension 2,

a ⋅ (b + c) = ⟨a1 , a2 ⟩ ⋅ ⟨b1 + c1 , b2 + c2 ⟩

= a1 (b1 + c1 ) + a2 (b2 + c2 ) = a1 b1 + a1 c1 + a2 b2 + a2 c2

a⋅ b +a⋅ b = ⟨a1 , a2 ⟩ ⋅ ⟨b1 , b2 ⟩ + ⟨a1 , a2 ⟩ ⋅ ⟨c1 , c2 ⟩

= a1 b1 + a2 b2 + a1 c1 + a2 c2

Property 6 is sufficiently important that it is often used as the definition of dot product. It is not at all an obvious consequence of
the definition. To verify it, we just write |a − b| in two different ways. The first expresses |a − b| in terms of a ⋅ b. It is
2 2

2 1
|a − b |   = (a − b ) ⋅ (a − b )

3
= a ⋅ a − a ⋅ b − b ⋅ a + b ⋅ b

1,2 2 2
=  |a| + |b | − 2a ⋅ b

1 2
Here, =, for example, means that the equality is a consequence of property 1. The second way we write |a − b| involves cos θ
and follows from the cosine law for triangles. Just in case you don't remember the cosine law, we'll derive it right now! Start by
applying Pythagoras to the shaded triangle in the right hand figure of

That triangle is a right triangle whose hypotenuse has length |a − b| and whose other two sides have lengths (|b| − |a| cos θ)
and |a| sin θ. So Pythagoras gives
2 2 2
|a − b| = (|b| − |a| cos θ) + (|a| sin θ)

2 2 2 2 2
= |b | − 2|a| |b| cos θ + |a| cos θ + |a| sin θ

2 2
= |b | − 2|a| |b| cos θ + |a|

This is precisely the cosine law 5 . Observe that, when θ = π

2
, this reduces to, (surprise!) Pythagoras' theorem.
2
Setting our two expressions for |a − b| equal to each other,
2 2 2 2 2
|a − b | = |a| + |b | − 2a ⋅ b = |b | − 2|a| |b| cos θ + |a|

cancelling the |a| and |b| common to both sides


2 2

−2a ⋅ b = −2|a| |b| cos θ

and dividing by −2 gives

a ⋅ b = |a| |b| cos θ

which is exactly property 6.


Property 7 follows directly from property 6. First note that the dot product a ⋅ b = |a| |b| cos θ is zero if and only if at least
one of the three factors |a|,  |b|,   cos θ is zero. The first factor is zero if and only if a = 0. The second factor is zero if and only
if b = 0. The third factor is zero if and only if θ = ± + 2kπ, for some integer k, which in turn is true if and only if a and b
π

are mutually perpendicular.

Because of Property 7 of Theorem 1.2.11, the dot product can be used to test whether or not two vectors are perpendicular to each
other. That is, whether or not the angle between the two vectors is 90 . Another name 6 for “perpendicular” is “orthogonal”.

Testing for orthogonality is one of the main uses of the dot product.

1.2.7 https://math.libretexts.org/@go/page/89204
 Example 1.2.12
Consider the three vectors

a = ⟨1, 1, 0⟩ b = ⟨1, 0, 1⟩ c = ⟨−1, 1, 1⟩

Their dot products

a ⋅ b = ⟨1, 1, 0⟩ ⋅ ⟨1, 0, 1⟩ = 1 ×1 +1 ×0 +0 ×1 =1

a ⋅ c = ⟨1, 1, 0⟩ ⋅ ⟨−1, 1, 1⟩ = 1 × (−1) + 1 × 1 + 0 × 1 =0

b ⋅ c = ⟨1, 0, 1⟩ ⋅ ⟨−1, 1, 1⟩ = 1 × (−1) + 0 × 1 + 1 × 1 =0

−−−−−−−−−− –
tell us that c is perpendicular to both a and b. Since both |a| = |b| = √1 2
+1
2
+0
2
= √2 the first dot product tells us that
the angle, θ, between a and b obeys
a⋅ b 1 π
cos θ = = ⟹ θ =
|a| |b| 2 3

Dot products are also used to compute projections. First, here's the definition.

 Definition 1.2.13. Projection.

Draw two vectors, a and b, with their tails at a common point and drop a perpendicular from the head of a to the line that
passes through both the head and tail of b. By definition, the projection of the vector a on the vector b is the vector from the
tail of b to the point on the line where the perpendicular hits.

Think of the projection of a on b as the part of a that is in the direction of b.


Now let's develop a formula for the projection of a on b. Denote by θ the angle between a and b. If |θ| is no more than 90 , as in ∘

the figure on the left above, the length of the projection of a on b is |a| cos θ. By Property 6 of Theorem 1.2.11,
|a| cos θ = a ⋅ b/|b|, so the projection is a vector whose length is a ⋅ b/|b| and whose direction is given by the unit vector b/|b|.

Hence
a⋅ b b a⋅ b
projection of a on b = proj b a = = b
2
|b| |b| |b|

If |θ| is larger than 90 , as in the figure on the right above, the projection has length

|a| cos(π − θ) = −|a| cos θ = −a ⋅ b/|b|

and direction −b/|b|. In this case


a⋅ b −b a⋅ b
proj b a = −   =  b
2
|b| |b| |b|

too. So the formula

1.2.8 https://math.libretexts.org/@go/page/89204
 Equation 1.2.14

a⋅ b
proj b a = b
2
|b|

is applicable whenever b ≠ 0. We may rewrite proj b a =


a⋅b

|b|
b

|b|
. The coefficient, a⋅b

|b|
, of the unit vector b

|b|
, is called the
component of a in the direction b. As a special case, if b happens to be a unit vector, which, for emphasis, we'll now write has b
^
,

the projection formula simplifies to

 Equation 1.2.15

^ ^
proj ^ a = (a ⋅ b) b
b

 Example 1.2.16

In this example, we will find the projection of the vector ⟨0, 3⟩ on the vector ⟨1, 1⟩ , as in the figure

By Equation 1.2.14 with a = ⟨0, 3⟩ and b = ⟨1, 1⟩ , that projection is


⟨0, 3⟩ ⋅ ⟨1, 1⟩
proj ⟨1,1⟩ ⟨0, 3⟩ = ⟨1, 1⟩
2
| ⟨1, 1⟩ |

0 ×1 +3 ×1 3 3
= ⟨1, 1⟩ = ⟨ , ⟩
2 2
1 +1 2 2

One use of projections is to “resolve forces”. There is an example in the next (optional) section.

(Optional) Using Dot Products to Resolve Forces — The Pendulum


Model a pendulum by a mass m that is connected to a hinge by an idealized rod that is massless and of fixed length ℓ. Denote by θ
the angle between the rod and vertical. The forces acting on the mass are
gravity, which has magnitude mg and direction ⟨0, −1⟩ ,
tension in the rod, whose magnitude τ (t) automatically adjusts itself so that the distance between the mass and the hinge is
fixed at ℓ (so that the rod does not stretch or contract) and whose direction is always parallel to the rod,
and possibly some frictional forces, like friction in the hinge and air resistance. Assume that the total frictional force has
magnitude proportional 7 to the speed of the mass and has direction opposite to the direction of motion of the mass. We'll call
the constant of proportionality β.

If we use a coordinate system centered on the hinge, the (x, y) coordinates of the mass at time t are

1.2.9 https://math.libretexts.org/@go/page/89204
x(t) = ℓ sin θ(t)

y(t) = −ℓ cos θ(t)

where θ(t) is the angle between the rod and vertical at time t. We are now going to use Newton's law of motion

mass × acceleration = total applied force

to determine now θ evolves in time. By definition, the velocity and acceleration vectors 8 for the position vector ⟨x(t), y(t)⟩ are
d dx dy
⟨x(t), y(t)⟩ = ⟨ (t), (t)⟩
dt dt dt

2 2 2
d d x d y
⟨x(t), y(t)⟩ = ⟨ (t), (t)⟩
2 2 2
dt dt dt

So, the velocity and acceleration vectors of our mass are


d
v(t) = ⟨x(t), y(t)⟩
dt

d d
= ⟨ℓ sin θ(t), −ℓ cos θ(t)⟩
dt dt

dθ dθ
= ⟨ℓ cos θ(t) (t) , ℓ sin θ(t) (t)⟩
dt dt


=ℓ (t) ⟨cos θ(t), sin θ(t)⟩
dt
2
d
a(t) = ⟨x(t), y(t)⟩
2
dt

d dθ
= {ℓ (t) ⟨cos θ(t), sin θ(t)⟩}
dt dt
2
d θ dθ d d
=ℓ (t) ⟨cos θ(t), sin θ(t)⟩ + ℓ (t) ⟨ cos θ(t), sin θ(t)⟩
2
dt dt dt dt

2 2
d θ dθ
=ℓ (t) ⟨cos θ(t), sin θ(t)⟩ + ℓ( (t)) ⟨− sin θ(t), cos θ(t)⟩
2
dt dt

The negative of the velocity vector is −ℓ dθ

dt
⟨cos θ, sin θ⟩ , so the total frictional force is


−βℓ ⟨cos θ, sin θ⟩
dt

with β our constant of proportionality.


The vector

τ (t) ⟨− sin θ(t), cos θ(t)⟩

has magnitude τ (t) and direction parallel to the rod pointing from the mass towards the hinge and so is the force due to tension in
the rod.
Hence, for this physical system, Newton's law of motion is
mass×acceleration

2 2
d θ dθ
mℓ ⟨cos θ, sin θ⟩ + mℓ ( ) ⟨− sin θ, cos θ⟩
2
dt dt
f riction
gravity tension 
 

= mg ⟨0, −1⟩ + τ ⟨− sin θ, cos θ⟩ − βℓ ⟨cos θ, sin θ⟩ (∗)
dt

This is a rather complicated looking equation. Writing out its x- and y -components doesn't help. They also look complicated.
Instead, the equation can be considerably simplified (and consequently better understood) by “taking its components parallel to and
perpendicular to the direction of motion”. From the velocity vector v(t), we see that ⟨cos θ(t), sin θ(t)⟩ is a unit vector parallel to

1.2.10 https://math.libretexts.org/@go/page/89204
the direction of motion at time t. Recall, from 1.2.15, that the projection of any vector b on any unit vector d
^
(with the “hat” on d
^

reminding ourselves that the vector is a unit vector) is


^ ^
(b ⋅ d) d

The coefficient b ⋅ d
^
is, by definition, the component of b in the direction d
^
. So, by dotting both sides of the equation of motion

(∗) with d = ⟨cos θ(t), sin θ(t)⟩ , we extract the component parallel to the direction of motion. Since
^

⟨cos θ, sin θ⟩ ⋅ ⟨cos θ, sin θ⟩ = 1

⟨cos θ, sin θ⟩ ⋅ ⟨− sin θ, cos θ⟩ = 0

⟨cos θ, sin θ⟩ ⋅ ⟨0, −1⟩ = − sin θ

this gives
2
d θ dθ
mℓ = −mg sin θ − βℓ
2
dt dt

which is much cleaner than (∗)! When θ is small, we can approximate sin θ ≈ θ and get the equation
2
d θ β dθ g
+ + θ =0
2
dt m dt ℓ

which is easily solved. There are systematic procedures for finding the solution, but we'll just guess.
When there is no friction (so that β = 0 ), we would expect the pendulum to just oscillate. So it is natural to guess

θ(t) = A sin(ωt − δ)

which is an oscillation with (unknown) amplitude A, frequency ω (radians per unit time) and phase δ. Substituting this guess into
g
the left hand side, θ + θ, yields
′′

2 g
−Aω sin(ωt − δ) + A sin(ωt − δ)

−−

which is zero if ω = √g/ℓ. So  θ(t) = A sin(ωt − δ)  is a solution for any amplitude A and phase δ, provided the frequency
−−

ω = √g/ℓ.

When there is some, but not too much, friction, so that β >0 is relatively small, we would expect “oscillation with decaying
amplitude”. So we guess
−γt
θ(t) = Ae sin(ωt − δ)

for some constant decay rate γ, to be determined. With this guess,


−γt
θ(t) = Ae sin(ωt − δ)
′ −γt −γt
θ (t) = − γAe sin(ωt − δ) + ωAe cos(ωt − δ)

′′ 2 2 −γt −γt
θ (t) = (γ − ω )Ae sin(ωt − δ) − 2γωAe cos(ωt − δ)

and the left hand side


2
d θ β dθ g 2 2
β g −γt
+ + θ = [γ −ω − γ+ ] Ae sin(ωt − δ)
2
dt m dt ℓ m ℓ

β −γt
+ [−2γω + ω] Ae cos(ωt − δ)
m

β g β β
vanishes if γ − ω −
2 2
m
γ+

=0 and −2γω + m
ω = 0. The second equation tells us the decay rate γ = 2m
and then the first
tells us the frequency
−−−−−−−
−−−−−−−−−−− 2
2 β g g β
ω = √γ − γ+ =√ −
m ℓ ℓ 4m
2

2
β g
When there is a lot of friction (namely when 4m2
> , so that the frequency ω is not a real number), we would expect damping

without oscillation and so would guess θ(t) = Ae −γt


. You can determine the allowed values of γ by substituting this guess in.

1.2.11 https://math.libretexts.org/@go/page/89204
To extract the components perpendicular to the direction of motion, we dot with ⟨− sin θ, cos θ⟩ rather than ⟨cos θ, sin θ⟩ . Note
that, because

⟨− sin θ, cos θ⟩ ⋅ ⟨cos θ, sin θ⟩ = 0,

the vector ⟨− sin θ, cos θ⟩ really is perpendicular to the direction of motion. Since
⟨− sin θ, cos θ⟩ ⋅ ⟨cos θ, sin θ⟩ = 0

⟨− sin θ, cos θ⟩ ⋅ ⟨− sin θ, cosθ⟩ = 1

⟨− sin θ, cos θ⟩ ⋅ ⟨0, −1⟩ = − cos θ

dotting both sides of the equation of motion (∗) with ⟨− sin θ, cos θ⟩ gives
dθ 2

mℓ( ) = −mg cos θ + τ


dt

This equation just determines the tension


dθ 2
τ = mℓ( ) + mg cos θ
dt

in the rod, once you know θ(t).

(Optional) Areas of Parallelograms


A parallelogram is naturally determined by the two vectors that define its sides. We'll now develop a formula for the area of a
parallelogram in terms of these two vectors.
Construct a parallelogram as follows. Pick two vectors ⟨a, b⟩ and ⟨c, d⟩ . Draw them with their tails at a common point. Then draw
⟨a, b⟩ a second time with its tail at the head of ⟨c, d⟩ and draw ⟨c, d⟩ a second time with its tail at the head of ⟨a, b⟩ . If the

common point is the origin, you get a picture like the figure below.

Any parallelogram can be constructed like this if you pick the common point and two vectors appropriately. Let's compute the area
of the parallelogram. The area of the large rectangle with vertices (0, 0),  (0, b + d),  (a + c, 0) and (a + c, b + d) is
(a + c)(b + d). The parallelogram we want can be extracted from the large rectangle by deleting the two small rectangles (each of

area bc), and the two lightly shaded triangles (each of area cd ), and the two darkly shaded triangles (each of area ab ). So the
1

2
1

desired
1 1
area = (a + c)(b + d) − (2 × bc) − (2 × cd) − (2 × ab) = ad − bc
2 2

In the above figure, we have implicitly assumed that a,  b,  c,  d ≥ 0 and d/c ≥ b/a. In words, we have assumed that both vectors
⟨a, b⟩ ,   ⟨c, d⟩ lie in the first quadrant and that ⟨c, d⟩ lies above ⟨a, b⟩ . By simply interchanging a ↔ c and b ↔ d in the picture

and throughout the argument, we see that when a,  b,  c,  d ≥ 0 and b/a ≥ d/c, so that the vector ⟨c, d⟩ lies below ⟨a, b⟩ , the area
of the parallelogram is bc − ad. In fact, all cases are covered by the formula

1.2.12 https://math.libretexts.org/@go/page/89204
 Equation 1.2.17

area of parallelogram with sides  ⟨a, b⟩  and  ⟨c, d⟩ = |ad − bc|

Given two vectors ⟨a, b⟩ and ⟨c, d⟩ , the expression ad − bc is generally written
a b
det [ ] = ad − bc
c d

and is called the determinant of the matrix 9

a b
[ ]
c d

with rows ⟨a, b⟩ and ⟨c, d⟩ . The determinant of a 2 × 2 matrix is the product of the diagonal entries minus the product of the off-
diagonal entries.
There is a similar formula in three dimensions. Any three vectors a = ⟨a 1, a2 , a3 ⟩ ,  b = ⟨b1 , b2 , b3 ⟩ and c = ⟨c1, c2 , c3 ⟩ in three
dimensions

determine a parallelepiped (three dimensional parallelogram). Its volume is given by the formula

 Equation 1.2.18

a ∣ a2 a3 ∣
⎡ 1 ⎤
∣ ∣
volume of parallelepiped with edges a,  b,  c  =   det ⎢ b1 b2 b3 ⎥
∣ ∣
⎣ ⎦
∣ c1 c2 c3 ∣

The determinant of a 3 × 3 matrix can be defined in terms of some 2 × 2 determinants by

This formula is called “expansion along the top row”. There is one term in the formula for each entry in the top row of the 3 × 3
matrix. The term is a sign times the entry itself times the determinant of the 2 × 2 matrix gotten by deleting the row and column
that contains the entry. The sign alternates, starting with a “+”.
We shall not prove this formula completely here 10. It gets a little tedious. But, there is one case in which we can easily verify that
the volume of the parallelepiped is really given by the absolute value of the claimed determinant. If the vectors b and c happen to
lie in the xy plane, so that b = c = 0, then
3 3

a1 a2 a3
⎡ ⎤
det ⎢ b1 b2 0 ⎥ = a1 (b2 0 − 0 c2 ) − a2 (b1 0 − 0 c1 ) + a3 (b1 c2 − b2 c1 )
⎣ ⎦
c1 c2 0

= a3 (b1 c2 − b2 c1 )

The first factor, a , is the z -coordinate of the one vector not contained in the xy-plane. It is (up to a sign) the height of the
3

parallelepiped. The second factor is, up to a sign, the area of the parallelogram determined by b and c. This parallelogram forms
the base of the parallelepiped. The product is indeed, up to a sign, the volume of the parallelepiped. That the formula is true in
general is a consequence of the fact (that we will not prove) that the value of a determinant does not change when one rotates the
coordinate system and that one can always rotate our coordinate axes around so that b and c both lie in the xy-plane.

1.2.13 https://math.libretexts.org/@go/page/89204
The Cross Product
We have already seen two different products involving vectors — the multiplication of a vector by a scalar and the dot product of
two vectors. The dot product of two vectors yields a scalar. We now introduce another product of two vectors, called the cross
product. The cross product of two vectors will give a vector. There are applications which have two vectors as inputs and produce
one vector as an output, and which are related to the cross product. Here is a very brief mention of two such applications. We will
look at them in much more detail later.
Consider a parallelogram in three dimensions. A parallelogram is naturally determined by the two vectors that define its sides.
One measure of the size of a parallelogram is its area. One way to specify the orientation of the parallelogram is to give a vector
that is perpendicular to it. A very compact way to encode both the area and the orientation of the parallelogram is to give a
vector whose direction is perpendicular to the plane in which it lies and whose magnitude is its area. We shall see that such a
vector can be easily constructed by taking the cross product (definition coming shortly) of the two vectors that give the sides of
the parallelogram.

Imagine a rigid body which is rotating at a rate Ω radians per second about an axis whose direction is given by the unit vector
^ . Let P be any point on the body. We shall see, in the (optional) §1.2.7, that the velocity, v, of the point P is the cross product
a

(again, definition coming shortly) of the vector Ωa^ with the vector r from any point on the axis of rotation to P .

Finally, here is the definition of the cross product. Note that it applies only to vectors in three dimensions.

 Definition 1.2.19. Cross Product.


The cross product of the vectors a = ⟨a 1, a2 , a3 ⟩ and b = ⟨b 1, b2 , b3 ⟩ is denoted a × b and is defined by

a × b = ⟨a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ⟩

Note that each component has the form a b − a b . The index i of the first a in component number
i j j i k of a × b is just after k in
the list 1, 2, 3, 1, 2, 3, 1, 2, 3, ⋯ .The index j of the first b is just before k in the list.

(a × b )k = ajust af ter k  bjust bef ore k − ajust bef ore k  bjust af ter k

For example, for component number k = 3,


''just after 3'' is 1
} ⟹ (a × b )3 = a1 b2 − a2 b1
''just before 3'' is 2

There is a much better way to remember this definition. Recall that a 2 × 2 matrix is an array of numbers having two rows and two
columns and that the determinant of a 2 × 2 matrix is defined by

a b
det [ ] = ad − bc
c d

It is the product of the entries on the diagonal minus the product of the entries not on the diagonal.
A 3 × 3 matrix is an array of numbers having three rows and three columns.

1.2.14 https://math.libretexts.org/@go/page/89204
i j k
⎡ ⎤

⎢ a1 a2 a3 ⎥
⎣ ⎦
b1 b2 b3

You will shortly see why the entries in the top row have been given the rather peculiar names i, j and k. The determinant of a
3 × 3 matrix can be defined in terms of some 2 × 2 determinants by

This formula is called “expansion of the determinant along the top row”. There is one term in the formula for each entry in the top
row. The term is a sign times the entry itself times the determinant of the 2 × 2 matrix gotten by deleting the row and column that
contains the entry. The sign alternates, starting with a +. If we now replace i by ^ı , j by ^ȷ and k by k
^
, we get exactly the formula

for a × b of Definition 1.2.19. That is the reason for the peculiar choice of names for the matrix entries. So
^
⎡ ^
ı ^
ȷ k ⎤

a × b = det ⎢ a a2 a3 ⎥
1

⎣ ⎦
b1 b2 b3

^
= ^
ı (a2 b3 − a3 b2 ) − ^
ȷ (a1 b3 − a3 b1 ) + k(a1 b2 − a2 b1 )

is a mnemonic device for remembering the definition of a × b. It is also good from the point of view of evaluating a × b. Here are
several examples in which we use the determinant mnemonic device to evaluate cross products.

 Example 1.2.20

In this example, we'll use the mnemonic device to compute two very simple cross products. First

Second

Note that, unlike most (or maybe even all) products that you have seen before, ^ı × ^ȷ is not the same as ^ȷ × ^ı !

 Example 1.2.21

In this example, we'll use the mnemonic device to compute two more complicated cross products. Let a = ⟨1, 2, 3⟩ and
b = ⟨1, −1, 2⟩ . First

Second

1.2.15 https://math.libretexts.org/@go/page/89204
Here are some important observations.
The vectors a × b and b × a are not the same! In fact b × a = −a × b. We shall see in Theorem 1.2.23 below that this
was not a fluke.
The vector a × b has dot product zero with both a and b. So the vector a × b is prependicular to both a and b. We shall
see in Theorem 1.2.23 below that this was also not a fluke.

 Example 1.2.22
Yet again we use the mnemonic device to compute a more complicated cross product. This time let a = ⟨3, 2, 1⟩ and
b = ⟨6, 4, 2⟩ . Then

We shall see in Theorem 1.2.23 below that it is not a fluke that the cross product is 0. It is a consequence of the fact that a and
b = 2a are parallel.

We now move on to learning about the properties of the cross product. Our first properties lead up to a more intuitive geometric
definition of a × b, which is better for interpreting a × b. These properties of the cross product, which state that a × b is a vector
and then determine its direction and length, are as follows. We will collect these properties, and a few others, into a theorem
shortly.
(0)
a, b are vectors in three dimensions and a × b is a vector in three dimensions.
(1)
a×b is perpendicular to both a and b.

Proof.
To check that a and a × b are perpendicular, one just has to check that the dot product a ⋅ (a × b) = 0. The six terms in

a ⋅ (a × b) = a1 (a2 b3 − a3 b2 ) + a2 (a3 b1 − a1 b3 ) + a3 (a1 b2 − a2 b1 )

cancel pairwise. The computation showing that b ⋅ (a × b) = 0 is similar.

(2)
|a × b| = |a| |b| sin θ where 0 ≤ θ ≤ π is the angle between a, b

= the area of the parallelogram with sides a, b

Proof.
The formula |a × b| = |a| |b| sin θ is gotten by verifying that
2
|a × b| = (a × b) ⋅ (a × b)

2 2 2
= (a2 b3 − a3 b2 ) + (a3 b1 − a1 b3 ) + (a1 b2 − a2 b1 )

2 2 2 2 2 2 2 2
=a b − 2 a2 b3 a3 b2 + a b +a b − 2 a3 b1 a1 b3 + a b
2 3 3 2 3 1 1 3

2 2 2 2
+a b − 2 a1 b2 a2 b1 + a b
1 2 2 1

1.2.16 https://math.libretexts.org/@go/page/89204
is equal to
2 2 2 2 2 2
|a| |b | sin θ = |a| |b | (1 − cos θ)

2 2 2
= |a| |b | − (a ⋅ b )

2
2 2 2 2 2 2
= (a +a + a )(b +b + b ) − (a1 b1 + a2 b2 + a3 b3 )
1 2 3 1 2 3

2 2 2 2 2 2 2 2 2 2 2 2
=a b +a b +a b +a b +a b +a b
1 2 1 3 2 1 2 3 3 1 3 2

− (2 a1 b1 a2 b2 + 2 a1 b1 a3 b3 + 2 a2 b2 a3 b3 )

To see that |a| |b| sin θ is the area of the parallelogram with sides a and b, just recall that the area of any parallelogram is given
by the length of its base times its height. Think of a as the base of the parallelogram. Then |a| is the length of the base and
|b| sin θ is the height.

These properties almost determine a × b. Property 1 forces the vector a × b to lie on the line perpendicular to the plane
containing a and b. There are precisely two vectors on this line that have the length given by property 2. In the left hand figure of

the two vectors are labeled c and d. Which of these two candidates is correct is determined by the right hand rule 11 , which says
that if you form your right hand into a fist with your fingers curling from a to b, then when you stick your thumb straight out from
the fist, it points in the direction of a × b. This is illustrated in the figure on the right above 12 . The important special cases
(3)

^ ^ ^ ^
ı ×^
ȷ = k,    ^
ȷ ×k = ^
ı ,    k × ^
ı = ^
ȷ

^ ^ ^
^
ȷ ×^
ı = −k,    k × ^
ȷ = −^
ı ,     ^
ı × k = −^
ȷ

all follow directly from the definition of the cross product (see, for example, Example 1.2.20) and all obey the right hand
rule. Combining properties 1, 2 and the right hand rule give the geometric definition of a × b. To remember these three
special cases, just remember this figure.

The product of any two standard basis vectors, taken in the order of the arrows in the figure, is the third standard basis vector.
Going against the arrows introduces a minus sign.
(4)
^
a × b = |a| |b| sin θ n

where θ is the angle between a, b, |n ^ ⊥ a, b, and (a, b, n


^ | = 1,  n ^ ) obey the right hand rule.

Outline of Proof.
We have already seen that the right hand side has the correct length and, except possibly for a sign, direction. To check that the
right hand rule holds in general, rotate your coordinate system around 13 so that a points along the positive x axis and b lies in

1.2.17 https://math.libretexts.org/@go/page/89204
the xy-plane with positive y component. That is a = α ^ı and b = β ^ı + γmma^ȷ with α, γmma ≥ 0. Then

a × b = α^
ı × (β ^
ı + γmma^
ȷ ) = αβ ^
ı ×^
ı + αγmma ^
ı ×^
ȷ.

The first term vanishes by property 2, because the angle θ between ^ı and ^
ı is zero. So, by property 3, ^
a × b = αγmmak

points along the positive z axis, which is consistent with the right hand rule.

The analog of property 7 of the dot product (which says that a⋅ b is zero if and only if a=0 or b =0 or a⊥b ) follows
immediately from property 2.
(5)

a×b = 0 ⟺ a = 0 or b = 0 or a ∥ b.

The remaining properties are all tools for helping do computations with cross products. Here is a theorem which summarizes the
properties of the cross product. We have already seen the first five. The other properties are all tools for helping do computations
with cross products.

 Theorem 1.2.23. Properties of the Cross Product.

(0)
a, b are vectors in three dimensions and a × b is a vector in three dimensions.
(1)
a×b is perpendicular to both a and b.
(2)
|a × b| = |a| |b| sin θ where 0 ≤ θ ≤ π is the angle between a, b

= the area of the parallelogram with sides a, b

(3)

^ ^ ^ ^
ı ×^
ȷ = k,    ^
ȷ ×k = ^
ı ,    k × ^
ı = ^
ȷ

(4)
^
a × b = |a| |b| sin θ n

where θ is the angle between a, b, |n ^ ⊥ a, b, and (a, b, n


^ | = 1,  n ^ ) obey the right hand rule.

(5)

a×b = 0 ⟺ a = 0 or b = 0 or a ∥ b.

(6)

a × b = −b × a

(7)

(sa) × b = a × (sb) = s(a × b)

for any scalar (i.e. number) s.


(8)

a × (b + c) = a × b + a × c

(9)

1.2.18 https://math.libretexts.org/@go/page/89204
a ⋅ (b × c) = (a × b) ⋅ c

(10)

a × (b × c) = (c ⋅ a)b − (b ⋅ a)c

Proof.
We have already seen the proofs up to number 5. Numbers 6, 7 and 8 follow immediately from the definition, using a little
algebra. To prove numbers 9 and 10 we just write out the definitions of the left hand sides and the right hand sides and observe
that they are equal.
(9) The left hand side is

a ⋅ (b × c) = ⟨a1 , a2 , a3 ⟩ ⋅ ⟨b2 c3 −b3 c2 , b3 c1 −b1 c3 , b1 c2 −b2 c1 ⟩

= a1 b2 c3 −a1 b3 c2 +a2 b3 c1 −a2 b1 c3 +a3 b1 c2 −a3 b2 c1

The right hand side is

(a × b) ⋅ c = ⟨a2 b3 −a3 b2 , a3 b1 −a1 b3 , a1 b2 −a2 b1 ⟩ ⋅ ⟨c1 , c2 , c3 ⟩

= a2 b3 c1 −a3 b2 c1 +a3 b1 c2 −a1 b3 c2 +a1 b2 c3 −a2 b1 c3

The left and right hand sides are the same.


(10) We will give the straightforward, but slightly tedious, computations in (the optional) §1.2.6.

 Warning 1.2.24
Take particular care with properties 6 and 10. They are counterintuitive and are a frequent source of errors. In particular, for
general vectors a, b, c, the cross product is neither commutative nor associative, meaning that
a×b ≠ b ×a

a × (b × c) ≠ (a × b) × c

For example

^ ^ ^
ı × (^
ı ×^
ȷ) = ^
ı × k = −k × ^
ı = −^
ȷ

(^
ı ×^
ı)×^
ȷ = 0 ×^
ȷ =0

 Example 1.2.25
As an illustration of the properties of the dot and cross product, we now derive the formula for the volume of the parallelepiped
with edges a = ⟨a , a , a ⟩ , b = ⟨b , b , b ⟩ , c = ⟨c , c , c ⟩ that was mentioned in §1.2.4.
1 2 3 1 2 3 1 2 3

The volume of the parallelepiped is the area of its base times its height 14 . The base is the parallelogram with sides b and c. Its
area is the length of its base, which is |b|, times its height, which is |c| sin θ. (Drop a perpendicular from the head of c to the
line containing b ). Here θ is the angle between b and c. So the area of the base is |b| |c| sin θ = |b × c|, by property 2 of the
cross product.
To get the height of the parallelepiped, we drop a perpendicular from the head of a to the line that passes through the tail of a
and is perpendicular to the base of the parallelepiped. In other words, from the head of a to the line that contains both the head
and the tail of b × c. So the height of the parallelepiped is |a| | cos a⃗ rphi|. (The absolute values have been included because if

1.2.19 https://math.libretexts.org/@go/page/89204
the angle between b × c and a happens to be greater than 90 ,

the cos a⃗ rphi produced by taking the dot product of a and
(b × c ) will be negative.)

All together

volume of parallelepiped = (area of base) (height)

= |b × c| |a| | cos a⃗ rphi|

=∣
∣a ⋅ (b × c)∣

= | a1 (b × c)1 + a2 (b × c)2 + a3 (b × c)3 |

∣ b2 b3 b1 b3 b1 b2 ∣
= ∣a1 det [ ] − a2 det [ ] + a3 det [ ]∣
∣ c2 c3 c1 c3 c1 c2 ∣

∣ a1 a2 a3 ∣
⎡ ⎤
∣ ∣
= det ⎢ b1 b2 b3 ⎥
∣ ∣
⎣ ⎦
∣ c1 c2 c3 ∣

 Example 1.2.26

As a concrete example of the computation of the volume of a parallelepiped, we consider the parallelepiped with edges
a = ⟨0, 1, 2⟩

b = ⟨1, 1, 0⟩

c = ⟨0, 1, 0⟩

Here is a sketch.

The base of the parallelepiped is the parallelogram with sides b and c. It is the shaded parallelogram in the sketch above. As
^
⎡ ^
ı ^
ȷ k⎤

b × c = det ⎢ 1 1 0⎥
⎣ ⎦
0 1 0

1 0 1 0 1 1
^
= ^
ı det [ ]−^
ȷ det [ ] + k det [ ]
1 0 0 0 0 1

^
= ^
ı (1 × 0 − 0 × 1) − ^
ȷ (1 × 0 − 0 × 0) + k(1 × 1 − 1 × 0)

^
=k

We should not be surprised that b × c has direction k


^
.

b ×c has to be perpendicular to both b and c and


both b and c lie in the xy-plane,
so that b × c has to the perpendicular to the xy-plane,
so that b × c has to the parallel to the z -axis.
The area of the base, i.e. of the shaded parallelogram in the figure above, is
^
|b × c| = | k| = 1

and the volume of the parallelepiped is

1.2.20 https://math.libretexts.org/@go/page/89204
|a ⋅ (b × c)| = | ⟨0, 1, 2⟩ ⋅ ⟨0, 0, 1⟩ | = 2

(Optional) Some Vector Identities


Here are a few identities involving dot and cross products.

 Lemma 1.2.27
1. a ⋅ (b × c) = (a × b) ⋅ c
2. a × (b × c) = (c ⋅ a)b − (b ⋅ a)c
3. a × (b × c) + b × (c × a) + c × (a × b) = 0

Proof of (a).
We proved this in Theorem 1.2.23, by evaluating the left and right hand sides, and observing that they are the same. Here is a
second proof, in which we again write out both sides, but this time we express them in terms of determinants.
^
⎡ ^
ı ^
ȷ k ⎤

a⋅ b ×c = (a1 , a2 , a3 ) ⋅ det ⎢ b b2 b3 ⎥
1

⎣ ⎦
c1 c2 c3

b2 b3 b1 b3 b1 b2
= a1 det [ ] − a2 det [ ] + a3 det [ ]
c2 c3 c1 c3 c1 c2

a1 a2 a3
⎡ ⎤
= det ⎢ b1 b2 b3 ⎥
⎣ ⎦
c1 c2 c3

^
⎡ ^
ı ^
ȷ k ⎤

a×b ⋅ c = det ⎢ a a2 a3 ⎥ ⋅ (c1 , c2 , c3 )


1

⎣ ⎦
b1 b2 b3

a2 a3 a1 a2
= c1 det [ ] − c2 det [ a1 a3 b1 b3 ] + c3 det [ ]
b2 b3 b1 b2

c1 c2 c3
⎡ ⎤
= det ⎢ a1 a2 a3 ⎥
⎣ ⎦
b1 b2 b3

Exchanging two rows in a determinant changes the sign of the determinant. Moving the top row of a 3 ×3 determinant to the
bottom row requires two exchanges of rows. So the two 3 × 3 determinants are equal.

Proof of (b).
The proof is not exceptionally difficult — just write out both sides and grind. Substituting in
^
b × c  =  (b2 c3 − b3 c2 ) ^
ı − (b1 c3 − b3 c1 )^
ȷ + (b1 c2 − b2 c1 )k

gives, for the left hand side,

^ ^ ^
⎡ ı ȷ k ⎤

a × (b × c) = det ⎢ a1 a2 a3 ⎥

⎣ ⎦
b2 c3 − b3 c2 −b1 c3 + b3 c1 b1 c2 − b2 c1

= ^
ı [ a2 (b1 c2 − b2 c1 ) − a3 (−b1 c3 + b3 c1 )]

−^
ȷ [ a1 (b1 c2 − b2 c1 ) − a3 (b2 c3 − b3 c2 )]

^
+ k[ a1 (−b1 c3 + b3 c1 ) − a2 (b2 c3 − b3 c2 )]

On the other hand, the right hand side

1.2.21 https://math.libretexts.org/@go/page/89204
^
(a ⋅ c)b − (a ⋅ b)c = (a1 c1 + a2 c2 + a3 c3 )(b1 ^
ı + b2 ^
ȷ + b3 k)

^
− (a1 b1 + a2 b2 + a3 b3 )(c1 ^
ı + c2 ^
ȷ + c3 k)

= ^
ı  [ a1 b1 c1 + a2 b1 c2 + a3 b1 c3 − a1 b1 c1 − a2 b2 c1 − a3 b3 c1 ]

+^
ȷ  [ a1 b2 c1 + a2 b2 c2 + a3 b2 c3 − a1 b1 c2 − a2 b2 c2 − a3 b3 c2 ]

^
+ k [ a1 b3 c1 + a2 b3 c2 + a3 b3 c3 − a1 b1 c3 − a2 b2 c3 − a3 b3 c3 ]

= ^
ı  [ a2 b1 c2 + a3 b1 c3 − a2 b2 c1 − a3 b3 c1 ]

+^
ȷ  [ a1 b2 c1 + a3 b2 c3 − a1 b1 c2 − a3 b3 c2 ]

^
+ k [ a1 b3 c1 + a2 b3 c2 − a1 b1 c3 − a2 b2 c3 ]

The last formula that we had for the left hand side is the same as the last formula we had for the right hand side. Oof! This is a
little tedious to do by hand. But any computer algebra system will do it for you in a flash.

Proof of (c).
We just apply part (b) three times
a × (b × c) + b × (c × a) + c × (a × b)

= (c ⋅ a)b − (b ⋅ a)c + (a ⋅ b)c − (c ⋅ b)a + (b ⋅ c)a − (a ⋅ c)b

=0

(Optional) Application of Cross Products to Rotational Motion


In most computations involving rotational motion, the cross product shows up in one form or another. This is one of the main
applications of the cross product. Consider, for example, a rigid body which is rotating at a constant rate of Ω radians per second
about an axis whose direction is given by the unit vector a ^ . Let P be any point on the body. Let's figure out its velocity. Pick any

point on the axis of rotation and designate it as the origin of our coordinate system. Denote by r the vector from the origin to the
point P . Let θ denote the angle between a
^ and r. As time progresses the point P sweeps out a circle of radius R = |r | sin θ.

In one second P travels along an arc that subtends an angle of Ω radians, which is the fraction of a full circle. The length of this
Ω

arc is Ω


× 2πR = ΩR = Ω|r | sin θ so P travels the distance Ω|r | sin θ in one second and its speed, which is also the length of

its velocity vector, is Ω|r | sin θ.


Now we just need to figure out the direction of the velocity vector. That is, the direction of motion of the point P . Imagine that
both a
^ and r lie in the plane of a piece of paper, as in the figure above. Then v points either straight into or straight out of the page

and consequently is perpendicular to both a ^ and r. To distinguish between the “into the page” and “out of the page” cases, let's

impose the conventions that Ω > 0 and the axis of rotation a ^ is chosen to obey the right hand rule, meaning that if the thumb of

your right hand is pointing in the direction a^ , then your fingers are pointing in the direction of motion of the rigid body. Under

these conventions, the velocity vector v obeys


|v| = Ω|r|| a
^ | sin θ

v ⊥a
^, r

(a
^ , r, v) obey the right hand rule
That is, v is exactly Ω a^ × r. It is conventional to define the “angular velocity” of a rigid body to be vector Ω = Ω a ^ . That is, the

vector with length given by the rate of rotation and direction given by the axis of rotation of the rigid body. In particular, the bigger
the rate of rotation, the longer the angular velocity vector. In terms of this angular velocity vector, the velocity of the point P is

1.2.22 https://math.libretexts.org/@go/page/89204
v = Ω×r

(Optional) Application of Cross Products to Rotating Reference Frames


Imagine a moving particle that is being tracked by two observers.
1. One observer is fixed (out in space) and measures the position of the particle to be (X(t), Y (t), Z(t)).
2. The other observer is tied to a merry-go-round (the Earth) and measures the position of the particle to be (x(t), y(t), z(t)).
The merry-go-round is sketched in the figure on the left below. It is rotating about the Z -axis at a (constant) rate of Ω radians per
second. The vector Ω = Ω k
^
, whose length is the rate of rotation and whose direction is the axis of rotation, is called the angular

velocity.

The x- and y -axes of the moving observer are painted in red on the merry-go-round. The figure on the right above shows a top
view of the merry-go-round. The x- and y -axes of the moving observer are again red. The X- and Y -axes of the fixed observer are
blue. We are assuming that at time 0, the x-axis of the moving observer and the X-axis of the fixed observer coincide. As the
merry-go-round is rotating at Ω radians per second, the angle between the X-axis and x-axis after t seconds is Ωt.
As an example, suppose that the moving particle is tied to the tip of the moving observer's unit x vector. Then
x(t) = 1 y(t) = 0 z(t) = 0

X(t) = cos(Ωt) Y (t) = sin(Ωt) Z(t) = 0

or, if we write r(t) = (x(t), y(t), z(t)) and R(t) = (X(t), Y (t), Z(t)), then

r(t) = (1 , 0 , 0) R(t) = ( cos(Ωt) , sin(Ωt) , 0)

In general, denote by ^ı (t) the coordinates of the unit x-vector of the moving observer at time t, as measured by the fixed observer.
Similarly ^ȷ (t) for the unit y -vector, and k
^
(t) for the unit z -vector. As the merry-go-round is rotating about the Z -axis at a rate of Ω

radians per second, the angle between the X-axis and x-axis after t seconds is Ωt, and

^
ı (t) = ( cos(Ωt) , sin(Ωt) , 0)

^
ȷ (t) = ( − sin(Ωt) , cos(Ωt) , 0)

^
k(t) = (0 , 0 , 1)

The position of the moving particle, as seen by the fixed observer is


^
R(t) = x(t) ^
ı (t) + y(t) ^
ȷ (t) + z(t) k(t)

Differentiating, the velocity of the moving particle, as measured by the fixed observer is
dR dx dy dz
^
V(t) = = (t)  ^
ı (t) + (t) ^
ȷ (t) + (t) k(t)
dt dt dt dt
d d d
^ ^ ^
+ x(t) ı (t) + y(t) ȷ (t) + z(t) k(t)
dt dt dt

1.2.23 https://math.libretexts.org/@go/page/89204
We saw, in the last (optional) §1.2.7, that
d d d
^ ^ ^
ı (t) = Ω × ^
ı (t) ^
ȷ (t) = Ω × ^
ȷ (t) k(t) = Ω × k(t)
dt dt dt

(You could also verify that these are correct by putting in Ω = (0, 0, Ω) and explicitly computing the cross products.) So
dx dy dz
^
V(t) = ( (t)  ^
ı (t) + (t) ^
ȷ (t) + (t) k(t))
dt dt dt

^
+ Ω × (x(t)  ^
ı (t) + y(t) ^
ȷ (t) + z(t) k(t))

Differentiating a second time, the acceleration of the moving particle (which is also F

m
, where F is the net force being applied to
the particle and m is the mass of the particle) as measured by the fixed observer is
2 2 2
F d x d y d z
^
= A(t) = ( (t)  ^
ı (t) + (t) ^
ȷ (t) + (t) k(t))
m dt2 dt2 dt2

dx dy dz
^
+ 2Ω × ( (t)  ^
ı (t) + (t) ^
ȷ (t) + (t) k(t))
dt dt dt

^
+ Ω × (Ω × [x(t)  ^
ı (t) + y(t) ^
ȷ (t) + z(t) k(t)])

Recall that the angular velocity Ω = (0, 0, Ω) does not depend on time. The rotating observer sees ^
ı (t) as ^
ı = (1, 0, 0), sees ^ȷ (t)
as ^ȷ = (0, 1, 0), and sees k
^
(t) as k = (0, 0, 1) and so sees
^

F
= a(t) + 2Ω × v(t) + Ω × [Ω × r(t)]
m

where, as usual,
d dx dy dz
v(t) = r(t) =( (t) , (t) , (t))
dt dt dt dt
2 2 2 2
d d x d y d z
a(t) = r(t) = ( (t) , (t) , (t))
2 2 2 2
dt dt dt dt

So the acceleration of the particle seen by the moving observer is


F
a(t) = − 2Ω × v(t) − Ω × [Ω × r(t)]
m

Here
Fis the sum of all external forces acting on the moving particle,
Fcor = −2Ω × v(t) is called the Coriolis force and
−Ω × [Ω × r(t)] is called the centrifugal force.

As an example, suppose that you are the moving particle and that you are at the edge of the merry-go-round. Let's say t = 0 and
you are at ^ı . Then F is the friction that the surface of the merry-go-round applies to the soles of your shoes. If you are just standing
there, v(t) = 0, so that F = 0, and the friction F exactly cancels the centrifugal force −Ω × [Ω × r(t)] so that you remain at
cor

ı (t). Assume that Ω > 0. Now suppose that you start walking around the edge of the merry-go-round. Then, at t = 0, r = ^
^ ı and

if you walk in the direction of rotation (with speed one), as in the figure on the left below, v = ^ȷ and the Coriolis force
ı tries to push you off of the merry-go-round, while
^
F = −2Ω k × ^
cor ȷ = 2Ω ^

if you walk opposite to the direction of rotation (with speed one), as in the figure on the right below, v = −^ȷ so that the
Coriolis force F = −2Ω k
cor
^
× (−^ ȷ ) = −2Ω ^ı tries to pull you into the centre of the merry-go-round.

1.2.24 https://math.libretexts.org/@go/page/89204
On a rotating ball, such as the Earth, the Coriolis force deflects wind to the right (counterclockwise) in the northern hemisphere and
to the left (clockwise) is the southern hemisphere. In particular, hurricanes/cyclones/typhoons rotate counterclockwise in the
northern hemisphere and clockwise in the southern hemisphere. On the other hand, when it comes to water draining out of, for
example, a toilet, Coriolis force effects are dominated by other factors like asymmetry of the toilet.

Exercises
Stage 1

 1.

Let a = ⟨2, 0⟩ and b = ⟨1, 1⟩ . Evaluate and sketch a + b,  a + 2b and 2a − b.

 2.
Determine whether or not the given points are collinear (that is, lie on a common straight line)
1. (1, 2, 3),  (0, 3, 7),  (3, 5, 11)
2. (0, 3, −5),  (1, 2, −2),  (3, 0, 4)

 3.

Determine whether the given pair of vectors is perpendicular


1. ⟨1, 3, 2⟩ ,   ⟨2, −2, 2⟩
2. ⟨−3, 1, 7⟩ ,   ⟨2, −1, 1⟩
3. ⟨2, 1, 1⟩ ,   ⟨−1, 4, 2⟩

 4.

Consider the vector a = ⟨3, 4⟩ .


1. Find a unit vector in the same direction as a.
2. Find all unit vectors that are parallel to a.
3. Find all vectors that are parallel to a and have length 10.
4. Find all unit vectors that are perpendicular to a.

 5.

Consider the vector b = ⟨3, 4, 0⟩ .


1. Find a unit vector in the same direction as b.
2. Find all unit vectors that are parallel to b.
3. Find four different unit vectors that are perpendicular to b.

 6.

Let a = ⟨a 1, a2 ⟩ . Compute the projection of a on ^ı and ^ȷ .

 7.

Does the triangle with vertices (1, 2, 3),  (4, 0, 5)and (3, 6, 4) have a right angle?

1.2.25 https://math.libretexts.org/@go/page/89204
 8.
Show that the area of the parallelogram determined by the vectors a and b is |a × b|.

 9.

Show that the volume of the parallelepiped determined by the vectors a,  b and c is

|a ⋅ (b × c)|

 10.
Verify by direct computation that

1. ^ı × ^ȷ = k
^
, ^
^
ȷ ×k = ^
^
ı, k×^ ı =^
ȷ

2. a ⋅ (a × b) = b ⋅ (a × b) = 0

 11.

Consider the following statement: “If a ≠ 0 and if a ⋅ b = a ⋅ c then b = c. ” If the statment is true, prove it. If the statement
is false, give a counterexample.

 12.

Consider the following statement: “The vector a × (b × c) is of the form αb + βc for some real numbers α and ” If the
β.

statement is true, prove it. If the statement is false, give a counterexample.

 13.

What geometric conclusions can you draw from a ⋅ (b × c) = ⟨1, 2, 3⟩ ?

 14.

What geometric conclusions can you draw from a ⋅ (b × c) = 0?

 15.

Consider the three points O = (0, 0), A = (a, 0) and B = (b, c).
1. Sketch, in a single figure,
the triangle with vertices O, A and B, and
the circumscribing circle for the triangle (i.e. the circle that goes through all three vertices), and
the vectors

−→
OA, from O to A,

1.2.26 https://math.libretexts.org/@go/page/89204

−→
OB, from O to B,

−→
OC , from O to C , where C is the centre of the circumscribing circle.
Then add to the sketch and evaluate, from the sketch,

−→ −
−→
the projection of the vector OC on the vector OA, and

−→ −
−→
the projection of the vector OC on the vector OB.
2. Determine C .
3. Evaluate, using the formula 1.2.14,

−→ −
−→
the projection of the vector OC on the vector OA, and

−→ −
−→
the projection of the vector OC on the vector OB.

Stage 2

 16.

Find the equation of a sphere if one of its diameters has end points (2, 1, 4) and (4, 3, 10).

 17.

Use vectors to prove that the line joining the midpoints of two sides of a triangle is parallel to the third side and half its length.

 18.

Compute the areas of the parallelograms determined by the following vectors.


1. ⟨−3, 1⟩ ,   ⟨4, 3⟩
2. ⟨4, 2⟩ ,   ⟨6, 8⟩

 19 ✳.

Consider the plane W , defined by:


W   :   − x + 3y + 3z = 6,

Find the area of the parallelogram on W defined by 0 ≤ x ≤ 3, 0 ≤ y ≤ 2.

 20.

Compute the volumes of the parallelepipeds determined by the following vectors.


1. ⟨4, 1, −1⟩ ,   ⟨−1, 5, 2⟩ ,   ⟨1, 1, 6⟩
2. ⟨−2, 1, 2⟩ ,   ⟨3, 1, 2⟩ ,   ⟨0, 2, 5⟩

 21.

Compute the dot product of the vectors a and b. Find the angle between them.
1. a = ⟨1, 2⟩ ,  b = ⟨−2, 3⟩
2. a = ⟨−1, 1⟩ ,  b = ⟨1, 1⟩
3. a = ⟨1, 1⟩ ,  b = ⟨2, 2⟩
4. a = ⟨1, 2, 1⟩ ,  b = ⟨−1, 1, 1⟩
5. a = ⟨−1, 2, 3⟩ ,  b = ⟨3, 0, 1⟩

1.2.27 https://math.libretexts.org/@go/page/89204
 22.

Determine the angle between the vectors a and b if


1. a = ⟨1, 2⟩ ,  b = ⟨3, 4⟩
2. a = ⟨2, 1, 4⟩ ,  b = ⟨4, −2, 1⟩
3. a = ⟨1, −2, 1⟩ ,  b = ⟨3, 1, 0⟩

 23.

Determine all values of y for which the given vectors are perpendicular.
1. ⟨2, 4⟩ ,   ⟨2, y⟩
2. ⟨4, −1⟩ ,   ⟨y, y ⟩
2

3. ⟨3, 1, 1⟩ ,   ⟨2, 5y, y 2


 24.

Let u = −2 ^ı + 5^ȷ and v = α ^ı − 2^ȷ . Find α so that


1. u ⊥ v
2. u∥v
3. The angle between u and v is 60 ∘
.

 25.

Define a = ⟨1, 2, 3⟩ and b = ⟨4, 10, 6⟩ .


1. Find the component of b in the direction a.
2. Find the projection of b on a.
3. Find the projection of b perpendicular to a.

 26.

Compute ⟨1, 2, 3⟩ × ⟨4, 5, 6⟩ .

 27.

Calculate the following cross products.


1. ⟨1, −5, 2⟩ × ⟨−2, 1, 5⟩
2. ⟨2, −3, −5⟩ × ⟨4, −2, 7⟩
3. ⟨−1, 0, 1⟩ × ⟨0, 4, 5⟩

 28.

Let p = ⟨−1, 4, 2⟩ ,  q = ⟨3, 1, −1⟩ ,  r = ⟨2, −3, −1⟩ . Check, by direct computation, that
1. p × p = 0
2. p × q = −q × p
3. p × (3r) = 3(p × r)
4. p × (q + r) = p × q + p × r
5. p × (q × r) ≠ (p × q) × r

1.2.28 https://math.libretexts.org/@go/page/89204
 29.

Calculate the area of the triangle with vertices (0, 0, 0), (1, 2, 3) and (3, 2, 1).

 30 ✳.

A particle P of unit mass whose position in space at time t is r(t) has angular momentum L(t) = r(t) × r (t). ′
If
(t) = ρ(t)r(t) for a scalar function ρ, show that L is constant, i.e. does not change with time. Here denotes
′′ ′ d
r .
dt

Stage 3

 31.

Show that the diagonals of a parallelogram bisect each other.

 32.

Consider a cube such that each side has length s. Name, in order, the four vertices on the bottom of the cube A, B, C , D and
the corresponding four vertices on the top of the cube A , B , C , D .
′ ′ ′ ′

1. Show that all edges of the tetrahedron A C BD have the same length.
′ ′

2. Let E be the center of the cube. Find the angle between EA and EC .

 33.

Find the angle between the diagonal of a cube and the diagonal of one of its faces.

 34.

Consider a skier who is sliding without friction on the hill y = h(x) in a two dimensional world. The skier is subject to two
forces. One is gravity. The other acts perpendicularly to the hill. The second force automatically adjusts its magnitude so as to
prevent the skier from burrowing into the hill. Suppose that the skier became airborne at some (x , y ) with y = h(x ). How
0 0 0 0

fast was the skier going?

 35.

A marble is placed on the plane ax + by + cz = d. The coordinate system has been chosen so that the positive z -axis points
straight up. The coefficient c is nonzero and the coefficients a and b are not both zero. In which direction does the marble roll?
Why were the conditions “c ≠ 0 ” and “a, b not both zero” imposed?

 36.
Show that a ⋅ (b × c) = (a × b) ⋅ c.

 37.

Show that a × (b × c) = (a ⋅ c)b − (a ⋅ b)c.

 38.

Derive a formula for (a × b) ⋅ (c × d) that involves dot but not cross products.

1.2.29 https://math.libretexts.org/@go/page/89204
 39.

A prism has the six vertices



A = (1, 0, 0) A = (5, 0, 1)

B = (0, 3, 0) B = (4, 3, 1)

C = (0, 0, 4) C = (4, 0, 5)

1. Verify that three of the faces are parallelograms. Are they rectangular?
2. Find the length of AA .′

3. Find the area of the triangle ABC .


4. Find the volume of the prism.

 40.

(Three dimensional Pythagorean Theorem) A solid body in space with exactly four vertices is called a tetrahedron. Let A, B,
C and D be the areas of the four faces of a tetrahedron. Suppose that the three edges meeting at the vertex opposite the face of

area D are perpendicular to each other. Show that D = A + B + C .


2 2 2 2

 41.

(Three dimensional law of cosines) Let A, B, C and D be the areas of the four faces of a tetrahedron. Let α be the angle
between the faces with areas B and C , β be the angle between the faces with areas A and C and γ be the angle between the
faces with areas A and B. (By definition, the angle between two faces is the angle between the normal vectors to the faces.)
Show that
2 2 2 2
D =A +B +C − 2BC cos α − 2AC cos β − 2AB cos γ

1. Some people use an underline, as in v, rather than an arrow.



2. Or, in the Wikipedia jargon, disambiguate.
3. OK. OK. Out in that (admittedly very small) part of the real world that actually knows what a vector is.
4. The notation ∥a∥ is also used for the length of a.
5. You may be used to seeing it written as c = a + b − 2ab cos C , where a, b and c are the lengths of the three sides of the
2 2 2

triangle and C is the angle opposite the side of length c


6. The concepts of the dot product and perpendicularity have been generalized a lot in mathematics (for example, from 2d and 3d
vectors to functions). The generalization of the dot product is called the “inner product” and the generalization of
perpendicularity is called “orthogonality”.
7. The behaviour of air resistance (sometimes called drag) is pretty complicated. We're using a reasonable low speed
approximation. At high speeds drag is typically proportional to the square of the speed.
8. For a more comprehensive treatment of derivatives of vector valued functions r(t), and in particular of velocity and
acceleration, see Section 1.6 in this text and Section 1.1 in the CLP-4 text.
9. The topics of matrices and determinants appear prominently in linear algebra courses. We are only going to use them as
notation, and we will explicitly explain that notation. A linear algebra course is not a prerequisite for this text.
10. For a full derivation, see Example 1.2.25
11. That the cross product uses the right hand rule, rather than the left hand rule, is an example of the tyranny of the masses — only
roughly 10\% of humans are left-handed.

1.2.30 https://math.libretexts.org/@go/page/89204
12. This figure is a variant of
https://commons.wikimedia.org/wiki/File:Right_hand_rule_simple.png
13. Note that as you translate or rotate the coordinate system, the right hand rule is preserved. If (a, b, n
^ ) obey the right hand rule

so do their rotated and translated versions.


14. This is a simple integral calculus exercise.

This page titled 1.2: Vectors is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.

1.2.31 https://math.libretexts.org/@go/page/89204
1.3: Equations of Lines in 2d
A line in two dimensions can be specified by giving one point (x 0, y0 ) on the line and one vector d = ⟨d x, dy ⟩ whose direction is
parallel to the line.

If (x, y) is any point on the line then the vector ⟨x − x , y − y


0 0⟩ , whose tail is at (x0 , y0 ) and whose head is at (x, y), must be
parallel to d and hence must be a scalar multiple of d. So

 Equation 1.3.1. Parametric Equations

⟨x − x0 , y − y0 ⟩ = td

or, writing out in components,


x − x0 = tdx

y − y0 = tdy

These are called the parametric equations of the line, because they contain a free parameter, namely t. As t varies from −∞ to ∞,
the point (x + td , y + td ) traverses the entire line.
0 x 0 y

It is easy to eliminate the parameter t from the equations. Just multiply x − x0 = tdx by dy , multiply y − y0 = tdy by dx and
subtract to give

(x − x0 )dy − (y − y0 )dx = 0

In the event that d and d are both nonzero, we can rewrite this as
x y

 Equation 1.3.2. Symmetric Equation


x − x0 y − y0
=
dx dy

This is called the symmetric equation for the line.


A second way to specify a line in two dimensions is to give one point (x0 , y0 ) on the line and one vector n = ⟨nx , ny ⟩ whose
direction is perpendicular to that of the line.

If (x, y) is any point on the line then the vector ⟨x − x0 , y − y0 ⟩ , whose tail is at (x0 , y0 ) and whose head is at (x, y), must be
perpendicular to n so that

 Equation 1.3.3

n ⋅ ⟨x − x0 , y − y0 ⟩ = 0

Writing out in components

nx (x − x0 ) + ny (y − y0 ) = 0 or nx x + ny y = nx x0 + ny y0

1.3.1 https://math.libretexts.org/@go/page/89205
Observe that the coefficients n , n of x and y in the equation of the line are the components of a vector ⟨n , n ⟩ perpendicular to
x y x y

the line. This enables us to read off a vector perpendicular to any given line directly from the equation of the line. Such a vector is
called a normal vector for the line.

 Example 1.3.4

Consider, for example, the line y = 3x + 7. To rewrite this equation in the form

nx x + ny y = nx x0 + ny y0

we have to move terms around so that x and y are on one side of the equation and 7 is on the other side: 3x − y = −7. Then
n is the coefficient of x, namely 3, and n is the coefficient of y, namely −1. One normal vector for y = 3x + 7 is ⟨3, −1⟩ .
x y

Of course, if ⟨3, −1⟩ is perpendicular to y = 3x + 7, so is −5 ⟨3, −1⟩ = ⟨−15, 5⟩ . In fact, if we first multiply the equation
3x − y = −7 by −5 to get −15x + 5y = 35 and then set n and n to the coefficients of x and y respectively, we get
x y

n = ⟨−15, 5⟩ .

 Example 1.3.5

In this example, we find the point on the line y = 6 − 3x (call the line L) that is closest to the point (7, 5).
We'll start by sketching the line. To do so, we guess two points on L and then draw the line that passes through the two points.
If (x, y) is on L and x = 0, then y = 6. So (0, 6) is on L.
If (x, y) is on L and y = 0, then x = 2. So (2, 0) is on L.

Denote by P the point on L that is closest to (7, 5). It is characterized by the property that the line from (7, 5) to P is
perpendicular to L. This is the case just because if Q is any other point on L, then, by Pythagoras, the distance from (7, 5) to
Q is larger than the distance from (7, 5) to P . See the figure on the right above.

Let's use N to denote the line which passes through (7, 5) and which is perpendicular to L.

Since L has the equation 3x + y = 6, one vector perpendicular to L, and hence parallel to N , is ⟨3, 1⟩ . So if (x, y) is any
point on N , the vector ⟨x − 7, y − 5⟩ must be of the form t ⟨3, 1⟩ . So the parametric equations of N are

⟨x − 7, y − 5⟩ = t ⟨3, 1⟩ or x = 7 + 3t,  y = 5 + t

Now let (x, y) be the coordinates of P . Since P is on N , we have x = 7 + 3t, y = 5 +t for some t. Since P is also on L, we
also have 3x + y = 6. So
3(7 + 3t) + (5 + t) = 6

⟺ 10t + 26 = 6

⟺ t = −2

⟹ x = 7 + 3 × (−2) = 1,  y = 5 + (−2) = 3

and P is (1, 3).

1.3.2 https://math.libretexts.org/@go/page/89205
Exercises
Stage 1

 1

A line in R has direction d and passes through point c.


2

Which of the following gives its parametric equation: ⟨x, y⟩ = c + td, or ⟨x, y⟩ = c − td?

 2

A line in R has direction d and passes through point c.


2

Which of the following gives its parametric equation: ⟨x, y⟩ = c + td, or ⟨x, y⟩ = −c + td?

 3

Two points determine a line. Verify that the equations

⟨x − 1, y − 9⟩ = t ⟨8, 4⟩

and
1
⟨x − 9, y − 13⟩ = t ⟨1, ⟩
2

describe the same line by finding two different points that lie on both lines.

 4

A line in R has parametric equations


2

x −3 = 9t

y −5 = 7t

There are many different ways to write the parametric equations of this line. If we rewrite the equations as
x − x0 = dx t

y − y0 = dy t

what are all possible values of ⟨x 0, y0 ⟩ and ⟨d x, dy ⟩ ?

Stage 2

 5

Find the vector parametric, scalar parametric and symmetric equations for the line containing the given point and with the
given direction.
1. point (1, 2), direction ⟨3, 2⟩
2. point (5, 4), direction ⟨2, −1⟩
3. point (−1, 3), direction ⟨−1, 2⟩

 6

Find the vector parametric, scalar parametric and symmetric equations for the line containing the given point and with the
given normal.
1. point (1, 2), normal ⟨3, 2⟩
2. point (5, 4), normal ⟨2, −1⟩

1.3.3 https://math.libretexts.org/@go/page/89205
3. point (−1, 3), normal ⟨−1, 2⟩

 7

Use a projection to find the distance from the point (−2, 3) to the line 3x − 4y = −4.

 8
Let a, b and c be the vertices of a triangle. By definition, a median of a triangle is a straight line that passes through a vertex
of the triangle and through the midpoint of the opposite side.
1. Find the parametric equations of the three medians.
2. Do the three medians meet at a common point? If so, which point?

 9
√3
Let C be the circle of radius 1 centred at (2, 1). Find an equation for the line tangent to C at the point ( 5

2
,1+
2
).

This page titled 1.3: Equations of Lines in 2d is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.

1.3.4 https://math.libretexts.org/@go/page/89205
1.4: Equations of Planes in 3d
Specifying one point (x , y , z ) on a plane and a vector d parallel to the plane does not uniquely determine the plane, because it
0 0 0

is free to rotate about d. On the other hand, giving one point

on the plane and one vector n = ⟨n , n , n ⟩ with direction perpendicular to that of the plane does uniquely determine the plane. If
x y z

(x, y, z) is any point on the plane then the vector ⟨x − x , y − y , z − z ⟩ , whose tail is at (x , y , z ) and whose head is at
0 0 0 0 0 0

(x, y, z), lies entirely inside the plane and so must be perpendicular to n. That is,

 Equation 1.4.1. The Equation of a Plane

n ⋅ ⟨x − x0 , y − y0 , z − z0 ⟩ = 0

Writing out in components

nx (x − x0 ) + ny (y − y0 ) + nz (z − z0 ) = 0 or nx x + ny y + nz z = d

where d = n x x0 + ny y0 + nz z0 .

Again, the coefficients n , n , n of x,  y and z in the equation of the plane are the components of a vector ⟨n , n , n ⟩
x y z x y z

perpendicular to the plane. The vector n is often called a normal vector for the plane. Any nonzero multiple of n will also be
perpendicular to the plane and is also called a normal vector.

 Example 1.4.2
We have just seen that if we write the equation of a plane in the standard form

ax + by + cz = d

then it is easy to read off a normal vector for the plane. It is just ⟨a, b, c⟩ . So for example the planes

P :  x + 2y + 3z = 4 P :  3x + 6y + 9z = 7

have normal vectors n = ⟨1, 2, 3⟩ and n = ⟨3, 6, 9⟩ , respectively. Since n = 3n, the two normal vectors
′ ′
n and ′
n are
parallel to each other. This tells us that the planes P and P are parallel to each other.

When the normal vectors of two planes are perpendicular to each other, we say that the planes are perpendicular to each other.
For example the planes
′′
P :  x + 2y + 3z = 4 P :  2x − y = 7

have normal vectors n = ⟨1, 2, 3⟩ and n ′′


= ⟨2, −1, 0⟩ , respectively. Since
′′
n⋅n = 1 × 2 + 2 × (−1) + 3 × 0 = 0

the normal vectors n and n


′′
are mutually perpendicular, so the corresponding planes P and P
′′
are perpendicular to each
other.

Here is an example that illustrates how one can sketch a plane, given the equation of the plane.

1.4.1 https://math.libretexts.org/@go/page/89206
 Example 1.4.3

In this example, we'll sketch the plane

P :  4x + 3y + 2z = 12

A good way to prepare for sketching a plane is to find the intersection points of the plane with the x-, y - and z -axes, just as
you are used to doing when sketching lines in the xy-plane. For example, any point on the x axis must be of the form (x, 0, 0).
For (x, 0, 0) to also be on P we need x = 12

4
= 3. So P intersects the x-axis at (3, 0, 0). Similarly, P intersects the y -axis at

(0, 4, 0) and the z -axis at (0, 0, 6). Now plot the points (3, 0, 0), (0, 4, 0) and (0, 0, 6). P is the plane containing these three

points. Often a visually effective way to sketch a surface in three dimensions is to


only sketch the part of the surface in the first octant. That is, the part with x ≥ 0, y ≥ 0 and z ≥ 0.
To do so, sketch the curve of intersection of the surface with the part of the xy-plane in the first octant and,
similarly, sketch the curve of intersection of the surface with the part of the xz-plane in the first octant and the curve of
intersection of the surface with the part of the yz-plane in the first octant.
That's what we'll do. The intersection of the plane P with the xy-plane is the straight line through the two points (3, 0, 0) and
(0, 4, 0). So the part of that intersection in the first octant is the line segment from (3, 0, 0) to (0, 4, 0). Similarly the part of the

intersection of P with the xz-plane that is in the first octant is the line segment from (3, 0, 0) to (0, 0, 6) and the part of the
intersection of P with the yz-plane that is in the first octant is the line segment from (0, 4, 0) to (0, 0, 6). So we just have to
sketch the three line segments joining the three axis intercepts (3, 0, 0), (0, 4, 0) and (0, 0, 6). That's it.

Here are two examples that illustrate how one can find the distance between a point and a plane.

 Example 1.4.4

In this example, we'll compute the distance between the point

x = (1, −1, −3) and the plane P :  x + 2y + 3z = 18

By the “distance between x and the plane P ” we mean the shortest distance between x and any point y on P . In fact, we'll
evaluate the distance in two different ways. In the next Example 1.4.5, we'll use projection. In this example, our strategy for
finding the distance will be to
first observe that the vector n = ⟨1, 2, 3⟩ is normal to P and then
start walking 1 away from x in the direction of the normal vector n and
keep walking until we hit P . Call the point on P where we hit, y. Then the desired distance is the distance between x and
y. From the figure below it does indeed look like distance between x and y is the shortest distance between x and any

point on P . This is in fact true, though we won't prove it.

1.4.2 https://math.libretexts.org/@go/page/89206
So imagine that we start walking, and that we start at time t = 0 at x and walk in the direction n. Then at time t we might be
at

x + tn = (1, −1, −3) + t ⟨1, 2, 3⟩ = (1 + t, −1 + 2t, −3 + 3t)

We hit the plane P at exactly the time t for which (1 + t, −1 + 2t, −3 + 3t) satisfies the equation for P, which is
x + 2y + 3z = 18. So we are on P at the unique time t obeying

(1 + t) + 2(−1 + 2t) + 3(−3 + 3t) = 18 ⟺ 14t = 28 ⟺ t =2

So the point on P which is closest to x is

y = [x + tn ] = (1 + t, −1 + 2t, −3 + 3t)∣
∣ = (3, 3, 3)
t=2 t=2

and the distance from x to P is the distance from x to y, which is


−−−−−−−−−− −−
2 2 2
|y − x| = 2|n| = 2 √ 1 +2 +3 = 2 √14

 Example 1.4.5. Example 1.4.4, revisited

We are again going to find the distance from the point

x = (1, −1, −3) to the plane P :  x + 2y + 3z = 18

But this time we will use the following strategy.


We'll first find any point z on P and then
we'll denote by y the point on P nearest x, and we'll denote by v the vector from x to z (see the figure below) and then
we'll realize, by looking at the figure, that the vector from x to y is exactly the projection 2 of the vector v on n so that
the distance from x to P , i.e. the length of the vector from x to y, is exactly |proj v| . n

Now let's find a point on P . The plane P is given by a single equation, namely

x + 2y + 3z = 18

in the three unknowns, x, y, z. The easiest way to find one solution to this equation is to assign two of the unknowns the value
zero and then solve for the third unknown. For example, if we set x = y = 0, then the equation reduces to 3z = 18. So we
may take z = (0, 0, 6).
Then v, the vector from x = (1, −1, −3) to z = (0, 0, 6) is ⟨0 − 1 , 0 − (−1) , 6 − (−3)⟩ = ⟨−1, 1, 9⟩ so that, by Equation
1.2.14,
v⋅n
proj v = n
n 2
|n|

⟨−1, 1, 9⟩ ⋅ ⟨1, 2, 3⟩
= ⟨1, 2, 3⟩
2
| ⟨1, 2, 3⟩ |

28
= ⟨1, 2, 3⟩
14

= 2 ⟨1, 2, 3⟩

and the distance from x to P is


−−
| proj n v| = ∣
∣2 ⟨1, 2, 3⟩ ∣
∣ = 2 √14

1.4.3 https://math.libretexts.org/@go/page/89206
just as we found in Example 1.4.4.

In the next example, we find the distance between two planes.

 Example 1.4.6

Now we'll increase the degree of difficulty a tiny bit, and compute the distance between the planes

P :  x + 2y + 2z = 1 and P :  2x + 4y + 4z = 11

By the “distance between the planes P and P ” we mean the shortest distance between any pair of points x and x with x in P
′ ′

and x in P . First observe that the normal vectors


′ ′


n = ⟨1, 2, 2⟩ and n = ⟨2, 4, 4⟩ = 2n

to P and P are parallel to each other. So the planes P and P are parallel to each other. If they had not been parallel, they
′ ′

would have crossed and the distance between them would have been zero.
Our strategy for finding the distance will be to
first find a point x on P and then, like we did in Example 1.4.4,
start walking away from P in the direction of the normal vector n and
keep walking until we hit P . Call the point on P that we hit x . Then the desired distance is the distance between x and
′ ′ ′

x . From the figure below it does indeed look like distance between x and x is the shortest distance between any pair of
′ ′

points with one point on P and one point on P . Again, this is in fact true, though we won't prove it.

We can find a point on P just as we did on Example 1.4.5. The plane P is given by the single equation

x + 2y + 2z = 1

in the three unknowns, x, y, z. We can find one solution to this equation by assigning two of the unknowns the value zero and
then solving for the third unknown. For example, if we set y = z = 0, then the equation reduces to x = 1. So we may take
x = (1, 0, 0).

Now imagine that we start walking, and that we start at time t = 0 at x and walk in the direction n. Then at time t we might
be at

x + tn = (1, 0, 0) + t ⟨1, 2, 2⟩ = (1 + t, 2t, 2t)

We hit the second plane P at exactly the time t for which



(1 + t, 2t, 2t) satisfies the equation for ′
P , which is
2x + 4y + 4z = 11. So we are on P at the unique time t obeying

1
2(1 + t) + 4(2t) + 4(2t) = 11 ⟺ 18t = 9 ⟺ t =
2

So the point on P which is closest to x is


3

x = [x + tn ] 1 = (1 + t, 2t, 2t)∣
∣ 1 =( , 1, 1)
t= t=
2 2 2

and the distance from P to P is the distance from x to x which is


′ ′

−−−−−−−−−−−−−−−−−−−−−−− − −−
3 9 3
2 2 2
√ (1 − ) + (0 − 1 ) + (0 − 1 ) =√ =
2 4 2

1.4.4 https://math.libretexts.org/@go/page/89206
Now we'll find the angle between two intersecting planes.

 Example 1.4.7

The orientation (i.e. direction) of a plane is determined by its normal vector. So, by definition, the angle between two planes is
the angle between their normal vectors. For example, the normal vectors of the two planes
P1 : 2x + y − z = 3

P2 : x +y +z = 4

are
n1 = ⟨2, 1, −1⟩

n2 = ⟨1, 1, 1⟩

If we use θ to denote the angle between n and n


1 2, then
n1 ⋅ n2
cos θ =
| n1 | | n2 |

⟨2, 1, −1⟩ ⋅ ⟨1, 1, 1⟩


=
| ⟨2, 1, −1⟩ | | ⟨1, 1, 1⟩ |

2
=
– –
√6 √3

so that
2
θ = arccos −− = 1.0799
√18

to four decimal places. That's in radians. In degrees, it is 1.0799 180

π

= 61.87 to two decimal places.

Exercises
Stage 1

 1

The vector k
^
is a normal vector (i.e. is perpendicular) to the plane z = 0. Find another nonzero vector that is normal to z = 0.

 2

Consider the plane P with equation 3x + 1

2
y + z = 4.

1. Find the intersection of P with the y -axis.


2. Find the intersection of P with the z -axis.
3. Sketch the part of the intersection of P with the yz-plane that is in the first octant. (That is, with x, y, z ≥ 0.)

 3
1. Find the equation of the plane that passes through the origin and has normal vector ⟨1, 2, 3⟩ .
2. Find the equation of the plane that passes through the point (0, 0, 1) and has normal vector ⟨1, 1, 3⟩ .
3. Find, if possible, the equation of a plane that passes through both (1, 2, 3) and (1, 0, 0) and has normal vector ⟨4, 5, 6⟩ .
4. Find, if possible, the equation of a plane that passes through both (1, 2, 3) and (0, 3, 4) and has normal vector ⟨2, 1, 1⟩ .

 4. ✳
Find the equation of the plane that contains (1, 0, 0), (0, 1, 0) and (0, 0, 1).

1.4.5 https://math.libretexts.org/@go/page/89206
 5
1. Find the equation of the plane containing the points (1, 0, 1), (1, 1, 0) and (0, 1, 1).
2. Is the point (1, 1, 1) on the plane?
3. Is the origin on the plane?
4. Is the point (4, −1, −1) on the plane?

 6

What's wrong with the following exercise? “Find the equation of the plane containing (1, 2, 3), (2, 3, 4) and (3, 4, 5).”

Stage 2

 7

Find the plane containing the given three points.


1. (1, 0, 1),  (2, 4, 6),  (1, 2, −1)
2. (1, −2, −3),  (4, −4, 4),  (3, 2, −3)
3. (1, −2, −3),  (5, 2, 1),  (−1, −4, −5)

 8

Find the distance from the given point to the given plane.
1. point (−1, 2, 3), plane x + y + z = 7
2. point (1, −4, 3), plane x − 2y + z = 5

 9. ✳

A plane Π passes through the points A = (1, 1, 3), B = (2, 0, 2) and C = (2, 1, 0) in R 3
.

1. Find an equation for the plane Π.


2. Find the point E in the plane Π such that the line L through D = (6, 1, 2) and E is perpendicular to Π.

 10. ✳
Let A = (2, 3, 4) and let L be the line given by the equations x + y = 1 and x + 2y + z = 3.
1. Write an equation for the plane containing A and perpendicular to L.
2. Write an equation for the plane containing A and L.

 11. ✳
Consider the plane 4x + 2y − 4z = 3. Find all parallel planes that are distance 2 from the above plane. Your answers should
be in the following form: 4x + 2y − 4z = C .

 12. ✳
Find the distance from the point (1, 2, 3) to the plane that passes through the points (0, 1, 1), (1, −1, 3) and (2, 0, −1).

Stage 3

1.4.6 https://math.libretexts.org/@go/page/89206
 13. ✳
Consider two planes W 1, W2 , and a line M defined by:

W1   :   − 2x + y + z = 7

W2   :   − x + 3y + 3z = 6

x 2y − 4
M  :   = = z+5
2 4

1. Find a parametric equation of the line of intersection L of W and W 1 2.

2. Find the distance from L to M .

 14

Find the equation of the sphere which has the two planes x + y + z = 3,  x + y + z = 9 as tangent planes if the center of the
sphere is on the planes 2x − y = 0,  3x − z = 0.

 15

Find the equation of the plane that passes through the point (−2, 0, 1) and through the line of intersection of
2x + 3y − z = 0,  x − 4y + 2z = −5.

 16

Find the distance from the point p to the plane n ⋅ x = c.

 17

Describe the set of points equidistant from (1, 2, 3) and (5, 2, 7).

 18

Describe the set of points equidistant from a and b.

 19. ✳

Consider a point P (5, −10, 2) and the triangle with vertices A(0, 1, 1), B(1, 0, 1) and C (1, 3, 0).
1. Compute the area of the triangle ABC .
2. Find the distance from the point P to the plane containing the triangle.

 20. ✳

Consider the sphere given by


2 2 2
(x − 1 ) + (y − 2 ) + (z + 1 ) =2

Suppose that you are at the point (2, 2, 0) on S, and you plan to follow the shortest path on S to (2, 1, −1). Express your
initial direction as a cross product.

1. To see why heading in the normal direction gives the shortest walk, revisit Example 1.3.5
2. Now might be a good time to review the Definition 1.2.13 of projection.

This page titled 1.4: Equations of Planes in 3d is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.

1.4.7 https://math.libretexts.org/@go/page/89206
1.5: Equations of Lines in 3d
Just as in two dimensions, a line in three dimensions can be specified by giving one point (x , y , z ) on the line and one vector
0 0 0

d = ⟨d , d , d ⟩ whose direction is parallel to that of the line. If (x, y, z) is any point on the line then the vector
x y z

⟨x − x , y − y , z − z ⟩ ,
0 0 0 whose tail is at (x , y , z ) and whose arrow is at (x, y, z), must be parallel to d and hence a scalar
0 0 0

multiple of d. By translating this statement into a vector equation we get

 Equation 1.5.1. Parametric Equations of a Line

⟨x − x0 , y − y0 , z − z0 ⟩ = td

or the three corresponding scalar equations

x − x0 = tdx y − y0 = tdy z − z0 = tdz

These are called the parametric equations of the line. Solving all three equations for the parameter t (assuming that dx , dy and dz

are all nonzero)


x − x0 y − y0 z − z0
t = = =
dx dy dz

and erasing the “t = ” again gives the (so called) symmetric equations for the line.
Here is an example in which we find the parametric equations of a line that is given by the intersection of two planes.

 Example 1.5.2
The set of points (x, y, z) that obey x + y + z = 2 form a plane. The set of points (x, y, z) that obey x − y = 0 form a second
plane. The set of points (x, y, z) that obey both x + y + z = 2 and x − y = 0 lie on the intersection of these two planes and
hence form a line. We shall find the parametric equations for that line.
To sketch x + y + z = 2 we observe that if any two of x, y, z are zero, then the third is 2. So all of (0, 0, 2), (0, 2, 0) and
(2, 0, 0) are on x + y + z = 2. The plane x − y = 0 contains all of the z -axis, since (0, 0, z) obeys x − y = 0 for all z. Here

are separate sketches of (parts of) the two planes.

And here is a sketch of their intersection

Method 1. Each point on the line has a different value of z. We'll use z as the parameter. (We could just as well use x or y.)
There is no law that requires us to use the parameter name t, but that's what we have done so far, so set t = z. If (x, y, z) is on
the line then z = t and

1.5.1 https://math.libretexts.org/@go/page/89207
x +y +t = 2

x −y =0

The second equation forces y = x. Substituting this into the first equation gives
t
2x + t = 2 ⟹ x =y =1−
2

So the parametric equations are


t t 1 1
x =1− ,  y = 1 − ,  z = t or ⟨x − 1, y − 1, z⟩ = t ⟨− ,− , 1⟩
2 2 2 2

Method 2. We first find one point on the line. There are lots of them. We'll find the point with z = 0. (We could just as well use
z=123.4, but arguably z = 0 is a little easier.) If (x, y, z) is on the line and z = 0, then

x +y = 2

x −y = 0

The second equation again forces y = x. Substituting this into the first equation gives

2x = 2 ⟹ x =y =1

So (1, 1, 0) is on the line. Now we'll find a direction vector, d, for the line.
Since the line is contained in the plane x + y + z = 2, any vector lying on the line, like d, is also completely contained in
that plane. So d must be perpendicular to the normal vector of x + y + z = 2, which is ⟨1, 1, 1⟩ .
Similarly, since the line is contained in the plane x − y = 0, any vector lying on the line, like d, is also completely
contained in that plane. So d must be perpendicular to the normal vector of x − y = 0, which is ⟨1, −1, 0⟩ .
So we may choose for d any vector which is perpendicular to both ⟨1, 1, 1⟩ and ⟨1, −1, 0⟩ , like, for example,

d = ⟨1, −1, 0⟩ × ⟨1, 1, 1⟩

^
⎡ ^
ı ^
ȷ k⎤
−1 0 1 0 1 −1
^
= det ⎢ 1 −1 0⎥ = ^
ı det [ ]−^
ȷ det [ ] + k det [ ]
1 1 1 1 1 1
⎣ ⎦
1 1 1

^
= −^
ı −^
ȷ + 2k

We now have both a point on the line (namely (1, 1, 0)) and a direction vector for the line (namely ⟨−1, −1, 2⟩), so, as usual,
the parametric equations for the line are

⟨x − 1, y − 1, z⟩ = t ⟨−1, −1, 2⟩ or x = 1 − t,  y = 1 − t,  z = 2t

This looks a little different than the solution from method 1, but we'll see in a moment that they are really the same. Before
that, let's do one more method.
Method 3. We'll find two points on the line. We have already found that (1, 1, 0) is on the line. From the picture above, it looks
like (0, 0, 2) is also on the line. This is indeed the case since (0, 0, 2) obeys both x + y + z = 2 and x − y = 0. Notice that
we could also have guessed (0, 0, 2) by setting x = 0 and then solving y + z = x + y + z = 2, −y = x − y = 0 for x and y.
As both (1, 1, 0) and (0, 0, 2) are on the line, the vector with head at (1, 1, 0) and tail at (0, 0, 2), which is
⟨1 − 0, 1 − 0, 0 − 2⟩ = ⟨1, 1, −2⟩ , is a direction vector for the line. As (0, 0, 2) is a point on the line and ⟨1, 1, −2⟩ is a

direction vector for the line, the parametric equations for the line are

⟨x − 0, y − 0, z − 2⟩ = t ⟨1, 1, −2⟩ or x = t,  y = t,  z = 2 − 2t

This also looks similar, but not quite identical, to our previous answers. Time for a comparison.
Comparing the answers. The parametric equations given by the three methods are different. That's just because we have really
used different parameters in the three methods, even though we have called the parameter t in each case. To clarify the relation
between the three answers, rename the parameter of method 1 to t , the parameter of method 2 to t and the parameter of
1 2

method 3 to t . The parametric equations then become


3

1.5.2 https://math.libretexts.org/@go/page/89207
t1 t1
Method 1: x =1− y =1− z = t1
2 2

Method 2: x = 1 − t2 y = 1 − t2 z = 2t2

Method 3: x = t3 y = t3 z = 2 − 2t3

Substituting t = 2t into the Method 1 equations gives the Method 2 equations, and substituting t = 1 − t into the Method
1 2 3 2

3 equations gives the Method 2 equations. So all three really give the same line, just parametrized a little differently.

 Warning 1.5.3. A line in three dimensions has infinitely many normal vectors

For example, the line

⟨x − 1, y − 1, z⟩ = t ⟨1, 2, −2⟩

has direction vector ⟨1, 2, −2⟩ . Any vector perpendicular to ⟨1, 2, −2⟩ is perpendicular to the line. The vector ⟨n 1, n2 , n3 ⟩ is
perpendicular to ⟨1, 2, −2⟩ if and only if
0 = ⟨1, 2, −2⟩ ⋅ ⟨n1 , n2 , n3 ⟩ = n1 + 2 n2 − 2 n3

There is whole plane of ⟨n1 , n2 , n3 ⟩ 's obeying this condition, of which ⟨2, −1, 0⟩ , ⟨0, 1, 1⟩ and ⟨2, 0, 1⟩ are only three
examples.

The next two examples illustrate two different methods for finding the distance between a point and a line.

 Example 1.5.4

In this example, we find the distance between the point (2, 3, −1) and the line
L :   ⟨x − 1, y − 2, z − 3⟩ = t ⟨1, 1, 2⟩

or, equivalently, x = 1 + t,  y = 2 + t,  z = 3 + 2t

The vector from (2, 3, −1) to the point (1 + t , 2 + t , 3 + 2t) on L is ⟨t − 1 , t − 1 , 2t + 4⟩ . The square of the distance
between (2, 3, −1) and the point (1 + t , 2 + t , 3 + 2t) on L is the square of the length of that vector, namely
2 2 2 2
d(t) = (t − 1 ) + (t − 1 ) + (2t + 4 )

The point on L that is closest to (2, 3, −1) is that whose value of t obeys
d 2
0 = d(t) = 2(t − 1) + 2(t − 1) + 2(2)(2t + 4) (∗)
dt

Before we solve this equation for t and finish of our computation, observe that this equation (divided by 2) says that

⟨1 , 1 , 2⟩ ⋅ ⟨t − 1 , t − 1 , 2t + 4⟩ = 0

That is, the vector from (2, 3, −1) to the point on L nearest (2, 3, −1) is perpendicular to L's direction vector.
Now back to our computation. The equation (∗) simplifies to 12t + 12 = 0. So the optimal t = −1 and the distance is
−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2 −−
d(−1) = √ (−1 − 1 ) + (−1 − 1 ) + (−2 + 4 ) = √12

 Example 1.5.5. Example 1.5.4 revisited

In this example, we again find the distance between the point (2, 3, −1) and the line

L :   ⟨x − 1, y − 2, z − 3⟩ = t ⟨1, 1, 2⟩

but we use a different method. In the figure below, Q is the point (2, 3, −1).

1.5.3 https://math.libretexts.org/@go/page/89207
If we drop a perpendicular from Q to the line L, it hits the line L at the point N , which is the point on L that is nearest Q. So
the distance from Q to L is exactly the distance from Q to N , which is exactly the length of the vector from Q to N . In the
figure above, w⃗  is the vector from Q to N . Now the vector w⃗  has to be perpendicular to the direction vector for L. That is, w⃗ 
has to be perpendicular to d ⃗ = ⟨1, 1, 2⟩ . However, as we saw in Warning 1.5.3, there are a huge number of vectors in different
directions that are perpendicular to d .⃗  So you might think that it is very hard to even determine the direction of w⃗ .
Fortunately, it isn't. Here is the strategy.
Pick any point on L and call it P .
It is very easy to find the vector from P to N — it is just the projection of the vector from P to Q (called v⃗ in the figure
above) on d .⃗ 
Once we know proj d
⃗ 
⃗  v, we will be able to compute

w⃗ = proj ⃗  v
⃗ − v ⃗ 
d

and then the distance from Q to the line L is just |w⃗ |.


Here is the computation. We'll choose P to be the point on L that has t = 0, which is (1, 2, 3). So the vector from
P = (1, 2, 3) to Q = (2, 3, −1) is

v ⃗ = ⟨2 − 1, 3 − 2, −1 − 3⟩ = ⟨1, 1, −4⟩

The projection of v⃗ = ⟨1, 1, −4⟩ on d ⃗ = ⟨1, 1, 2⟩ is


⟨1, 1, −4⟩ ⋅ ⟨1, 1, 2⟩ −6
proj ⃗  v
⃗ = ⟨1, 1, 2⟩ = ⟨1, 1, 2⟩ = ⟨−1, −1, −2⟩
d 2
| ⟨1, 1, 2⟩ | 6

and then

w⃗ = proj ⃗  v
⃗ − v ⃗ = ⟨−1, −1, −2⟩ − ⟨1, 1, −4⟩ = ⟨−2, −2, 2⟩
d

and finally the distance from Q to the line L is



| w⃗ | = | ⟨−2, −2, 2⟩ | = |2 ⟨−1, −1, 1⟩ | = 2 √3

The next two (optional) examples illustrate two different methods for finding the distance between two lines.

 Example 1.5.6. (Optional) Distance between lines

In this example, we find the distance between the lines


L :   ⟨x − 1, y − 2, z − 3⟩ = t ⟨1, 0, −1⟩

L :   ⟨x − 1, y − 2, z − 1⟩ = t ⟨1, −2, 1⟩

We can rewrite the equations of the lines as


L :  x = 1 + t,  y = 2,  z = 3 − t

L :  x = 1 + t,  y = 2 − 2t,  z = 1 + t

Of course the value of t in the parametric equation for L need not be the same as the value of t in the parametric equation for
L . So let us denote by x⃗ (s) = (1 + s , 2 , 3 − s) and y (t) the points on L and L , respectively,
′ ′
⃗  = (1 + t , 2 − 2t , 1 + t)

that are closest together. Note that the vector from x⃗ (s) to y (t)
⃗  is ⟨t − s , −2t , −2 + s + t⟩ . Then, in particular,

1.5.4 https://math.libretexts.org/@go/page/89207
x⃗ (s) is the point on L that is closest to the point y (t),
⃗  and
⃗ 
y (t) is the point on L that is closest to the point x⃗ (s).

So, as we saw in Example 1.5.4, the vector, ⟨t − s , −2t , −2 + s + t⟩ , that joins x⃗ (s) and y (t),
⃗  must be perpendicular to
both the direction vector of L and the direction vector of L . Consequently ′

0 = ⟨1, 0, −1⟩ ⋅ ⟨t − s , −2t , −2 + s + t⟩ = 2 − 2s

0 = ⟨1, −2, 1⟩ ⋅ ⟨t − s , −2t , −2 + s + t⟩ = −2 + 6t

So s = 1 and t = 1

3
and the distance between L and L is ′


∣ ⟨t − s , −2t , −2 + s + t⟩ ∣
∣s=1, t=1/3 =∣
∣ ⟨−2/3 , −2/3 , −2/3⟩ ∣

2
= –
√3

 Example 1.5.7. Example 1.5.6 revisited, again optional

In this example, we again find the distance between the lines


L :   ⟨x − 1, y − 2, z − 3⟩ = t ⟨1, 0, −1⟩

L :   ⟨x − 1, y − 2, z − 1⟩ = t ⟨1, −2, 1⟩

this time using a projection, much as in Example 1.4.5. The procedure, which will be justified below, is
first form a vector n⃗  that is perpendicular to the direction vectors of both lines by taking the cross product of the two
direction vectors. In this example,
^
⎡ ^
ı
ı ^
ȷ
ȷ k ⎤
^
⟨1, 0, −1⟩ × ⟨1, −2, 1⟩ = det ⎢ 1 0 −1 ⎥ = −2 ^
ı
ı − 2^
ȷ
ȷ − 2k

⎣ ⎦
1 −2 1

Since we just want n^ to be perpendicular to both direction vectors, we may simplify our computations by dividing this

vector by −2, and take n⃗ = ⟨1, 1, 1⟩ .


Next find one point on L and one point on L and subtract to form a vector v ⃗ whose tail is at one point and whose head is at

the other point. This vector goes from one line to the other line. In this example, the point (1, 2, 3) is on L (just set t = 0 in
the equation for L) and the point (1, 2, 1) is on L (just set t = 0 in the equation for L ), so that we may take
′ ′

v ⃗ = ⟨1 − 1 , 2 − 2 , 3 − 1⟩ = ⟨0, 0, 2⟩

The distance between the two lines is the length of the projection of v ⃗ on n⃗ . In this example, by 1.2.14, the distance is
∣ v ⃗ ⋅ n⃗  ∣ | v ⃗ ⋅ n⃗ |

∣proj n⃗  v∣
∣⃗  = ∣ n⃗ ∣ =
2
∣ |n⃗ | ∣ | n⃗ |

| ⟨0, 0, 2⟩ ⋅ ⟨1, 1, 1⟩ |
=
| ⟨1, 1, 1⟩ |

2
=

√3

just as we found in Example 1.5.6


Now, here is the justification for the procedure.
As we did in Example 1.5.6, denote by x⃗ (s) and y (t) ⃗  the points on L and L , respectively, that are closest together. Note

that, as we observed in Example 1.5.6, the vector from x⃗ (s) to y (t) ⃗  is perpendicular to the direction vectors of both lines,
and so is parallel to n⃗ .
Denote by P the plane through x⃗ (s) that is perpendicular to n⃗ . As x⃗ (s) is on L and the direction vector of L is
perpendicular to n⃗ , the line L is contained in P .
Denote by P the plane through y (t)

⃗  that is perpendicular to n⃗ . As y (t)
⃗  is on L and the direction vector of L is ′ ′

perpendicular to n⃗ , the line L is contained in P .


′ ′

1.5.5 https://math.libretexts.org/@go/page/89207
The planes P and P are parallel to each other. As x⃗ (s) is on P and y (t)

⃗  is on P , and the vector from x⃗ (s) to y (t)

⃗  is
perpendicular to both P and P , the distance from P to P is exactly the length of the vector from x⃗ (s) to y (t).
′ ′
⃗  That is
also the distance from L to L . ′

The vector v ⃗ constructed in the procedure above is a vector between L and L and so is also a vector between P and P .
′ ′

Looking at the figure below 1 , we see that the vector from x⃗ (s) to y (t)
⃗  is (up to a sign) the projection of v ⃗ on n⃗ .

So the distance from P to P ′


, and hence the distance from L to L , is exactly the length of proj

n⃗ 
⃗ 
v.

Exercises
Stage 1

 1

What is wrong with the following exercise?


“Give an equation for the line passing through the point (3, 1, 3) that is normal to the vectors ⟨4, −6, 2⟩ and ⟨ 1

3
,−
1

2
,
1

6
⟩. ”

 2

Find, if possible, four lines in 3d with


no two of the lines parallel to each other and
no two of the lines intersecting.

Stage 2

 3

Find a vector parametric equation for the line of intersection of the given planes.
1. x − 2z = 3 and y + z = 5 1

2. 2x − y − 2z = −3 and 4x − 3y − 3z = −5

 4

Determine a vector equation for the line of intersection of the planes


1. x + y + z = 3 and x + 2y + 3z = 7
2. x + y + z = 3 and 2x + 2y + 2z = 7

 5
In each case, determine whether or not the given pair of lines intersect. Also find all planes containing the pair of lines.
1. ⟨x, y, z⟩ = ⟨−3, 2, 4⟩ + t ⟨−4, 2, 1⟩ and ⟨x, y, z⟩ = ⟨2, 1, 2⟩ + t ⟨1, 1, −1⟩
2. ⟨x, y, z⟩ = ⟨−3, 2, 4⟩ + t ⟨−4, 2, 1⟩ and ⟨x, y, z⟩ = ⟨2, 1, −1⟩ + t ⟨1, 1, −1⟩
3. ⟨x, y, z⟩ = ⟨−3, 2, 4⟩ + t ⟨−2, −2, 2⟩ and ⟨x, y, z⟩ = ⟨2, 1, −1⟩ + t ⟨1, 1, −1⟩
4. ⟨x, y, z⟩ = ⟨3, 2, −2⟩ + t ⟨−2, −2, 2⟩ and ⟨x, y, z⟩ = ⟨2, 1, −1⟩ + t ⟨1, 1, −1⟩

1.5.6 https://math.libretexts.org/@go/page/89207
 6
Find the equation of the line through (2, −1, −1) and parallel to each of the two planes x + y = 0 and x − y + 2z = 0.

Express the equations of the line in vector and scalar parametric forms and in symmetric form.

 7. ✳

Let L be the line given by the equations x + y = 1 and x + 2y + z = 3. Write a vector parametric equation for L.

 8
1. Find a vector parametric equation for the line x + 2y + 3z = 11,  x − 2y + z = −1.
2. Find the distance from (1, 0, 1) to the line x + 2y + 3z = 11,  x − 2y + z = −1.

 9

Let L1 be the line passing through (1, −2, −5) in the direction of d
⃗ 
1 = ⟨2, 3, 2⟩ . Let L2 be the line passing through
(−3, 4, −1) in the direction d ⃗  2 = ⟨5, 2, 4⟩ .

1. Find the equation of the plane P that contains L and is parallel to L


1 2.

2. Find the distance from L to P .


2

 10. ✳

Let L be a line which is parallel to the plane 2x + y − z = 5 and perpendicular to the line x = 3 − t, y = 1 − 2t and z = 3t.
1. Find a vector parallel to the line L.
2. Find parametric equations for the line L if L passes through a point Q(a, b, c) where a < 0, b > 0, c > 0, and the
distances from Q to the xy--plane, the xz--plane and the yz--plane are 2, 3 and 4 respectively.

 11. ✳

Let L be the line of intersection of the planes x + y + z = 6 and x − y + 2z = 0.


1. Find the points in which the line L intersects the coordinate planes.
2. Find parametric equations for the line through the point (10, 11, 13) that is perpendicular to the line L and parallel to the
plane y = z.

 12. ✳

The line L has vector parametric equation r (t)


⃗  = (2 + 3t) ^
ı + 4t^
ı ȷ
^
ȷ − k.

1. Write the symmetric equations for L.


2. Let α be the angle between the line L and the plane given by the equation x − y + 2z = 0. Find α.

 13. ✳

Find the parametric equation for the line of intersection of the planes

x + y + z = 11 and x − y − z = 13.

 14. ✳
1. Find a point on the y-axis equidistant from (2, 5, −3) and (−3, 6, 1).
2. Find the equation of the plane containing the point (1, 3, 1) and the line r (t)
⃗  =t ^
ı
ı +t ^
ȷ
^
ȷ + (t + 2) k.

1.5.7 https://math.libretexts.org/@go/page/89207
Stage 3

 15. ✳

Let A = (0, 2, 2), B = (2, 2, 2), C = (5, 2, 1).

1. Find the parametric equations for the line which contains A and is perpendicular to the triangle ABC .

−→ −
−→
2. Find the equation of the set of all points P such that P A is perpendicular to P B. This set forms a
Plane/Line/Sphere/Cone/Paraboloid/Hyperboloid (circle one) in space.
3. A light source at the origin shines on the triangle ABC making a shadow on the plane x + 7y + z = 32. (See the
~
diagram.) Find A.

 16

Let P ,  Q,  R and S be the vertices of a tetrahedron. Denote by p ,⃗   q ,⃗   r ⃗  and s ⃗ the vectors from the origin to P ,  Q,  R and S
respectively. A line is drawn from each vertex to the centroid of the opposite face, where the centroid of a triangle with vertices
a⃗ ,  b
⃗ 
and c ⃗ is 1

3
⃗ 
(a⃗ + b + c ).
⃗  Show that these four lines meet at 1

4
(p ⃗ + q ⃗ + r ⃗ + s ⃗  ).

 17
y−7 y+2
Calculate the distance between the lines x+2

3
=
−4
=
z−2

4
and x−1

−3
=
4
=
z+1

1
.

1. and possibly reviewing the Definition 1.2.13 of projection

This page titled 1.5: Equations of Lines in 3d is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.

1.5.8 https://math.libretexts.org/@go/page/89207
1.6: Curves and their Tangent Vectors
The right hand side of the parametric equation (x, y, z) = (1, 1, 0) + t ⟨1, 2, −2⟩ that we just saw in Warning 1.5.3 is a vector-
valued function of the one real variable t. We are now going to study more general vector-valued functions of one real variable.
That is, we are going to study functions that assign to each real number t (typically in some interval) a vector r (t).
⃗  For example
⃗ 
r (t) = (x(t), y(t), z(t))

might be the position 1 of a particle at time t. As t varies r (t)


⃗  sweeps out a curve.

While in some applications t will indeed be “time”, it does not have to be. It can be simply a parameter that is used to label the
different points on the curve that r (t)
⃗  sweeps out. We then say that r (t)
⃗  provides a parametrization of the curve.

 Example 1.6.1. Parametrization of x 2


+ y
2
= a
2

While we will often use t as the parameter in a parametrized curve r (t),


⃗  there is no need to call it t. Sometimes it is natural to
use a different name for the parameter. For example, consider the circle 2 x + y = a . It is natural to use the angle θ in the
2 2 2

sketch below to label the point (a cos θ , a sin θ) on the circle.

That is,

r (θ)
⃗  = (a cos θ , a sin θ) 0 ≤ θ < 2π

is a parametrization of the circle x


2
+y
2
=a .
2
Just looking at the figure above, it is clear that, as θ runs from 0 to ⃗ 
2π, r (θ)

traces out the full circle.


However beware that just knowing that r (t)
⃗  lies on a specified curve does not guarantee that, as t varies, r (t)
⃗  covers the entire
curve. For example, as t runs over the whole real line, arctan(t) runs over the interval (−1, 1). For all t,
2

−−−−−−−−−−−−− −
2 4 2
⃗ 
r (t) = (x(t), y(t)) = a ( arctan(t) , √ 1 − arctan (t) )
π π2

is well-defined and obeys x(t) 2


+ y(t)
2
=a .
2
But this r (t)
⃗  does not cover the entire circle because y(t) is always positive.

 Example 1.6.2. Parametrization of (x − h) 2


+ (y − k)
2 2
= a

We can tweak the parametrization of Example 1.6.1 to get a parametrization of the circle of radius a that is centred on (h, k).
One way to do so is to redraw the sketch of Example 1.6.1 with the circle translated so that its centre is at (h, k).
We see from the sketch that

1.6.1 https://math.libretexts.org/@go/page/92234
⃗ 
r (θ) = (h + a cos θ , k + a sin θ) 0 ≤ θ < 2π

is a parametrization of the circle (x − h) 2


+ (y − k)
2
=a .
2

A second way to come up with this parametrization is to observe that we can turn the trig identity cos 2
t + sin
2
t =1 into the
equation (x − h) + (y − k) = a of the circle by
2 2 2

multiplying the trig identity by a to get (a cos t) + (a sin t) = a and then


2 2 2 2

setting  a cos t = x − h  and  a sin t = y − k , which turns (a cos t) + (a sin t) 2 2


=a
2
into (x − h) 2
+ (y − k)
2
=a .
2

2
2 y
 Example 1.6.3. Parametrization of x

a
2
+ 2
= 1 and of x 2/3
+ y
2/3
= a
2/3

2
2
y
We can build parametrizations of the curves x

a
2
+ 2
=1 and x 2/3
+y
2/3
=a
2/3
from the trig identity cos
2
t + sin
2
t = 1,
b

like we did in the second part of the last example.


2
y 2 y
Setting   cos t = x

a
  and   sin t = b
  turns cos 2
t + sin
2
t =1 into x

a
2
+ 2
= 1.
b
1 1 2/3
2/3
y y
Setting   cos t = ( x

a
) 3
  and   sin t = ( a
) 3
  turns cos 2
t + sin
2
t =1 into x

2/3
+
2/3
= 1.
a a

So
⃗ 
r (t) = (a cos t , b sin t) 0 ≤ t < 2π

3 3
⃗ 
r (t) = (a cos t , a sin t) 0 ≤ t < 2π

2 2
y
give parametrizations of x

a2
+
2
=1 and x 2/3
+y
2/3
=a
2/3
, respectively. To see that running t from 0 to 2π runs r (t)
⃗  once
b

around the curve, look at the figures below.

The curve x + y = a
2/3 2/3
is called an astroid. From its equation, we would expect its sketch to look like a deformed
2/3

circle. But it is probably not so obvious that it would have the pointy bits of the right hand figure. We will not explain here why
they arise. The astroid is studied in some detail in Example 1.1.7 of the CLP-4 text. In particular, the above sketch is carefully
developed there.

 Example 1.6.4. Parametrization of e y


= 1 + x
2

A very easy method that can often create parametrizations for a curve is to use x or y as a parameter. Because we can solve
y
e = 1 +x
2
for y as a function of x, namely y = ln (1 + x ), we can use x as the parameter simply by setting t = x. This 2

gives the parametrization


2
⃗ 
r (t) = (t , ln(1 + t )) −∞ < t < ∞

1.6.2 https://math.libretexts.org/@go/page/92234
 Example 1.6.5. Parametrization of x 2
+ y
2 2
= a , again

It is also quite common that one can use either x or y to parametrize part of, but all of, a curve. A simple example is the circle
x + y = a . For each −a < x < a, there are two points on the circle with that value of x. So one cannot use x to
2 2 2

parametrize the whole circle. Similarly, for each −a < y < a, there are two points on the circle with that value of y. So one
cannot use y to parametrize the whole circle. On the other hand
− −−−−−
2 2
⃗ 
r (t) = (t , √ a − t ) −a < t < a
− −−−−−
2 2
⃗ 
r (t) = (t , −√ a − t ) −a < t < a

provide parametrizations of the top half and bottom half, respectively, of the circle using x as the parameter, and
− −−−−−
2 2
⃗ 
r (t) = (√ a − t , t) −a < t < a
− −−−−−
2 2
⃗ 
r (t) = ( − √ a − t , t) −a < t < a

provide parametrizations of the right half and left half, respectively, of the circle using y as the parameter.

 Example 1.6.6. Unparametrization of r (t)


⃗  = (cos t, 7 − t)

In this example, we will undo the parametrization ⃗ 


r (t) = (cos t, 7 − t) and find the Cartesian equation of the curve in
question. We may rewrite the parametrization as

x = cos t

y = 7 −t

Note that we can eliminate the parameter t simply by using the second equation to solve for t as a function of y. Namely
t = 7 − y. Substituting this into the first equation gives us the Cartesian equation

x = cos(7 − y)

Curves often arise as the intersection of two surfaces. For example, the intersection of the sphere x 2
+y
2
+z
2
=1 with the plane
y = x is a circle. The part of that circle that is in the first octant is the red curve in the figure below.

One way to parametrize such curves is to choose one of the three coordinates x, y, z as the parameter, and solve the two given
equations for the remaining two coordinates, as functions of the parameter. Here are two examples.

 Example 1.6.7

The set of all (x, y, z) obeying


x −y =0
2 2 2
x +y +z =1

is the circle sketched above. We can choose to use y as the parameter and think of
x =y
2 2 2
x +z = 1 −y

as a system of two equations for the two unknowns x and z, with y being treated as a given constant, rather than as an
unknown. We can now (trivially) solve the first equation for x, substitute the result into the second equation, and finally solve

1.6.3 https://math.libretexts.org/@go/page/92234
for z.
2 2 2 2 2
x = y,  x +z = 1 −y ⟹ z = 1 − 2y

−−−−−−
If, for example, we are interested in points (x, y, z) on the curve with z ≥ 0, we have z = √1 − 2y 2
and
−−−−−−
1 1
2
⃗ 
r (y) = (y , y , √ 1 − 2y ), − ≤y ≤
– –
√2 √2

is a parametrization for the part of the circle above the xy-plane. If, on the other hand, we are interested in points (x, y, z) on
− −−−− −
the curve with z ≤ 0, we have z = −√1 − 2y and 2

−−−−−−
1 1
2
⃗ 
r (y) = (y , y , −√ 1 − 2y ), − ≤y ≤
– –
√2 √2

is a parametrization for the part of the circle below the xy-plane.

 Example 1.6.8

The previous example was rigged so that it was easy to solve for x and z as functions of y. In practice it is not always easy, or
even possible, to do so. A more realistic example is the set of all (x, y, z) obeying
2 2
2
y z
x + + =1
2 3
2 2
x + 2y =z

which is the blue curve in the figure

(Don't worry about how we make sketches like this. We'll develop some surface sketching technique in §1.7 below.)
Substituting x = z − 2y (from the second equation) into the first equation gives
2 2

2
3 z
2
− y +z+ =1
2 3

or, completing the square,


3 1 3 2 7
2
− y + (z + ) =
2 3 2 4

If, for example, we are interested in points (x, y, z) on the curve with y ≥ 0, this can be solved to give y as a function of z.
−−−−−−−−−−−−− −
2 3 2 14
y =√ (z + ) −
9 2 12

Then x 2
= z − 2y
2
also gives x as a function of z. If x ≥ 0,
−−−−−−−−−−−−−−−− −
2
4 3 14
x = √ z− (z + ) +
9 2 6
−−−−−−−−−−− −
4 4 2 1
=√ − z − z
3 9 3

1.6.4 https://math.libretexts.org/@go/page/92234
The other signs of x and y can be gotten by using the appropriate square roots. In this example, (x, y, z) is on the curve, i.e.
satisfies the two original equations, if and only if all of (±x, ±y, z) are also on the curve.

Derivatives and Tangent Vectors


This being a Calculus text, one of our main operations is differentiation. We are now interested in parametrizations r (t).
⃗  It is very
easy and natural to extend our definition of derivative to r (t)
⃗  as follows.

 Definition 1.6.9
The derivative of the vector valued function r (t)
⃗  is defined to be

′ dr ⃗  r (t
⃗  + h) − r (t)
⃗ 
r ⃗ (t) = (t) = lim
dt h→0 h

when the limit exists. In particular, if r (t)


⃗  = x(t) ^ ȷ + z(t)k, then
ı + y(t)^
ı ȷ
^

′ ′ ′ ′ ^
r ⃗ (t) = x (t) ^
ı
ı + y (t)^
ȷ
ȷ + z (t)k

That is, to differentiate a vector valued function of t, just differentiate each of its components.

And of course differentiation interacts with arithmetic operations, like addition, in the obvious way. Only a little more thought is
required to see that differentiation interacts quite nicely with dot and cross products too. Here are some examples.

 Example 1.6.10

Let
2 4 6 ^
a⃗ (t) = t ^
ı
ı +t ^
ȷ
ȷ +t k

⃗  −t −3t −5t ^
b(t) = e ^
ı
ı +e ^
ȷ
ȷ +e k
2
γ(t) = t

s(t) = sin t

We are about to compute some derivatives. To make it easier to follow what is going on, we'll use some colour. When we apply
the product rule
d ′ ′
[f (t) g(t)] = f (t) g(t) + f (t) g (t)
dt

we'll use blue to highlight the factors f ′


(t) and g ′
(t). Here we go.
⃗  2 −t 2 −3t 2 −5t ^
γ(t) b(t) = t e ^
ı
ı +t e ^
ȷ
ȷ +t e k

gives
d −t 2 −t −3t 2 −3t −5t 2 −5t
⃗  ^
[γ(t)b(t)] = [2te −t e ]^
ı
ı + [2te −3 t e ]^
ȷ
ȷ + [2te −5 t e ]k
dt
−t −3t −5t ^ 2 −t −3t −5t ^
= 2t{ e ^
ı
ı +e ^
ȷ
ȷ +e k} + t { − e ^
ı
ı − 3e ^
ȷ
ȷ − 5e k}


′ ⃗  ⃗ 
= γ (t)b(t) + γ(t)b (t)

and
⃗  2 −t 4 −3t 6 −5t
a⃗ (t) ⋅ b(t) = t e +t e +t e

gives

1.6.5 https://math.libretexts.org/@go/page/92234
d −t 2 −t 3 −3t 4 −3t 5 −5t 6 −5t
⃗ 
[ a⃗ (t) ⋅ b(t)] = [2te −t e ] + [4 t e −3 t e ] + [6 t e −5 t e ]
dt
−t 3 −3t 5 −5t 2 −t 4 −3t 6 −5t
= [2te + 4t e + 6t e ] + [−t e −3 t e −5 t e ]

3 5 ^ −t −3t −5t ^
= {2t ^
ı
ı + 4t ^
ȷ
ȷ + 6t k} ⋅ { e ^
ı
ı +e ^
ȷ
ȷ +e k}

2 4 6 ^ −t −3t −5t ^
+ {t ^
ı
ı +t ^
ȷ
ȷ +t k} ⋅ { − e ^
ı
ı − 3e ^
ȷ
ȷ − 5e k}



⃗  ⃗ 
= a⃗  (t) ⋅ b(t) + a⃗ (t) ⋅ b (t)

and

^ ^ ^
⎡ ı
ı ȷ
ȷ k ⎤
 ⃗
a⃗ (t) × b(t) = det ⎢ t2 t
4 6
t ⎥

⎣ −t −3t −5t ⎦
e e e

4 −5t 6 −3t 2 −5t 6 −t ^ 2 −3t 4 −t


= ^
ı
ı (t e −t e )−^
ȷ
ȷ (t e −t e ) + k(t e −t e )

gives
d
⃗ 
[ a⃗ (t) × b(t)]
dt
3 −5t 5 −3t −5t 5 −t ^ −3t 3 −t
=  ^
ı
ı ( 4 t e    −  6 t e )  −  ^
ȷ
ȷ ( 2te   −  6 t e ) + k( 2te   −  4 t e )

4 −5t 6 −3t 2 −5t 6 −t ^ 2 −3t 4 −t


+^
ı
ı (−5 t e +3 t e )−^
ȷ
ȷ (−5 t e +t e ) + k(−3 t e +t e )

3 5 ^ −t −3t −5t ^
= {2t ^
ı
ı + 4t ^
ȷ
ȷ + 6t k} × { e ^
ı
ı +e ^
ȷ
ȷ +e k}

2 4 6 ^ −t −3t −5t ^
+ {t ^
ı
ı +t ^
ȷ
ȷ +t k} × { − e ^
ı
ı − 3e ^
ȷ
ȷ − 5e k}


′ ⃗  ⃗ 
= a⃗  (t) × b(t) + a⃗ (t) × b (t)

and
2 4 6 ^
a⃗ (s(t)) = (sin t) ^
ı
ı + (sin t) ^
ȷ
ȷ + (sin t) k

d
3 5 ^
⟹ [ a⃗ (s(t))] = 2(sin t) cos t ^
ı
ı + 4(sin t) cos t ^
ȷ
ȷ + 6(sin t) cos t k
dt
3 5^
= {2(sin t) ^
ı + 4(sin t) ^
ı ȷ
ȷ + 6(sin t) k} cos t
′ ′
= a⃗  (s(t)) s (t)

Of course these examples extend to general (differentiable) a⃗ (t), ⃗ 


b(t), γ(t) and s(t) and give us (most of) the following theorem.

 Theorem 1.6.11. Arithmetic of differentiation

Let

be vector valued differentiable functions of t ∈ R that take values in R and


⃗ 
a⃗ (t), b(t)
n

α, β ∈ R be constants and

γ(t) and s(t) be real valued differentiable functions of t ∈ R

Then
d ′

⃗  ⃗ 
(a) [α a⃗ (t) + β b(t)] = α a⃗  (t) + β b (t) (linear combination)
dt
d ′
⃗  ′ ⃗  ⃗ 
(b) [γ(t)b(t)] = γ (t)b(t) + γ(t)b (t) (multiplication by scalar function)
dt
d ′

⃗  ⃗  ⃗ 
(c) [ a⃗ (t) ⋅ b(t)] = a⃗  (t) ⋅ b(t) + a⃗ (t) ⋅ b (t) (dot product)
dt
d ′

⃗  ⃗  ⃗ 
(d) [ a⃗ (t) × b(t)] = a⃗  (t) × b(t) + a⃗ (t) × b (t)   (cross product)
dt
d ′ ′
(e) [ a⃗ (s(t))] = a⃗  (s(t)) s (t) (composition)
dt

1.6.6 https://math.libretexts.org/@go/page/92234
′ ′
Let's think about the geometric significance of r ⃗ (t). In particular, let's think about the relationship between r ⃗ (t) and distances
′ ⃗ 
r (t+h)− ⃗ 
r (t)
along the curve. The derivative r ⃗ (t) is the limit of h
as h → 0. The numerator, r (t
⃗  + h) − r (t),
⃗  is the vector with head
at r (t
⃗  + h) and tail at r (t).
⃗ 

When h is very small this vector


has the essentially the same direction as the tangent vector to the curve at r (t)
⃗  and
has length being essentially the length of the part of the curve between r (t)
⃗  and r (t
⃗  + h).

Taking the limit as h → 0 yields that



r ⃗ (t) is a tangent vector to the curve at r (t)
⃗  that points in the direction of increasing t and
dr ⃗ 
if s(t) is the length of the part of the curve between r (0)
⃗  and r (t),
⃗  then ds

dt
(t) = ∣
∣ (t)∣
∣.
dt

This is worth stating formally.

 Lemma 1.6.12

Let r (t)
⃗  be a parametrized curve.

1. Denote by T
^
the unit tangent vector to the curve at r (t)
⃗  pointing in the direction of increasing t. If r ⃗ (t) ≠ 0 then

r ⃗ (t)
^
T(t) =

| r ⃗ (t)|

2. Denote by s(t) the length of the part of the curve between r (0)
⃗  and r (t).
⃗  Then
ds ∣ dr ⃗  ∣
(t) = ∣ (t)∣
dt ∣ dt ∣

T
∣ dr ⃗  ∣
s(T ) − s(T0 ) = ∫ ∣ (t)∣ d
T0 ∣ dt ∣

3. In particular, if the parameter happens to be arc length, i.e. if t = s, so that ds

ds
= 1, then

∣ dr ⃗  ∣ ′
^
∣ (s)∣ = 1 T(s) = r ⃗ (s)
∣ dt ∣

As an application, we have the

 Lemma 1.6.13

If r (t)
⃗  = (x(t) , y(t) , z(t)) is the position of a particle at time t, then

′ ′ ′ ′
ds
⃗  ^ ^
velocity at time t = v(t) = r ⃗ (t) = x (t) ^
ıı + y (t)^
ȷ
ȷ + z (t)k = (t) T(t)
dt
−−−−−−−−−−−−−−−−−−
ds ′ ′ 2 ′ 2 ′ 2
speed at time t = ⃗ 
(t) = | v(t)| = | r ⃗ (t)| = √ (x (t) + y (t) + z (t)
dt
′′ ′ ′′ ′′ ′′ ^
acceleration at time t = a⃗ (t) = r ⃗  (t) = v ⃗ (t) = (x (t) ^
ıı + y (t)^
ȷ
ȷ +z (t)k

1.6.7 https://math.libretexts.org/@go/page/92234
and the distance travelled between times T and T is
0

T T −−−−−−−−−−−−−−−−−−
dr ⃗  ′ 2 ′ 2 ′ 2
∣ ∣ √ (x (t)
s(T ) − s(T0 ) = ∫ (t) dt = ∫ + y (t) + z (t) dt
∣ ∣
T0
dt T 0

′ ′
Note that the velocity v(t)
⃗  = r ⃗ (t) is a vector quantity while the speed
ds

dt
(t) = | r ⃗ (t)| is a scalar quantity.

 Example 1.6.14. Circumference of a circle

In general it can be quite difficult to compute arc lengths. So, as an easy warmup example, we will compute the circumference
of the circle 3 x + y = a . We'll also find a unit tangent to the circle at any point on the circle. We'll use the parametrization
2 2 2

⃗ 
r (θ) = (a cos θ , a sin θ) 0 ≤ θ ≤ 2π

of Example 1.6.1. Using Lemma 1.6.12, but with the parameter t renamed to θ

r ⃗ (θ) = −a sin θ ^
ı + a cos θ^
ı ȷ
ȷ

r ⃗ (θ)
^
T(θ) = = − sin θ ^
ı + cos θ^
ı ȷ
ȷ

| r ⃗ (θ)|

ds ′
(θ) = ∣
∣r ⃗ (θ)∣
∣ =a

Θ

s(Θ) − s(0) = ∫ ∣
∣r ⃗ (θ)∣
∣ dθ = aΘ
0

As 4 s(Θ) is the arc length of the part of the circle with 0 ≤ θ ≤ Θ, the circumference of the whole circle is

s(2π) = 2πa

which is reassuring, since this formula has been known 5 for thousands of years.

The formula s(Θ) − s(0) = aΘ also makes sense — the part of the circle with 0 ≤θ ≤Θ is the fraction Θ


of the whole
circle, and so should have length × 2πa. Also note that
Θ

⃗  ^
r (θ) ⋅ T(θ) = (a cos θ , a sin θ) ⋅ ( − sin θ , cos θ) = 0

so that the tangent to the circle at any point is perpendicular to the radius vector of the circle at that point. This is another
geometric fact that has been known 6 for thousands of years.
It is Proposition 18 in Book 3 of Euclid's Elements. It was published around 300BC.

 Example 1.6.15. Arc length of a helix


Consider the curve

⃗  ^
r (t) = 6 sin(2t) ^
ı
ı + 6 cos(2t)^
ȷ
ȷ + 5tk

where the standard basis vectors ^ ȷ = (0, 1, 0) and k = (0, 0, 1). We'll first sketch it, by observing that
ı = (1, 0, 0), ^
ı ȷ
^

x(t) = 6 sin(2t) and y(t) = 6 cos(2t) obey


2 2 2 2
x(t) + y(t) = 36 sin (2t) + 36 cos (2t) = 36

1.6.8 https://math.libretexts.org/@go/page/92234
So all points of the curve lie on the cylinder x + y = 36 and 2 2

as t increases, (x(t), y(t)) runs clockwise around the circle x + y 2 2


= 36 and at the same time z(t) = 5t just increases
linearly.
Our curve is the helix

We have marked three points of the curve on the above sketch. The first has t = 0 and is 0 ^
ı ȷ + 0 k. The second has t =
ı + 6^
ȷ
^ π

and is 0^
ı
ı − 6^
ȷ
ȷ +

2
^
k, and the third has t =π and is 0^
ı
ı + 6^
ȷ
^
ȷ + 5π k. We'll now use Lemma 1.6.12 to find a unit tangent
^
T(t) to the curve at r (t)
⃗  and also the arclength of the part of curve between t = 0 and t = π.

⃗  ^
r (t) = 6 sin(2t) ^
ı + 6 cos(2t)^
ı ȷ
ȷ + 5tk

^
r ⃗ (t) = 12 cos(2t) ^
ı
ı − 12 sin(2t)^
ȷ
ȷ + 5k
−−−−−−−−−−−−−−−−−−−−−−−− − −−−−−−
ds ′ 2 2 2 2 2 2 2
(t) = ∣
∣r ⃗ (t)∣
∣ = √ 12 cos (2t) + 12 sin (2t) + 5 = √ 12 + 5
dt

= 13

r ⃗ (t) 12 12 5
^ ^
T(t) = = cos(2t) ^
ı
ı − sin(2t)^
ȷ
ȷ + k

| r ⃗ (t))| 13 13 13

π

s(π) − s(0) = ∫ ∣
∣r ⃗ (t)∣
∣ dt = 13π
0

 Example 1.6.16. Velocity and acceleration

Imagine that, at time t, a particle is at


t t
⃗ 
r (t) = [h + a cos(2π )] ^
ı
ı + [k + a sin(2π )] ^
ȷ
ȷ
T T

As |r (t)
⃗  −h ^ ı
ı −k ^ȷ | = a, the particle is running around the circle of radius a centred on (h, k). When t increases by T , the
ȷ

argument, 2π , of cos(2π ) and sin(2π ) increases by exactly 2π and the particle runs exactly once around the circle. In
t

T
t

T
t

particular, it travels a distance 2πa. So it is moving at speed . According to Lemma 1.6.13, it has
2πa

′ 2πa t 2πa t
velocity  = r ⃗ (t) = − sin(2π )^
ı
ı + cos(2π )^
ȷ
ȷ
T T T T

ds ′ 2πa
speed = (t) = | r ⃗ (t)| =
dt T
2 2
′′ 4π a t 4π a t
acceleration = r ⃗  (t) = − cos(2π )^
ı
ı − sin(2π )^
ȷ
ȷ
2 2
T T T T

2

=− ⃗ 
[ r (t) −h ^
ı −k ^
ı ȷ
ȷ]
2
T

Here are some observations.


The velocity r ⃗ (t) has dot product zero with r (t)

⃗  −h ^ı
ı −k ^ȷ , which is the radius vector from the centre of the circle to the
ȷ

particle. So the velocity is perpendicular to the radius vector, and hence parallel to the tangent vector of the circle at r (t).
⃗ 

The speed given by Lemma 1.6.13 is exactly the speed we found above, just before we started applying Lemma 1.6.13.

1.6.9 https://math.libretexts.org/@go/page/92234
′′
The acceleration r ⃗  (t) points in the direction opposite to the radius vector.

Exercises
Stage 1
Questions 1.6.2.1 through 1.6.2.5 provide practice with curve parametrization. Being comfortable with the algebra and
interpretation of these descriptions are essential ingredients in working effectively with parametrizations.

 1

Consider the following time-parametrized curve:


π
2
⃗ 
r (t) = (cos( t), (t − 5 ) )
4


List the three points (−1/√2, 0), (1, 25), and (0, 25) in chronological order.

 2

At what points in the xy-plane does the curve 2


(sin t, t ) cross itself? What is the difference in t between the first time the
curve crosses through a point, and the last?

 3

Find the specified parametrization of the first quadrant part of the circle x 2
+y
2 2
=a .

1. In terms of the y coordinate.


2. In terms of the angle between the tangent line and the positive x-axis.
3. In terms of the arc length from (0, a).

 4

A circle of radius a rolls along the x-axis in the positive direction, starting with its centre at (a, a). In that position, we mark
the topmost point on the circle P . As the circle moves, P moves with it. Let θ be the angle the circle has rolled - see the
diagram below.
1. Give the position of the centre of the circle as a function of θ.
2. Give the position of P a function of θ.

1.6.10 https://math.libretexts.org/@go/page/92234
 5

The curve C is defined to be the intersection of the ellipsoid


1
2 2 2
x − y + 3z =1
4

and the plane

x + y + z = 0.

When y is very close to 0, and z is negative, find an expression giving z in terms of y.

 6

A particle traces out a curve in space, so that its position at time t is

−t
1 2 2
⃗  ^ ^ ^
r (t) =e ı
ı + ȷ
ȷ + (t − 1 ) (t − 3 ) k
t

for t > 0.
Let the positive z axis point vertically upwards, as usual. When is the particle moving upwards, and when is it moving
downwards? Is it moving faster at time t = 1 or at time t = 3?

 7
Below is the graph of the parametrized function r (t).
⃗  Let s(t) be the arclength along the curve from r (0)
⃗  to r (t).
⃗ 

Indicate on the graph s(t + h) − s(t) and r (t


⃗  + h) − r (t).
⃗  Are the quantities scalars or vectors?

 8
What is the relationship between velocity and speed in a vector-valued function of time?

 9✳

′ ′′ ′′′ dr ⃗  2 3
d r ⃗  d r ⃗ 
Let r (t)
⃗  be a vector valued function. Let r ⃗ , r ⃗  , and r ⃗  denote , 2
, and 3
, respectively. Express
dt dt dt

1.6.11 https://math.libretexts.org/@go/page/92234
d ′ ′′
[(r ⃗ × r ⃗ ) ⋅ r ⃗  ]
dt

′ ′′ ′′′
in terms of r ,⃗  r ⃗ , r ⃗  , and r ⃗  . Select the correct answer.
′ ′′ ′′′
1. (r ⃗  × r ⃗  ) ⋅ r ⃗ 
2. (r ⃗  × r ⃗  ) ⋅ r ⃗ + (r ⃗ × r ⃗ ) ⋅ r ⃗ 
′ ′′ ′ ′′′

′ ′′′
3. (r ⃗ × r ⃗ ) ⋅ r ⃗ 
4. 0
5. None of the above.

Stage 2

 10 ✳

Find the speed of a particle with the given position function


– 5t −5t ^
r (t)
⃗  = 5 √2 t ^
ı
ı +e ^
ȷ
ȷ −e k

Select the correct answer:


1. |v(t)|
⃗  = (e +e
5t
)
−5t

−−−−−−−−−−− −
2. |v(t)|
⃗  = √10 + 5 e + 5 e
t −t

−−−−−−−−−−−− −
3. |v(t)|
⃗  = √10 + e +e
10t −10t

4. |v(t)|
⃗  = 5(e +e )
5t −5t

5. |v(t)|⃗  = 5(e + e )
t −t

 11

Find the velocity, speed and acceleration at time t of the particle whose position is r (t).
⃗  Describe the path of the particle.
1. r (t)
⃗  = a cos t ^
ı + a sin t ^
ı ȷ
^
ȷ + ct k

2. r (t)
⃗  = a cos t sin t ^
ı + a sin t ^
ı ȷ
^
ȷ + a cos t k
2

 12 ✳
1. Let
2 1 3
⃗ 
r (t) = (t , 3, t )
3

Find the unit tangent vector to this parametrized curve at t = 1, pointing in the direction of increasing t.
2. Find the arc length of the curve from (a) between the points (0, 3, 0) and (1, 3, − ). 1

 13


Using Lemma 1.6.12, find the arclength of r (t)
⃗  = (t, √
3

2
2 3
t ,t ) from t = 0 to t = 1.

 14
7
A particle's position at time t is given by r (t)
⃗  = (t + sin t, cos t) . What is the magnitude of the acceleration of the particle
at time t?
The particle traces out a cycloid--see Question 1.6.2.4

1.6.12 https://math.libretexts.org/@go/page/92234
 15 ✳
3

A curve in R is given by the vector equation r (t)


3
⃗  = (2t cos t, 2t sin t,
t

3
)

1. Find the length of the curve between t = 0 and t = 2.


2. Find the parametric equations of the tangent line to the curve at t = π.

 16 ✳
Let r (t)
⃗  = (3 cos t, 3 sin t, 4t) be the position vector of a particle as a function of time t ≥ 0.

1. Find the velocity of the particle as a function of time t.


2. Find the arclength of its path between t = 1 and t = 2.

 17 ✳
Consider the curve
1 3
1 3 3
⃗  ^
r (t) = cos t ^
ı
ı + sin t^
ȷ
ȷ + sin tk
3 3

1. Compute the arc length of the curve from t = 0 to t = . π

2. Compute the arc length of the curve from t = 0 to t = π.

 18 ✳

Let r (t)
⃗  =(
1

3
3
t ,
1

2
2
t ,
1

2
t), t ≥ 0. Compute s(t ), the arclength of the curve at time t.

 19 ✳

Find the arc length of the curve ⃗ 


r (t) = (t
m m
, t
3m/2
, t ) for 0 ≤ a ≤ t ≤ b, and where m > 0. Express your result in terms
of m, a, and b.

 20

If a particle has constant mass m, position r ,⃗  and is moving with velocity v,⃗  then its angular momentum is L = m(r ⃗ × v).
⃗ 

For a particle with mass m = 1 and position function r ⃗ = (sin t, cos t, t), find \(\left|\frac{\mathrm{d}\textbf{L}
{\mathrm{d}t} \right|\text{.}\)

 21 ✳

Consider the space curve Γ whose vector equation is


2^
⃗ 
r (t) = t sin(πt) ^
ı
ı + t cos(πt) ^
ȷ
ȷ +t k 0 ≤t <∞

This curve starts from the origin and eventually reaches the ellipsoid E whose equation is 2x 2
+ 2y
2
+z
2
= 24.

1. Determine the coordinates of the point P where Γ intersects E.


2. Find the tangent vector of Γ at the point P .
3. Does Γ intersect E at right angles? Why or why not?

 22 ✳

Suppose a particle in 3-dimensional space travels with position vector r (t),


⃗  which satisfies ′′
r ⃗  (t) = −r (t).
⃗  Show that the
2 ′ 2
“energy” |r (t)
⃗  | + | r ⃗ (t)| is constant (that is, independent of t ).

1.6.13 https://math.libretexts.org/@go/page/92234
Stage 3

 23 ✳

A particle moves along the curve \cC of intersection of the surfaces z 2


= 12y and 18x = yz in the upward direction. When
the particle is at (1, 3, 6) its velocity v ⃗ and acceleration a⃗  are given by
^ ^
v ⃗ = 6 ^
ı + 12 ^
ı ȷ
ȷ + 12 k a⃗ = 27 ^
ı + 30 ^
ı ȷ
ȷ +6 k

1. Write a vector parametric equation for \cC using u = as a parameter. z

2. Find the length of \cC from (0, 0, 0) to (1, 3, 6).


3. If u = u(t) is the parameter value for the particle's position at time t, find du

dt
when the particle is at (1, 3, 6).
2

4. Find d u
2
when the particle is at (1, 3, 6).
dt

 24 ✳
2

A particle of mass m = 1 has position r ⃗  0 =


1

2
^
k and velocity v ⃗  0 =
π

2
^
ı
ı at time 0. It moves under a force
2t ^
F(t) = −3t ^
ı + sin t ^
ı ȷ
ȷ + 2e k.

1. Determine the position r (t)


⃗  of the particle depending on t.
2. At what time after time t = 0 does the particle cross the plane x = 0 for the first time?
3. What is the velocity of the particle when it crosses the plane x = 0 in part (b)?

 25 ✳

Let C be the curve of intersection of the surfaces y = x and z = x . A particle moves along
2 2

3
3
C with constant speed such
that > 0. The particle is at (0, 0, 0) at time t = 0 and is at (3, 9, 18) at time t =
dx

dt
.
7

1. Find the length of the part of C between (0, 0, 0) and (3, 9, 18).
2. Find the constant speed of the particle.
3. Find the velocity of the particle when it is at (1, 1, ). 2

4. Find the acceleration of the particle when it is at (1, 1, ). 2

 26

A camera mounted to a pole can swivel around in a full circle. It is tracking an object whose position at time t seconds is x(t)
metres east of the pole, and y(t) metres north of the pole.
In order to always be pointing directly at the object, how fast should the camera be programmed to rotate at time t? (Give your
answer in terms of x(t) and y(t) and their derivatives, in the units rad/sec.)

 27
A projectile falling under the influence of gravity and slowed by air resistance proportional to its speed has position satisfying
2
d r ⃗  dr ⃗ 
^
= −gk − α
dt2 dt

where α is a positive constant. If r ⃗ = r ⃗  and 0


dr ⃗ 

dt
= v⃗0
  at time t = 0, find r (t).
⃗  (Hint: Define u(t) = e αt dr ⃗ 

dt
(t) and substitute
dr ⃗ 

dt
(t) = e
−αt
u(t) into the given differential equation to find a differential equation for u.)

1.6.14 https://math.libretexts.org/@go/page/92234
 28 ✳

At time t = 0 a particle has position and velocity vectors r (0)


⃗  = ⟨−1, 0, 0⟩ and v⃗(0)
  = ⟨0, −1, 1⟩ . At time t, the particle has

acceleration vector

a⃗ (t) = ⟨cos t, sin t, 0⟩

1. Find the position of the particle after t seconds.


2. Show that the velocity and acceleration of the particle are always perpendicular for every t.
3. Find the equation of the tangent line to the particle's path at t = −π/2.
4. True or False: None of the lines tangent to the path of the particle pass through (0, 0, 0). Justify your answer.

 29 ✳

The position of a particle at time t (measured in seconds s) is given by


πt πt
⃗  ^
r (t) = t cos( )^
ı
ı + t sin( )^
ȷ
ȷ +t k
2 2

1. Show that the path of the particle lies on the cone z = x + y .


2 2 2

2. Find the velocity vector and the speed at time t.


3. Suppose that at time t = 1 s the particle flies off the path on a line L in the direction tangent to the path. Find the equation
of the line L.
4. How long does it take for the particle to hit the plane x = −1 after it started moving along the straight line L?

 30 ✳
1. The curve r ⃗  (t) = ⟨1 + t, t , t ⟩ and r ⃗  (t) = ⟨cos t, sin t, t⟩ intersect at the point P (1, 0, 0). Find the angle of
1
2 3
2

intersection between the curves at the point P .


2. Find the distance between the line of intersection of the planes x + y − z = 4 and 2x − z = 4 and the line
r (t)
⃗  = ⟨t, −1 + 2t, 1 + 3t⟩ .

1. When we say r (t) ⃗  = (x(t), y(t), z(t)), we mean that (x(t), y(t), z(t)) is the point at the head of the vector r (t)
⃗  when its tail
is at the origin.
2. We of course assume that the constant a > 0.
3. We of course assume that the constant a > 0.
4. You might guess that Θ is a capital Greek theta. You'd be right.
5. The earliest known written approximations of π in Egypt and Babylon, date from 1900–1600BC. The first recorded algorithm
for rigorously evaluating π was developed by Archimedes around 250 BC. The first use of the symbol π for the ratio between
the circumference of a circle and its diameter, in print was in 1706 by William Jones.
6. It is Proposition 18 in Book 3 of Euclid's Elements. It was published around 300BC.

This page titled 1.6: Curves and their Tangent Vectors is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.

1.6.15 https://math.libretexts.org/@go/page/92234
1.7: Sketching Surfaces in 3d
In practice students taking multivariable calculus regularly have great difficulty visualising surfaces in three dimensions, despite
the fact that we all live in three dimensions. We'll now develop some technique to help us sketch surfaces in three dimensions 1.
We all have a fair bit of experience drawing curves in two dimensions. Typically the intersection of a surface (in three dimensions)
with a plane is a curve lying in the (two dimensional) plane. Such an intersection is usually called a cross-section. In the special
case that the plane is one of the coordinate planes, the intersection is sometimes called a trace. One can often get a pretty good idea
of what a surface looks like by sketching a bunch of cross-sections. Here are some examples.

 Example 1.7.1. 4x 2
+ y
2
− z
2
= 1

Sketch the surface that satisfies 4x 2


+y
2
−z
2
= 1.

Solution
We'll start by fixing any number z and sketching the part of the surface that lies in the horizontal plane z = z
0 0.

The intersection of our surface with that horizontal plane is a horizontal cross-section. Any point (x, y, z) lying on that
horizontal cross-section satisfies both
2 2 2
z = z0   and  4 x +y −z =1
2 2 2
⟺ z = z0   and  4 x +y = 1 +z
0

Think of z as a constant. Then 4x


0
2
+y
2
= 1 +z
2
0
is a curve in the xy-plane. As 1 + z is a constant, the curve is an ellipse.
2
0
−−−−−
To determine its semi-axes 2, we observe that when y = 0, we have x =±
1

2
√1 + z
0
2
and when x = 0, we have
−−−−− −−−−− −−−−−
y = ±√1 + z
2
0
. So the curve is just an ellipse with x semi-axis 1

2
√1 + z
2
0
and y semi-axis √1 + z 2
0
. It's easy to sketch.

Remember that this ellipse is the part of our surface that lies in the plane z = z . Imagine that the sketch of the ellipse is on a
0

single sheet of paper. Lift the sheet of paper up, move it around so that the x- and y -axes point in the directions of the three
dimensional x- and y -axes and place the sheet of paper into the three dimensional sketch at height z . This gives a single 0

horizontal ellipse in 3d, as in the figure below.

1.7.1 https://math.libretexts.org/@go/page/92235
We can build up the full surface by stacking many of these horizontal ellipses — one for each possible height z . So we now 0

draw a few of them as in the figure below. To reduce the amount of clutter in the sketch, we have only drawn the first octant
(i.e. the part of three dimensions that has x ≥ 0, y ≥ 0 and z ≥ 0 ).

Here is why it is OK, in this case, to just sketch the first octant. Replacing x by −x in the equation 4x + y − z = 1 does 2 2 2

not change the equation. That means that a point (x, y, z) is on the surface if and only if the point (−x, y, z) is on the surface.
So the surface is invariant under reflection in the yz-plane. Similarly, the equation 4x + y − z = 1 does not change when y
2 2 2

is replaced by −y or z is replaced by −z. Our surface is also invariant reflection in the xz- and yz-planes. Once we have the
part in the first octant, the remaining octants can be gotten simply by reflecting about the coordinate planes.
We can get a more visually meaningful sketch by adding in some vertical cross-sections. The x = 0 and y = 0 cross-sections
(also called traces — they are the parts of our surface that are in the yz- and xz-planes, respectively) are
2 2 2 2
x = 0,  y −z =1 and y = 0,  4 x −z =1

These equations describe hyperbolae 3. If you don't remember how to sketch them, don't worry. We'll do it now. We'll first
sketch them in 2d. Since
2 2
y = 1 +z ⟹ |y| ≥ 1

 and  y = ±1 when z = 0

 and  for large z,  y ≈ ±z


2 2 1
4x = 1 +z ⟹ |x| ≥
2

1
 and  x =±  when z = 0
2

1
 and  for large z,  x ≈ ± z
2

the sketches are

1.7.2 https://math.libretexts.org/@go/page/92235
Now we'll incorporate them into the 3d sketch. Once again imagine that each is a single sheet of paper. Pick each up and move
it into the 3d sketch, carefully matching up the axes. The red (blue) parts of the hyperbolas above become the red (blue) parts
of the 3d sketch below (assuming of course that you are looking at this on a colour screen).

Now that we have a pretty good idea of what the surface looks like we can clean up and simplify the sketch. Here are a couple
of possibilities.

Here are two figures created by graphing software.

This type of surface is called a hyperboloid of one sheet.


There are also hyperboloids of two sheets. For example, replacing the +1 on the right hand side of x 2
+y
2
−z
2
=1 gives
x + y − z = −1, which is a hyperboloid of two sheets. We'll sketch it quickly in the next example.
2 2 2

 Example 1.7.2. 4x 2
+ y
2
− z
2
= −1

Sketch the surface that satisfies 4x 2


+y
2
−z
2
= −1.

Solution
As in the last example, we'll start by fixing any number z and sketching the part of the surface that lies in the horizontal plane
0

z = z . The intersection of our surface with that horizontal plane is


0

2 2 2
z = z0   and  4 x +y =z −1
0

1.7.3 https://math.libretexts.org/@go/page/92235
Think of z as a constant.
0

If |z0| < 1, then z − 1 < 0 and there are no solutions to x


2
0
2
+y
2
=z
0
2
− 1.

If |z0| =1 there is exactly one solution, namely x = y = 0.


−−−−− −−−−−
If |z0| >1 then 4x 2
+y
2
=z
0
2
−1 is an ellipse with x semi-axis 1

2
√z
2
0
−1 and y semi-axis √z0
2
−1 . These semi-axes
are small when |z | is close to 1 and grow as |z | increases.
0 0

The first octant parts of a few of these horizontal cross-sections are drawn in the figure below.

Next we add in the x = 0 and y = 0 cross-sections (i.e. the parts of our surface that are in the yz- and xz-planes, respectively)
2 2 2 2
x = 0,  z = 1 +y and y = 0,  z = 1 + 4x

Now that we have a pretty good idea of what the surface looks like we clean up and simplify the sketch.
Here is are two figures created by graphing software.

This type of surface is called a hyperboloid of two sheets.

 Example 1.7.3. yz = 1
Sketch the surface yz = 1.
Solution

1.7.4 https://math.libretexts.org/@go/page/92235
This surface has a special property that makes it relatively easy to sketch. There are no x's in the equation yz = 1. That means
that if some y and z obey y z = 1, then the point (x, y , z ) lies on the surface yz = 1 for all values of x. As x runs from
0 0 0 0 0 0

−∞ to ∞, the point (x, y , z ) sweeps out a straight line parallel to the x-axis. So the surface yz = 1 is a union of lines
0 0

parallel to the x-axis. It is invariant under translations parallel to the x-axis. To sketch yz = 1, we just need to sketch its
intersection with the yz-plane and then translate the resulting curve parallel to the x-axis to sweep out the surface.
We'll start with a sketch of the hyperbola yz = 1 in two dimensions.

Next we'll move this 2d sketch into the yz-plane, i.e. the plane x = 0, in 3d, except that we'll only draw in the part in the first
octant.

The we'll draw in x = x cross-sections for a couple of more values of x


0 0

and clean up the sketch a bit


Here are two figures created by graphing software.

1.7.5 https://math.libretexts.org/@go/page/92235
 Example 1.7.4. xyz = 4

Sketch the surface xyz = 4.


Solution
We'll sketch this surface using much the same procedure as we used in Examples 1.7.1 and 1.7.2. We'll only sketch the part of
the surface in the first octant. The remaining parts (in the octants with x, y < 0, z ≥ 0, with x, z < 0, y ≥ 0 and with
y, z < 0, x ≥ 0 ) are just reflections of the first octant part.

As usual, we start by fixing any number z and sketching the part of the surface that lies in the horizontal plane
0 z = z0 . The
intersection of our surface with that horizontal plane is the hyperbola
4
z = z0   and  xy =
z0

Note that x → ∞ as y → 0 and that y → ∞ as x → 0. So the hyperbola has both the x-axis and the y -axis as asymptotes,
when drawn in the xy-plane. The first octant parts of a few of these horizontal cross-sections (namely, z = 4, z = 2 and
0 0

z =
0
1

2
) are drawn in the figure below.

Next we add some vertical cross-sections. We can't use x = 0 or y = 0 because any point on xyz = 1 must have all of x, y, z

nonzero. So we use

x = 4,  yz = 1 and y = 4,  xz = 1

instead. They are again hyperbolae.

1.7.6 https://math.libretexts.org/@go/page/92235
Finally, we clean up and simplify the sketch.

Here are two figures created by graphing software.

Level Curves and Surfaces


Often the reason you are interested in a surface in 3d is that it is the graph z = f (x, y) of a function of two variables f (x, y).
Another good way to visualize the behaviour of a function f (x, y) is to sketch what are called its level curves. By definition, a level
curve of f (x, y) is a curve whose equation is f (x, y) = C , for some constant C . It is the set of points in the xy-plane where f
takes the value C . Because it is a curve in 2d, it is usually easier to sketch than the graph of f . Here are a couple of examples.

 Example 1.7.5. f (x, y) = x 2


+ 4y
2
− 2x + 2

Sketch the level curves of f (x, y) = x 2


+ 4y
2
− 2x + 2.

Solution
Fix any real number C . Then, for the specified function f , the level curve f (x, y) = C is the set of points (x, y) that obey

1.7.7 https://math.libretexts.org/@go/page/92235
2 2 2 2
x + 4y − 2x + 2 = C ⟺ x − 2x + 1 + 4 y +1 = C
2 2
⟺ (x − 1 ) + 4y = C −1

Now (x − 1) 2
+ 4y is the sum of two squares, and so is always at least zero. So if C − 1 < 0, i.e. if C < 1, there is no curve
2

f (x, y) = C . If C − 1 = 0, i.e. if C = 1, then f (x, y) = C − 1 = 0 if and only if both (x − 1 ) = 0 and 4 y = 0 and so the
2 2

level curve consists of the single point (1, 0). If C > 1, then f (x, y) = C become (x − 1) + 4y = C − 1 > 0 which 2 2

describes an ellipse centred on (1, 0). It intersects the x-axis when y = 0 and
2
−−−−− −−−−−
(x − 1 ) = C −1 ⟺ x − 1 = ±√ C − 1 ⟺ x = 1 ± √C − 1

and it intersects the line x = 1 (i.e. the vertical line through the centre) when
2
−−−−− 1
−−−−−
4y = C −1 ⟺ 2y = ±√ C − 1 ⟺ y =± √C − 1
2

−−−−− −−−−−
So, when C > 1, f (x, y) = C is the ellipse centred on (1, 0) with x semi-axis √C − 1 and y semi-axis 1

2
√C − 1 . Here is a
sketch of some representative level curves of f (x, y) = x + 4y − 2x + 2. 2 2

It is often easier to develop an understanding of the behaviour of a function f (x, y) by looking at a sketch of its level curves,
than it is by looking at a sketch of its graph. On the other hand, you can also use a sketch of the level curves of f (x, y) as the
first step in building a sketch of the graph z = f (x, y). The next step would be to redraw, for each C , the level curve
f (x, y) = C , in the plane z = C , as we did in Example 1.7.1.

 Example 1.7.6. e x+y+z


= 1

The function f (x, y) is given implicitly by the equation e x+y+z


= 1. Sketch the level curves of f .
Solution
This one is not as nasty as it appears. That “f (x, y) is given implicitly by the equation e = 1 ” means that, for each x, y,
x+y+z

the solution z of e = 1 is f (x, y). So, for the specified function f and any fixed real number C , the level curve
x+y+z

f (x, y) = C is the set of points (x, y) that obey

x+y+C
e =1 ⟺ x +y +C = 0 (by taking the logarithm of both sides)

⟺ x + y = −C

This is of course a straight line. It intersects the x-axis when y = 0 and x = −C and it intersects the y -axis when x = 0 and
y = −C . Here is a sketch of some level curves.

1.7.8 https://math.libretexts.org/@go/page/92235
We have just seen that sketching the level curves of a function f (x, y) can help us understand the behaviour of f . We can
generalise this to functions F (x, y, z) of three variables. A level surface of F (x, y, z) is a surface whose equation is of the form
F (x, y, z) = C for some constant C . It is the set of points (x, y, z) at which F takes the value C .

 Example 1.7.7. F (x, y, z) = x 2


+ y
2
+ z
2



Let F (x, y, z) = x + y + z . If C > 0, then the level surface F (x, y, z) = C is the sphere of radius √C centred on the
2 2 2

origin. Here is a sketch of the parts of the level surfaces F = 1 (radius 1), F = 4 (radius 2) and F = 9 (radius 3) that are in
the first octant.

 Example 1.7.8. F (x, y, z) = x 2


+ z
2

Let F (x, y, z) = x + z and C > 0. Consider the level surface x + z = C . The variable y does not appear in this
2 2 2 2



equation. So for any fixed y , the intersection of the our surface x + z = C with the plane y = y is the circle of radius √C
0
2 2
0

centred on x = z = 0. Here is a sketch of the first quadrant part of one such circle.

1.7.9 https://math.libretexts.org/@go/page/92235


The full surface is the horizontal stack of all of those circles with y running over R. It is the cylinder of radius √C centred on
0

the y -axis. Here is a sketch of the parts of the level surfaces F = 1 (radius 1), F = 4 (radius 2) and F = 9 (radius 3) that are
in the first octant.

 Example 1.7.9. F (x, y, z) = e x+y+z

Let F (x, y, z) = e and C > 0. Consider the level surface e


x+y+z x+y+z
= C , or equivalently, x + y + z = ln C . It is the plane

that contains the intercepts (ln C , 0, 0), (0, ln C , 0) and (0, 0, ln C ). Here is a sketch of the parts of the level surfaces
F =e (intercepts (1, 0, 0), (0, 1, 0), (0, 0, 1)),
F =e
2
(intercepts (2, 0, 0), (0, 2, 0), (0, 0, 2)) and
F =e
3
(intercepts (3, 0, 0), (0, 3, 0), (0, 0, 3))
that are in the first octant.

Exercises
Stage 1

1.7.10 https://math.libretexts.org/@go/page/92235
 1. ✳

Match the following equations and expressions with the corresponding pictures. Cartesian coordinates are (x, y, z), cylindrical
coordinates are (r, θ, z), and spherical coordinates are (ρ, θ, a⃗ rphi).

(A) (B) (C)

(D) (E) (F)

2 2 2
(a) a⃗ rphi = π/3 (b) r = 2 cos θ (c) x +y =z +1

2 2 4 4
(d) y =x +z (e) ρ = 2 cos a⃗ rphi (f) z =x +y − 4xy

 2

In each of (a) and (b) below, you are provided with a sketch of the first quadrant parts of a few level curves of some function
f (x, y). Sketch the first octant part of the corresponding graph z = f (x, y).

(a) (b)

 3

Sketch a few level curves for the function f (x, y) whose graph z = f (x, y) is sketched below.

1.7.11 https://math.libretexts.org/@go/page/92235
Stage 2

 4

Sketch some of the level curves of


1. f (x, y) = x + 2y 2 2

2. f (x, y) = xy
3. f (x, y) = xe −y

 5. ✳
2y
Sketch the level curves of f (x, y) = x +y
2 2
.

 6. ✳

Draw a “contour map” of f (x, y) = e , showing all types of level curves that occur.
2 2
−x +4 y

 7. ✳

A surface is given implicitly by


2 2 2
x +y −z + 2z = 0

1. Sketch several level curves z = constant.


2. Draw a rough sketch of the surface.

 8. ✳

Sketch the hyperboloid z 2


= 4x
2
+y
2
− 1.

 9
Describe the level surfaces of
1. f (x, y, z) = x + y + z 2 2 2

2. f (x, y, z) = x + 2y + 3z
3. f (x, y, z) = x + y 2 2

 10

Sketch the graphs of


1. f (x, y) = sin x 0 ≤ x ≤ 2π,  0 ≤ y ≤ 1
−−−−−−
2. f (x, y) = √x 2
+y
2

3. f (x, y) = |x| + |y|

 11

Sketch and describe the following surfaces.


1. 4x + y = 16
2 2

2. x + y + 2z = 4
2 2 2
y z x
3. + =1+
9 4 16
4. y 2
=x
2
+z
2

1.7.12 https://math.libretexts.org/@go/page/92235
2 2 2
x y z
5. + + =1
9 12 9
6. x 2
+y
2
+z
2
+ 4x − by + 9z − b = 0 where b is a constant.
2 2
x y z
7. = +
4 4 9

8. z = x 2

Stage 3

 12

The surface below has circular level curves, centred along the z -axis. The lines given are the intersection of the surface with
the right half of the yz-plane. Give an equation for the surface.

1. Of course you could instead use some fancy graphing software, but part of the point is to build intuition. Not to mention that
you can't use fancy graphing software on your exam.
2. The semi-axes of an ellipse are the line segments from the centre of the ellipse to the farthest points on the ellipse and to the
nearest points on the ellipse. For a circle the lengths of all of these line segments are just the radius.
3. It's not just a figure of speech!

This page titled 1.7: Sketching Surfaces in 3d is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.

1.7.13 https://math.libretexts.org/@go/page/92235
1.8: Cylinders
There are some classes of relatively simple, but commonly occurring, surfaces that are given their own names. One such class is
cylindrical surfaces. You are probably used to thinking of a cylinder as being something that looks like x + y = 1. 2 2

In Mathematics, the word “cylinder” is given a more general meaning.

 Definition 1.8.1. Cylinder


A cylinder is a surface that consists of all points that are on all lines that are
parallel to a given line and
pass through a given fixed curve, that lies in a fixed plane that is not parallel to the given line.

 Example 1.8.2

Here are sketches of three cylinders. The familiar cylinder on the left below

is called a right circular cylinder, because the given fixed curve (x 2


+y
2
= 1, z = 0 ) is a circle and the given line (the z -axis)
is perpendicular (i.e. at right angles) to the fixed curve.
The cylinder on the left above can be thought of as a vertical stack of circles. The cylinder on the right above can also be
thought of as a stack of circles, but the centre of the circle at height z has been shifted rightward to (0, z, z). For that cylinder,
the given fixed curve is once again the circle x + y = 1, z = 0, but the given line is y = z, x = 0.
2 2

We have already seen the the third cylinder

in Example 1.7.3. It is called a hyperbolic cylinder. In this example, the given fixed curve is the hyperbola yz = 1, x =0 and
the given line is the x-axis.

1.8.1 https://math.libretexts.org/@go/page/92236
This page titled 1.8: Cylinders is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.

1.8.2 https://math.libretexts.org/@go/page/92236
1.9: Quadric Surfaces
Another named class of relatively simple, but commonly occurring, surfaces is the quadric surfaces.

 Definition 1.9.1. Quadrics


A quadric surface is surface that consists of all points that obey Q(x, y, z) = 0, with Q being a polynomial of degree two 1 .
Technically, we should also require that the polynomial can't be factored into the product of two polynomials of degree one.

For Q(x, y, z) to be a polynomial of degree two, it must be of the form


2 2 2
Q(x, y, z) = Ax + By + Cz + Dxy + Eyz + F xz + Gx + H y + I z + J

for some constants A, B, ⋯ , J. Each constant z cross section of a quadric surface has an equation of the form
2 2
Ax + Dxy + By + gx + hy + j = 0, z = z0

If A = B = D = 0 but g and h are not both zero, this is a straight line. If A, B, and D are not all zero, then by rotating and
translating our coordinate system the equation of the cross section can be brought into one of the forms 2
αx
2
+ βy
2
with α, β > 0, which, if γ > 0, is an ellipse (or a circle),

αx
2
− βy
2
with α, β > 0, which, if γ ≠ 0, is a hyperbola, and if γ = 0 is two lines,

x
2
= δy, which, if δ ≠ 0 is a parabola, and if δ = 0 is a straight line.
There are similar statements for the constant x cross sections and the constant y cross sections. Hence quadratic surfaces are built
by stacking these three types of curves.
We have already seen a number of quadric surfaces in the last couple of sections.
We saw the quadric surface 4x 2
+y
2
−z
2
=1 in Example 1.7.1.

Its constant z cross sections are ellipses and its x = 0 and y = 0 cross sections are hyperbolae. It is called a hyperboloid of one
sheet.
We saw the quadric surface x 2
+y
2
=1 in Example 1.8.2.

Its constant z cross sections are circles and its x =0 and y =0 cross sections are straight lines. It is called a right circular
cylinder.
Appendix A.8 contains other quadric surfaces.

1. Technically, we should also require that the polynomial can't be factored into the product of two polynomials of degree one.
2. This statement can be justified using a linear algebra eigenvalue/eigenvector analysis. It is beyond what we can cover here, but
is not too difficult for a standard linear algebra course.

This page titled 1.9: Quadric Surfaces is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

1.9.1 https://math.libretexts.org/@go/page/92237
CHAPTER OVERVIEW
2: Partial Derivatives
In this chapter we are going to generalize the definition of “derivative” to functions of more than one variable and then we are
going to use those derivatives. We will parallel the development in Chapters 1 and 2 of the CLP-1 text. We shall
define limits and continuity of functions of more than one variable (Definitions 2.1.2 and 2.1.3) and then
study the properties of limits in more than one dimension (Theorem 2.1.5) and then
define derivatives of functions of more than one variable (Definition 2.2.1).
We are going to be able to speed things up considerably by recycling what we have already learned in the CLP-1 text.
We start by generalizing the definition of “limit” to functions of more than one variable.
2.1: Limits
2.2: Partial Derivatives
2.3: Higher Order Derivatives
2.4: The Chain Rule
2.5: Tangent Planes and Normal Lines
2.6: Linear Approximations and Error
2.7: Directional Derivatives and the Gradient
2.8: Optional — Solving the Wave Equation
2.9: Maximum and Minimum Values
2.10: Lagrange Multipliers

This page titled 2: Partial Derivatives is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

1
2.1: Limits
Before we really start, let's recall some useful notation.

 Definition 2.1.1
N is the set {1, 2, 3, ⋯}of all natural numbers.
R is the set of all real numbers.
∈ is read “is an element of”.

∉ is read “is not an element of”.

{A|B} is read “the set of all A such that B ”

If S is a set and T is a subset of S, then S ∖ T is {x ∈ S|x ∉ T } , the set S with the elements of T removed. In particular,
if S is a set and a is an element of S, then S ∖ {a} = {x ∈ S|x ≠  a} is the set S with the element a removed.
If n is a natural number, R is used for both the set of n -component vectors ⟨x , x , ⋯ , x ⟩ and the set of points
n
1 2 n

(x , x , ⋯ , x ) with n coordinates.
1 2 n

If S and T are sets, then f : S → T means that f is a function which assigns to each element of S an element of T . The set
S is called the domain of f .

[a, b] = {x ∈ R|a ≤ x ≤ b} (a, b] = {x ∈ R|a < x ≤ b}

[a, b) = {x ∈ R|a ≤ x < b} (a, b) = {x ∈ R|a < x < b}

The definition of the limit of a function of more than one variable looks just like the definition 1 of the limit of a function of one
variable. Very roughly speaking

lim f (x⃗ ) = L
⃗  a⃗ 
x→

if f (x⃗ ) approaches L whenever x⃗ approaches a⃗ . Here is a more careful definition of limit.

 Definition 2.1.2. Limit


Let
m and n be natural numbers 2
m
a⃗ ∈ R

the function f (x⃗ ) be defined for all x⃗ near 3 a⃗  and take values in R n

n
L ∈ R

We write
lim f (x⃗ ) = L
⃗  a⃗ 
x→

if 4 the value of the function f (x⃗ ) is sure to be arbitrarily close to L whenever the value of x⃗  is close enough to a⃗ , without 5
being exactly a⃗ .

Now that we have extended the definition of limit, we can extend the definition of continuity.

 Definition 2.1.3. Continuity

Let
m and n be natural numbers
m
a⃗ ∈ R

the function f (x⃗ ) be defined for all x⃗ near a⃗  and take values in R n

1. The function f is continuous at a point a⃗  if

lim f (x⃗ ) = f (a⃗ )


⃗  a⃗ 
x→

2.1.1 https://math.libretexts.org/@go/page/89209
2. The function f is continuous on a set D if it is continuous at every point of D.

Here are a few very simple examples. There will be some more substantial examples later — after, as we did in the CLP-1 text, we
build some tools that can be used to build complicated limits from simpler ones.

 Example 2.1.4
1. If f (x, y) is the constant function which always takes the value L, then

lim f (x, y) = L
(x,y)→(a,b)

2. If f 2
: R
2
→ R is defined by f (x, y) = (x, y), then

lim f (x, y) = (a, b)


(x,y)→(a,b)

3. By definition, as (x, y) approaches (a, b), x approaches a and y approaches b, so that if f : R


2
→ R is defined by
f (x, y) = x, then

lim f (x, y) = a
(x,y)→(a,b)

Similarly, if g : R 2
→ R is defined by g(x, y) = y, then

lim g(x, y) = b
(x,y)→(a,b)

Limits of multivariable functions have much the same computational properties as limits of functions of one variable. The
following theorem summarizes a bunch of them. For simplicity, it concerns primarily real valued functions. That is, functions that
output real numbers as opposed to vectors. However it does contain one vector valued function. The function X in the theorem
takes as input an n -component vector and returns an m-component vector. We will not deal with many vector valued functions
here in CLP-3, but we will see a lot in CLP-4.

 Theorem 2.1.5. Arithmetic, and Other, Properties of Limits

Let
m and n be natural numbers
a⃗ ∈ R and b ⃗ ∈ R
m n

D be a subset of R that contains all x⃗ ∈ R that are near a⃗ 


m m

c, F , G ∈ R

and
m ⃗ 
f , g : D ∖ { a⃗ } → R X : R ∖ { b} → D ∖ { a⃗ } γ : R → R

Assume that

lim f (x⃗ ) = F lim g(x⃗ ) = G ⃗  = a⃗ 


lim X(y ) lim γ(t) = γ(F )
⃗  a⃗ 
x→ ⃗  a⃗ 
x→ ⃗  t→F
⃗  b
y→

Then
1. lim [f (x⃗ ) + g(x⃗ )] = F +G
⃗  a⃗ 
x→

lim [f (x⃗ ) − g(x⃗ )] = F − G


⃗  a⃗ 
x→

2. lim f (x⃗ ) g(x⃗ ) = F G


⃗  a⃗ 
x→

lim cf (x⃗ ) = cF
⃗  a⃗ 
x→

2.1.2 https://math.libretexts.org/@go/page/89209
⃗ 
f ( x)
3. lim ⃗ 
=
F

G
if G ≠ 0
⃗  a⃗  g( x)
x→

4. lim f (X(y ))
⃗  =F
⃗  b ⃗ 
y→

5. lim γ(f (x⃗ )) = γ(F )


⃗  a⃗ 
x→

This shows that multivariable limits interact very nicely with arithmetic, just as single variable limits did. Also recall, from
Theorem 1.6.8 in the CLP-1 text,

 Theorem 2.1.6

The following functions are continuous everywhere in their domains


polynomials, rational functions
roots and powers
trig functions and their inverses
exponential and the logarithm

 Example 2.1.7

In this example we evaluate


x + sin y
lim
(x,y)→(2,3) x2 y 2 + 1

a
as a typical application of Theorem 2.1.5. Here “=” means that part (a) of Theorem 2.1.5 justifies that equality. Start by
computing separately the limits of the numerator and denominator.
a
lim (x + sin y)  = lim x+ lim sin y
(x,y)→(2,3) (x,y)→(2,3) (x,y)→(2,3)

e
= lim x + sin ( lim y)
(x,y)→(2,3) (x,y)→(2,3)

=  2 + sin 3

2 2 a 2 2
lim (x y + 1)  =   lim x y + lim 1
(x,y)→(2,3) (x,y)→(2,3) (x,y)→(2,3)

b
=  ( lim x)( lim x)( lim y)( lim y) + 1
(x,y)→(2,3) (x,y)→(2,3) (x,y)→(2,3) (x,y)→(2,3)

2 2
=  2 3 +1

Since the limit of the denominator is nonzero, we can simply divide.


lim (x + sin y)
x + sin y c (x,y)→(2,3)

lim   =  
(x,y)→(2,3) x2 y 2 + 1 lim (x2 y 2 + 1)
(x,y)→(2,3)

2 + sin 3

37

Here we have used that sin x is a continuous function.

While the CLP-1 text's Definition 1.3.3 of the limit of a function of one variable, and our Definition 2.1.2 of the limit of a
multivariable function look virtually identical, there is a substantial practical difference between the two. In dimension one, you
can approach a point from the left or from the right and that's it. There are only two possible directions of approach. In two or more
dimensions there is “much more room” and there are infinitely many possible types of approach. One can even spiral in to a point.
See the middle and right hand figures below.

2.1.3 https://math.libretexts.org/@go/page/89209
The next few examples illustrate the impact that the“extra room” in dimensions greater than one has on limits.

 Example 2.1.8
2
x y
As a second example, we consider lim
x2 +y 2
. In this example, both the numerator, x 2
y, and the denominator, x
2 2
+y ,
(x,y)→(0,0)

tend to zero as (x, y) approaches (0, 0), so we have to be more careful.


A good way to see the behaviour of a function f (x, y) when (x, y) is close to (0, 0) is to switch to the polar coordinates, r, θ,
that are defined by
x = r cos θ

y = r sin θ

The points (x, y) that are close to (0, 0) are those with small r, regardless of what θ is. Recall that lim f (x, y) = L when
(x,y)→(0,0)

approaches L as (x, y) approaches (0, 0). Substituting x = r cos θ, y = r sin θ into that statement turns it into the
f (x, y)

statement that lim f (x, y) = L when f (r cos θ, r sin θ) approaches L as r approaches 0. For our current example
(x,y)→(0,0)

2 2
x y (r cos θ) (r sin θ)
2
= = r cos θ sin θ
2 2 2
x +y r

As ∣∣r cos 2
θ sin θ∣
∣ ≤r tends to 0 as r tends to 0 (regardless of what θ does as r tends to 0) we have
2
x y
lim =0
(x,y)→(0,0) x2 + y 2

 Example 2.1.9
2 2 2 2
x −y x −y
As a third example, we consider lim 2
x +y
2
. Once again, the best way to see the behaviour of f (x, y) = 2
x +y
2
for (x, y)
(x,y)→(0,0)

close to (0, 0) is to switch to polar coordinates.


2 2 2 2
x −y (r cos θ) − (r sin θ)
2 2
f (x, y) = = = cos θ − sin θ = cos(2θ)
2 2 2
x +y r

Note that, this time, f is independent of r but does depend on θ. Here is a greatly magnified sketch of a number of level curves
for f (x, y).

2.1.4 https://math.libretexts.org/@go/page/89209
Observe that
√3
as (x, y) approaches (0, 0) along the ray with 2θ = 30 ∘
, f (x, y) approaches the value 2
(and in fact f (x, y) takes the
√3
value cos(30 ) = ∘
at every point of that ray)
2

as (x, y) approaches (0, 0) along the ray with 2θ = 60 ∘


, f (x, y) approaches the value 1

2
(and in fact f (x, y) takes the value

cos(60 ) = at every point of that ray)
1

as (x, y) approaches (0, 0) along the ray with 2θ = 90 ∘


, f (x, y) approaches the value 0 (and in fact f (x, y) takes the value
cos(90 ) = 0 at every point of that ray)

and so on
So there is not single number L such that f (x, y) approaches L as r = |(x, y)| → 0, no matter what the direction of approach
2 2
x −y
is. The limit lim
x +y
2 2
does not exist.
(x,y)→(0,0)

Here is another way to come to the same conclusion.


Pick any really small positive number. We'll use 10 −137
as an example.
√3
Pick any real number F between −1 and 1. We'll use F = as an example. 2

Looking at the sketch above, we see that f (x, y) takes the value F along an entire ray θ = const, r > 0. In the case
√3
F =
2
the ray is 2θ = 30 , r > 0. In particular, because the ray extends all the way to (0, 0), f takes the value F for
,

some (x, y) obeying |(x, y)| < 10 .


−137

2 2
x −y
That is true regardless of which really small number you picked. So f (x, y) = x2 +y 2
does not approach any single value as
2 2
x −y
r = |(x, y)| approaches 0 and we conclude that lim 2
x +y
2
does not exist.
(x,y)→(0,0)

Optional — A Nasty Limit That Doesn't Exist

 Example 2.1.10

In this example we study the behaviour of the function


2
(2x−y)
if x ≠ y
f (x, y) = { x−y

0 if x = y

as (x, y) → (0, 0). Here is a graph of the level curve, f (x, y) = −3, for this function.

2.1.5 https://math.libretexts.org/@go/page/89209
Here is a larger graph of level curves, f (x, y) = c, for various values of the constant c.

As before, it helps to convert to polar coordinates — it is a good approach 6 . In polar coordinates


2
(2 cos θ−sin θ)
r if  cos θ ≠ sin θ
f (r cos θ, r sin θ) = { cos θ−sin θ

0 if  cos θ = sin θ

2
(2 cos θ−sin θ)
If we approach the origin along any fixed ray θ = const, then f (r cos θ, r sin θ) is the constant (or 0 if
cos θ−sin θ

cos θ = sin θ ) times r and so approaches zero as r approaches zero. You can see this in the figure below, which shows the

level curves again, with the rays θ = π and θ = π superimposed.


1

8
3

16

2.1.6 https://math.libretexts.org/@go/page/89209
If you move towards the origin on either of those rays, you first cross the f =3 level curve, then the f =2 level curve, then
the f = 1 level curve, then the f = level curve, and so on.
1

That f (x, y) → 0 as (x, y) → (0, 0) along any fixed ray is suggestive, but does not imply that the limit exists and is zero.
Recall that to have lim f (x, y) = 0, we need f (x, y) → 0 no matter how (x, y) → (0, 0). It is not sufficient to check
(x,y)→(0,0)

only straight line approaches.


In fact, the limit of f (x, y) as (x, y) → (0, 0) does not exist. A good way to see this is to observe that if you fix any r > 0, no
matter how small, f (x, y) takes all values from −∞ to +∞ on the circle x + y = r . You can see this in the figure below,
2 2 2

which shows the level curves yet again, with a circle x + y = r superimposed. For every single −∞ < c < ∞, the level
2 2 2

curve f (x, y) = c crosses the circle.

Consequently there is no one number L such that f (x, y) is close to L whenever (x, y) is sufficiently close to (0, 0). The limit
lim f (x, y) does not exist.
(x,y)→(0,0)

Another way to see that f (x, y) does not have any limit as (x, y) → (0, 0) is to show that f (x, y) does not have a limit as
(x, y) approaches (0, 0) along some specific curve. This can be done by picking a curve that makes the denominator, x − y,

tend to zero very quickly. One such curve is x − y = x or, equivalently, y = x − x . Along this curve, for x ≠ 0,
3 3

2 2
3 3
(2x − x + x ) (x + x )
3
f (x, x − x ) = =
3 3
x −x +x x

2 2
(1 + x ) +∞ as x → 0 with x > 0
= ⟶ {
x −∞ as x → 0 with x < 0

The choice of the specific power x is not important. Any power x with p > 2 will have the same effect.
3 p

If we send (x, y) to (0, 0) along the curve x − y = ax or, equivalently, y = x − ax


2 2
, where a is a nonzero constant,
2 2
2 2
(2x − x + ax ) (x + ax )
2
lim f (x, x − ax ) = lim = lim
x→0 x→0 x − x + ax2 x→0 ax2
2
(1 + ax) 1
= lim =
x→0 a a

This limit depends on the choice of the constant a. Once again, this proves that f (x, y) does not have a limit as
(x, y) → (0, 0).

Exercises
Stage 1

2.1.7 https://math.libretexts.org/@go/page/89209
 1
Suppose f (x, y) is a function such that lim f (x, y) = 10.
(x,y)→(0,0)

True or false: |f (0.1, 0.1) − 10| < |f (0.2, 0.2) − 10|

 2

A millstone pounds wheat into flour. The wheat sits in a basin, and the millstone pounds up and down.
Samples of wheat are taken from various places along the basin. Their diameters are measured and their position on the basin is
recorded.
Consider this claim: “As the particles get very close to the millstone, the diameters of the particles approach 50 μ m.” In this
context, describe the variables below from Definition 2.1.2.
1. x
2. a
3. L

 3
2
x
Let f (x, y) = 2 2
.
x +y

1. Find a ray approaching the origin along which f (x, y) = 1.


2. Find a ray approaching the origin along which f (x, y) = 0.
3. What does the above work show about a limit of f (x, y)?

 4
Let f (x, y) = x 2
−y
2

1. Express the function in terms of the polar coordinates r and θ, and simplify.
2. Suppose (x, y) is a distance of 1 from the origin. What are the largest and smallest values of f (x, y)?
3. Let r > 0. Suppose (x, y) is a distance of r from the origin. What are the largest and smallest values of f (x, y)?
4. Let ϵ > 0. Find a positive value of r that guarantees |f (x, y)| < ϵ whenever (x, y) is at most r units from the origin.
5. What did you just show?

 5

Suppose f (x, y) is a polynomial. Evaluate lim f (x, y), where (a, b) ∈ R


2
.
(x,y)→(a,b)

Stage 2

 6

Evaluate, if possible,
1. lim  (xy + x )
2

(x,y)→(2,−1)

x
2. lim  
2 2
(x,y)→(0,0) x +y
2
x
3. lim  
2 2
(x,y)→(0,0) x +y
3
x
4. lim  
2 2
(x,y)→(0,0) x +y

2.1.8 https://math.libretexts.org/@go/page/89209
2 2
x y
5. lim  
2 4
(x,y)→(0,0) x +y
y
(sin x) (e − 1)
6. lim  
(x,y)→(0,0) xy

 7. ✳
8 8
x +y
1. Find the limit: lim
4 4
.
(x,y)→(0,0) x +y
5
xy
2. Prove that the following limit does not exist: lim .
(x,y)→(0,0) x8 + y 10

 8. ✳

Evaluate each of the following limits or show that it does not exist.
3 3
x −y
1. lim
(x,y)→(0,0) x2 + y 2
2 4
x −y
2. lim
2 4
(x,y)→(0,0) x +y

Stage 3

 9. ✳

Evaluate each of the following limits or show that it does not exist.
2 2 2 2
2x + x y − y x + 2y
1. lim
(x,y)→(0,0) x2 + y 2
2 2 2 2
x y − 2x y + x
2. lim
2 2 2
(x,y)→(0,1) (x +y − 2y + 1 )

 10
2
x y
Define, for all (x, y) ≠ (0, 0), f (x, y) = 4
x +y
2
.

1. Let 0 ≤ θ < 2π. Compute lim f (r cos θ, r sin θ).


+
r→0

2. Compute lim f (x, x 2


).
x→0

3. Does lim f (x, y) exist?


(x,y)→(0,0)

 11. ✳
Compute the following limits or explain why they do not exist.
xy
1. lim
2 2
(x,y)→(0,0) x +y

sin(xy)
2. lim
2 2
(x,y)→(0,0) x +y
2 2 4
x + 2x y +y
3. lim
4
(x,y)→(−1,1) 1 +y
x
4. lim |y |
(x,y)→(0,0)

1. Definition 1.3.3 in the CLP-1 text.


2. In this text, we will interested in m, n ∈ {1, 2, 3}, but the definition works for all natural numbers m, n.

2.1.9 https://math.libretexts.org/@go/page/89209
3. To be precise, there is a number r > 0 such that f (x⃗ ) is defined for all x⃗ obeying |x⃗ − a⃗ | < r.
4. There is a precise, formal version of this definition that looks just like Definition 1.7.1 of the CLP-1 text.
5. You may find the condition “without being exactly a⃗ ” a little strange, but there is a good reason for it, which we have already
f (x)−f (a) f (x)−f (a)
seen in Calculus I. In the definition f ′
(x) = lim
x−a
, the function whose limit is being taken, namely x−a
, is not
x→a

defined at all at x = a. This will again happen when we define derivatives of functions of more than one variable.
6. Not just a pun.

This page titled 2.1: Limits is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.

2.1.10 https://math.libretexts.org/@go/page/89209
2.2: Partial Derivatives
We are now ready to define derivatives of functions of more than one variable. First, recall how we defined the derivative, f (a), ′

of a function of one variable, f (x). We imagined that we were walking along the x-axis, in the positive direction, measuring, for
example, the temperature along the way. We denoted by f (x) the temperature at x. The instantaneous rate of change of temperature
that we observed as we passed through x = a was
df f (a + h) − f (a) f (x) − f (a)
(a) = lim = lim
dx h→0 h x→a x −a

Next suppose that we are walking in the xy-plane and that the temperature at (x, y) is f (x, y). We can pass through the point
(x, y) = (a, b) moving in many different directions, and we cannot expect the measured rate of change of temperature if we walk

parallel to the x-axis, in the direction of increasing x, to be the same as the measured rate of change of temperature if we walk
parallel to the y -axis in the direction of increasing y. We'll start by considering just those two directions. We'll consider other
directions (like walking parallel to the line y = x ) later.
Suppose that we are passing through the point (x, y) = (a, b) and that we are walking parallel to the x-axis (in the positive
direction). Then our y -coordinate will be constant, always taking the value y = b. So we can think of the measured temperature as
the function of one variable B(x) = f (x, b) and we will observe the rate of change of temperature

dB B(a + h) − B(a) f (a + h, b) − f (a, b)


(a) = lim = lim
dx h→0 h h→0 h

∂f
This is called the “partial derivative f with respect to x at (a, b)” and is denoted ∂x
(a, b). Here
the symbol ∂, which is read “partial”, indicates that we are dealing with a function of more than one variable, and
∂f
the x in ∂x
indicates that we are differentiating with respect to x, while y is being held fixed, i.e. being treated as a constant.
∂f

∂x
is read “partial dee f dee x”.

Do not write when


d

dx

∂x
is appropriate. We shall later encounter situations when d

dx
f and ∂

∂x
f are both defined and have
different meanings.
If, instead, we are passing through the point (x, y) = (a, b) and are walking parallel to the y -axis (in the positive direction), then
our x-coordinate will be constant, always taking the value x = a. So we can think of the measured temperature as the function of
one variable A(y) = f (a, y) and we will observe the rate of change of temperature
dA A(b + h) − A(b) f (a, b + h) − f (a, b)
(b) = lim = lim
dy h→0 h h→0 h

∂f
This is called the “partial derivative f with respect to y at (a, b)” and is denoted ∂y
(a, b).

df
Just as was the case for the ordinary derivative (x) (see Definition 2.2.6 in the CLP-1 text), it is common to treat the partial
dx

derivatives of f (x, y) as functions of (x, y) simply by evaluating the partial derivatives at (x, y) rather than at (a, b).

 Definition 2.2.1. Partial Derivatives


The x- and y -partial derivatives of the function f (x, y) are

∂f f (x + h, y) − f (x, y)
(x, y) = lim
∂x h→0 h

∂f f (x, y + h) − f (x, y)
(x, y) = lim
∂y h→0 h

respectively. The partial derivatives of functions of more than two variables are defined analogously.

Partial derivatives are used a lot. And there many notations for them.

2.2.1 https://math.libretexts.org/@go/page/89210
 Definition 2.2.2
∂f
The partial derivative ∂x
(x, y) of a function f (x, y) is also denoted

∂f
fx (x, y) fx Dx f (x, y) Dx f D1 f (x, y) D1 f
∂x

∂f
The subscript 1 on D 1f indicates that f is being differentiated with respect to its first variable. The partial derivative ∂x
(a, b)

is also denoted
∂f ∣

∂x ∣(a,b)

∂f
with the subscript (a, b) indicating that ∂x
is being evaluated at (x, y) = (a, b).

is used to make explicit that the variable y is being held fixed 1.


∂f
The notation ( ∂x
)
y

 Remark 2.2.3. The Geometric Interpretation of Partial Derivatives

We'll now develop a geometric interpretation of the partial derivative


∂f f (a + h, b) − f (a, b)
(a, b) = lim
∂x h→0 h

in terms of the shape of the graph z = f (x, y) of the function f (x, y). That graph appears in the figure below. It looks like the
part of a deformed sphere that is in the first octant.
∂f
The definition of (a, b) concerns only points on the graph that have y = b. In other words, the curve of intersection of the
∂x

surface z = f (x, y) with the plane y = b. That is the red curve in the figure. The two blue vertical line segments in the figure
f (a+h,b)−f (a,b)
have heights f (a, b) and f (a + h, b), which are the two numbers in the numerator of h
.

A side view of the curve (looking from the left side of the y -axis) is sketched in the figure below.

2.2.2 https://math.libretexts.org/@go/page/89210
Again, the two blue vertical line segments in the figure have heights f (a, b) and f (a + h, b), which are the two numbers in the
f (a+h,b)−f (a,b)
numerator of h
. So the numerator f (a + h, b) − f (a, b) and denominator h are the rise and run, respectively, of
∂f
the curve z = f (x, b) from x = a to x = a + h. Thus ∂x
(a, b) is exactly the slope of (the tangent to) the curve of intersection
∂f
of the surface z = f (x, y) and the plane y =b at the point (a, b, f (a, b)). In the same way ∂y
(a, b) is exactly the slope of
(the tangent to) the curve of intersection of the surface z = f (x, y) and the plane x = a at the point (a, b, f (a, b)).

Evaluation of Partial Derivatives


From the above discussion, we see that we can readily compute partial derivatives ∂

∂x
by using what we already know about
ordinary derivatives . More precisely,
d

dx

∂f
to evaluate ∂x
(x, y), treat the y in f (x, y) as a constant and differentiate the resulting function of x with respect to x.
∂f
To evaluate ∂y
(x, y), treat the x in f (x, y) as a constant and differentiate the resulting function of y with respect to y.
∂f
To evaluate (a, b), treat the y in f (x, y) as a constant and differentiate the resulting function of x with respect to x. Then
∂x

evaluate the result at x = a, y = b.


∂f
To evaluate ∂y
(a, b), treat the x in f (x, y) as a constant and differentiate the resulting function of y with respect to y. Then
evaluate the result at x = a, y = b.

Now for some examples.

 Example 2.2.4

Let
3 2 2
f (x, y) = x +y + 4x y

Then, since ∂

∂x
treats y as a constant,

∂f ∂ 3
∂ 2
∂ 2
= (x ) + (y ) + (4x y )
∂x ∂x ∂x ∂x

2 2

= 3x + 0 + 4y (x)
∂x
2 2
= 3x + 4y

and, since ∂

∂y
treats x as a constant,

∂f ∂ ∂ ∂
3 2 2
= (x ) + (y ) + (4x y )
∂y ∂y ∂y ∂y

∂ 2
= 0 + 2y + 4x (y )
∂y

= 2y + 8xy

In particular, at (x, y) = (1, 0) these partial derivatives take the values

2.2.3 https://math.libretexts.org/@go/page/89210
∂f
2 2
(1, 0) = 3(1 ) + 4(0 ) =3
∂x
∂f
(1, 0) = 2(0) + 8(1)(0)  = 0
∂y

 Example 2.2.5

Let
xy
f (x, y) = y cos x + xe

Then, since ∂

∂x
treats y as a constant, ∂

∂x
e
yx
= ye
yx
and
∂ ∂ ∂ ∂
xy xy
(x, y) = y (cos x) + e (x) + x (e ) (by the product rule)
∂x ∂x ∂x ∂x
xy xy
= −y sin x + e + xy e

∂ ∂ ∂ xy
(x, y) = cos x (y) + x (e )
∂x ∂y ∂y
2 xy
= cos x + x e

Let's move up to a function of four variables. Things generalize in a quite straight forward way.

 Example 2.2.6

Let
2 3y
f (x, y, z, t) = x sin(y + 2z) + t e ln z

Then
∂f
(x, y, z, t) = sin(y + 2z)
∂x
∂f
2 3y
(x, y, z, t) = x cos(y + 2z) + 3 t e ln z
∂y

∂f 2 3y
(x, y, z, t) = 2x cos(y + 2z) + t e /z
∂z

∂f
3y
(x, y, z, t) = 2te ln z
∂t

Now here is a more complicated example — our function takes a special value at (0, 0). To compute derivatives there we revert to
the definition.

 Example 2.2.7

Set
cos x−cos y
if x ≠ y
x−y
f (x, y) = {
0 if x = y

cos x−cos y
If b ≠ a, then for all (x, y) sufficiently close to (a, b), f (x, y) = and we can compute the partial derivatives of f at
x−y

(a, b) using the familiar rules of differentiation. However that is not the case for (a, b) = (0, 0). To evaluate f (0, 0), we need x

to set y = 0 and find the derivative of


cos x−1
if x ≠ 0
f (x, 0) = { x

0 if x = 0

with respect to x at x = 0. As we cannot use the usual differentiation rules, we evaluate the derivative 2 by applying the
definition

2.2.4 https://math.libretexts.org/@go/page/89210
f (h, 0) − f (0, 0)
fx (0, 0) = lim
h→0 h
cos h−1
−0
h
= lim (Recall that h ≠ 0 in the limit.)
h→0 h

cos h − 1
= lim
2
h→0 h
− sin h
= lim ô
(By l'H pital's rule.)
h→0 2h

− cos h
= lim ô
(By l'H pital again.)
h→0 2

1
=−
2

We could also evaluate the limit of cos h−1

2
by substituting in the Taylor expansion
h

2 4
h h
cos h = 1 − + −⋯
2 4!

We can also use Taylor expansions to understand the behaviour of f (x, y) for (x, y) near (0, 0). For x ≠ y,
2 4
2 4
x x y y
[1 − + − ⋯] − [1 − + − ⋯]
cos x − cos y 2! 4! 2! 4!
=
x −y x −y
2 2 4 4
x −y x −y
− + −⋯
2! 4!
=
x −y

2 2 4 4
1 x −y 1 x −y
=− + −⋯
2! x −y 4! x −y
3 2 2 3
x +y x + x y + xy +y
=− + −⋯
2! 4!

So for (x, y) near (0, 0),


x+y
− if x ≠ y
f (x, y) ≈ { 2

0 if x = y

So it sure looks like (and in fact it is true that)


f (x, y) is continuous at (0, 0) and
f (x, y) is not continuous at (a, a) for small a ≠ 0 and
1
fx (0, 0) = fy (0, 0) = −
2

 Example 2.2.8

Again set
cos x−cos y
if x ≠ y
x−y
f (x, y) = {
0 if x = y

We'll now compute f y (x, y) for all (x, y).


The case y ≠ x: When y ≠ x,

2.2.5 https://math.libretexts.org/@go/page/89210
∂ cos x − cos y
fy (x, y) =
∂y x −y

∂ ∂
(x − y) (cos x − cos y) − (cos x − cos y) (x − y)
∂y ∂y
=
2
(x − y)

(by the quotient rule)

(x − y) sin y + cos x − cos y


=
2
(x − y)

The case y = x: When y = x,


f (x, y + h) − f (x, y)
fy (x, y) = lim
h→0 h

f (x, x + h) − f (x, x)
= lim
h→0 h
cos x−cos(x+h)
−0
x−(x+h)
= lim (Recall that h ≠ 0 in the limit.)
h→0 h

cos(x + h) − cos x
= lim
2
h→0 h

Now we apply L'Hôpital's rule, remembering that, in this limit, x is a constant and h is the variable — so we differentiate with
respect to h.
− sin(x + h)
fy (x, y) = lim
h→0 2h

Note that if x is not an integer multiple of π, then the numerator − sin(x + h) does not tend to zero as h tends to zero, and the
limit giving f (x, y) does not exist. On the other hand, if x is an integer multiple of π, both the numerator and denominator
y

tend to zero as h tends to zero, and we can apply L'Hôpital's rule a second time. Then
− cos(x + h)
fy (x, y) = lim
h→0 2
cos x
=−
2

The conclusion:
(x−y) sin y+cos x−cos y
⎧ if x ≠ y

⎪ 2
(x−y)

fy (x, y) = ⎨ cos x
− if x = y with x an integer multiple of π


2

DN E if x = y with x not an integer multiple of π

 Example 2.2.9. Optional — A Little Weirdness


In this example, we will see that the function
2
x
if x ≠ y
x−y
f (x, y) = {
0 if x = y

is not continuous at (0, 0) and yet has both partial derivatives f (0, 0) and f x y (0, 0) perfectly well defined. We'll also see how
that is possible. First let's compute the partial derivatives. By definition,

2.2.6 https://math.libretexts.org/@go/page/89210
h

2
h
f (0 + h, 0) − f (0, 0) − 0
h−0
fx (0, 0) = lim = lim = lim 1
h→0 h h→0 h h→0

=1
2
0
f (0, 0 + h) − f (0, 0) −0
0−h
fy (0, 0) = lim = lim = lim 0
h→0 h h→0 h h→0

=0

So the first order partial derivatives f


x (0, 0) and f y (0, 0) are perfectly well defined.
To see that, nonetheless, f (x, y) is not continuous at (0, 0), we take the limit of f (x, y) as (x, y) approaches (0, 0) along the
curve y = x − x . The limit is
3

2
x 1
3
lim f (x, x − x ) = lim = lim
3
x→0 x→0 x − (x − x ) x→0 x

which does not exist. Indeed as x approoaches 0 through positive numbers, 1

x
approaches +∞, and as x approoaches 0

through negative numbers, approaches −∞.


1

So how is this possible? The answer is that f (0, 0) only involves values of f (x, y) with y = 0. As f (x, 0) = x, for all values
x

of x, we have that f (x, 0) is a continuous, and indeed a differentiable, function. Similarly, f (0, 0) only involves values of y

f (x, y) with x = 0. As f (0, y) = 0, for all values of y, we have that f (0, y) is a continuous, and indeed a differentiable,

function. On the other hand, the bad behaviour of f (x, y) for (x, y) near (0, 0) only happens for x and y both nonzero.

Our next example uses implicit differentiation.

 Example 2.2.10

The equation
5 2 z 2x
z +y e +e =0

implicitly determines z as a function of x and y. That is, the function z(x, y) obeys
5 2 z(x,y) 2x
z(x, y ) +y e +e =0

For example, when x = y = 0, the equation reduces to


5
z(0, 0 ) = −1

which forces 3 z(0, 0) = −1. Let's find the partial derivative ∂z

∂x
(0, 0).

We are not going to be able to explicitly solve the equation for z(x, y). All we know is that
5 2 z(x,y) 2x
z(x, y ) +y e +e =0

for all x and y. We can turn this into an equation for ∂z

∂x
(0, 0) by differentiating 4 the whole equation with respect to x, giving
∂z ∂z
4 2 z(x,y) 2x
5z(x, y )   (x, y) + y e   (x, y) + 2 e =0
∂x ∂x

and then setting x = y = 0, giving

4
∂z
5z(0, 0 )   (0, 0) + 2 = 0
∂x

As we already know that z(0, 0) = −1,


∂z 2 2
(0, 0) = − =−
4
∂x 5z(0, 0) 5

2.2.7 https://math.libretexts.org/@go/page/89210
Next we have a partial derivative disguised as a limit.

 Example 2.2.11

In this example we are going to evaluate the limit


3 3
(x + y + z) − (x + y )
lim
z→0 (x + y)z

The critical observation is that, in taking the limit z → 0, x and y are fixed. They do not change as z is getting smaller and
smaller. Furthermore this limit is exactly of the form of the limits in the Definition 2.2.1 of partial derivative, disguised by
some obfuscating changes of notation.
Set
3
(x + y + z)
f (x, y, z) =
(x + y)

Then
3 3
(x + y + z) − (x + y ) f (x, y, z) − f (x, y, 0)
lim = lim
z→0 (x + y)z z→0 z

f (x, y, 0 + h) − f (x, y, 0)
= lim
h→0 h

∂f
= (x, y, 0)
∂z
3
∂ (x + y + z)
= [ ]
∂z x +y
z=0

3
(const+z)
Recalling that ∂

∂z
treats x and y as constants, we are evaluating the derivative of a function of the form const
. So
3 3 2
(x + y + z) − (x + y ) (x + y + z) ∣
lim =3 ∣
z→0 (x + y)z x +y ∣
z=0

= 3(x + y)

The next example highlights a potentially dangerous difference between ordinary and partial derivatives.

 Example 2.2.12
−1
In this example we are going to see that, in contrast to the ordinary derivative case, ∂r

∂x
is not, in general, the same as ( ∂x

∂r
) .

5
Recall that Cartesian and polar coordinates (for (x, y) ≠ (0, 0) and r > 0 ) are related by
x = r cos θ

y = r sin θ
−−−−−−
2 2
r = √x +y

y
tan θ =
x

We will use the functions


−−−−−−
2 2
x(r, θ) = r cos θ and r(x, y) = √ x +y

Fix any point (x 0, y0 ) ≠ (0, 0) and let (r 0, θ0 ), 0 ≤ θ0 < 2π, be the corresponding polar coordinates. Then

2.2.8 https://math.libretexts.org/@go/page/89210
∂x ∂r x
(r, θ) = cos θ (x, y) =
− −−−− −
∂r ∂x √ x2 + y 2

so that
−1

−1 ⎛ ⎞
∂x ∂r x0 −1
(r0 , θ0 ) = ( (x0 , y0 )) ⟺ cos θ0 = ⎜ −−−−−−⎟ = (cos θ0 )
∂r ∂x 2 2
⎝ √x +y ⎠
0 0

2
⟺ cos θ0 = 1

⟺ θ0 = 0, π

We can also see pictorially why this happens. By definition, the partial derivatives
∂x x(r0 + dr, θ0 ) − x(r0 , θ0 )
(r0 , θ0 ) = lim
∂r dr→0 dr

∂r r(x0 + dx, y0 ) − r(x0 , y0 )


(x0 , y0 ) = lim
∂x dx→0 dx

Here we have just renamed the h of Definition 2.2.1 to dr and to dx in the two definitions.
In computing ∂x

∂r
is held fixed, r is changed by a small amount dr and the resulting
(r0 , θ0 ), θ0

dx = x(r + dr, θ ) − x(r , θ ) is computed. In the figure on the left below, dr is the length of the orange line segment and
0 0 0 0

dx is the length of the blue line segment.

On the other hand, in computing , y is held fixed, x is changed by a small amount dx and the resulting
∂r

∂x

dr = r(x + dx, y ) − r(x , y ) is computed. In the figure on the right above, dx is the length of the pink line segment and
0 0 0 0

dr is the length of the orange line segment.

Here are the two figures combined together. We have arranged that the same dr is used in both computations. In order for the
dr 's to be the same in both computations, the two dx's have to be different (unless θ = 0, π ). So, in general, 0

∂x ∂r −1
(r0 , θ0 ) ≠ ( (x0 , y0 )) .
∂r ∂x

 Example 2.2.13. Optional — Example 2.2.12, continued

The inverse function theorem, for functions of one variable, says that, if y(x) and x(y) are inverse functions, meaning that
dy
y(x(y)) = y and x(y(x)) = x, and are differentiable with ≠ 0, then
dx

2.2.9 https://math.libretexts.org/@go/page/89210
dx 1
(y) =
dy
dy
(x(y))
dx

dy
To see this, just apply d

dy
to both sides of y(x(y)) = y to get dx
(x(y)) 
dx

dy
(y) = 1, by the chain rule (see Theorem 2.9.3 in
the CLP-1 text). In the CLP-1 text, we used this to compute the derivatives of the logarithm (see Theorem 2.10.1 in the CLP-1
text) and of the inverse trig functions (see Theorem 2.12.7 in the CLP-1 text).
We have just seen, in Example 2.2.12, that we can't be too naive in extending the single variable inverse function theorem to
functions of two (or more) variables. On the other hand, there is such an extension, which we will now illustrate, using
Cartesian and polar coordinates. For simplicity, we'll restrict our attention to x > 0, y > 0, or equivalently, r > 0, 0 < θ < . π

The functions which convert between Cartesian and polar coordinates are
−−−−−−
2 2
x(r, θ) = r cos θ r(x, y) = √ x +y

y
y(r, θ) = r sin θ θ(x, y) = arctan( )
x

The two functions on the left convert from polar to Cartesian coordinates and the two functions on the right convert from
Cartesian to polar coordinates. The inverse function theorem (for functions of two variables) says that,
if you form the first order partial derivatives of the left hand functions into the matrix
∂x ∂r
(r, θ) (r, θ)
∂r ∂θ cos θ −r sin θ
[ ] =[ ]
∂y ∂y
(r, θ) (r, θ) sin θ r cos θ
∂r ∂θ

and you form the first order partial derivatives of the right hand functions into the matrix
x y
x y
∂r ∂r ⎡ ⎤
⎡ (x, y) (x, y) ⎤ √x2 +y 2 √x2 +y 2
⎡ 2 2 2 2 ⎤
∂x ∂y √x +y √x +y
=⎢
⎢ −
y
1
⎥ =⎢
⎥ ⎥
∂θ ∂θ −y x
⎣ (x, y) (x, y) ⎦ x
2
x
⎣ ⎦
∂x ∂y ⎣ y
2
y
2 ⎦ x2 +y 2 x2 +y 2
1+( ) 1+( )
x x

and if you evaluate the second matrix at x = x(r, θ), y = y(r, θ),

∂r ∂r
⎡ (x(r, θ), y(r, θ)) (x(r, θ), y(r, θ)) ⎤ cos θ sin θ
∂x ∂y
=[ ]
∂θ ∂θ sin θ cos θ
⎣ (x(r, θ), y(r, θ)) (x(r, θ), y(r, θ)) ⎦ −
r r
∂x ∂y

6
and if you multiply the two matrices together
∂r ∂r ∂x ∂x
⎡ (x(r, θ), y(r, θ)) (x(r, θ), y(r, θ)) ⎤ (r, θ) (r, θ)
∂x ∂y ∂r ∂θ
 [ ]
∂θ ∂θ ∂y ∂y
⎣ (x(r, θ), y(r, θ)) (x(r, θ), y(r, θ)) ⎦ (r, θ) (r, θ)
∂x ∂y ∂r ∂θ

cos θ sin θ cos θ −r sin θ


=[ ] [ ]
sin θ cos θ
− sin θ r cos θ
r r

(cos θ)(cos θ) + (sin θ)(sin θ) (cos θ)(−r sin θ) + (sin θ)(r cos θ)
  =[ ]
sin θ cos θ sin θ cos θ
(− )(cos θ) + ( )(sin θ) (− )(−r sin θ) + ( )(r cos θ)
r r r r

then the result is the identity matrix


1 0
[ ]
0 1

and indeed it is!

2.2.10 https://math.libretexts.org/@go/page/89210
This two variable version of the inverse function theorem can be derived by applying the derivatives ∂

∂r
and ∂

∂θ
to the
equations

r(x(r, θ), y(r, θ)) = r

θ(x(r, θ), y(r, θ)) = θ

and using the two variable version of the chain rule, which we will see in §2.4.

Exercises
Stage 1

 1

Let f (x, y) = e x
cos y. The following table gives some values of f (x, y).

x = 0 x = 0.01 x = 0.1

y = −0.1 0.99500 1.00500 1.09965

y = −0.01 0.99995 1.01000 1.10512

y = 0 1.0 1.01005 1.10517

∂f
1. Find two different approximate values for ∂x
(0, 0) using the data in the above table.
∂f
2. Find two different approximate values for ∂y
(0, 0) using the data in the above table.
∂f ∂f
3. Evaluate ∂x
(0, 0) and ∂y
(0, 0) exactly.

 2

You are traversing an undulating landscape. Take the z -axis to be straight up towards the sky, the positive x-axis to be due
south, and the positive y -axis to be due east. Then the landscape near you is described by the equation z = f (x, y), with you at
the point (0, 0, f (0, 0)). The function f (x, y) is differentiable.
Suppose f y (0, 0) < 0. Is it possible that you are at a summit? Explain.

 3✳
Let
2
x y
if (x, y) ≠ (0, 0)
f (x, y) = { x2 +y 2

0 if (x, y) = (0, 0)

Compute, directly from the definitions,


∂f
1. (0, 0)
∂x
∂f
2. (0, 0)
∂y
d
3. f (t, t)

∣t=0
dt

Stage 2

 4

Find all first partial derivatives of the following functions and evaluate them at the given point.
1. f (x, y, z) = x 3 4
y z
5
(0, −1, −1)

2.2.11 https://math.libretexts.org/@go/page/89210
2. w(x, y, z) = ln(1 + e xyz
) (2, 0, −1)

1
3. f (x, y) = −−−−−− (−3, 4)
2 2
√x + y

 5
x+y
Show that the function z(x, y) = x−y
obeys

∂z
x (x, y) + y
∂x

f rac∂z∂y(x, y) = 0

 6✳

A surface z(x, y) is defined by zy − y + x = ln(xyz).


1. Compute ∂z

∂x
,
∂z

∂y
in terms of x, y, z.
2. Evaluate ∂z

∂x
and ∂z

∂y
at (x, y, z) = (−1, −2, 1/2).

 7✳

Find ∂U

∂T
and ∂T

∂V
at (1, 1, 2, 4) if (T , U , V , W ) are related by
2
(T U − V ) ln(W − U V ) = ln 2

 8✳

Suppose that u = x 2
+ yz, x = ρr cos(θ), y = ρr sin(θ) and z = ρr. Find ∂u

∂r
at the point (ρ
0, r0 , θ0 ) = (2, 3, π/2).

 9
Use the definition of the derivative to evaluate f x (0, 0) and f y (0, 0) for
2 2
x −2 y
if x ≠ y
f (x, y) = { x−y

0 if x = y

Stage 3

 10
Let f be any differentiable function of one variable. Define z(x, y) = f (x 2 2
+ y ). Is the equation
∂z ∂z
y (x, y) − x (x, y) = 0
∂x ∂y

necessarily satisfied?

 11
Define the function
2
(x+2y)
if x + y ≠ 0
f (x, y) = { x+y

0 if x + y = 0

∂f ∂f
1. Evaluate, if possible, ∂x
(0, 0) and ∂y
(0, 0).

2.2.12 https://math.libretexts.org/@go/page/89210
2. Is f (x, y) continuous at (0, 0)?

 12

Consider the cylinder whose base is the radius-1 circle in the xy-plane centred at (0, 0), and which slopes parallel to the line in
the yz-plane given by z = y.

When you stand at the point (0, −1, 0), what is the slope of the surface if you look in the positive y direction? The positive x
direction?

1. There are applications in which there are several variables that cannot be varied independently. For example, the pressure,
volume and temperature of an ideal gas are related by the equation of state P V = (constant)T . In those applications, it may
not be clear from the context which variables are being held fixed.
2. It is also possible to evaluate the derivative by using the technique of the optional Section 2.15 in the CLP-1 text.
3. The only real number z which obeys z = −1 is z = −1. However there are four other complex numbers which also obey
5

5
z = −1.

4. You should have already seen this technique, called implicit differentiation, in your first Calculus course. It is covered in
Section 2.11 in the CLP-1 text.
5. If you are not familiar with polar coordinates, don't worry about it. There will be an introduction to them in §3.2.1.
6. Matrix multiplication is usually covered in courses on linear algebra, which you may or may not have taken. That's why this
example is optional.

This page titled 2.2: Partial Derivatives is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

2.2.13 https://math.libretexts.org/@go/page/89210
2.3: Higher Order Derivatives
df
You have already observed, in your first Calculus course, that if f (x) is a function of x, then its derivative, dx
(x), is also a
2
d f
function of x, and can be differentiated to give the second order derivative dx
2
(x), which can in turn be differentiated yet again to
give the third order derivative, f (3)
(x), and so on.
We can do the same for functions of more than one variable. If f (x, y) is a function of x and y, then both of its partial derivatives,
∂f ∂f

∂x
(x, y) and ∂y
(x, y) are also functions of x and y. They can both be differentiated with respect to x and they can both be
differentiated with respect to y. So there are four possible second order derivatives. Here they are, together with various alternate
notations.
2
∂ ∂f ∂ f
( ) (x, y) = (x, y) = fxx (x, y)
2
∂x ∂x ∂x
2
∂ ∂f ∂  f
( ) (x, y) = (x, y) = fxy (x, y)
∂y ∂x ∂y∂x

2
∂ ∂f ∂  f
( ) (x, y) = (x, y) = fyx (x, y)
∂x ∂y ∂x∂y
2
∂ ∂f ∂ f
( ) (x, y) = (x, y) = fyy (x, y)
2
∂y ∂y ∂y

2 2
∂  f
In ∂y ∂x
=

∂y ∂x
f, the derivative closest to f , in this case ∂

∂x
, is applied first.

In f xy , the derivative with respect to the variable closest to f , in this case x, is applied first.

 Example 2.3.1

Let f (x, y) = e my
cos(nx). Then
my my
fx = −ne sin(nx) fy = m e cos(nx)

2 my my
fxx = −n e cos(nx) fyx = −mne sin(nx)

my 2 my
fxy = −mne sin(nx) fyy = m e cos(nx)

 Example 2.3.2

Let f (x, y) = e αx+βy


. Then
αx+βy αx+βy
fx = αe fy = βe

2 αx+βy αx+βy
fxx = α e fyx = βαe

αx+βy 2 αx+βy
fxy = αβe fyy = β e

More generally, for any integers m, n ≥ 0,


m+n
∂ f
m n αx+βy
=α β e
m
∂x ∂y n

 Example 2.3.3

If f (x 1, x2 , x3 , x4 ) = x
4
1
3
x
2
x
2
3
x4 , then

2.3.1 https://math.libretexts.org/@go/page/89211
4 3
∂  f ∂  
4 3 2
= (x x x )
1 2 3
∂ x1 ∂ x2 ∂ x3 ∂ x4 ∂ x1 ∂ x2 ∂ x3
2
∂  
4 3
= (2 x x x3 )
1 2
∂ x1 ∂ x2


4 2
= (6 x x x3 )
1 2
∂x1
3 2
= 24 x x x3
1 2

and
4 3
∂  f ∂  
3 3 2
= (4 x x x x4 )
1 2 3
∂ x4 ∂ x3 ∂ x2 ∂ x1 ∂ x4 ∂ x3 ∂ x2
2
∂  
3 2 2
= (12 x x x x4 )
1 2 3
∂ x4 ∂ x3


3 2
= (24 x x x3 x4 )
1 2
∂x4
3 2
= 24 x x x3
1 2

Notice that in Example 2.3.1,


my
fxy = fyx = −mne sin(nx)

and in Example 2.3.2


αx+βy
fxy = fyx = αβ e

and in Example 2.3.3


4 4
∂  f ∂  f
3 2
= = 24 x x x3
1 2
∂ x1 ∂ x2 ∂ x3 ∂ x4 ∂ x4 ∂ x3 ∂ x2 ∂ x1

In all of these examples, it didn't matter what order we took the derivatives in. The following theorem 1 shows that this was no
accident.

 Theorem 2.3.4. Clairaut's Theorem or Schwarz's Theorem


2 2
∂ f ∂ f
If the partial derivatives ∂x∂y
and ∂y∂x
exist and are continuous at (x 0, y0 ), then
2 2
∂ f ∂ f
(x0 , y0 ) = (x0 , y0 )
∂x∂y ∂y∂x

Optional — The Proof of Theorem 2.3.4


Outline
Here is an outline of the proof of Theorem 2.3.4. The (numbered) details are in the subsection below. Fix real numbers x and y 0 0

and define
1
F (h, k) = [f (x0 + h, y0 + k) − f (x0 , y0 + k) − f (x0 + h, y0 ) + f (x0 , y0 )]
hk

2 2
∂ f ∂ f
We define F (h, k) in this way because both partial derivatives ∂x∂y
(x0 , y0 ) and ∂y∂x
(x0 , y0 ) are limits of F (h, k) as h, k → 0.

Precisely, we show in item (1) in the details below that


∂ ∂f
(x0 , y0 ) = lim lim F (h, k)
∂y ∂x k→0 h→0

∂ ∂f
(x0 , y0 ) = lim lim F (h, k)
∂x ∂y h→0 k→0

2.3.2 https://math.libretexts.org/@go/page/89211
Note that the two right hand sides here are identical except for the order in which the limits are taken.
Now, by the mean value theorem (four times),

(2) 1 ∂f ∂f
F (h, k)  =   [ (x0 + h, y0 + θ1 k) − (x0 , y0 + θ1 k)]
h ∂y ∂y

(3) ∂ ∂f
  =   (x0 + θ2 h, y0 + θ1 k)
∂x ∂y

(4) 1 ∂f ∂f
F (h, k)  =   [ (x0 + θ3 h, y0 + k) − (x0 + θ3 h, y0 )]
k ∂x ∂x

(5) ∂ ∂f
  =   (x0 + θ3 h, y0 + θ4 k)
∂y ∂x

for some numbers 0 < θ 1, θ2 , θ3 , θ4 < 1. All of the numbers θ 1, θ2 , θ3 , θ4 depend on x 0, y0 , h, k. Hence
∂ ∂f ∂ ∂f
(x0 + θ2 h, y0 + θ1 k) = (x0 + θ3 h, y0 + θ4 k)
∂x ∂y ∂y ∂x

for all h and k. Taking the limit (h, k) → (0, 0) and using the assumed continuity of both partial derivatives at (x 0, y0 ) gives
∂ ∂f ∂ ∂f
lim F (h, k) = (x0 , y0 ) = (x0 , y0 )
(h,k)→(0,0) ∂x ∂y ∂y ∂x

as desired. To complete the proof we just have to justify the details (1), (2), (3), (4) and (5).

The Details
1. By definition,
∂ ∂f 1 ∂f ∂f
(x0 , y0 ) = lim [ (x0 , y0 + k) − (x0 , y0 )]
∂y ∂x k→0 k ∂x ∂x

1 f (x0 + h, y0 + k) − f (x0 , y0 + k) f (x0 + h, y0 ) − f (x0 , y0 )


= lim [ lim − lim ]
k→0 k h→0 h h→0 h

f (x0 + h, y0 + k) − f (x0 , y0 + k) − f (x0 + h, y0 ) + f (x0 , y0 )


= lim lim
k→0 h→0 hk

= lim lim F (h, k)


k→0 h→0

Similarly,
∂ ∂f 1 ∂f ∂f
(x0 , y0 ) = lim [ (x0 + h, y0 ) − (x0 , y0 )]
∂x ∂y h→0 h ∂y ∂y

1 f (x0 + h, y0 + k) − f (x0 + h, y0 ) f (x0 , y0 + k) − f (x0 , y0 )


= lim [ lim − lim ]
h→0 h k→0 k k→0 k

f (x0 + h, y0 + k) − f (x0 + h, y0 ) − f (x0 , y0 + k) + f (x0 , y0 )


= lim lim
h→0 k→0 hk

= lim lim F (h, k)


h→0 k→0

2. The mean value theorem (Theorem 2.13.4 in the CLP-1 text) says that, for any differentiable function a⃗ rphi(x),
the slope of the line joining the points (x 0, a⃗ rphi(x0 )) and (x 0 + k, a⃗ rphi(x0 + k)) on the graph of a⃗ rphi
is the same as
the slope of the tangent to the graph at some point between x and x 0 0 + k.

That is, there is some 0 < θ 1 <1 such that


a⃗ rphi(x0 + k) − a⃗ rphi(x0 ) da⃗ rphi
= (x0 + θ1 k)
k dx

2.3.3 https://math.libretexts.org/@go/page/89211
Applying this with x replaced by y and a⃗ rphi replaced by G(y) = f (x 0 + h, y) − f (x0 , y) gives
G(y0 + k) − G(y0 ) dG
= (y0 + θ1 k) for some 0 < θ1 < 1
k dy

∂f ∂f
= (x0 + h, y0 + θ1 k) − (x0 , y0 + θ1 k)
∂y ∂y

Hence, for some 0 < θ 1 < 1,

1 G(y0 + k) − G(y0 )
F (h, k)  =   [ ]
h k

1 ∂f ∂f
= [ (x0 + h, y0 + θ1 k) − (x0 , y0 + θ1 k)]
h ∂y ∂y

∂f
3. Define H (x) = ∂y
(x, y0 + θ1 k). By the mean value theorem,

1
F (h, k)  =   [H (x0 + h) − H (x0 )]
h

dH
=  (x0 + θ2 h) for some 0 < θ2 < 1
dx

∂ ∂f
= (x0 + θ2 h, y0 + θ1 k)
∂x ∂y

4. Define A(x) = f (x, y 0 + k) − f (x, y0 ). By the mean value theorem,

1 A(x0 + h) − A(x0 )
F (h, k)  =   [ ]
k h

1 dA
=  (x0 + θ3 h) for some 0 < θ3 < 1
k dx

1 ∂f ∂f
= [ (x0 + θ3 h, y0 + k) − (x0 + θ3 h, y0 )]
k ∂x ∂x

∂f
5. Define B(y) = ∂x
(x0 + θ3 h, y). By the mean value theorem
1
F (h, k)  =   [B(y0 + k) − B(y0 )]
k
dB
=  (y0 + θ4 k) for some 0 < θ4 < 1
dy

∂ ∂f
= (x0 + θ3 h, y0 + θ4 k)
∂y ∂x

This completes the proof of Theorem 2.3.4.


2 2

Optional — An Example of
∂  f ∂  f
(x0 , y0 ) ≠ (x0 , y0 )
∂x∂y ∂y∂x

2 2
∂ f ∂ f
In Theorem 2.3.4, we showed that ∂x∂y
(x0 , y0 ) =
∂y∂x
(x0 , y0 ) if the partial derivatives
2
∂ f

∂x∂y
and \(\frac{\partial^2 f }{\partial y\partial x}\]
2 2
∂ f ∂ f
exist and are continuous at (x0 , y0 ). Here is an example which shows that if the partial derivatives ∂x∂y
and ∂y∂x
are not
continuous at (x 0 , y0 ), then it is possible that

2.3.4 https://math.libretexts.org/@go/page/89211
2 2
∂ f ∂ f
(x0 , y0 ) ≠ (x0 , y0 ).
∂x∂y ∂y∂x

Define
2 2
x −y
xy if (x, y) ≠ (0, 0)
f (x, y) = { x2 +y 2

0 if (x, y) = (0, 0)

This function is continuous everywhere. Note that f (x, 0) = 0 for all x and f (0, y) = 0 for all y. We now compute the first order
partial derivatives. For (x, y) ≠ (0, 0),
2 2 2 2
∂f x −y 2x 2x(x −y )
(x, y) = y + xy − xy
2 2 2 2 2 2 2
∂x x +y x +y (x +y )
2 2 2
x −y 4xy
  =y + xy
2 2 2 2 2
x +y (x +y )

2 2 2 2
∂f x −y 2y 2y(x −y )
(x, y) = x − xy − xy
2 2 2 2 2 2 2
∂y x +y x +y (x +y )

2 2 2
x −y 4yx
  =x − xy
x2 + y 2 (x2 + y 2 )2

For (x, y) = (0, 0),


∂f d d
(0, 0) = [ f (x, 0)] =[ 0] =0
∂x dx dx
x=0 x=0

∂f d d
(0, 0) = [ f (0, y)] =[ 0] =0
∂y dy dy
y=0 y=0

By way of summary, the two first order partial derivatives are


2 2 2 3
x −y 4x y
y 2 2
+ 2
if (x, y) ≠ (0, 0)
x +y ( x2 +y 2 )
fx (x, y) = {

0 if (x, y) = (0, 0)
2 2 3 2
x −y 4x y
x 2 2
− 2
if (x, y) ≠ (0, 0)
x +y 2 2
fy (x, y) = { ( x +y )

0 if (x, y) = (0, 0)

∂f ∂f
Both ∂x
(x, y) and ∂y
(x, y) are continuous. Finally, we compute
2
∂  f d 1
(0, 0) = [ fy (x, 0)] = lim [ fy (h, 0) − fy (0, 0)]
∂x∂y dx h→0 h
x=0

2 2
1 h −0
= lim [h − 0] = 1
2 2
h→0 h h +0
2
∂  f d 1
(0, 0) = [ fx (0, y)] = lim [ fx (0, k) − fx (0, 0)]
∂y∂x dy k→0 k
y=0

2 2
1 0 −k
= lim [k − 0] = −1
2
k→0 k 0 +k
2

Exercises
Stage 1

 1

Let all of the third order partial derivatives of the function f (x, y, z) exist and be continuous. Show that
fxyz (x, y, z) = fxzy (x, y, z) = fyxz (x, y, z) = fyzx (x, y, z)

= fzxy (x, y, z) = fzyx (x, y, z)

2.3.5 https://math.libretexts.org/@go/page/89211
 2

Find, if possible, a function f (x, y) for which f x (x, y) = e


y
and f y (x, y) = e .
x

Stage 2

 3

Find the specified partial derivatives.


1. f (x, y) = x 2 3
y ; fxx (x, y), fxyy (x, y), fyxy (x, y)
2

2. f (x, y) = e xy
; fxx (x, y), fxy (x, y), fxxy (x, y), fxyy (x, y)
3 3
1 ∂ f ∂ f
3. f (u, v, w) =  , (u, v, w) , (3, 2, 1)
u + 2v + 3w ∂u∂v∂w ∂u∂v∂w

 4
−− −−−−−
Find all second partial derivatives of f (x, y) = √x 2 2
+ 5y .

 5

Find the specified partial derivatives.


1. f (x, y, z) = arctan (e √xy
); fxyz (x, y, z)

2. f (x, y, z) = arctan (e √xy


) + arctan (e
√xz
) + arctan (e√
yz
); fxyz (x, y, z)

3. f (x, y, z) = arctan (e √xyz


); fxx (1, 0, 0)

 6✳

Let f (r, θ) = r m
cos mθ be a function of r and θ, where m is a positive integer.
1. Find the second order partial derivatives f , f , f and evaluate their respective values at (r, θ) = (1, 0).
rr rθ θθ

2. Determine the value of the real number λ so that f (r, θ) satisfies the differential equation
λ 1
frr + fr + fθθ = 0
2
r r

Stage 3

 7
1 2 2 2

Let α > 0 be a constant. Show that u(x, y, z, t) = 3/2


e
−( x +y +z )/(4αt)
satisfies the heat equation
t

ut = α(uxx + uyy + uzz )

for all t > 0

1. The history of this important theorem is pretty convoluted. See “A note on the history of mixed partial derivatives” by Thomas
James Higgins which was published in Scripta Mathematica 7 (1940), 59-62. The Theorem is named for Alexis Clairaut (1713-
-1765), a French mathematician, astronomer, and geophysicist, and Hermann Schwarz (1843--1921), a German mathematician.

This page titled 2.3: Higher Order Derivatives is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.

2.3.6 https://math.libretexts.org/@go/page/89211
2.4: The Chain Rule
You already routinely use the one dimensional chain rule
d df dx
f (x(t)) = (x(t))  (t)
dt dx dt

in doing computations like


d 2 2
sin(t ) = cos(t ) 2t
dt

In this example, f (x) = sin(x) and x(t) = t 2


.

We now generalize the chain rule to functions of more than one variable. For concreteness, we concentrate on the case in which all
functions are functions of two variables. That is, we find the partial derivatives and of a function F (s, t) that is defined as a
∂F

∂s
∂F

∂t

composition

F (s, t) = f (x(s, t) , y(s, t))

We are using the name F for the new function F (s, t) as a reminder that it is closely related to, though not the same as, the
function f (x, y). The partial derivative is the rate of change of F when s is varied with t held constant. When s is varied, both
∂F

∂s

the x-argument, x(s, t), and the y -argument, y(s, t), in f (x(s, t) , y(s, t)) vary. Consequently, the chain rule for
f (x(s, t) , y(s, t)) is a sum of two terms — one resulting from the variation of the x-argument and the other resulting from the

variation of the y -argument.

 Theorem 2.4.1. The Chain Rule


Assume that all first order partial derivatives of f (x, y), x(s, t) and y(s, t) exist and are continuous. Then the same is true for
F (s, t) = f (x(s, t) , y(s, t)) and

∂F ∂f ∂x ∂f ∂y
(s, t) = (x(s, t) , y(s, t)) (s, t) + (x(s, t) , y(s, t)) (s, t)
∂s ∂x ∂s ∂y ∂s

∂F ∂f ∂x ∂f ∂y
(s, t) = (x(s, t) , y(s, t)) (s, t) + (x(s, t) , y(s, t)) (s, t)
∂t ∂x ∂t ∂y ∂t

We will give the proof of this theorem in §2.4.4, below. It is common to state this chain rule as
∂F ∂f ∂x ∂f ∂y
= +
∂s ∂x ∂s ∂y ∂s

∂F ∂f ∂x ∂f ∂y
= +
∂t ∂x ∂t ∂y ∂t

That is, it is common to suppress the function arguments. But you should make sure that you understand what the arguments are
before doing so.
Theorem 2.4.1 is given for the case that F is the composition of a function of two variables, f (x, y), with two functions, x(s, t)
and y(s, t), of two variables each. There is nothing magical about the number two. There are obvious variants for any numbers of
variables. For example,

 Equation 2.4.2
If F (t) = f (x(t), y(t), z(t)), then
dF ∂f dx ∂f dy
(t) = (x(t) , y(t) , z(t)) (t) + (x(t) , y(t) , z(t)) (t)
dt ∂x dt ∂y dt

∂f dz
+ (x(t) , y(t) , z(t)) (t)
∂z dt

2.4.1 https://math.libretexts.org/@go/page/89212
and

 Equation 2.4.3

if F (s, t) = f (x(s, t)), then


∂F df ∂x
(s, t) = (x(s, t)) (s, t)
∂t dx ∂t

There will be a large number of examples shortly. First, here is a memory aid.

Memory Aids for the Chain Rule


We recommend strongly that you use the following procedure, without leaving out any steps, the first couple of dozen times that
you use the chain rule.
Step 1: List explicitly all the functions involved and specify the arguments of each function. Ensure that all different functions
have different names. Invent new names for some of the functions if necessary. In the case of the chain rule in Theorem 2.4.1,
the list would be

f (x, y) x(s, t) y(s, t) F (s, t) = f (x(s, t), y(s, t))

While the functions f and F are closely related, they are not the same. One is a function of x and y while the other is a function
of s and t.
Step 2: Write down the template
∂F ∂f
=
∂s ∂s

Note that
The function F appears once in the numerator on the left. The function f , from which F is constructed by a change of
variables, appears once in the numerator on the right.
The variable in the denominator on the left appears once in the denominator on the right.
Step 3: Fill in the blanks with every variable that makes sense. In particular, since f is a function of x and y, it may only be
differentiated with respect to x and y. So we add together two copies of our template — one for x and one for y:
∂F ∂f ∂x ∂f ∂y
= +
∂s ∂x ∂s ∂y ∂s

∂y ∂f
Note that x and y are functions of s so that the derivatives ∂x

∂s
and ∂s
make sense. The first term, ∂x
∂x

∂s
, arises from the
∂f ∂y
variation of x with respect to s and the second term, ∂y ∂s
, arises from the variation of y with respect to s.
Step 4: Put in the functional dependence explicitly. Fortunately, there is only one functional dependence that makes sense. The
left hand side is a function of s and t. Hence the right hand side must also be a function of s and t. As f is a function of x and
y, this is achieved by evaluating f at x = x(s, t) and y = y(s, t).

∂F ∂f ∂x ∂f ∂y
(s, t) = (x(s, t), y(s, t)) (s, t) + (x(s, t), y(s, t)) (s, t)
∂s ∂x ∂s ∂y ∂s

∂f ∂f
If you fail to put in the arguments, or at least if you fail to remember what the arguments are, you may forget that ∂x
and ∂y

depend on s and t. Then, if you have to compute a second derivative of F , you will probably fail to differentiate the factors
∂f ∂f

∂x
(x(s, t), y(s, t)) and ∂y
(x(s, t), y(s, t)).

To help remember the formulae of Theorem 2.4.1, it is sometimes also useful to pretend that our variables are physical quantities
with f , F having units of grams, x, y having units of meters and s, t having units of seconds. Note that
the left hand side, , has units grams per second.
∂F

∂s

Each term on the right hand side contains the partial derivative of f with respect to a different independent variable. That
independent variable appears once in the denominator and once in the numerator, so that its units (in this case meters) cancel

2.4.2 https://math.libretexts.org/@go/page/89212
∂f ∂f ∂y
out. Thus both of the terms ∂x
∂x

∂s
and ∂y ∂s
on the right hand side also have the units grams per second.
Hence both sides of the equation have the same units.
∂f ∂f ∂y
Here is a pictorial procedure that uses a tree diagram to help remember the chain rule ∂

∂s
f (x(s, t), y(s, t)) =
∂x
∂x

∂s
+
∂y ∂s
. As
in the figure on the left below,
write, on the top row, “f ”.
Write, on the middle row, each of the variables that the function f (x, y) depends on, namely“x” and “y ”.
Write, on the bottom row,
below x, each of the variables that the function x(s, t) depends on, namely “s ” and “t ”, and
below y, each of the variables that the function y(s, t) depends on, namely “s ” and “t ”.
Draw a line joining each function with each of the variables that it depends on.
Then, as in the figure on the right below, write beside each line, the partial derivative of the function at the top of the line with
respect to the variable at the bottom of the line.

Finally
observe, from the figure below, that there are two paths from f , on the top, to s, on the bottom. One path goes from f at the
top, through x in the middle to s at the bottom. The other path goes from f at the top, through y in the middle to s at the
bottom.
For each such path, multiply together the partial derivatives beside the lines of the path. In this example, the two products
∂f ∂f ∂y
are ∂x
∂x

∂s
, for the first path, and ∂y ∂s
, for the second path.
∂f ∂f ∂y
Then add together those products, giving, in this example, ∂x
∂x

∂s
+
∂y ∂s
.

Put in the arguments, as in Step 4, above.


That's it. We have
∂ ∂f ∂x
f (x(s, t), y(s, t)) = (x(s, t), y(s, t)) (s, t)
∂s ∂x ∂s

∂f ∂y
+ (x(s, t), y(s, t)) (s, t)
∂y ∂s

 Example 2.4.4

The right hand side of the chain rule


d ∂f dx ∂f dy
f (x(t) , y(t) , z(t)) = (x(t) , y(t) , z(t)) (t) + (x(t) , y(t) , z(t)) (t)
dt ∂x dt ∂y dt

∂f dz
+ (x(t) , y(t) , z(t)) (t)
∂z dt

2.4.3 https://math.libretexts.org/@go/page/89212
∂f ∂f dy ∂f
of Equation 2.4.2, without arguments, is ∂x
dx

dt
+
∂y dt
+
∂z
dz

dt
. The corresponding tree diagram is

Because x(t), y(t) and z(t) are each functions of just one variable, the derivatives beside the lower lines in the tree are
ordinary, rather than partial, derivatives.

Chain Rule Examples


Let's do some routine examples first and work our way to some trickier ones.

 Example 2.4.5. ∂

∂s
f (x(s, t), y(s, t))

In this example we find ∂

∂s
f (x(s, t), y(s, t)) for
xy
f (x, y) = e x(s, t) = s y(s, t) = cos t

Define F (s, t) = f (x(s, t) , y(s, t)). The appropriate chain rule for this example is the upper equation of Theorem 2.4.1.
∂F ∂f ∂x ∂f ∂y
(s, t) = (x(s, t) , y(s, t)) (s, t) + (x(s, t) , y(s, t)) (s, t)
∂s ∂x ∂s ∂y ∂s

For the given functions


xy
f (x, y) = e

∂f xy
∂f x(s,t) y(s,t) s cos t
(x, y) = ye (x(s, t), y(s, t)) = y(s, t)e = cos t e
∂x ∂x

∂f xy
∂f x(s,t) y(s,t) s cos t
(x, y) = xe (x(s, t), y(s, t)) = x(s, t)e = s e
∂y ∂y

∂x ∂y
=1 =0
∂s ∂s

so that
∂f
∂f ∂y
∂x

∂x ∂y
∂s ∂s
   
∂F
s cos t s cos t s cos t
(s, t) = {cos t e } (1) + {s e } (0) = cos t e
∂s

 Example 2.4.6. d

dt
f (x(t), y(t))

In this example we find d

dt
f (x(t), y(t)) for
2 2
f (x, y) = x −y x(t) = cos t y(t) = sin t

Define F (t) = f (x(t), y(t)). Since F (t) is a function of one variable its derivative is denoted dF

dt
rather than ∂F

∂t
. The
appropriate chain rule for this example (see 2.4.2) is
dF ∂f dx ∂f dy
(t) = (x(t), y(t)) (t) + (x(t), y(t)) (t)
dt ∂x dt ∂y dt

For the given functions

2.4.4 https://math.libretexts.org/@go/page/89212
2 2
f (x, y) = x −y

∂f ∂f
(x, y) = 2x (x(t), y(t)) = 2x(t) = 2 cos t
∂x ∂x
∂f ∂f
(x, y) = −2y (x(t), y(t)) = −2y(t) = −2 sin t
∂y ∂y

dx dy
= − sin t = cos t
dt dt

so that
dF
(t) = (2 cos t)(− sin t) + (−2 sin t)(cos t) = −4 sin t cos t
dt

Of course, in this example we can compute F (t) explicitly


2 2 2 2
F (t) = f (x(t), y(t)) = x(t) − y(t) = cos t − sin t

and then differentiate



F (t) = 2(cos t)(− sin t) − 2(sin t)(cos t) = −4 sin t cos t

 Example 2.4.7. ∂

∂t
f (x + ct)

Define u(x, t) = x + ct and w(x, t) = f (x + ct) = f (u(x, t)). Then


∂ ∂w df ∂u ′
f (x + ct) = (x, t) = (u(x, t)) (x, t) = c f (x + ct)
∂t ∂t du ∂t

 Example 2.4.8. ∂
2
f (x + ct)
∂t

Define w(x, t) = f (x + ct) and W (x, t) =


∂w

∂t

(x, t) = c f (x + ct) = F (u(x, t)) where ′
F (u) = c f (u) and
u(x, t) = x + ct. Then
2
∂ ∂W dF ∂u ′′
f (x + ct) = (x, t) = (u(x, t)) (x, t) = c f (x + ct) c
∂t2 ∂t du ∂t
2 ′′
=c f (x + ct)

 Example 2.4.9. Equation of state

Suppose that we are told that F (P , V , T ) = 0 and that we are to find ∂P

∂T
.

Before we can find , we first have to decide what it means. This happens regularly in applications. In fact, this particular
∂P

∂T

problem comes from thermodynamics. The variables P ,  V ,  T are the pressure, volume and temperature, respectively, of some
gas. These three variables are not independent. They are related by an equation of state, here denoted F (P , V , T ) = 0. Given
values for any two of P ,  V ,  T , the third can be found by solving F (P , V , T ) = 0. We are being asked to find . This
∂P

∂T

implicitly instructs us to treat P , in this problem, as the dependent variable. So a careful wording of this problem (which you
will never encounter in the “real world”) would be the following. The function P (V , T ) is defined by F (P (V , T ), V , T ) = 0.
Find ( ) . That is, find the rate of change of pressure as the temperature is varied, while holding the volume fixed.
∂P

∂T V

Since we are not told explicitly what F is, we cannot solve explicitly for P (V , T ). So, instead we differentiate both sides of

F (P (V , T ), V , T ) = 0

with respect to T, while holding V fixed. Think of the left hand side, F (P (V , T ), V , T ), as being
F (P (V , T ), Q(V , T ), R(V , T )) with Q(V , T ) = V and R(V , T ) = T . By the chain rule,
∂ ∂P ∂Q ∂R
F (P (V , T ), Q(V , T ), R(V , T )) = F1 + F2 + F3 =0
∂T ∂T ∂T ∂T

2.4.5 https://math.libretexts.org/@go/page/89212
with F referring to the partial derivative of F with respect to its j
j
th
argument. Experienced chain rule users never introduce Q
and R. Instead, they just write
∂F ∂P ∂F ∂V ∂F ∂T
+ + =0
∂P ∂T ∂V ∂T ∂T ∂T

Recalling that V and T are the independent variables and that, in computing ∂

∂T
, V is to be treated as a constant,
∂V ∂T
=0 =1
∂T ∂T

Now putting in the functional dependence


∂F ∂P ∂F
(P (V , T ), V , T ) (V , T ) + (P (V , T ), V , T ) = 0
∂P ∂T ∂T

and solving
∂F
∂P (P (V , T ), V , T )
∂T
(V , T ) = −
∂T ∂F
(P (V , T ), V , T )
∂P

 Example 2.4.10
2
d y
Suppose that f (x, y) = 0 and that we are to find dx
2
.

Once again, x and y are not independent variables. Given a value for either x or y, the other is determined by solving
2
d y
f (x, y) = 0. Since we are asked to find , it is y that is to be viewed as a function of x, rather than the other way around. So
dx
2

f (x, y) = 0 really means that, in this problem, f (x, y(x)) = 0 for all x. Differentiating both sides of this equation with
respect to x,

f (x, y(x)) = 0 for all x

d
⟹ f (x, y(x)) = 0
dx

Note that d

dx
is not the same as
f (x, y(x)) fx (x, y(x)). The former is, by definition, the rate of change with respect to x of
g(x) = f (x, y(x)). Precisely,

dg g(x + Δx) − g(x)


= lim
dx Δx→0 Δx

f (x + Δx , y(x + Δx)) − f (x , y(x))


= lim (∗)
Δx→0 Δx

On the other hand, by definition,


f (x + Δx, y) − f (x, y)
fx (x, y) = lim
Δx→0 Δx

f (x + Δx , y(x)) − f (x , y(x))
⟹ fx (x, y(x)) = lim (∗∗)
Δx→0 Δx

dg
The right hand sides of (∗) and (∗∗) are not the same. In , as Δx varies the value of y that is substituted into the first f (⋯)
dx

on the right hand side, namely y(x + Δx), changes as Δx changes. That is, we are computing the rate of change of f along
the (curved) path y = y(x). In (∗∗), the corresponding value of y is y(x) and is independent of Δx. That is, we are computing
the rate of change of f along a horizontal straight line. As a concrete example, suppose that f (x, y) = x + y. Then,
0 = f (x , y(x)) = x + y(x) gives y(x) = −x so that

d d d d
f (x, y(x)) = f (x, −x) = [x + (−x)] = 0 =0
dx dx dx dx

But f (x, y) = x + y implies that f x (x, y) = 1 for all x and y so that

2.4.6 https://math.libretexts.org/@go/page/89212
∣ ∣
fx (x, y(x)) = fx (x, y) =1 =1
∣ ∣
y=−x y=−x

Now back to

f (x, y(x)) = 0 for all x

d
⟹ f (x, y(x)) = 0
dx

dx dy
⟹ fx (x, y(x)) + fy (x, y(x)) (x) = 0 by the chain rule
dx dx

dy fx (x, y(x))
⟹ (x) = −
dx fy (x, y(x))

2
d y d fx (x, y(x))
⟹ (x) = − [ ]
2
dx dx fy (x, y(x))

d d
fy (x, y(x)) [ fx (x, y(x))] − fx (x, y(x)) [ fy (x, y(x))]
dx dx
=− (†)
2
fy (x, y(x))

by the quotient rule. Now it suffices to substitute in d

dx
[ fx (x, y(x))] and d

dx
[ fy (x, y(x))]. For the former apply the chain
rule to h(x) = u(x, y(x)) with u(x, y) = f (x, y). x

d dh
[ fx (x, y(x))] = (x)
dx dx
dx dy
= ux (x, y(x)) + uy (x, y(x)) (x)
dx dx

dx dy
= fxx (x, y(x)) + fxy (x, y(x)) (x)
dx dx

fx (x, y(x))
= fxx (x, y(x)) − fxy (x, y(x)) [ ]
fy (x, y(x))

Substituting this and


d dx dy
[ fy (x, y(x))] = fyx (x, y(x)) + fyy (x, y(x)) (x)
dx dx dx

fx (x, y(x))
= fyx (x, y(x)) − fyy (x, y(x)) [ ]
fy (x, y(x))

into the right hand side of (†) gives the final answer.
fx fx
2 fy fxx − fy fxy − fx fyx + fx fyy
d y fy fy

(x) = −
2 2
dx fy

2 2
fy fxx − 2 fx fy fxy + fx fyy
=−
3
fy

with all of f x, fy , fxx , fxy , fyy having arguments (x , y(x)).

We now move on to the proof of Theorem 2.4.1. To give you an idea of how the proof will go, we first review the proof of the
familiar one dimensional chain rule.
df
Review of the Proof of d

dt
f (x(t)) =
dx
(x(t)) 
dx

dt
(t)

As a warm up, let's review the proof of the one dimensional chain rule
d df dx
f (x(t)) = (x(t))  (t)
dt dx dx

df
assuming that dx

dt
exists and that dx
is continuous. We wish to find the derivative of F (t) = f (x(t)). By definition

2.4.7 https://math.libretexts.org/@go/page/89212
F (t + h) − F (t)

F (t) = lim
h→0 h

f (x(t + h)) − f (x(t))


= lim
h→0 h

Notice that the numerator is the difference of f (x) evaluated at two nearby values of x, namely x = x(t + h) and x = x(t). The 1 0

mean value theorem is a good tool for studying the difference in the values of f (x) at two nearby points. Recall that the mean value
theorem says that, for any given x and x , there exists an (in general unknown) c between them so that
0 1


f (x1 ) − f (x0 ) = f (c) (x1 − x0 )

For this proof, we choose x 0 = x(t) and x 1 = x(t + h). The the mean value theorem tells us that there exists a c so that h


f (x(t + h)) − f (x(t)) = f (x1 ) − f (x0 ) = f (ch ) [x(t + h) − x(t)]

We have put the subscript h on c to emphasise that c , which is between x = x(t) and x = x(t + h), may depend on h. Now
h h 0 1

since c is trapped between x(t) and x(t + h) and since x(t + h) → x(t) as h → 0, we have that c must also tend to x(t) as
h h

h → 0. Plugging this into the definition of F (t),


f (x(t + h)) − f (x(t))



F (t) = lim
h→0 h

f (ch ) [x(t + h) − x(t)]
= lim
h→0 h

x(t + h) − x(t)

= lim f (ch )  lim
h→0 h→0 h
′ ′
= f (x(t)) x (t)

as desired.

Proof of Theorem 2.4.1


We'll now prove the formula for f (x(s, t) , y(s, t)) that is given in Theorem 2.4.1. The proof uses the same ideas as the proof of

∂s

the one variable chain rule, that we have just reviewed.


We wish to find the partial derivative with respect to s of F (s, t) = f (x(s, t) , y(s, t)). By definition

∂F F (s + h, t) − F (s, t)
(s, t) = lim
∂s h→0 h

f (x(s + h, t) , y(s + h, t)) − f (x(s, t) , y(s, t))


= lim
h→0 h

The numerator is the difference of f (x, y) evaluated at two nearby values of (x, y), namely (x , y ) = (x(s + h, t) , y(s + h, t)) 1 1

and (x , y ) = (x(s, t) , y(s, t)). In going from (x , y ) to (x , y ), both the x and y -coordinates change. By adding and
0 0 0 0 1 1

subtracting we can separate the change in the x-coordinate from the change in the y -coordinate.

f (x1 , y1 ) − f (x0 , y0 ) = {f (x1 , y1 ) − f (x0 , y1 )} + {f (x0 , y1 ) − f (x0 , y0 )}

The first half, {f (x , y ) − f (x , y )}, has the same y argument in both terms and so is the difference of the function of one
1 1 0 1

variable g(x) = f (x, y ) (viewing y just as a constant) evaluated at the two nearby values, x , x , of x. Consequently, we can
1 1 0 1

make use of the mean value theorem as we did in §2.4.3 above. There is a c between x = x(s, t) and x = x(s + h, t) such x,h 0 1

that
∂f

f (x1 , y1 ) − f (x0 , y1 ) = g(x1 ) − g(x0 ) = g (cx,h )[ x1 − x0 ] = (cx,h , y1 ) [ x1 − x0 ]
∂x

∂f
= (cx,h , y(s + h, t)) [x(s + h, t) − x(s, t)]
∂x

We have introduced the two subscripts in c x,h to remind ourselves that it may depend on h and that it lies between the two x-values
x and x .
0 1

2.4.8 https://math.libretexts.org/@go/page/89212
Similarly, the second half, {f (x , y ) − f (x , y )}, is the difference of the function of one variable h(y) = f (x
0 1 0 0 0, y) (viewing x
0

just as a constant) evaluated at the two nearby values, y , y , of y. So, by the mean value theorem,
0 1


∂f
f (x0 , y1 ) − f (x0 , y0 ) = h(y1 ) − h(y0 ) = h (cy,h )[ y1 − y0 ] = (x0 , cy,h ) [ y1 − y0 ]
∂y

∂f
= (x(s, t) , cy,h ) [y(s + h, t) − y(s, t)]
∂y

for some (unknown) c between y = y(s, t) and y = y(s + h, t). Again, the two subscripts in c remind ourselves that it
y,h 0 1 y,h

may depend on h and that it lies between the two y -values y and y . So, noting that, as h tends to zero, c , which is trapped
0 1 x,h

between x(s, t) and x(s + h, t), must tend to x(s, t), and c , which is trapped between y(s, t) and y(s + h, t), must tend to
y,h

y(s, t),

∂F f (x(s + h, t) , y(s + h, t)) − f (x(s, t) , y(s, t))


(s, t)) = lim
∂s h→0 h
∂f
(cx,h , y(s + h, t)) [x(s + h, t) − x(s, t)]
∂x
= lim
h→0 h
∂f
(x(s, t) , cy,h ) [y(s + h, t) − y(s, t)]
∂y
+ lim
h→0 h

∂f x(s + h, t) − x(s, t)
= lim (cx,h , y(s + h, t))  lim
h→0 ∂x h→0 h

∂f y(s + h, t) − y(s, t)
+ lim (x(s, t) , cy,h ) lim
h→0 ∂y h→0 h

∂f ∂x ∂f ∂y
= (x(s, t) , y(s, t)) (s, t) + (x(s, t) , y(s, t)) (s, t)
∂x ∂s ∂y ∂s

We can of course follow the same procedure to evaluate the partial derivative with respect to t. This concludes the proof of
Theorem 2.4.1.

Exercises
Stage 1

 1

Write out the chain rule for each of the following functions.
1. ∂h

∂x
for h(x, y) = f (x, u(x, y))
2. dh

dx
for h(x) = f (x, u(x), v(x))
3. ∂h

∂x
for h(x, y, z) = f (u(x, y, z), v(x, y), w(x))

 2

A piece of the surface z = f (x, y) is shown below for some continuously differentiable function f (x, y). The level curve
f (x, y) = z is marked with a blue line. The three points P , P , and P lie on the surface.
1 0 1 2

2.4.9 https://math.libretexts.org/@go/page/89212
On the level curve z = z1 , we can think of y as a function of x. Let w(x) = f (x, y(x)) = z1 . We approximate, at P0 ,
Δf
fx (x, y) ≈
Δx
and dw

dx
(x) ≈
Δw

Δx
. Identify the quantities Δf , Δw, and Δx from the diagram.

 3✳

Let w = f (x, y, t) with x and y depending on t. Suppose that at some point (x, y) and at some time t, the partial derivatives
dy
fx , fy and f are equal to
t 2, −3 and 5 respectively, while dx

dt
=1 and dt
= 2. Find and explain the difference between dw

dt

and f t.

 4

Thermodynamics texts use the relationship


∂y ∂z ∂x
( )( )( ) = −1
∂x ∂y ∂z

Explain the meaning of this equation and prove that it is true.

 5

What is wrong with the following argument? Suppose that w = f (x, y, z) and z = g(x, y). By the chain rule,
∂w ∂w ∂x ∂w ∂y ∂w ∂z ∂w ∂w ∂z
= + + = +
∂x ∂x ∂x ∂y ∂x ∂z ∂x ∂x ∂z ∂x

Hence 0 = ∂w

∂z
∂z

∂x
and so ∂w

∂z
=0 or ∂z

∂x
= 0.

Stage 2

 6

Use two methods (one using the chain rule) to evaluate ∂w

∂s
and ∂w

∂t
given that the function w =x
2
+y
2 2
+z , with
x = st,  y = s cos t and z = s sin t.

 7
3

Evaluate ∂

∂x∂y 2
f (2x + 3y, xy) in terms of partial derivatives of f. You may assume that f is a smooth function so that the
Chain Rule and Clairaut's Theorem on the equality of the mixed partial derivatives apply.

2.4.10 https://math.libretexts.org/@go/page/89212
 8

Find all second order derivatives of g(s, t) = f (2s + 3t, 3s − 2t). You may assume that f (x, y) is a smooth function so that
the Chain Rule and Clairaut's Theorem on the equality of the mixed partial derivatives apply.

 9✳
2 2
∂ f ∂ f
Assume that f (x, y) satisfies Laplace's equation ∂x
2
+
∂y
2
= 0. Show that this is also the case for the composite function
2 2
∂ g ∂ g
g(s, t) = f (s − t, s + t).That is, show that + = 0. You may assume that f (x, y) is a smooth function so that the
∂s
2
∂t
2

Chain Rule and Clairaut's Theorem on the equality of the mixed partial derivatives apply.

 10 ✳

Let z = f (x, y) where x = 2s + t and y = s − t. Find the values of the constants a, b and c such that
2 2 2 2 2
∂ z ∂ z ∂ z ∂ z ∂ z
a +b +c = +
2 2 2 2
∂x ∂x ∂y ∂y ∂s ∂t

You may assume that z = f (x, y) is a smooth function so that the Chain Rule and Clairaut's Theorem on the equality of the
mixed partial derivatives apply.

 11 ✳

Let F be a function on 2
R . Denote points in R by (u, v) and the corresponding partial derivatives of
2
F by Fu (u, v),

Fv (u, v), Fuu (u, v), Fuv (u, v), etc.. Assume those derivatives are all continuous. Express

2

2 2
F (x − y , 2xy)
∂x ∂y

in terms of partial derivatives of the function F .

 12 ✳
u(x, y) is defined as
2
y −y
u(x, y) = e F (x e )

for an arbitrary function F (z).


1. If F (z) = ln(z), find ∂u

∂x
and ∂u

∂y
.

2. For an arbitrary F (z) show that u(x, y) satisfies


∂u ∂u
2xy + =u
∂x ∂y

 13 ✳
Let f (x) and g(x) be two functions of x satisfying f ′′
(7) = −2 and g ′′
(−4) = −1. If z = h(s, t) = f (2s + 3t) + g(s − 6t)
2

is a function of s and t, find the value of ∂

∂t
z
2
when s = 2 and t = 1.

 14 ✳
Suppose that w = f (xz, yz), where f is a differentiable function. Show that
∂w ∂w ∂w
x +y =z
∂x ∂y ∂z

2.4.11 https://math.libretexts.org/@go/page/89212
 15 ✳

Suppose z = f (x, y) has continuous second order partial derivatives, and x = r cos t, y = r sin t. Express the following
partial derivatives in terms r, t, and partial derivatives of f .
∂z
1.
∂t
2
∂ z
2. 2
∂t

 16 ✳

Let z = f (x, y), where f (x, y) has continuous second-order partial derivatives, and
fx (2, 1) = 5, fy (2, 1) = −2,

fxx (2, 1) = 2, fxy (2, 1) = 1, fyy (2, 1) = −4

Find d
2
z(x(t), y(t)) when x(t) = 2t 2
, y(t) = t
3
and t = 1.
dt

 17 ✳
2 2 2 2

Assume that the function F (x, y, z) satisfies the equation ∂F

∂z
=

∂x
F
2
+

∂y
F
2
and the mixed partial derivatives ∂

∂x∂y
F
and ∂

∂y∂x
F

are equal. Let A be some constant and let G(γmma, s, t) = F (γmma + s, γmma − s, At). Find the value of A such that
2 2
∂G ∂ G ∂ G
= + .
∂t ∂γmma2 ∂s2

 18 ✳

Let f (x) be a differentiable function, and suppose it is given that ′


f (0) = 10. Let g(s, t) = f (as − bt), where a and b are
∂g ∂g
constants. Evaluate ∂s
at the point (s, t) = (b, a), that is, find ∂s


(b,a)
.

 19 ✳
Let f (u, v) be a differentiable function of two variables, and let z be a differentiable function of x and y defined implicitly by
f (xz, yz) = 0. Show that

∂z ∂z
x +y = −z
∂x ∂y

 20 ✳

Let w(s, t) = u(2s + 3t, 3s − 2t) for some twice differentiable function u = u(x, y).
1. Find w in terms of u , u , and u . You can assume that u
ss xx xy yy xy = uyx .

2. Suppose u + u = 0. For what constant A will w = Aw ?


xx yy ss tt

 21 ✳
Suppose that f (x, y) is twice differentiable (with f xy = fyx ), and x = r cos θ and y = r sin θ.
1. Evaluate f , f and f in terms of r, θ and partial derivatives of f with respect to x and y.
θ r rθ

2. Let g(x, y) be another function satisfying g = f and g = −f . Express f and f in terms of r, θ and g
x y y x r θ r, gθ .

2.4.12 https://math.libretexts.org/@go/page/89212
 22 ✳

By definition, the gradient of the differentiable function f (x, y) at the point (x 0 , y0 ) is

∂f ∂f
n⃗ ablaf (x0 , y0 ) = ⟨ (x0 , y0 ) , (x0 , y0 )⟩
∂x ∂y

Suppose that we know

n⃗ ablaf (3, 6) = ⟨7, 8⟩

Suppose also that

n⃗ ablag(1, 2) = ⟨−1, 4⟩ ,

and

n⃗ ablah(1, 2) = ⟨−5, 10⟩ .

Assuming g(1, 2) = 3, h(1, 2) = 6, and z(s, t) = f (g(s, t), h(s, t)), find

n⃗ ablaz(1, 2)

 23 ✳
1. Let f be an arbitrary differentiable function defined on the entire real line. Show that the function w defined on the entire
plane as
−y
w(x, y) = e f (x − y)

satisfies the partial differential equation:


∂w ∂w
w+ + =0
∂x ∂y

2. The equations x = u − 3u v , y = 3u v − v and z = u − v define z as a function of x and y. Determine


3 2 2 3 2 2 ∂z

∂x
at the
point (u, v) = (2, 1) which corresponds to the point (x, y) = (2, 11).

 24 ✳

The equations
2
x − y cos(uv) = v

4
2 2
x +y − sin(uv) = u
π

define x and y implicitly as functions of u and v (i.e. x = x(u, v), and y = y(u, v) ) near the point (x, y) = (1, 1) at which
π
(u, v) = ( , 0).
2

1. Find
∂x ∂y
 and 
∂u ∂u

at (u, v) = ( π

2
, 0).

2. If z = x + y
4 4
, determine ∂z

∂u
at the point (u, v) = ( π

2
, 0).

 25 ✳

Let f (u, v) be a differentiable function, and let u = x + y and v = x − y. Find a constant, α, such that
2 2 2 2
(fx ) + (fy ) = α((fu ) + (fv ) )

2.4.13 https://math.libretexts.org/@go/page/89212
Stage 3

 26

The wave equation


2 2
∂ u 1 ∂ u
− =0
2 2 2
∂x c ∂t

arises in many models involving wave-like phenomena. Let u(x, t) and v(ξ, η) be related by the change of variables

u(x, t) = v(ξ(x, t), η(x, t))

ξ(x, t) = x − ct

η(x, t) = x + ct

2 2 2

1. Show that   ∂

∂x
u
2

1

c
2
∂ u
2
= 0  if and only if   ∂

∂ξ∂η
v
= 0.
∂t
2 2

2. Show that   ∂

∂x2
u

1

c2
∂ u
2
= 0  if and only if  u(x, t) = F (x − ct) + G(x + ct)  for some functions F and G.
∂t

3. Interpret  F (x − ct) + G(x + ct)  in terms of travelling waves. Think of u(x, t) as the height, at position x and time t, of
a wave that is travelling along the x-axis.
Remark: Don't be thrown by the strange symbols ξ and η. They are just two harmless letters from the Greek alphabet, called
“xi” and “eta” respectively.

 27
Evaluate
∂y
1. ∂z
if e yz 2
− x z ln y = π
dy
2. dx
if F (x, y, x 2 2
−y ) = 0

∂y
3. ( ∂x
) if xyuv = 1 and x + y + u + v = 0
u

This page titled 2.4: The Chain Rule is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

2.4.14 https://math.libretexts.org/@go/page/89212
2.5: Tangent Planes and Normal Lines
The tangent line to the curve y = f (x) at the point (x , f (x )) is the straight line that fits the curve best 1 at that point. Finding
0 0

tangent lines was probably one of the first applications of derivatives that you saw. See, for example, Theorem 2.3.2 in the CLP-1
text. The analog of the tangent line one dimension up is the tangent plane. The tangent plane to a surface S at a point (x , y , z ) is 0 0 0

the plane that fits S best at (x , y , z ). For example, the tangent plane to the hemisphere
0 0 0

2 2 2
S = {(x, y, z)| x +y + (z − 1 ) = 1,  0 ≤ z ≤ 1}

at the origin is the xy-plane, z = 0.

We are now going to determine, as our first application of partial derivatives, the tangent plane to a general surface S at a general
point (x , y , z ) lying on the surface. We will also determine the line which passes through (x , y , z ) and whose direction is
0 0 0 0 0 0

perpendicular to S at (x , y , z ). It is called the normal line to S at (x , y , z ).


0 0 0 0 0 0

For example, the following figure shows the side view of the tangent plane (in black) and normal line (in blue) to the surface
2
z = x +y (in red) at the point (0, 1, 1).
2

Recall, from 1.4.1, that to specify any plane, we need


one point on the plane and
a vector perpendicular to the plane, i.e. a normal vector,
and recall, from 1.5.1, that to specify any line, we need
one point on the line and
a direction vector for the line.
We already have one point that is on both the tangent plane of interest and the normal line of interest — namely (x , y , z ). 0 0 0

Furthermore we can use any (nonzero) vector that is perpendicular to S at (x , y , z ) as both the normal vector to the tangent
0 0 0

plane and the direction vector of the normal line.


So our main task is to determine a normal vector to the surface S at (x , y , z ). That's what we do now, first for surfaces of the
0 0 0

form z = f (x, y) and then, more generally, for surfaces of the form G(x, y, z) = 0.

Surfaces of the Form z = f (x, y)


We construct a vector perpendicular to the surface z = f (x, y) at (x , y , f (x , y )) by, first, constructing two tangent vectors to
0 0 0 0

the specified surface at the specified point, and, second, taking the cross product of those two tangent vectors. Consider the red
curve in the figure below. It is the intersection of our surface z = f (x, y)

2.5.1 https://math.libretexts.org/@go/page/89213
with the plane y = y 0. Here is a side view of the red curve.

The vector from the point (x , y , f (x , y )), on the red curve, to the point (x + h , y
0 0 0 0 0 0 , f (x0 + h, y0 )), also on the red curve,
is almost tangent to the red curve, if h is very small. As h tends to 0, that vector, which is

⟨h , 0 , f (x0 + h, y0 ) − f (x0 , y0 )⟩

becomes exactly tangent to the curve. However its length also tends to 0. If we divide by h, and then take the limit h → 0, we get
1 f (x0 + h, y0 ) − f (x0 , y0 )
lim ⟨h , 0 , f (x0 + h, y0 ) − f (x0 , y0 )⟩ = lim ⟨1 , 0 , ⟩
h→0 h h→0 h

f ( x0 +h, y0 )−f ( x0 , y0 )
Since the limit lim h→0
h
is the definition of the partial derivative f x (x0 , y0 ), we get that
1
lim ⟨h , 0 , f (x0 + h, y0 ) − f (x0 , y0 )⟩ = ⟨1 , 0 , fx (x0 , y0 )⟩
h→0 h

is a nonzero vector that is exactly tangent to the red curve and hence is also tangent to our surface z = f (x, y) at the point
(x0 , y0 , f (x0 , y0 )).

For the second tangent vector, we repeat the process with the blue curve in the figure at the beginning of this subsection. That blue
curve is the intersection of our surface z = f (x, y) with the plane x = x . Here is a front view of the blue curve.
0

When h is very small, the vector

2.5.2 https://math.libretexts.org/@go/page/89213
1
⟨0 , h , f (x0 , y0 + h) − f (x0 , y0 )⟩
h

from the point (x , y , f (x , y )), on the blue curve, to (x , y + h , f (x , y + h)), also on the blue curve, (and lengthened
0 0 0 0 0 0 0 0

by a factor ) is almost tangent to the blue curve. Taking the limit h → 0 gives the tangent vector
1

1 f (x0 , y0 + h) − f (x0 , y0 )
lim ⟨0 , h , f (x0 , y0 + h) − f (x0 , y0 )⟩ = lim ⟨0 , 1 , ⟩
h→0 h h→0 h

= ⟨0 , 1 , fy (x0 , y0 )⟩

to the blue curve at the point (a , b , f (a, b)).

Now that we have two vectors in the tangent plane to the surface z = f (x, y) at (x 0 , y0 , f (x0 , y0 )), we can find a normal vector
to the tangent plane by taking their cross product. Their cross product is
^
⎡ ^
ı
ı ^
ȷ
ȷ k ⎤

⟨1 , 0 , fx (x0 , y0 )⟩ × ⟨0 , 1 , fy (x0 , y0 )⟩ = det ⎢ 1 0 fx (x0 , y0 ) ⎥

⎣ ⎦
0 1 fy (x0 , y0 )

^
= −fx (x0 , y0 ) ^
ı − fy (x0 , y0 ) ^
ı ȷ
ȷ +k

and we have that the vector


^
−fx (x0 , y0 ) ^
ı − fy (x0 , y0 ) ^
ı ȷ
ȷ +k

is perpendicular to the surface z = f (x, y) at (x 0 , y0 , f (x0 , y0 )).

The tangent plane to the surface z = f (x, y) at (x 0 , y0 , f (x0 , y0 )) is the plane through (x 0 , y0 , f (x0 , y0 )) with normal vector
−fx (x0 , y0 ) ^
ı − fy (x0 , y0 ) ^
ı ȷ
^
ȷ + k. This plane has equation

−fx (x0 , y0 ) (x − x0 ) − fy (x0 , y0 ) (y − y0 ) + (z − f (x0 , y0 )) = 0

or, after a little rearrangement,

z = f (x0 , y0 ) + fx (x0 , y0 ) (x − x0 ) + fy (x0 , y0 ) (y − y0 )

Now that we have the normal vector, finding the equation of the normal line to the surface z = f (x, y) at the point
(x , y , f (x , y )) is straightforward. Writing it in parametric form,
0 0 0 0

⟨x, y, z⟩ = ⟨x0 , y0 , f (x0 , y0 )⟩ + t ⟨−fx (x0 , y0 ) , −fy (x0 , y0 ) , 1⟩

By way of summary

 Theorem 2.5.1. Tangent Plane and Normal Line


1. The vector
^
−fx (x0 , y0 ) ^
ı
ı − fy (x0 , y0 ) ^
ȷ
ȷ +k

is normal to the surface z = f (x, y) at (x , y , f (x , y )).


0 0 0 0

2. The equation of the tangent plane to the surface z = f (x, y) at the point (x 0 , y0 , f (x0 , y0 )) may be written as

z = f (x0 , y0 ) + fx (x0 , y0 ) (x − x0 ) + fy (x0 , y0 ) (y − y0 )

3. The parametric equation of the normal line to the surface z = f (x, y) at the point (x 0 , y0 , f (x0 , y0 )) is

⟨x, y, z⟩ = ⟨x0 , y0 , f (x0 , y0 )⟩ + t ⟨−fx (x0 , y0 ) , −fy (x0 , y0 ) , 1⟩

or, writing it component by component,

x = x0 − t fx (x0 , y0 ) y = y0 − t fy (x0 , y0 ) z = f (x0 , y0 ) + t

2.5.3 https://math.libretexts.org/@go/page/89213
 Example 2.5.2
As a warm-up example, we'll find the tangent plane and normal line to the surface z = x 2
+y
2
at the point (1, 0, 1). To do so,
we just apply Theorem 2.5.1 with x = 1, y = 0 and
0 0

2 2
f (x, y) = x +y f (1, 0) = 1

fx (x, y) = 2x fx (1, 0) = 2

fy (x, y) = 2y fy (1, 0) = 0

So the tangent plane is

z = f (x0 , y0 ) + fx (x0 , y0 ) (x − x0 ) + fy (x0 , y0 ) (y − y0 )

= 1 + 2(x − 1) + 0(y − 0)

= −1 + 2x

and the normal line is


⟨x, y, z⟩ = ⟨x0 , y0 , f (x0 , y0 )⟩ + t ⟨−fx (x0 , y0 ) , −fy (x0 , y0 ) , 1⟩

= ⟨1, 0, 1⟩ + t ⟨−2 , 0 , 1⟩

= ⟨1 − 2t , 0 , 1 + t⟩

That was pretty simple — find the partial derivatives and substitute in the coordinates. Let's do something a bit more challenging.

 Example 2.5.3. Optional

Find the distance from (0, 3, 0) to the surface z = x 2 2


+y .

Solution
Write f (x, y) = x + y . Let's denote by (a, b, f (a, b)) the point on z = f (x, y) that is nearest (0, 3, 0). Before we really get
2 2

into the problem, let's make a simple sketch and think about what the lines from (0, 3, 0) to the surface look like and, in
particular, the angles between these lines and the surface.

The line from (0, 3, 0) to (a, b, f (a, b)), the point on z = f (x, y) nearest (0, 3, 0), is distinguished from the other lines from
(0, 3, 0) to the surface, by being perpendicular to the surface. We will provide a detailed justification for this claim below.

Let's first exploit the fact that the vector from (0, 3, 0) to (a, b, f (a, b)) must be perpendicular to the surface to determine
(a, b, f (a, b)), and consequently the distance from (0, 3, 0) to the surface. By Theorem 2.5.1.a, with x = a and y = b, the0 0

vector
^ ^
−fx (a, b) ^
ı
ı − fy (a, b) ^
ȷ
ȷ + k = −2a ^
ı
ı − 2b ^
ȷ
ȷ +k

is normal to the surface z = f (x, y) at (a, b, f (a, b)). So the vector from (0, 3, 0) to (a, b, f (a, b)), namely
^ 2 2 ^
a ^
ı + (b − 3) ^
ı ȷ + f (a, b) k = a ^
ȷ ı + (b − 3) ^
ı ȷ
ȷ + (a + b ) k

must be parallel to (∗). This does not force the vector (∗) to equal (∗∗), but it does force the existence of some number t

obeying

2.5.4 https://math.libretexts.org/@go/page/89213
2 2 ^ ^
a ^
ı + (b − 3) ^
ı ȷ
ȷ + (a + b ) k = t( − 2a ^
ı − 2b ^
ı ȷ
ȷ + k)

or equivalently


⎪ a = −2a t

⎨ b − 3 = −2b t

⎪ 2 2
a +b =t

We now have a system of three equations in the three unknowns a, b and t. If we can solve them, we will have found the point
on the surface that we want.
The first equation is a(1 + 2t) = 0 so that either a = 0 or t = − . 1

The third equation forces t ≥ 0, so a = 0, and the last equation reduces to t = b 2


.

Substituting this into the middle equation gives


3 3
b − 3 = −2 b or equivalently 2b +b −3 = 0

In general, cubic equations are very hard to solve 2. But, in this case, we can guess one solution 3, namely b = 1. So (b − 1)

must be a factor of 2b + b − 3 and a little division then gives us


3

3 2
0 = 2b + b − 3 = (b − 1)(2 b + 2b + 3)

We can now find the roots of the quadratic factor by using the high school formula
−−−−−−−−−−
2
−2 ± √ 2 − 4(2)(3)

Since 2
2
− 4(2)(3) < 0, the factor 2b
2
+ 2b + 3 has no real roots. So the only real solution to the cubic equation
2b
3
+b −3 = 0 is b = 1.
In summary,
a = 0, b = 1 and
the point on z = x 2
+y
2
nearest (0, 3, 0) is (0, 1, 1) and
−−−−−−−−−

the distance from (0, 3, 0) to z = x 2
+y
2
is the distance from (0, 3, 0) to (0, 1, 1), which is √(−2) 2 2
+1 = √5.

Finally back to the claim that, because (a, b, f (a, b)) is the point on z = f (x, y) that is nearest 4 (0, 3, 0), the vector from
(0, 3, 0) to (a, b, f (a, b)) must be perpendicular to the surface z = f (x, y) at (a, b, f (a, b)). Note that the square of the

distance from (0, 3, 0) to a general point (x, y, f (x, y))on z = f (x, y) is


2 2 2
D(x, y) = x + (y − 3 ) + f (x, y )

If x = a, y =b minimizes D(x, y) then, in particular,


restricting our attention to the slice y = b of the surface, x = a minimizes g(x) = D(x, b) = x 2
+ (b − 3 )
2
+ f (x, b )
2
so
that

′ 2 2 2 ∣
0 = g (a) = [x + (b − 3 ) + f (x, b ) ]∣
∂x ∣
x=a

= 2a + 2f (a, b) fx (a, b)

= 2 ⟨a , b − 3 , f (a, b)⟩ ⋅ ⟨1 , 0 , fx (a, b)⟩

and
restricting our attention to the slice x = a of the surface, y = b minimizes h(y) = D(a, y) = a 2
+ (y − 3 )
2
+ f (a, y )
2
so
that

′ 2 2 2 ∣
0 = h (b) = [a + (y − 3 ) + f (a, y ) ]∣
∂y ∣
y=b

= 2(b − 3) + 2f (a, b) fy (a, b)

= 2 ⟨a , b − 3 , f (a, b)⟩ ⋅ ⟨0 , 1 , fy (a, b)⟩

2.5.5 https://math.libretexts.org/@go/page/89213
We have expressed the final right hand sides of both of the above bullets as the dot product of the vector ⟨a , b − 3 , f (a, b)⟩

with something because


is the vector from (0, 3, 0) to the point (a , b , f (a, b)) on the surface and
⟨a , b − 3 , f (a, b)⟩

the vanishing of the dot product of two vectors implies that the two vectors are perpendicular.
Thus, that

⟨a , b − 3 , f (a, b)⟩ ⋅ ⟨1 , 0 , fx (a, b)⟩ = ⟨a , b − 3 , f (a, b)⟩ ⋅ ⟨0 , 1 , fy (a, b)⟩

=0

tells us that the vector ⟨a , b − 3 , f (a, b)⟩ from (0, 3, 0) to (a, b, f (a, b)) is perpendicular to both ⟨1 , 0 , f (a, b)⟩ and x

⟨0 , 1 , f (a, b)⟩ and hence is parallel to their cross product ⟨1 , 0 , f (a, b)⟩ × ⟨0 , 1 , f (a, b)⟩ , which we already know is a
y x y

normal vector to the surface z = f (x, y) at (a, b, f (a, b)).


This shows that the point on the surface that minimises the distance to (0, 3, 0) is joined to (0, 3, 0) by a line that is parallel to
the normal vector — just as we required.

Surfaces of the Form G(x, y, z) = 0


We now use a little trickery to construct a vector perpendicular to the surface G(x, y, z) = 0 at the point (x 0 , y0 , z0 ). Imagine
that you are walking on the surface and that at time 0 you are at the point (x , y , z ). Let r (t)
⃗ 
0 = (x(t) ,
0 0 y(t) , z(t)) denote
your position at time t.

Because you are walking along the surface, we know that r (t)
⃗  always lies on the surface and so

G(x(t) , y(t) , z(t)) = 0

for all t. Differentiating this equation with respect to t gives, by the chain rule,
∂G ∂G
′ ′
(x(t) , y(t) , z(t)) x (t) + (x(t) , y(t) , z(t)) y (t)
∂x ∂y

∂G ′
+ (x(t) , y(t) , z(t)) z (t) = 0
∂z

Then setting t = 0 gives


∂G ′
∂G ′
∂G ′
(x0 , y0 , z0 ) x (0) + (x0 , y0 , z0 ) y (0) + (x0 , y0 , z0 ) z (0) = 0
∂x ∂y ∂z

Expressing this as a dot product allows us to turn this into a statement about vectors.
∂G ∂G ∂G ′
⟨ (x0 , y0 , z0 ) , (x0 , y0 , z0 ) , (x0 , y0 , z0 )⟩ ⋅ r ⃗ (0) = 0
∂x ∂y ∂z

The first vector in this dot product is sufficiently important that it is given its own name.

2.5.6 https://math.libretexts.org/@go/page/89213
 Definition 2.5.4. Gradient
The gradient 5 of the function G(x, y, z) at the point (x 0 , y0 , z0 ) is

∂G ∂G ∂G
⟨ (x0 , y0 , z0 ) , (x0 , y0 , z0 ) , (x0 , y0 , z0 )⟩
∂x ∂y ∂z

It is denoted n⃗ ablaG(x 0, y0 , z0 ).

So (∗) tells us that the gradient n⃗ ablaG(x 0, y0 , z0 ), is perpendicular to the vector r ⃗ (0). ′

Now if t is very close to zero, the vector r (t)


⃗  ⃗ 
− r (0), from r (0)
⃗  to r (t),
⃗  is almost tangent to the path that we are walking on. The
limit
⃗ 
r (t) ⃗ 
− r (0)

r ⃗ (0) = lim
t→0 t

is thus exactly tangent to our path, and consequently to the surface G(x, y, z) = 0 at (x 0, y0 , z0 ). This is true for all paths on the
surface that pass through (x , y , z ) at time t = 0, which tells us that n⃗ ablaG(x , y
0 0 0 0 0, z0 ) is perpendicular to the surface at
(x , y , z ). We have just found a normal vector!
0 0 0

The above argument goes through unchanged for surfaces of the form 6 G(x, y, z) = K, for any constant K. So we have

 Theorem 2.5.5. Tangent Plane and Normal Line

Let K be a constant and (x 0, y0 , z0 ) be a point on the surface G(x, y, z) = K. Assume that the gradient
∂G ∂G ∂G
n⃗ ablaG(x0 , y0 , z0 ) = ⟨ (x0 , y0 , z0 ) , (x0 , y0 , z0 ) , (x0 , y0 , z0 )⟩
∂x ∂y ∂z

of G at (x 0, y0 , z0 ) is nonzero.
1. The vector n⃗ ablaG(x , y , z ) is normal to the surface G(x, y, z) = K at (x , y
0 0 0 0 0, z0 ).

2. The equation of the tangent plane to the surface G(x, y, z) = K at (x , y , z ) is 0 0 0

n⃗ ablaG(x0 , y0 , z0 ) ⋅ ⟨x − x0 , y − y0 , z − z0 ⟩ = 0

3. The parametric equation of the normal line to the surface G(x, y, z) = K at (x 0, y0 , z0 ) is

⟨x, y, z⟩ = ⟨x0 , y0 , z0 ⟩ + t n⃗ ablaG(x0 , y0 , z0 )

 Remark 2.5.6
Theorem 2.5.1 about the tangent planes and normal lines to the surface z = f (x, y) is actually a very simple consequence of
Theorem 2.5.5 about the tangent planes and normal lines to the surface G(x, y, z) = 0. This is just because we can always
rewrite the equation z = f (x, y) as z − f (x, y) = 0 and apply Theorem 2.5.5 with G(x, y, z) = z − f (x, y). Since
^
n⃗ ablaG(x0 , y0 , z0 ) = −fx (x0 , y0 ) ^
ı
ı − fy (x0 , y0 ) ^
ȷ
ȷ +k

Theorem 2.5.5 then gives 7 Theorem 2.5.1.

Here are a couple of routine examples.

 Example 2.5.7
Find the tangent plane and the normal line to the surface
2 2
z =x + 5xy − 2 y

at the point (1, 2, 3).


Solution

2.5.7 https://math.libretexts.org/@go/page/89213
As a preliminary check, note that
2 2
1 + 5 × 1 × 2 − 2(2 ) =3

which verifies that the point (1, 2, 3) is indeed on the surface. This is a good reality check and also increases our confidence
that the question is asking what we think that it is asking. Rewrite the equation of the surface as
2 2
G(x, y, z) = x + 5xy − 2 y − z = 0. Then the gradient
^
n⃗ ablaG(x, y, z) = (2x + 5y) ^
ı
ı + (5x − 4y) ^
ȷ
ȷ −k

so that, by Theorem 2.5.5,


^
n⃗  = n⃗ ablaG(1, 2, 3) = 12 ^
ı
ı −3 ^
ȷ
ȷ −k

is a normal vector to the surface at (1, 2, 3). Equipped 8 with the normal, it is easy to work out an equation for the tangent
plane.

n⃗ ⋅ ⟨x − 1 , y − 2 , z − 3⟩ = ⟨12 , −3 , −1⟩ ⋅ ⟨x − 1 , y − 2 , z − 3⟩ = 0

or

12x − 3y − z = 3

We can quickly check that the point (1, 2, 3) does indeed lie on the plane:

12 × 1 − 3 × 2 − 3 = 3

The normal line is

⟨x − 1 , y − 2 , z − 3⟩ = t n⃗  = t ⟨12 , −3 , −1⟩

or
x −1 y −2 z−3
= = ( = t)
12 −3 −1

Another warm-up example. This time the surface is a hyperboloid of one sheet.

 Example 2.5.8

Find the tangent plane and the normal line to the surface
2 2 2
x +y −z =4

at the point (2, −3, 3).


Solution
As a preliminary check, note that the point (2, −3, 3) is indeed on the surface:
2 2 2
2 + (−3 ) − (3 ) =4

The equation of the surface is G(x, y, z) = x 2


+y
2
−z
2
= 4. Then the gradient of G is
^
n⃗ ablaG(x, y, z) = 2x ^
ı + 2y ^
ı ȷ
ȷ − 2z k

so that, at (2, −3, 3),


^
n⃗ ablaG(2, −3, 3) = 4 ^
ı −6 ^
ı ȷ
ȷ −6 k

and so, by Theorem 2.5.5,


1
^ ^
n⃗  = (4 ^
ı −6 ^
ı ȷ − 6 k) = 2 ^
ȷ ı −3 ^
ı ȷ
ȷ −3 k
2

is a normal vector to the surface at (2, −3, 3). The tangent plane is

2.5.8 https://math.libretexts.org/@go/page/89213
n⃗ ⋅ ⟨x − 2 , y + 3 , z − 3⟩ = ⟨2 , −3 , −3⟩ ⋅ ⟨x − 2 , y + 3 , z − 3⟩ = 0

or

2x − 3y − 3z = 4

Again, as a check, we can verify that our point (2, −3, 3) is indeed on the plane:

2 × 2 − 3 × (−3) − 3 × 3 = 4

The normal line is

⟨x − 2 , y + 3 , z − 3⟩ = t n⃗  = t ⟨2 , −3 , −3⟩

or
x −2 y +3 z−3
= = ( = t)
2 −3 −3

 Warning 2.5.9

The vector n⃗ ablaG(x, y, z) is not a normal vector to the surface G(x, y, z)=K at (x , y 0 0, z0 ). The vector
n ⃗ ablaG(x0 , y0 , z0 ) is a normal vector to G(x, y, z)=K at (x , y , z ) (provided G(x , y , z ) = K ).
0 0 0 0 0 0

As an example of the consequences of failing to evaluate n⃗ ablaG(x, y, z) at the point (x 0, y0 , z0 ), consider the problem
2 2 2
Find the tangent plane to the surface x +y +z = 1 at the point (0, 0, 1).

In this case, the surface is G(x, y, z) = x + y + z = 1. The gradient of G is n⃗ ablaG(x, y, z) = 2x ^


2 2 2
ı + 2y ^
ı ȷ + 2z k. To
ȷ
^

correctly apply part (b) of Theorem 2.5.5, we evaluate n⃗ ablaG(0, 0, 1) = 2 k


^
and find that the tangent plane at (0, 0, 1) is

n⃗ ablaG(0, 0, 1) ⋅ ⟨x − 0 , y − 0 , z − 1⟩ = 0 or 2(z − 1) = 0 or z =1

This is of course correct — the tangent plane to the unit sphere at the north pole is indeed horizontal.
But if we were to incorrectly apply part (b) of Theorem 2.5.5 by failing to evaluate n⃗ ablaG(x, y, z) at (0, 0, 1), we would find
that the “tangent plane” is
n⃗ ablaG(x, y, z) ⋅ ⟨x − 0 , y − 0 , z − 1⟩ = 0

or 2x(x − 0) + 2y(y − 0) + 2z(z − 1) = 0

2 2 2
or x +y +z −z = 0

This is horribly wrong. It is not even a plane, as any plane has an equation of the form ax + by + cz = d, with a, b, c and d

constants.

Now we'll move on to some more involved examples.

 Example 2.5.10

Suppose that we wish to find the highest and lowest points on the surface G(x, y, z) = x − 2x + y − 4y + z − 6z = 2. 2 2 2

That is, we wish to find the points on the surface with the maximum value of z and with the minimum 9 value of z.
Completing three squares,
2 2 2
G(x, y, z) =x − 2x + y − 4y + z − 6z
2 2 2
= (x − 1 ) + (y − 2 ) + (z − 3 ) − 14.

So the surface G(x, y, z) = 2 is a sphere, whose highest point is the north pole and whose lowest point is the south pole. But
let's pretend that G(x, y, z) = 2 is some complicated surface that we can't easily picture.
We'll find its highest and lowest points by exploiting the fact that the tangent plane to G = 2 is horizontal at the highest and
lowest points. Equivalently, the normal vector to G = 2 is vertical at the highest and lowest points. To see that this is the case,
look at the figure below. If the tangent plane at (x , y , z ) is not horizontal, then the tangent plane contains points near
0 0 0

2.5.9 https://math.libretexts.org/@go/page/89213
with z bigger than z and points near (x , y , z ) with z smaller than z
(x0 , y0 , z0 ) 0 0 0 0 0. Near (x
0, y0 , z0 ), the tangent plane is a
good approximation to the surface. So the surface also contains 10 such points.

The gradient is
^
n⃗ ablaG(x, y, z) = (2x − 2) ^
ı
ı + (2y − 4) ^
ȷ
ȷ + (2z − 6) k

It is vertical when the ^


ı and ^
ı ȷ components are both zero. This happens when 2x − 2 = 0 and 2y − 4 = 0, i.e. when x = 1 and
ȷ

y = 2. So the normal vector to the surface G = 2 at the point (x, y, z) is vertical when x = 1, y = 2 and (don't forget that

(x, y, z) has to be on G = 2 )

2 2 2
G(1, 2, z) = 1 −2 ×1 +2 −4 ×2 +z − 6z = 2
2
⟺ z − 6z − 7 = 0

⟺ (z − 7)(z + 1) = 0

⟺ z = 7,   − 1

The highest point is (1, 2, 7) and the lowest point is (1, 2, −1), as expected.

We could have short-cut the last example by using that the surface was a sphere. Here is an example in the same spirit for which we
don't have an easy short-cut.

 Example 2.5.11

In the last example, we found the points on a specified surface having the largest and smallest values of z. We'll now ramp up
the level of difficulty a bit and find the points on the surface x + 2y + 3z = 72 that have the largest and smallest values of
2 2 2

x + y + 3z.

To develop a strategy for tackling this problem, consider the following sketch.

The red ellipse in the sketch is intended to represent (schematically) our surface
2 2 2
x + 2y + 3z = 72

which is an ellipsoid. The middle diagonal (black) line is intended to represent (schematically) the plane x + y + 3z = C for
some more or less randomly chosen value of the constant C. At each point on that plane, the function, x + y + 3z, (that we are
trying to maximize and minimize) takes the value C . In particular, for the C chosen in the figure, x + y + 3z = C does
intersect our surface, indicating that x + y + 3z does indeed take the value C somewhere on our surface.

2.5.10 https://math.libretexts.org/@go/page/89213
To maximize x + y + 3z, imagine slowly increasing the value of C . As we do so, the plane x + y + 3z = C moves to the
right. We want to stop increasing C at the biggest value of C for which the plane x + y + 3z = C intersects our surface
x + 2 y + 3 z = 72. For that value of C the plane x + y + 3z = C , which is represented by the right hand blue line in the
2 2 2

sketch, is tangent to our surface.


Similarly, to minimize x + y + 3z, imagine slowly decreasing the value of C . As we do so, the plane x + y + 3z = C moves
to the left. We want to stop decreasing C at the smallest value of C for which the plane x + y + 3z = C intersects our surface
x + 2 y + 3 z = 72. For that value of C the plane x + y + 3z = C , which is represented by the left hand blue line in the
2 2 2

sketch, is again tangent to our surface. The previous Example 2.5.10 was similar, except that the plane was z = C .
We are now ready to compute. We need to find the points (a, b, c) (in the sketch, they are the black dot points of tangency) for
which
is on the surface and
(a, b, c)

the normal vector to the surface x 2


+ 2y
2
+ 3z
2
= 72 at (a, b, c) is parallel to ⟨1, 1, 3⟩ , which is a normal vector to the
plane x + y + 3z = C
Since the gradient of x 2
+ 2y
2
+ 3z
2
is ⟨2x , 4y , 6z⟩ = 2 ⟨x , 2y , 3z⟩ , these two conditions are, in equations,
2 2 2
a + 2b + 3c = 72

⟨a , 2b , 3c⟩ = t ⟨1, 1, 3⟩ for some number t

The second equation says that a = t, b =


t

2
and c = t. Substituting this into the first equation gives

2
1 2 2
9 2 2
t + t + 3t = 72 ⟺ t = 72 ⟺ t = 16 ⟺ t = ±4
2 2

So
the point on the surface x 2
+ 2y
2
+ 3z
2
= 72 at which x + y + 3z takes its maximum value is
(a, b, c) = (t,
t

2
, t)


= (4, 2, 4) and
t=4

x + y + 3z takes the value 4 + 2 + 3 × 4 = 18 there.


The point on the surface x + 2y + 3z = 72 at which x + y + 3z takes its minimum value is
2 2 2

(a, b, c) = (t,
t

2
, t)


= (−4, −2, −4) and
t=−4

x + y + 3z takes the value −4 − 2 + 3 × (−4) = −18 there.

 Example 2.5.12
Find the distance from the point (1, 1, 1) to the plane x + 2y + 3z = 20.
Solution 1
First note that the point (1, 1, 1) is not itself on the plane x + 2y + 3z = 20 because

1 + 2 × 1 + 3 × 1 = 6 ≠ 20

Denote by (a, b, c) the point on the plane x + 2y + 3z = 20 that is nearest (1, 1, 1). Then the vector from (1, 1, 1) to (a, b, c),
namely ⟨a − 1 , b − 1 , c − 1⟩ , must be perpendicular 11 to the plane. As the gradient of x + 2y + 3z, namely ⟨1 , 2 , 3⟩ , is a
normal vector to the plane, ⟨a − 1 , b − 1 , c − 1⟩ must be parallel to ⟨1 , 2 , 3⟩ . So there must be some number t so that

⟨a − 1 , b − 1 , c − 1⟩ = t ⟨1 , 2 , 3⟩

or

a = t + 1,  b = 2t + 1,  c = 3t + 1

As (a, b, c) must be on the plane, we know that a + 2b + 3c = 20 and so

(t + 1) + 2(2t + 1) + 3(3t + 1) = 20 ⟹ 14t = 14 ⟹ t =1

The distance from (1, 1, 1) to the plane x + 2y + 3z = 20 is the length of the vector
−−
⟨a − 1 , b − 1 , c − 1⟩ = t ⟨1 , 2 , 3⟩ = ⟨1 , 2 , 3⟩ which is √14.

2.5.11 https://math.libretexts.org/@go/page/89213
Solution 2
Denote by P = (a, b, c) the point on the plane x + 2y + 3z = 20 that is nearest the point Q = (1, 1, 1). Pick any other point
on the plane and call it R. For example (x, y, z) = (20, 0, 0) obeys x + 2y + 3z = 20 and so R = (20, 0, 0) is a point on the
plane.
The triangle P QR is right angled. Denote by θ the angle between the hypotenuse QR and the side QP . The distance from
Q = (1, 1, 1) to the plane is the length of the line segment QP , which is

distance = |QP | = |QR| cos θ

Now, the dot product between the vector from Q to R, which is ⟨19, −1, −1⟩ , with the vector ⟨1, 2, 3⟩ , which is normal to the
plane and hence parallel to the side QP is
⟨19, −1, −1⟩ ⋅ ⟨1, 2, 3⟩ = 14

= | ⟨19, −1, −1⟩ | | ⟨1, 2, 3⟩ |  cos θ


−−
= |QR| √14  cos θ

so that, finally,
14 −−
distance = |QR| cos θ = −− = √14
√14

 Example 2.5.13

Let F (x, y, z) = 0 and G(x, y, z) = 0 be two surfaces. These two surfaces intersect along a curve. Find a tangent vector to this
curve at the point (x , y , z ).
0 0 0

Solution
Call the tangent vector T. Then T has to be
tangent to the surface F (x, y, z) = 0 at (x 0, y0 , z0 ) and
tangent to the surface G(x, y, z) = 0 at (x 0, y0 , z0 ).

Consequently T has to be
perpendicular to the vector n⃗ ablaF (x 0, y0 , z0 ), which is normal to F (x, y, z) = 0 at (x 0, y0 , z0 ), and at the same time has
to be
perpendicular to the vector n⃗ ablaG(x 0, y0 , z0 ), which is normal to G(x, y, z) = 0 at (x 0, y0 , z0 ).

Recall that an easy way to construct a vector that is perpendicular to two other vectors is to take their cross product. So we take
^
⎡ ^
ı
ı ^
ȷ
ȷ k ⎤

T = n⃗ ablaF (x0 , y0 , z0 ) × n⃗ ablaG(x0 , y0 , z0 ) = det ⎢ Fx Fy Fz ⎥


⎢ ⎥
⎣ ⎦
Gx Gy Gz

^
= (Fy Gz − Fz Gy ) ^
ı
ı + (Fz Gx − Fx Gz ) ^
ȷ
ȷ + (Fx Gy − Fy Gx ) k

where all partial derivatives are evaluated at (x, y, z) = (x 0, y0 , z0 ).

Let's put Example 2.5.13 into action.

2.5.12 https://math.libretexts.org/@go/page/89213
 Example 2.5.14

Consider the curve that is the intersection of the surfaces


2 2 2 2 2
x +y +z =5 and x +y = 4z


Find a tangent vector to this curve at the point (√3 , 1 , 1).

Solution
– –
As a preliminary check, we verify that the point (√3 , 1 , 1) really is on the curve. To do so, we check that (√3 , 1 , 1)

satisfies both equations:


– 2 2 2 – 2 2
(√3) + 1 + 1 = 5 (√3) + 1 = 4 × 1

We'll find the specified tangent vector by using the strategy of Example 2.5.13.
Write F (x, y, z) = x 2
+y
2
+z
2
and G(x, y, z) = x 2
+y
2
− 4z. Then
the vector
– ∣ –
n⃗ ablaF (√3, 1, 1) = ⟨2x , 2y , 2z⟩ = 2 ⟨√3 , 1 , 1⟩

(x,y,z)=( √3,1,1)


is normal to the surface F (x, y, z) = 5 at (√3 , 1 , 1), and
the vector
– ∣ –
n⃗ ablaG(√3, 1, 1) = ⟨2x , 2y , −4⟩ = 2 ⟨√3 , 1 , −2⟩

(x,y,z)=( √3,1,1)


is normal to the surface G(x, y, z) = 0 at (√3 , 1 , 1).

So a tangent vector is
^
⎡ ^ı
ı ^
ȷ
ȷ k ⎤
– – –
⟨√3 , 1 , 1⟩ × ⟨√3 , 1 , −2⟩ = det ⎢ √3 1 1 ⎥

⎣ – ⎦
√3 1 −2
– – – – ^
= ( − 2 − 1) ^
ı + (√3 + 2 √3) ^
ı ȷ
ȷ + (√3 − √3) k

= −3 ^
ı
ı + 3 √3 ^
ȷ
ȷ

There is an easy common factor of 3 in both components. So we can create a slightly neater tangent vector by dividing the
– –
length of −3 ^ ȷ by 3, giving ⟨−1 , √3 , 0⟩ .
ı + 3 √3 ^
ı ȷ

 Example 2.5.15. (Optional) computer graphics hidden-surface elimination


When you look at a solid three dimensional object, you do not see all of the surface of the object — parts of the surface are
hidden from your view by other parts of the object. For example, the following sketch shows, schematically, a ray of light
leaving your eye and hitting the surface of the object at the light dot. The object is solid, so the light cannot penetrate any
further. But, if it could, it would follow the dotted line, hitting the surface of the object three more times. Your eye can see the
light dot, but cannot see the other three dark dots.

Recreating this effect in computer generated graphics is called “hidden-surface elimination”. In general, implementing hidden-
surface elimination can be quite complicated. Often a technique called “ray tracing” is used 12. However, it is easy if you know
about vectors and gradients, and you are only looking at a single convex body. By definition, a solid is convex if, whenever
two points are in the solid, then the line segment joining the two points is also contained in the solid.

2.5.13 https://math.libretexts.org/@go/page/89213
So suppose that we are looking at a convex solid, that the equation of the surface of the solid is G(x, y, z) = 0, and that our
eye is at (x , y , z ).
e e e

First consider a light ray that leaves our eye and then just barely nicks the solid at the point (x, y, z), as in the figure on the
left below. The light ray is a tangent line to the surface at (x, y, z). So the direction vector of the light ray,
⟨x − x , y − y , z − z ⟩ , is tangent to the surface at (x, y, z) and consequently is perpendicular to the normal vector,
e e e

n⃗  = n⃗ ablaG(x, y, z), of the surface at (x, y, z). Thus

⟨x − xe , y − ye , z − ze ⟩ ⋅ n⃗ ablaG(x, y, z) = 0

Now consider a light ray that leaves our eye and then passes through the solid, as in the figure on the right above. Call the
point at which the light ray first enters the solid (x, y, z) and the point at which the light ray leaves the solid (x , y , z ). ′ ′ ′

Let v ⃗ be a vector that has the same direction as, i.e. is a positive multiple of, the vector ⟨x − x , y − y , z − z ⟩ .
e e e

Let n⃗  be an outward pointing normal to the solid at (x, y, z). It will be either n⃗ ablaG(x, y, z) or −n⃗ ablaG(x, y, z).

Let n⃗  be an outward pointing normal to the solid at (x , y , z ). It will be either n⃗ ablaG(x , y , z ) or
′ ′ ′ ′ ′ ′

′ ′ ′
−n⃗ ablaG(x , y , z ).

Then
at the point (x, y, z) where the ray enters the solid, which is a visible point, the direction vector v ⃗ points into the solid.
The angle θ between v ⃗ and the outward pointing normal n⃗  is greater than 90 , so that the dot product

v ⃗ ⋅ n⃗  = | v|⃗  | n⃗ | cos θ < 0. But

at the point (x , y , z ) where the ray leaves the solid, which is a hidden point, the direction vector v ⃗ points out of the
′ ′ ′


solid. The angle θ between v ⃗ and the outward pointing normal n⃗  is less than 90 , so that the dot product

′ ′
v ⃗ ⋅ n⃗  = | v|⃗  | n⃗  | cos θ > 0.

Our conclusion is that, if we are looking in the direction v,⃗  and if the outward pointing normal 13 to the surface of the solid at
(x, y, z) is n⃗ ablaG(x, y, z) then the point (x, y, z) is hidden if and only if v ⃗ ⋅ n⃗ ablaG(x, y, z) > 0.

This method was used by the computer graphics program that created the shaded figures 14 in Examples 1.7.1 and 1.7.2, which
are reproduced here.

2.5.14 https://math.libretexts.org/@go/page/89213
Tangent planes, in addition to being geometric objects, provide a simple but powerful tool for approximating functions of two
variables near a specified point. We saw something very similar in the CLP-1 text where we approximated functions of one variable
by their tangent lines. This brings us to our next topic — approximating functions.

Exercises
Stage 1

 1

Is it reasonable to say that the surfaces 2


x +y
2
+ (z − 1 )
2
=1 and x
2
+y
2
+ (z + 1 )
2
=1 are tangent to each other at
(0, 0, 0)?

 2

Let the point r ⃗  = (x , y , z ) lie on the surface G(x, y, z) = 0. Assume that n⃗ ablaG(x , y , z ) ≠ 0. Suppose that the
0 0 0 0 0 0 0

parametrized curve r (t) ⃗  = (x(t), y(t), z(t)) is contained in the surface and that r (
⃗  t ) = r ⃗  . Show that the tangent line to the 0 0

curve at r ⃗  lies in the tangent plane to G = 0 at r ⃗  .


0 0

 3

Let F (x , y , z ) = G(x , y , z ) = 0 and let the vectors n⃗ ablaF (x , y , z ) and n⃗ ablaG(x , y , z ) be nonzero and not be
0 0 0 0 0 0 0 0 0 0 0 0

parallel to each other. Find the equation of the normal plane to the curve of intersection of the surfaces F (x, y, z) = 0 and
G(x, y, z) = 0 at (x , y , z ). By definition, that normal plane is the plane through (x , y , z ) whose normal vector is the
0 0 0 0 0 0

tangent vector to the curve of intersection at (x , y , z ). 0 0 0

 4

Let f (x , y ) = g(x , y ) and let ⟨f (x , y ), f (x , y )⟩ ≠ ⟨g (x , y ), g (x


0 0 0 0 x 0 0 y 0 0 x 0 0 y 0, y0 )⟩ . Find the equation of the tangent line to
the curve of intersection of the surfaces z = f (x, y) and z = g(x, y) at (x , y 0 0 , z0 = f (x0 , y0 )).

Stage 2

 5✳
2
x y
Let f (x, y) = 4 2
. Find the tangent plane to the surface z = f (x, y) at the point (−1 , 1,
1

3
).
x + 2y

 6✳

Find the tangent plane to


27
−−−−−−−−−−−− − =9
√ x2 + y 2 + z 2 + 3

at the point (2, 1, 1).

 7

Find the equations of the tangent plane and the normal line to the graph of the specified function at the specified point.
1. f (x, y) = x − y at (−2, 1)
2 2

2. f (x, y) = e at (2, 0)
xy

2.5.15 https://math.libretexts.org/@go/page/89213
 8✳
Consider the surface z = f (x, y) defined implicitly by the equation xy z + y 2 2
z
3 2
= 3 +x . Use a 3--dimensional gradient
vector to find the equation of the tangent plane to this surface at the point (−1, 1, 2). Write your answer in the form
z = ax + by + c, where a, b and c are constants.

 9✳
A surface is given by
2 2
z =x − 2xy + y .

1. Find the equation of the tangent plane to the surface at x = a, y = 2a.


2. For what value of a is the tangent plane parallel to the plane x − y + z = 1?

 10 ✳
2y
Find the tangent plane and normal line to the surface z = f (x, y) = x2 +y 2
at (x, y) = (−1, 2).

 11 ✳
Find all the points on the surface x 2
+ 9y
2
+ 4z
2
= 17 where the tangent plane is parallel to the plane x − 8z = 0.

 12 ✳

Let S be the surface z = x + 2y 2 2


+ 2y − 1. Find all points P (x0 , y0 , z0 ) on S with x0 ≠ 0 such that the normal line at P

contains the origin (0, 0, 0).

 13 ✳

Find all points on the hyperboloid z 2


= 4x
2
+y
2
−1 where the tangent plane is parallel to the plane 2x − y + z = 0.

 14

Find a vector of length √3 which is tangent to the curve of intersection of the surfaces  z
2 2
= 4x
2
+ 9y   and
 6x + 3y + 2z = 5  at  (2, 1, −5).

Stage 3

 15

Find all horizontal planes that are tangent to the surface with equation
2 2
−( x +y )/2
z = xye

What are the largest and smallest values of z on this surface?

 16 ✳
Let S be the surface
2 2 3
xy − 2x + yz + x +y +z =7

1. Find the tangent plane and normal line to the surface S at the point (0, 2, 1).
2. The equation defining S implicitly defines z as a function of x and y for (x, y, z) near (0, 2, 1). Find expressions for ∂z

∂x

and ∂z

∂y
. Evaluate ∂z

∂y
at (x, y, z) = (0, 2, 1).

2.5.16 https://math.libretexts.org/@go/page/89213
2

3. Find an expression for ∂

∂x∂y
z
.

 17 ✳
1. Find a vector perpendicular at the point (1, 1, 3) to the surface with equation x + z = 10.
2 2

2. Find a vector tangent at the same point to the curve of intersection of the surface in part (a) with surface y 2
+z
2
= 10.

3. Find parametric equations for the line tangent to that curve at that point.

 18 ✳

Let P be the point where the curve


3 2 ^
⃗ 
r (t) =t ^
ı +t ^
ı ȷ
ȷ +t k, (0 ≤ t < ∞)

intersects the surface


3
z + xyz − 2 = 0

Find the (acute) angle between the curve and the surface at P .

 19

Find the distance from the point (1, 1, 0) to the circular paraboloid with equation z = x 2 2
+y .

1. It is possible, but beyond the scope of this text, to give a precise meaning to “fits best”.
2. The method for solving cubics was developed in the 15th century by del Ferro, Cardano and Ferrari (Cardano's student). Ferrari
then went on to discover a formula for the roots of a quartic. Both the cubic and quartic formulae are extremely cumbersome,
and no such formula exists for polynomials of degree 5 and higher. This is the famous Abel-Ruffini theorem.
3. See Appendix A.16 in the CLP-2 text. There it is shown that any integer root of a polynomial with integer coefficients must
divide the constant term exactly. So in this case only ±1 and ±3 could be integer roots. So it is good to check to see if any of
these are solutions before moving on to more sophisticated techniques.
4. Note that we are assuming that (a, b, f (a, b)) is the point on the surface that is nearest (0, 3, 0). That there exists such a point is
intuitively obvious from a sketch of the surface. The mathematical proof that there exists such a point is beyond the scope of
this text.
5. The gradient will also play a big role in Section 2.7.
6. Alternatively, one could rewrite G = K as G − K = 0 and replace G by G − K in the above argument.
7. Indeed we could write Theorem 2.5.1 as a corollary of Theorem 2.5.5. But in a textbook one tries to start with the concrete and
move to the more general.
8. The spelling “equipt” is a bit archaic. There must be a joke here about quips.
9. Recall that “minimum” means the most negative, not the closest to zero.
10. While this is intuitively obvious, proving it is beyond the scope of this text.
11. We saw why this vector must be perpendicular to the plane in Example 2.5.3.
12. You can find out more about it by plugging “ray tracing” into the search engine of your choice.
13. If n⃗ ablaG(x, y, z) is the inward pointing normal, just replace G by −G.
14. Those figures are not convex. But it was still possible to use the method discussed above because any light ray from our eye
that passes through the figure intersects the figure at most twice. It first enters the figure at a visible point and then exits the
figure at a hidden point.

This page titled 2.5: Tangent Planes and Normal Lines is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.

2.5.17 https://math.libretexts.org/@go/page/89213
2.6: Linear Approximations and Error
A frequently used, and effective, strategy for building an understanding of the behaviour of a complicated function near a point is
to approximate it by a simple function. The following suite of such approximations is standard fare in Calculus I courses. See, for
example, §3.4 in the CLP-1 text.
g(t0 + Δt) ≈ g(t0 ) constant approximation

g(t0 + Δt) ≈ g(t0 ) + g (t0 ) Δt linear, or tangent line, approximation

′ 1 ′′ 2
g(t0 + Δt) ≈ g(t0 ) + g (t0 ) Δt + g (t0 ) Δt quadratic approximation
2

More generally, for any natural number n, the approximation


′ 1 ′′ 2 1 (n) n
g(t0 + Δt) ≈ g(t0 ) + g (t0 ) Δt + g (t0 ) Δt +⋯ + g (t0 ) Δt
2 n!

is known as the Taylor polynomial of degree n. You may have also found a formula for the error introduced in making this
approximation. The error E (Δt) is defined by
n

′ 1 ′′ 2 1 (n) n
g(t0 + Δt) = g(t0 ) + g (t0 )Δt + g (t0 )Δt +⋯ + g (t0 )Δt + En (Δt)
2! n!

and obeys 1
1 (n+1) n+1
En (Δt) = g (t0 + cΔt)Δt
(n+1)!

for some (unknown) 0 ≤ c ≤ 1.


It is a simple matter to use these one dimensional approximations to generate the analogous multidimensional approximations. To
introduce the ideas, we'll generate the linear approximation to a function, f (x, y), of two variables, near the point (x , y ). Define 0 0

g(t) = f (x0 + t Δx , y0 + t Δy)

We have defined g(t) so that

g(0) = f (x0 , y0 ) and g(1) = f (x0 + Δx , y0 + Δy)

Consequently, setting t0 =0 and Δt = 1,

f (x0 + Δx , y0 + Δy) = g(1) = g(t0 + Δt)


≈ g(t0 ) + g (t0 ) Δt

= g(0) + g (0)

We can now compute g ′


(0) using the multivariable chain rule of 2.4.2:


∂f ∂f
g (t) = (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy
∂x ∂y

so that,

 Equation 2.6.1

∂f ∂f
f (x0 + Δx , y0 + Δy) ≈ f (x0 , y0 ) + (x0 , y0 ) Δx + (x0 , y0 ) Δy
∂x ∂y

Of course exactly the same procedure works for functions of three or more variables. In particular

 Equation 2.6.2

2.6.1 https://math.libretexts.org/@go/page/92239
f (x0 + Δx , y0 + Δy , z0 + Δz)

∂f ∂f
≈ f (x0 , y0 , z0 ) + (x0 , y0 , z0 ) Δx + (x0 , y0 , z0 ) Δy
∂x ∂y

∂f
+ (x0 , y0 , z0 ) Δz
∂z

While these linear approximations are quite simple, they tend to be pretty decent provided Δx and Δy are small. See the optional
§2.6.1 for a more precise statement.

 Remark 2.6.3

Applying 2.6.1, with Δx = x − x and Δy = y − y


0 0. gives
∂f ∂f
f (x , y) ≈ f (x0 , y0 ) + (x0 , y0 ) (x − x0 ) + (x0 , y0 ) (y − y0 )
∂x ∂y

Looking at part (b) of Theorem 2.5.1, we see that this just says that the tangent plane to the surface z = f (x, y) at the point
(x , y , f (x , y )) remains close to the surface when (x, y) is close to (x , y ).
0 0 0 0 0 0

 Example 2.6.4

Let
−−−−−−
2 2
f (x, y) = √ x +y

Then
∂f 1 2x x0
(x, y) = − −−−− − fx (x0 , y0 ) = −−−−−−
∂x 2 √ x2 + y 2 2 2
√x +y
0 0

∂f 1 2y y0
(x, y) = fy (x0 , y0 ) =
− −−−− − −−−−−−
∂y 2 √ x2 + y 2 2 2
√x +y
0 0

so that the linear approximation to f (x, y) at (x 0, y0 ) is

f (x0 + Δx , y0 + Δy) ≈ f (x0 , y0 ) + fx (x0 , y0 ) Δx + fy (x0 , y0 ) Δy


−−−−−− x0 y0
2 2
= √x +y + −−−−−− Δx + −−−−−− Δy
0 0
2 2 2 2
√x +y √x +y
0 0 0 0

 Definition 2.6.5

People often write Δf for the change f (x0 + Δx , y0 + Δy) − f (x0 , y0 ) in the value of f. Then the linear approximation
2.6.1 becomes
∂f ∂f
Δf ≈ (x0 , y0 ) Δx + (x0 , y0 ) Δy
∂x ∂y

If they want to emphasize that that Δx, Δy and Δf are really small (they may even say “infinitesimal”), they'll write 2 dx, dy
and df instead. In this notation
∂f ∂f
df ≈ (x0 , y0 ) dx + (x0 , y0 ) dy
∂x ∂y

2.6.2 https://math.libretexts.org/@go/page/92239
 Definition 2.6.6

Suppose that we wish to approximate a quantity Q and that the approximation turns out to be Q + ΔQ. Then
the absolute error in the approximation is |ΔQ| and
ΔQ
the relative error in the approximation is ∣∣ Q


and
ΔQ
the percentage error in the approximation is 100 ∣∣ Q


−−
In Example 3.4.5 of the CLP-1 text we found an approximate value for the number √4.1 by using a linear approximation to the
single variable function f (x) = √−
x . We can make similar use of linear approximations to multivariable functions.

 Example 2.6.7
3
(0.998)
Find an approximate value for 1.003
.

Solution
3
x
Set f (x, y) = . We are to find (approximately) f (0.998 , 1.003). We can easily find
y

3
1
f (1, 1) = =1
1

and since
2 3
∂f 3x ∂f x
= and =−
2
∂x y ∂y y

we can also easily find


2
∂f 1
(1, 1) = 3 =3
∂x 1
3
∂f 1
(1, 1) = 1 = −1
2
∂y 1

So, setting Δx = −0.002 and Δy = 0.003, we have


3
0.998
= f (0.998 , 1.003) = f (1 + Δx , 1 + Δy)
1.003

∂f ∂f
≈ f (1, 1) + (1, 1) Δx + (1, 1) Δy
∂x ∂y

≈ 1 + 3(−0.002) − 1(0.003) = 0.991

By way of comparison, the exact answer is 0.9910389to seven decimal places.

 Example 2.6.8

Find an approximate value for (4.2) 1/2


+ (26.7 )
1/3
+ (256.4 )
1/4
.

Solution
Set f (x, y, z) = x 1/2
+y
1/3
+z
1/4
. We are to find (approximately) f (4.2 , 26.7 , 256.4). We can easily find
1/2 1/3 1/4
f (4, 27, 256) = (4 ) + (27 ) + (256 ) = 2 +3 +4 = 9

and since
∂f 1 ∂f 1 ∂f 1
= = =
1/2 2/3
∂x 2x ∂y 3y ∂z 4z 3/4

we can also easily find

2.6.3 https://math.libretexts.org/@go/page/92239
∂f 1 1 1
(4, 27, 256) = = ×
1/2 2 2
∂x 2(4)

∂f 1 1 1
(4, 27, 256) = = ×
∂y 3(27)2/3 3 9

∂f 1 1 1
(4, 27, 256) = = ×
3/4 4 64
∂z 4(256)

So, setting Δx = 0.2, Δy = −0.3, and Δz = 0.4, we have


1/2 1/3 1/4
(4.2 ) + (26.7 ) + (256.4 ) = f (4.2 , 26.7 , 256.4)

= f (4 + Δx , 27 + Δy , 256 + Δz)

∂f ∂f
≈ f (4, 27, 256) + (4, 27, 256) Δx + (4, 27, 256) Δy
∂x ∂y

∂f
+ (4, 27, 256) Δz
∂z
0.2 0.3 0.4 1 1 1
≈9+ − + =9+ − +
2 ×2 3 ×9 4 × 64 20 90 640

= 9.0405

to four decimal places. The exact answer is 9.03980 to five decimal places.
That's a difference of about
9.0405 − 9.0398
100 % = 0.008%
9

Note that we could have used the single variable approximation techniques in the CLP-1 text to separately approximate
1/2
(4.2 ) , (26.7) and (256.4) and then added the results together. Indeed what we have done here is equivalent.
1/3 1/4

 Example 2.6.9
A triangle has sides a = 10.1 cm and b = 19.8cm which include an angle 35 . Approximate the area of the triangle.

Solution
The triangle has height h = a sin θ and hence has area
1 1
A(a, b, θ) = bh = ab sin θ
2 2

The sin θ in this formula hides a booby trap built into this problem. In preparing the linear approximation we will need to use
the derivative of sin θ. But the standard derivative sin θ = cos θ only applies when θ is expressed in radians — not in
d

degrees. See Warning 3.4.23 in the CLP-1 text.


So we are obliged to convert 35 into

π π π

35 = (30 + 5)  radians = ( + ) radians
180 6 36

We need to compute (approximately) A(10.1 , 19.8 ,


π

6
+
π

36
). We will, of course 3, choose
π
a0 = 10 b0 = 20 θ0 =
6
π
Δa = 0.1 Δb = −0.2 Δθ =
36

By way of preparation, we evaluate

2.6.4 https://math.libretexts.org/@go/page/92239
1 1 1
A(a0 , b0 , θ0 ) = a0 b0 sin θ0 = (10)(20) = 50
2 2 2
∂A 1 1 1
(a0 , b0 , θ0 ) = b0 sin θ0 = (20) =5
∂a 2 2 2

∂A 1 1 1 5
(a0 , b0 , θ0 ) = a0 sin θ0 = (10) =
∂b 2 2 2 2

∂A 1 1 √3 –
(a0 , b0 , θ0 ) = a0 b0 cos θ0 = (10)(20) = 50 √3
∂θ 2 2 2

So the linear approximation gives


π π
Area = A(10.1 , 19.8 , + ) = A(a0 + Δa , b0 + Δb , θ0 + Δθ)
6 36

∂A ∂A
≈ A(a0 , b0 , θ0 ) + (a0 , b0 , θ0 )Δa + (a0 , b0 , θ0 )Δb
∂a ∂b

∂A
+ (a0 , b0 , θ0 )Δθ
∂θ
5 – π
= 50 + 5 × 0.1 + × (−0.2) + 50 √3
2 36
5 5 – π
= 50 + − + 50 √3
10 10 36
– π
= 50 (1 + √3 )
36

≈ 57.56

to two decimal places. The exact answer is 57.35 to two decimal places. Our approximation has an error of about
57.56 − 57.35
100  % = 0.37%
57.35

Another practical use of these linear approximations is to quantify how errors made in measured quantities propagate in
computations using those measured quantities. Let's explore this idea a little by recycling the last example.

 Example 2.6.10. Example 2.6.9, continued

Suppose, that, as in Example 2.6.9, we are attempting to determine the area of a triangle by measuring the lengths of two of its
sides together with the angle between them and then using the formula
1
A(a, b, θ) = ab sin θ
2

Of course, in the real world 4, we cannot measure lengths and angles exactly. So if we need to know the area to within 1%, the
question becomes: “How accurately do we have to measure the side lengths and included angle if we want the area that we
compute to have an error of no more than about 1%?”
Let's call the exact side lengths and included angle a , b and θ , respectively, and the measured side lengths and included
0 0 0

angle a + Δa, b + Δb and θ + Δθ. So Δa, Δb and Δθ represent the errors in our measurements. Then, by 2.6.2, the error
0 0 0

in our computed area will be approximately


∂A ∂A ∂A
ΔA ≈ (a0 , b0 , θ0 ) Δa + (a0 , b0 , θ0 ) Δb + (a0 , b0 , θ0 ) Δθ
∂a ∂b ∂θ

Δa Δb Δθ
= b0 sin θ0 + a0 sin θ0 + a0 b0 cos θ0
2 2 2

and the percentage error in our computed area will be


|ΔA| ∣ Δa Δb cos θ0 ∣
100 ≈ ∣100 + 100 + 100Δθ ∣
A(a0 , b0 , θ0 ) ∣ a0 b0 sin θ0 ∣

By the triangle inequality, |u + v| ≤ |u| + |v|, and the fact that |uv| = |u| |v|,

2.6.5 https://math.libretexts.org/@go/page/92239
∣ Δa Δb cos θ0 ∣
∣100 + 100 + 100Δθ ∣
∣ a0 b0 sin θ0 ∣

∣ Δa ∣ ∣ Δb ∣ ∣ cos θ0 ∣
≤ 100 ∣ ∣ + 100 ∣ ∣ + 100|Δθ|  ∣ ∣
∣ a0 ∣ ∣ b0 ∣ ∣ sin θ0 ∣

We want this to be less than 1.


Of course we do not know exactly what a , b and θ are. But suppose that we are confident that
0 0 0 a0 ≥ 10, b0 ≥ 10 and

π

6
≤θ ≤0
π

2
so that cot θ ≤ cot = √3 ≤ 2. Then
0
π

∣ Δa ∣ ∣ Δa ∣
100 ∣ ∣ ≤ 100 ∣ ∣ = 10 |Δa|
∣ a0 ∣ ∣ 10 ∣

∣ Δb ∣ ∣ Δb ∣
100 ∣ ∣ ≤ 100 ∣ ∣ = 10 |Δb|
∣ b0 ∣ ∣ 10 ∣

∣ cos θ0 ∣
100|Δθ|  ∣ ∣ ≤ 100|Δθ| 2 = 200 |Δθ|
∣ sin θ0 ∣

and
|ΔA|
100 ≲ 10 |Δa| + 10 |Δb| + 200 |Δθ|
A(a0 , b0 , θ0 )

So it will suffice to have measurement errors |Δa|, |Δb| and |Δθ| obey

10 |Δa| + 10 |Δb| + 200 |Δθ| < 1

 Example 2.6.11

A Question
Suppose that three variables are measured with percentage error ε ,  ε and ε respectively. In other words, if the exact value
1 2 3

of variable number i is x and measured value of variable number i is x + Δx then


i i i

∣ Δxi ∣
100  ∣ ∣ = εi
∣ xi ∣

Suppose further that a quantity P is then computed by taking the product of the three variables. So the exact value of P is

P (x1 , x2 , x3 ) = x1 x2 x3

and the measured value is P (x 1 + Δx1 , x2 + Δx2 , x3 + Δx3 ). What is the percentage error in this measured value of P ?
Solution
The percentage error in the measured value P (x 1 + Δx1 , x2 + Δx2 , x3 + Δx3 ) is
∣ P (x1 + Δx1 , x2 + Δx2 , x3 + Δx3 ) − P (x1 , x2 , x3 ) ∣
100  ∣ ∣
∣ P (x1 , x2 , x3 ) ∣

We can get a much simpler approximate expression for this percentage error, which is good enough for virtually all
applications, by applying

P (x1 + Δ x1 , x2 + Δx2 , x3 + Δx3 )

≈ P (x1 , x2 , x3 ) + Px1 (x1 , x2 , x3 ) Δx1 + Px2 (x1 , x2 , x3 ) Δx2

+ Px3 (x1 , x2 , x3 ) Δx3

The three partial derivatives are

2.6.6 https://math.libretexts.org/@go/page/92239

Px (x1 , x2 , x3 ) = [ x1 x2 x3 ] = x2 x3
1
∂x1


Px2 (x1 , x2 , x3 ) = [ x1 x2 x3 ] = x1 x3
∂x2


Px3 (x1 , x2 , x3 ) = [ x1 x2 x3 ] = x1 x2
∂x3

So
P (x1 + Δx1 , x2 + Δx2 , x3 + Δx3 )

≈ P (x1 , x2 , x3 ) + x2 x3 Δx1 + x1 x3 Δx2 + x1 x2 Δx3

and the (approximate) percentage error in P is


∣ P (x1 + Δx1 , x2 + Δx2 , x3 + Δx3 ) − P (x1 , x2 , x3 ) ∣
100  ∣ ∣
∣ P (x1 , x2 , x3 ) ∣

∣ x2 x3 Δx1 + x1 x3 Δx2 + x1 x2 Δx3 ∣


≈ 100  ∣ ∣
∣ P (x1 , x2 , x3 ) ∣

∣ x2 x3 Δx1 + x1 x3 Δx2 + x1 x2 Δx3 ∣


= 100  ∣ ∣
∣ x1 x2 x3 ∣

∣ Δx1 Δx2 Δx3 ∣


= ∣100 + 100 + 100 ∣
∣ x1 x2 x3 ∣

≤ ε1 + ε2 + ε3

More generally, if we take a product of n, rather than three, variables the percentage error in the product becomes at most
n

(approximately) ∑ ε . This is the basis of the experimentalist's rule of thumb that when you take products, percentage errors
i

i=1

add.
Still more generally, if we take a “product” ∏ n

i=1
x
mi

i
, the percentage error in the “product” becomes at most (approximately)
n

∑ | m i | εi
i=1

Quadratic Approximation and Error Bounds


Recall that, in the CLP-1 text, we started with the constant approximation, then improved it to the linear approximation by adding
in degree one terms, then improved that to the quadratic approximation by adding in degree two terms, and so on. We can do the
same thing here. Once again, set

g(t) = f (x0 + t Δx , y0 + t Δy)

and recall that

g(0) = f (x0 , y0 ) and g(1) = f (x0 + Δx , y0 + Δy)

We'll now see what the quadratic approximation


′ 1 ′′ 2
g(t0 + Δt) ≈ g(t0 ) + g (t0 ) Δt + g (t0 ) Δt
2

and the corresponding exact formula (see (3.4.32) in the CLP-1 text)
′ 1 ′′ 2
g(t0 + Δt) = g(t0 ) + g (t0 ) Δt + g (t0 + cΔt) Δt for some 0 ≤ c ≤ 1
2

tells us about f . We have already found, using the chain rule, that
∂f ∂f

g (t) = (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy
∂x ∂y

∂f ∂f
We now need to evaluate g ′′
(t). Temporarily write f 1 =
∂x
and f 2 =
∂y
so that

g (t) = f1 (x0 + t Δx , y0 + t Δy) Δx + f2 (x0 + t Δx , y0 + t Δy) Δy

2.6.7 https://math.libretexts.org/@go/page/92239
Then we have, again using the chain rule,
d
[ f1 (x0 + t Δx , y0 + t Δy)]
dt

∂f1 ∂f1
= (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy
∂x ∂y
2 2
∂ f ∂  f
= (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy (∗)
2
∂x ∂y∂x

and
d
[ f2 (x0 + t Δx , y0 + t Δy)]
dt

∂f2 ∂f2
= (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy
∂x ∂y
2 2
∂  f ∂ f
= (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy (∗∗)
2
∂x∂y ∂y

2 2
∂  f ∂  f
Adding Δx times (∗) to Δy times (∗∗) and recalling that ∂y∂x
=
∂x∂y
, gives
2
′′
∂ f 2
g (t) = (x0 + t Δx , y0 + t Δy) Δx
∂x2
2
∂  f
+2 (x0 + t Δx , y0 + t Δy) ΔxΔy
∂x∂y
2
∂ f 2
+ (x0 + t Δx , y0 + t Δy) Δy
2
∂y

Now setting t 0 =0 and Δt = 1, the quadratic approximation


′ 1 ′′
f (x0 + Δx , y0 + Δy) = g(1) ≈ g(0) + g (0) + g (0)
2

is

 Equation 2.6.12

f (x0 + Δx , y0 + Δy)

∂f ∂f
≈ f (x0 , y0 ) + (x0 , y0 ) Δx + (x0 , y0 ) Δy
∂x ∂y
2 2 2
1 ∂ f 2
∂  f ∂ f 2
+ { (x0 , y0 ) Δx +2 (x0 , y0 ) ΔxΔy + (x0 , y0 ) Δy }
2 2
2 ∂x ∂x∂y ∂y

and the corresponding exact formula


′ 1 ′′
f (x0 + Δx , y0 + Δy) = g(1) = g(0) + g (0) + g (c)
2

is

 Equation 2.6.13

f (x0 + Δx , y0 + Δy)

∂f ∂f
= f (x0 , y0 ) + (x0 , y0 ) Δx + (x0 , y0 ) Δy
∂x ∂y
2 2 2
1 ∂ f ∂  f ∂ f
2 2
+ { ⃗ 
(r (c)) Δx +2 ⃗ 
(r (c)) ΔxΔy + ⃗ 
(r (c)) Δy }
2 2
2 ∂x ∂x∂y ∂y

where r (c)
⃗  = (x 0 + c Δx , y0 + c Δy) and c is some (unknown) number satisfying 0 ≤ c ≤ 1.

2.6.8 https://math.libretexts.org/@go/page/92239
 Equation 2.6.14
If we can bound the second derivatives
2 2 2
∣∂ f ∣ ∣ ∂  f ∣ ∣∂ f ∣
∣ ⃗ 
(r (c))∣  ,   ∣ ⃗ 
(r (c))∣  ,   ∣ ⃗ 
(r (c))∣ ≤M
2 2
∣ ∂x ∣ ∣ ∂x∂y ∣ ∣ ∂y ∣

we can massage 2.6.13 into the form


∣ ∂f ∂f ∣
∣f (x0 + Δx , y0 + Δy) − {f (x0 , y0 ) + (x0 , y0 ) Δx + (x0 , y0 ) Δy}∣
∣ ∂x ∂y ∣

M 2 2
≤ (|Δx | + 2|Δx| |Δy| + |Δy | )
2

Why might we want to do this? The left hand side of 2.6.14 is exactly the error in the linear approximation 2.6.1. So the right hand
side is a rigorous bound on the error in the linear approximation.

 Example 2.6.15. Example 2.6.7, continued


3
(0.998)
Suppose that we approximate as in Example 2.6.7 and we want a rigorous bound on the approximation. We can get
1.003

such a rigorous bound by applying 2.6.13. Set


3
x
f (x, y) =
y

and

x0 = 1 Δx = −0.002 y0 = 1 Δy = 0.003

Then the exact answer is f (x0 + Δx , y0 + Δy) and the approximate answer is
∂f ∂f
f (x0 , y0 ) +
∂x
(x0 , y0 ) Δx +
∂y
(x0 , y0 ) Δy, so that, by 2.6.13, the error in the approximation is exactly
2 2 2
1∣∂ f ∂  f ∂ f ∣
2 2
∣ ⃗ 
(r (c)) Δx +2 ⃗ 
(r (c)) ΔxΔy + ⃗ 
(r (c)) Δy ∣
2 2
2 ∣ ∂x ∂x∂y ∂y ∣

with r (c)
⃗  = (1 − 0.002c , 1 + 0.0003c) for some, unknown, 0 ≤ c ≤ 1. For our function f
3 2 3
x ∂f 3x ∂f x
f (x, y) = (x, y) = (x, y) = −
2
y ∂x y ∂y y
2 2 2 2 3
∂ f 6x ∂ f 3x ∂ f 2x
(x, y) = (x, y) = − (x, y) =
2 2 2
∂x y ∂x∂y y ∂y y3

We don't know what r (c)


⃗  = (1 − 0.002c , 1 + 0.0003c) is. But we know that 0 ≤ c ≤ 1, so we definitely know that the x

component of r (c) is smaller that 1 and the y component of r (c)


 
⃗ ⃗  is bigger than 1. So
2 2 2
∣∂ f ∣ ∣ ∂ f ∣ ∣∂ f ∣
∣ ⃗ 
(r (c))∣ ≤6 ∣ ⃗ 
(r (c))∣ ≤3 ∣ ⃗ 
(r (c))∣ ≤2
2 2
∣ ∂x ∣ ∣ ∂x∂y ∣ ∣ ∂y ∣

and
1 2 2
error ≤ [6Δx + 2 × 3|Δx Δy| + 2Δy ]
2
2 2
≤ 3(0.002 ) + 3(0.002)(0.003) + (0.003 )

= 0.000039

By way of comparison, the exact error is 0.0000389, to seven decimal places.

2.6.9 https://math.libretexts.org/@go/page/92239
 Example 2.6.16
−−−−−−−−−−
In this example, we find the quadratic approximation of f (x, y) = √1 + 4x + y at (x , y ) = (1, 2) and use it to compute
2 2
0 0

approximately f (1.1 , 2.05). We know that we will need all partial derivatives up to order 2, so we first compute them and
evaluate them at (x , y ) = (1, 2).
0 0

−−−−−−−−−−
2 2
f (x, y) = √ 1 + 4 x +y f (x0 , y0 ) = 3

4x 4
fx (x, y) = − −−−−−−−− − fx (x0 , y0 ) =
√ 1 + 4 x2 + y 2 3

y 2
fy (x, y) = − −−−−−−−− − fy (x0 , y0 ) =
2 2 3
√ 1 + 4x + y

2
4 16x 4 16
fxx (x, y) = − fxx (x0 , y0 ) = −
− −−−−−−−− − 2 2 3/2
2 2 3 27
√ 1 + 4x + y [1 + 4 x +y ]

20
=
27

4xy 8
fxy (x, y) = − fxy (x0 , y0 ) = −
2 2 3/2
[1 + 4 x +y ] 27

2
1 y 1 4
fyy (x, y) = − −−−−−−−− − − fyy (x0 , y0 ) = −
2 2 2 2 3/2
√ 1 + 4x + y [1 + 4 x + y ] 3 27

5
=
27

We now just substitute them into 2.6.12 to get that the quadratic approximation to f about (x 0, y0 ) is

f (x0 + Δx , y0 + Δy)

≈ f (x0 , y0 ) + fx (x0 , y0 )Δx + fy (x0 , y0 )Δy

1 2 2
+ [fxx (x0 , y0 )Δx + 2 fxy (x0 , y0 )ΔxΔy + fyy (x0 , y0 )Δy ]
2

4 2 10 2
8 5 2
=3+ Δx + Δy + Δx − ΔxΔy + Δy
3 3 27 27 54

In particular, with Δx = 0.1 and Δy = 0.05,


4 2 10 8 5
f (1.1 , 2.05) ≈ 3+ (0.1)+ (0.05)+ (0.01)− (0.005)+ (0.0025)
3 3 27 27 54

= 3.1691

The actual value, to four decimal places, is 3.1690. The percentage error is about 0.004\%.

 Example 2.6.17

In this example, we find the quadratic approximation of f (x, y) = e 2x


sin(3y) about (x 0, y0 ) = (0, 0) in two different ways.
The first way uses the canned formula 2.6.12. We compute all partial derivatives up to order 2 at (x 0, y0 ).

2x
f (x, y) = e sin(3y) f (x0 , y0 ) = 0
2x
fx (x, y) = 2 e sin(3y) fx (x0 , y0 ) = 0
2x
fy (x, y) = 3 e cos(3y) fy (x0 , y0 ) = 3

2x
fxx (x, y) = 4 e sin(3y) fxx (x0 , y0 ) = 0
2x
fxy (x, y) = 6 e cos(3y) fxy (x0 , y0 ) = 6

2x
fyy (x, y) = −9 e sin(3y) fyy (x0 , y0 ) = 0

So the quadratic approximation to f about (0, 0) is

2.6.10 https://math.libretexts.org/@go/page/92239
f (x , y) ≈ f (x, y) + fx (x, y)x + fy (0, 0)y

1 2 2
+ [fxx (0, 0)x + 2 fxy (0, 0)xy + fyy (0, 0)y ]
2

= 3y + 6xy

That's pretty simple — just compute a bunch of partial derivatives and substitute into the formula 2.6.12.
But there is also a sneakier, and often computationally more efficient, method to get the same result. It exploits the single
variable Taylor expansions

x
1 2
e = 1 +x + x +⋯
2!
1 3
sin y = y − y +⋯
3!

Replacing x by 2x in the first and y by 3y in the second and multiplying the two together, keeping track only of terms of
degree at most two, gives
2x
f (x, y) = e sin(3y)

1 2
1 3
= [1 + (2x) + (2x ) + ⋯ ][(3y) − (3y ) +⋯ ]
2! 3!

2
9 3
= [1 + 2x + 2 x + ⋯ ][3y − y +⋯ ]
2

2
9 3 3 2 3
= 3y + 6xy + 6 x y + ⋯ − y − 9x y − 9x y +⋯
2

= 3y + 6xy + ⋯

just as in the first computation.

Optional — Taylor Polynomials


We have just found linear and quadratic approximations to the function f (x, y), for (x, y) near the point (x , y ). In CLP-1, we 0 0

found not only linear and quadratic approximations, but in fact a whole hierarchy of approximations. For each integer n ≥ 0, the
n
th
degree Taylor polynomial for f (x) about x = a was defined, in Definition 3.4.11 of the CLP-1 text, to be
n
1
(k) k
∑ f (a) ⋅ (x − a)
k!
k=0

We'll now define, and find, the Taylor polynomial of degree n for the function f (x, y) about (x, y) = (x 0, y0 ). It is going to be a
polynomial of degree n in Δx and Δy. The most general such polynomial is
ℓ m
Tn (Δx, Δy) = ∑ aℓ,m  (Δx ) (Δy )

ℓ,m≥0

ℓ+m≤n

with all of the coefficients a being constants. The specific coefficients for the Taylor polynomial are determined by the
ℓ,m

requirement that all partial derivatives of T (Δx, Δy) at Δx = Δy = 0 are the same as the corresponding partial derivatives of
n

f (x + Δx , y + Δy) at Δx = Δy = 0.
0 0

By way of preparation for our computation of the derivatives of T n (Δx, Δy), consider
2 3
d 4 3
d 4 2
d 4
t = 4t t = (4)(3)t t = (4)(3)(2)t
2 3
dt dt dt
4 5 6
d 4
d 4
d 4
t = (4)(3)(2)(1) = 4! t =0 t =0
4 5 6
dt dt dt

and

2.6.11 https://math.libretexts.org/@go/page/92239
2 3
d ∣ d ∣ d ∣
4 4 4
t ∣ =0 t ∣ =0 t ∣ =0
∣ 2 3
dt t=0 dt ∣t=0 dt ∣t=0

4 5 6
d ∣ d ∣ d ∣
4 4 4
t ∣ = 4! t ∣ =0 t ∣ =0
4 5 6
dt ∣t=0 dt ∣ dt ∣
t=0 t=0

More generally, for any natural numbers p, m,


p m−p
d m(m − 1) ⋯ (m − p + 1)t if p ≤ m
m
t ={
p
dt 0 if p > m

so that
p
d ∣ m! if p = m
m
t ∣ ={
p
dt ∣ 0 if p ≠ m
t=0

Consequently
p q
∂ ∂ ∣ ℓ! m! if p = ℓ and q = m
ℓ m
(Δx ) (Δy ) ∣ ={
p q
∂(Δx) ∂(Δy) ∣ 0 if p ≠ ℓ or q ≠ m
Δx=Δy=0

and
p+q p q
∂   Tn ∂ ∂ ∣
ℓ m
(0, 0) = ∑ aℓ,m   (Δx ) (Δy ) ∣
p q p q
∂(Δx ) ∂(Δy ) ∂(Δx) ∂(Δy) ∣
ℓ,m≥0 Δx=Δy=0

ℓ+m≤n

p! q! ap,q if p + q ≤ n
={
0 if p + q > n

Our requirement that the derivatives of f and T match is the requirement that, for all p + q ≤ n,
n

p+q p+q
∂   Tn ∂  

(0, 0) = f (x0 + Δx , y0 + Δy)
p q p q ∣
∂(Δx ) ∂(Δy ) ∂(Δx ) ∂(Δy ) Δx=Δy=0

p+q
∂  f
= (x0 , y0 )
p q
∂x ∂y

This requirement gives


p+q
∂  f
p! q! ap,q = (x0 , y0 )
p
∂x ∂y q

So the Taylor polynomial of degree n for the function f (x, y) about (x, y) = (x 0, y0 ) is the right hand side of

 Equation 2.6.18
ℓ+m
1 ∂  f
ℓ m
f (x0 + Δx , y0 + Δy) ≈ ∑   (x0 , y0 ) (Δx ) (Δy )
ℓ m
ℓ! m! ∂x ∂y
ℓ,m≥0

ℓ+m≤n

This is for functions, f (x, y), of two variables. There are natural extensions of this for functions of any (finite) number of
variables. For example, the Taylor polynomial of degree n for a function, f (x, y, z), of three variables is the right hand side of

f (x0 + Δx , y0 + Δy , z0 + Δz)

k+ℓ+m
1 ∂  f
k ℓ m
≈ ∑   (x0 , y0 , z0 ) (Δx ) (Δy ) (Δz)
k ℓ m
k! ℓ! m! ∂x ∂y ∂z
k,ℓ,m≥0

k+ℓ+m≤n

2.6.12 https://math.libretexts.org/@go/page/92239
Exercises
Stage 1

 1

Let x and y be constants and let


0 0 m and n be integers. If m <0 assume that x0 ≠ 0, and if n <0 assume that y0 ≠ 0.

Define P (x, y) = x y .
m n

1. Find the linear approximation to P (x 0 + Δx, y0 + Δy).

2. Denote by
∣ P (x0 + Δx, y0 + Δy) − P (x0 , y0 ) ∣
P% = 100 ∣ ∣
∣ P (x0 , y0 ) ∣

∣ Δx ∣
x% = 100 ∣ ∣
∣ x0 ∣

∣ Δy ∣
y% = 100 ∣ ∣
∣ y0 ∣

the percentage errors in P , x and y respectively. Use the linear approximation to find an (approximate) upper bound on P %

in terms of m, n, x and y .
% %

 2

Consider the following work.


We compute, approximately, the y -coordinate of the point whose polar coordinates are r = 0.9 and θ = 2 ∘
. In general, the y -
coordinate of the point whose polar coordinates are r and θ is Y (r, θ) = r sin θ. The partial derivatives

Yr (r, θ) = sin θ Yθ (r, θ) = r cos θ

So the linear approximation to \(Y(r_0+\Delta r,\theta_0+\Delta\theta)\) with r 0 =1 and θ 0 =0 is


Y (1 + Δr, 0 + Δθ) ≈ Y (1, 0) + Yr (1, 0) Δr + Yθ (1, 0) Δθ

= 0  +  (0) Δr  +  (1)Δθ

Applying this with Δr = −0.1 and Δθ = 2 gives the (approximate) y -coordinate

Y (0.9, 2) = Y (1 − 0.1 , 0 + 2) ≈ 0  +  (0) (−0.1)  +  (1)(2) = 2

This conclusion is ridiculous. We're saying that the y -coordinate is more than twice the distance from the point to the origin.
What was the mistake?

Stage 2

 3

Find an approximate value for f (x, y) = sin(πxy + ln y) at (0.01, 1.05)without using a calculator or computer.

 4✳
2
x y
Let f (x, y) = 4 2
. Find an approximate value for f (−0.9 , 1.1) without using a calculator or computer.
x + 2y

 5

Four numbers, each at least zero and each at most 50, are rounded to the first decimal place and then multiplied together.
Estimate the maximum possible error in the computed product.

2.6.13 https://math.libretexts.org/@go/page/92239
 6✳
One side of a right triangle is measured to be 3 with a maximum possible error of ±0.1, and the other side is measured to be 4
with a maximum possible error of ±0.2. Use the linear approximation to estimate the maximum possible error in calculating
the length of the hypotenuse of the right triangle.

 7✳
If two resistors of resistance R and R are wired in parallel, then the resulting resistance R satisfies the equation
1 2

. Use the linear approximation to estimate the change in R if R decreases from 2 to 1.9 ohms and R
1 1 1
= + 1 2
R R1 R2

increases from 8 to 8.1 ohms.

 8

The total resistance R of three resistors, R 1, R2 , R3 , connected in parallel is determined by


1 1 1 1
= + +
R R1 R2 R3

If the resistances, measured in Ohms, are R = 25Ω, R 1 2 = 40Ω and R 3 = 50Ω, with a possible error of 0.5\% in each case,
estimate the maximum error in the calculated value of R.

 9

The specific gravity S of an object is given by  S =   where A is the weight of the object in air and W is the weight of
A

A−W

the object in water. If  A = 20 ± .01  and  W = 12 ± .02  find the approximate percentage error in calculating S from the
given measurements.

 10 ✳

The pressure in a solid is given by


2 2
P (s, r) = sr(4 s −r − 2)

where s is the specific heat and r is the density. We expect to measure (s, r) to be approximately (2, 2) and would like to have
the most accurate value for P . There are two different ways to measure s and r. Method 1 has an error in s of ±0.01 and an
error in r of ±0.1, while method 2 has an error of ±0.02 for both s and r.
Should we use method 1 or method 2? Explain your reasoning carefully.

 11

A rectangular beam that is supported at its two ends and is subjected to a uniform load sags by an amount
4
pℓ
S =C
wh3

where p = load, ℓ = length, h = height, w = width and C is a constant. Suppose p ≈ 100, ℓ ≈ 4, w ≈ .1 and h ≈ .2. Will
the sag of the beam be more sensitive to changes in the height of the beam or to changes in the width of the beam.

 12 ✳
2y
Let z = f (x, y) = 2
x +y
2
. Find an approximate value for f (−0.8, 2.1).

2.6.14 https://math.libretexts.org/@go/page/92239
 13 ✳

Suppose that a function z = f (x, y) is implicitly defined by an equation:


2 3
xyz + x + y +z =0

1. Find .
∂z

∂x

2. If f (−1, 1) < 0, find the linear approximation of the function z = f (x, y) at (−1, 1).
3. If f (−1, 1) < 0, use the linear approximation in (b) to approximate f (−1.02, 0.97).

 14 ✳

Let z = f (x, y) be given implicitly by


z
e + yz = x + y.

1. Find the differential dz.


2. Use linear approximation at the point (1, 0) to approximate f (0.99, 0.01).

 15 ✳

Two sides and the enclosed angle of a triangle are measured to be m, 4 ± .1 m and 90 ± 1 respectively. The length of
3 ± .1

the third side is then computed using the cosine law C = A + B 2 2 2


− 2AB cos θ. What is the approximate maximum error in

the computed value of C ?

 16 ✳
−−−−−−
Use differentials to find a reasonable approximation to the value of f (x, y) = xy √x 2
+y
2
at x = 3.02, y = 3.96. Note that
3.02 ≈ 3 and 3.96 ≈ 4.

 17 ✳

Use differentials to estimate the volume of metal in a closed metal can with diameter 8cm and height 12cm if the metal is
0.04cm thick.

 18 ✳

Let z be a function of x, y such that


3 2
z − z + 2xy − y = 0, z(2, 4) = 1.

1. Find the linear approximation to z at the point (2, 4).


2. Use your answer in (a) to estimate the value of z at (2.02, 3.96).

Stage 3

 19 ✳

Consider the surface given by:


3 2
z − xy z − 4x = 0.

1. Find expressions for ∂z

∂x
,
∂z

∂y
as functions of x, y, z.
2. Evaluate ∂z

∂y
,
∂z

∂y
at (1, 1, 2).
3. Measurements are made with errors, so that x = 1 ± 0.03 and y = 1 ± 0.02. Find the corresponding maximum error in
measuring z.
4. A particle moves over the surface along the path whose projection in the xy--plane is given in terms of the angle θ as

2.6.15 https://math.libretexts.org/@go/page/92239
x(θ) = 1 + cos θ,  y(θ) = sin θ

from the point A : x = 2,  y = 0 to the point B : x = 1,  y = 1. Find dz


at points A and B.

 20 ✳

Consider the function f that maps each point (x, y) in R to y e 2 −x


.

1. Suppose that x = 1 and y = e, but errors of size 0.1 are made in measuring each of x and y. Estimate the maximum error
that this could cause in f (x, y).
2. The graph of the function f sits in R , and the point (1, e, 1) lies on that graph. Find a nonzero vector that is perpendicular
3

to that graph at that point.

 21 ✳

A surface is defined implicitly by z 4


− xy z
2 2
+ y = 0.

1. Compute ∂z

∂x
,
∂z

∂y
in terms of x, y, z.
2. Evaluate ∂z

∂x
and ∂z

∂y
at (x, y, z) = (2, −1/2, 1).
3. If x decreases from 2 to 1.94, and y increases from −0.5 to −0.4, find the approximate change in z from 1.
4. Find the equation of the tangent plane to the surface at the point (2, −1/2, 1).

 22 ✳
∂f ∂f
A surface z = f (x, y) has derivatives ∂x
=3 and ∂y
= −2 at (x, y, z) = (1, 3, 1).

1. If x increases from 1 to 1.2, and y decreases from 3 to 2.6, find the change in z using a linear approximation.
2. Find the equation of the tangent plane to the surface at the point (1, 3, 1).

 23 ✳
According to van der Waal's equation, a gas satisfies the equation
2 2
(p V + 16)(V − 1) = T V ,

where p, V and T denote pressure, volume and temperature respectively. Suppose the gas is now at pressure 1, volume 2 and
temperature 5. Find the approximate change in its volume if p is increased by 0.2 and T is increased by 0.3.

 24 ✳

Consider the function f (x, y) = e


2 2
−x +4 y
.

1. Find the equation of the tangent plane to the graph z = f (x, y) at the point where (x, y) = (2, 1).
2. Find the tangent plane approximation to the value of f (1.99, 1.01)using the tangent plane from part (a).

 25 ✳

Let z = f (x, y) = ln(4x 2 2


+ y ).

1. Use a linear approximation of the function z = f (x, y) at (0, 1) to estimate f (0.1, 1.2).
2. Find a point P (a, b, c) on the graph of z = f (x, y) such that the tangent plane to the graph of z = f (x, y) at the point P is
parallel to the plane 2x + 2y − z = 3.

2.6.16 https://math.libretexts.org/@go/page/92239
 26 ✳
1. Find the equation of the tangent plane to the surface x z + y sin(πx) = −y at the point P = (1, 1, −1).
2 3 2

2. Let z be defined implicitly by x z + y sin(πx) = −y . Find


2 3
at the point P = (1, 1, −1).
2

∂x
∂z

3. Let z be the same implicit function as in part (ii), defined by the equation x z + y sin(πx) = −y . Let x = 0.97, and
2 3 2

y = 1. Find the approximate value of z.

 27 ✳

The surface x
4
+y
4
+z
4
+ xyz = 17 passes through (0, 1, 2), and near this point the surface determines x as a function,
x = F (y, z), of y and z.
1. Find F and F at (x, y, z) = (0, 1, 2).
y z

2. Use the tangent plane approximation (also known as linear, first order or differential approximation) to find the
approximate value of x (near 0) such that (x, 1.01, 1.98)lies on the surface.

1. You may have seen it written as E n (x) =


1

(n+1)!
g
(n+1)
(c)(x − a)
n+1

2. Don't take the notation dx or the terminology “infinitesimal” too seriously. It is just intended to signal “very small”.
3. There are other choices possible. For example, we could write 35 = 45 − 10 . To get a good approximation we try to make
∘ ∘ ∘

Δθ as small as possible, while keeping the arithmetic reasonably simple.

4. Of course in our “real world” everyone uses calculus.

This page titled 2.6: Linear Approximations and Error is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.

2.6.17 https://math.libretexts.org/@go/page/92239
2.7: Directional Derivatives and the Gradient
df
The principal interpretation of (a) is the rate of change of f (x), per unit change of x, at x = a. The natural analog of this
dx

interpretation for multivariable functions is the directional derivative, which we now introduce through a question.

A Question
Suppose that you are standing at (a, b) near a campfire. The temperature you feel at (x, y) is f (x, y). You start to move with
velocity v ⃗ = ⟨v , v ⟩ . What rate of change of temperature do you feel?
1 2

The Answer
Let's set the beginning of time, t = 0, to the time at which you leave (a, b). Then
at time 0 you are at (a, b) and feel the temperature f (a, b) and
at time t you are at (a + v t , b + v t) and feel the temperature f (a + v
1 2 1t , b + v2 t). So
the change in temperature between time 0 and time t is f (a + v t , b + v 1 2 t) − f (a, b),
f (a+v1 t , b+v2 t)−f (a,b)
the average rate of change of temperature, per unit time, between time 0 and time t is t
and the
f (a+v1 t , b+v2 t)−f (a,b)
instantaneous rate of change of temperature per unit time as you leave (a, b) is lim t
.
t→0

Concentrate on the t dependence in this limit by writing f (a + v 1t , b + v2 t) = g(t). Then


f (a + v1 t , b + v2 t) − f (a, b) g(t) − g(0)
lim = lim
t→0 t t→0 t

dg
= (0)
dt

d

= [f (a + v1 t , b + v2 t)]

dt t=0

By the chain rule, we can write the right hand side in terms of partial derivatives of f .
d
[f (a + v1 t , b + v2 t)] = fx (a + v1 t , b + v2 t) v1 + fy (a + v1 t , b + v2 t) v2
dt

So, the instantaneous rate of change per unit time as you leave (a, b) is
f (a + v1 t , b + v2 t) − f (a, b)
lim
t→0 t


= [ fx (a + v1 t , b + v2 t) v1 + fy (a + v1 t , b + v2 t) v2 ]

t=0

= fx (a, b) v1 + fy (a, b) v2

= ⟨fx (a, b) , fy (a, b)⟩ ⋅ ⟨v1 , v2 ⟩

Notice that we have expressed the rate of change as the dot product of the velocity vector with a vector of partial derivatives of f .
We have seen such a vector of partial derivatives of f before; in Definition 2.5.4, we defined the gradient of the three variable
function G(x, y, z) at the point (x , y , z ) to be ⟨G (x , y , z ) , G (x , y , z ) , G (x , y , z )⟩ . Here we see the
0 0 0 x 0 0 0 y 0 0 0 z 0 0 0

natural two dimensional analog.

 Definition 2.7.1
The vector ⟨f x (a, b) , fy (a, b)⟩ is denoted n⃗ ablaf (a, b) and is called “the gradient of the function f at the point (a, b)”.

In general, the gradient of f is a vector with one component for each variable of f . The j th
component is the partial derivative of f
with respect to the j variable.
th

Now because the dot product n⃗ ablaf (a, b) ⋅ v ⃗  appears frequently, we introduce some handy notation.

2.7.1 https://math.libretexts.org/@go/page/92240
 Definition 2.7.2

Given any vector v⃗ = ⟨v 1, v2 ⟩ , the expression

⟨fx (a, b), fy (a, b)⟩ ⋅ ⟨v1 , v2 ⟩ = n⃗ ablaf (a, b) ⋅ v ⃗ 

is denoted D f (a, b).


v ⃗ 

Armed with this useful notation we can answer our question very succinctly.

 Equation 2.7.3

The rate of change of f per unit time as you leave (a, b) moving with velocity v ⃗ is

Dv ⃗ f (a, b) = n⃗ ablaf (a, b) ⋅ v ⃗ 

We can compute the rate of change of temperature per unit distance (as opposed to per unit time) in a similar way. The change in
temperature between time 0 and time t is f (a + v t, b + v t) − f (a, b). Between time 0 and time t, you have travelled a distance
1 2

⃗ 
| v|t. So the instantaneous rate of change of temperature per unit distance as you leave (a, b) is
f (a + v1 t, b + v2 t) − f (a, b)
lim
t→0 t| v|⃗ 

f (a+v1 t,b+v2 t)−f (a,b)


This is exactly 1

⃗ 
| v|
times lim t
which we computed above to be D f (a, b).
v ⃗ 
So
t→0

 Equation 2.7.4

Given any nonzero vector v,⃗  the rate of change of f per unit distance as you leave (a, b) moving in direction v ⃗ is
v ⃗ 
∇f (a, b) ⋅ =D v
⃗   f (a, b)
| v|⃗  | v |⃗ 

 Definition 2.7.5

D v
⃗   f (a, b) is called the directional derivative of the function f (x, y) at the point (a, b) in the direction 1 v.⃗ 
| v | 

The Implications
We have just seen that the instantaneous rate of change of f per unit distance as we leave (a, b) moving in direction v ⃗  is a dot
product, which we can write as
v ⃗ 
∇f (a, b) ⋅ = |∇f (a, b)| cos θ
| v|⃗ 

where θ is the angle between the gradient vector ∇f (a, b) and the direction vector v.⃗  Writing it in this way allows us to make some
useful observations. Since cos θ is always between −1 and +1
the direction of maximum rate of increase is that having θ = 0. So to get maximum rate of increase per unit distance, as you
leave (a, b), you should move in the same direction as the gradient ∇f (a, b). Then the rate of increase per unit distance is
|∇f (a, b)|.

The direction of minimum (i.e. most negative) rate of increase is that having θ = 180 . To get minimum rate of increase per ∘

unit distance you should move in the direction opposite ∇f (a, b). Then the rate of increase per unit distance is −|∇f (a, b)|.
The directions giving zero rate of increase are those perpendicular to ∇f (a, b). If you move in a direction perpendicular to
∇f (a, b), then f (x, y) remains constant as you leave (a, b). At that instant, you are moving so that f (x, y) remains constant

and consequently you are moving along the level curve f (x, y) = f (a, b). So ∇f (a, b) is perpendicular to the level curve
f (x, y) = f (a, b) at (a, b). The corresponding statement in three dimensions is that ∇F (a, b, c) is perpendicular to the level

2.7.2 https://math.libretexts.org/@go/page/92240
surface F (x, y, z) = F (a, b, c) at (a, b, c). Hence a good way to find a vector normal to the surface F (x, y, z) = F (a, b, c) at
the point (a, b, c) is to compute the gradient ∇F (a, b, c). This is precisely what we saw back in Theorem 2.5.5.
Now that we have defined the directional derivative, here are some examples.

 Example 2.7.6

Find the directional derivative of the function f (x, y) = e at the point (0, 1) in the direction −^
2
x+y
ı
ı +^
ȷ
ȷ.

Solution
To compute the directional derivative, we need the gradient. To compute the gradient, we need some partial derivatives. So we
start with the partial derivatives of f at (0, 1):
2
x+y ∣
fx (0, 1) = e =e
∣ x=0
y=1

2
x+y ∣
fy (0, 1) = 2ye = 2e
∣ x=0
y=1

So the gradient of f at (0, 1) is

∇f (0, 1) = fx (0, 1) ^
ı + fy (0, 1) ^
ı ȷ =e^
ȷ ı + 2e ^
ı ȷ
ȷ

and the direction derivative in the direction −^ ȷ is


ı +^
ı ȷ

−^
ı
ı +^
ȷ
ȷ −^
ı
ı +^
ȷ
ȷ e
D ^ ^ f (0, 1) = ∇f (0, 1) ⋅ = (e ^
ı
ı + 2e ^
ȷ
ȷ) ⋅ =
−ıı +ȷȷ
– –
^
ı +^
|−ı ȷȷ |
|−^
ı +^
ı ȷ
ȷ| √2 √2

 Example 2.7.7

Find the directional derivative of the function w(x, y, z) = xyz + ln(xz) at the point (1, 3, 1) in the direction ⟨1 , 0 , 1⟩ . In
what directions is the directional derivative zero?
Solution
First, the partial derivatives of w at (1, 3, 1) are
1 ∣ 1
wx (1, 3, 1) = [yz + ]∣ = 3 ×1 + =4
x ∣ 1
(1,3,1)


wy (1, 3, 1) = xz∣ = 1 ×1 =1

(1,3,1)

1 ∣ 1
wz (1, 3, 1) = [xy + ]∣ = 1 ×3 + =4
z ∣ 1
(1,3,1)

so the gradient of w at (1, 3, 1) is

∇w(1, 3, 1) = ⟨wx (1, 3, 1) , wy (1, 3, 1) , wz (1, 3, 1)⟩ = ⟨4 , 1 , 4⟩

and the direction derivative in the direction ⟨1 , 0 , 1⟩ is


⟨1 , 0 , 1⟩ ⟨1 , 0 , 1⟩
D w(1, 3, 1) = ∇w(1, 3, 1) ⋅ = ⟨4 , 1 , 4⟩ ⋅
⟨ 1 , 0 , 1⟩

|⟨ 1 , 0 , 1⟩ |
| ⟨1 , 0 , 1⟩ | √2

8 –
= = 4 √2

√2

The directional derivative of w at (1, 3, 1) in the direction t ≠ 0 is zero if and only if


t t
0 =D t w(1, 3, 1) = ∇w(1, 3, 1) ⋅ = ⟨4 , 1 , 4⟩ ⋅
|t| |t| |t|

which is the case if and only if t is perpendicular to ⟨4 , 1 , 4⟩ . So if we walk in the direction of any vector in the plane,
4x + y + 4z = 0 (which has normal vector ⟨4 , 1 , 4⟩) then the directional derivative is zero.

2.7.3 https://math.libretexts.org/@go/page/92240
 Example 2.7.8
Let
2 2
f (x, y) = 5 − x − 2y (a, b) = ( − 1, −1)

In this example, we'll explore the behaviour of the function f (x, y) near the point (a, b).
Note that for any fixed f < 5, f (x, y) = f is the ellipse x
0 0
2
+ 2y
2
= 5 − f0 . So the graph z = f (x, y) consists of a bunch of
horizontal ellipses stacked one on top of each other.
−−−−
−−−−− 5−f0
Since the ellipse x2
+ 2y
2
= 5 − f0 has x-semi-axis √5 − f and y -semi-axis √ 0
2
,

the ellipses start with a point on the z axis when f 0 =5 and


increase in size as f decreases.
0

The part of the graph z = f (x, y) in the first octant is sketched in the left hand figure below.
Several level curves, f (x, y) = f , are sketched in the right hand figure below.
0

The gradient vector

∇f (a, b) = ⟨−2x, −4y⟩ ∣


∣ = ⟨2, 4⟩ = 2 ⟨1, 2⟩
(−1,−1)

at (−1, −1) is also illustrated in the right hand sketch.


We have that, at (a, b) = (−1, −1),
the unit vector giving the direction of maximum rate of increase is the unit vector in the direction of the gradient vector

2 ⟨1, 2⟩ , which is ⟨1, 2⟩ . The maximum rate of increase is | ⟨2, 4⟩ | = 2 √5.
1

√5

The unit vector giving the direction of minimum rate of increase is − 1

√5
⟨1, 2⟩ and that minimum rate is

−| ⟨2, 4⟩ | = −2 √5.

The directions giving zero rate of increase are perpendicular to ∇f (a, b). One vector perpendicular 2 to ⟨1, 2⟩ is ⟨2, −1⟩ .
So the unit vectors giving the direction of zero rate of increase are the ± (2, −1). These are the directions of the tangent
1

√5

vector at (a, b) to the level curve of f through (a, b), which is the curve f (x, y) = f (a, b).

 Example 2.7.9

What is the rate of change of f (x, y, z) = x + y + z at (3, 5, 4) moving in the positive x-direction along the curve of
2 2 2

intersection of the surfaces G(x, y, z) = 25 and H (x, y, z) = 0 where


2 2 2 2 2 2
G(x, y, z) = 2 x −y + 2z and H (x, y, z) = x −y +z

Solution
As a first check note that (3, 5, 4) really does lie on both surfaces because
2 2 2
G(3, 5, 4) = 2(3 ) − 5 + 2(4 ) = 18 − 25 + 32 = 25

2 2 2
H (3, 5, 4) = 3 −5 + 4 = 9 − 25 + 16 = 0

2.7.4 https://math.libretexts.org/@go/page/92240
We compute gradients to get the normal vectors to the surfaces G(x, y, z) = 25 and H (x, y, z) = 0 at (3, 5, 4).

^
∇G(3, 5, 4) = [4x ^
ı − 2y ^
ı ȷ
ȷ + 4z k]
(3,5,4)

^ ^
= 12 ^
ı − 10 ^
ı ȷ + 16 k = 2(6 ^
ȷ ı −5 ^
ı ȷ
ȷ + 8 k)

^
∇H (3, 5, 4) = [2x ^
ı − 2y ^
ı ȷ
ȷ + 2z k]
(3,5,4)

^ ^
=6 ^
ı
ı − 10 ^
ȷ
ȷ + 8 k = 2(3 ^
ı
ı −5 ^
ȷ
ȷ + 4 k)

The direction of interest is tangent to the curve of intersection. So the direction of interest is tangent to both surfaces and hence
is perpendicular to both gradients. Consequently one tangent vector to the curve at (3, 5, 4) is
^ ^
∇G(3, 5, 4) × ∇H (3, 5, 4) = 4(6 ^
ı −5 ^
ı ȷ + 8 k) × (3 ^
ȷ ı −5 ^
ı ȷ
ȷ + 4 k)

^
⎡ ^
ı
ı ^
ȷ
ȷ k⎤

= 4  det ⎢ 6 −5 8 ⎥
⎣ ⎦
3 −5 4

^ ^
= 4 (20 ^
ı − 15 k) = 20 (4 ^
ı ı
ı − 3 k)

and the unit tangent vector to the curve at (3, 5, 4) that has positive x component is
^
4 ^
ı
ı −3 k 4 3
^ ^
= ı
ı − k
^ 5 5
|4 ^
ı
ı − 3 k|

The desired rate of change is


4 3
^ ^
D 4 3
^ f (3, 5, 4) = ∇f (3, 5, 4) ⋅ ( ı
ı − k)
^
ı
ı− k
5 5 5 5

^
^
ı +2y ^
[2x ı ȷ
ȷ +2z k]( x,y,z) =( 3,5,4)


4 3
^ ^
= (6 ^
ı + 10 ^
ı ȷ
ȷ + 8 k) ⋅ ( ^
ı
ı − k)
5 5

=0

Actually, we could have known that the rate of change would be zero.
indent=-0.1in
Any point (x, y, z) on the curve obeys both y = x + z and 2x − y + 2z = 25.
2 2 2 2 2 2

Substituting y = x + z into 2x − y + 2z = 25 gives x + z = 25.


2 2 2 2 2 2 2 2

So, at any point on the curve, x + z = 25 and y = x + z = 25 so that x + y + z


2 2 2 2 2 2 2 2
= 50.

That is, f (x, y, z) = x + y + z takes the value 50 at every point of the curve.
2 2 2

So of course the rate of change of f along the curve is 0.

Let's change things up a little. In the next example, we are told the rates of change in two different directions. From this we are to
determine the rate of change in a third direction.

 Example 2.7.10

The rate of change of a given function f (x, y) at the point P = (1, 2) in the direction towards P = (2, 3) is 2√2 and in the
0 1

direction towards P = (1, 0) is −3. What is the rate of change of f at P towards the origin P = (0, 0)?
2 0 3

Solution
We can easily determine the rate of change of f at the point P in any direction once we know the gradient 0

n⃗ ablaf (1, 2) = a ^ ȷ . So we will first use the two given rates of change to determine a and b, and then we determine the
ı +b ^
ı ȷ

rate of change towards (0, 0).


The two rates of change that we are given are those in the directions of the vectors
−−−→ −−−→
P0 P1 = ⟨1, 1⟩ P0 P2 = ⟨0, −2⟩

2.7.5 https://math.libretexts.org/@go/page/92240

−→
As you might guess, the notation P Q means the vector whose tail is at P and whose head is at Q. So the given rates of change
tell us that

– ⟨1, 1⟩ ⟨1, 1⟩
√2 = D f (1, 2) = ∇f (1, 2) ⋅ = ⟨a, b⟩ ⋅
⟨ 1,1⟩

|⟨ 1,1⟩ |
| ⟨1, 1⟩ | √2

a b
= – + –
√2 √2

⟨0, −2⟩ ⟨0, −2⟩


−3 = D ⟨ 0,−2⟩ f (1, 2) = ∇f (1, 2) ⋅ = ⟨a, b⟩ ⋅
|⟨ 0,−2⟩ |
| ⟨0, −2⟩ | 2

= −b

These two lines give us two linear equations in the two unknowns a and b. The second equation directly gives us b = 3.

Substituting b = 3 into the first equation gives


a 3 –
– + – = 2 √2 ⟹ a+3 = 4 ⟹ a =1
√2 √2

A direction vector from P 0 = (1, 2) towards P 3 = (0, 0) is


−−−→
P0 P3 = ⟨−1, −2⟩

and the rate of change (per unit distance) in that direction is


⟨−1, −2⟩ ⟨−1, −2⟩
D ⟨ −1,−2⟩ f (1, 2) = ∇f (1, 2) ⋅ = ⟨a, b⟩ ⋅ –
|⟨ −1,−2⟩ |
| ⟨−1, −2⟩ | √5

⟨−1, −2⟩ 7
= ⟨1, 3⟩ ⋅ =−
– –
√5 √5

 Example 2.7.11. Optional

Find all points (a, b, c) for which the spheres (x − a) + (y − b) + (z − c) = 1 and x + y + z


2 2 2 2 2 2
=1 intersect
orthogonally. That is, the tangent planes to the two spheres are to be perpendicular at each point of intersection.
Solution
Let (x0, y0 , z0 ) be a point of intersection. That is
2 2 2
(x0 − a) + (y0 − b ) + (z0 − c ) =1

2 2 2
x +y +z =1
0 0 0

A normal vector to G(x, y, z) = (x − a) 2


+ (y − b )
2
+ (z − c )
2
=1 at (x 0, y0 , z0 ) is

N = ∇G(x0 , y0 , z0 ) = ⟨2(x0 − a) , 2(y0 − b) , 2(z0 − c)⟩

A normal vector to g(x, y, z) = x 2


+y
2
+z
2
=1 at (x 0, y0 , z0 ) is

n⃗  = ∇g(x0 , y0 , z0 ) = ⟨2 x0 , 2 y0 , 2 z0 ⟩

The two tangent planes are perpendicular if and only if N


^
and n
^ are perpendicular, which is the case if and only if

^ ^ = 4 x0 (x0 − a) + 4 y0 (y0 − b) + 4 z0 (z0 − c)


0 = N⋅n

or, dividing the equation by 4,


x0 (x0 − a) + y0 (y0 − b) + z0 (z0 − c) = 0

Let's pause to take stock. We need to find all (a, b, c)'s such that the statement

(x0 , y0 , z0 ) is a point of intersection of the two spheres

implies the statement

2.7.6 https://math.libretexts.org/@go/page/92240
^ ^  are perpendicular
the normal vectors N and n

In equations, we need to find all (a, b, c)'s such that the statement
2 2 2
(x0 , y0 , z0 ) obeys x +y +z =1
0 0 0

2 2 2
 and (x0 − a) + (y0 − b ) + (z0 − c ) =1 (S1)

implies the statement

(x0 , y0 , z0 ) obeys x0 (x0 − a) + y0 (y0 − b) + z0 (z0 − c) = 0

Now if we expand (S2) then we can, with a little care, massage it into something that looks more like (S1).
2 2 2
x0 (x0 − a) + y0 (y0 − b) + z0 (z0 − c) = x +y +z − ax0 − b y0 − c z0
0 0 0

1
2 2 2 2 2 2 2 2 2
= {[ x +y + z ] + [(x0 − a) + (y0 − b ) + (z0 − c ) ] − a −b −c }
0 0 0
2

If (S1) is true, then [ x


2
0
+y
0
2 2
+z ] = 1
0
and [(x 0 − a)
2
+ (y0 − b )
2
+ (z0 − c ) ] = 1
2
so that
1
2 2 2
x0 (x0 − a) + y0 (y0 − b) + z0 (z0 − c) = {1  +  1  − a −b −c }
2

and statement (S2) is true if and only if


2 2 2
a +b +c =2


Our conclusion is that the set of allowed points (a, b, c) is the sphere of radius √2 centred on the origin.

 Example 2.7.12. Optional — The gradient in polar coordinates

What is the gradient of a function in polar coordinates?


Solution
As was the case in Examples 2.4.9 and 2.4.10, figuring out what the question is asking is half the battle. By Definition 2.5.4,
the gradient of a function g(x, y) is the vector ⟨g (x, y), g (x, y)⟩ . In this question we are told that we are given some
x y

function f (r, θ) of the polar coordinates 3 r and θ. We are supposed to convert this function to Cartesian coordinates.
This means that we are to consider the function

g(x, y) = f (r(x, y), θ(x, y))

with
−−−−−−
2 2
r(x, y) = √ x +y

y
θ(x, y) = arctan
x

Then we are to compute the gradient of g(x, y) and express the answer in terms of r and θ. By the chain rule,
∂g ∂f ∂r ∂f ∂θ
=   +  
∂x ∂r ∂x ∂θ ∂x
2
∂f 1 2x ∂f −y/x
=   +  
− −−−− − 2
∂r 2 √ x2 + y 2 ∂θ 1 + (y/x)

∂f x ∂f y
=   − −− −−− −  
∂r √ x2 + y 2 ∂θ x2 + y 2

∂f r cos θ ∂f r sin θ
=   −  
2
∂r r ∂θ r

since x = r cos θ and y = r sin θ


∂f ∂f sin θ
=   cos θ −  
∂r ∂θ r

2.7.7 https://math.libretexts.org/@go/page/92240
Similarly
∂g ∂f ∂r ∂f ∂θ
=   +  
∂y ∂r ∂y ∂θ ∂y

∂f 1 2y ∂f 1/x
=   − −−−− − +  
∂r 2 √ x2 + y 2 ∂θ 1 + (y/x)2

∂f y ∂f x
=   +  
− −−−− − 2 2
∂r √ x2 + y 2 ∂θ x +y

∂f ∂f cos θ
=   sin θ +  
∂r ∂θ r

So

⟨gx , gy ⟩ = fr   ⟨cos θ, sin θ⟩ + ⟨− sin θ, cos θ⟩
r

or, with all the arguments written explicitly,

⟨gx (x, y), gy (x, y)⟩ = fr (r(x, y), θ(x, y))  ⟨cos θ(x, y) , sin θ(x, y)⟩

1
   + fθ (r(x, y), θ(x, y))  ⟨− sin θ(x, y) , cos θ(x, y)⟩
r(x, y)

Exercises
Stage 1

 1✳

Find the directional derivative of f (x, y, z) = e xyz


in the ⟨0, 1, 1⟩ direction at the point (0, 1, 1).

 2✳

Find ∇(y 2
+ sin(xy)).

Stage 2

 3

Find the rate of change of the given function at the given point in the given direction.
1. f (x, y) = 3x − 4y at the point (0, 2) in the direction −2 ^
ıı.

2. f (x, y, z) = x + y + z
−1 −1
at (2, −3, 4) in the direction ^
−1
ı
ı +^
ȷ
^
ȷ + k.

 4
In what directions at the point (2, 0) does the function f (x, y) = xy have the specified rates of change?
1. −1
2. −2
3. −3

 5

Find ∇f (a, b) given the directional derivatives



D( ı^
ı +^
ȷ
ȷ )/ √2
f (a, b) = 3 √2 D(3 ı^
ı −4 ^
ȷ
ȷ )/5
f (a, b) = 5

2.7.8 https://math.libretexts.org/@go/page/92240
 6✳
You are standing at a location where the surface of the earth is smooth. The slope in the southern direction is 4 and the slope in

the south-eastern direction is √2. Find the slope in the eastern direction.

 7✳

Assume that the directional derivative of w = f (x, y, z) at a point P is a maximum in the direction of the vector 2 ^
ı
ı −^
ȷ
^
ȷ + k,

and the value of the directional derivative in that direction is 3√6.
1. Find the gradient vector of w = f (x, y, z) at P .
2. Find the directional derivative of w = f (x, y, z) at P in the direction of the vector ^
ı +^
ı ȷ
ȷ

 8✳

A hiker is walking on a mountain with height above the z = 0 plane given by


2
z = f (x, y) = 6 − xy

The positive x-axis points east and the positive y -axis points north, and the hiker starts from the point P (2, 1, 4).
1. In what direction should the hiker proceed from P to ascend along the steepest path? What is the slope of the path?
2. Walking north from P , will the hiker start to ascend or descend? What is the slope?
3. In what direction should the hiker walk from P to remain at the same height?

 9

Two hikers are climbing a (small) mountain whose height is  z = 1000 − 2x − 3y . They start at  (1, 1, 995)  and follow the
2 2

path of steepest ascent. Their (x, y) coordinates obey  y = ax   for some constants a, b. Determine a and b.
b

 10 ✳

A mosquito is at the location (3, 2, 1) in R . She knows that the temperature T near there is given by T
3
= 2x
2
+y
2 2
−z .

1. She wishes to stay at the same temperature, but must fly in some initial direction. Find a direction in which the initial rate
of change of the temperature is 0.
2. If you and another student both get correct answers in part (a), must the directions you give be the same? Why or why not?
3. What initial direction or directions would suit the mosquito if she wanted to cool down as fast as possible?

 11 ✳

The air temperature T (x, y, z) at a location (x, y, z) is given by:


2
T (x, y, z) = 1 + x + yz.

1. A bird passes through (2, 1, 3) travelling towards (4, 3, 4) with speed 2. At what rate does the air temperature it
experiences change at this instant?
2. If instead the bird maintains constant altitude (z = 3 ) as it passes through (2, 1, 3) while also keeping at a fixed air
temperature, T = 8, what are its two possible directions of travel?

 12 ✳

Let f (x, y) = 2x 2
+ 3xy + y
2
be a function of x and y.
1. Find the maximum rate of change of f (x, y) at the point P (1, − ) . 4

2. Find the directions in which the directional derivative of f (x, y) at the point P (1, −
4

3
) has the value 1

5
.

2.7.9 https://math.libretexts.org/@go/page/92240
 13 ✳
The temperature T (x, y) at a point of the xy-plane is given by
2
x
T (x, y) = ye

A bug travels from left to right along the curve y = x at a speed of 0.01m/sec. The bug monitors T (x, y) continuously. What
2

is the rate of change of T as the bug passes through the point (1, 1)?

 14 ✳

Suppose the function T = F (x, y, z) = 3 + xy − y


2
+z
2
−x describes the temperature at a point (x, y, z) in space, with
F (3, 2, 1) = 3.

1. Find the directional derivative of T at (3, 2, 1), in the direction of the point (0, 1, 2).
2. At the point (3, 2, 1), in what direction does the temperature decrease most rapidly?
− −−−
3. Moving along the curve given by x = 3e , y = 2 cos t, z = √1 + t , find
t
, the rate of change of temperature with
dT

dt

respect to t, at t = 0.
4. Suppose ^ ı + 5^
ı ȷȷ + ak is a vector that is tangent to the temperature level surface T (x, y, z) = 3 at (3, 2, 1). What is a?
^

 15 ✳

Let
2 2 2
−( x +y +z )
f (x, y, z) = (2x + y)e
2 2
g(x, y, z) = xz + y + yz + z

1. Find the gradients of f and g at (0, 1, −1).


2. A bird at (0, 1, −1) flies at speed 6 in the direction in which f (x, y, z) increases most rapidly. As it passes through
(0, 1, −1), how quickly does g(x, y, z) appear (to the bird) to be changing?

3. A bat at (0, 1, −1) flies in the direction in which f (x, y, z) and g(x, y, z) do not change, but z increases. Find a vector in
this direction.

 16 ✳

A bee is flying along the curve of intersection of the surfaces 3z + x + y = 2 and z = x


2 2 2
−y
2
in the direction for which z
is increasing. At time t = 2, the bee passes through the point (1, 1, 0) at speed 6.
1. Find the velocity (vector) of the bee at time t = 2.
2. The temperature T at position (x, y, z) at time t is given by T = xy − 3x + 2yt + z. Find the rate of change of
temperature experienced by the bee at time t = 2.

 17 ✳
2 2 2

The temperature at a point (x, y, z) is given by T (x, y, z) = 5e −2 x −y −3 z


, where T is measured in centigrade and x, y, z in
meters.
1. Find the rate of change of temperature at the point P (1, 2, −1) in the direction toward the point (1, 1, 0).
2. In which direction does the temperature decrease most rapidly?
3. Find the maximum rate of decrease at P .

 18 ✳

The directional derivative of a function w = f (x, y, z) at a point P in the direction of the vector ^
ı is 2, in the direction of the
ı

vector ^
ı +^
ı ȷȷ is −√2, and in the direction of the vector ^ ı +^
ı ȷ + k is −
ȷ
^
. Find the direction in which the function
5

√3

w = f (x, y, z) has the maximum rate of change at the point P . What is this maximum rate of change?

2.7.10 https://math.libretexts.org/@go/page/92240
 19 ✳

Suppose it is known that the direction of the fastest increase of the function f (x, y) at the origin is given by the vector ⟨1, 2⟩ .
Find a unit vector u that is tangent to the level curve of f (x, y) that passes through the origin.

 20 ✳

The shape of a hill is given by z = 1000 − 0.02x 2


− 0.01 y .
2
Assume that the x -axis is pointing East, and the y -axis is
pointing North, and all distances are in metres.
1. What is the direction of the steepest ascent at the point (0, 100, 900)?(The answer should be in terms of directions of the
compass).
2. What is the slope of the hill at the point (0, 100, 900)in the direction from (a)?
3. If you ride a bicycle on this hill in the direction of the steepest descent at 5 m/s, what is the rate of change of your altitude
(with respect to time) as you pass through the point (0, 100, 900)?

 21 ✳

Let the pressure P and temperature T at a point (x, y, z) be


2 2
x + 2y
2
P (x, y, z) = , T (x, y, z) = 5 + xy − z
2
1 +z

1. If the position of an airplane at time t is


2
(x(t), y(t), z(t)) = (2t, t − 1, cos t)

find (P T ) at time t = 0 as observed from the airplane.


d

dt
2

2. In which direction should a bird at the point (0, −1, 1) fly if it wants to keep both P and T constant. (Give one possible
direction vector. It does not need to be a unit vector.)
3. An ant crawls on the surface z + zx + y = 2. When the ant is at the point (0, −1, 1), in which direction should it go for
3 2

maximum increase of the temperature T = 5 + xy − z ? Your answer should be a vector ⟨a, b, c⟩ , not necessarily of unit
2

length. (Note that the ant cannot crawl in the direction of the gradient because that leads off the surface. The direction
vector ⟨a, b, c⟩ has to be on the tangent plane to the surface.)

 22 ✳

Suppose that f (x, y, z) is a function of three variables and let u = √6


1
⟨1, 1, 2⟩ and v = 1

√3
⟨1, −1, −1⟩ and w = 1

√3
⟨1, 1, 1⟩ .

Suppose that at a point (a, b, c),


Du f = 0

Dv f = 0

Dw f = 4

Find ∇f at (a, b, c).

 23 ✳
The elevation of a hill is given by the equation f (x, y) = x 2 2
y e
−x−y
. An ant sits at the point (1, 1, e −2
).

1. Find the unit vector u = ⟨u 1, u2 ⟩ that maximizes

f ((1, 1) + tu) − f (1, 1)


lim
t→0 t

2. Find a vector v = ⟨v 1, v2 , v3 ⟩ pointing in the direction of the path that the ant could take in order to stay on the same
elevation level e .
−2

2.7.11 https://math.libretexts.org/@go/page/92240
3. Find a vector v = ⟨v , v , v ⟩ pointing in the direction of the path that the ant should take in order to maximize its
1 2 3

instantaneous rate of level increase.

 24 ✳

Let the temperature in a region of space be given by T (x, y, z) = 3x 2


+
1

2
y
2
+ 2z
2
degrees.
1. A sparrow is flying along the curve r (s)
⃗  = ( s , 2s, s ) at a constant speed of 3 ms
1

3
3 2
. What is the velocity of the sparrow
−1

when s = 1?
2. At what rate does the sparrow feel the temperature is changing at the point A( , 2, 1) for which s = 1.1

3. At the point A( , 2, 1) in what direction will the temperature be decreasing at maximum rate?
1

4. An eagle crosses the path of the sparrow at A( , 2, 1), is moving at right angles to the path of the sparrow, and is also
1

moving in a direction in which the temperature remains constant. In what directions could the eagle be flying as it passes
through the point A?

 25 ✳

Assume that the temperature T at a point (x, y, z) near a flame at the origin is given by
200
T (x, y, z) =
2 2 2
1 +x +y +z

where the coordinates are given in meters and the temperature is in degrees Celsius. Suppose that at some moment in time, a
moth is at the point (3, 4, 0) and is flying at a constant speed of 1m/s in the direction of maximum increase of temperature.
1. Find the velocity vector v ⃗ of the moth at this moment.
2. What rate of change of temperature does the moth feel at that moment?

 26 ✳

We say that u is inversely proportional to v if there is a constant k so that u = k/v. Suppose that the temperature T in a metal
ball is inversely proportional to the distance from the centre of the ball, which we take to be the origin. The temperature at the
point (1, 2, 2) is 120 .∘

1. Find the constant of proportionality.


2. Find the rate of change of T at (1, 2, 2) in the direction towards the point (2, 1, 3).
3. Show that at most points in the ball, the direction of greatest increase is towards the origin.

 27 ✳
The depth of a lake in the xy-plane is equal to f (x, y) = 32 − x 2
− 4x − 4 y
2
meters.
1. Sketch the shoreline of the lake in the xy-plane.
Your calculus instructor is in the water at the point (−1, 1). Find a unit vector which indicates in which direction he should
swim in order to:
1. [(b)] stay at a constant depth?
2. [(c)] increase his depth as rapidly as possible (i.e. be most likely to drown)?

Stage 3

 28

The temperature T (x, y) at points of the xy-plane is given by T (x, y) = x 2


− 2y .
2

1. Draw a contour diagram for T showing some isotherms (curves of constant temperature).
2. In what direction should an ant at position (2, −1) move if it wishes to cool off as quickly as possible?

2.7.12 https://math.libretexts.org/@go/page/92240
3. If the ant moves in that direction at speed v at what rate does its temperature decrease?
4. What would the rate of decrease of temperature of the ant be if it moved from (2, −1) at speed v in direction ⟨−1, −2⟩ ?
5. Along what curve through (2, −1) should the ant move to continue experiencing maximum rate of cooling?

 29 ✳

Consider the function f (x, y, z) = x 2


+ cos(yz).

1. Give the direction in which f is increasing the fastest at the point (1, 0, π/2).
2. Give an equation for the plane T tangent to the surface

S = {(x, y, z)|f (x, y, z) = 1}

at the point (1, 0, π/2).


3. Find the distance between T and the point (0, 1, 0).
4. Find the angle between the plane T and the plane

P = {(x, y, z)|x + z = 0} .

 30 ✳

A function T (x, y, z) at P = (2, 1, 1) is known to have T (P ) = 5, T x (P ) = 1, Ty (P ) = 2, and T


z (P ) = 3.

1. A bee starts flying at P and flies along the unit vector pointing towards the point Q = (3, 2, 2). What is the rate of change
of T (x, y, z) in this direction?
2. Use the linear approximation of T at the point P to approximate T (1.9, 1, 1.2).
3. Let S(x, y, z) = x + z. A bee starts flying at P ; along which unit vector direction should the bee fly so that the rate of
change of T (x, y, z) and of S(x, y, z) are both zero in this direction?

 31 ✳

Consider the functions F (x, y, z) = z 3


+ xy
2
+ xz and G(x, y, z) = 3x − y + 4z. You are standing at the point P (0, 1, 2).
1. You jump from P to Q(0.1 , 0.9 , 1.8).Use the linear approximation to determine approximately the amount by which F
changes.
2. You jump from P in the direction along which G increases most rapidly. Will F increase or decrease?
3. You jump from P in a direction ⟨a , b , c⟩ along which the rates of change of F and G are both zero. Give an example of
such a direction (need not be a unit vector).

 32 ✳

A meteor strikes the ground in the heartland of Canada. Using satellite photographs, a model
100
z = f (x, y) = −
2 2
x + 2x + 4 y + 11

of the resulting crater is made and a plan is drawn up to convert the site into a tourist attraction. A car park is to be built at
(4, 5) and a hiking trail is to be made. The trail is to start at the car park and take the steepest route to the bottom of the crater.

1. Sketch a map of the proposed site clearly marking the car park, a few level curves for the function f and the trail.
2. In which direction does the trail leave the car park?

 33 ✳

You are standing at a lone palm tree in the middle of the Exponential Desert. The height of the sand dunes around you is given
in meters by
2 2
−( x +2 y )
h(x, y) = 100e

2.7.13 https://math.libretexts.org/@go/page/92240
where x represents the number of meters east of the palm tree (west if x is negative) and y represents the number of meters
north of the palm tree (south if y is negative).
1. Suppose that you walk 3 meters east and 2 meters north. At your new location, (3, 2), in what direction is the sand dune
sloping most steeply downward?
2. If you walk north from the location described in part (a), what is the instantaneous rate of change of height of the sand
dune?
3. If you are standing at (3, 2) in what direction should you walk to ensure that you remain at the same height?
4. Find the equation of the curve through (3, 2) that you should move along in order that you are always pointing in a steepest
descent direction at each point of this curve.

 34 ✳
Let f (x, y) be a differentiable function with f (1, 2) = 7. Let
3 4 3 4
u = ^
ı
ı + ^
ȷ
ȷ, v = ^
ı
ı − ^
ȷ
ȷ
5 5 5 5

be unit vectors. Suppose it is known that the directional derivatives Du f (1, 2) and Dv f (1, 2) are equal to 10 and 2

respectively.
1. Show that the gradient vector ∇f at (1, 2) is 10 ^
ı + 5^
ı ȷ
ȷ.

2. Determine the rate of change of f at (1, 2) in the direction of the vector ^


ı + 2^
ı ȷ
ȷ.

3. Using the tangent plane approximation, estimate the value of f (1.01, 2.05).

1. Some people require direction vectors to have unit length. We don't.


2. Check this by taking the dot product of ⟨1, 2⟩ and ⟨2, −1⟩ .
3. Polar coordinates were defined in Example 2.1.8.

This page titled 2.7: Directional Derivatives and the Gradient is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.

2.7.14 https://math.libretexts.org/@go/page/92240
2.8: Optional — Solving the Wave Equation
Many phenomena are modelled by equations that relate the rates of change of various quantities. As rates of change are given by
derivatives the resulting equations contain derivatives and so are called differential equations. We saw a number of such differential
equations in §2.4 of the CLP-1 text.
In this section we consider
2 2
∂ w 1 ∂ w
(x, t) − (x, t) = 0
2 2 2
∂x c ∂t

This is an extremely important 1 partial differential equation called the “wave equation” (in one spatial dimension) that is used in
modelling water waves, sound waves, seismic waves, light waves and so on. The reason that we are looking at it here is that we can
use what we have just learned to see that its solutions are waves travelling with speed c.
To start, we'll use gradients and the chain rule to find the solution of the slightly simpler equation
∂w 1 ∂w
(x, t) − (x, t) = 0
∂x c ∂t

By way of motivation for what will follow, note that


we can rewrite the above equation as
1
⟨1 , − ⟩ ⋅ ∇w(x, t) = 0
c

This equation tells that the gradient of any solution w(x, t) must always be perpendicular to the constant vector ⟨1 , −
1

c
⟩.

A vector ⟨a, b⟩ is perpendicular to ⟨1 , − ⟩ if and only if


1

1 b
⟨a, b⟩ ⋅ ⟨1 , − ⟩ =0 ⟺ a− =0 ⟺ b = ac ⟺ ⟨a, b⟩ = a ⟨1, c⟩
c c

That is, a vector is perpendicular to ⟨1 , − ⟩ if and only if it is parallel to ⟨1, c⟩ .


1

Thus the gradient of any solution w(x, t) must always be parallel to the constant vector ⟨1 , c⟩ .
Recall that one of our implications following Definition 2.7.5 is that the gradient of w(x, t) must always be perpendicular to the
level curves of w.
So the level curves of w(x, t) are always perpendicular to the constant vector ⟨1 , c⟩. They must be straight lines with
equations of the form

⟨1 , c⟩ ⋅ ⟨x − x0 , t − t0 ⟩ = 0 or x + ct = u with u a constant

That is, for each constant u, w(x, t) takes the same value at each point of the straight line x + ct = u. Call that value U (u). So
w(x, t) = U (u) = U (x + ct) for some function U .

This solution represents a wave packet moving to the left with speed c. You can see this by observing that all points (x, t) in space-
time for which x + ct takes the same fixed value, say z, have the same value of U (x + ct), namely U (z). So if you move so that
your position at time t is x = z − ct (i.e. move the left with speed c ) you always see the same value of w. The figure below
illustrates this. It contains the graphs of U (x), U (x + c) = U (x + ct)∣∣ and U (x + 2c) = U (x + ct)∣∣
t=1
for a bump shaped t=2

U (x). In the figure the location of the tick z on the x-axis was chosen so that so that U (z) = max U (x). x

2.8.1 https://math.libretexts.org/@go/page/92241
The above argument that lead to the solution w(x, t) = U (x + ct) was somewhat handwavy. But we can easily turn it into a much
tighter argument by simply changing variables from (x, y) to (u, v) with u = x + ct. It doesn't much matter what we choose
(within reason) for the new variable v. Let's take v = x − ct. Then x = and t = and it is easy to translate back and forth
u+v

2
u−v

2c

between x, t and u, v.
Now define the function W (u, v) by

w(x, t) = W (x + ct , x − ct)

By the chain rule


∂w ∂
(x, t) = [W (x + ct , x − ct)]
∂x ∂x
∂W ∂ ∂W ∂
= (x + ct , x − ct) (x + ct) + (x + ct , x − ct) (x − ct)
∂u ∂x ∂v ∂x
∂W ∂W
= (x + ct , x − ct) + (x + ct , x − ct)
∂u ∂v

and
∂w ∂
(x, t) = [W (x + ct , x − ct)]
∂t ∂t

∂W ∂ ∂W ∂
= (x + ct , x − ct) (x + ct) + (x + ct , x − ct) (x − ct)
∂u ∂t ∂v ∂t

∂W ∂W
= (x + ct , x − ct) × c + (x + ct , x − ct) × (−c)
∂u ∂v

Subtracting 1

c
times the second equation from the first equation gives
∂w 1 ∂w ∂W
(x, t) − (x, t) = 2 (x + ct , x − ct)
∂t c ∂t ∂v

So
∂w 1 ∂w
w(x, t) obeys the equation  (x, t) − (x, t) = 0 for all x, r
∂x c ∂t

if and only if
∂W
W (u, v) obeys the equation  (x + ct , x − ct) = 0 for all x, t,
∂v

which, substituting in x = u+v

2
and t = u−v

2c
, is the case if and only if
∂W
W (u, v) obeys the equation  (u , v) = 0 for all u, v
∂v

The equation (u , v) = 0 means that


∂W

∂v
W (u, v) is independent of v, so that W (u, v) is of the form W (u, v) = U (u), for some
function U , and, so finally,

w(x, t) = W (x + ct , x − ct) = U (x + ct)

Now that we have solved our toy equation, let's move on to the 1d wave equation.

2.8.2 https://math.libretexts.org/@go/page/92241
 Example 2.8.1. Wave Equation

We'll now expand the above argument to find the general solution to
2 2
∂ w 1 ∂ w
(x, t) − (x, t) = 0
2 2 2
∂x c ∂t

We'll again make the change of variables from (x, y) to (u, v) with u = x + ct and v = x − ct and again define the function
W (u, v) by

w(x, t) = W (x + ct , x − ct)

By the chain rule, we still have


∂w ∂
(x, t) = [W (x + ct , x − ct)]
∂x ∂x

∂W ∂W
= (x + ct , x − ct) + (x + ct , x − ct)
∂u ∂v

∂w ∂
(x, t) = [W (x + ct , x − ct)]
∂t ∂t

∂W ∂W
= (x + ct , x − ct) × c + (x + ct , x − ct) × (−c)
∂u ∂v

We now need to differentiate a second time. Write W 1 (u, v) =


∂W

∂u
(u, v) and W 2 (u, v) =
∂W

∂v
(u, v) so that

∂w
(x, t) = W1 (x + ct , x − ct) + W2 (x + ct , x − ct)
∂x

∂w
(x, t) = c W1 (x + ct , x − ct) − c W2 (x + ct , x − ct)
∂t

Using the chain rule again


2
∂ w ∂ ∂w
(x, t) = [ (x, t)]
2
∂x ∂x ∂x

∂ ∂
= [ W1 (x + ct , x − ct)] + [ W2 (x + ct , x − ct)]
∂x ∂x

∂W1 ∂W1 ∂W2 ∂W2


=  +  +  + 
∂u ∂v ∂u ∂v
2 2 2 2
∂ W ∂  W ∂  W ∂ W
=  +  +  + 
2 2
∂u ∂v ∂u ∂u ∂v ∂v
2
∂ w ∂ ∂w
(x, t) = [ (x, t)]
2
∂t ∂t ∂t

∂ ∂
=c [ W1 (x + ct , x − ct)] −c [ W2 (x + ct , x − ct)]
∂t ∂t

∂W1 ∂W1 ∂W2 ∂W2


2 2 2 2
=c   −  c −c   +  c
∂u ∂v ∂u ∂v
2 2 2 2
2
∂ W 2
∂ W 2
∂ W 2
∂ W
=c   −  c −c   +  c
2 2
∂u ∂v∂u ∂u∂v ∂v

with all of the functions on the right hand sides having arguments (x + ct , x − ct). So, subtracting 1

c
2
times the second from
the first, we get
2 2 2
∂ w 1 ∂ w ∂ W
(x, t) − (x, t) =4 (x + ct , x − ct)
2 2 2
∂x c ∂t ∂u∂v

2 2

and w(x, t) obeys ∂

∂x2
w
(x, t) −
1

c2
∂ w
2
(x, t) = 0 for all x and t if and only if
∂t

2
∂ W
(u , v) = 0
∂u∂v

2.8.3 https://math.libretexts.org/@go/page/92241
for all u and v.
This tells us that the u-derivative of is zero, so that
∂W

∂v
is independent of u. That is
∂W

∂v
˜(v) for some
∂W
(u, v) = V
∂v
~
function V . The reason that we have called it V˜ instead of V with become evident shortly.
Recall that to apply , you treat u as a constant and differentiate with respect to v.

∂v

So ∂W

∂v
(u, v) = V ˜(v) says that, when u is thought of as a constant, W is an antiderivative of V
˜.

~
That is, W (u, v) = ∫ V (v) dv + U , with U being an arbitrary constant. As u is being thought of as a constant, U is
allowed to depend on u.
~
So, denoting by V any antiderivative of V , we can write our solution in a very neat form.

W (u, v) = U (u) + V (v)

and the function we want is 2

w(x, t) = W (x + ct , x − ct) = U (x + ct) + V (x − ct)

As we saw above U (x + ct) represents a wave packet moving to the left with speed c. Similarly, V (x − ct) represents a wave
packet moving to the right with speed c.
This is known as d'Alembert's form of the solution. It is named after Jean le Rond d'Alembert, 1717--1783, who was a French
mathematician, physicist, philosopher and music theorist.
Notice that w(x, t) = U (x + ct) + V (x − ct) is a solution regardless of what U and V are. The differential equation cannot
tell us what U and V are. To determine them, we need more information about the system — usually in the form of initial
conditions, like w(x, 0) = ⋯ and (x, 0) = ⋯ . General techniques for solving partial differential equations lie beyond this
∂w

∂t

text — but definitely require a good understanding of multivariable calculus. A good reason to keep on reading!

Really Optional — Derivation of the Wave Equation


In this section we derive the wave equation
2 2
∂ w 1 ∂ w
(x, t) − (x, t) = 0
2 2 2
∂x c ∂t

in one application. To be precise, we apply Newton's law to an elastic string, and conclude that small amplitude transverse
vibrations of the string obey the wave equation.
Here is a sketch of a tiny element of the string.

The basic notation that we will use (most of which appears in the sketch) is
w(x, t) = vertical displacement of the string from the x axis

at position x and time t

θ(x, t) = angle between the string and a horizontal line

at position x and time t

T (x, t) = tension in the string at position x and time t

ρ(x) = mass density (per unit length) of the string at position x

The forces acting on the tiny element of string at time t are


1. tension pulling to the right, which has magnitude T (x + Δx, t) and acts at an angle θ(x + Δx, t) above horizontal

2.8.4 https://math.libretexts.org/@go/page/92241
2. tension pulling to the left, which has magnitude T (x, t) and acts at an angle θ(x, t) below horizontal and, possibly,
3. various external forces, like gravity. We shall assume that all of the external forces act vertically and we shall denote by
F (x, t)Δx the net magnitude of the external force acting on the element of string.

−−−−−−−−−
The length of the element of string is essentially √Δx + Δw so that the mass of the element of string is essentially
2 2

−−−−−−−− −
2
ρ(x)√Δx + Δw and the vertical component of Newton's law F = ma says that
2

2
−−−−−−−−− ∂ w
2 2
ρ(x) √ Δx + Δw (x, t)
2
∂t

= T (x + Δx, t) sin θ(x + Δx, t) − T (x, t) sin θ(x, t) + F (x, t)Δx

Dividing by Δx and taking the limit as Δx → 0 gives


−−−−−−−−−−
2 2
∂w ∂ w ∂
ρ(x) √ 1 + ( ) (x, t) = [T (x, t) sin θ(x, t)] + F (x, t)
2
∂x ∂t ∂x

∂T ∂θ
= (x, t) sin θ(x, t) + T (x, t) cos θ(x, t) (x, t) + F (x, t) (E1)
∂x ∂x

We can dispose of all the θ 's by observing from the figure above that
Δw ∂w
tan θ(x, t) = lim = (x, t)
Δx→0 Δx ∂x

which implies, using the figure below, that


∂w
(x, t) 1
∂x
sin θ(x, t) = −−−−−−−−−−−− cos θ(x, t) = −−−−−−−−−−−−
∂w 2 ∂w 2
√1 + ( (x, t)) √1 + ( (x, t))
∂x ∂x

2
∂ w
∂w ∂θ 2
(x, t)
∂x
θ(x, t) = arctan (x, t) (x, t) =
∂w 2
∂x ∂x
1 +( (x, t))
∂x

Substituting these formulae into (E1) give a horrendous mess. However, we can get considerable simplification by looking only at
small vibrations. By a small vibration, we mean that |θ(x, t)| ≪ 1 for all x and t. This implies that | tan θ(x, t)| ≪ 1, hence that
∣ ≪ 1 and hence that
∂w

∣ (x, t)∣
∂x

−−−−−−−−−−
2
∂w ∂w
√1 + ( ) ≈1 sin θ(x, t) ≈ (x, t)
∂x ∂x

2
∂θ ∂ w
cos θ(x, t) ≈ 1 (x, t) ≈ (x, t) (E2)
2
∂x ∂x

Substituting these into equation (E1) give


2 2
∂ w ∂T ∂w ∂ w
ρ(x) (x, t) = (x, t) (x, t) + T (x, t) (x, t) + F (x, t)
∂ t2 ∂x ∂x ∂x2

which is indeed relatively simple, but still exhibits a problem. This is one equation in the two unknowns w and T .
Fortunately there is a second equation lurking in the background, that we haven't used yet. Namely, the horizontal component of
Newton's law of motion. As a second simplification, we assume that there are only transverse vibrations. That is, our tiny string
element moves only vertically. Then the net horizontal force on it must be zero. That is,

T (x + Δx, t) cos θ(x + Δx, t) − T (x, t) cos θ(x, t) = 0

Dividing by Δx and taking the limit as Δx tends to zero gives

2.8.5 https://math.libretexts.org/@go/page/92241

[T (x, t) cos θ(x, t)] = 0
∂x

Thus T (x, t) cos θ(x, t) is independent of x. For small amplitude vibrations, cos θ is very close to one, for all x. So T is a function
of t only, which is determined by how hard you are pulling on the ends of the string at time t. So for small, transverse vibrations,
(E3) simplifies further to
2 2
∂ w ∂ w
ρ(x) (x, t) = T (t) (x, t) + F (x, t)
2 2
∂t ∂x

In the event that the string density ρ is a constant, independent of x, the string tension T (t) is a constant independent of t (in other
words you are not continually playing with the tuning pegs) and there are no external forces F we end up with the wave equation
−−
2 2
∂ w 2
∂ w T
(x, t) = c (x, t) where c =√
∂ t2 ∂x2 ρ

as desired.
The equation that is called the wave equation has built into it a lot of approximations. By going through the derivation, we have
seen what those approximations are, and we can get some idea as to when they are applicable.
1. If you plug “wave equation” into your favourite search engine you will get more than a million hits.

This page titled 2.8: Optional — Solving the Wave Equation is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.

2.8.6 https://math.libretexts.org/@go/page/92241
2.9: Maximum and Minimum Values
One of the core topics in single variable calculus courses is finding the maxima and minima of functions of one variable. We'll now
extend that discussion to functions of more than one variable 1. Rather than leaping into the deep end, we'll not be too ambitious
and concentrate on functions of two variables. That being said, many of the techniques work more generally. To start, we have the
following natural extensions to some familiar definitions.

 Definition 2.9.1

Let the function f (x, y) be defined for all (x, y) in some subset R of R . Let (a, b) be a point in R.
2

(a, b) is a local maximum of f (x, y) if f (x, y) ≤ f (a, b) for all (x, y) close to (a, b). More precisely, (a, b) is a local
maximum of f (x, y) if there is an r > 0 such that f (x, y) ≤ f (a, b) for all points (x, y) within a distance r of (a, b).
(a, b) is a local minimum of f (x, y) if f (x, y) ≥ f (a, b) for all (x, y) close to (a, b).

Local maximum and minimum values are also called extremal values.
(a, b) is an absolute maximum or global maximum of f (x, y) if f (x, y) ≤ f (a, b) for all (x, y) in R.

(a, b) is an absolute minimum or global minimum of f (x, y) if f (x, y) ≥ f (a, b) for all (x, y) in R.

Local Maxima and Minima


One of the first things you did when you were developing the techniques used to find the maximum and minimum values of f (x)
was ask yourself 2
Suppose that the largest value of f (x) is f (a). What does that tell us about a?
After a little thought you answered
If the largest value of f (x) is f (a) and f is differentiable at a, then f ′
(a) = 0.

Let's recall why that's true. Suppose that the largest value of f (x) is f (a). Then for all h > 0,
f (a+h) − f (a)
f (a+h) ≤ f (a) ⟹ f (a+h) − f (a) ≤ 0 ⟹ ≤0 if h > 0
h

Taking the limit h → 0 tells us that f ′


(a) ≤ 0. Similarly 3, for all h < 0,
f (a+h) − f (a)
f (a+h) ≤ f (a) ⟹ f (a+h) − f (a) ≤ 0 ⟹ ≥0 if h < 0
h

Taking the limit h → 0 now tells us that f ′


(a) ≥ 0. So we have both f ′
(a) ≥ 0 and f ′
(a) ≤ 0 which forces f ′
(a) = 0.

You also observed at the time that for this argument to work, you only need f (x) ≤ f (a) for all x's close to a, not necessarily for
all x's in the whole world. (In the above inequalities, we only used f (a + h) with h small.) Since we care only about f (x) for x
near a, we can refine the above statement.
If f (a) is a local maximum for f (x) and f is differentiable at a, then f ′
(a) = 0.

Precisely the same reasoning applies to minima.


If f (a) is a local minimum for f (x) and f is differentiable at a, then f ′
(a) = 0.

Let's use the ideas of the above discourse to extend the study of local maxima and local minima to functions of more than one
variable. Suppose that the function f (x, y) is defined for all (x, y) in some subset R of R , that (a, b) is point of R that is not on
2

the boundary of R, and that f has a local maximum at (a, b). See the figure below.

2.9.1 https://math.libretexts.org/@go/page/92242
Then the function f (x, y) must decrease in value as (x, y) moves away from (a, b) in any direction. No matter which direction d
⃗ 

we choose, the directional derivative of f at (a, b) in direction d must be zero or smaller. Writing this in mathematical symbols, we
⃗ 

get
⃗ 
d
D  ⃗ f (a, b) = ∇f (a, b) ⋅ ≤0
d
⃗ 
|d |

And the directional derivative of f at (a, b) in the direction −d ⃗ also must be zero or negative.
⃗  ⃗ 
−d d
D ⃗ f (a, b) = ∇f (a, b) ⋅ ≤0 which implies that ∇f (a, b) ⋅ ≥0
−d
⃗  ⃗ 
|d | |d |

⃗ 
As n⃗ ablaf (a, b) ⋅
d

⃗ 
must be both positive (or zero) and negative (or zero) at the same time, it must be zero. In particular,
|d |

choosing d ⃗ = ^
ı forces the x component of n⃗ ablaf (a, b) to be zero, and choosing d = ^
ı
⃗ 
ȷ forces the y component of ∇f (a, b) to be
ȷ

zero. We have thus shown that ∇f (a, b) = 0. The same argument shows that ∇f (a, b) = 0 when (a, b) is a local minimum too.
This is an important and useful result, so let's theoremise it.

 Theorem 2.9.2

Let the function f (x, y) be defined for all (x, y) in some subset R of R . Assume that 2

(a, b)is a point of R that is not on the boundary of R and


(a, b)is a local maximum or local minimum of f and that
the partial derivatives of f exist at (a, b).
Then

∇f (a, b) = 0.

 Definition 2.9.3
Let f (x, y) be a function and let (a, b) be a point in its domain. Then
if ∇f (a, b) exists and is zero we call (a, b) a critical point (or a stationary point) of the function, and
if ∇f (a, b) does not exist then we call (a, b) a singular point of the function.

 Warning 2.9.4

Note that some people (and texts) combine both of these cases and call (a, b) a critical point when either the gradient is zero or
does not exist.

2.9.2 https://math.libretexts.org/@go/page/92242
 Warning 2.9.5

Theorem 2.9.2 tells us that every local maximum or minimum (in the interior of the domain of a function whose partial
derivatives exist) is a critical point. Beware that it does not 4 tell us that every critical point is either a local maximum or a local
minimum.

In fact, we shall see later 5, in Examples 2.9.13 and 2.9.15, critical points that are neither local maxima nor a local minima. None-
the-less, Theorem 2.9.2 is very useful because often functions have only a small number of critical points. To find local maxima
and minima of such functions, we only need to consider its critical and singular points. We'll return later to the question of how to
tell if a critical point is a local maximum, local minimum or neither. For now, we'll just practice finding critical points.

 Example 2.9.6. f (x, y) = x 2


− 2xy + 2 y
2
+ 2x − 6y + 12

Find all critical points of f (x, y) = x 2


− 2xy + 2 y
2
+ 2x − 6y + 12.

Solution
To find the critical points, we need to find the gradient. To find the gradient we need to find the first order partial derivatives.
So, as a preliminary calculation, we find the two first order partial derivatives of f (x, y).

fx (x, y) = 2x − 2y + 2

fy (x, y) = −2x + 4y − 6

So the critical points are the solutions of the pair of equations

2x − 2y + 2 = 0 − 2x + 4y − 6 = 0

or equivalently (dividing by two and moving the constants to the right hand side)

x − y = −1 (E1)

−x + 2y = 3 (E2)

This is a system of two equations in two unknowns (x and y ). One strategy for a solving system like this is to
First use one of the equations to solve for one of the unknowns in terms of the other unknown. For example, (E1) tells us
that y = x + 1. This expresses y in terms of x. We say that we have solved for y in terms of x.
Then substitute the result, y = x + 1 in our case, into the other equation, (E2). In our case, this gives
−x + 2(x + 1) = 3 ⟺ x +2 = 3 ⟺ x =1

We have now found that x = 1, y = x + 1 = 2 is the only solution. So the only critical point is (1, 2). Of course it only
takes a moment to verify that ∇f (1, 2) = ⟨0, 0⟩ . It is a good idea to do this as a simple check of our work.
An alternative strategy for solving a system of two equations in two unknowns, like (E1) and (E2), is to
add equations (E1) and (E2) together. This gives

(E1) + (E2) :   (1 − 1)x + (−1 + 2)y = −1 + 3 ⟺ y =2

The point here is that adding equations (E1) and (E2) together eliminates the unknown x, leaving us with one equation in
the unknown y, which is easily solved. For other systems of equations you might have to multiply the equations by some
numbers before adding them together.
We now know that y = 2. Substituting it into (E1) gives us

x − 2 = −1 ⟹ x =1

Once again (thankfully) we have found that the only critical point is (1, 2).

This was pretty easy because we only had to solve linear equations, which in turn was a consequence of the fact that f (x, y) was a
polynomial of degree two. Here is an example with some slightly more challenging algebra.

2.9.3 https://math.libretexts.org/@go/page/92242
 Example 2.9.7. f (x, y) = 2x 3
− 6xy + y
2
+ 4y

Find all critical points of f (x, y) = 2x 3


− 6xy + y
2
+ 4y.

Solution
As in the last example, we need to find where the gradient is zero, and to find the gradient we need the first order partial
derivatives.
2
fx = 6 x − 6y fy = −6x + 2y + 4

So the critical points are the solutions of


2
6x − 6y = 0 − 6x + 2y + 4 = 0

We can rewrite the first equation as y =x ,


2
which expresses y as a function of x. We can then substitute y =x
2
into the
second equation, giving
2 2
−6x + 2y + 4 = 0 ⟺ −6x + 2 x +4 = 0 ⟺ x − 3x + 2 = 0

⟺ (x − 1)(x − 2) = 0

⟺ x = 1 or 2

When x = 1, y = 1 2
=1 and when x = 2, y = 2 2
= 4. So, there are two critical points: (1, 1),  (2, 4).
Alternatively, we could have also used the second equation to write y = 3x − 2, and then substituted that into the first
equation to get
2 2
6x − 6(3x − 2) = 0 ⟺ x − 3x + 2 = 0

just as above.

And here is an example for which the algebra requires a bit more thought.

 Example 2.9.8. f (x, y) = xy(5x + y − 15)

Find all critical points of f (x, y) = xy(5x + y − 15).


Solution
The first order partial derivatives of f (x, y) = xy(5x + y − 15) are
fx (x, y)   =  y(5x + y − 15) + xy(5)   =  y(5x + y − 15) + y(5x)

  =  y(10x + y − 15)

fy (x, y)   =  x(5x + y − 15) + xy(1)   =  x(5x + y − 15) + x(y)

  =  x(5x + 2y − 15)

The critical points are the solutions of fx (x, y) = fy (x, y) = 0. That is, we need to find all x, y that satisfy the pair of
equations
y(10x + y − 15) = 0 (E1)

x(5x + 2y − 15) = 0 (E2)

The first equation, y(10x + y − 15) = 0, is satisfied if at least one of the two factors y, (10x + y − 15) is zero. So the first
equation is satisfied if at least one of the two equations
y =0 (E1a)

10x + y = 15 (E1b)

is satisfied. The second equation, x(5x + 2y − 15) = 0, is satisfied if at least one of the two factors x, (5x + 2y − 15) is
zero. So the second equation is satisfied if at least one of the two equations
x =0 (E2a)

5x + 2y = 15 (E2b)

2.9.4 https://math.libretexts.org/@go/page/92242
is satisfied.
So both critical point equations (E1) and (E2) are satisfied if and only if at least one of (E1a), (E1b) is satisfied and in addition
at least one of (E2a), (E2b) is satisfied. So both critical point equations (E1) and (E2) are satisfied if and only if at least one of
the following four possibilities hold.
(E1a) and (E2a) are satisfied if and only if x = y = 0
(E1a) and (E2b) are satisfied if and only if y = 0,  5x + 2y = 15 ⟺ y = 0,  5x = 15
(E1b) and (E2a) are satisfied if and only if 10x + y = 15,  x = 0 ⟺ y = 15,  x = 0
(E1b) and (E2b) are satisfied if and only if 10x + y = 15,  5x + 2y = 15. We can use, for example, the second of these
equations to solve for x in terms of y: x = (15 − 2y). When we substitute this into the first equation we get
1

2(15 − 2y) + y = 15, which we can solve for y. This gives −3y = 15 − 30 or y = 5 and then x =
1
(15 − 2 × 5) = 1.
5

In conclusion, the critical points are (0, 0), (3, 0), (0, 15) and (1, 5).
A more compact way to write what we have just done is
fx (x, y) = 0 and fy (x, y) = 0

⟺ y(10x + y − 15) = 0 and x(5x + 2y − 15) = 0

⟺ {y = 0 or 10x + y = 15} and {x = 0 or 5x + 2y = 15}

⟺ {y = 0,  x = 0} or {y = 0,  5x + 2y = 15}

 or {10x + y = 15,  x = 0} or {10x + y = 15,  5x + 2y = 15}

⟺ {x = y = 0} or {y = 0,  x = 3}

 or {x = 0,  y = 15} or {x = 1,  y = 5}

Let's try a more practical example — something from the real world. Well, a mathematician's “real world”. The interested reader
should search-engine their way to a discussion of “idealisation”, “game theory” “Cournot models” and “Bertrand models”. But
don't spend too long there. A discussion of breweries is about to take place.

 Example 2.9.9

In a certain community, there are two breweries in competition 6, so that sales of each negatively affect the profits of the other.
If brewery A produces x litres of beer per month and brewery B produces y litres per month, then the profits of the two
breweries are given by
2 2 2 2
2x +y 4y +x
P = 2x − Q = 2y −
6 6
10 2 × 10

respectively. Find the sum of the two profits if each brewery independently sets its own production level to maximize its own
profit and assumes that its competitor does likewise. Then, assuming cartel behaviour, find the sum of the two profits if the two
breweries cooperate so as to maximize that sum 7.
Solution
If A adjusts x to maximize P (for y held fixed) and B adjusts y to maximize Q (for x held fixed) then x and y are determined
by the equations
4x
Px = 2 − 6
=0 (E1)
10

8y
Qy = 2 − =0 (E2)
6
2×10

Equation (E1) yields x =


1

2
6
10 and equation (E2) yields y =
1

2
10 .
6
Knowing x and y we can determine P, Q and the total
profit
1 5 2 2
P +Q = 2(x + y) − 6
( x + 3y )
10 2

6 5 3 5 6
= 10 (1 + 1 − − ) = 10
8 4 8

On the other hand if (A, B) adjust (x, y) to maximize P + Q = 2(x + y) −


1
6
(
5

2
x
2
+ 3 y ),
2
then x and y are determined by
10

2.9.5 https://math.libretexts.org/@go/page/92242
5x
(P + Q)x =2− 6
=0 (E1)
10

6y
(P + Q)y =2− 6
=0 (E2)
10

Equation (E1) yields x = 2

5
6
10 and equation (E2) yields y = 1

3
6
10 . Again knowing x and y we can determine the total profit
1 5 2 2
P +Q = 2(x + y) − 6
( x + 3y )
10 2

6 4 2 2 1 11 6
= 10 ( + − − ) = 10
5 3 5 3 15

So cooperating really does help their profits. Unfortunately, like a very small tea-pot, consumers will be a little poorer 8.

Moving swiftly away from the last pun, let's do something a little more geometric.

 Example 2.9.10

Equal angle bends are made at equal distances from the two ends of a 100 metre long fence so the resulting three segment
fence can be placed along an existing wall to make an enclosure of trapezoidal shape. What is the largest possible area for such
an enclosure?

Solution
This is a very geometric problem (fenced off from pun opportunities), and as such we should start by drawing a sketch and
introducing some variable names.

The area enclosed by the fence is the area inside the blue rectangle (in the figure on the right above) plus the area inside the
two blue triangles.
1
A(x, θ) = (100 − 2x)x sin θ + 2 ⋅ ⋅ x sin θ ⋅ x cos θ
2
2 2
= (100x − 2 x ) sin θ + x sin θ  cos θ

To maximize the area, we need to solve


∂A
0 = = (100 − 4x) sin θ + 2x sin θ cos θ
∂x

∂A 2 2 2 2
0 = = (100x − 2 x ) cos θ + x { cos θ − sin θ}
∂θ

Note that both terms in the first equation contain the factor sin θ and all terms in the second equation contain the factor x. If
either sin θ or x are zero the area A(x, θ) will also be zero, and so will certainly not be maximal. So we may divide the first
equation by sin θ and the second equation by x, giving
(100 − 4x) + 2x cos θ = 0 (E1)

2 2
(100 − 2x) cos θ + x{ cos θ − sin θ} =0 (E2)

These equations might look a little scary. But there is no need to panic. They are not as bad as they look because θ enters only
through cos θ and sin θ, which we can easily write in terms of cos θ. Furthermore we can eliminate cos θ by observing that
2

2
(100−4x)
the first equation forces cos θ = −
100−4x

2x
and hence sin
2
θ = 1 − cos
2
θ =1−
2
4x
. Substituting these into the second
equation gives

2.9.6 https://math.libretexts.org/@go/page/92242
2
100 − 4x (100 − 4x)
−(100 − 2x) +x [ − 1] =0
2
2x 2x
2 2
⟹ −(100 − 2x)(100 − 4x) + (100 − 4x ) − 2x =0
2
⟹ 6x − 200x = 0

100 −100/3 1

⟹ x = cos θ = − = θ = 60
3 200/3 2

and the maximum area enclosed is


2 – 2 –
100 100 √3 1 100 √3 2500
A = (100 −2 )  +    = 
2 2 –
3 3 2 2 3 2 √3

Now here is a very useful (even practical!) statistical example — finding the line that best fits a given collection of points.

 Example 2.9.11. Linear regression

An experiment yields n data points  (x i, yi ),  i = 1, 2, ⋯ , n. We wish to find the straight line  y = mx + b  which “best” fits
the data.

The definition of “best” is “minimizes the root mean square error”, i.e. minimizes
n

2
E(m, b) = ∑(m xi + b − yi )

i=1

Note that
term number i in E(m, b) is the square of the difference between y , which is the i i
th
measured value of y, and
[mx + b ] , which is the approximation to y given by the line y = mx + b.
i
x=xi

All terms in the sum are positive, regardless of whether the points (x i, yi ) are above or below the line.
Our problem is to find the m and b that minimizes E(m, b). This technique for drawing a line through a bunch of data points
is called “linear regression”. It is used a lot 9 10. Even in the real world — and not just the real world that you find in
mathematics problems. The actual real world that involves jobs.
Solution
We wish to choose m and b so as to minimize E(m, b). So we need to determine where the partial derivatives of E are zero.
n n n n
∂E
2
0 = = ∑ 2(m xi + b − yi )xi = m[ ∑ 2 x ] + b[ ∑ 2 xi ] − [ ∑ 2 xi yi ]
i
∂m
i=1 i=1 i=1 i=1

n n n n
∂E
0 = = ∑ 2(m xi + b − yi ) = m[ ∑ 2 xi ] + b[ ∑ 2] − [ ∑ 2 yi ]
∂b
i=1 i=1 i=1 i=1

There are a lot of symbols here. But remember that all of the x 's and y 's are given constants. They come from, for example,
i i

experimental data. The only unknowns are m and b. To emphasize this, and to save some writing, define the constants

2.9.7 https://math.libretexts.org/@go/page/92242
n n n n

2
Sx = ∑ xi Sy = ∑ yi Sx2 = ∑ x Sxy = ∑ xi yi
i

i=1 i=1 i=1 i=1

The equations which determine the critical points are (after dividing by two)
Sx2 m + Sx b = Sxy (E1)

Sx m + n b = Sy (E2)

These are two linear equations on the unknowns m and b. They may be solved in any of the usual ways. One is to use (E2) to
solve for b in terms of m
1
b = (Sy − Sx m)
n

and then substitute this into (E1) to get the equation


1
2
Sx2 m + Sx (Sy − Sx m) = Sxy ⟹ (nSx2 − Sx )m = nSxy − Sx Sy
n

for m. We can then solve this equation for m and substitute back into (E3) to get b. This gives
nSxy − Sx Sy
m =
2
nSx2 − Sx
2
Sy nSx2 − Sx Sx nSxy − Sx Sy nSy Sx2 − nSx Sxy
b = − =
2 2 2
n nSx2 − Sx n nSx2 − Sx n(nSx2 − Sx )

Sx Sxy − Sy Sx2
=−
2
nSx2 − Sx

Another way to solve the system of equations is


2
n(E1) − Sx (E2) : [nSx2 − Sx ]m = nSxy − Sx Sy

2
−Sx (E1) + Sx2 (E2) : [nSx2 − Sx ]b = −Sx Sxy + Sy Sx2

which gives the same solution.


So given a bunch of data points, it only takes a quick bit of arithmetic — no calculus required — to apply the above formulae
and so to find the best fitting line. Of course while you don't need any calculus to apply the formulae, you do need calculus to
understand where they came from. The same technique can be extended to other types of curve fitting problems. For example,
polynomial regression.

The Second Derivative Test


Now let's start thinking about how to tell if a critical point is a local minimum or maximum. Remember what happens for functions
of one variable. Suppose that x = a is a critical point of the function f (x). Any (sufficiently smooth) function is well
approximated, when x is close to a, by the first few terms of its Taylor expansion
′ 1 ′′ 2 1 (3) 3
f (x) = f (a) + f (a) (x − a) + f (a) (x − a) + f (a) (x − a) +⋯
2 3!

As a is a critical point, we know that f ′


(a) = 0 and
1 ′′ 2 1 (3) 3
f (x) = f (a) + f (a) (x − a) + f (a) (x − a) +⋯
2 3!

If f ′′
(a) ≠ 0, f (x) is going to look a lot like f (a) + 1

2
f
′′
(a) (x − a)
2
when x is really close to a. In particular
if f ′′
(a) > 0, then we will have f (x) > f (a) when x is close to (but not equal to) a, so that a will be a local minimum and
if f ′′
(a) < 0, then we will have f (x) < f (a) when x is close to (but not equal to) a, so that a will be a local maximum, but
if f ′′
(a) = 0, then we cannot draw any conclusions without more work.
A similar, but messier, analysis is possible for functions of two variables. Here are some simple quadratic examples that provide a
warmup for that messier analysis.

2.9.8 https://math.libretexts.org/@go/page/92242
 Example 2.9.12. f (x, y) = x 2
+ 3xy + 3 y
2
− 6x − 3y − 6

Consider f (x, y) = x 2
+ 3xy + 3 y
2
− 6x − 3y − 6. The gradient of f is

∇f (x, y) = (2x + 3y − 6) ^
ı
ı + (3x + 6y − 3) ^
ȷ
ȷ

So (x, y) is a critical point of f if and only if


2x + 3y = 6 (E1)

3x + 6y = 3 (E2)

Multiplying the first equation by 2 and subtracting the second equation gives

x =9

Then substituting x = 9 back into the first equation gives


2 × 9 + 3y = 6 ⟹ y = −4

So f (x, y) has precisely one critical point, namely (9 , −4).

Now let's try to determine if f (x, y) has a local minimum, or a local maximum, or neither, at (9, −4). A good way to
determine the behaviour of f (x, y) for (x, y) near (9, −4) is to make the change of variables 11

x = 9 + Δx y = −4 + Δy

and study the behaviour of f for Δx and Δy near zero.


2 2
f (9 + Δx , −4 + Δy) = (9 + Δx ) + 3(9 + Δx)(−4 + Δy) + 3(−4 + Δy )

− 6(9 + Δx) − 3(−4 + Δy) − 6

2 2
= (Δx ) + 3Δx Δy + 3(Δy ) − 27

And a good way to study the sign of quadratic expressions like (Δx ) + 3Δx Δy + 3(Δy ) is to complete the square. So far
2 2

you have probably just completed the square for quadratic expressions that involve only a single variable. For example
2
3 9
2
x + 3x + 3 = (x + ) − +3
2 4

When there are two variables around, like Δx and Δy, you can just pretend that one of them is a constant and complete the
square as before. For example, if you pretend that Δy is a constant,
2
3 9
2 2 2
(Δx ) + 3Δx Δy + 3(Δy ) = (Δx + Δy) + (3 − ) (Δy )
2 4

2
3 3 2
= (Δx + Δy) + (Δy )
2 4

To this point, we have expressed


2
3 3 2
f (9 + Δx , −4 + Δy) = (Δx + Δy) + (Δy ) − 27
2 4

2
As the smallest values of (Δx + 3

2
Δy) and 3

4
(Δy )
2
are both zero, we have that

f (x, y) = f (9 + Δx , −4 + Δy) ≥ −27 = f (9, −4)

for all (x, y) so that (9, −4) is both a local minimum and a global minimum for f .

You have already encountered single variable functions that have a critical point which is neither a local max nor a local min. See
Example 3.5.9 in the CLP-1 text. Here are a couple of examples which show that this can also happen for functions of two
variables. We'll start with the simplest possible such example.

2.9.9 https://math.libretexts.org/@go/page/92242
 Example 2.9.13. f (x, y) = x 2
− y
2

The first partial derivatives of f (x, y) = x − y are f (x, y) = 2x and f (x, y) = −2y. So the only critical point of this
2 2
x y

function is (0, 0). Is this a local minimum or maximum? Well let's start with (x, y) at (0, 0) and then move (x, y) away from
(0, 0) and see if f (x, y) gets bigger or smaller. At the origin f (0, 0) = 0. Of course we can move (x, y) away from (0, 0) in

many different directions.


First consider moving (x, y) along the x-axis. Then (x, y) = (x, 0) and f (x, y) = f (x, 0) = x . So when we start with
2

x = 0 and then increase x, the value of the function f increases — which means that (0, 0) cannot be a local maximum for

f.

Next let's move (x, y) away from (0, 0) along the y -axis. Then (x, y) = (0, y) and f (x, y) = f (0, y) = −y . So when we
2

start with y = 0 and then increase y, the value of the function f decreases — which means that (0, 0) cannot be a local
minimum for f .
So moving away from (0, 0) in one direction causes the value of f to increase, while moving away from (0, 0) in a second
direction causes the value of f to decrease. Consequently (0, 0) is neither a local minimum or maximum for f . It is called a
saddle point, because the graph of f looks like a saddle. (The full definition of “saddle point” is given immediately after this
example.) Here are some figures showing the graph of f .

The figure below show some level curves of f . Observe from the level curves that
f increases as you leave (0, 0) walking along the x axis
f decreases as you leave (0, 0) walking along the y axis

Approximately speaking, if a critical point (a, b) is neither a local minimum nor a local maximum, then it is a saddle point. For
(a, b) to not be a local minimum, f has to take values bigger than f (a, b) at some points nearby (a, b). For (a, b) to not be a local

maximum, f has to take values smaller than f (a, b) at some points nearby (a, b). Writing this more mathematically we get the
following definition.

 Definition 2.9.14

The critical point (a, b) is called a saddle point for the function f (x, y) if, for each r > 0,
there is at least one point (x, y), within a distance r of (a, b), for which f (x, y) > f (a, b) and
there is at least one point (x, y), within a distance r of (a, b), for which f (x, y) < f (a, b).

2.9.10 https://math.libretexts.org/@go/page/92242
Here is another example of a saddle point. This time we have to work a bit to see it.

 Example 2.9.15. f (x, y) = x 2


− 2xy − y
2
+ 4y − 2

Consider f (x, y) = x 2
− 2xy − y
2
+ 4y − 2. The gradient of f is

∇f (x, y) = (2x − 2y) ^


ı + ( − 2x − 2y + 4) ^
ı ȷ
ȷ

So (x, y) is a critical point of f if and only if


2x − 2y = 0

−2x − 2y = −4

The first equation gives that x = y. Substituting y = x into the second equation gives

−2y − 2y = −4 ⟹ x =y =1

So f (x, y) has precisely one critical point, namely (1, 1).


To determine if f (x, y) has a local minimum, or a local maximum, or neither, at (1, 1), we proceed as in Example 2.9.12. We
make the change of variables

x = 1 + Δx y = 1 + Δy

to give

f (1 + Δx , 1 + Δy)

2 2
= (1 + Δx ) − 2(1 + Δx)(1 + Δy) − (1 + Δy ) + 4(1 + Δy) − 2
2 2
= (Δx ) − 2Δx Δy − (Δy )

Completing the square,


2 2 2 2
f (1 + Δx , 1 + Δy) = (Δx ) − 2Δx Δy − (Δy ) = (Δx − Δy) − 2(Δy )

Notice that f has now been written as the difference of two squares, much like the f in the saddle point Example 2.9.13.
If Δx and Δy are such that the first square (Δx − Δy) is nonzero, but the second square (Δy) is zero, then
2 2

> 0 = f (1, 1). That is, whenever Δy = 0 and Δx ≠ Δy, then


2
f (1 + Δx , 1 + Δy) = (Δx − Δy)
2
f (1 + Δx , 1 + Δy) = (Δx − Δy) > 0 = f (1, 1).

On the other hand, if Δx and Δy are such that the first square (Δx − Δy) is zero but the second square (Δy) is
2 2

nonzero, then f (1 + Δx , 1 + Δy) = −2(Δy ) < 0 = f (1, 1). That is, whenever Δx = Δy ≠ 0, then
2

2
f (1 + Δx , 1 + Δy) = −2(Δy ) < 0 = f (1, 1).

So
f (x, y) > f (1, 1) at all points on the blue line in the figure above, and
f (x, y) < f (1, 1) at all point on the red line.
We conclude that (1, 1) is the only critical point for f (x, y), and furthermore that it is a saddle point.

2.9.11 https://math.libretexts.org/@go/page/92242
The above three examples show that we can find all critical points of quadratic functions of two variables. We can also classify
each critical point as either a minimum, a maximum or a saddle point.
Of course not every function is quadratic. But by using the quadratic approximation 2.6.12 we can apply the same ideas much more
generally. Suppose that (a, b) is a critical point of some function f (x, y). For Δx and Δy small, the quadratic approximation
2.6.12 gives

f (a + Δx , b + Δy)

≈ f (a , b) + fx (a , b) Δx + fy (a , b) Δy

1 2 2
+ { fxx (a, b) Δx + 2 fxy (a, b) ΔxΔy + fyy (a, b) Δy }
2
1 2 2
= f (a , b) + { fxx (a, b) Δx + 2 fxy (a, b) ΔxΔy + fyy (a, b) Δy }
2

since (a, b) is a critical point so that fx (a, b) = fy (a, b) = 0. Then using the technique of Examples 2.9.12 and 2.9.15, we get
12
(details below).

 Theorem 2.9.16. Second Derivative Test

Let r > 0 and assume that all second order derivatives of the function f (x, y) are continuous at all points (x, y) that are within
a distance r of (a, b). Assume that f (a, b) = f (a, b) = 0. Define
x y

2
D(x, y) = fxx (x, y) fyy (x, y) − fxy (x, y )

It is called the discriminant of f . Then


if D(a, b) > 0 and f (a, b) > 0, then f (x, y) has a local minimum at (a, b),
xx

if D(a, b) > 0 and f (a, b) < 0, then f (x, y) has a local maximum at (a, b),
xx

if D(a, b) < 0, then f (x, y) has a saddle point at (a, b), but
if D(a, b) = 0, then we cannot draw any conclusions without more work.

Proof
We are putting quotation marks around the word “Proof”, because we are not going to justify the fact that it suffices to analyse
the quadratic approximation in equation (∗). Let's temporarily suppress the arguments (a, b). If f (a, b) ≠ 0, then by xx

completing the square we can write


2 2
fxx Δx + 2 fxy ΔxΔy + fyy Δy

2 2
fxy fxy
2
= fxx (Δx + Δy) + ( fyy − ) Δy
fxx fxx

1 2 2
2
= {(fxx Δx + fxy Δy ) + (fxx fyy − fxy ) Δy }
fxx

Similarly, if f yy (a, b) ≠ 0,

2 2
fxx Δx + 2 fxy ΔxΔy + fyy Δy

1 2
2 2
= {(fxy Δx + fyy Δy ) + (fxx fyy − fxy ) Δx }
fyy

Note that this algebra breaks down if f xx (a, b) = fyy (a, b) = 0. We'll deal with that case shortly. More importantly, note that
if (f f −
xx yy
2
fxy ) >0 then both f xx and f yy must be nonzero and of the same sign and furthermore, whenever Δx or Δy
are nonzero,
2 2
2
{(fxx Δx + fxy Δy ) + (fxx fyy − fxy ) Δy } >0 and

2
2 2
{(fxy Δx + fyy Δy ) + (fxx fyy − fxy ) Δx } >0

so that, recalling (∗),

2.9.12 https://math.libretexts.org/@go/page/92242
if f xx (a, b) > 0, then (a, b) is a local minimum and
if f xx (a, b) < 0, then (a, b) is a local maximum.
If (f xx fyy
2
− fxy ) < 0 and fxx is nonzero then
2 2
2
{(fxx Δx + fxy Δy ) + (fxx fyy − fxy ) Δy }

is strictly positive whenever Δx ≠ 0, Δy = 0 and is strictly negative whenever f Δx + f Δy = 0, Δy ≠ 0, so thatxx xy

(a, b) is a saddle point. Similarly, (a, b) is also a saddle point if (f ) < 0 and f is nonzero. 2
f −f xx yy xy yy

Finally, if f ≠ 0 and f = f = 0, then


xy xx yy

2 2
fxx Δx + 2 fxy Δx Δy + fyy Δy = 2 fxy Δx Δy

is strictly positive for one sign of Δx Δy and is strictly negative for the other sign of Δx Δy. So (a, b) is again a saddle
point.

You might wonder why, in the local maximum/local minimum cases of Theorem 2.9.16, f (a, b) appears rather than f (a, b). xx yy

The answer is only that x is before y in the alphabet 13 . You can use f (a, b) just as well as f (a, b). The reason is that if
yy xx

D(a, b) > 0 (as in the first two bullets of the theorem), then because D(a, b) = f we 2
(a, b) f (a, b) − f (a, b ) > 0, xx yy xy

necessarily have f (a, b) f (a, b) > 0 so that f (a, b) and f (a, b) must have the same sign — either both are positive or both
xx yy xx yy

are negative.
You might also wonder why we cannot draw any conclusions when D(a, b) = 0 and what happens then. The second derivative test
for functions of two variables was derived in precisely the same way as the second derivative test for functions of one variable is
derived — you approximate the function by a polynomial that is of degree two in (x − a), (y − b) and then you analyze the
behaviour of the quadratic polynomial near (a, b). For this to work, the contributions to f (x, y) from terms that are of degree two
in (x − a), (y − b) had better be bigger than the contributions to f (x, y) from terms that are of degree three and higher in (x − a),
(y − b) when (x − a), (y − b) are really small. If this is not the case, for example when the terms in f (x, y) that are of degree two

in (x − a), (y − b) all have coefficients that are exactly zero, the analysis will certainly break down. That's exactly what happens
when D(a, b) = 0. Here are some examples. The functions
4 4 4 4
f1 (x, y) = x +y f2 (x, y) = −x −y

3 3 4 4
f3 (x, y) = x +y f4 (x, y) = x −y

all have (0, 0) as the only critical point and all have D(0, 0) = 0. The first, f1 has its minimum there. The second, f2 , has its
maximum there. The third and fourth have a saddle point there.
Here are sketches of some level curves for each of these four functions (with all renamed to simply f ).

 Example 2.9.17. f (x, y) = 2x 3


− 6xy + y
2
+ 4y

Find and classify all critical points of f (x, y) = 2x 3


− 6xy + y
2
+ 4y.

Solution
Thinking a little way ahead, to find the critical points we will need the gradient and to apply the second derivative test of
Theorem 2.9.16 we will need all second order partial derivatives. So we need all partial derivatives of order up to two. Here
they are.

2.9.13 https://math.libretexts.org/@go/page/92242
3 2
f = 2x − 6xy + y + 4y
2
fx = 6 x − 6y fxx = 12x fxy = −6

fy = −6x + 2y + 4 fyy = 2 fyx = −6

(Of course, f xy and f yx have to be the same. It is still useful to compute both, as a way to catch some mechanical errors.)
We have already found, in Example 2.9.7, that the critical points are (1, 1),  (2, 4). The classification is

critical
fxx fyy − fxy
2
fxx type
point

(1, 1) 12 × 2 − (−6 )
2
< 0 saddle point

(2, 4) 24 × 2 − (−6 )
2
> 0 24 local min

We were able to leave the f xx entry in the top row blank, because
we knew that f (1, 1)f (1, 1) − f (1, 1) < 0, and
xx yy
2
xy

we knew, from Theorem 2.9.16, that f (1, 1)f (1, 1) − f


xx yy
2
xy (1, 1) < 0, by itself, was enough to ensure that (1, 1) was a
saddle point.
Here is a sketch of some level curves of our f (x, y).

They are not needed to answer this question, but can give you some idea as to what the graph of f looks like.

 Example 2.9.18. f (x, y) = xy(5x + y − 15)


Find and classify all critical points of f (x, y) = xy(5x + y − 15).
Solution
We have already computed the first order partial derivatives

fx (x, y) = y(10x + y − 15) fy (x, y) = x(5x + 2y − 15)

of f (x, y) in Example 2.9.8. Again, to classify the critical points we need the second order partial derivatives. They are

fxx (x, y) = 10y

fyy (x, y) = 2x

fxy (x, y) = (1)(10x + y − 15) + y(1) = 10x + 2y − 15

fyx (x, y) = (1)(5x + 2y − 15) + x(5) = 10x + 2y − 15

(Once again, we have computed both f and f to guard against mechanical errors.) We have already found, in Example
xy yx

2.9.8, that the critical points are (0, 0),  (0, 15),  (3, 0) and (1, 5). The classification is

2.9.14 https://math.libretexts.org/@go/page/92242
critical
fxx fyy − fxy
2
fxx type
point

(0, 0) 0 × 0 − (−15 )
2
< 0 saddle point

(0, 15) 150 × 0 − 15


2
< 0 saddle point

(3, 0) 0 × 6 − 15
2
< 0 saddle point

(1, 5) 50 × 2 − 5
2
> 0 75 local min

Here is a sketch of some level curves of our f (x, y). f is negative in the shaded regions and f is positive in the unshaded
regions.

Again this is not needed to answer this question, but can give you some idea as to what the graph of f looks like.

 Example 2.9.19

Find and classify all of the critical points of f (x, y) = x 3


+ xy
2
− 3x
2
− 4y
2
+ 4.

Solution
We know the drill now. We start by computing all of the partial derivatives of f up to order 2.
3 2 2 2
f =x + xy − 3x − 4y +4
2 2
fx = 3 x +y − 6x fxx = 6x − 6 fxy = 2y

fy = 2xy − 8y fyy = 2x − 8 fyx = 2y

The critical points are then the solutions of f x = 0, fy = 0. That is


2 2
fx = 3x +y − 6x = 0 (E1)

fy = 2y(x − 4) = 0 (E2)

The second equation, 2y(x − 4) = 0, is satisfied if and only if at least one of the two equations y = 0 and x = 4 is satisfied.
When y = 0, equation (E1) forces x to obey
2 2
0 = 3x +0 − 6x = 3x(x − 2)

so that x = 0 or x = 2.
When x = 4, equation (E1) forces y to obey
2 2 2
0 = 3 ×4 +y − 6 × 4 = 24 + y

which is impossible.
So, there are two critical points: (0, 0),  (2, 0). Here is a table that classifies the critical points.

2.9.15 https://math.libretexts.org/@go/page/92242
critical 2
fxx fyy − fxy fxx type
point

(0, 0) (−6) × (−8) − 0


2
> 0 −6 < 0 local max

(2, 0) 6 × (−4) − 0
2
< 0 saddle point

 Example 2.9.20

A manufacturer wishes to make an open rectangular box of given volume V using the least possible material. Find the design
specifications.
Solution
Denote by x, y and z, the length, width and height, respectively, of the box.

The box has two sides of area xz, two sides of area yz and a bottom of area xy. So the total surface area of material used is

S = 2xz + 2yz + xy

However the three dimensions x, y and z are not independent. The requirement that the box have volume V imposes the
constraint

xyz = V

We can use this constraint to eliminate one variable. Since z is at the end of the alphabet (poor z ), we eliminate z by
substituting z = . So we have find the values of x and y that minimize the function
V

xy

2V 2V
S(x, y) = + + xy
y x

Let's start by finding the critical points of S. Since


2V
Sx (x, y) = − +y
2
x
2V
Sy (x, y) = − +x
2
y

(x, y) is a critical point if and only if


2
x y = 2V (E1)

2
xy = 2V (E2)

Solving (E1) for y gives y = 2V

x2
. Substituting this into (E2) gives
2
4V 3 −
−− 2V 3 −
−−
3
x = 2V ⟹ x = 2V ⟹ x = √2V and y = = √2V
4 2/3
x (2V )

As there is only one critical point, we would expect it to give the minimum 14 . But let's use the second derivative test to verify
that at least the critical point is a local minimum. The various second partial derivatives are
4V 3 −
−− 3 −
−−
Sxx (x, y) = Sxx (√2V , √2V ) = 2
3
x
3

−− 3

−−
Sxy (x, y) = 1 Sxy (√2V , √2V ) = 1

4V 3 −
−− 3 −
−−
Syy (x, y) = Syy (√2V , √2V ) = 2
3
y

2.9.16 https://math.libretexts.org/@go/page/92242
So
3

−− 3

−− 3

−− 3

−− 3

−− 3

−− 2
Sxx (√2V , √2V ) Syy (√2V , √2V ) − Sxy (√2V , √2V ) =3 >0

3

−− 3

−−
Sxx (√2V , √2V ) = 2 > 0


−− 3 −
−−
and, by Theorem 2.9.16.b, (√2V is a local minimum and the desired dimensions are
3

, √2V )

−−
3 −
−− V
3
x = y = √2V z =√
4

Note that our solution has x = y. That's a good thing — the function S(x, y) is symmetric in x and y. Because the box has no
top, the symmetry does not extend to z.

Absolute Minima and Maxima


Of course a local maximum or minimum of a function need not be the absolute maximum of minimum. We'll now consider how to
find the absolute maximum and minimum. Let's start by reviewing how one finds the absolute maximum and minimum of a
function of one variable on an interval.
For concreteness, let's suppose that we want to find the extremal 15 values of a function f (x) on the interval 0 ≤ x ≤ 1. If an
extremal value is attained at some x = a which is in the interior of the interval, i.e. if 0 < a < 1, then a is also a local maximum
or minimum and so has to be a critical point of f . But if an extremal value is attained at a boundary point a of the interval, i.e. if
a = 0 or a = 1, then a need not be a critical point of f . This happens, for example, when f (x) = x. The largest value of f (x) on

the interval 0 ≤ x ≤ 1 is 1 and is attained at x = 1, but f (x) = 1 is never zero, so that f has no critical points.

So to find the maximum and minimum of the function f (x) on the interval [0, 1], you
1. build up a list of all candidate points 0 ≤ a ≤ 1 at which the maximum or minimum could be attained, by finding all a 's for
which either
1. 0 <a <1 and f (a) = 0 or

16
2. 0 < a < 1 and f (a) does not exist

or
3. a is a boundary point, i.e. a = 0 or a = 1,

2. and then you evaluate f (a) at each a on the list of candidates. The biggest of these candidate values of f (a) is the absolute
maximum and the smallest of these candidate values is the absolute minimum.
The procedure for finding the maximum and minimum of a function of two variables, f (x, y) in a set like, for example, the unit
disk x + y ≤ 1, is similar. You again
2 2

1. build up a list of all candidate points (a, b) in the set at which the maximum or minimum could be attained, by finding all
17
(a, b) 's for which either

1. (a, b) is in the interior of the set (for our example, a + b < 1 ) and f (a, b) = f (a, b) = 0 or
2 2
x y

2. (a, b) is in the interior of the set and f (a, b) or f (a, b) does not exist or
x y

18
3. (a, b) is a boundary point, (for our example, a + b = 1 ), and could give the maximum or minimum on the boundary —
2 2

more about this shortly —


2. and then you evaluate f (a, b) at each (a, b) on the list of candidates. The biggest of these candidate values of f (a, b) is the
absolute maximum and the smallest of these candidate values is the absolute minimum.
The boundary of a set, like x + y ≤ 1, in R is a curve, like x + y = 1. This curve is a one dimensional set, meaning that it is
2 2 2 2 2

like a deformed x-axis. We can find the maximum and minimum of f (x, y) on this curve by converting f (x, y) into a function of
one variable (on the curve) and using the standard function of one variable techniques. This is best explained by some examples.

2.9.17 https://math.libretexts.org/@go/page/92242
 Example 2.9.21

Find the maximum and minimum of T (x, y) = (x + y)e on the region defined by x (i.e. on the unit disk).
2 2
−x −y 2 2
+y ≤1

Solution
Let's follow our checklist. First critical points, then points where the partial derivatives don't exist, and finally the boundary.
Interior Critical Points: If T takes its maximum or minimum value at a point in the interior, x + y < 1, then that point must 2 2

be either a critical point of T or a singular point of T . To find the critical points we compute the first order derivatives.
2 2 2 2
2 −x −y 2 −x −y
Tx (x, y) = (1 − 2 x − 2xy)e Ty (x, y) = (1 − 2xy − 2 y )e

Because the exponential e is never zero, the critical points are the solutions of
2 2
−x −y

Tx = 0 ⟺ 2x(x + y) = 1

Ty = 0 ⟺ 2y(x + y) = 1

As both 2x(x + y) and 2y(x + y) are nonzero, we may divide the two equations, which gives x

y
= 1, forcing x = y.
Substituting this into either equation gives 2x(2x) = 1 so that x = y = ± 1

2
.

So the only critical points are ( 1

2
,
1

2
) and (− 1

2
,−
1

2
). Both are in x 2
+y
2
< 1.

Singular points: In this problem, there are no singular points.


Boundary: Points on the boundary satisfy x + y = 1. That is they lie on a circle. We may use the figure below to express
2 2

x = cos t and y = sin t, in terms of the angle t. This will make the formula for T on the boundary quite a bit easier to deal

with. On the boundary,


2 2
− cos t−sin t −1
T = (cos t + sin t)e = (cos t + sin t)e

As all t 's are allowed, this function takes its max and min at zeroes of

dT
−1
= ( − sin t + cos t)e
dt

That is, (cos t + sin t)e −1


takes its max and min
when sin t = cos t,
that is, when x = y and x + y = 1, 2 2

which forces x + x = 1 and hence x = y = ±


2 2 1
.
√2

All together, we have the following candidates for max and min, with the max and min indicated.
1 1 1 1 1 1 1 1
point (
2
,
2
) (−
2
,−
2
) (
√2
,
√2
) (−
√2
,−
√2
)

1 1 √2 √2
value of T √e
≈ 0.61 −
√e e
≈ 0.52 −
e

max min

The following sketch shows all of the critical points. It is a good idea to make such a sketch so that you don't accidentally
include a critical point that is outside of the allowed region.

2.9.18 https://math.libretexts.org/@go/page/92242
In the last example, we analyzed the behaviour of f on the boundary of the region of interest by using the parametrization
x = cos t, y = sin t of the circle x + y = 1. Sometimes using this parametrization is not so clean. And worse, some curves don't
2 2

have such a simple parametrization. In the next problem we'll look at the boundary a little differently.

 Example 2.9.22

Find the maximum and minimum values of f (x, y) = x 3


+ xy
2
− 3x
2
− 4y
2
+4 on the disk x 2
+y
2
≤ 1.

Solution
Again, we first find all critical points, then find all singular points and, finally, analyze the boundary.
Interior Critical Points: If f takes its maximum or minimum value at a point in the interior, x + y < 1, then that point must 2 2

be either a critical point of f or a singular point of f . To find the critical points 19 we compute the first order derivatives.
2 2
fx = 3 x +y − 6x fy = 2xy − 8y

The critical points are the solutions of


2 2
fx = 3x +y − 6x = 0 (E1)

fy = 2y(x − 4) = 0 (E2)

The second equation, 2y(x − 4) = 0, is satisfied if and only if at least one of the two equations y = 0 and x = 4 is satisfied.
When y = 0, equation (E1) forces x to obey
2 2
0 = 3x +0 − 6x = 3x(x − 2)

so that x = 0 or x = 2.
When x = 4, equation (E1) forces y to obey
2 2 2
0 = 3 ×4 +y − 6 × 4 = 24 + y

which is impossible.
So, there are only two critical points: (0, 0),  (2, 0).
Singular points: In this problem, there are no singular points.
Boundary: On the boundary, x + y = 1, we could again take advantage of having a circle and write x = cos t and y = sin t.
2 2

But, for practice, we'll use another method 20. We know that (x, y) satisfies x + y = 1, and hence y = 1 − x . Examining
2 2 2 2

the formula for f (x, y), we see that it contains only even 21 powers of y, so we can eliminate y by substituting y = 1 − x 2 2

into the formula.


3 2 2 2 2
f =x + x(1 − x ) − 3 x − 4(1 − x ) + 4 = x + x

The max and min of x + x for −1 ≤ x ≤ 1 must occur either


2

when x = −1 (⇒ y = f = 0 ) or
when x = +1 (⇒ y = 0, f = 2 ) or


when 0 = d

dx
2
(x + x ) = 1 + 2x (⇒ x = − 1

2
, y = ±√
3

4
, f =−
1

4
).

Here is a sketch showing all of the points that we have identified.

2.9.19 https://math.libretexts.org/@go/page/92242
Note that the point (2, 0) is outside the allowed region 22. So all together, we have the following candidates for max and min,
with the max and min indicated.
√3
point (0, 0) (−1, 0) (1, 0) (−
1

2

2
)

1
value of f 4 0 2 −
4

max min

 Example 2.9.23

Find the maximum and minimum values of f (x, y) = xy − x 3


y
2
when (x, y) runs over the square 0 ≤ x ≤ 1,  0 ≤ y ≤ 1.
Solution
As usual, let's examine the critical points, singular points and boundary in turn.
Interior Critical Points: If f takes its maximum or minimum value at a point in the interior, 0 < x < 1, 0 < y < 1, then that
point must be either a critical point of f or a singular point of f . To find the critical points we compute the first order
derivatives.
2 2 3
fx (x, y) = y − 3 x y fy (x, y) = x − 2 x y

The critical points are the solutions of


2 2
fx = 0 ⟺ y(1 − 3 x y) = 0 ⟺ y = 0  or  3 x y = 1

2 2
fy = 0 ⟺ x(1 − 2 x y) = 0 ⟺ x = 0  or  2 x y = 1

If y = 0, we cannot have 2x 2
y = 1, so we must have x = 0.
2
3x y
If 3x 2
y = 1, we cannot have x = 0, so we must have 2x 2
y = 1. Dividing gives 1 = 2
2x y
=
3

2
which is impossible.

So the only critical point in the square is (0, 0). There f = 0.

Singular points: Yet again there are no singular points in this problem.
Boundary: The region is a square, so its boundary consists of its four sides.
First, we look at the part of the boundary with x = 0. On that entire side f = 0.
Next, we look at the part of the boundary with y = 0. On that entire side f = 0.
Next, we look at the part of the boundary with y = 1. There f = f (x, 1) = x − x . To find the maximum and minimum of 3

f (x, y) on the part of the boundary with y = 1, we must find the maximum and minimum of x − x when 0 ≤ x ≤ 1.
3

Recall that, in general, the maximum and minimum of a function h(x) on the interval a ≤ x ≤ b, must occur either at
x = a or at x = b or at an x for which either h (x) = 0 or h (x) does not exist. In this case, (x − x ) = 1 − 3 x , so the
′ ′ d 3 2

dx

max and min of x − x for 0 ≤ x ≤ 1 must occur


3

either at x = 0, where f = 0,
or at x = , where f =
1 2
,
√3 3 √3

or at x = 1, where f = 0.

Finally, we look at the part of the boundary with x = 1. There f = f (1, y) = y − y .


2
As d

dy
2
(y − y ) = 1 − 2y, the only
critical point of y − y is at y =
2 1

2
. So the the max and min of y − y for 0 ≤ y ≤ 1 must occur
2

either at y = 0, where f = 0,

2.9.20 https://math.libretexts.org/@go/page/92242
or at y = , where f = ,
1

2
1

or at y = 1, where f = 0.
All together, we have the following candidates for max and min, with the max and min indicated.

1 1
point (0, 0) (0,0≤y≤1) (0≤x≤1,0) (1, 0) (1,
2
) (1, 1) (0, 1) (
√3
, 1)

1 2
value of f 0 0 0 0 4
0 0 3√3
≈ 0.385

min min min min min min max

 Example 2.9.24

Find the maximum and minimum values of f (x, y) = xy + 2x + y when (x, y) runs over the triangular region with vertices
(0, 0), (1, 0) and (0, 2). The triangular region is sketched in

Solution
As usual, let's examine the critical points, singular points and boundary in turn.
Interior Critical Points: If f takes its maximum or minimum value at a point in the interior, then that point must be either a
critical point of f or a singular point of f . The critical points are the solutions of

fx (x, y) = y + 2 = 0 fy (x, y) = x + 1 = 0

So there is exactly one critical point, namely (−1, −2). This is well outside the triangle and so is not a candidate for the
location of the max and min.
Singular points: Yet again there are no singular points for this f .
Boundary: The region is a triangle, so its boundary consists of its three sides.
First, we look at the side that runs from (0, 0) to (0, 2). On that entire side x = 0, so that f (0, y) = y. The smallest value
of f on that side is f = 0 at (0, 0) and the largest value of f on that side is f = 2 at (0, 2).
Next, we look at the side that runs from (0, 0) to (1, 0). On that entire side y = 0, so that f (x, 0) = 2x. The smallest value
of f on that side is f = 0 at (0, 0) and the largest value of f on that side is f = 2 at (1, 0).
Finally, we look at the side that runs from (0, 2) to (1, 0). Or first job is to find the equation of the line that contains (0, 2)
and (1, 0). By way of review, we'll find the equation using three different methods.
Method 1: You (probably) learned in high school that any line in the xy-plane 23 has equation y = mx + b where b is
the y intercept and m is the slope. In this case, the line crosses the y axis at y = 2 and so has y intercept b = 2. The line
Δy
passes through (0, 2) and (1, 0) and so, as we see in the figure below, has slope m = Δx
=
0−2

1−0
= −2. Thus the side

2.9.21 https://math.libretexts.org/@go/page/92242
of the triangle that runs from (0, 2) to (1, 0) is y = 2 − 2x with 0 ≤ x ≤ 1.

Method 2: Every line in the xy-plane has an equation of the form ax + by = c. In this case (0, 0) is not on the line so that
c ≠ 0 and we can divide the equation by c, giving y = 1. Rename = A and = B. Thus, because the line does
a b a b
x+
c c c c

not pass through the origin, it has an equation of the form Ax + By = 1, for some constants A and B. In order for (0, 2)
to lie on the line, x = 0, y = 2 has to be a solution of Ax + By = 1. That is, Ax ∣∣ + By ∣∣ = 1, so that B =
x=0
. In
y=2
1

order for (1, 0) to lie on the line, x = 1, y = 0 has to be a solution of Ax + By = 1. That is Ax ∣∣ x=1
+ By ∣

y=0
= 1, so
that A = 1. Thus the line has equation x + y = 1, or equivalently, y = 2 − 2x.
1

Method 3: The vector from (0, 2) to (1, 0) is ⟨1 − 0 , 0 − 2⟩ = ⟨1, −2⟩ . As we see from the figure above, it is a direction
vector for the line. One point on the line is (0, 2). So a parametric equation for the line (see Equation 1.3.1) is

⟨x − 0 , y − 2⟩ = t ⟨1, −2⟩ or x = t,  y = 2 − 2t

By any of these three methods 24 , we have that the side of the triangle that runs from (0, 2) to (1, 0) is y = 2 − 2x with
0 ≤ x ≤ 1. On that side of the triangle

2
f (x, 2 − 2x) = x(2 − 2x) + 2x + (2 − 2x) = −2 x + 2x + 2

Write g(x) = −2x + 2x + 2. The maximum and minimum of g(x) for


2
0 ≤ x ≤ 1, and hence the maximum and minimum
values of f on the hypotenuse of the triangle, must be achieved either at
x = 0, where f (0, 2) = g(0) = 2, or at
x = 1, where f (1, 0) = g(1) = 2, or when
(x) = −4x + 2 so that x = and
′ 1 2
0 =g , y =2− =1
2 2

1 1 2 2 5
f( , 1) = g( ) =− + +2 =
2 2 4 2 2

All together, we have the following candidates for max and min, with the max and min indicated.

point (0, 0) (0, 2) (1, 0) (


1

2
, 1)

value of f 0 2 2
5

min max

2.9.22 https://math.libretexts.org/@go/page/92242
 Example 2.9.25
−−−−−−
Find the high and low points of the surface  z = √x 2 2
+y   with (x, y) varying over the square  |x| ≤ 1, |y| ≤ 1 .
Solution
−−−−− −
The function  f (x, y) = √x 2
+ y2   has a particularly simple geometric interpretation — it is the distance from the point
(x, y) to the origin. So

the minimum of f (x, y) is achieved at the point in the square that is nearest the origin — namely the origin itself. So
(0, 0, 0) is the lowest point on the surface and is at height 0.

The maximum of f (x, y) is achieved at the points in the square that are farthest from the origin — namely the four corners
– –
of the square ( ± 1, ±1). At those four points z = √2. So the highest points on the surface are (±1, ±1, √2).
Even though we have already answered this question, it will be instructive to see what we would have found if we had
−−−−−−
followed our usual protocol. The partial derivatives of f (x, y) = √x + y are defined for (x, y) ≠ (0, 0) and are
2 2

x y
fx (x, y) = − −−−− − fy (x, y) = − −−−− −
√ x2 + y 2 √ x2 + y 2

There are no critical points because


fx = 0 only for x = 0, and
fy = 0 only for y = 0, but
(0, 0) is not a critical point because f
x and f are not defined there.
y

There is one singular point — namely (0, 0). The minimum value of f is achieved at the singular point.
The boundary of the square consists of its four sides. One side is

{(x, y)|x = 1,   − 1 ≤ y ≤ 1}

−−−− − −−−− −
On this side f = √1 + y 2 .As √1 + y 2increases with |y|, the smallest value of f on that side is 1 (when y = 0 ) and the

largest value of f is √2 (when y = ±1 ). The same thing happens on the other three sides. The maximum value of f is
achieved at the four corners. Note that f and f are both nonzero at all four corners.
x y

Exercises
Stage 1

 1✳

a. Some level curves of a function f (x, y) are plotted in the xy--plane below.

For each of the four statements below, circle the letters of all points in the diagram where the situation applies. For example, if
the statement were “These points are on the y --axis”, you would circle both P and U , but none of the other letters. You may

2.9.23 https://math.libretexts.org/@go/page/92242
assume that a local maximum occurs at point T .
(i) n⃗ ablaf  is zero P R ST U

(ii) f  has a saddle point P R ST U

(iii)  the partial derivative fy  is positive P R ST U

(iv)  the directional derivative of f  in the direction ⟨0, −1⟩ is P R ST U

 negative

b. The diagram below shows three “y traces” of a graph z = F (x, y) plotted on xz--axes. (Namely the intersections of the
surface z = F (x, y) with the three planes (y = 1.9, y = 2, y = 2.1 ). For each statement below, circle the correct word.

(i)  the first order partial derivative Fx (1, 2) is positive/negative/zero

(ii) F  has a critical point at (2, 2) true/false

(iii)  the second order partial derivative Fxy (1, 2) is positive/negative/zero

 2
−−−−−−
Find the high and low points of the surface  z = √x + y   with (x, y) varying over the square  |x| ≤ 1, |y| ≤ 1 . Discuss the
2 2

values of  z ,  z   there. Do not evaluate any derivatives in answering this question.


x y

 3

If t0 is a local minimum or maximum of the smooth function  f (t)  of one variable (t runs over all real numbers) then

 f (t0 ) = 0. Derive an analogous necessary condition for x⃗  to be a local minimum or maximium of the smooth function
0

 g(x ⃗)  restricted to points on the line ⃗ 


 x ⃗ = a⃗ + td  . The test should involve the gradient of g(x⃗ ).

Stage 2

 4✳
2
Let z = f (x, y) = (y 2
−x )
2
.

1. Make a reasonably accurate sketch of the level curves in the xy--plane of z = f (x, y) for z = 0, 1 and 16. Be sure to show
the units on the coordinate axes.
2. Verify that (0, 0) is a critical point for z = f (x, y), and determine from part (a) or directly from the formula for f (x, y)
whether (0, 0) is a local minimum, a local maximum or a saddle point.
3. Can you use the Second Derivative Test to determine whether the critical point (0, 0) is a local minimum, a local maximum
or a saddle point? Give reasons for your answer.

2.9.24 https://math.libretexts.org/@go/page/92242
 5✳
Use the Second Derivative Test to find all values of the constant c for which the function z =x
2
+ cxy + y
2
has a saddle
point at (0, 0).

 6✳

Find and classify all critical points of the function


3 3
f (x, y) = x −y − 2xy + 6.

 7✳

Find all critical points for f (x, y) = x(x + xy + y − 9). Also find out which of these points give local maximum values for
2 2

f (x, y), which give local minimum values, and which give saddle points.

 8✳

Find the largest and smallest values of x y z in the part of the plane
2 2
2x + y + z = 5 where x ≥ 0, y ≥ 0 and z ≥ 0. Also
find all points where those extreme values occur.

 9
Find and classify all the critical points of f (x, y) = x 2
+y
2 2
+ x y + 4.

 10 ✳
Find all saddle points, local minima and local maxima of the function
3 2 2
f (x, y) = x +x − 2xy + y − x.

 11 ✳

For the surface


3 2 2 2
z = f (x, y) = x + xy − 3x − 4y +4

Find and classify [as local maxima, local minima, or saddle points] all critical points of f (x, y).

 12
Find the maximum and minimum values of f (x, y) = xy − x 3
y
2
when (x, y) runs over the square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1.

 13

The temperature at all points in the disc x is given by T (x, y) = (x + y)e Find the maximum and minimum
2 2
2 2 −x −y
+y ≤1 .

temperatures at points of the disc.

 14 ✳
1. For the function z = f (x, y) = x + 3xy + 3y − 6x − 3y − 6. Find and classify as [local maxima, local minima, or
3 2

saddle points] all critical points of f (x, y).


2. The images below depict level sets f (x, y) = c of the functions in the list at heights c = 0, 0.1, 0.2, … , 1.9, 2.Label the
pictures with the corresponding function and mark the critical points in each picture. (Note that in some cases, the critical
points might not be drawn on the images already. In those cases you should add them to the picture.)

2.9.25 https://math.libretexts.org/@go/page/92242
1. 2 2
f (x, y) = (x + y − 1)(x − y) + 1
−−−−−−
2. f (x, y) = √x
2
+y
2

3. f (x, y) = y(x + y)(x − y) + 1

4. 2
f (x, y) = x +y
2

 15 ✳

Let the function


3 2
f (x, y) = x + 3xy + 3 y − 6x − 3y − 6

Classify as [ local maxima, minima or saddle points] all critical points of f (x, y).

 16 ✳

Let h(x, y) = y(4 − x 2


− y ).
2

1. Find and classify the critical points of h(x, y) as local maxima, local minima or saddle points.
2. Find the maximum and minimum values of h(x, y) on the disk x + y ≤ 1. 2 2

 17 ✳
Find the absolute maximum and minimum values of the function f (x, y) = 5 + 2x − x 2
− 4y
2
on the rectangular region

R = {(x, y)| − 1 ≤ x ≤ 3,   − 1 ≤ y ≤ 1}

 18 ✳

Find the minimum of the function h(x, y) = −4x − 2y + 6 on the closed bounded domain defined by x 2
+y
2
≤ 1.

 19 ✳

Let f (x, y) = xy(x + y − 3).


1. Find all critical points of f , and classify each one as a local maximum, a local minimum, or saddle point.
2. Find the location and value of the absolute maximum and minimum of f on the triangular region x ≥ 0, y ≥ 0, x + y ≤ 8.

 20 ✳
Find and classify the critical points of f (x, y) = 3x
2
y +y
3
− 3x
2
− 3y
2
+ 4.

 21 ✳
Consider the function
3 2
f (x, y) = 2 x − 6xy + y + 4y

1. Find and classify all of the critical points of f (x, y).


2. Find the maximum and minimum values of f (x, y) in the triangle with vertices (1, 0), (0, 1) and (1, 1).

2.9.26 https://math.libretexts.org/@go/page/92242
 22 ✳

Find all critical points of the function 4


f (x, y) = x +y
4
− 4xy + 2, and for each determine whether it is a local minimum,
maximum or saddle point.

 23 ✳

Let

f (x, y) = xy(x + 2y − 6)

1. Find every critical point of f (x, y) and classify each one.


2. Let D be the region in the plane between the hyperbola xy = 4 and the line x + 2y − 6 = 0. Find the maximum and
minimum values of f (x, y) on D.

 24 ✳

Find all the critical points of the function


4 4
f (x, y) = x +y − 4xy

defined in the xy-plane. Classify each critical point as a local minimum, maximum or saddle point.

 25 ✳

A metal plate is in the form of a semi-circular disc bounded by the x-axis and the upper half of x + y = 4. The temperature
2 2

at the point (x, y) is given by T (x, y) = ln (1 + x + y ) − y. Find the coldest point on the plate, explaining your steps
2 2

carefully. (Note: ln 2 ≈ 0.693, ln 5 ≈ 1.609)

 26 ✳

Find all the critical points of the function


3 2
f (x, y) = x + xy −x

defined in the xy-plane. Classify each critical point as a local minimum, maximum or saddle point. Explain your reasoning.

 27 ✳

Consider the function g(x, y) = x 2 2


− 10y − y .

1. Find and classify all critical points of g.


2. Find the absolute extrema of g on the bounded region given by
2 2
x + 4y ≤ 16,  y ≤ 0

 28 ✳

Find and classify all critical points of


3 2 2 2
f (x, y) = x − 3x y − 3x − 3y

 29 ✳

Find the maximum value of


2 2
−( x +y )/2
f (x, y) = xye

2.9.27 https://math.libretexts.org/@go/page/92242
on the quarter-circle D = {(x, y)|x 2
+y
2
≤ 4,  x ≥ 0,  y ≥ 0} .

 30

Equal angle bends are made at equal distances from the two ends of a 100 metre long fence, so that the resulting three segment
fence can be placed along an existing wall to make an enclosure of trapezoidal shape. What is the largest possible area for such
an enclosure?

 31

Find the most economical shape of a rectangular box that has a fixed volume V and that has no top.

Stage 3

 32 ✳

The temperature T (x, y) at a point of the xy--plane is given by


2 2
T (x, y) = 20 − 4 x −y

1. Find the maximum and minimum values of T (x, y) on the disk D defined by x + y ≤ 4. 2 2

2. Suppose an ant lives on the disk D. If the ant is initially at point (1, 1), in which direction should it move so as to increase
its temperature as quickly as possible?
3. Suppose that the ant moves at a velocity v ⃗ = ⟨−2, −1⟩ . What is its rate of increase of temperature as it passes through
(1, 1)?

4. Suppose the ant is constrained to stay on the curve y = 2 − x 2


. Where should the ant go if it wants to be as warm as
possible?

 33 ✳

Consider the function


2 3 2 2
f (x, y) = 3kx y + y − 3x − 3y +4

where k > 0 is a constant. Find and classify all critical points of f (x, y) as local minima, local maxima, saddle points or points
of indeterminate type. Carefully distinguish the cases k < , k = and k > .
1

2
1

2
1

 34 ✳
1. Show that the function f (x, y) = 2x + 4y + has exactly one critical point in the first quadrant x > 0, y > 0, and find
1

xy

its value at that point.


2. Use the second derivative test to classify the critical point in part (a).
3. Hence explain why the inequality 2x + 4y + ≥ 6 is valid for all positive real numbers x and y.
1

xy

 35

An experiment yields data points  (x , y ),  i = 1, 2, ⋯ , n. We wish to find the straight line  y = mx + b  which “best” fits
i i

the data. The definition of “best” is “minimizes the root mean square error”, i.e. minimizes   ∑ (mx + b − y ) . Find m
n

i=1 i i
2

and b.

1. Life is not (always) one-dimensional and sometimes we have to embrace it.


2. Or perhaps your instructor asked you.
3. Recall that if h < 0 and A ≤ B, then hA ≥ hB. This is because the product of any two negative numbers is positive, so that
h < 0,  A ≤ B ⟹ A−B ≤ 0 ⟹ h(A − B) ≥ 0 ⟹ hA ≥ hB.

2.9.28 https://math.libretexts.org/@go/page/92242
4. A very common error of logic that people make is “Affirming the consequent”. “If P then Q” is true, does not imply that “If Q
then P” is true . The statement “If he is Shakespeare then he is dead” is true. But concluding from “That man is dead” that “He
must be Shakespeare” is just silly.
5. And you also saw, for example in Example 3.6.4 of the CLP-1 text, that critical points that are also inflection points are neither
local maxima nor local minima.
6. We have both types of music here — country and western.
7. This sort of thing is generally illegal.
8. Sorry about the pun.
9. Proof by search engine.
10. And has been used for a long time. It was introduced by the French mathematician Adrien-Marie Legendre, 1752--1833, in
1805, and by the German mathematician and physicist Carl Friedrich Gauss, 1777--1855, in 1809.
11. This is equivalent to translating the graph so that the critical point lies at (0, 0).
12. There are analogous results in higher dimensions that are accessible to people who have learned some linear algebra. They are
derived by diagonalizing the matrix of second derivatives, which is called the Hessian matrix.
13. The shackles of convention are not limited to mathematics. Election ballots often have the candidates listed in alphabetic order.
14. Indeed one can use the facts that 0 < x < ∞, that 0 < y < ∞, and that S → ∞ as x → 0 and as y → 0 and as x → ∞ and as
y → ∞ to prove that the single critical point gives the global minimum.

15. Recall that “extremal value” means “either maximum value or minimum value”.
16. Recall that if f (a) does not exist, then a is called a singular point of f .

17. This is probably a good time to review the statement of Theorem 2.9.2.
18. It should intuitively obvious from a sketch that the boundary of the disk x + y ≤ 1 is the circle x + y = 1. But if you
2 2 2 2

really need a formal definition, here it is. A point (a, b) is on the boundary of a set S if there is a sequence of points in S that
converges to (a, b) and there is also a sequence of points in the complement of S that converges to (a, b).
19. We actually found the critical points in Example 2.9.19. But, for the convenience of the reader, we'll repeat that here.
20. Even if you don't believe that “you can't have too many tools”, it is pretty dangerous to have to rely on just one tool.
−−−− −
21. If it contained odd powers too, we could consider the cases y ≥ 0 and y ≤ 0 separately and substitute y = √1 − x in the 2

−−−− −
former case and y = −√1 − x in the latter case.
2

22. We found (2, 0) as a solution to the critical point equations (E1), (E2). That's because, in the course of solving those equations,
we ignored the constraint that x + y ≤ 1.
2 2

23. To be picky, any line the xy-plane that is not parallel to the y axis.
24. In the third method, x has just be renamed to t.

This page titled 2.9: Maximum and Minimum Values is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.

2.9.29 https://math.libretexts.org/@go/page/92242
2.10: Lagrange Multipliers
In the last section we had to solve a number of problems of the form “What is the maximum value of the function f on the curve
C ?” In those examples, the curve C was simple enough that we could reduce the problem to finding the maximum of a function of

one variable. For more complicated problems this reduction might not be possible. In this section, we introduce another method for
solving such problems. First some nomenclature.

 Definition 2.10.1
A problem of the form
“Find the maximum and minimum values of the function f (x, y) for (x, y) on the curve g(x, y) = 0. ”
is one type of constrained optimization problem. The function being maximized or minimized, f (x, y), is called the objective
function. The function, g(x, y), whose zero set is the curve of interest, is called the constraint function.

Such problems are quite common. As we said above, we have already encountered them in the last section on absolute maxima and
minima, when we were looking for the extreme values of a function on the boundary of a region. In economics “utility functions”
are used to model the relative “usefulness” or “desirability” or “preference” of various economic choices. For example, a utility
function U (w, κ) might specify the relative level of satisfaction a consumer would get from purchasing a quantity w of wine and κ
of coffee. If the consumer wants to spend $100 and wine costs $20 per unit and coffee costs $5 per unit, then the consumer would
like to maximize U (w, κ) subject to the constraint that 20w + 5κ = 100.
To this point we have always solved such constrained optimization problems either by
solving g(x, y) = 0 for y as a function of x (or for x as a function of y ) or by
parametrizing the curve g(x, y) = 0. This means writing all points of the curve in the form (x(t), y(t)) for some functions x(t)
and y(t). For example we used x(t) = cos t, y(t) = sin t as a parametrization of the circle x + y = 1 in Example 2.9.21.
2 2

However quite often the function g(x, y) is so complicated that one cannot explicitly solve g(x, y) = 0 for y as a function of x or
for x as a function of y and one also cannot explicitly parametrize g(x, y) = 0. Or sometimes you can, for example, solve
g(x, y) = 0 for y as a function of x, but the resulting solution is so complicated that it is really hard, or even virtually impossible,

to work with. Direct attacks become even harder in higher dimensions when, for example, we wish to optimize a function
f (x, y, z) subject to a constraint g(x, y, z) = 0.

There is another procedure called the method of “Lagrange multipliers” 1 that comes to our rescue in these scenarios. Here is the
three dimensional version of the method. There are obvious analogs is other dimensions.

 Theorem 2.10.2. Lagrange Multipliers

Let f (x, y, z) and g(x, y, z) have continuous first partial derivatives in a region of R that contains the surface S given by the
3

equation g(x, y, z) = 0. Further assume that ∇g(x, y, z) ≠ 0 on S.


If f , restricted to the surface S, has a local extreme value at the point (a, b, c) on S, then there is a real number λ such that

∇f (a, b, c) = λ∇g(a, b, c)

that is
fx (a, b, c) = λ gx (a, b, c)

fy (a, b, c) = λ gy (a, b, c)

fz (a, b, c) = λ gz (a, b, c)

The number λ is called a Lagrange multiplier.

Proof
Suppose that (a, b, c) is a point of S and that f (x, y, z) ≥ f (a, b, c) for all points (x, y, z) on S that are close to (a, b, c). That
is (a, b, c) is a local minimum for f on S. Of course the argument for a local maximum is virtually identical.

2.10.1 https://math.libretexts.org/@go/page/92243
Imagine that we go for a walk on S, with the time t running, say, from t = −1 to t = +1 and that at time t = 0 we happen to
be exactly at (a, b, c). Let's say that our position is (x(t), y(t), z(t)) at time t.
Write

F (t) = f (x(t), y(t), z(t))

So F (t) is the value of f that we see on our walk at time t. Then for all t close to 0, (x(t), y(t), z(t)) is close to
(x(0), y(0), z(0)) = (a, b, c) so that

F (0) = f (x(0), y(0), z(0)) = f (a, b, c) ≤ f (x(t), y(t), z(t)) = F (t)

for all t close to zero. So F (t) has a local minimum at t = 0 and consequently F ′
(0) = 0.

By the chain rule, Theorem 2.4.1,


d
′ ∣
F (0) = f (x(t), y(t), z(t))
∣t=0
dt
′ ′ ′
= fx (a, b, c)x (0) + fy (a, b, c)y (0) + fz (a, b, c)z (0) = 0 (∗)

We may rewrite this as a dot product:


′ ′ ′ ′
0 = F (0) = ∇f (a, b, c) ⋅ ⟨x (0) , y (0) , z (0)⟩
′ ′ ′
⟹ n⃗ ablaf (a, b, c) ⊥ ⟨x (0) , y (0) , z (0)⟩

This is true for all paths on S that pass through (a, b, c) at time 0. In particular it is true for all vectors ′ ′ ′
⟨x (0) , y (0) , z (0)⟩

that are tangent to S at (a, b, c). So ∇f (a, b, c) is perpendicular to S at (a, b, c).


But we already know, by Theorem 2.5.5.a, that ∇g(a, b, c) is also perpendicular to S at (a, b, c). So ∇f (a, b, c) and ∇g(a, b, c)
have to be parallel vectors. That is,

∇f (a, b, c) = λ∇g(a, b, c)

for some number λ. That's the Lagrange multiplier rule of our theorem.

So to find the maximum and minimum values of f (x, y, z) on a surface g(x, y, z) = 0, assuming that both the objective function
f (x, y, z) and constraint function g(x, y, z) have continuous first partial derivatives and that ∇g(x, y, z) ≠ 0, you

1. build up a list of candidate points (x, y, z) by finding all solutions to the equations
fx (x, y, z) = λ gx (x, y, z)

fy (x, y, z) = λ gy (x, y, z)

fz (x, y, z) = λ gz (x, y, z)

g(x, y, z) = 0

Note that there are four equations and four unknowns, namely x, y, z and λ.
2. Then you evaluate f (x, y, z) at each (x, y, z) on the list of candidates. The biggest of these candidate values is the absolute
maximum and the smallest of these candidate values is the absolute minimum.
Another way to write the system of equations in the first step is

Lx (a, b, c, λ) = Ly (a, b, c, λ) = Lz (a, b, c, λ) = Lλ (a, b, c, λ) = 0

where L(x, y, z, λ) is the auxiliary function 2 3 .

L(x, y, z, λ) = f (x, y, z) − λ g(x, y, z)

Now for a bunch of examples.

 Example 2.10.3
Find the maximum and minimum of the function x 2
− 10x − y
2
on the ellipse whose equation is x 2
+ 4y
2
= 16.

Solution

2.10.2 https://math.libretexts.org/@go/page/92243
For this problem the objective function is f (x, y) = x − 10x − y and the constraint function is g(x, y) = x + 4y − 16.
2 2 2 2

To apply the method of Lagrange multipliers we need ∇f and ∇g. So we start by computing the first order derivatives of these
functions.

fx = 2x − 10 fy = −2y gx = 2x gy = 8y

So, according to the method of Lagrange multipliers, we need to find all solutions to
2x − 10 = λ(2x)

−2y = λ(8y)

2 2
x + 4y − 16 = 0

Rearranging these equations gives


(λ − 1)x = −5 (E1)

(4λ + 1)y = 0 (E2)

2 2
x + 4y − 16 = 0 (E3)

From (E2), we see that we must have either λ = − 1

4
or y = 0.
If λ = − , (E1) gives − x = −5, i.e. x = 4, and then (E3) gives y = 0.
1

4
5

If y = 0, then (c) gives x = ±4 (and while we could easily use (E1) to solve for λ, we don't actually need λ ).
So the method of Lagrange multipliers, Theorem 2.10.2 (actually the dimension two version of Theorem 2.10.2), gives that the
only possible locations of the maximum and minimum of the function f are (4, 0) and (−4, 0). To complete the problem, we
only have to compute f at those points.

point (4, 0) (−4, 0)

value of f −24 56

min max

Hence the maximum value of x 2


− 10x − y
2
on the ellipse is 56 and the minimum value is −24.

In the previous example, the objective function and the constraint were specified explicitly. That will not always be the case. In the
next example, we have to do a little geometry to extract them.

 Example 2.10.4

Find the rectangle of largest area (with sides parallel to the coordinates axes) that can be inscribed in the ellipse x 2
+ 2y
2
= 1.

Solution
Since this question is so geometric, it is best to start by drawing a picture.

2.10.3 https://math.libretexts.org/@go/page/92243
Call the coordinates of the upper right corner of the rectangle (x, y), as in the figure above. The four corners of the rectangle
are (±x, ±y) so the rectangle has width 2x and height 2y and the objective function is f (x, y) = 4xy. The constraint function
for this problem is g(x, y) = x + 2y − 1. Again, to use Lagrange multipliers we need the first order partial derivatives.
2 2

fx = 4y fy = 4x gx = 2x gy = 4y

So, according to the method of Lagrange multipliers, we need to find all solutions to
4y = λ(2x) (E1)

4x = λ(4y) (E2)

2 2
x + 2y −1 = 0 (E3)

Equation (E1) gives y = 1

2
λx. Substituting this into equation (E2) gives
2 2
4x = 2 λ x or 2x(2 − λ ) = 0

– –
So (E2) is satisfied if either x = 0 or λ = √2 or λ = −√2.
If x = 0, then (E1) gives y = 0 too. But (0, 0) violates the constraint equation (E3). Note that, to have a solution, all of the
equations (E1), (E2) and (E3) must be satisfied.

If λ = √2, then

(E2) gives x = √2y and then
(E3) gives 2y + 2y = 1 or y
2 2 2
=
1

4
so that

y =±
1

2
and x = √2y = ± . √2
1


If λ = −√2, then

(E2) gives x = −√2y and then
(E3) gives 2y + 2y = 1 or y
2 2 2
=
1

4
so that

y =±
1

2
and x = −√2y = ∓ 1

√2
.

We now have four possible values of (x, y), namely ( √2


1
,
1

2
), ( −
1

√2
, −
1

2
), (
1

√2
, −
1

2
) and ( − 1

√2
,
1

2
). They are the four
corners of a single rectangle. We said that we wanted (x, y) to be the upper right corner, i.e. the corner in the first quadrant. It
is (1
,
1
).
2
√2

 Example 2.10.5

Find the ends of the major and minor axes of the ellipse 3x
2
− 2xy + 3 y
2
= 4. They are the points on the ellipse that are
farthest from and nearest to the origin.
Solution
Let (x, y) be a point on 3x − 2xy + 3y = 4. This point is at the end of a major axis when it maximizes its distance from the
2 2

centre, (0, 0) of the ellipse. It is at the end of a minor axis when it minimizes its distance from (0, 0). So we wish to maximize
− −− −− −
and minimize the distance √x + y subject to the constraint
2 2

2 2
g(x, y) = 3 x − 2xy + 3 y −4 = 0

−−−−−− −−−−−− 2
Now maximizing/minimizing √x + y is equivalent 4 to maximizing/minimizing its square
2 2 2 2 2 2
(√x + y ) = x + y . So
we are free to choose the objective function
2 2
f (x, y) = x +y

which we will do, because it makes the derivatives cleaner. Again, we use Lagrange multipliers to solve this problem, so we
start by finding the partial derivatives.

fx (x, y) = 2x fy (x, y) = 2y gx (x, y) = 6x − 2y gy (x, y) = −2x + 6y

We need to find all solutions to

2.10.4 https://math.libretexts.org/@go/page/92243
2x = λ(6x − 2y)

2y = λ(−2x + 6y)

2 2
3x − 2xy + 3 y −4 =0

Dividing the first two equations by 2, and then collecting together the x's and the y 's gives
(1 − 3λ)x + λy = 0 (E1)

λx + (1 − 3λ)y = 0 (E2)

2 2
3x − 2xy + 3 y −4 =0 (E3)

To start, let's concentrate on the first two equations. Pretend, for a couple of minutes, that we already know the value of λ and
are trying to find x and y. Note that λ cannot be zero because if it is, (E1) forces x = 0 and (E2) forces y = 0 and (0, 0) is not
on the ellipse, i.e. violates (E3). So we may divide by λ and (E1) gives
1 − 3λ
y =− x
λ

Subbing this into (E2) gives


2
(1 − 3λ)
λx − x =0
λ

1−3λ
Again, x cannot be zero, since then y = − λ
x would give y = 0 and (0, 0) is still not on the ellipse.
2
(1−3λ)
So we may divide λx − λ
x =0 by x, giving
2
(1 − 3λ)
2 2
λ− =0 ⟺ (1 − 3λ ) −λ =0
λ
2
⟺ 8λ − 6λ + 1 = (2λ − 1)(4λ − 1) = 0

We now know that λ must be either 1

2
or 1

4
. Subbing these into either (E1) or (E2) gives
1 1 1
λ =   ⟹  − x+ y = 0  ⟹  x = y
2 2 2
E3
2 2 2
  ⟹  3 x − 2x + 3x = 4  ⟹  x = ±1

1 1 1
λ =   ⟹   x+ y = 0  ⟹  x = −y
4 4 4
E3 1
2 2 2
  ⟹  3 x + 2x + 3x = 4  ⟹  x = ±

√2

E3
Here “ ⟹ ” indicates that we have just used (E3). We now have (x, y) = ±(1, 1), from λ = 1

2
, and (x, y) = ± ( 1
,−
1
)
√2 √2

from λ = 1

4
. The distance from (0, 0) to ±(1, 1), namely √2, is larger than the distance from (0, 0) to ±( 1
,−
1
), namely
√2 √2

1. So the ends of the minor axes are ±( 1

√2
,−
√2
1
) and the ends of the major axes are ±(1, 1). Those ends are sketched in the
figure on the left below. Once we have the ends, it is an easy matter 5 to sketch the ellipse as in the figure on the right below.

2.10.5 https://math.libretexts.org/@go/page/92243
 Example 2.10.6

Find the values of w ≥ 0 and κ ≥ 0 that maximize the utility function


2 1

U (w, κ) = 6 w 3
κ 3
subject to the constraint 4w + 2κ = 12

Solution
The constraint 4w + 2κ = 12 is simple enough that we can easily use it to express κ in terms of w, then substitute
2 1

κ = 6 − 2w into U (w, κ), and then maximize U (w, 6 − 2w) = 6 w 3


(6 − 2w ) 3
using the techniques of §3.5 in the CLP-1
textbook.
2 1

However, for practice purposes, we'll use Lagrange multipliers with the objective function U (w, κ) = 6w 3
κ 3
and the
constraint function g(w, κ) = 4w + 2κ − 12. The first order derivatives of these functions are
1 1 2 2
− −
Uw = 4 w 3
κ 3
Uκ = 2 w 3
κ 3
gw = 4 gκ = 2

The boundary values w = 0 and κ = 0 give utility 0, which is obviously not going to be the maximum utility. So it suffices to
consider only local maxima. According to the method of Lagrange multipliers, we need to find all solutions to
1 1

4w 3 κ 3 = 4λ (E1)
2 2

2w 3 κ 3 = 2λ (E2)

4w + 2κ − 12 = 0 (E3)

Then
1 1

equation (E1) gives λ = w −


3
κ 3
.
2 2 1 1

Substituting this into (E2) gives w κ =λ =w 3



3

3 κ 3 and hence w = κ.
Then substituting w = κ into (E3) gives 6κ = 12.
So w = κ = 2 and the maximum utility is U (2, 2) = 12.

 Example 2.10.7

Find the point on the sphere x 2


+y
2
+z
2
=1 that is farthest from (1, 2, 3).
Solution
As before, we simplify the algebra by maximizing the square of the distance rather than the distance itself. So we are to
maximize
2 2 2
f (x, y, z) = (x − 1 ) + (y − 2 ) + (z − 3 )

subject to the constraint


2 2 2
g(x, y, z) = x +y +z −1 = 0

Since

fx (x, y, z) = 2(x − 1) fy (x, y, z) = 2(y − 2) fz (x, y, z) = 2(z − 3)

gx (x, y, z) = 2x gy (x, y, z) = 2y gz (x, y, z) = 2z

we need to find all solutions to


1
2(x − 1) = λ(2x) ⟺ x = (E1)
1 −λ

2
2(y − 2) = λ(2y) ⟺ y = (E2)
1 −λ

3
2(z − 3) = λ(2z) ⟺ z = (E3)
1 −λ
2 2 2
0 =x +y +z −1 (E4)

2.10.6 https://math.libretexts.org/@go/page/92243
Substituting (E1), (E2) and (E3) into (E4) gives
1 +4 +9 2 −−
−1 = 0 ⟹ (1 − λ ) = 14 ⟹ 1 − λ = ±√14
(1 − λ)2

We can then substitute these two values of λ back into the expressions for x, y, z in terms of λ to get the two points
(1, 2, 3) and −
1 1
(1, 2, 3).
√14 √14

The vector from 1


(1, 2, 3) to (1, 2, 3), namely {1 −
1
} (1, 2, 3), is obviously shorter than the vector from
√14 √14


1

√14
(1, 2, 3) to (1, 2, 3), which is {1 +
√14
1
} (1, 2, 3). So the nearest point is 1

√14
(1, 2, 3) and the farthest point is

1

√14
.
(1, 2, 3)

(Optional) An Example with Two Lagrange Multipliers


In this optional section, we consider an example of a problem of the form “maximize (or minimize) f (x, y, z) subject to the two
constraints g(x, y, z) = 0 and h(x, y, z) = 0 ”. We use the following variant of Theorem 2.10.2.

 Theorem 2.10.8. Two Lagrange Multipliers

Let f (x, y, z), g(x, y, z) and h(x, y, z) have continuous first partial derivatives in a region of R
3
that contains the curve C

given by the equations

g(x, y, z) = h(x, y, z) = 0

Assume 6 that ∇g(x, y, z) × ∇h(x, y, z) ≠ 0 on C . If f , restricted to the curve C, has a local extreme value at the point
(a, b, c) on C , then there are real numbers λ and μ such that

∇f (a, b, c) = λ∇g(a, b, c) + μ∇h(a, b, c)

that is

fx (a, b, c) = λ gx (a, b, c) + μ hx (a, b, c)

fy (a, b, c) = λ gy (a, b, c) + μ hy (a, b, c)

fz (a, b, c) = λ gz (a, b, c) + μ hz (a, b, c)

We can reformulate this theorem in terms of the auxiliary function

L(x, y, z, λ, μ) = f (x, y, z) − λ g(x, y, z) − μ h(x, y, z)

It is a function of five variables — the original variables x, y and z, and two auxiliary variables λ and μ. If there is a local extreme
value at (a, b, c) then (a, b, c) must obey

 Equation 2.10.9

0 = Lx (a, b, c, λ, μ) = fx (a, b, c) − λ gx (a, b, c) − μhx (a, b, c)

0 = Ly (a, b, c, λ, μ) = fy (a, b, c) − λ gy (a, b, c) − μhy (a, b, c)

0 = Lz (a, b, c, λ, μ) = fz (a, b, c) − λ gz (a, b, c) − μhz (a, b, c)

0 = Lλ (a, b, c, λ, μ) = g(a, b, c)

0 = Lμ (a, b, c, λ, μ) = h(a, b, c)

for some λ and μ. So solving this system of five equations in five unknowns gives all possible candidates for the locations of local
maxima and minima. We'll go through an example shortly.

Proof of Theorem 2.10.8.


Before we get to the example itself, here is why the above approach works. Assume that a local minimum occurs at (a, b, c),
which is the grey point in the schematic figure below. Imagine that you start walking away from (a, b, c) along the curve
g = h = 0. Your path is the grey line in the schematic figure below.

2.10.7 https://math.libretexts.org/@go/page/92243
Call your velocity vector v.⃗  It is tangent to the curve g(x, y, z) = h(x, y, z) = 0. Because f has a local minimum at (a, b, c), f
must be increasing (or constant) as we leave (a, b, c). So the directional derivative

Dv ⃗ f (a, b, c) = ∇f (a, b, c) ⋅ v ⃗ ≥ 0

Now start over. Again walk away from (a, b, c) along the curve g = h = 0, but this time moving in the opposite direction, with
velocity vector −v.⃗  Again f must be increasing (or constant) as we leave (a, b, c), so the directional derivative

D ⃗  ≥ 0
f (a, b, c) = ∇f (a, b, c) ⋅ (−v)
−v ⃗ 

As both ∇f (a, b, c) ⋅ v ⃗ and −∇f (a, b, c) ⋅ v ⃗ are at least zero, we now have that

∇f (a, b, c) ⋅ v ⃗ = 0

for all vectors v ⃗ that are tangent to the curve g = h = 0 at (a, b, c). Let's denote by T the set of all vectors v ⃗ that are tangent to
the curve g = h = 0 at (a, b, c) and let's denote by T the set of all vectors that are perpendicular to all vectors in T . So (∗)

says that ∇f (a, b, c) must in T . ⊥

We now find all vectors in T . We can easily guess two such vectors. Since the curve g = h = 0 lies inside the surface g = 0

and ∇g(a, b, c) is normal to g = 0 at (a, b, c), we have

∇g(a, b, c) ⋅ v ⃗ = 0

Similarly, since the the curve g = h = 0 lies inside the surface h = 0 and ∇h(a, b, c) is normal to h = 0 at (a, b, c), we have

∇h(a, b, c) ⋅ v ⃗ = 0

Picking any two constants λ and μ, multiplying (E1) by λ, multiplying (E2) by μ and adding gives that

(λ∇g(a, b, c) + μ∇h(a, b, c)) ⋅ v ⃗ = 0

for all vectors v ⃗ in T . Thus, for all λ and μ, the vector λ∇g(a, b, c) + μ∇h(a, b, c) is in T ⊥
.

Now the vectors in T form a line. (They are all tangent to the same curve at the same point.) So, T , the set of all vectors

perpendicular to T , forms a plane. As λ and μ run over all real numbers, the vectors λ∇g(a, b, c) + μ∇h(a, b, c) form a plane.
Thus we have found all vector in T and we conclude that ∇f (a, b, c) must be of the form λ∇g(a, b, c) + μ∇h(a, b, c) for

some real numbers λ and μ. The three components of the equation

∇f (a, b, c) = λ∇g(a, b, c) + μ∇h(a, b, c)

are exactly the first three equations of 2.10.9. This completes the explanation of why Lagrange multipliers work in this setting.

 Example 2.10.10

Find the distance from the origin to the curve that is the intersection of the two surfaces
2 2 2
z =x +y x − 2z = 3

Solution
Yet again, we simplify the algebra by maximizing the square of the distance rather than the distance itself. So we are to
maximize
2 2 2
f (x, y, z) = x +y +z

subject to the constraints

2.10.8 https://math.libretexts.org/@go/page/92243
2 2 2
0 = g(x, y, z) = x +y −z 0 = h(x, y, z) = x − 2z − 3

Since

fx = 2x fy = 2y fz = 2z

gx = 2x gy = 2y gz = −2z

hx = 1 hy = 0 hz = −2

the method of Lagrange multipliers requires us to find all solutions to


2x = λ(2x) + μ(1) (E1)

2y = λ(2y) + μ(0) ⟺ (1 − λ)y = 0 (E2)

2z = λ(−2z) + μ(−2) (E3)


2 2 2
z =x +y (E4)

x − 2z = 3 (E5)

Since equation (E2) factors so nicely we start there. It tells us that either y = 0 or λ = 1.
Case \(\lambda=1\text{:}\) When λ = 1 the remaining equations reduce to
0 =μ (E1)

0 = 4z + 2μ (E3)

2 2 2
z =x +y (E4)

x − 2z = 3 (E5)

So
equation (E1) gives μ = 0.
Then substituting μ = 0 into (E3) gives z = 0.
Then substituting z = 0 into (E5) gives x = 3.
Then substituting z = 0 and x = 3 into (E4) gives 0 = 9 + y 2
, which is impossible, since 9 + y 2
≥9 >0 for all y.
So we can't have λ = 1.
Case y = 0: When y = 0 the remaining equations reduce to
2(1 − λ)x = μ (E1)

(1 + λ)z = −μ (E3)

2 2
z =x (E4)

x − 2z = 3 (E5)

These don't clean up quite so nicely as in the λ =1 case. But at least equation (E4) tells us that z = ±x. So we have to
consider those two possibilities.
Subcase y = 0, z = x: When y = 0 and z = x, the remaining equations reduce to

2(1 − λ)x = μ (E1)

(1 + λ)x = −μ (E3)

−x = 3 (E5)

So equation (E5) now tells us that x = −3 so that (x, y, z) = (−3, 0, −3). (We don't really care what λ and μ are. But as they
obey −6(1 − λ) = μ, −3(1 + λ) = −μ we have, adding the two equations together

−9 + 3λ = 0 ⟹ λ =3

and then, subbing into either equation, μ = 12. )


Subcase y = 0, z = −x: When y = 0 and z = −x, the remaining equations reduce to

2(1 − λ)x = μ (E1)

(1 + λ)x = μ (E3)

3x = 3 (E5)

2.10.9 https://math.libretexts.org/@go/page/92243
So equation (E5) now tells us that x = 1 so that (x, y, z) = (1, 0, −1). (Again, we don't really care what λ and μ are. But as
they obey 2(1 − λ) = μ, (1 + λ) = μ we have, subtracting the second equation from the first,
1
1 − 3λ = 0 ⟹ λ =
3

and then, subbing into either equation, μ = 4

3
. )
Conclusion: We have two candidates for the location of the max and min, namely (−3, 0, −3) and (1, 0, −1). The first is a
– –
distance 3√2 from the origin, giving the maximum, and the second is a distance √2 from the origin, giving the minimum. In

particular, the distance is √2.

Exercises
Stage 1

 1✳
1. Does the function f (x, y) = x + y have a maximum or a minimum on the curve xy = 1? Explain.
2 2

2. Find all maxima and minima of f (x, y) on the curve xy = 1.

 2

The surface S is given by the equation g(x, y, z) = 0. You are walking on S measuring the function f (x, y, z) as you go. You
are currently at the point (x , y , z ) where f takes its largest value on S, and are walking in the direction d ⃗ ≠ 0. Because
0 0 0

you are walking on S, the vector d ⃗ is tangent to S at (x , y , z ). 0 0 0

1. What is the directional derivative of f at (x 0, y0 , z0 ) in the direction d ?⃗  Do not use the method of Lagrange multipliers.
2. What is the directional derivative of f at (x 0, y0 , z0 ) in the direction d ?⃗  This time use the method of Lagrange multipliers.

Stage 2

 3

Find the maximum and minimum values of the function f (x, y, z) = x + y − z on the sphere x 2
+y
2
+z
2
= 1.

 4
2
2 2
y
Find a,  b and c so that the volume 4π

3
abc of an ellipsoid x

a
2
+ 2
+
z

c
2
=1 passing through the point (1, 2, 1) is as small as
b

possible.

 5✳

Use the Method of Lagrange Multipliers to find the minimum value of z =x


2
+y
2
subject to 2
x y = 1. At which point or
points does the minimum occur?

 6✳

Use the Method of Lagrange Multipliers to find the radius of the base and the height of a right circular cylinder of maximum
volume which can be fit inside the unit sphere x + y + z = 1. 2 2 2

 7✳

Use the method of Lagrange Multipliers to find the maximum and minimum values of

f (x, y) = xy

2.10.10 https://math.libretexts.org/@go/page/92243
subject to the constraint
2 2
x + 2y = 1.

 8✳

Find the maximum and minimum values of f (x, y) = x 2


+y
2
subject to the constraint x 4
+y
4
= 1.

 9✳

Use Lagrange multipliers to find the points on the sphere z 2


+x
2
+y
2
− 2y − 10 = 0 closest to and farthest from the point
(1, −2, 1).

 10 ✳

Use Lagrange multipliers to find the maximum and minimum values of the function f (x, y, z) = x 2
+y
2

1

20
z
2
on the curve
of intersection of the plane x + 2y + z = 10 and the paraboloid x + y − z = 0. 2 2

 11 ✳

Find the point P = (x, y, z) (with x, y and z > 0 ) on the surface x 3 2
y z = 6 √3 that is closest to the origin.

 12 ✳

Find the maximum value of f (x, y, z) = xyz on the ellipsoid


2 2 2
g(x, y, z) = x + xy + y + 3z =9

Specify all points at which this maximum value occurs.

 13 ✳
Find the radius of the largest sphere centred at the origin that can be inscribed inside (that is, enclosed inside) the ellipsoid
2 2 2
2(x + 1 ) +y + 2(z − 1 ) =8

 14 ✳
Let C be the intersection of the plane x + y + z = 2 and the sphere x 2
+y
2
+z
2
= 2.

1. Use Lagrange multipliers to find the maximum value of f (x, y, z) = z on C .


2. What are the coordinates of the lowest point on C ?

 15 ✳
1. Use Lagrange multipliers to find the extreme values of
2 2 2
f (x, y, z) = (x − 2 ) + (y + 2 ) + (z − 4 )

on the sphere x + y + z = 6.
2 2 2

2. Find the point on the sphere x + y


2 2
+z
2
=6 that is farthest from the point (2, −2, 4).

 16 ✳
1. Find the minimum of the function
2 2 2
f (x, y, z) = (x − 2 ) + (y − 1 ) +z

2.10.11 https://math.libretexts.org/@go/page/92243
subject to the constraint x + y + z = 1, using the method of Lagrange multipliers.
2 2 2

2. Give a geometric interpretation of this problem.

 17 ✳

Use Lagrange multipliers to find the minimum and maximum values of (x + z)e subject to x y 2
+y
2
+z
2
= 6.

 18 ✳

Find the points on the ellipse 2x2


+ 4xy + 5 y
2
= 30 which are closest to and farthest from the origin.

 19

Find the ends of the major and minor axes of the ellipse 3x 2
− 2xy + 3 y
2
= 4.

 20 ✳

A closed rectangular box with a volume of 96 cubic meters is to be constructed of two materials. The material for the top costs
twice as much per square meter as that for the sides and bottom. Use the method of Lagrange multipliers to find the dimensions
of the least expensive box.

 21 ✳

Consider the unit sphere


2 2 2
S = §et(x, y, z)x +y +z =1

in R . Assume that the temperature at a point (x, y, z) of S is


3

2
T (x, y, z) = 40x y z

Find the hottest and coldest temperatures on S.

 22 ✳
Find the dimensions of the box of maximum volume which has its faces parallel to the coordinate planes and which is
contained inside the region 0 ≤ z ≤ 48 − 4x − 3y .
2 2

 23 ✳

A rectangular bin is to be made of a wooden base and heavy cardboard with no top. If wood is three times more expensive than
cardboard, find the dimensions of the cheapest bin which has a volume of 12m . 3

2.10.12 https://math.libretexts.org/@go/page/92243
 24 ✳

A closed rectangular box having a volume of 4 cubic metres is to be built with material that costs $8 per square metre for the
sides but $12 per square metre for the top and bottom. Find the least expensive dimensions for the box.

 25 ✳

Suppose that a, b, c are all greater than zero and let D be the pyramid bounded by the plane ax + by + cz = 1 and the 3
coordinate planes. Use the method of Lagrange multipliers to find the largest possible volume of D if the plane
ax + by + cz = 1 is required to pass through the point (1, 2, 3). (The volume of a pyramid is equal to one-third of the area of

its base times the height.)

Stage 3

 26 ✳

Use Lagrange multipliers to find the minimum distance from the origin to all points on the intersection of the curves
g(x, y, z) = x − z − 4 = 0

and h(x, y, z) = x +y +z−3 = 0

 27 ✳

Find the largest and smallest values of


2
f (x, y, z) = 6x + y + xz

on the sphere x 2
+y
2
+z
2
= 36. Determine all points at which these values occur.

 28 ✳

The temperature in the plane is given by T (x, y) = e y 2


(x + y ).
2

1. 1. Give the system of equations that must be solved in order to find the warmest and coolest point on the circle
x + y = 100 by the method of Lagrange multipliers.
2 2

2. Find the warmest and coolest points on the circle by solving that system.
2. 1. Give the system of equations that must be solved in order to find the critical points of T (x, y).
2. Find the critical points by solving that system.
3. Find the coolest point on the solid disc x 2
+y
2
≤ 100.

 29 ✳
1. By finding the points of tangency, determine the values of c for which x + y + z = c is a tangent plane to the surface
2 2 2
4x + 4y +z = 96.

2. Use the method of Lagrange Multipliers to determine the absolute maximum and minimum values of the function
f (x, y, z) = x + y + z along the surface g(x, y, z) = 4 x + 4 y + z = 96.
2 2 2

3. Why do you get the same answers in (a) and (b)?

 30

Let f (x, y) have continuous partial derivatives. Consider the problem of finding local minima and maxima of f (x, y) on the
curve xy = 1.
Define g(x, y) = xy − 1. According to the method of Lagrange multipliers, if (x, y) is a local minimum or maximum of
f (x, y) on the curve xy = 1, then there is a real number λ such that

2.10.13 https://math.libretexts.org/@go/page/92243
∇f (x, y) = λ∇g(x, y), g(x, y) = 0

On the curve xy = 1, we have y = x


1
and f (x, y) = f (x, 1

x
). Define F (x) = f (x, 1

x
). If x ≠ 0 is a local minimum or
maximum of F (x), we have that

F (x) = 0

Show that (E1) is equivalent to (E2), in the sense that


there is a λ such that (x, y, λ) obeys (E1)

if and only if

1
x ≠ 0 obeys (E2) and y = .
x

1. Joseph-Louis Lagrange was actually born Giuseppe Lodovico Lagrangia in Turin, Italy in 1736. He moved to Berlin in 1766
and then to Paris in 1786. He eventually acquired French citizenship and then the French claimed he was a French
mathematician, while the Italians continued to claim that he was an Italian mathematician.
2. We call L an auxiliary function because, while we use it to help solve the problem, it doesn't actually appear in either the
statement of the question or in the answer itself
3. Some people use L(x, y, z, λ) = f (x, y, z) + λ g(x, y, z) instead. This amounts to renaming λ to −λ. While we care that λ
has a value, we don't care what it is.
4. The function S(z) = z is a strictly increasing function for z ≥ 0. So, for a, b ≥ 0, the statement “a < b ” is equivalent to the
2

statement “S(a) < S(b) ”.


5. if you tilt your head so that the line through (1, 1) and (−1, −1) appears horizontal
6. This condition says that the normal vectors to g = 0 and h = 0 at (x, y, z) are not parallel. This ensures that the surfaces g = 0
and h = 0 are not tangent to each other at (x, y, z). They intersect in a curve.

This page titled 2.10: Lagrange Multipliers is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.

2.10.14 https://math.libretexts.org/@go/page/92243
CHAPTER OVERVIEW
3: Multiple Integrals
b
In your previous calculus courses you defined and worked with single variable integrals, like ∫ f (x) dx. In this chapter, we
a

define and work with multivariable integrals, like ∬ f (x, y) dx dy and ∭ f (x, y, z) dx dy dz. We start with two variable
R V

integrals.
3.1: Double Integrals
3.2: Double Integrals in Polar Coordinates
3.3: Applications of Double Integrals
3.4: Surface Area
3.5: Triple Integrals
3.6: Triple Integrals in Cylindrical Coordinates
3.7: Triple Integrals in Spherical Coordinates
3.8: Optional— Integrals in General Coordinates

Thumbnail: A diagram depicting a worked triple integral example. The questions is "Find the volume of the region bounded above
by the sphere x + y + z = a and below by the cone z sin (a) = (x + y ) cos (a) where a is in the interval [0, π] (Public
2 2 2 2 2 2 2 2 2

Domain; Inductiveload via Wikipedia).

This page titled 3: Multiple Integrals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

1
3.1: Double Integrals
Vertical Slices
Suppose that you want to compute the mass of a plate that fills the region R in the xy-plane. Suppose further that the density of the
plate, say in kilograms per square meter, depends on position. Call the density f (x, y). For simplicity we'll assume that R is the
region between the bottom curve y = B(x) and the top curve y = T (x) with x running from a to b. That is,

R = { (x, y) ∣
∣ a ≤ x ≤ b,  B(x) ≤ y ≤ T (x) }

We'll shortly express that mass as a two dimensional integral. As a warmup, recall the procedure that we used to set up a (one
dimensional) integral representing the area of R in Example 1.5.1 of the CLP-2 text.
Pick a natural number n (that we will later send to infinity), and then
subdivide R into n narrow vertical slices, each of width Δx = . Denote by x
b−a

n
i = a + i Δx the x-coordinate of the right
hand edge of slice number i.

For each i = 1, 2, … , n, slice number i has x running from x to x . We approximate its area by the area of a rectangle. We
i−1 i

pick a number x between x



i i−1 and x and approximate the slice by a rectangle whose top is at y = T (x ) and whose bottom
i

i

is at y = B(x ). The rectangle is outlined in blue in the figure below.



i

Thus the area of slice i is approximately [T (x ) − B(x



i

i
)]Δx.

So the Riemann sum approximation of the area of R is


n

∗ ∗
Area ≈ ∑ [T (x ) − B(x )]Δx
i i

i=1

By taking the limit as n → ∞ (i.e. taking the limit as the width of the rectangles goes to zero), we convert the Riemann sum
into a definite integral (see Definition 1.1.9 in the CLP-2 text) and at the same time our approximation of the area becomes the
exact area:

3.1.1 https://math.libretexts.org/@go/page/89215
n b

∗ ∗
Area = lim ∑ [T (x ) − B(x )]Δx = ∫ [T (x) − B(x)]dx
i i
n→∞
i=1 a

Now we can expand that procedure to yield the mass of R rather than the area of R. We just have to replace our approximation
[T (x ) − B(x )]Δx of the area of slice i by an approximation to the mass of slice i. To do so, we
∗ ∗
i i

Pick a natural number m (that we will later send to infinity), and then
subdivide slice number i into m tiny rectangles, each of width Δx and of height Δy = 1

m

i

[T (x ) − B(x )].
i
Denote by
y = B(x ) + j Δy the y -coordinate of the top of rectangle number j.

j i

At this point we approximate the density inside each rectangle by a constant. For each j = 1, 2, … , m, rectangle number j has
y running from y to y . We pick a number y between y and y and approximate the density on rectangle number j in

j−1 j j j−1 j

slice number i by the constant f (x , y ).



i

j

Thus the mass of rectangle number j in slice number i is approximately f (x , y ) Δx Δy. ∗


i

j

So the Riemann sum approximation of the mass of slice number i is


m

∗ ∗
Mass of slice i ≈ ∑ f (x , y ) Δx Δy
i j

j=1

Note that the y 's depend on i and m.



j

By taking the limit as m → ∞ (i.e. taking the limit as the height of the rectangles goes to zero), we convert the Riemann sum
into a definite integral:

T (x )
i

∗ ∗
Mass of slice i ≈ Δx ∫ f (x , y) dy = F (x ) Δx
i i

B( xi )

where
T (x)

F (x) = ∫ f (x, y) dy
B(x)

Notice that, while we started with the density f (x, y) being a function of both x and y, by taking the limit of this Riemann sum,
we have “integrated out” the dependence on y. As a result, F (x) is a function of x only, not of x and y.
Finally taking the limit as n → ∞ (i.e. taking the limit as the slice width goes to zero), we get

n T (x ) n
i

∗ ∗
Mass = lim ∑ Δx ∫ f (x , y) dy = lim ∑ F (x ) Δx
i i
n→∞ ∗ n→∞
B( x )
i=1 i i=1

Now we are back in familiar 1-variable territory. The sum ∑ F (x ∗


i
) Δx is a Riemann sum approximation to the integral
i=1
b

a
F (x) dx. So
b b T (x)

Mass = ∫ F (x) dx = ∫ [∫ f (x, y) dy] dx


a a B(x)

This is our first double integral. There are a couple of different standard notations for this integral.

3.1.2 https://math.libretexts.org/@go/page/89215
 Definition 3.1.1

b T (x)

∬ f (x, y) dx dy = ∫ [∫ f (x, y) dy] dx


R a B(x)

b T (x) b T (x)

=∫ ∫ f (x, y) dy dx = ∫ dx ∫ dy f (x, y)
a B(x) a B(x)

The last three integrals here are called iterated integrals, for obvious reasons.

Note that
b T (x)

to evaluate the integral ∫ ∫ f (x, y) dy dx,


a B(x)

T (x)
first evaluate the inside integral ∫ B(x)
f (x, y) dy using the inside limits of integration, and by treating x as a constant and
using standard single variable integration techniques, such as those in the CLP-2 text. The result of the inside integral is a
function of x only. Call it F (x).
b
Then evaluate the outside integral ∫ F (x) dx, whose integrand is the answer to the inside integral. Again, this integral is
a

evaluated using standard single variable integration techniques.


b T (x)

To evaluate the integral ∫ dx ∫ dy f (x, y),


a B(x)

T (x)
first evaluate the inside integral ∫ B(x)
dy f (x, y) using the limits of integration that are directly beside the dy. Indeed the dy
T (x)
is written directly beside ∫ B(x)
to make it clear that the limits of integration B(x) and T (x) are for the y -integral. In the
T (x)
past you probably wrote this integral as ∫ B(x)
f (x, y) dy. The result of the inside integral is again a function of x only. Call
it F (x).
b
Then evaluate the outside integral ∫ dx F (x), whose integrand is the answer to the inside integral and whose limits of
a

integration are directly beside the dx.


At this point you may be wondering “Do we always have to use vertical slices?” and “Do we always have to integrate with respect
to y first?” The answer is “no”. This brings us to consider “horizontal slices”.

Horizontal Slices
We found, when computing areas of regions in the xy-plane, that it is often advantageous to use horizontal slices, rather than
vertical slices. See, for example, Example 1.5.4 in the CLP-2 text. The same is true when setting up multidimensional integrals. So
we now repeat the setup procedure of the last section, but starting with horizontal slices, rather than vertical slices. This procedure
will be useful when dealing with regions of the form

R = { (x, y) ∣
∣ c ≤ y ≤ d,  L(y) ≤ x ≤ R(y) }

Here L(y) (“L” stands for “left”) is the smallest 1 allowed value of x, when the y -coordinate is y, and R(y) (“R ” stands for
“right”) is the largest allowed value of x, when the y -coordinate is y. Suppose that we wish to evaluate the mass of a plate that fills

3.1.3 https://math.libretexts.org/@go/page/89215
the region R, and that the density of the plate is f (x, y). We follow essentially the same the procedure as we used with vertical
slices, but with the roles of x and y swapped.
Pick a natural number n (that we will later send to infinity). Then
subdivide the interval c ≤ y ≤ d into n narrow subintervals, each of width Δy = . Each subinterval cuts a thin horizontal
d−c

slice from the region (see the figure below).


We approximate slice number i by a thin horizontal rectangle (indicated by the long darker gray rectangle in the figure below).
On this slice, the y -coordinate runs over a very narrow range. We pick a number y , somewhere in that range. We approximate

i

slice i by a rectangle whose left side is at x = L(y ) and whose right side is at x = R(y ).

i

i

If we were computing the area of R, we would now approximate the area of slice i by [R(x ) − L(x )]Δy, which is the area ∗
i

i

of the rectangle with width [R(x ) − L(x )] and height Δy.



i

i

To get the mass, just as we did above with vertical slices, we


pick another natural number m (that we will later send to infinity), and then
subdivide slice number i into m tiny rectangles, each of height Δy and of width Δx = [R(y ) − L(y )]. 1

m

i

i

For each j = 1, 2, … , m, rectangle number j has x running over a very narrow range. We pick a number x somewhere in ∗
j

that range. See the small black rectangle in the figure below.

Here is a magnified sketch of slice number i

On rectangle number j in slice number i, we approximate the density by f (x , y ∗


j

i
), giving us that the mass of rectangle
number j in slice number i is approximately f (x , y ) Δx Δy.

j

i

So the Riemann sum approximation of the mass of (horizontal) slice number i is


m

∗ ∗
Mass of slice i ≈ ∑ f (x , y ) Δx Δy
j i

j=1

By taking the limit as m → ∞ (i.e. taking the limit as the width of the rectangles goes to zero), we convert the Riemann
sum into a definite integral:

R( yi )

∗ ∗
Mass of slice i ≈ Δy ∫ f (x, y ) dx = F (y ) Δy
i i

L( y )
i

where
R(y)

F (y) = ∫ f (x, y) dx
L(y)

3.1.4 https://math.libretexts.org/@go/page/89215
Observe that, as x has been integrated out, F (y) is a function of y only, not of x and y.
Finally taking the limit as n → ∞ (i.e. taking the limit as the slice width goes to zero), we get

n R( y ) n
i

∗ ∗
Mass = lim ∑ Δy ∫ f (x, y ) dx = lim ∑ F (y ) Δy
i i
n→∞ ∗ n→∞
L( y )
i=1 i i=1

n
d
Now ∑ F (y ∗
i
) Δy is a Riemann sum approximation to the integral ∫ c
F (y) dy. So
i=1

d d R(y)

Mass = ∫ F (y) dy = ∫ [∫ f (x, y) dx] dy


c c L(y)

The standard notations of Notation 3.1.1 also apply to this integral.

 Definition 3.1.2

d R(y)

∬ f (x, y) dx dy = ∫ [∫ f (x, y) dx] dy


R c L(y)

d R(y) d R(y)

=∫ ∫ f (x, y) dx dy = ∫ dy ∫ dx f (x, y)
c L(y) c L(y)

Note that
d R(y)

to evaluate the integral ∫ ∫ f (x, y) dx dy,


c L(y)

R(y)
first evaluate the inside integral ∫ L(y)
f (x, y) dx using the inside limits of integration. The result of the inside integral is a
function of y only. Call it F (y).
d
Then evaluate the outside integral ∫ c
F (y) dy, whose integrand is the answer to the inside integral.
d R(y)

To evaluate the integral ∫ dy ∫ dx f (x, y),


c L(y)

R(y)
first evaluate the inside integral ∫ L(y)
dx f (x, y) using the limits of integration that are directly beside the dx. Again, the
R(y)
dx is written directly beside ∫ L(y)
to make it clear that the limits of integration L(y) and R(y) are for the x-integral. In the
R(y)
past you probably wrote this integral as ∫ L(y)
f (x, y) dx. The result of the inside integral is again a function of y only. Call
it F (y).
d
Then evaluate the outside integral ∫ dy F (y), whose integrand is the answer to the inside integral and whose limits of
c

integration are directly beside the dy.


By way of summary, we now have two integral representations for the mass of regions in the xy-plane.

 Theorem 3.1.3

Let R be a region in the xy-plane and let the function f (x, y) be defined and continuous on R.
1. If

R = { (x, y) ∣
∣ a ≤ x ≤ b,  B(x) ≤ y ≤ T (x) }

with B(x) and T (x) being continuous, and if the mass density in R is f (x, y), then the mass of R is
b T (x) b T (x)

∫ [∫ f (x, y) dy] dx = ∫ ∫ f (x, y) dy dx


a B(x) a B(x)

b T (x)

=∫ dx ∫ dy f (x, y)
a B(x)

3.1.5 https://math.libretexts.org/@go/page/89215
2. If

R = { (x, y) ∣
∣ c ≤ y ≤ d,  L(y) ≤ x ≤ R(y) }

with L(y) and R(y) being continuous, and if the mass density in R is f (x, y), then the mass of R is
d R(y) d R(y)

∫ [∫ f (x, y) dx] dy = ∫ ∫ f (x, y) dx dy


c L(y) c L(y)

d R(y)

=∫ dy ∫ dx f (x, y)
c L(y)

Implicit in Theorem 3.1.3 is the statement that, if

{ (x, y) ∣
∣ a ≤ x ≤ b,  B(x) ≤ y ≤ T (x) }

= { (x, y) ∣
∣ c ≤ y ≤ d,  L(y) ≤ x ≤ R(y) }

and if f (x, y) is continuous, then


b T (x) d R(y)

∫ ∫ f (x, y) dy dx = ∫ ∫ f (x, y) dx dy
a B(x) c L(y)

This is called Fubini's theorem 2. It will be discussed more in the optional §3.1.5.

 Definition 3.1.4

The integrals of Theorem 3.1.3 are often denoted

∬ f (x, y) dxdy or ∬ f (x, y) dA


R R

The symbol dA represents the area of an “infinitesimal” piece of R.

Here is a simple example. We'll do some more complicated examples in §3.1.4.

 Example 3.1.5

Let R be the triangular region above the x-axis, to the right of the y -axis and to the left of the line x + y = 1. Find the mass of
R if it has density f (x, y) = y.

Solution
We'll do this problem twice — once using vertical strips and once using horizontal strips. First, here is a sketch of R.

Solution using vertical strips.    We'll now set up a double integral for the mass using vertical strips. Note, from the figure

3.1.6 https://math.libretexts.org/@go/page/89215
that
the leftmost points in R have x = 0 and the rightmost point in R has x = 1 and
for each fixed x between 0 and 1, the point (x, y) in R with the smallest y has y = 0 and the point (x, y) in R with the
largest y has y = 1 − x.
Thus

R = {(x, y)|0 = a ≤ x ≤ b = 1,  0 = B(x) ≤ y ≤ T (x) = 1 − x}

and, by part (a) of Theorem 3.1.3


b T (x) 1 1−x

Mass = ∫ dx ∫ dy f (x, y) = ∫ dx ∫ dy y
a B(x) 0 0

Now the inside integral is


1−x 1−x
2
y 1 2
∫ y dy = [ ] = (1 − x )
0
2 2
0

so that the
1 2 3 1
(1 − x) (1 − x) 1
Mass = ∫ dx  = [− ] =
0
2 6 6
0

Solution using horizontal strips.    This time we'll set up a double integral for the mass using horizontal strips. Note, from the
figure

that
the lowest points in R have y = 0 and the topmost point in R has y = 1 and
for each fixed y between 0 and 1, the point (x, y) in R with the smallest x has x = 0 and the point (x, y) in R with the
largest x has x = 1 − y.
Thus

R = {(x, y)|0 = c ≤ y ≤ d = 1,  0 = L(y) ≤ x ≤ R(y) = 1 − y}

and, by part (b) of Theorem 3.1.3


d R(y) 1 1−y

Mass = ∫ dy ∫ dx f (x, y) = ∫ dy ∫ dx y
c L(y) 0 0

Now the inside integral is

3.1.7 https://math.libretexts.org/@go/page/89215
1−y
1−y 2
∫ y dx = [xy] = y −y
0
0

since the y integral treats x as a constant. So the


1 1
2 3
2
y y 1 1 1
Mass =∫ dy [y − y ] = [ − ] = − =
0
2 3 2 3 6
0

Double integrals share the usual basic properties that we are used to from integrals of functions of one variable. See, for example,
Theorem 1.2.1 and Theorem 1.2.12 in the CLP-2 text. Indeed the following theorems follow from them.

 Theorem 3.1.6. Arithmetic of Integration

Let A, B, C be real numbers. Under the hypotheses of Theorem 3.1.3,

∬ (f (x, y) + g(x, y)) dxdy =∬ f (x, y) dxdy + ∬ g(x, y) dxdy (a)


R R R

∬ (f (x, y) − g(x, y)) dxdy =∬ f (x, y) dxdy − ∬ g(x, y) dxdy (b)


R R R

∬ C f (x, y) dxdy = C ∬ f (x, y) dxdy (c)


R R

Combining these three rules we have

∬ (Af (x, y) + Bg(x, y)) dxdy = A ∬ f (x, y) dxdy


R R

+B∬ g(x, y) dxdy (d)


R

That is, integrals depend linearly on the integrand.

∬ dxdy = Area(R) (e)


R

If the region R in the -plane is the union of regions


xy R1 and R that do not overlap (except possibly on their boundaries),
2

then

∬ f (x, y) dxdy = ∬ f (x, y) dxdy + ∬ f (x, y) dxdy (f)


R R1 R2

In the very special (but not that uncommon) case that R is the rectangle
R = {(x, y)|a ≤ x ≤ b,  c ≤ y ≤ d}

and the integrand is the product f (x, y) = g(x)h(y),

3.1.8 https://math.libretexts.org/@go/page/89215
b d

∬ f (x, y) dxdy = ∫ dx ∫ dy g(x)h(y)


R a c

b d

=∫ dx g(x) ∫ dy h(y)
a c

since g(x) is a constant as far as the y-integral is concerned


b d

= [∫ dx g(x)]   [∫ dy h(y)]
a c

d
since ∫c
dy h(y) is a constant as far as the $x$-integral is concerned.
This is worth stating as a theorem

 Theorem 3.1.7

If the domain of integration

R = {(x, y)|a ≤ x ≤ b,  c ≤ y ≤ d}

is a rectangle and the integrand is the product f (x, y) = g(x)h(y), then


b d

∬ f (x, y) dxdy = [∫ dx g(x)]   [∫ dy h(y)]


R a c

Just as was the case for single variable integrals, sometimes we don't actually need to know the value of a double integral exactly.
We are instead interested in bounds on its value. The following theorem provides some simple tools for generating such bounds.
They are the multivariable analogs of the single variable tools in Theorem 1.2.12 of the CLP-2 text.

 Theorem 3.1.8. Inequalities for Integrals

Under the hypotheses of Theorem 3.1.3,


1. If f (x, y) ≥ 0 for all (x, y) in R, then

∬ f (x, y) dxdy ≥ 0
R

2. If there are constants m and M such that m ≤ f (x, y) ≤ M for all (x, y) in R, then

m Area(R) ≤ ∬ f (x, y) dxdy ≤ M Area(R)


R

3. If f (x, y) ≤ g(x, y) for all (x, y) in R, then

∬ f (x, y) dxdy ≤ ∬ g(x, y) dxdy


R R

4. We have
∣ ∣
∣∬ f (x, y) dxdy ∣ ≤ ∬ |f (x, y)| dxdy
∣ R
∣ R

Volumes
Now that we have defined double integrals, we should start putting them to use. One of the most immediate applications arises
from interpreting f (x, y), not as a density, but rather as the height of the part of a solid above the point (x, y) in the xy-plane. Then
Theorem 3.1.3 gives the volume between the xy-plane and the surface z = f (x, y).
We'll now see how this goes in the case of part (b) of Theorem 3.1.3. The case of part (a) works in the same way. So we assume
that the solid V lies above the base region

R = { (x, y) ∣
∣ c ≤ y ≤ d,  L(y) ≤ x ≤ R(y) }

3.1.9 https://math.libretexts.org/@go/page/89215
and that

V = { (x, y, z) ∣
∣ (x, y) ∈ R,  0 ≤ z ≤ f (x, y) }

The base region R (which is also the top view of V ) is sketched in the figure on the left below and the part of V in the first octant is
sketched in the figure on the right below.

To find the volume of V we shall


Pick a natural number n and slice R into strips of width Δy = .
d−c

Subdivide slice number i into m tiny rectangles, each of height Δy and of width Δx = 1

m
⋯.

Compute, approximately, the volume of the part of V that is above each rectangle.
Take the limit m → ∞ and then the limit n → ∞.
We have just been through this type of argument twice. So we'll abbreviate the argument and just say
slice the base region R into long “infinitesimally” thin strips of width dy.
Subdivide each strip into “infinitesimal” rectangles each of height dy and of width dx. See the figure on the left above.
The volume of the part of V that is above the rectangle centred on (x, y) is essentially f (x, y) dx dy. See the figure on the right
above.
So the volume of the part of V that is above the strip centred on y is essentially 3 dy ∫
R(y)

L(y)
dx f (x, y) and
we arrive at the following conclusion.

 Equation 3.1.9

If

V = { (x, y, z) ∣
∣ (x, y) ∈ R,  0 ≤ z ≤ f (x, y) }

where

R = { (x, y) ∣
∣ c ≤ y ≤ d,  L(y) ≤ x ≤ R(y) }

then
d R(y)

Volume(V) = ∫ dy ∫ dx f (x, y)


c L(y)

Similarly

 Equation 3.1.10

If

V = { (x, y, z) ∣
∣ (x, y) ∈ R,  0 ≤ z ≤ f (x, y) }

where

R = { (x, y) ∣
∣ a ≤ x ≤ b,  B(x) ≤ y ≤ T (x) }

3.1.10 https://math.libretexts.org/@go/page/89215
then
b T (x)

Volume(V) = ∫ dx ∫ dy f (x, y)


a B(x)

Examples
Oof — we have had lots of equations and theory. It's time to put all of this to work. Let's start with a mass example and then move
on to a volume example. You will notice that the mathematics is really very similar. Just the interpretation changes.

 Example 3.1.11. Mass

Let ν > 0 be a constant and let R be the region above the curve x
2
= 4ν y and to the right of the curve y
2
=
1

2
ν x. Find the
mass of R if it has density f (x, y) = xy.
Solution
For practice, we'll do this problem twice — once using vertical strips and once using horizontal strips. We'll start by sketching
2
2
2y
R. First note that, since y ≥
x


and x ≥
ν
, both x and y are positive throughout R. The two curves intersect at points
(x, y) that satisfy both

2
2 2 2 2 4
2y x 2y 2 x x
x =  and y = ⟹ x = = ( ) =
ν 4ν ν ν 4ν 8ν 3

3
x
⟺ ( − 1) x = 0
3

This equation has only two real 4 solutions —


2

x =0 and x = 2ν . So the upward opening parabola y =


x


and the rightward
2
2y
opening parabola x = ν
intersect at (0, 0) and (2ν , ν ).

Solution using vertical strips.    We'll now set up a double integral for the mass using vertical strips and using the abbreviated
argument of the end of the last section (on volumes). Note, from the figure above, that
2 −−

∣ x νx
R = {(x, y) ∣ 0 = a ≤ x ≤ b = 2ν ,   = B(x) ≤ y ≤ T (x) = √  }
∣ 4ν 2

Slice R into long “infinitesimally” thin vertical strips of width dx.


Subdivide each strip into “infinitesimal” rectangles each of height dy and of width dx. See the figure below.

The mass of the rectangle centred on (x, y) is essentially f (x, y) dx dy = xy dx dy.

3.1.11 https://math.libretexts.org/@go/page/89215
T (x)
So the mass of the strip centred on x is essentially  dx ∫
B(x)
dy f (x, y) (the integral over y adds up the masses of all of the
different rectangles on the single vertical strip in question) and
we conclude that the
b T (x) 2ν √ν x/2

Mass(R) = ∫ dx ∫ dy f (x, y) = ∫ dx ∫ dy xy


2
a B(x) 0 x /(4ν )

Here the integral over x adds up the masses of all of the different strips.
Recall that, when integrating y, x is held constant, so we may factor the constant x out of the inner y integral.
√ν x/2 √ν x/2

∫ dy xy = x ∫ dy y
x2 /(4ν ) x2 /(4ν )

√ν x/2
2
y
= x[ ]
2 2
x /(4ν )

2 5
νx x
= −
2
4 32ν

and the
2ν 2 5
νx x
Mass(R) = ∫ dx  [ − ]
2
0
4 32ν
3 6 4
ν (2ν ) (2ν ) ν
= − =
3 ×4 6 × 32ν 2 3

Solution using horizontal strips.    We'll now set up a double integral for the mass using horizontal strips, again using the
abbreviated argument of the end of the last section (on volumes). Note, from the figure at the beginning of this example, that
2
∣ 2y −−−
R = {(x, y) ∣ 0 = c ≤ y ≤ d = ν ,   = L(y) ≤ x ≤ R(y) = √4ν y }
∣ ν

Slice R into long “infinitesimally” thin horizontal strips of width dy.


Subdivide each strip into “infinitesimal” rectangles each of height dy and of width dx. See the figure below.

The mass of the rectangle centred on (x, y) is essentially f (x, y) dx dy = xy dx dy.


R(y)
So the mass of the strip centred on y is essentially  dy ∫
L(y)
dx f (x, y) (the integral over x adds up the masses of all of the
different rectangles on the single horizontal strip in question) and
we conclude that the
d R(y) ν √4ν y

Mass(R) = ∫ dy ∫ dx f (x, y) = ∫ dy ∫ dx xy


2
c L(y) 0 2y /ν

Here the integral over y adds up the masses of all of the different strips. Recalling that, when integrating x, y is held
constant

3.1.12 https://math.libretexts.org/@go/page/89215
ν √4ν y

Mass(R) = ∫ dy y [ ∫ dx x]


2
0 2y /ν

ν 2 √4ν y
x
=∫ dy y [ ]
0
2 2
2y /ν

ν 5
2
2y
=∫ dy  [2ν y − ]
0
ν2
3 6 4
2ν (ν ) 2ν ν
= − =
2
3 6ν 3

 Example 3.1.12. Volume

Let R be the part of the xy -plane above the x-axis and below the parabola y = 1 − x 2
. Find the volume between R and the
− −− −
surface z = x √1 − y .
2

Solution
Yet again, for practice, we'll do this problem twice — once using vertical strips and once using horizontal strips. First, here is a
sketch of R.

Solution using vertical strips.    We'll now set up a double integral for the volume using vertical strips. Note, from the figure

that
the leftmost point in R has x = −1 and the rightmost point in R has x = 1 and
for each fixed x between −1 and 1, the point (x, y) in R with the smallest y has y = 0 and the point (x, y) in R with the
largest y has y = 1 − x .2

Thus
2
R = {(x, y)| − 1 = a ≤ x ≤ b = 1,  0 = B(x) ≤ y ≤ T (x) = 1 − x }

and, by 3.1.10
2
b T (x) 1 1−x
2 − −−−
Volume = ∫ dx ∫ dy f (x, y) = ∫ dx ∫ dy x √ 1 − y
a B(x) −1 0

2
1 1−x
2 − −−−
=2∫ dx ∫ dy x √ 1 − y
0 0

1−x
2
−−−−
since the inside integral F (x) = ∫ 0
2
dy x √1 − y is an even function of x. Now, for x ≥ 0, the inside integral is

3.1.13 https://math.libretexts.org/@go/page/89215
2 2 2
1−x 1−x 1−x
− −−− − −−− 2
2 2 2 3/2
∫ x √ 1 − y  dy = x ∫ √ 1 − y  dy = x [− (1 − y ) ]
0 0
3
0

2
2 3
= x (1 − x )
3

so that the
1 1
3 6
2 2 3
4 x x 2
Volume = 2 ∫ dx  x (1 − x ) = [ − ] =
0
3 3 3 6 9
0

Solution using horizontal strips.    This time we'll set up a double integral for the volume using horizontal strips. Note, from
the figure

that
the lowest points in R have y = 0 and the topmost point in R has y = 1 and
−−−−
for each fixed y between 0 and 1, the point (x, y) in R with the leftmost x has x = −√1 − y and the point (x, y) in R
− −− −
with the rightmost x has x = √1 − y .
Thus
− −−− − −−−
R = {(x, y)|0 = c ≤ y ≤ d = 1,   − √ 1 − y = L(y) ≤ x ≤ R(y) = √ 1 − y }

and, by 3.1.9
d R(y) 1 √1−y
2 − −−−
Volume = ∫ dy ∫ dx f (x, y) = ∫ dy ∫ dx x √ 1 − y
c L(y) 0 −√1−y

Now the inside integral has an even integrand (in x) and so is


√1−y √1−y 3 √1−y

2 − −−− − −−− 2 − −−− x


∫ dx x √ 1 − y = 2 √ 1 − y ∫ x  dx = 2 √ 1 − y [ ]
−√1−y 0
3
0

2 2
= (1 − y )
3

So the
1 3 1
2 2 (1 − y) 2
2
Volume = ∫ dy (1 − y ) = [− ] =
3 0
3 3 9
0

 Example 3.1.13. Volume

Find the volume common to the two cylinders x 2


+y
2
=a
2
and x 2
+z
2
=a .
2

Solution
Our first job is figure out what the specified solid looks like. Note that
The variable z does not appear in the equation x + y = a . So, for every value of the constant z , the part of the cylinder
2 2 2
0

2
x +y = a
2 2
in the plane z = z , is the circle x + y = a , z = z . So the cylinder x + y = a consists of many
0
2 2 2
0
2 2 2

circles stacked vertically, one on top of the other. The part of the cylinder x + y = a that lies above the xy-plane is 2 2 2

sketched in the figure on the left below.

3.1.14 https://math.libretexts.org/@go/page/89215
The variable y does not appear in the equation x + z = a . So, for every value of the constant y , the part of the cylinder
2 2 2
0

2 2
x +z = a
2
in the plane y = y , is the circle x + z = a , y = y . So the cylinder x + z = a consists of many
0
2 2 2
0
2 2 2

circles stacked horizontally, one beside the other. The part of the cylinder x + z = a that lies to the right of the xz-plane
2 2 2

is sketched in the figure on the right below.

We have to compute the volume common to these two intersecting cylinders.


The equations x + y = a and x + z = a do not change at all if x is replaced by −x. Consequently both cylinders,
2 2 2 2 2 2

and hence our solid, is symmetric about the yz-plane. In particular the volume of the part of the solid in the octant x ≤ 0,
y ≥ 0, z ≥ 0 is the same as the volume in the first octant x ≥ 0, y ≥ 0, z ≥ 0. Similarly, the equations do not change at all

if y is replaced by −y or if z is replaced by −z. Our solid is also symmetric about both the xz-plane and the xy-plane.
Hence the volume of the part of our solid in each of the eight octants is the same.
So we will compute the volume of the part of the solid in the first octant, i.e. with x ≥ 0, y ≥ 0, z ≥ 0. The total volume of
the solid is eight times that.
The part of the solid in the first octant is sketched in the figure on the left below. A point (x, y, z) lies in the first cylinder if
and only if x + y ≤ a .
2 2 2

It lies in the second cylinder if and only if x 2


+z
2
≤a .
2
So the part of the solid in the first octant is
2 2 2 2 2 2
V1 = {(x, y, z)|x ≥ 0,  y ≥ 0,  z ≥ 0,  x +y ≤ a ,  x +z ≤a }

−− −−−−
Notice that, in V 1, z
2
≤a
2
−x
2
so that z ≤ √a 2
−x
2
and
− −−−−−
2 2 2 2 2
V1 = {(x, y, z)|x ≥ 0,  y ≥ 0,  x +y ≤ a ,  0 ≤ z ≤ √ a − x }

The top view of the part of the solid in the first octant is sketched in the figure on the right above. In that top view, x runs from
−− − −−−
0 to a. For each fixed x, y runs from 0 to √a − x . So we may rewrite
2 2

V1 = {(x, y, z)|(x, y) ∈ R,  0 ≤ z ≤ f (x, y)}

where
− −−−−− − −−−−−
∣ 2 2 2 2
R = { (x, y)   0 ≤ x ≤ a,  0 ≤ y ≤ √ a − x  } and f (x, y) = √ a − x

and “(x, y) ∈ R” is read “(x, y) is an element of R. ”. Note that f (x, y) is actually independent of y. This will make things a
bit easier below.
We can now compute the volume of V using our usual abbreviated protocol.
1

Slice R into long “infinitesimally” thin horizontal strips of height dx.


Subdivide each strip into “infinitesimal” rectangles each of width dy and of height dx. See the figure below.

3.1.15 https://math.libretexts.org/@go/page/89215
The volume of the part of V above rectangle centred on (x, y) is essentially
1

− −−−−−
2 2
f (x, y) dx dy = √ a − x  dx dy

So the volume of the part of V above the strip centred on x is essentially


1

√a2 −x2
− −−−−−
2 2
dx ∫ √ a − x  dy
0

(the integral over y adds up the volumes over all of the different rectangles on the single horizontal strip in question) and
we conclude that the
a √a2 −x2
− −−−−−
2 2
Volume(V1 ) = ∫ dx ∫ dy √ a − x
0 0

Here the integral over x adds up the volumes over all of the different strips. Recalling that, when integrating y, x is held
constant
a √a2 −x2
− −−−−−
2 2
Volume(V1 ) = ∫ dx √ a − x [ ∫ dy]
0 0

a
2 2
=∫ dx (a −x )
0

3 a
x
2
= [a x − ]
3
0

3
2a
=
3

and the total volume of the solid in question is


3
16a
Volume(V) = 8 Volume(V1 ) =
3

 Example 3.1.14. Geometric Interpretation


2 a
−− −−−−
Evaluate ∫ ∫
2 2
√a − x  dx dy.

0 0

Solution
This integral represents the volume of a simple geometric figure and so can be evaluated without using any calculus at all. The
domain of integration is

R = {(x, y)|0 ≤ y ≤ 2,  0 ≤ x ≤ a}

−− −−−−
and the integrand is 2
so the integral represents the volume between the xy-plane and the surface
2
f (x, y) = √a − x ,
− −−− −−
z = √a − x , with (x, y) running over R. We can rewrite the equation of the surface as x + z = a , which, as in
2 2 2 2 2

Example 3.1.13, we recognize as the equation of a cylinder of radius a centred on the y -axis. We want the volume of the part
of this cylinder that lies above R. It is sketched in the figure below.

3.1.16 https://math.libretexts.org/@go/page/89215
The constant y cross-sections of this volume are quarter circles of radius a and hence of area 1

4
πa .
2
The inside integral,
a −−−− −−

0
√a2 − x2  dx, is exactly this area. So, as y runs from 0 to 2,
2 a 2
− −−−−− 1 πa
2 2 2
∫ ∫ √ a − x  dx dy = πa × 2 =
0 0
4 2

 Example 3.1.15. Example 3.1.14, the hard way


2 a −−−−−−
It is possible, but very tedious, to evaluate the integral ∫ ∫ √a − x  dx dy of Example 3.1.14, using single variable
0 0
2 2

calculus techniques. We do so now as a review of a couple of those techniques.


a −−−−−− −−−−−−
The inside integral is ∫ √a − x  dx. The standard procedure for eliminating square roots like √a − x from integrands is
0
2 2 2 2

the method of trigonometric substitution, that was covered in §1.9 of the CLP-2 text. In this case, the appropriate substitution is

x = a sin θ dx = a cos θ dθ

The lower limit of integration x = 0, i.e. a sin θ = 0, corresponds to θ = 0, and the upper limit x = a, i.e. a sin θ = a,

corresponds to θ = , so that
π

a π/2 π/2
− −−−−− −−−−−−−−−−
2 2 2 2 2 2 2
∫ √ a − x  dx = ∫ a − a sin θ  a cos θ dθ = a ∫ cos θ dθ
√ 
0 0 0
2 2
a cos θ

π/2
The orthodox procedure for evaluating the resulting trigonometric integral ∫ 0
2
cos θ dθ, covered in §1.8 of the CLP-2 text,
uses the trigonometric double angle formula
1 + cos(2θ)
2 2
cos(2θ) = 2 cos θ−1 to write cos θ =
2

and then
a π/2 2 π/2
− −−−−− a
2 2 2 2
∫ √ a − x  dx = a ∫ cos θ dθ = ∫ [1 + cos(2θ)] dθ
0 0
2 0

2 π/2
a sin(2θ)
= [θ + ]
2 2
0

2
πa
=
4

π/2
However we remark that there is also an efficient, sneaky, way to evaluate definite integrals like ∫
0
cos
2
θ dθ. Looking at
the figures

we see that

3.1.17 https://math.libretexts.org/@go/page/89215
π/2 π/2
2 2
∫ cos θ dθ = ∫ sin θ dθ
0 0

Thus
π/2 π/2 π/2
1
2 2 2 2
∫ cos θ dθ = ∫ sin θ dθ = ∫ [ sin θ + cos θ] dθ
0 0 0
2

π/2
1 π
= ∫ dθ =
2 0
4

In any event, the inside integral


a 2
− −−−−− πa
2 2
∫ √ a − x  dx =
0 4

and the full integral


2 a 2 2 2
− −−−−− πa πa
2 2
∫ ∫ √ a − x  dx dy = ∫ dy =
0 0
4 0
2

just as we saw in Example 3.1.14.

 Example 3.1.16. Order of Integration


2 x+2

The integral ∫ ∫ dy dx represents the area of a region in the xy-plane. Express the same area as a double integral with
2
−1 x

the order of integration reversed.


\soln The critical step in reversing the order of integration is to sketch the region in the xy-plane. Rewrite the given integral as
2 x+2 2 x+2

∫ ∫ dy dx = ∫ [∫ dy] dx
−1 x2 −1 x2

From this we see that, on the domain of integration,


x runs from −1 to 2 and
for each fixed x, y runs from the parabola y = x to the straight line y = x + 2.
2

The given iterated integral corresponds to the (vertical) slicing in the figure on the left below.

To reverse the order of integration we have to switch to horizontal slices as in the figure on the right above.
There we see a new wrinkle: the formula giving the value of x at the left hand end of a slice depends on whether the y
coordinate of the slice is bigger than, or smaller than y = 1. Looking at the figure on the right, we see that, on the domain of
integration,
y runs from 0 to 4 and
for each fixed 0 ≤ y ≤ 1, x runs from x = −√y to x = +√y.
for each fixed 1 ≤ y ≤ 4, x runs from x = y − 2 to x = +√y.
So

3.1.18 https://math.libretexts.org/@go/page/89215
2 x+2 1 √y 4 √y

∫ dx ∫ dy = ∫ dy ∫ dx + ∫ dy ∫ dx
2
−1 x 0 −√y 1 y−2

There was a moral to the last example. Just because both orders of integration have to give the same answer doesn't mean that they
are equally easy to evaluate. Here is an extreme example illustrating that moral.

 Example 3.1.17

Evaluate the integral of sin x

x
over the region in the xy-plane that is above the x-axis, to the right of the line y =x and to the
left of the line x = 1.
Solution
Here is a sketch of the specified domain.

We'll try to evaluate the specified integral twice — once using horizontal strips (the impossibly hard way) and once using
vertical strips (the easy way).
Solution using horizontal strips.    To set up the integral using horizontal strips, as in the figure on the left below, we observe
that, on the domain of integration,
y runs from 0 to 1 and
for each fixed y, x runs from x = y to 1.
So the iterated integral is
1 1
sin x
∫ dy ∫ dx 
0 y
x

And we have a problem. The integrand sin x

x
does not have an antiderivative that can be expressed in terms of elementary
5
without resorting to, for example, numerical methods or infinite series 6.
1
functions . It is impossible to evaluate ∫ y
dx 
sin x

Solution using vertical strips.    To set up the integral using vertical strips, as in the figure on the right above, we observe that,
on the domain of integration,
x runs from 0 to 1 and
for each fixed x, y runs from 0 to y = x.
So the iterated integral is
1 x
sin x
∫ dx ∫ dy 
0 0 x

This time, because x is treated as a constant in the inner integral, it is trivial to evaluate the iterated integral.

3.1.19 https://math.libretexts.org/@go/page/89215
1 x 1 x 1
sin x sin x
∫ dx ∫ dy  =∫ dx  ∫ dy = ∫ dx  sin x = 1 − cos 1
0 0 x 0 x 0 0

Here is an example which is included as an excuse to review some integration technique from CLP-2.

 Example 3.1.18

Find the volume under the surface z = 1 − 3x 2


− 2y
2
and above the xy-plane.
Solution
Before leaping into integration, we should try to understand what the surface and volume look like. For each constant z , the 0

part of the surface z = 1 − 3x − 2y that lies in the horizontal plane z = z is the ellipse 3x + 2y = 1 − z . The biggest
2 2
0
2 2
0

of these ellipses is that in the xy-plane, where z = 0. It is the ellipse 3x + 2y = 1. As z increases the ellipse shrinks,
0
2 2
0

degenerating to a single point, namely (0, 0, 1), when z = 1. So the surface consists of a stack of ellipses and our solid is
0

2 2 2 2
V = {(x, y, z)|3 x + 2y ≤ 1,  0 ≤ z ≤ 1 − 3 x − 2y }

This is sketched in the figure below

The top view of the base region


2 2
R = {(x, y)|3 x + 2y ≤ 1}

is sketched in the figure below.

Considering that the x-dependence in z = 1 − 3x − 2y is almost identical to the y -dependence in z = 1 − 3x − 2y (only


2 2 2 2

the coefficients 2 and 3 are interchanged), using vertical slices is likely to lead to exactly the same level of difficulty as using
horizontal slices. So we'll just pick one — say vertical slices.
The fattest part of R is on the y -axis. The intersection points of the ellipse with the y -axis have x =0 and y obeying
3(0 )
2
+ 2y
2
=1 or y =± . So in R, −
1

√2
≤y ≤ and, for each such y,
√2
1 1

√2
3x
2
≤ 1 − 2y
2
or
−−−−− −−−−−
1−2y 2 1−2y 2
−√
3
≤x ≤√
3
. So using vertical strips as in the figure above

3.1.20 https://math.libretexts.org/@go/page/89215
2 2
Volume(V) = ∬ (1 − 3 x − 2 y ) dx dy
R

2
1−2y
1

√2 3
2 2
=∫ dy ∫ dx (1 − 3 x − 2y )
2
1 1−2y
− −√
√2 3

2
1 1−2y

√2 3
2 2
=4∫ dy ∫ dx (1 − 3 x − 2y )
0 0

1 2
1−2y

√2
2 3 3

=4∫ dy [(1 − 2 y )x − x ]
0
0
1 −− −−−−−
2
√2 1 − 2y 2 1 − 2y
2
=4∫ dy √  [(1 − 2 y ) − ]
0 3 3
1
2 3/2
√2 1 − 2y
=8∫ dy [ ]
0
3

To evaluate this integral, we use the trig substitution 7 2y 2


= sin
2
θ, or
sin θ cos θ
y = – dy = – dθ
√2 √2

to give
dy

π π
2 3/2
2 cos θ cos θ 8 2
4
Volume(V) = 8 ∫ dθ  [ ] = −− ∫ dθ  cos θ

0 √2 3 √54 0

Then to integrate cos 4


θ, we use the double angle formula 8
cos(2θ) + 1
2
cos θ =
2
2
2
( cos(2θ) + 1) cos (2θ) + 2 cos(2θ) + 1
4
⟹ cos θ = =
4 4
cos(4θ)+1
+ 2 cos(2θ) + 1
2
=
4
3 1 1
= + cos(2θ) + cos(4θ)
8 2 8
π π

Finally, since ∫0
2
cos(4θ) dθ = ∫
0
2
cos(2θ) dθ = 0,

8 3 π π
Volume(V) =     =
−− –
√54 8 2 2 √6

Optional — More about the Definition of   ∬ R


f (x, y) dxdy

Technically, the integral ∬ R


f (x, y) dx dy, where R is a bounded region in R , is defined as follows. 2

Subdivide R by drawing lines parallel to the x and y axes.

3.1.21 https://math.libretexts.org/@go/page/89215
Number the resulting rectangles contained in R, 1 through n. Notice that we are numbering all of the rectangles in R, not just
those in one particular row or column.
Denote by ΔA the area of rectangle #i.
i

Select an arbitrary point (x , y ) in rectangle #i.



i

i
n

Form the sum ∑ f (x ∗


i

, y )ΔAi .
i
Again note that the sum runs over all of the rectangles in R, not just those in one particular
i=1

row or column.
Now repeat this construction over and over again, using finer and finer grids. If, as the size 9 of the rectangles approaches zero, this
sum approaches a unique limit (independent of the choice of parallel lines and of points (x , y )), then we define ∗
i

i

∗ ∗
∬ f (x, y) dx dy = lim ∑ f (x , y ) ΔAi
i i
R i=1

 Theorem 3.1.19

If f (x, y) is continuous in a region R described by


a ≤x ≤b

B(x) ≤ y ≤ T (x)

for continuous functions B(x), T (x), then


b T (x)

∬ f (x, y) dx dy and ∫ dx[ ∫ dy f (x, y)]


R a B(x)

both exist and are equal. Similarly, if R is described by


c ≤y ≤d

L(y) ≤ x ≤ R(y)

for continuous functions L(y), R(y), then


d R(y)

∬ f (x, y) dx dy and ∫ dy[ ∫ dx f (x, y)]


R c L(y)

both exist and are equal.

The proof of this theorem is not particularly difficult, but is still beyond the scope of this text. The main ideas in the proof can
already be seen in §1.1.6 of the CLP-2 text. An important consequence of this theorem is

 Theorem 3.1.20. Fubini

If f (x, y) is continuous in a region R described by both

a ≤x ≤b c ≤y ≤d
{ } and { }
B(x) ≤ y ≤ T (x) L(y) ≤ x ≤ R(y)

for continuous functions B(x), T (x), L(y), R(y), then both

3.1.22 https://math.libretexts.org/@go/page/89215
b T (x) d R(y)

∫ dx[ ∫ dy f (x, y)] and ∫ dy[ ∫ dx f (x, y)]


a B(x) c L(y)

exist and are equal.

The hypotheses of both of these theorems can be relaxed a bit, but not too much. For example, if
R = {(x, y)|0 ≤ x ≤ 1,  0 ≤ y ≤ 1}

1 if x, y are both rational numbers


f (x, y) = {
0 otherwise

then the integral ∬ R


f (x, y) dx dy does not exist. This is easy to see. If all of the x 's and y 's are chosen to be rational numbers,

i

i

then
n n

∗ ∗
∑ f (x , y ) ΔAi = ∑ ΔAi = Area(R)
i i

i=1 i=1

But if we choose all the x 's and y 's to be irrational numbers, then

i

i

n n

∗ ∗
∑ f (x , y ) ΔAi = ∑ 0 ΔAi = 0
i i

i=1 i=1

So the limit of ∗
∑ f (x , y ) ΔAi ,
i

i
as the maximum diagonal of the rectangles approaches zero, depends on the choice of points
i=1

∗ ∗
(x , y ).
i i
So the integral ∬ R
f (x, y) dx dy does not exist.
Here is an even more pathological 10 example.

 Example 3.1.21

In this example, we relax exactly one of the hypotheses of Fubini's Theorem, namely the continuity of f , and construct an
example in which both of the integrals in Fubini's Theorem exist, but are not equal. In fact, we choose
R = {(x, y)|0 ≤ x ≤ 1,  0 ≤ y ≤ 1} and we use a function f (x, y) that is continuous on R, except at exactly one point —

the origin.
First, let δ 1, δ2 , δ3 ,   ⋯ be any sequence of real numbers obeying

1 = δ1 > δ2 > δ3 >   ⋯   > δn → 0

For example δ = or δ = n
1

n
are both acceptable. For each positive integer n, let I
n
1
n−1 n = (δn+1 , δn ] = {t| δn+1 < t ≤ δn }
2

and let g (t) be any nonnegative continuous function obeying


n

gn (t) = 0 if t is not in I and n

∫ g(t) dt = 1
In

There are many such functions. For example


1
2 ⎧
⎪ δn − t if  (δn+1 + δn ) ≤ t ≤ δn
2
2
1
gn (t) = ( ) ⎨ t −δ if δn+1 ≤ t ≤ (δn+1 + δn )
n+1
2
δn − δn+1 ⎩

0 otherwise

Here is a summary of what we have done so far.

3.1.23 https://math.libretexts.org/@go/page/89215
We subdivided the interval 0 < x ≤ 1 into infinitely many subintervals I . As n increases, the subinterval I gets smaller
n n

and smaller and also gets closer and closer to zero.


We defined, for each n, a nonnegative continuous function g that is zero everywhere outside of I and whose integral
n n

over I is one.
n

Now we define the integrand f (x, y) in terms of these subintervals I and functions g n n.

⎧ 0 if x = 0




⎪0 if y = 0

f (x, y) = ⎨ gm (x)gn (y) if x ∈ Im ,  y ∈ In  with m = n




⎪ −gm (x)gn (y) if x ∈ Im ,  y ∈ In  with m = n + 1



0 otherwise

You should think of (0, 1] × (0, 1] as a union of a bunch of small rectangles I × I , as in the figure below. On most of these
m n

rectangles, f (x, y) is just zero. The exceptions are the darkly shaded rectangles I × I on the “diagonal” of the figure and then n

lightly shaded rectangles I ×I just to the left of the “diagonal”.


n+1 n

On each darkly shaded rectangle, f (x, y) ≥ 0 and the graph of f (x, y) is the graph of g (x)g n n (y) which looks like a pyramid.
On each lightly shaded rectangle, f (x, y) ≤ 0 and the graph of f (x, y) is the graph of −g n+1 (x)gn (y) which looks like a
pyramidal hole in the ground.

1
Now fix any 0 ≤ y ≤ 1 and let's compute ∫ 0
f (x, y) dx. That is, we are integrating f along a line that is parallel to the x-axis.
1
If y = 0, then f (x, y) = 0 for all x, so ∫ f (x, y) dx = 0. If 0 < y ≤ 1, then there is exactly one positive integer
0
n with
y ∈ In and f (x, y) is zero, except for x in I or I . So for y ∈ I
n n+1 n

∫ f (x, y) dx = ∑ ∫ f (x, y) dx
0 Im
m=n,n+1

=∫ gn (x)gn (y) dx − ∫ gn+1 (x)gn (y) dx


In In+1

= gn (y) ∫ gn (x) dx − gn (y) ∫ gn+1 (x) dx


In In+1

= gn (y) − gn (y) = 0

1
Here we have twice used that ∫
Im
g(t) dt = 1 for all m. Thus ∫
0
f (x, y) dx = 0 for all y and hence
1 1
∫ dy[ ∫ dx f (x, y)] = 0.
0 0

1
Finally, fix any 0 ≤ x ≤ 1 and let's compute ∫ 0
f (x, y) dy. That is, we are integrating f along a line that is parallel to the y -
1
axis. If x = 0, then f (x, y) = 0 for all y, so ∫
0
f (x, y) dy = 0. If 0 < x ≤ 1, then there is exactly one positive integer m

3.1.24 https://math.libretexts.org/@go/page/89215
with x ∈ I . If m ≥ 2, then f (x, y) is zero, except for y in I and I
m m m−1 . But, if m = 1, then f (x, y) is zero, except for y in
I . (Take another look at the figure above.) So for x ∈ I , with m ≥ 2,
1 m

∫ f (x, y) dy = ∑ ∫ f (x, y) dy
0 In
n=m,m−1

=∫ gm (x)gm (y) dy − ∫ gm (x)gm−1 (y) dy


Im Im−1

= gm (x) ∫ gm (y) dy − gm (x) ∫ gm−1 (y) dy


Im Im−1

= gm (x) − gm (x) = 0

But for x ∈ I 1,

∫ f (x, y) dy = ∫ f (x, y) dy = ∫ g1 (x)g1 (y) dy = g1 (x) ∫ g1 (y) dy


0 I1 I1 I1

= g1 (x)

Thus
1
0 if x ≤ δ2
∫ f (x, y) dy = {
0
g1 (x) if x ∈ I1

and hence
1 1

∫ dx[ ∫ dy f (x, y)] = ∫ g1 (x) dx = 1


0 0 I1

The conclusion is that for the f (x, y) above, which is defined for all 0 ≤ x ≤ 1, 0 ≤y ≤1 and is continuous except at (0, 0),
1 1 1 1

∫ dy[ ∫ dx f (x, y)] = 0 ∫ dx[ ∫ dy f (x, y)] = 1


0 0 0 0

Even and Odd Functions


During the course of our study of integrals of functions of one variable, we found that the evaluation of certain integrals could be
substantially simplified by exploiting symmetry properties of the integrand. Concretely, in Section 1.2.1 of the CLP-2 text, we gave
the

 Definition 3.1.22. (Definition 1.2.8 in the CLP-2 text)

Let f (x) be a function of one variable. Then,


we say that f (x) is even when f (x) = f (−x) for all x, and
we say that f (x) is odd when f (x) = −f (−x) for all x.

We also saw that


f (x) = |x|, f (x) = cos x and f (x) = x are even functions and
2

f (x) = sin x, f (x) = tan x and f (x) = x are odd functions.


3

In fact, if f (x) is any even power of x, then f (x) is an even function and if f (x) is any odd power of x, then f (x) is an odd
function.
We also learned how to exploit evenness and oddness to simplify integration.

 Theorem 3.1.23. (Theorem 1.2.11 in the CLP-2 text)

Let a > 0.
1. If f (x) is an even function, then

3.1.25 https://math.libretexts.org/@go/page/89215
a a

∫ f (x)dx = 2 ∫ f (x)dx
−a 0

2. If f (x) is an odd function, then


a

∫ f (x)dx = 0
−a

We will now see that we can similarly exploit evenness and oddness of functions of more than one variable. But for functions of
more than one variable there is also more than one kind of oddness and evenness. In the Definition 3.1.22 (Definition 1.2.8 in the
CLP-2 text) of evenness and oddness of the function f (x), we compared the value of f at x with the value of f at −x. The points
x and −x are the same distance from the origin, 0, and are on opposite sides of 0. The point −x is called the reflection of x across

the origin. To prepare for our definitions of evenness and oddness of functions of two variables, we now define three different
reflections in the two dimensional world of the xy-plane.

 Definition 3.1.24

Let x and y be two real numbers.


The reflection of (x, y) across the y -axis is (−x, y).
The reflection of (x, y) across the x-axis is (x, −y).
The reflection of (x, y) across the origin is (−x, −y).

To get from the point (x, y) to its image reflected across the y -axis, you
start from (x, y), and
walk horizontally straight to the y -axis, and
cross the y -axis, and
continue horizontally the same distance as you have already travelled to (−x, y).
Here are four examples.

To get from the point (x, y) to its image reflected across the x-axis, you
start from (x, y), and
walk vertically straight to the x-axis, and
cross the x-axis, and
continue vertically the same distance as you have already travelled to the reflected image (x, −y).
Here are four examples.

3.1.26 https://math.libretexts.org/@go/page/89215
To get from the point (x, y) to its image reflected across the origin, you
start from (x, y), and
walk radially straight to the origin, and
cross the origin, and
continue radially in the same direction the same distance as you have already travelled to the reflected image (−x, −y).
Here are three examples.

For each of these three types of reflection, there is a corresponding kind of oddness and evenness.

 Definition 3.1.25

Let f (x, y) be a function of two variables. Then,


we say that f (x, y) is even (under reflection across the origin) when f (−x, −y) = f (x, y) for all x and y, and
we say that f (x, y) is odd (under reflection across the origin) when f (−x, −y) = −f (x, y) for all x and y
and
we say that f (x, y) is even under x → −x (i.e. under reflection across the y -axis) when f (−x, y) = f (x, y) for all x and
y, and

we say that f (x, y) is odd under x → −x (i.e. under reflection across the y -axis) when f (−x, y) = −f (x, y) for all x and
y

and
we say that f (x, y) is even under y → −y (i.e. under reflection across the x-axis) when f (x, −y) = f (x, y) for all x and
y, and

we say that f (x, y) is odd under y → −y (i.e. under reflection across the x-axis) when f (x, −y) = −f (x, y) for all x and
y.

 Example 3.1.26

Let m and n be two integers and set f (x, y) = x m


y
n
. Then
m n m m n m
f (−x, y) = (−x ) y = (−1 ) x y = (−1 ) f (x, y)
m n n m n n
f (x, −y) = x (−y ) = (−1 ) x y = (−1 ) f (x, y)
m n m+n m n m+n
f (−x, −y) = (−x ) (−y ) = (−1 ) x y = (−1 ) f (x, y)

Consequently
if m is even, then f (x, y) is even under x → −x and
if m is odd, then f (x, y) is odd under x → −x and
if n is even, then f (x, y) is even under y → −y and

3.1.27 https://math.libretexts.org/@go/page/89215
if n is odd, then f (x, y) is odd under y → −y and
if m + n is even, then f (x, y) is even (under reflection across the origin) and
if m + n is odd, then f (x, y) is odd (under reflection across the origin).

Recall from Theorem 3.1.23 (or Theorem 1.2.11 in the CLP-2 text) that we can exploit the evenness or oddness of the integrand,
a
f (x), of the integral ∫ f (x) dx to simplify the evaluation of the integral when b = −a, i.e. when the domain of integration is
b

invariant under reflection across the origin. Similarly, we will be able to simplify the evaluation of the double integral
∬ f (x, y) dx dy when the integrand is even or odd and the domain of integration R is invariant under the corresponding
R

reflection — meaning that the reflected R is identical to the original R. Here are some details for “reflection across the y -axis”.
The details for the other reflections are similar.
If R is any subset of the xy-plane,

the reflection of R across the y-axis = {(−x, y)|(x, y) ∈ R}

The set notation on the right hand side means “the set of all points (−x, y) with (x, y) a point of R ”.
In the special case 11 that

R = {(x, y)|c ≤ y ≤ d, L(y) ≤ x ≤ R(y)}

(see §3.1.2 on horizontal slices) then

the reflection of R across the y-axis = {(x, y)|c ≤ y ≤ d, −R(y) ≤ x ≤ −L(y)}

In the sketch below R is the reflection of R across the y -axis.


y

A subset R of the xy-plane is invariant under reflection across the y -axis (or is also known as “symmetric about the y -axis”)
when

(−x, y) is in R ⟺ (x, y) is in R

Recall that the symbol ⟺ is read “if and only if”. In the special case that
R = {(x, y)|c ≤ y ≤ d, L(y) ≤ x ≤ R(y)}

R is is invariant under reflection across the y -axis when L(y) = −R(y).


Here are some more sketches. The first sketch is of a rectangle that is invariant under reflection across the y -axis, but is not
invariant under reflection across the x-axis. The remaining three sketches show a triangle and its reflections across the y -axis,
across the x-axis and across the origin.

3.1.28 https://math.libretexts.org/@go/page/89215
We are finally ready for the analog of Theorem 3.1.23 (Theorem 1.2.11 in the CLP-2 text) for functions of two variables. By way of
motivation for that theorem, consider the integral ∬ f (x, y) dxdy, with the integrand, f (x, y), odd under x → −x, and the
R

domain of integration, R, symmetric about the y -axis. Slice up R into tiny (think “infinitesmal”) squares, either by subdividing
vertical slices into tiny squares, as in §3.1.1, or by subdividing horizontal slices into tiny squares, as in §3.1.2. Concentrate on any
point (x , y ) in R.
0 0

The contribution to the integral coming from the square that contains (x , y ) is (essentially 12 ) f (x , y ) Δx Δy. That
0 0 0 0

contribution is cancelled by the contribution coming from the square containing (the reflected point) (−x , y ), which is
0 0

f (−x0 , y0 ) Δx Δy = −f (x0 , y0 ) Δx Δy

This is the case for all points (x


0, y0 ) in R. Consequently

∬ f (x, y) dxdy = 0
R

Here is the analog of Theorem 3.1.23 for functions of two variables.

 Theorem 3.1.27. 2d Even and Odd


1. Let R be a subset of the xy-plane that is symmetric about the y -axis. If f (x, y) is odd under x → −x, then

∬ f (x, y) dxdy = 0
R

Denote by R the set of all points in R that have x ≥ 0. If f (x, y) is even under x → −x, then
+

∬ f (x, y) dxdy = 2 ∬ f (x, y) dxdy


R R+

2. Let R be a subset of the xy-plane that is symmetric about the x-axis. If f (x, y) is odd under y → −y, then

∬ f (x, y) dxdy = 0
R

Denote by R the set of all points in R that have y ≥ 0. If f (x, y) is even under y → −y, then
+

∬ f (x, y) dxdy = 2 ∬ f (x, y) dxdy


R R+

3. Let R be a subset of the xy-plane that is invariant under reflection across the origin. If f (x, y) is odd (under reflection
across the origin), then

3.1.29 https://math.libretexts.org/@go/page/89215
∬ f (x, y) dxdy = 0
R

Denote by R either the set of all points in R that have x ≥ 0 or the set of all points in R that have y ≥ 0. If f (x, y) is
+

even (under reflection across the origin), then

∬ f (x, y) dxdy = 2 ∬ f (x, y) dxdy


R R+

Proof
We will give only the proof for part (a) in the special case that

R = {(x, y)|c ≤ y ≤ d, L(y) ≤ x ≤ R(y)}

In part (a), we are assuming that R is symmetric about the y-axis, so that L(y) = −R(y). So, using horizontal strips, as
described in §3.1.2,
d R(y)

∬ f (x, y) dxdy = ∫ dy ∫ dx f (x, y)


R c −R(y)

Fix any c ≤ y ≤ d.
If f (x, y) is odd under x → −x, then f (−x, y) = −f (x, y) for all −R(y) ≤ x ≤ R(y) and
R(y)

∫ dx f (x, y) = 0
−R(y)

by part (b) of Theorem 3.1.23 (Theorem 1.2.11 in the CLP-2 text).


If f (x, y) is even under x → −x, then f (−x, y) = f (x, y) for all −R(y) ≤ x ≤ R(y) and
R(y) R(y)

∫ dx f (x, y) = 2 ∫ dx f (x, y)


−R(y) 0

by part (a) of Theorem 3.1.23.


As the statements of the two bullets are true for each fixed c ≤ y ≤ d, we have that
if f (x, y) is odd under x → −x, then
d R(y) d

∬ f (x, y) dxdy = ∫ dy ∫ dx f (x, y) = ∫ dy 0


R c −R(y) c

=0

and if f (x, y) is even under x → −x, then


d R(y) d R(y)

∬ f (x, y) dxdy = ∫ dy ∫ dx f (x, y) = ∫ dy 2 ∫ dx f (x, y)


R c −R(y) c 0

=2∬ f (x, y) dxdy


R+

The proof of part (a) when R is not of the form

R = {(x, y)|c ≤ y ≤ d, L(y) ≤ x ≤ R(y)}

(for example if R has holes in it) is most easily done using the change of variables x = −u, y =v in Theorem 3.8.3, which is
part of the optional §3.8.
The proof of part (b) is similar to the proof of part (a).
The proof of part (c) is most easily done using the change of variables x = −u, y = −v in Theorem 3.8.3, which is part of the
optional §3.8.

3.1.30 https://math.libretexts.org/@go/page/89215
 Example 3.1.28. ∬ R
e
x
sin(y + y ) dxdy
3

Evaluate the integral

x 3
∬ e sin(y + y ) dxdy
R

over the triangular region R in the sketch

Solution
Start by checking the evenness and oddness properties of the integrand f (x, y) = e x
sin(y + y ).
3
Since
−x 3
f (−x, y) = e sin(y + y )

x 3 x 3 x 3
f (x, −y) =e sin ( − y + (−y ) ) = e sin(−y − y ) = −e sin(y + y )

= −f (x, y)
−x 3
f (−x, −y) = −e sin(y + y )

the integrand is odd under y → −y but is neither even nor odd under x → −x and (x, y) → −(x, y). Fortunately (or by
rigging), the domain of integration R is invariant under y → −y (i.e. is symmetric about the x-axis) and so

x 3
∬ e sin(y + y ) dxdy = 0
R

by part (b) of Theorem 3.1.27 (Theorem 1.2.11 in the CLP-2 text).

 Example 3.1.29. ∬ R
(x e
y
+ ye
x
+ xe
xy
+ 7) dxdy

Evaluate the integral

y x xy
∬ (x e + ye + xe + 7) dxdy
R

over the region R whose outer boundary is the ellipse x 2


+ 4y
2
= 1.

Solution
First, let's sketch the ellipse x + 4y = 1. Notice that its x intercepts are the points (x, 0) that obey x + 4(0) = 1. So the
2 2 2 2

x-intercepts are (±1, 0). Similarly its y intercepts are the points (0, y) that obey 0 + 4 y = 1. So the y -intercepts are
2 2

(0, ±1/2). Here is a sketch of R.

From the sketch, it looks like R is invariant under x → −x (i.e. is symmetric about the y -axis) and is also invariant under
y → −y (i.e. is symmetric about the x-axis) and is also invariant under (x, y) → −(x, y). It is easy to check analytically that

this is indeed the case. The point (x, y) is in R if and only if it is inside x + 4y = 1. That is the case if and only if 2 2

x + 4 y ≤ 1. Since
2 2

2 2 2 2 2 2 2 2
(−x ) + 4y =x + (−4y ) = (−x ) + 4(−y ) =x + 4y

we have

3.1.31 https://math.libretexts.org/@go/page/89215
(x, y) is in R ⟺ (−x, y) is in R

⟺ (x, −y) is in R

⟺ (−x, −y) is in R

Now let's check the evenness and oddness properties of the integrand.
y x xy
f (x, y) = xe + ye + xe +7
y −x −xy
f (−x, y) = −xe + ye − xe +7
−y x −xy
f (x, −y) = xe − ye + xe +7
−y −x xy
f (−x, −y) = −xe − ye − xe +7

So is neither even nor odd under any of


f (x, y) x → −x, y → −y, and (x, y) → −(x, y). BUT, look at the four terms of
f (x, y) separately.

The first term of f (x, y), namely x e , is odd under x → −x. y

The second term of f (x, y), namely y e , is odd under y → −y. x

The third term of f (x, y), namely x e , is odd under (x, y) → −(x, y).
xy

The fourth term of f (x, y), namely 7, is even under all of x → −x, y → −y, and (x, y) → −(x, y).
So, by parts (a), (b) and (c) of Theorem 3.1.27, in order,

y x xy
∬ (x e + ye + xe + 7) dxdy
R

y x xy
=∬ xe dxdy + ∬ ye dxdy + ∬ xe dxdy + 7 ∬ dxdy
R R R R

= 0 + 0 + 0 + 7 Area(R)

Since R is an ellipse with semi-major axis a = 1 and semi-minor axis b = 1

2
, it has area πab = 1

2
π and

y x xy 7
∬ (x e + ye + xe + 7) dxdy = π
2
R

Exercises
Stage 1

 1

For each of the following, evaluate the given double integral without using iteration. Instead, interpret the integral as, for
example, an area or a volume.
3 1

1. ∫ ∫ dy dx
−1 −4

2 √4−y 2

2. ∫ ∫ dx dy
0 0
2
3 √9−y −−−−−−−−−
3. ∫ ∫ √9 − x
2
−y
2
 dx dy
−3 0

 2

Let f (x, y) = 12x 2 3


y . Evaluate
3

1. ∫ f (x, y) dx
0
2

2. ∫ f (x, y) dy
0
2 3

3. ∫ ∫ f (x, y) dx dy
0 0

3.1.32 https://math.libretexts.org/@go/page/89215
3 2

4. ∫ ∫ f (x, y) dy dx
0 0
3 2

5. ∫ ∫ f (x, y) dx dy
0 0

Stage 2
Questions 3.1.7.3 through 3.1.7.8 provide practice with limits of integration for double integrals in Cartesian coordinates.

 3

For each of the following, evaluate the given double integral using iteration.

1. ∬ (x
2 2
+ y ) dx dy where R is the rectangle 0 ≤ x ≤ a,  0 ≤ y ≤ b where a > 0 and b > 0.
R

2. ∬ (x − 3y) dx dy where T is the triangle with vertices (0, 0),  (a, 0),  (0, b).
T

3. ∬ xy
2
dx dy where R is the finite region in the first quadrant bounded by the curves y = x and x = y 2 2
.
R

4. ∬ x cos y dx dy where D is the finite region in the first quadrant bounded by the coordinate axes and the curve
D
2
y = 1 −x .

x
5. ∬ e
y
dx dy where R is the region 0 ≤ x ≤ 1,  x 2
≤ y ≤ x.
R
y

xy
6. ∬ 4
dx dy where T is the triangle with vertices (0, 0),  (0, 1),  (1, 1).
T 1 +x

 4

For each of the following integrals (i) sketch the region of integration, (ii) write an equivalent double integral with the order of
integration reversed and (iii) evaluate both double integrals.
x
2 e

1. ∫ dx ∫ dy
0 1

√2 √4−2y 2

2. ∫ dy ∫ dx y
2
0 −√4−2y

1 3x+2

3. ∫ dx ∫ dy
2
−2 x +4x

 5. ✳

Combine the sum of the two iterated double integrals


y=1 x=y y=2 x=2−y

∫ ∫ f (x, y) dx dy + ∫ ∫ f (x, y) dx dy


y=0 x=0 y=1 x=0

into a single iterated double integral with the order of integration reversed.

 6. ✳

Consider the integral


1 1
x/y
∫ ∫ e  dy dx
0 x

1. Sketch the domain of integration.


2. Evaluate the integral by reversing the order of integration.

3.1.33 https://math.libretexts.org/@go/page/89215
 7. ✳

The integral I is defined as


√2 √y 4 √y

I =∬ f (x, y) dA = ∫ ∫ f (x, y) dx dy + ∫ ∫ f (x, y) dx dy


R 1 1/y √2 y/2

1. Sketch the region R.


2. Re--write the integral I by reversing the order of integration.
3. Compute the integral I when f (x, y) = x/y.

 8. ✳

A region E in the xy--plane has the property that for all continuous functions f
x=3 y=2x+3

∬ f (x, y) dA = ∫ [∫ f (x, y)dy] dx


2
E x=−1 y=x

1. Compute ∬ x dA.
E

2. Sketch the region E.


3. Set up ∬ x dA as an integral or sum of integrals in the opposite order.
E

 9. ✳

Calculate the integral:

2
∬ sin(y ) dA
D

where D is the region bounded by x + y = 0, 2x − y = 0, and y = 4.

 10. ✳

Consider the integral


1 1 2
sin(π x )
I =∫ ∫  dx dy
0
x
√y

1. Sketch the region of integration.


2. Evaluate I.

 11. ✳

Let I be the double integral of the function f (x, y) = y 2


sin xy over the triangle with vertices (0, 0), (0, 1) and (1, 1) in the
xy--plane.

1. Write I as an iterated integral in two different ways.


2. Evaluate I .

 12. ✳
Find the volume (V ) of the solid bounded above by the surface
2
−x
z = f (x, y) = e ,

below by the plane z = 0 and over the triangle in the xy--plane formed by the lines x = 1, y = 0 and y = x.

3.1.34 https://math.libretexts.org/@go/page/89215
 13. ✳
1 2−y
y
Consider the integral I =∫ ∫  dx dy.
0 y
x

1. Sketch the region of integration.


2. Interchange the order of integration.
3. Evaluate I .

 14. ✳

For the integral


1 1 −−−−−
3
I =∫ ∫ √1 + y  dy dx
0 √x

1. Sketch the region of integration.


2. Evaluate I .

 15. ✳
1. D is the region bounded by the parabola y 2
=x and the line y = x − 2. Sketch D and evaluate J where

J =∬ 3y dA
D

2. Sketch the region of integration and then evaluate the integral I :


4 1
3
y
I =∫ ∫ e  dy dx
1
0 √x
2

 16. ✳

Consider the iterated integral


0 2
3
∫ ∫ cos(x ) dx dy
−4 √−y

1. Draw the region of integration.


2. Evaluate the integral.

 17. ✳
1. Combine the sum of the iterated integrals
1 √y 4 √y

I =∫ ∫ f (x, y) dx dy + ∫ ∫ f (x, y) dx dy


0 −√y 1 y−2

into a single iterated integral with the order of integration reversed.


x

2. Evaluate I if f (x, y) = e

2−x
.

 18. ✳
Let
4 √8−y

I =∫ ∫ f (x, y) dx dy
0 √y

3.1.35 https://math.libretexts.org/@go/page/89215
1. Sketch the domain of integration.
2. Reverse the order of integration.
3. Evaluate the integral for f (x, y) = 1
2
.
(1+y)

 19. ✳

Evaluate
0 2x
2
y
∫ ∫ e  dy dx
−1 −2

 20. ✳

Let
2 x 6 √6−x

I =∫ ∫ f (x, y) dy dx + ∫ ∫ f (x, y) dy dx


0 0 2 0

Express I as an integral where we integrate first with respect to x.

 21. ✳

Consider the domain D above the x--axis and below parabola y = 1 − x in the xy--plane. 2

1. Sketch D.
2. Express

∬ f (x, y) dA
D

as an iterated integral corresponding to the order dx dy. Then express this integral as an iterated integral corresponding to
the order dy dx.
3. Compute the integral in the case f (x, y) = e
3
x−( x /3)
.

 22. ✳
1 1
Let I =∫
0

x
2
3 3
x   sin(y ) dy dx.

1. Sketch the region of integration in the xy--plane. Label your sketch sufficiently well that one could use it to determine the
limits of double integration.
2. Evaluate I .

 23. ✳

Consider the solid under the surface z = 6 − xy, bounded by the five planes x = 0, x = 3, y = 0, y = 3, z = 0. Note that no
part of the solid lies below the x--y plane.
1. Sketch the base of the solid in the xy--plane. Note that it is not a square!
2. Compute the volume of the solid.

 24. ✳

Evaluate the following integral:


2 4
3/2
∫ ∫ cos (y ) dy dx
2
−2 x

3.1.36 https://math.libretexts.org/@go/page/89215
 25. ✳

Consider the volume above the xy -plane that is inside the circular cylinder x
2
+y
2
= 2y and underneath the surface
z = 8 + 2xy.

1. Express this volume as a double integral I , stating clearly the domain over which I is to be taken.
2. Express in Cartesian coordinates, the double integral I as an iterated intergal in two different ways, indicating clearly the
limits of integration in each case.
3. How much is this volume?

 26. ✳

Evaluate the following integral:


9 3
3
∫ ∫ sin(π x ) dxdy
0 √y

 27. ✳

The iterated integral


1 √x
3
I =∫ [∫ sin (y − 3y) dy] dx
0 −√x

is equal to ∬D
sin (y
3
− 3y) dA for a suitable region R in the xy-plane.
1. Sketch the region R.
2. Write the integral I with the orders of integration reversed, and with suitable limits of integration.
3. Find I .

 28. ✳

Find the double integral of the function f (x, y) = xy over the region bounded by y = x − 1 and y 2
= 2x + 6.

Stage 3

 29

Find the volume of the solid inside the cylinder x 2


+ 2y
2
= 8, above the plane z = y − 4 and below the plane z = 8 − x.

1. By the “smallest” x we mean the x farthest to the left along the number line, not the x closest to 0.
2. This theorem is named after the Italian mathematician Guido Fubini (1879--1943).
R(y)
3. Think of the part of V that is above the strip as being a thin slice of bread. Then the factor dy in dy ∫ L(y)
dx f (x, y) is the
R(y)
thickness of the slice of bread. The factor ∫ L(y)
dx f (x, y) is the surface area of the constant y cross-section
{(x, z)|L(y) ≤ x ≤ R(y),  0 ≤ z ≤ f (x, y)} , i.e. the surface area of the slice of bread.

4. It also has two complex solutions that play no role here.


5. Perhaps the best known function whose antiderivative cannot be expressed in terms of elementary functions is e . It is the
2
−x

x
integrand of the error function erf (x) =  dt that is used in computing “bell curve” probabilities. See Example
2
2 −t
∫ e
√π 0

3.6.10 in the CLP-2 text.


6. See, for example, Example 3.6.10 in the CLP-2 text.
7. See §1.9 in the CLP-2 text for a general discussion of trigonometric substitution.
8. We weren't joking about his being a good review of single variable integration techniques. See Example 1.8.8 in the CLP-2 text.
9. For example, let p be the perimeter of rectangle number i and require that max
i p tends to zero. This way both the
1≤i≤n i

heights and widths of all rectangles also tend to zero.

3.1.37 https://math.libretexts.org/@go/page/89215
10. For mathematicians, “pathological” is a synonym for “cool”.
11. Here L(y) (“L” stands for “left”) is the leftmost allowed value of x when the y -coordinate is y, and R(y) (“R ” stands for
“right”) is the rightmost allowed value of x, when the y -coordinate is y.
12. In this motivation, we suppress the Δx → 0 and Δy → 0 limits.

This page titled 3.1: Double Integrals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

3.1.38 https://math.libretexts.org/@go/page/89215
3.2: Double Integrals in Polar Coordinates
So far, in setting up integrals, we have always cut up the domain of integration into tiny rectangles by drawing in many lines of
constant x and many lines of constant y.

There is no law that says that we must cut up our domains of integration into tiny pieces in that way. Indeed, when the objects of
interest are sort of round and centered on the origin, it is often advantageous 1 to use polar coordinates, rather than Cartesian
coordinates.

Polar Coordinates
It may have been a while since you did anything in polar coordinates. So let's review before we resume integrating.

 Definition 3.2.1
The polar coordinates 2 of any point (x, y) in the xy-plane are

r =  the distance from (0, 0) to (x, y)

θ =  the (counter-clockwise) angle between the x-axis 

 and the line joining (x, y) to (0, 0)

Cartesian and polar coordinates are related, via a quick bit of trigonometry, by

 Equation 3.2.2

x = r cos θ y = r sin θ
−−−−−−
y
2 2
r = √x +y θ = arctan
x

The following two figures show a number of lines of constant θ, on the left, and curves of constant r, on the right.

Note that the polar angle θ is only defined up to integer multiples of 2π. For example, the point (1, 0) on the x-axis could have
θ = 0, but could also have θ = 2π or θ = 4π. It is sometimes convenient to assign θ negative values. When θ < 0, the counter-

clockwise 3 angle θ refers to the clockwise angle |θ|. For example, the point (0, −1) on the negative y -axis can have θ = − and π

can also have θ = .3π

3.2.1 https://math.libretexts.org/@go/page/89216
It is also sometimes convenient to extend the above definitions by saying that x = r cos θ and y = r sin θ even when r is negative.
For example, the following figure shows (x, y) for r = 1, θ = and for r = −1, θ = .
π

4
π

Both points lie on the line through the origin that makes an angle of 45 with the x-axis and both are a distance one from the origin.

But they are on opposite sides of the the origin.

Polar Curves
Here are a couple of examples in which we sketch curves specified by equations in terms of polar coordinates.

 Example 3.2.3. The Cardioid

Let's sketch the curve

r = 1 + cos θ

Our starting point will be to understand how 1 + cos θ varies with θ. So it will be helpful to remember what the graph of cos θ
looks like for 0 ≤ θ ≤ 2π.

From this we see that the graph of y = 1 + cos θ is

Now let's pick some easy θ values, find the corresponding r's and sketch them.
When θ = 0, we have r = 1 + cos 0 = 1 + 1 = 2. To sketch the point with θ = 0 and r = 2, we first draw in the half-line
consisting of all points with θ = 0, r > 0. That's the positive x-axis, sketched in gray in the leftmost figure below. Then we
put in a dot on that line a distance 2 from the origin. That's the red dot in the first figure below.
Now increase θ a bit (to another easy place to evaluate), say to θ = . As we do so r = 1 + cos θ decreases to
π

6
√3
r = 1 + cos
π

6
=1+
2
≈ 1.87. To sketch the point with θ = and r ≈ 1.87, we first draw in the half-line consisting
π

of all points with θ = π

6
, r > 0. That's the upper gray line in the second figure below. Then we put in a dot on that line a

3.2.2 https://math.libretexts.org/@go/page/89216
distance 1.87 from the origin. That's the upper red dot in the second figure below.

Now increase θ still more, say to


2π π
θ = = ,
6 3

followed by θ = 3π

6
=
π

2
,

followed by θ = 4π

6
=

3
,

followed by θ = 5π

6
,

followed by θ = 6π

6
= π.

As θ increases, r = 1 + cos θ decreases, hitting r = 1 when θ = and ending at r = 0 when θ = π. For each of these θ 's,
π

we first draw in the half-line consisting of all points with that θ and r ≥ 0. Those are the five gray lines in the figure on the
right above. Then we put in a dot on each θ -line a distance r = 1 + cos θ from the origin. Those are the red dots on the
gray lines in the figure on the right above.
We could continue the above procedure for π ≤ θ ≤ 2π. Or we can look at the graph of cos θ above and notice that the
graph of cos θ for π ≤ θ ≤ 2π is exactly the mirror image, about θ = π, of the graph of cos θ for 0 ≤ θ ≤ π.

That is, cos(π + θ) = cos(π − θ) so that r(π + θ) = r(π − θ). So we get the figure.

Finally, we fill in a smooth curve through the dots and we get the graph below. This curve is called a cardioid because it
looks like a heart 4.

 Example 3.2.4. The Three Petal Rose


Now we'll use the same procedure as in the last example to sketch the graph of

r = sin(3θ)

Again it will be useful to remember what the graph of sin(3θ) looks like for 0 ≤ θ ≤ 2π.

3.2.3 https://math.libretexts.org/@go/page/89216
We'll first consider 0 ≤ θ ≤ π

3
, so that 0 ≤ 3θ ≤ π. On this interval r(θ) = sin(3θ)
starts with r(0) = 0, and then
increases as θ increases until
, i.e. θ = , where r( and then
π π π
3θ = ) = 1,
2 6 6

decreases as θ increases until


3θ = π, i.e. θ = , where r( again.
π π
) = 0,
3 3

Here is a table giving a few values of r(θ) for 0 ≤θ ≤


π

3
. Notice that we have chosen values of θ for which sin(3θ) is
easy to compute.

θ 3θ r(θ)

0 0 0
π π 1
≈ 0.71
12 4 √2

2π π
1
12 2

3π 3π 1
≈ 0.71
12 4 √2


π 0
12

and here is a sketch exhibiting those values and another sketch of the part of the curve with 0 ≤ θ ≤ π

3
.

Next consider π

3
≤θ ≤

3
, so that π ≤ 3θ ≤ 2π. On this interval r(θ) = sin(3θ)
starts with r( ) = 0, and then
π

decreases as θ increases until


, i.e. θ = , where r( and then
3π π π
3θ = ) = −1,
2 2 2

increases as θ increases until


3θ = 2π, i.e. θ = , where r( again.
2π 2π
) = 0,
3 3

We are now encountering, for the first time, r(θ) 's that are negative. The figure on the left below contains, for each of
θ =

12
= ,
π

3

,  
12
=

,  
12
and   =
π

2

12

12

the (dashed) half-line consisting of all points with that θ and r < 0 and
the dot with that θ and r(θ) = sin(3θ).
The figure on the right below provides a sketch of the part of the curve r = sin(3θ) with π

3
≤θ ≤

3
.

3.2.4 https://math.libretexts.org/@go/page/89216
Finally consider 2π

3
≤ θ ≤ π (because r(θ + π) = sin(3θ + 3π) = − sin(3θ) = −r(θ), the part of the curve with
π ≤ θ ≤ 2π just retraces the part with 0 ≤ θ ≤ π ), so that 2π ≤ 3θ ≤ 3π. On this interval r(θ) = sin(3θ)

starts with r( ) = 0, and then


increases as θ increases until


, i.e. θ = , where r( ) = 1, and then
5π 10π 5π
3θ =
2 12 2

decreases as θ increases until


3θ = 3π, i.e. θ = = π, where r(π) = 0, again.
12π

12

The figure on the left below contains, for each of θ = 8π

12
=

3
,

12
,  
10π

12
,  
11π

12
and   12π

12

the (solid) half-line consisting of all points with that θ and r ≥ 0 and
the dot with that θ and r(θ) = sin(3θ).
The figure on the right below provides a sketch of the part of the curve r = sin(3θ) with 2π

3
≤ θ ≤ π.

Putting the three lobes together gives the full curve, which is called the “three petal rose”.

There is an infinite family of similar rose curves (also called rhodonea 5 curves).

Integrals in Polar Coordinates


We now return to the problem of using polar coordinates to set up double integrals. So far, we have used Cartesian coordinates, in
the sense that we have cut up our domains of integration into tiny rectangles (on which the integrand is essentially constant) by
drawing in many lines of constant x and many lines of constant y. To use polar coordinates, we instead draw in both lines of
constant θ and curves of constant r. This cuts the xy-plane up into approximate rectangles.

3.2.5 https://math.libretexts.org/@go/page/89216
Here is an enlarged sketch of one such approximate rectangle.

One side has length dr, the spacing between the curves of constant r. The other side is a portion of a circle of radius r that
subtends, at the origin, an angle dθ, the angle between the lines of constant θ. As the circumference of the full circle is 2πr and as
dθ is the fraction


of a full circle 6, the other side of the approximate rectangle has length 2πr = rdθ. So the shaded region

has area approximately

 Equation 3.2.5

dA = r dr dθ

By way of comparison, using Cartesian coordinates we had dA = dx dy.


This intuitive computation has been somewhat handwavy 7. But using it in the usual integral setup procedure, in which we choose
dr and dθ to be constants times and then take the limit n → 0, gives, in the limit, error exactly zero. A sample argument, in
1

which we see the error going to zero in the limit n → ∞, is provided in the (optional) section §3.2.4.

 Example 3.2.6. Mass

Let 0 ≤ a < b ≤ 2π be constants and let R be the region

R = {(r cos θ, r sin θ)|a ≤ θ ≤ b, B(θ) ≤ r ≤ T (θ)}

where the functions T (θ) and B(θ) are continuous and obey B(θ) ≤ T (θ) for all a ≤ θ ≤ b. Find the mass of R if it has
density f (x, y).
Solution
The figure on the left below is a sketch of R. Notice that r = T (θ) is the outer curve while r = B(θ) is the inner curve.

Divide R into wedges (as in wedges of pie 8 or wedges of cheese) by drawing in many lines of constant θ, with the various
values of θ differing by a tiny amount dθ. The figure on the right above shows one such wedge, outlined in blue.
Concentrate on any one wedge. Subdivide the wedge further into approximate rectangles by drawing in many circles of
constant r, with the various values of r differing by a tiny amount dr. The figure below shows one such approximate rectangle,

3.2.6 https://math.libretexts.org/@go/page/89216
in black.

Now concentrate on one such rectangle. Let's say that it contains the point with polar coordinates r and θ. As we saw in 3.2.5
above,
the area of that rectangle is essentially dA = r dr dθ.
As the mass density on the rectangle is essentially f (r cos θ , r sin θ), the mass of the rectangle is essentially
f (r cos θ , r sin θ) r dr dθ.

To get the mass of any one wedge, say the wedge whose polar angle runs from θ to θ + dθ, we just add up the masses of
the approximate rectangles in that wedge, by integrating r from its smallest value on the wedge, namely B(θ), to its largest
value on the wedge, namely T (θ). The mass of the wedge is thus
T (θ)

dθ ∫ dr r f (r cos θ , r sin θ)
B(θ)

Finally, to get the mass of R, we just add up the masses of all of the different wedges, by integrating θ from its smallest
value on R, namely a, to its largest value on R, namely b.
In conclusion,
b T (θ)

Mass(R) = ∫ dθ ∫ dr r f (r cos θ , r sin θ)


a B(θ)

We have repeatedly used the word “essentially” above to avoid getting into the nitty-gritty details required to prove things
rigorously. The mathematically correct proof of 3.2.7 follows the same intuition, but requires some more careful error bounds,
as in the optional §3.2.4 below.

In the last example, we derived the important formula that the mass of the region

R = {(r cos θ, r sin θ)|a ≤ θ ≤ b, B(θ) ≤ r ≤ T (θ)}

with mass density f (x, y) is

 Equation 3.2.7
b T (θ)

Mass(R) = ∫ dθ ∫ dr r f (r cos θ , r sin θ)


a B(θ)

We can immediately adapt that example to calculate areas and derive the formula that the area of the region

R = {(r cos θ, r sin θ)|a ≤ θ ≤ b, 0 ≤ r ≤ R(θ)}

is

 Equation 3.2.8
b
1 2
Area(R) = ∫ R(θ)  dθ
2 a

3.2.7 https://math.libretexts.org/@go/page/89216
We just have set the density to 1. We do so in the next example.

 Example 3.2.9. Polar Area

Let 0 ≤ a < b ≤ 2π be constants. Find the area of the region

R = {(r cos θ, r sin θ)|a ≤ θ ≤ b, 0 ≤ r ≤ B(θ)}

where the function R(θ) ≥ 0 is continuous.


Solution
To get the area of R we just need to assign it a density one and find the resulting mass. So, by 3.2.7, with f (x, y) = 1,

B(θ) = 0 and T (θ) = R(θ),

b R(θ)

Area(R) = ∫ dθ ∫ dr r
a 0

In this case we can easily do the inner r integral, giving


b
1
2
Area(R) = ∫ R(θ)  dθ
2 a

The expression R(θ)  dθ in 3.2.8 has a geometric interpretation. It is just the area of a wedge of a circular disk of radius
1

2
2

R(θ) (with R(θ) treated as a constant) that subtends the angle dθ.

To see this, note that area of the wedge is the fraction of the area of the entire disk, which is πR(θ) . So 3.2.8 just says that


2

the area of R can be computed by cutting R up into tiny wedges and adding up the areas of all of the tiny wedges.

 Example 3.2.10. Polar Area

Find the area of one petal of the three petal rose r = sin(3θ).
Solution
Looking at the last figure in Example 3.2.4, we see that we want the area of
π
R = {(r cos θ, r sin θ)|0 ≤ θ ≤ ,  0 ≤ r ≤ sin(3θ)}
3

So, by 3.2.8 with a = 0, b =


π

3
, and R(θ) = sin(3θ),
π

1 3
2
area(R) = ∫ sin (3θ) dθ
2 0
π

1 3

= ∫ (1 − cos(6θ)) dθ
4 0
π

1 1 3

= [θ − sin(6θ)]
4 6
0

π
=
12

3.2.8 https://math.libretexts.org/@go/page/89216
In the first step we used the double angle formula cos(2ϕ) = 1 − 2 sin 2
(ϕ). Unsurprisingly, trig identities show up a lot when
polar coordinates are used.

 Example 3.2.11. Volumes Using Polar Coordinates

A cylindrical hole of radius b is drilled symmetrically (i.e. along a diameter) through a metal sphere of radius a ≥ b. Find the
volume of metal removed.
Solution
Let's use a coordinate system with the sphere centred on (0, 0, 0) and with the centre of the drill hole following the z -axis. In
particular, the sphere is x + y + z ≤ a .
2 2 2 2

Here is a sketch of the part of the sphere in the first octant. The hole in the sphere made by the drill is outlined in red. By
symmetry the total amount of metal removed will be eight times the amount from the first octant.

That is, the volume of metal removed will be eight times the volume of the solid
−−−−−−−−−−
2 2 2
V1 = {(x, y, z)|(x, y) ∈ R1 ,  0 ≤ z ≤ √ a −x −y }

where the base region


2 2 2
R1 = {(x, y)| x +y ≤ b ,  x ≥ 0,  y ≥ 0}

In polar coordinates
− −−−−−
2 2
V1 = {(r cos θ, r sin θ, z)|(r cos θ, r sin θ) ∈ R1 ,  0 ≤ z ≤ √ a − r }

π
R1 = {(r cos θ, r sin θ)|0 ≤ r ≤ b,  0 ≤ θ ≤ }
2

We follow our standard divide and sum up strategy. We will cut the base region R into small pieces and sum up the volumes
1

that lie above each small piece.


Divide R into wedges by drawing in many lines of constant θ, with the various values of θ differing by a tiny amount dθ.
1

The figure on the left below shows one such wedge, outlined in blue.

3.2.9 https://math.libretexts.org/@go/page/89216
Concentrate on any one wedge. Subdivide the wedge further into approximate rectangles by drawing in many circles of
constant r, with the various values of r differing by a tiny amount dr. The figure on the right above shows one such
approximate rectangle, in black.
Concentrate on one such rectangle. Let's say that it contains the point with polar coordinates r and θ. As we saw in 3.2.5
above,
the area of that rectangle is essentially dA = r dr dθ.
−−−−−−
The part of V that is above that rectangle is like an office tower whose height is essentially √a − r , and whose base
1
2 2

has area dA = r dr dθ. It is outlined in black in the figure below. So the volume of the part of V that is above the 1
−− −− −−
rectangle is essentially √a − r r dr dθ.
2 2

To get the volume of the part of V above any one wedge (outlined in blue in the figure below), say the wedge whose polar
1

angle runs from θ to θ + dθ, we just add up the volumes above the approximate rectangles in that wedge, by integrating r
from its smallest value on the wedge, namely 0, to its largest value on the wedge, namely b. The volume above the wedge
is thus
2 2
b a −b
− −−−−− du
2 2 −
dθ ∫ dr r √ a − r = dθ ∫ √u
0 a2 −2

2 2
where u = a − r ,  du = −2r dr
2 2
a −b
3/2
u
= dθ[ ]
−3 2
a

1 3 2 2
3/2
= dθ [a − (a −b ) ]
3

Notice that this quantity is independent of θ. If you think about this for a moment, you can see that this is a consequence of
the fact that our solid is invariant under rotations about the z -axis.

Finally, to get the volume of V , we just add up the volumes over all of the different wedges, by integrating θ from its
1

smallest value on R , namely 0, to its largest value on R , namely .


1 1
π

3.2.10 https://math.libretexts.org/@go/page/89216
π/2
1 3/2
3 2 2
Volume(V1 ) = ∫ dθ [a − (a −b ) ]
3 0

π 3 2 2 3/2
= [a − (a −b ) ]
6

In conclusion, the total volume of metal removed is


Volume(V) = 8 Volume(V1 )

4π 3/2
3 2 2
= [a − (a −b ) ]
3

Note that we can easily apply a couple of sanity checks to our answer.
If the radius of the drill bit b = 0, no metal is removed at all. So the total volume removed should be zero. Our answer does
indeed give 0 in this case.
If the radius of the drill bit b = a, the radius of the sphere, then the entire sphere disappears. So the total volume removed
should be the volume of a sphere of radius a. Our answer does indeed give π a in this case. 4

3
3

If the radius, a, of the sphere and the radius, b, of the drill bit are measured in units of meters, then the remaining volume
3/2

3
[a
3 2
− (a
2
−b ) ], has units meters 3
, as it should.

The previous two problems were given to us (or nearly given to us) in polar coordinates. We'll now get a little practice converting
integrals into polar coordinates, and recognising when it is helpful to do so.

 Example 3.2.12. Changing to Polar Coordinates


1 x −−−−−−
Convert the integral ∫ 0

0
2 2
y √x + y  dy dx to polar coordinates and evaluate the result.
Solution
First recall that in polar coordinates x = r cos θ, y = r sin θ and dx dy = dA = r dr dθ so that the integrand (and dA )
−−−−−−
2 2 3
y√ x +y  dy dx = (r sin θ) r r dr dθ = r sin θ dr dθ

is very simple. So whether or not this integral will be easy to evaluate using polar coordinates will be largely determined by the
domain of integration.
So our main task is to sketch the domain of integration. To prepare for the sketch, note that in the integral
1 x −−−−−− 1 x −−−−−−
2 2 2 2
∫ ∫ y√ x +y  dy dx = ∫ dx [∫ dy y √ x +y ]
0 0 0 0

the variable x runs from 0 to 1 and


for each fixed 0 ≤ x ≤ 1, y runs from 0 to x.
So the domain of integration is

D = {(x, y)|0 ≤ x ≤ 1,  0 ≤ y ≤ x}

which is sketched in the figure on the left below. It is a right angled triangle.

Next we express the domain of integration in terms of polar coordinates, by expressing the equations of each of the boundary
lines in terms of polar coordinates.

3.2.11 https://math.libretexts.org/@go/page/89216
The x-axis, i.e. y = r sin θ = 0, is θ = 0.
The line x = 1 is r cos θ = 1 or r = .
1

cos θ

Finally, (in the first quadrant) the line


sin θ π
y =x ⟺ r sin θ = r cos θ ⟺ tan θ = =1 ⟺ θ =
cos θ 4

So, in polar coordinates, we can write the domain of integration as


π 1

R = {(r, θ)   0 ≤ θ ≤ ,  0 ≤ r ≤ }

4 cos θ

We can now slice up R using polar coordinates.


Divide R into wedges by drawing in many lines of constant θ, with the various values of θ differing by a tiny amount dθ.
The figure on the right above shows one such wedge.
The first wedge has θ = 0.
The last wedge has θ = . π

Concentrate on any one wedge. Subdivide the wedge further into approximate rectangles by drawing in many circles of
constant r, with the various values of r differing by a tiny amount dr. The figure on the right above shows one such
approximate rectangle, in black.
The rectangle that contains the point with polar coordinates r and θ has area (essentially) r dr dθ.
The first rectangle has r = 0.
The last rectangle has r = .
1

cos θ

So our integral is
2 2
y √x +y

1 x −−−−−− π/4
1

cos θ
2 2 2
∫ ∫ y√ x +y  dy dx = ∫ dθ ∫ dr r (r sin θ)
0 0 0 0

Because the r-integral treats θ as a constant, we can pull the sin θ out of the inner r-integral.
1
1 x −−−−−− π/4
cos θ
2 2 3
∫ ∫ y√ x +y  dy dx = ∫ dθ  sin θ ∫ dr r
0 0 0 0

π/4
1 1
= ∫ dθ  sin θ
4
4 0 cos θ

Make the substitution

u = cos θ,  du = − sin θ dθ

When θ = 0, u = cos θ = 1 and when θ = π

4
, u = cos θ =
1

√2
. So

1 x −−−−−− 1/ √2
2 2
1 1
∫ ∫ y√ x +y  dy dx = ∫ (−du)
4
0 0
4 1 u

−3 1/ √2
1 u 1 –
=− [ ] = [2 √2 − 1]
4 −3 12
1

 Example 3.2.13. Changing to Polar Coordinates



2

Evaluate ∫ e
−x
 dx.
0

Solution
This is actually a trick question. In fact it is a famous trick question 9.

3.2.12 https://math.libretexts.org/@go/page/89216
The integrand e does not have an antiderivative that can be expressed in terms of elementary functions 10. So we cannot
2
−x

evaluate this integral using the usual Calculus II methods. However we can evaluate it's square
∞ 2 ∞ ∞ ∞ ∞
2 2 2 2 2
−x −x −y −x −y
[∫ e  dx] =∫ e  dx  ∫ e  dy = ∫ dx ∫ dy e
0 0 0 0 0

precisely because this double integral can be easily evaluated just by changing to polar coordinates! The domain of integration
is the first quadrant {(x, y)|x ≥ 0,  y ≥ 0} . In polar coordinates, dxdy = r drdθ and the first quadrant is
π
{(r cos θ , r sin θ)|r ≥ 0,  0 ≤ θ ≤ }
2

So
∞ 2 ∞ ∞ π/2 ∞
2 2 2 2
−x −x −y −r
[∫ e  dx] =∫ dx ∫ dy e =∫ dθ ∫ dr r e
0 0 0 0 0

As r runs all the way to +∞, this is an improper integral, so we should be a little bit careful.
∞ 2 π/2 R
2 2
−x −r
[∫ e  dx] = lim ∫ dθ ∫ dr r e
R→∞
0 0 0
2
π/2 R
du
−u 2
= lim ∫ dθ ∫  e where u = r ,  du = 2r dr
R→∞
0 0
2
2
π/2 −u R
e
= lim ∫ dθ [− ]
R→∞
0
2
0

2
−R
π 1 e
= lim [ − ]
R→∞ 2 2 2

π
=
4

and so we get the famous result


∞ −
2 √π
−x
∫ e  dx =
0 2

 Example 3.2.14

Find the area of the region that is inside the circle r = 4 cos θ and to the left of the line x = 1.
Solution
First, let's check that r = 4 cos θ really is a circle and figure out what circle it is. To do so, we'll convert the equation
r = 4 cos θ into Cartesian coordinates. Multiplying both sides by r gives

2 2 2 2 2
r = 4r cos θ ⟺ x +y = 4x ⟺ (x − 2 ) +y =4

So r = 4 cos θ is the circle of radius 2 centred on (2, 0). We'll also need the intersection point(s) of x = r cos θ = 1 and
r = 4 cos θ. At such an intersection point
1
r cos θ = 1,  r = 4 cos θ ⟹ = 4 cos θ
cos θ
1
2
⟹ cos θ =
4
1
⟹ cos θ = since r cos θ = 1 > 0
2
π
⟹ θ =±
3

Here is a sketch of the region of interest, which we'll call R.

3.2.13 https://math.libretexts.org/@go/page/89216
We could figure out the area of R by using some high school geometry, because R is a circular wedge with a triangle
removed. (See Example 3.2.15, below.)

Instead, we'll treat its computation as an exercise in integration using polar coordinates.
As R is symmetric about the x-axis, the area of R is twice the area of the part that is above the x-axis. We'll denote by R the 1

upper half of R. Note that we can write the equation x = 1 in polar coordinates as r = . Here is a sketch of R .
1

cos θ
1

Observe that, on R , for any fixed θ between 0 and


1
π

2
,

if θ < π

3
, then r runs from 0 to 1

cos θ
, while

if θ > π

3
, then r runs from 0 to 4 cos θ.
This naturally leads us to split the domain of integration at θ = π

3
:

π/3 1/ cos θ π/2 4 cos θ

Area(R1 ) = ∫ dθ ∫ dr r + ∫ dθ ∫ dr r


0 0 π/3 0

As ∫ r dr = r

2
+ C,

3.2.14 https://math.libretexts.org/@go/page/89216
π/3 2 π/2
sec θ
2
Area(R1 ) = ∫ dθ  +∫ dθ 8 cos θ
0
2 π/3

π/2
1 π/3

= tan θ +4 ∫ dθ [1 + cos(2θ)]

2 0
π/3

– π/2
√3 sin(2θ)
= + 4 [θ + ]
2 2
π/3

– –
√3 π √3
= +4 [ − ]
2 6 4

2π √3
= −
3 2

and
4π –
Area(R) = 2Area(R1 ) = − √3
3

 Example 3.2.15. Optional — Example 3.2.14 by high school geometry

We'll now again compute the area of the region R that is inside the circle r = 4 cos θ and to the left of the line x = 1. That was
the region of interest in Example 3.2.14. This time we'll just use some geometry. Think of R as being the wedge W, of the
figure on the left below, with the triangle T , of the figure on the right below, removed.

First we'll get the area of W. The cosine of the angle between the x axis and the radius vector from C to A is 1

2
. So that
2π/3
angle is and W subtends an angle of . The entire circle has area π 2 , so that W, which is the fraction
π

3

3
2
= of the 2π
1

full circle, has area .


3

Now we'll get the area of the triangle T . Think of T as having base BD. Then the length of the base of T is 2√3 and the
– –
height of T is 1. So T has area (2√3)(1) = √3.
1

All together
4π –
Area(R) = Area(W) − Area(T ) = − √3
3

We used some hand waving in deriving the area formula 3.2.8: the word “essentially” appeared quite a few times. Here is how do
that derivation more rigorously.

Optional— Error Control for the Polar Area Formula>


Let 0 ≤ a < b ≤ 2π. In Examples 3.2.6 and 3.2.9 we derived the formula
b
1
2
A = ∫ R(θ) dθ
2 a

3.2.15 https://math.libretexts.org/@go/page/89216
for the area of the region

R = {(r cos θ, r sin θ)|a ≤ θ ≤ b,  0 ≤ r ≤ R(θ)}

In the course of that derivation we approximated the area of the shaded region in

by dA = r dr dθ.
We will now justify that approximation, under the assumption that

0 ≤ R(θ) ≤ M | R (θ)| ≤ L

for all a ≤ θ ≤ b. That is, R(θ) is bounded and its derivative exists and is bounded too.
Divide the interval a ≤ θ ≤ b into n equal subintervals, each of length Δθ =
b−a

n
. Let θ be the midpoint of the

i
i
th
interval. On
the i interval, θ runs from θ − Δθ to θ + Δθ.
th ∗
i
1

2

i
1

By the mean value theorem


∗ ′ ∗
R(θ) − R(θ ) = R (c)(θ − θ )
i i

for some c between θ and θ . Because |R (θ)| ≤ L



i

∣ ∗ ∗
∣R(θ) − R(θi )∣
∣ ≤ L∣
∣θ − θi ∣

This tells us that the difference between R(θ) and R(θ ∗


i
) can't be too big compared to ∣∣θ − θ ∗
i

∣.

On the i interval, the radius r = R(θ) runs over all values of R(θ) with θ satisfying ∣∣θ − θ ∣∣ ≤ Δθ. By (∗), all of these values
th ∗
i
1

of R(θ) lie between r = R(θ ) − LΔθ and R = R(θ ) + LΔθ. Consequently the part of R having θ in the i subinterval,
i

i
1

2
i

i
1

2
th

namely,
1 1
∗ ∗
Ri = {(r cos θ, r sin θ)| θ − Δθ ≤ θ ≤ θ + Δθ,  0 ≤ r ≤ R(θ)}
i i
2 2

must contain all of the circular sector


1 1
∗ ∗
{(r cos θ, r sin θ)| θ − Δθ ≤ θ ≤ θ + Δθ,  0 ≤ r ≤ ri }
i i
2 2

and must be completely contained inside the circular sector


1 1
∗ ∗
{(r cos θ, r sin θ)| θ − Δθ ≤ θ ≤ θ + Δθ,  0 ≤ r ≤ Ri }
i i
2 2

That is, we have found one circular sector that is bigger than the one we are approximating, and one circular sector that is smaller.
The area of a circular disk of radius ρ is π ρ . A circular sector of radius ρ that subtends an angle Δθ is the fraction
2
of the full Δθ

disk and so has the area Δθ


πρ =

ρ .
2 Δθ

2
2

3.2.16 https://math.libretexts.org/@go/page/89216
So the area of R must lie between
i

2 2
1 1 1 1 1 1
2 ∗ 2 ∗
Δθ r = Δθ [R(θ ) − LΔθ] and Δθ R = Δθ [R(θ ) + LΔθ]
i i i i
2 2 2 2 2 2

Observe that
2
1 1
∗ ∗ 2 ∗ 2 2
[R(θ ) ± LΔθ] = R(θ ) ± LR(θ )Δθ + L Δθ
i i i
2 4

implies that, since 0 ≤ R(θ) ≤ M ,

2
1 2 2

R(θ ) − LM Δθ + L Δθ
i
4
2
1

≤ [R(θ ) ± LΔθ] ≤
i
2

1
∗ 2 2 2
R(θ ) + LM Δθ + L Δθ
i
4

Hence (multiplying by Δθ

2
to turn them into areas)
1 2
1 2
1 2 3

R(θ ) Δθ − LM Δθ + L Δθ
i
2 2 8

≤ Area(Ri ) ≤

1 2
1 2
1 2 3

R(θ ) Δθ + LM Δθ + L Δθ
i
2 2 8

and the total area A obeys the bounds


n
1 2
1 2
1 2 3

∑[ R(θ ) Δθ − LM Δθ + L Δθ ]
i
2 2 8
i=1

≤A ≤
n
1 2
1 2
1 2 3

∑[ R(θ ) Δθ + LM Δθ + L Δθ ]
i
2 2 8
i=1

and
n
1 2
1 2
1 2 3

∑ R(θ ) Δθ − nLM Δθ + nL Δθ
i
2 2 8
i=1

≤A ≤
n
1 2
1 2
1 2 3

∑ R(θ ) Δθ + nLM Δθ + nL Δθ
i
2 2 8
i=1

b−a
Since Δθ = n
,

n 2 2 3
1 LM (b−a) L (b−a)
∗ 2
∑ R(θ ) Δθ− +
i 2
2 2 n 8 n
i=1

≤A ≤
n 2 2 3
1 LM (b−a) L (b−a)
∗ 2
∑ R(θ ) Δθ+ +
i
2 2 n 8 n2
i=1

3.2.17 https://math.libretexts.org/@go/page/89216
Now take the limit as n → ∞. Since
n 2 2 3
1 LM (b − a) L (b − a)
∗ 2
lim [ ∑ R(θ ) Δθ ± + ]
i 2
n→∞ 2 2 n 8 n
i=1

b 2 2 3
1 LM (b − a) L (b − a)
2
= ∫ R(θ) dθ ± lim + lim
2 a
n→∞ 2 n n→∞ 8 n2
b
1 2
= ∫ R(θ) dθ (since L, M , a and b are all constants)
2 a

we have that
b
1
2
A = ∫ R(θ) dθ
2 a

exactly, as desired.

Exercises
Stage 1

 1

Consider the points


(x1 , y1 ) = (3, 0) (x2 , y2 ) = (1, 1) (x3 , y3 ) = (0, 1)

(x4 , y4 ) = (−1, 1) (x5 , y5 ) = (−2, 0)

For each 1 ≤ i ≤ 5,
sketch, in the xy-plane, the point (x , y ) and
i i

find the polar coordinates r and θ , with 0 ≤ θ


i i i < 2π, for the point (x i, yi ).

 2
1. Find all pairs (r, θ) such that

(−2, 0) = (r cos θ , r sin θ)

2. Find all pairs (r, θ) such that

(1, 1) = (r cos θ , r sin θ)

3. Find all pairs (r, θ) such that

(−1, −1) = (r cos θ , r sin θ)

 3

Consider the points

(x1 , y1 ) = (3, 0) (x2 , y2 ) = (1, 1) (x3 , y3 ) = (0, 1)

(x4 , y4 ) = (−1, 1) (x5 , y5 ) = (−2, 0)

Also define, for each angle θ, the vectors

er (θ) = cos θ  ^
ı + sin θ ^
ı ȷ
ȷ eθ (θ) = − sin θ  ^
ı + cos θ ^
ı ȷ
ȷ

1. Determine, for each angle θ, the lengths of the vectors e (θ) and e (θ) and the angle between the vectors e (θ) and e
r θ r θ (θ).

Compute e (θ) × e (θ) (viewing e (θ) and e (θ) as vectors in three dimensions with zero k
r θ r θ
^
components).
2. For each 1 ≤ i ≤ 5, sketch, in the xy-plane, the point (x , y ) and the vectors e (θ ) and e (θ ). In your sketch of the
i i r i θ i

vectors, place the tails of the vectors e (θ ) and e (θ ) at (x , y ).


r i θ i i i

3.2.18 https://math.libretexts.org/@go/page/89216
 4

Let ⟨a, b⟩ be a vector. Let r be the length of ⟨a, b⟩ and θ be the angle between ⟨a, b⟩ and the x-axis.
1. Express a and b in terms of r and θ.
2. Let ⟨A, B⟩ be the vector gotten by rotating ⟨a, b⟩ by an angle a⃗ rphi about its tail. Express A and B in terms of a, b and
a⃗ rphi.

 5
For each of the regions R sketched below, express ∬
R
f (x, y) dx dy as an iterated integral in polar coordinates in two
different ways.

(a) (b) (c) (d)

 6
Sketch the domain of integration in the xy-plane for each of the following polar coordinate integrals.
π
2
4

1. ∫ dr ∫ dθ r f (r cos θ, r sin θ)


π
1 −
4
π 2

4 s in θ+cos θ

2. ∫ dθ ∫ dr r f (r cos θ, r sin θ)


0 0
3

3. ∫
√cos 2 θ+9 s in2 θ

dθ ∫ dr r f (r cos θ, r sin θ)


0 0

Stage 2

 7

Use polar coordinates to evaluate each of the following integrals.

1. ∬ (x + y)dx dy where S is the region in the first quadrant lying inside the disc x 2
+y
2
≤a
2
and under the line
S

y = √3x.

2. ∬ x dx dy, where S is the disc segment x 2


+y
2
≤ 2,  x ≥ 1.
S

3. ∬ (x
2
+ y )dx dy
2
where T is the triangle with vertices (0, 0), (1, 0) and (1, 1).
T

4. ∬ ln(x
2 2
+ y ) dx dy
2 2
x +y ≤1

 8

Find the volume lying inside the sphere x 2


+y
2
+z
2
=2 and above the paraboloid z = x 2
+y .
2

3.2.19 https://math.libretexts.org/@go/page/89216
 9

Let a > 0. Find the volume lying inside the cylinder x 2


+ (y − a)
2
=a
2
and between the upper and lower halves of the cone
2 2 2
z =x +y .

 10

Let a > 0. Find the volume common to the cylinders x 2


+y
2
≤ 2ax and z 2
≤ 2ax.

 11 ✳
−−−−−−
Consider the region E in 3--dimensions specified by the inequalities x 2
+y
2
≤ 2y and 0 ≤ z ≤ √x
2 2
+y .

1. Draw a reasonably accurate picture of E in 3--dimensions. Be sure to show the units on the coordinate axes.
2. Use polar coordinates to find the volume of E. Note that you will be “using polar coordinates” if you solve this problem by
means of cylindrical coordinates.

 12 ✳

Evaluate the iterated double integral


2
x=2 y=√4−x 3

2 2
∫ ∫ (x +y ) 2
 dy dx
x=0 y=0

 13 ✳
1. Sketch the region L (in the first quadrant of the xy--plane) with boundary curves
2 2 2 2
x +y = 2,  x +y = 4,  y = x,  y = 0.

The mass of a thin lamina with a density function ρ(x, y) over the region L is given by

M =∬ ρ(x, y) dA
L

2. Find an expression for M as an integral in polar coordinates.


3. Find M when
2xy
ρ(x, y) =
x2 + y 2

 14 ✳
1
Evaluate ∬ 2
 dA.
2 2 2
R (1 + x +y )

 15 ✳

Evaluate the double integral


−−−−−−
2 2
∬ y√ x +y dA
D

over the region D = {(x, y)|x 2


+y
2
≤ 2,  0 ≤ y ≤ x} .

3.2.20 https://math.libretexts.org/@go/page/89216
 16 ✳

This question is about the integral


1 √4−y 2
2 2
∫ ∫ ln (1 + x + y ) dx dy
0 √3y

1. Sketch the domain of integration.


2. Evaluate the integral by transforming to polar coordinates.

 17 ✳

Let D be the region in the xy --plane bounded on the left by the line x =2 and on the right by the circle 2
x +y
2
= 16.

Evaluate
−3/2
2 2
∬ (x +y )  dA
D

 18 ✳

In the xy--plane, the disk x 2


+y
2
≤ 2x is cut into 2 pieces by the line y = x. Let D be the larger piece.
1. Sketch D including an accurate description of the center and radius of the given disk. Then describe D in polar coordinates
(r, θ).
−−−−− −
2. Find the volume of the solid below z = √x 2
+ y2 and above D.

 19 ✳

Let D be the shaded region in the diagram. Find the average distance of points in D from the origin. You may use that
n−1
cos (x) sin(x)
n
∫ cos (x) dx =
n
+
n−1

n
∫ cos
n−2
(x) dx for all natural numbers n ≥ 2.

Stage 3

 20 ✳

Let G be the region in R given by2

2 2
x +y ≤1

0 ≤ x ≤ 2y

y ≤ 2x

1. Sketch the region G.


2. Express the integral ∬ G
f (x, y) dA a sum of iterated integrals ∬ f (x, y) dxdy.
3. Express the integral ∬ G
f (x, y) dA as an iterated integral in polar coordinates (r, θ) where x = r cos(θ) and y = r sin(θ).

 21 ✳

Consider

3.2.21 https://math.libretexts.org/@go/page/89216
2
√2 √4−y
y 2 2
x +y
J =∫ ∫ e  dx dy
0 y x

1. Sketch the region of integration.


2. Reverse the order of integration.
3. Evaluate J by using polar coordinates.

 22

Find the volume of the region in the first octant below the paraboloid
2 2
x y
z =1− −
2 2
a b

 23

A symmetrical coffee percolator holds 24 cups when full. The interior has a circular cross-section which tapers from a radius of
3' at the centre to 2' at the base and top, which are 12' apart. The bounding surface is parabolic. Where should the mark
indicating the 6 cup level be placed?

 24 ✳

Consider the surface S given by z = e


2 2
x +y
.

1. Compute the volume under S and above the disk x + y ≤ 9 in the xy-plane. 2 2

2. The volume under S and above a certain region R in the xy-plane is


1 y 2 2−y
2 2 2 2
x +y x +y
∫ (∫ e dx)dy + ∫ (∫ e dx)dy
0 0 1 0

Sketch R and express the volume as a single iterated integral with the order of integration reversed. Do not compute either
integral in part (b).

1. The “golden hammer” (also known as Maslow's hammer and as the law of the instrument) refers to a tendency to always use the
same tool, even when it isn't the best tool for the job. It is just as bad in mathematics as it is in carpentry.
2. In the mathematical literature, the angular coordinate is usually denoted θ, as we do here. The symbol ϕ is also often used for
the angular coordinate. In fact there is an ISO standard (#80000 – 2) which specifies that ϕ should be used in the natural
sciences and in technology. See Appendix A.7.
3. or anti-clockwise or widdershins. Yes, widdershins is a real word, though the Oxford English Dictionary lists its frequency of
usage as between 0.01 and 0.1 times per million words. Of course both “counter-clockwise” and “anti-clockwise” assume that
your clock is not a sundial in the southern hemisphere.
4. Well, a mathematician's heart. The name “cardioid” comes from the Greek word καρδια (which anglicizes to kardia) for heart.
5. The name rhodenea first appeared in the 1728 publication Flores geometrici of the Italian monk, theologian, mathematician and
engineer, Guido Grandi (1671– 1742).
6. Recall that θ has to be measured in radians for this to be true.
7. “Handwaving” is sometimes used as a pejorative to refer to an argument that lacks substance. Here we are just using it to
indicate that we have left out a bunch of technical details. In mathematics, “nose-following” is sometimes used as the polar

3.2.22 https://math.libretexts.org/@go/page/89216
opposite of handwaving. It refers to a very narrow, mechanical, line of reasoning.
8. There is a pie/pi/pye pun in there somewhere.
9. The solution is attributed to the French Mathematician Sim\'eon Denis Poisson (1781 – 840) and was published in the textbook
Cours d'Analyse de l'\'ecole polytechnique by Jacob Karl Franz Sturm (1803 – 1855).
z
10. On the other hand it is the core of the function erf(z) =  dt, which gives Gaussian (i.e. bell curve) probabilities.
2
2 −t
∫ e
√π 0

“erf” stands for “error function”.

This page titled 3.2: Double Integrals in Polar Coordinates is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.

3.2.23 https://math.libretexts.org/@go/page/89216
3.3: Applications of Double Integrals
Double integrals are useful for more than just computing areas and volumes. Here are a few other applications that lead to double
integrals.

Averages
In Section 2.2 of the CLP-2 text, we defined the average value of a function of one variable. We'll now extend that discussion to
functions of two variables. First, we recall the definition of the average of a finite set of numbers.

 Definition 3.3.1

The average (mean) of a set of n numbers f 1, f2 , ⋯ , fn is


¯
f = ⟨f ⟩

f1 + f2 + ⋯ + fn
=
n

The notations f¯ and ⟨f ⟩ are both commonly used to represent the average.

Now suppose that we want to take the average of a function f (x, y) with (x, y) running continuously over some region R in the
xy-plane. A natural approach to defining what we mean by the average value of f over R is to

First fix any natural number n.


Subdivide the region R into tiny (approximate) squares each of width Δx = and height Δy = 1

n
1

n
. This can be done by, for
example, subdividing vertical strips into tiny squares, like in Example 3.1.11.
Name the squares (in any fixed order) R , R , ⋯ , R , where N is the total number of squares.
1 2 N

Select, for each 1 ≤ i ≤ N , one point in square number i and call it (x , y ). So (x , y ) ∈ R .



i

i

i

i i

The average value of f at the selected points is


N N ∗ ∗
1 ∑ f (x , y )
∗ ∗ i=1 i i
∑ f (x , y ) =
i i N
N ∑ 1
i=1 i=1

N ∗ ∗
∑ f (x , y ) Δx Δy
i=1 i i
=
N
∑ Δx Δy
i=1

We have transformed the average into a ratio of Riemann sums.


∬ f (x,y) dx dy
Once we have the Riemann sums it is clear what to do next. Taking the limit n → ∞, we get exactly R
. That's why we
∬ dx dy
R

define

 Definition 3.3.2
Let f (x, y) be an integrable function defined on region R in the xy-plane. The average value of f on R is

∬ f (x, y) dx dy
R
¯
f = ⟨f ⟩ =

∬ dx dy
R

 Example 3.3.3. Average


−−−−−−−−− −
Let a > 0. A mountain, call it Half Dome 1, has height z(x, y) = √a 2
− x2 − y 2 above each point (x, y) in the base region
R = {(x, y)| x + y ≤ a , x ≤ 0} . Find its average height.
2 2 2

3.3.1 https://math.libretexts.org/@go/page/89217
Solution
By Definition 3.3.2 the average height is
− −−−−−−−− −
∬ z(x, y) dx dy ∬ √ a2 − x2 − y 2 dx dy
R R
z̄ = =
∬ dx dy ∬ dx dy
R R

The integrals in both the numerator and denominator are easily evaluated by interpreting them geometrically.
−−−−−−−−− −
The numerator ∬ R
z(x, y) dx dy = ∬
R
√a2 − x2 − y 2 dx dy can be interpreted as the volume of
− −−−−−−−−−
∣ 2 2 2 2 2 2
{ (x, y, z)   x + y ≤ a ,  x ≤ 0,  0 ≤ z ≤ √ a − x − y  }

2 2 2 2
= {(x, y, z)| x +y +z ≤ a ,  x ≤ 0,  z ≥ 0}

which is one quarter of the interior of a sphere of radius a. So the numerator is π a . 1

3
3

The denominator ∬ dx dy is the area of one half of a circular disk of radius a. So the denominator is
R
1

2
2
πa .

All together, the average height is


1 3
πa 2
3
z̄ = = a
1 2
πa 3
2

Notice this this number is bigger than zero and less than the maximum height, which is a. That makes sense.

 Example 3.3.4. Example 3.3.3, the hard way

This last example was relatively easy because we could reinterpret the integrals as geometric quantities. For practice, let's go
−−−−−−−−− −
back and evaluate the numerator ∬ √a − x − y dx dy of Example 3.3.3 as an iterated integral.
R
2 2 2

Here is a sketch of the top view of the base region R.

Using the slicing in the figure


−−−−−−−−−− a 0 −−−−−−−−−−
2 2 2 2 2 2
∬ √a −x −y dx dy = ∫ dy ∫ dx √ a −x −y
2 2
R −a −√a −y

0 −−−−−−−−−−
Note that, in the inside integral ∫ dx √a
2
−x
2
−y
2
, the variable y is treated as a constant, so that the integrand
−√a2 −y 2
−−−−−−−−− − −−− −−−− −−−− −−
√a2 − y 2 − x2 = √C 2 − x2 with C being the constant √a2 − y 2 . The standard protocol for evaluating this integral uses
the trigonometric substitution

3.3.2 https://math.libretexts.org/@go/page/89217
π π
x = C sin θ with  − ≤θ ≤
2 2

dx = C cos θ dθ

Trigonometric substitution was discussed in detail in Section 1.9 in the CLP-2 text. Since
x =0 ⟹ C sin θ = 0 ⟹ θ =0
−−−−−−
2 2
π
x = −√ a −y = −C ⟹ C sin θ = −C ⟹ θ =−
2

and
−−−−−−−−−− −−−−−−−−−− −
2 2 2 2 2 2
√a −x −y = √ C − C sin θ = C cos θ

the inner integral


0 −−−−−−−−−− 0
2 2 2 2 2
∫ dx √ a −x −y =∫ C cos θ dθ
2 2
−√a −y −π/2

0
sin(2θ)
0
1 + cos(2θ) ⎡ θ+ ⎤
2 2 2
=C ∫  dθ = C
−π/2
2 ⎣ 2 ⎦
−π/2

2
πC π
2 2
= = (a −y )
4 4

and the full integral


−−−−−−−−−− a a
π π
2 2 2 2 2 2 2
∬ √a −x −y dx dy = ∫ (a − y ) dy = ∫ (a − y ) dy
R 4 −a 2 0

3
π a
3
= [a − ]
2 3

1 3
= πa
3

just as we saw in Example 3.3.3.

 Remark 3.3.5
0
We remark that there is an efficient, sneaky, way to evaluate definite integrals like ∫ −π/2
cos
2
θ dθ. Looking at the figures

we see that
0 0
2 2
∫ cos θ dθ = ∫ sin θ dθ
−π/2 −π/2

Thus
0 0 0 0
2 2
1 2 2
1
∫ cos θ dθ =∫ sin θ dθ = ∫ [ sin θ + cos θ] dθ = ∫ dθ
−π/2 −π/2 −π/2
2 2 −π/2

π
=
4

3.3.3 https://math.libretexts.org/@go/page/89217
It is not at all unusual to want to find the average value of some function f (x, y) with (x, y) running over some region R, but to
also want some (x, y)'s to play a greater role in determining the average than other (x, y)'s. One common way to do so is to create
w( x1 , y1 ) w( x1 , y1 )
a “weight function” w(x, y) > 0 with w( x2 , y2 )
giving the relative importance of (x1 , y1 ) and (x
2, y2 ). That is, (x1 , y1 ) is w( x2 , y2 )

times as important as (x 2 , y2 ). This leads to the definition

 Definition 3.3.6

∬ f (x, y) w(x, y) dx dy
R

∬ w(x, y) dx dy
R

is called the weighted average of f over R with weight w(x, y).

Note that if f (x, y) = F , a constant, then the weighted average of f is just F , just as you would want.

Centre of Mass
One important example of a weighted average is the centre of mass. If you support a body at its centre of mass (in a uniform
gravitational field) it balances perfectly. That's the definition of the centre of mass of the body. In Section 2.3 of the CLP-2 text, we
found that the centre of mass of a body that consists of mass distributed continuously along a straight line, with mass density
ρ(x)kg/m and with x running from a to b, is at

b
∫ x ρ(x) dx
a
x̄ =
b
∫ ρ(x) dx
a

That is, the centre of mass is at the average of the x-coordinate weighted by the mass density.
In two dimensions, the centre of mass of a plate that covers the region R in the xy -plane and that has mass density ρ(x, y) is the
point (x̄, ȳ ) where

 Equation 3.3.7. Centre of Mass

x̄ = the weighted average of x over R

∬ x ρ(x, y) dx dy ∬ x ρ(x, y) dx dy
R R
= =
∬  ρ(x, y) dx dy Mass(R)
R

ȳ = the weighted average of y over R

∬ y ρ(x, y) dx dy ∬ y ρ(x, y) dx dy
R R
= =
∬  ρ(x, y) dx dy Mass(R)
R

If the mass density is a constant, the centre of mass is also called the centroid, and is the geometric centre of R. In this case

 Equation 3.3.8. Centroid

∬ x dx dy ∬ x dx dy
R R
x̄ = =
∬ dx dy Area(R)
R

∬ y dx dy ∬ y dx dy
R R
ȳ = =
∬  dx dy Area(R)
R

 Example 3.3.9. Centre of Mass

In Section 2.3 of the CLP-2 text, we did not have access to multivariable integrals, so we used some physical intuition to derive
that the centroid of a body that fills the region

R = { (x, y) ∣
∣ a ≤ x ≤ b,  B(x) ≤ y ≤ T (x) }

3.3.4 https://math.libretexts.org/@go/page/89217
in the xy-plane is (x̄, ȳ ) where
b
∫ x[T (x) − B(x)] dx
a
x̄ =
A
b 2 2
∫ [T (x ) − B(x ) ] dx
a
ȳ =
2A

b
and A = ∫ [T (x) − B(x)] dx is the area of R. Now that we do have access to multivariable integrals, we can derive these
a

formulae directly from 3.3.8. Using vertical slices, as in this figure,

we see that the area of R is


b T (x) b

A =∬ dx dy = ∫ dx ∫ dy = ∫ dx [T (x) − B(x)]


R a B(x) a

and that 3.3.8 gives


b T (x) b
1 1 1
x̄ = ∬ x dx dy = ∫ dx ∫ dy x = ∫ dx x[T (x) − B(x)]
A R
A a B(x)
A a

b T (x) b 2 2
1 1 1 T (x) B(x)
ȳ = ∬ y dx dy = ∫ dx ∫ dy y = ∫ dx  [ − ]
A R
A a B(x)
A a
2 2

just as desired.

We'll start with a simple mechanical example.

 Example 3.3.10. Quarter Circle

In Example 2.3.4 of the CLP-2 text, we found the centroid of the quarter circular disk
2 2 2
D = {(x, y)|x ≥ 0,  y ≥ 0,  x +y ≤r }

by using the formulae of the last example. We'll now find it again using 3.3.8.
Since the area of D is 1

4
πr ,
2
we have

∬ x dx dy ∬ y dx dy
D D
x̄ = ȳ =
1 2 1 2
πr πr
4 4

We'll evaluate ∬ D
x dx dy by using horizontal slices, as in the figure on the left below.

Looking at that figure, we see that

3.3.5 https://math.libretexts.org/@go/page/89217
y runs from 0 to r and
−−−−−−
for each y in that range, x runs from 0 to √r − y . 2 2

So
2 2
r √r −y
2 2
r √r −y
2
x
∬ x dx dy = ∫ dy ∫ dx x = ∫ dy [ ]
D 0 0 0 2
0
r 3
1 2 2
1 3
r
= ∫ dy [ r −y ] = [r − ]
2 0
2 3

3
r
=
3

and
3
4 r 4r
x̄ = [ ] =
2
πr 3 3π

This is the same answer as we got in Example 2.3.4 of the CLP-2 text. But because we were able to use horizontal slices, the
integral in this example was a little easier to evaluate than the integral in CLP-2. Had we used vertical slices, we would have
ended up with exactly the integral of CLP-2.
By symmetry, we should have ȳ = x̄. We'll check that by evaluating ∬ D
y dx dy by using vertical slices slices, as in the figure
on the right above. From that figure, we see that
x runs from 0 to r and
−−−−−−
for each x in that range, y runs from 0 to √r − x . 2 2

So
r √r2 −x2 r
1
2 2
∬ y dx dy = ∫ dx ∫ dy y = ∫ dx [ r −x ]
D 0 0
2 0

3
r
This is exactly the integral 1

2

0
dy [ r
2 2
−y ] that we evaluated above, with y renamed to x. So ∬ D
y dx dy =
r

3
too and
3
4 r 4r
ȳ = [ ] = = x̄
2
πr 3 3π

as expected.

 Example 3.3.11. Example 3.2.14, continued


Find the centroid of the region that is inside the circle r = 4 cos θ and to the left of the line x = 1.
Solution
Recall that we saw in Example 3.2.14 that r = 4 cos θ was indeed a circle, and in fact is the circle (x − 2) 2
+y
2
= 4. Here is
a sketch of that circle and of the region of interest, R.

From the sketch, we see that R is symmetric about the x-axis. So we expect that its centroid, (x̄, ȳ ), has ȳ = 0. To see this
from the integral definition, note that the integral ∬ y dx dy R

3.3.6 https://math.libretexts.org/@go/page/89217
has domain of integration, namely R, invariant under y → −y (i.e. under reflection in the x-axis), and
has integrand, namely y, that is odd under y → −y.
So ∬ R
y dx dy = 0 and consequently ȳ = 0.

We now just have to find x̄:

∬ x dx dy
R
x̄ =
∬ dx dy
R

We have already found, in Example 3.2.14, that


4π –
∬ dx dy = − √3
R
3

So we just have to compute ∬


R
x dx dy. Using R1 to denote the top half of R, and using polar coordinates, like we did in
Example 3.2.14,
x x
π/3 1/ cos θ  π/2 4 cos θ 

∬ x dx dy = ∫ dθ ∫ dr r (r cos θ) + ∫ dθ ∫ dr r (r cos θ)


R1 0 0 π/3 0

π/3 1/ cos θ π/2 4 cos θ


2 2
=∫ dθ  cos θ ∫ dr r +∫ dθ  cos θ ∫ dr r
0 0 π/3 0

π/3 2 π/2
sec θ 64
4
=∫ dθ  +∫ dθ  cos θ
0
3 π/3
3

The first integral is easy, provided we remember that tan θ is an antiderivative for sec
2
θ. For the second integral, we'll need
1+cos(2θ)
the double angle formula cos 2
θ =
2
:

2
2 1 + cos(2θ) 1
4 2 2
cos θ = ( cos θ) =[ ] = [1 + 2 cos(2θ) + cos (2θ)]
2 4

1 1 + cos(4θ)
= [1 + 2 cos(2θ) + ]
4 2

3 cos(2θ) cos(4θ)
= + +
8 2 8

so
π/2
1 π/3 64 3θ sin(2θ) sin(4θ)

∬ x dx dy = tan θ + [ + + ]

R1
3 0 3 8 4 32
π/3

– –
1 – 64 3 π √3 √3
= × √3 + [ × − + ]
3 3 8 6 4 ×2 32 × 2

4π –
= − 2 √3
3

The integral we want, namely ∬ R


x dx dy,

has domain of integration, namely R, invariant under y → −y (i.e. under reflection in the x-axis), and
has integrand, namely x, that is even under y → −y.
So ∬ R
x dx dy = 2 ∬
R1
x dx dy and, all together,
4π – 8π –
2( − 2 √3) − 4 √3
3 3
x̄ = = ≈ 0.59
4π – 4π –
− √3 − √3
3 3

As a check, note that 0 ≤ x ≤ 1 on R and more of R is closer to x = 1 than to x = 0. So it makes sense that x̄ is between 1

and 1.

3.3.7 https://math.libretexts.org/@go/page/89217
 Example 3.3.12. Reverse Centre of Mass
2 √2x−x2

Evaluate ∫ ∫ (2x + 3y)dy dx.


0 −√2x−x2

Solution
This is another integral that can be evaluated without using any calculus at all. This time by relating it to a centre of mass. By
3.3.8,

∬ x dx dy = x̄ Area(R)
R

∬ y dx dy = ȳ  Area(R)
R

so that we can easily evaluate ∬ x dx dy and R



R
y dx dy provided R is sufficiently simple and symmetric that we can
easily determine its area and its centroid.
That is the case for the integral in this example. Rewrite
2 √2x−x2 2 √2x−x2

∫ ∫ (2x + 3y)dy dx = 2 ∫ dx [ ∫ dy x]


2 2
0 −√2x−x 0 −√2x−x

2 √2x−x2

+3 ∫ dx [ ∫ dy y]
2
0 −√2x−x

On the domain of integration


x runs from 0 to 2 and
−−−−−− −−−−−−
for each fixed 0 ≤ x ≤ 2, y runs from −√2x − x to +√2x − x 2 2

−−−−−−
Observe that y = ±√2x − x is equivalent to
2

2 2 2 2 2
y = 2x − x = 1 − (x − 1 ) ⟺ (x − 1 ) +y =1

Our domain of integration is exactly the disk


2 2
R = {(x, y)|(x − 1 ) +y ≤ 1}

of radius 1 centred on (1, 0).

So R has area π and centre of mass (x̄, ȳ ) = (1, 0) and


2 √2x−x2

∫ ∫ (2x + 3y)dy dx = 2 ∬ x dx dy + 3 ∬ y dx dy


0 −√2x−x2 R R

= 2 x̄ Area(R) + 3 ȳ Area(R) = 2π

Moment of Inertia
Consider a plate that fills the region R in the xy-plane, that has mass density ρ(x, y) kg/m , and that is rotating at ω rad/s about 2

some axis. Let's call the axis of rotation A. We are now going to determine the kinetic energy of that plate. Recall 2 that, by

3.3.8 https://math.libretexts.org/@go/page/89217
definition, the kinetic energy of a point particle of mass m that is moving with speed v is 1

2
2
mv .

To get the kinetic energy of the entire plate, cut it up into tiny rectangles 3, say of size dx × dy. Think of each rectangle as being
(essentially) a point particle. If the point (x, y) on the plate is a distance D(x, y) from the axis of rotation A, then as the plate
rotates, the point (x, y) sweeps out a circle of radius D(x, y). The figure on the right below shows that circle as seen from high up
on the axis of rotation.

The circular arc that the point (x, y) sweeps out in one second subtends the angle ω radians, which is the fraction ω


of a full circle
and so has length (2πD(x, y)) = ω D(x, y). Consequently the rectangle that contains the point (x, y)
ω

has speed ω D(x, y), and


has area dx dy, and so
has mass ρ(x, y) dx dy, and
has kinetic energy
2
m v
 
1 2
1 2 2
(ρ(x, y) dx dy)(ω D(x, y)) = ω  D(x, y) ρ(x, y) dx dy
2 2

So (via our usual Riemann sum limit procedure) the kinetic energy of R is
1 1 1
2 2 2 2 2
∬ ω  D(x, y ) ρ(x, y) dx dy = ω ∬  D(x, y ) ρ(x, y) dx dy = IA ω
R
2 2 R
2

where

 Definition 3.3.13. Moment of Inertia

2
IA = ∬ D(x, y ) ρ(x, y) dx dy
R

is called the moment of inertial of R about the axis A. In particular the moment of inertia of R about the y -axis is

2
Iy = ∬ x ρ(x, y) dx dy
R

and the moment of inertia of R about the x-axis is

2
Ix = ∬ y ρ(x, y) dx dy
R

Notice that the expression I ω for the kinetic energy has a very similar form to mv , just with the velocity v replaced by the
1

2
A
2 1

2
2

angular velocity ω, and with the mass m replaced by I , which can be thought of as being a bit like a mass.
A

So far, we have been assuming that the rotation was taking place in the xy-plane — a two dimensional world. Our analysis extends
naturally to three dimensions, though the resulting integral formulae for the moment of inertia will then be triple integrals, which
we have not yet dealt with. We shall soon do so, but let's first do an example in two dimensions.

3.3.9 https://math.libretexts.org/@go/page/89217
 Example 3.3.14. Disk
Find the moment of inertia of the interior, R, of the circle x 2
+y
2
=a
2
about the x-axis. Assume that it has density one.
Solution
The distance from any point (x, y) inside the disk to the axis of

rotation (i.e. the x-axis) is |y|. So the moment of inertia of the interior of the disk about the x-axis is

2
Ix = ∬ y  dxdy
R

Switching to polar coordinates 4,


2
y

2π a  2π a
2 2 3
Ix = ∫ dθ ∫ dr r (r sin θ) =∫ dθ  sin θ∫ dr r
0 0 0 0

4 2π 4 2π
a a 1 − cos(2θ)
2
= ∫ dθ  sin θ = ∫ dθ 
4 0 4 0 2

4 2π
a sin(2θ)
= [θ − ]
8 2
0

1
4
= πa
4


For an efficient, sneaky, way to evaluate ∫ 0
sin
2
θ dθ, see Remark 3.3.5.

 Example 3.3.15. Cardioid

Find the moment of inertia of the interior, R, of the cardiod r = a(1 + cos θ) about the z -axis. Assume that the cardiod lies in
the xy-plane and has density one.
Solution
We sketched the cardiod (with a = 1 ) in Example 3.2.3.

As we said above, the formula for I in Definition 3.3.13 is valid even when the axis of rotation is not contained in the xy-
A

plane. We just have to be sure that our D(x, y) really is the distance from (x, y) to the axis of rotation. In this example the axis
−−−−− −
of rotation is the z -axis so that D(x, y) = √x + y and that the moment of inertia is
2 2

2 2
IA = ∬ (x + y ) dxdy
R

3.3.10 https://math.libretexts.org/@go/page/89217
Switching to polar coordinates, using dxdy = r drdθ and x 2
+y
2
=r ,
2

2π a(1+cos θ) 2π a(1+cos θ)
2 3
IA =∫ dθ ∫ dr r × r =∫ dθ  ∫ dr r
0 0 0 0

4 2π
a 4
= ∫ dθ (1 + cos θ)
4 0

4 2π
a 2 3 4
= ∫ dθ (1 + 4 cos θ + 6 cos θ + 4 cos θ + cos θ)
4 0

Now



∫ dθ  cos θ = sin θ =0

0
0

2π 2π 2π
1 + cos(2θ) 1 sin(2θ)
2
∫ dθ  cos θ =∫ dθ  = [θ + ] =π
0 0
2 2 2
0

2π 2π 0
u=sin θ
3 2 2
∫ dθ  cos θ =∫ dθ  cos θ[1 − sin θ] ⟹ ∫ du (1 − u ) = 0
0 0 0

To integrate cos 4
θ, we use the double angle formula
cos(2θ) + 1
2
cos θ =
2
2
2
( cos(2θ) + 1) cos (2θ) + 2 cos(2θ) + 1
4
⟹ cos θ = =
4 4
cos(4θ)+1
+ 2 cos(2θ) + 1
2
=
4
3 1 1
= + cos(2θ) + cos(4θ)
8 2 8

to give
2π 2π
4
3 1 1
∫ dθ  cos θ =∫ dθ  [ + cos(2θ) + cos(4θ)]
0 0
8 2 8

3 1 1 3
= × 2π + ×0 + ×0 = π
8 2 8 4

All together
4
a 3
IA = [2π + 4 × 0 + 6 × π + 4 × 0 + π]
4 4

35 4
= πa
16

Exercises
Stage 1

 1
For each of the following, evaluate the given double integral without using iteration. Instead, interpret the integral in terms of,
for example, areas or average values.
−−−−−
1. ∬ D
(x + 3) dx dy, where D is the half disc 0 ≤ y ≤ √4 − x 2

2. ∬ R
(x + y) dx dy where R is the rectangle 0 ≤ x ≤ a,  0 ≤ y ≤ b

3.3.11 https://math.libretexts.org/@go/page/89217
Stage 2

 2. ✳

Find the centre of mass of the region D in the xy --plane defined by the inequalities x
2
≤ y ≤ 1, assuming that the mass
density function is given by ρ(x, y) = y.

 3. ✳

Let R be the region bounded on the left by x = 1 and on the right by x 2


+y
2
= 4. The density in R is
1
ρ(x, y) =
− −−−− −
√ x2 + y 2

1. Sketch the region R.


2. Find the mass of R.
3. Find the centre-of-mass of R.
Note: You may use the result ∫ sec(θ) dθ = ln | sec θ + tan θ| + C .

 4. ✳
−−−−−−
A thin plate of uniform density 1 is bounded by the positive x and y axes and the cardioid √x 2
+y
2
= r = 1 + sin θ, which
is given in polar coordinates. Find the x--coordinate of its centre of mass.

 5. ✳

A thin plate of uniform density k is bounded by the positive x and y axes and the circle x 2
+y
2
= 1. Find its centre of mass.

 6. ✳

Let R be the triangle with vertices (0, 2), (1, 0), and (2, 0). Let R have density ρ(x, y) = y 2
. Find ȳ , the y --coordinate of the
center of mass of R. You do not need to find x̄.

 7. ✳

The average distance of a point in a plane region D to a point (a, b) is defined by


−−−−−−−−−−−−−−−
1 2 2
∬ √ (x − a) + (y − b )  dx dy
A(D) D

where A(D) is the area of the plane region D. Let D be the unit disk 1 ≥ x 2 2
+y . Find the average distance of a point in D
to the center of D.

 8. ✳
A metal crescent is obtained by removing the interior of the circle defined by the equation x 2
+y
2
=x from the metal plate of
constant density 1 occupying the unit disc x + y ≤ 1.
2 2

1. Find the total mass of the crescent.


2. Find the x-coordinate of its center of mass.
π/2
You may use the fact that ∫−π/2
4
cos (θ) dθ =

8
.

3.3.12 https://math.libretexts.org/@go/page/89217
 9. ✳

Let D be the region in the xy--plane which is inside the circle x


2 2
+ (y − 1 ) =1 but outside the circle 2
x +y
2
= 2.

Determine the mass of this region if the density is given by


2
ρ(x, y) = − −−−− −
√ x2 + y 2

Stage 3

 10. ✳

Let a, b and c be positive numbers, and let T be the triangle whose vertices are (−a, 0), (b, 0) and (0, c).
1. Assuming that the density is constant on T , find the center of mass of T .
2. The medians of T are the line segments which join a vertex of T to the midpoint of the opposite side. It is a well known
fact that the three medians of any triangle meet at a point, which is known as the centroid of T . Show that the centroid of T
is its centre of mass.

1. There is a real Half-Dome mountain in Yosemite National Park. It has a = 1445 m.


2. If you don't recall, don't worry. We wouldn't lie to you. Or check it on Wikipedia. They wouldn't lie to you either.
3. The relatively small number of “rectangles” around the boundary of R won't actually be rectangles. But, as we have seen in the
optional §3.2.4, one can still make things rigorous despite the rectangles being a bit squishy around the edges.
4. See how handy they are!

This page titled 3.3: Applications of Double Integrals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.

3.3.13 https://math.libretexts.org/@go/page/89217
3.4: Surface Area
Suppose that we wish to find the area of part, S, of the surface z = f (x, y). We start by cutting S up into tiny pieces. To do so,
we draw a bunch of curves of constant x (the blue curves in the figure below). Each such curve is the intersection of S with the
plane x = x for some constant x . And we also
0 0

draw a bunch of curves of constant y (the red curves in the figure below). Each such curve is the intersection of S with the
plane y = y for some constant y .
0 0

Concentrate on any one the tiny pieces. Here is a greatly magnified sketch of it, looking at it from above.

We wish to compute its area, which we'll call dS. Now this little piece of surface need not be parallel to the xy-plane, and indeed
need not even be flat. But if the piece is really tiny, it's almost flat. We'll now approximate it by something that is flat, and whose
area we know. To start, we'll determine the corners of the piece. To do so, we first determine the bounding curves of the piece.
Look at the figure above, and recall that, on the surface z = f (x, y).
The upper blue curve was constructed by holding x fixed at the value x , and sketching the curve swept out by
0

ȷ + f (x , y) k as y varied, and
^
x ^
0ı +y ^
ı ȷ 0

the lower blue curve was constructed by holding x fixed at the slightly larger value x + dx, and sketching the curve swept out
0

by (x + dx) ^
0 ı +y ^
ı ȷȷ + f (x + dx, y) k as y varied.
0
^

The red curves were constructed similarly, by holding y fixed and varying x.
So the four intersection points in the figure are
^
P0 = x0 ^
ı + y0 ^
ı ȷ
ȷ + f (x0 , y0 ) k

^
P1 = x0 ^
ı + (y0 + dy) ^
ı ȷ
ȷ + f (x0 , y0 + dy) k

^
P2 = (x0 + dx) ^
ı + y0 ^
ı ȷ
ȷ + f (x0 + dx, y0 ) k

^
P3 = (x0 + dx) ^
ı + (y0 + dy) ^
ı ȷ
ȷ + f (x0 + dx, y0 + dy) k

Now, for any small constants dX and dY , we have the linear approximation 1
∂f ∂f
f (x0 + dX, y0 + dY ) ≈ f (x0 , y0 ) + (x0 , y0 ) dX + (x0 , y0 ) dY
∂x ∂y

Applying this three times, once with dX = 0, dY = dy (to approximate P1 ), once with dX = dx, dY = 0 (to approximate P2 ),
and once with dX = dx, dY = dy (to approximate P ), 3

3.4.1 https://math.libretexts.org/@go/page/89218
∂f
^
P1 ≈ P0   +  dy ^
ȷ
ȷ    +   (x0 , y0 ) dy k
∂y

∂f
^
P2 ≈ P0   +  dx ^
ı
ı    +   (x0 , y0 ) dx k
∂x

∂f ∂f
^
P3 ≈ P0   +  dx ^
ı   +  dy ^
ı ȷ
ȷ   +  [ (x0 , y0 ) dx + (x0 , y0 ) dy] k
∂x ∂y

Of course we have only approximated the positions of the corners and so have introduced errors. However, with more work, one
can bound those errors (like we in the optional §3.2.4) and show that in the limit dx, dy → 0, all of the error terms that we dropped
contribute exactly 0 to the integral.
The small piece of our surface with corners P 0, P1 , P2 , P3 is approximately a parallelogram with sides
−−−→ −−−→ ∂f
^
P0 P1 ≈ P2 P3 ≈ dy ^
ȷ
ȷ  +   (x0 , y0 ) dy k
∂y

−−−→ −−−→ ∂f
^
P0 P2 ≈ P1 P3 ≈ dx ^
ı
ı +  (x0 , y0 ) dx k
∂x

−−−→ −−−→ −−−→ −−−→


Denote by θ the angle between the vectors P 0 P1 and P 0 P2 . The base of the parallelogram, P 0 P1 , has length ∣∣P ∣
0 P1 ∣, and the height
−−−→
of the parallelogram is ∣∣P ∣ sin θ.
0 P2 ∣ So the area of the parallelogram is 2 , by Theorem 1.2.23,
−−−→ −−−→ −−−→ −−−→
dS = | P0 P1 | | P0 P2 |  sin θ = ∣
∣P0 P1 × P0 P2 ∣

∣ ∂f ∂f ∣
^ ^
≈ ∣ (^
ȷ
ȷ  +   (x0 , y0 ) k) × ( ^
ı
ı +  (x0 , y0 ) k) ∣dxdy
∣ ∂y ∂x ∣

The cross product is easily evaluated:


^
⎡ ^
ı
ı ^
ȷ
ȷ k ⎤
∂f ∂f ⎢ ∂f

^ ^
(^
ȷ
ȷ  +   (x0 , y0 ) k) × ( ^
ı
ı +  (x0 , y0 ) k) = det ⎢ 0 1 (x0 , y0 ) ⎥
∂y
∂y ∂x ⎢ ⎥
∂f
⎣1 0 (x0 , y0 ) ⎦
∂x

^
= fx (x0 , y0 ) ^
ı
ı + fy (x0 , y0 ) ^
ȷ
ȷ −k

as is its length:
∣ ∂f ∂f ∣
^ ^
∣( ^
ȷ
ȷ  +   (x0 , y0 ) k) × ( ^
ı
ı +  (x0 , y0 ) k) ∣
∣ ∂y ∂x ∣
−−−−−−−−−−−−−−−−−−−−−−
2 2
= √ 1 + fx (x0 , y0 ) + fy (x0 , y0 )

Throughout this computation, x and y were arbitrary. So we have found the area of each tiny piece of the surface S.
0 0

 Equation 3.4.1

For the surface z = f (x, y),


−−−−−−−−−−−−−−−−−−−
2 2
dS = √ 1 + fx (x, y ) + fy (x, y )  dxdy

Similarly, for the surface x = g(y, z),


−−−−−−−−−−−−−−−−−−
2 2
dS = √ 1 + gy (y, z) + gz (y, z)  dydz

and for the surface y = h(x, z),

3.4.2 https://math.libretexts.org/@go/page/89218
−−−−−−−−−−−−−−−−−−−
2 2
dS = √ 1 + hx (x, z) + hz (x, z)  dxdz

Consequently, we have

 Theorem 3.4.2
1. The area of the part of the surface z = f (x, y) with (x, y) running over the region D in the xy-plane is
−−−−−−−−−−−−−−−−−−−
2 2
∬ √ 1 + fx (x, y ) + fy (x, y )  dxdy
D

2. The area of the part of the surface x = g(y, z) with (y, z) running over the region D in the yz-plane is
−−−−−−−−−−−−−−−−−−
2 2
∬ √ 1 + gy (y, z) + gz (y, z)  dydz
D

3. The area of the part of the surface y = h(x, z) with (x, z) running over the region D in the xz-plane is
−−−−−−−−−−−−−−−−−−−
2 2
∬ √ 1 + hx (x, z) + hz (x, z)  dxdz
D

 Example 3.4.3. Area of a cone

As a first example, we compute the area of the part of the cone


−−−−−−
2 2
z = √x +y

with 0 ≤ z ≤ a or, equivalently, with x 2


+y
2
≤a .
2

−−−−−−
Note that z = √x 2
+y
2
is the side of the cone. It does not include the top.
To find its area, we will apply 3.4.1 to
−−−−−−
2 2 2 2 2
z = f (x, y) = √ x +y with (x, y) running over x +y ≤a

That forces us to compute the first order partial derivatives


x
fx (x, y) =
− −−−− −
√ x2 + y 2

y
fy (x, y) = − −−−− −
√ x2 + y 2

Substituting them into the first formula in 3.4.1 yields


−−−−−−−−−−−−−−−−−−−
2 2
dS = √ 1 + fx (x, y ) + fy (x, y )  dx dy

−−−−−−−−−−−−−−−−−−−−−−−−−−− −
x 2 y 2

= √1 +( ) +( )  dx dy
− −− −−− − −− −−−
√ x2 + y 2 √ x2 + y 2

−−−−−−−−−−
2 2
x +y
= √1 +  dx dy
2
x + y2


= √2 dx dy

So

3.4.3 https://math.libretexts.org/@go/page/89218
– – – 2
Area = ∬ √2 dx dy = √2 ∬ dx dy = √2π a
2 2 2 2 2 2
x +y ≤a x +y ≤a

because ∬ 2
x +y
2 2
≤a
dx dy is exactly the area of a circular disk of radius a.

 Example 3.4.4. Area of a cylinder

Let a, b > 0. Find the surface area of


2 2 2
S = {(x, y, z)| x +z = a ,  0 ≤ y ≤ b}

Solution
The intersection of x + z = a with any plane of constant y is the circle of radius a centred on x = z = 0. So S is a bunch
2 2 2

of circles stacked sideways. It is a cylinder on its side (with both ends open). By symmetry, the area of S is four times the area
of the part of S that is in the first octanct, which is
− −−−−−
∣ 2 2
S1 = {(x, y, z)   z = f (x, y) = √ a − x ,  0 ≤ x ≤ a,  0 ≤ y ≤ b}

Since
x
fx (x, y) = − fy (x, y) = 0
− −−−− −
√ a2 − x2

the first formula in 3.4.1 yields


−−−−−−−−−−−−−−−−−−−
2 2
dS = √ 1 + fx (x, y ) + fy (x, y )  dx dy

−−−−−−−−−−−−−−− −
x 2

= √1 +(− )  dx dy
− −− −−−
√ a2 − x2

−−−−−−−−−−
2
x
= √1 +  dxdy
2 2
a −x

a
=  dxdy
− −−−− −
√ a2 − x2

So
a b a
a 1
Area(S1 ) = ∫ dx ∫ dy  − −− −−− = ab ∫ dx − −−−− −
0 0 √ a2 − x2 0 √ a2 − x2

The indefinite integral of 1


is arcsin
x

a
+ C. (See the table of integrals in Appendix A.4. Alternatively, use the trig
√a2 −x2

substitution x = a sin θ. ) So
a
x π
Area(S1 ) = ab [arcsin ] = ab[ arcsin 1 − arcsin 0] = ab
a 0 2

and

Area(S) = 4Area(S1 ) = 2πab

We could have also come to this conclusion by using a little geometry, rather than using calculus. Cut open the cylinder by
cutting along a line parallel to the y -axis, and then flatten out the cylinder. This gives a rectangle. One side of the rectangle is
just a circle of radius a, straightened out. So the rectangle has sides of lengths 2πa and b and has area 2πab.

3.4.4 https://math.libretexts.org/@go/page/89218
 Example 3.4.5. Area of a hemisphere

This time we compute the surface area of the hemisphere


2 2 2 2
x +y +z =a z ≥0

(with a > 0 ). You probably know, from high school, that the answer is × 4π a = 2π a . But you have probably not seen a 1

2
2 2

derivation 3 of this answer. Note that, since x + y = a − z on the hemisphere, the set of (x, y)'s for which there is a z
2 2 2 2

with (x, y, z) on the hemisphere is exactly {(x, y) ∈ R |x + y ≤ a } . So the hemisphere is 2 2 2 2

− −−−−−−−−−
∣ 2 2 2 2 2 2
S = {(x, y, z)   z = √ a − x − y ,  x + y ≤ a }

We will compute the area of S by applying 3.4.1 to


−−−−−−−−−−
2 2 2 2 2 2
z = f (x, y) = √ a −x −y with (x, y) running over x +y ≤a

The first formula in 3.4.1 yields


−−−−−−−−−−−−−−−−−−−
2 2
dS = √ 1 + fx (x, y ) + fy (x, y )  dxdy

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− −
−x 2 −y 2

= √1 +( ) +( )  dxdy
− −−−−−−−− − − −−−−−−−− −
√ a2 − x2 − y 2 √ a2 − x2 − y 2

−−−−−−−−−−−−−−
2 2
x +y
= √1 +  dxdy
2 2 2
a −x −y

−−−−−−−−−−−
2
a
=√  dxdy
2 2 2
a −x −y

So the area is ∬ 2
x +y
2 2
≤a
a

√a2 −x2 −y 2
 dxdy. To evaluate this integral, we switch to polar coordinates, substituting x = r cos θ,
y = r sin θ. This gives
a 2π
a a
area = ∬ − −−−−−−−− −  dxdy = ∫ dr r ∫ dθ  − −−−−−
2
x +y
2
≤a
2
√ a − x2 − y 2
2
0 0 √ a2 − r2

a
r
= 2πa ∫ dr 
− −−−−−
0 √ a2 − r2
0
−du/2
2 2
= 2πa ∫ with u = a − r ,  du = −2r dr

a
2
√u
0

= 2πa[ − √u ]
a2

2
= 2πa

as it should be.

 Example 3.4.6

Find the surface area of the part of the paraboloid z = 2 − x 2


−y
2
lying above the xy-plane.
Solution
The equation of the surface is of the form z = f (x, y) with f (x, y) = 2 − x 2
−y .
2
So

fx (x, y) = −2x fy (x, y) = −2y

and, by the first part of 3.4.1,

3.4.5 https://math.libretexts.org/@go/page/89218
−−−−−−−−−−−−−−−−−−−
2 2
dS = √ 1 + fx (x, y ) + fy (x, y )  dxdy

−−−−−−−−−−−
2 2
= √ 1 + 4x + 4y  dxdy

The point (x, y, z), with z = 2 − x − y , lies above the xy-plane if and only if z ≥ 0, or, equivalently, 2 − x
2 2 2
−y
2
≥ 0. So
the domain of integration is {(x, y)∣∣x + y ≤ 2} and
2 2

−−−−−−−−−−−
2 2
Surface Area = ∬  √ 1 + 4 x + 4y  dxdy
2 2
x +y ≤2

Switching to polar coordinates,


2π √2
− −−− −−
2
Surface Area = ∫ ∫ √ 1 + 4r  r dr dθ
0 0

√2
1 3/2 π
2
= 2π [ (1 + 4 r ) ] = [27 − 1]
12 6
0

13
= π
3

Exercises
Stage 1

 1

Let 0 < θ < π

2
, and a, b > 0. Denote by S the part of the surface z = y tan θ with 0 ≤ x ≤ a, 0 ≤ y ≤ b.

1. Find the surface area of S without using any calculus.


2. Find the surface area of S by using Theorem 3.4.2.

 2

Let c > 0. Denote by S the part of the surface ax + by + cz = d with (x, y) running over the region D in the xy-plane. Find
the surface area of S, in terms of a, b, c, d and A(D), the area of the region D.

 3

Let a, b, c > 0. Denote by S the triangle with vertices (a, 0, 0), (0, b, 0) and (0, 0, c).
1. Find the surface area of S in three different ways, each using Theorem 3.4.2.
2. Denote by T the projection of S onto the xy-plane. (It is the triangle with vertices (0, 0, 0) (a, 0, 0) and (0, b, 0).)
xy

Similarly use T to denote the projection of S onto the xz-plane and T to denote the projection of S onto the yz-plane.
xz yz

Show that
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
Area(S) = √ Area(Txy ) + Area(Txz ) + Area(Tyz )

Stage 2

 4. ✳

Find the area of the part of the surface z = y 3/2


that lies above 0 ≤ x, y ≤ 1.

 5. ✳
Find the surface area of the part of the paraboloid z = a 2
−x
2
−y
2
which lies above the xy--plane.

3.4.6 https://math.libretexts.org/@go/page/89218
 6. ✳
Find the area of the portion of the cone z 2
=x
2
+y
2
lying between the planes z = 2 and z = 3.

 7. ✳

Determine the surface area of the surface given by z = 2

3
3/2
(x +y
3/2
), over the square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1.

 8. ✳
1. To find the surface area of the surface z = f (x, y) above the region D, we integrate ∬ F (x, y) dA. What is F (x, y)? D

2. Consider a “Death Star”, a ball of radius 2 centred at the origin with another ball of radius 2 centred at (0, 0, 2√3) cut out
of it. The diagram below shows the slice where y = 0.

1. The Rebels want to paint part of the surface of Death Star hot pink; specifically, the concave part (indicated with a thick
line in the diagram). To help them determine how much paint is needed, carefully fill in the missing parts of this integral:
         
––
– ––

surface area = ∫ ∫             dr dθ


––––––
         
––
– ––

2. What is the total surface area of the Death Star?

 9. ✳

Find the area of the cone z 2 2


=x +y
2
between z = 1 and z = 16.

 10. ✳
−−−−−−−−− −
Find the surface area of that part of the hemisphere 2 2
z = √a − x − y
2
which lies within the cylinder
a 2 a 2
2
(x − ) +y =( ) .
2 2

1. Recall 2.6.1.
2. As we mentioned above, the approximation below becomes exact when the limit dx, dy → 0 is taken in the definition of the
integral. See §3.3.5 in the CLP-4 text.
3. There is a pun hidden here, because you can (with a little thought) also get the surface area by differentiating the volume with
respect to the radius.

This page titled 3.4: Surface Area is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

3.4.7 https://math.libretexts.org/@go/page/89218
3.5: Triple Integrals
Triple integrals, that is integrals over three dimensional regions, are just like double integrals, only more so. We decompose the
domain of integration into tiny cubes, for example, compute the contribution from each cube and then use integrals to add up all of
the different pieces. We'll go through the details now by means of a number of examples.

 Example 3.5.1
Find the mass inside the sphere x 2
+y
2
+z
2
=1 if the density is ρ(x, y, z) = |xyz|.
Solution
The absolute values can complicate the computations. We can avoid those complications by exploiting the fact that, by
symmetry, the total mass of the sphere will be eight times the mass in the first octant. We shall cut the first octant part of the
sphere into tiny pieces using Cartesian coordinates. That is, we shall cut it up using planes of constant z, planes of constant y,
and planes of constant x, which we recall look like

First slice the (the first octant part of the) sphere into horizontal plates by inserting many planes of constant z, with the
various values of z differing by dz. The figure on the left below shows the part of one plate in the first octant outlined in
red. Each plate
has thickness dz,
has z almost constant throughout the plate (it only varies by dz ), and
has (x, y) running over x ≥ 0, y ≥ 0, x + y ≤ 1 − z .
2 2 2

The bottom plate starts at z = 0 and the top plate ends at z = 1. See the figure on the right below.

Concentrate on any one plate. Subdivide it into long thin “square” beams by inserting many planes of constant y, with the
various values of y differing by dy. The figure on the left below shows the part of one beam in the first octant outlined in
blue. Each beam
has cross-sectional area dy dz,
has z and y essentially constant throughout the beam, and
−−−−−−−− −
has x running over 0 ≤ x ≤ √1 − y − z . 2 2

−−−− −
The leftmost beam has, essentially, y = 0 and the rightmost beam has, essentially, y = √1 − z . See the figure on the
2

right below.

3.5.1 https://math.libretexts.org/@go/page/89219
Concentrate on any one beam. Subdivide it into tiny approximate cubes by inserting many planes of constant x, with the
various values of x differing by dx. The figure on the left below shows the top of one approximate cube in black. Each
cube
has volume dx dy dz, and
has x, y and z all essentially constant throughout the cube.
−−−−−−−− −
The first cube has, essentially, x = 0 and the last cube has, essentially, x = √1 − y − z . See the figure on the right
2 2

below.

Now we can build up the mass.


Concentrate on one approximate cube. Let's say that it contains the point (x, y, z).
The cube has volume essentially dV = dx dy dz and
essentially has density ρ(x, y, z) = xyz and so
essentially has mass xyz dx dy dz.
To get the mass of any one beam, say the beam whose y coordinate runs from y to y + dy, we just add up the masses of the
approximate cubes in that beam, by integrating x from its smallest value on the beam, namely 0, to its largest value on the
−−−−−−−− −
beam, namely √1 − y − z . The mass of the beam is thus
2 2

2 2
√1−y −z

dy dz ∫ dx xyz
0

To get the mass of any one plate, say the plate whose z coordinate runs from z to z + dz, we just add up the masses of the
beams in that plate, by integrating y from its smallest value on the plate, namely 0, to its largest value on the plate, namely
−−− −−
√1 − z . The mass of the plate is thus
2

√1−z 2 √1−y
2
−z
2

dz ∫ dy ∫ dx xyz
0 0

To get the mass of the part of the sphere in the first octant, we just add up the masses of the plates that it contains, by
integrating z from its smallest value in the octant, namely 0, to its largest value on the sphere, namely 1. The mass in the
first octant is thus

3.5.2 https://math.libretexts.org/@go/page/89219
1 √1−z 2 √1−y 2 −z 2

∫ dz ∫ dy ∫ dx xyz
0 0 0

1 √1−z 2 √1−y
2
−z
2

=∫ dz ∫ dy yz [ ∫ dx x]
0 0 0

1 √1−z 2
1
2 2
=∫ dz ∫ dy  yz(1 − y −z )
0 0 2

1 √1−z 2 2
z(1 − z ) z 3
=∫ dz ∫ dy  [ y− y ]
0 0
2 2

1 2 2 2 2
z (1 − z ) z (1 − z )
=∫ dz  [ − ]
0
4 8

1 2 2
(1 − z )
=∫ dz z 
0
8
0 2
du u 2
=∫   with u = 1 − z ,  du = −2z dz
1
−2 8

1
=
48

So the mass of the total (eight octant) sphere is 8 × 1

48
=
1

6
.

Consider, for example, the limits of integration for the integral


1 √1−z 2 √1−y 2 −z 2

∫ dz ∫ dy ∫ dx xyz
0 0 0

1 √1−z 2 √1−y
2
−z
2

=∫ (∫ (∫ xyz dx) dy) dz
0 0 0

that we have just evaluated in Example 3.5.1.


When we are integrating over the innermost integral, with respect to x, the quantities y and z are treated as constants. In
particular, y and z may appear in the limits of integration for the x-integral, but x may not appear in those limits.
When we are integrating over y, we have already integrated out x; x no longer exists. The quantity z is treated as a constant. In
particular, z, but neither x nor y, may appear in the limits of integration for the y -integral.
Finally, when we are integrating over z, we have already integrated out x and y; they no longer exist. None of x, y or z, may
appear in the limits of integration for the z -integral.

 Example 3.5.2

In practice, often the hardest part of dealing with a triple integral is setting up the limits of integration. In this example, we'll
concentrate on exactly that.
Let V be the solid region in R bounded by the planes x = 0, y = 0, z = 0, y = 4 − x, and the surface z = 4 − x . We are
3 2

now going to write ∭ f (x, y, z) dV as an iterated integral (i.e. find the limits of integration) in two different ways. Here f is
V

just some general, unspecified, function.


First, we'll figure out what V looks like. The following three figures show
the part of the first octant with y ≤ 4 − x (except that it continues vertically upward)
the part of the first octant with z ≤ 4 − x (except that it continues to the right)
2

the part of the first octant with both y ≤ 4 − x and z ≤ 4 − x . That's 2

2
V = {(x, y, z)|x ≥ 0, y ≥ 0,  z ≥ 0,  x + y ≤ 4,  z ≤ 4 − x }

3.5.3 https://math.libretexts.org/@go/page/89219
The iterated integral ∭ f (x, y, z) dz dy dx = ∫ (∫ (∫ f (x, y, z) dz) dy) dx: For this iterated integral, the outside integral is
V

with respect to x, so we first slice up V using planes of constant x, as in the figure below.

Observe from that figure that, on V,


x runs from 0 to 2, and
for each fixed x in that range, y runs from 0 to 4 − x and
for each fixed (x, y) as above, z runs from 0 to 4 − x .
2

So
2
2 4−x 4−x

∭ f (x, y, z) dz dy dx = ∫ dx ∫ dy ∫ dz f (x, y, z)


V 0 0 0
2
2 4−x 4−x

=∫ ∫ ∫ f (x, y, z) dz dy dx
0 0 0

The iterated integral ∭ f (x, y, z) dy dx dz = ∫ (∫ (∫ f (x, y, z) dy) dx) dz: For this iterated integral, the outside integral is
V

with respect to z, so we first slice up V using planes of constant z, as in the figure below.

Observe from that figure that, on V,


z runs from 0 to 4, and
−−− −
for each fixed z in that range, x runs from 0 to √4 − z and
for each fixed (x, z) as above, y runs from 0 to 4 − x.
So

3.5.4 https://math.libretexts.org/@go/page/89219
4 √4−z 4−x

∭ f (x, y, z) dy dx dz = ∫ dz ∫ dx ∫ dy f (x, y, z)


V 0 0 0

4 √4−z 4−x

=∫ ∫ ∫ f (x, y, z) dy dx dz
0 0 0

 Example 3.5.3

As was said in the last example, in practice, often the hardest parts of dealing with a triple integral concern the limits of
integration. In this example, we'll again concentrate on exactly that. This time, we will consider the integral
2−y
2 2−y
2

I =∫ dy ∫ dz ∫ dx f (x, y, z)
0 0 0

and we will re-express I with the outside integral being over z. We will figure out the limits of integration for both the order
∫ dz ∫ dx ∫ dy f (x, y, z) and for the order ∫ dz ∫ dy ∫ dx f (x, y, z).

Our first task is to get a good idea as to what the domain of integration looks like. We start by reading off of the given integral
that
the outside integral says that y runs from 0 to 2, and
the middle integral says that, for each fixed y in that range, z runs from 0 to 2 − y and
2−y
the inside integral says that, for each fixed (y, z) as above, x runs from 0 to .
2

So the domain of integration is


2−y
V = {(x, y, z)|0 ≤ y ≤ 2,  0 ≤ z ≤ 2 − y,  0 ≤ x ≤ }
2

We'll sketch V shortly. Because it is generally easier to make 2d sketches than it is to make 3d sketches, we'll first make a 2d
sketch of the part of V that lies in the vertical plane y = Y . Here Y is any constant between 0 and 2. Looking at the definition
of V , we see that the point (x, Y , z) lies in V if and only if
2 −Y
0 ≤ z ≤ 2 −Y 0 ≤x ≤
2

Here, on the left, is a (2d) sketch of all (x, z)'s that obey those inequalities, and, on the right, is a (3d) sketch of all (x, Y , z)'s
that obey those inequalities.

So our solid V consists of a bunch of vertical rectangles stacked sideways along the y -axis. The rectangle in the plane y = Y
has side lengths 2−Y

2
and 2 − Y . As we move from the plane y = Y = 0, i.e. the xz-plane, to the plane y = Y = 2, the
rectangle decreases in size linearly from a one by two rectangle, when Y = 0, to a zero by zero rectangle, i.e. a point, when
Y = 2. Here is a sketch of V together with a typical y = Y rectangle.

3.5.5 https://math.libretexts.org/@go/page/89219
To re-express the given integral with the outside integral being with respect to z, we have to slice up V into horizontal plates
by inserting planes of constant z. So we have to figure out what the part of V that lies in the horizontal plane z = Z looks like.
From the figure above, we see that, in V , the smallest value of z is 0 and the biggest value of z is 2. So Z is any constant
between 0 and 2. Again looking at the definition of V in (∗) above, we see that the point (x, y, Z) lies in V if and only if

y ≥0 y ≤2 y ≤ 2 −Z x ≥0 2x + y ≤ 2

Here, on the top, is a (2d) sketch showing the top view of all (x, y)'s that obey those inequalities, and, on the bottom, is a (3d)
sketch of all (x, y, Z)'s that obey those inequalities.

To express I as an integral with the order of integration ∫ dz ∫ dy ∫ dx f (x, y, z), we subdivide the plate at height z into
vertical strips as in the figure

Since
y is essentially constant on each strip with the leftmost strip having y = 0 and the rightmost strip having y = 2 − z and
2−y
for each fixed y in that range, x runs from 0 to 2

we have

3.5.6 https://math.libretexts.org/@go/page/89219
2−y
2 2−z
2

I =∫ dz ∫ dy ∫ dx f (x, y, z)
0 0 0

Alternatively, to express I as an integral with the order of integration ∫ dz ∫ dx ∫ dy f (x, y, z), we subdivide the plate at
height z into horizontal strips as in the figure

Since
x is essentially constant on each strip with the first strip having x = 0 and the last strip having x = 1 and
for each fixed x between 0 and z/2, y runs from 0 to 2 − z and
for each fixed x between z/2 and 1, y runs from 0 to 2 − 2x
we have
2 z/2 2−z 2 1 2−2x

I =∫ dz ∫ dx ∫ dy f (x, y, z) + ∫ dz ∫ dx ∫ dy f (x, y, z)


0 0 0 0 z/2 0

Exercises
Stage 1

 1
Evaluate the integral
−−−−−−
2 2
∬ √b −y  dx dywhere R is the rectangle 0 ≤ x ≤ a,  0 ≤ y ≤ b
R

without using iteration. Instead, interpret the integral geometrically.

 2✳

Find the total mass of the rectangular box [0, 1] × [0, 2] × [0, 3] (that is, the box defined by the inequalities 0 ≤ x ≤ 1,

0 ≤ y ≤ 2, 0 ≤ z ≤ 3 ), with density function h(x, y, z) = x.

Stage 2

 3
y
Evaluate ∭ x dV where R is the tetrahedron bounded by the coordinate planes and the plane x

a
+
b
+
z

c
= 1.
R

 4

Evaluate ∭ y dV where R is the portion of the cube 0 ≤ x, y, z ≤ 1 lying above the plane y + z = 1 and below the plane
R

x + y + z = 2.

3.5.7 https://math.libretexts.org/@go/page/89219
 5
For each of the following, express the given iterated integral as an iterated integral in which the integrations are performed in
the order: first z, then y, then x.
1 1−z 1−z

1. ∫ dz ∫ dy ∫ dx f (x, y, z)
0 0 0
1 1 y

2. ∫ dz ∫ dy ∫ dx f (x, y, z)
0 √z 0

 6✳

A triple integral ∭ f  dV is given in iterated form by


E

2
y=1 z=1−y 2−y−z

∫ ∫ ∫ f (x, y, z) dx dz dy
y=−1 z=0 x=0

1. Draw a reasonably accurate picture of E in 3--dimensions. Be sure to show the units on the coordinate axes.
2. Rewrite the triple integral ∭ f  dV as one or more iterated triple integrals in the order
E

y= x= z=

∫ ∫ ∫ f (x, y, z) dz dx dy
y= x= z=

 7✳

A triple integral ∭ E
f (x, y, z) dV is given in the iterated form
x
1 1− 4−2x−4z
2

J =∫ ∫ ∫ f (x, y, z) dy dz dx
0 0 0

1. Sketch the domain E in 3--dimensions. Be sure to show the units.


2. Rewrite the integral as one or more iterated integrals in the form
y= x= z=

J =∫ ∫ ∫ f (x, y, z) dz dx dy
y= x= z=

 8✳

Write the integral given below 5 other ways, each with a different order of integration.
1 1 1−y

I =∫ ∫ ∫ f (x, y, z) dz dy dx
0 √x 0

 9✳

Let I =∭ f (x, y, z) dV where E is the tetrahedron with vertices (−1, 0, 0), (0, 0, 0), (0, 0, 3) and (0, −2, 0).
E

1. Rewrite the integral I in the form


x= y= z=

I =∫ ∫ ∫ f (x, y, z) dz dy dx
x= y= z=

2. Rewrite the integral I in the form


z= x= y=

I =∫ ∫ ∫ f (x, y, z) dy dx dz
z= x= y=

3.5.8 https://math.libretexts.org/@go/page/89219
 10 ✳

Let T denote the tetrahedron bounded by the coordinate planes x = 0, y = 0, z = 0 and the plane x + y + z = 1. Compute
1
K =∭  dV
T (1 + x + y + z)4

 11 ✳

Let E be the portion of the first octant which is above the plane z = x +y and below the plane z = 2. The density in E is
ρ(x, y, z) = z. Find the mass of E.

 12 ✳

Evaluate the triple integral ∭ x dV , where E is the region in the first octant bounded by the parabolic cylinder y = x and
E
2

the planes y + z = 1, x = 0, and z = 0.

 13 ✳
Let E be the region in the first octant bounded by the coordinate planes, the plane x + y = 1 and the surface z = y . Evaluate 2

∭ z dV .
E

 14 ✳

Evaluate ∭ R
2
yz e
−xyz
 dV over the rectangular box

R = {(x, y, z)|0 ≤ x ≤ 1,  0 ≤ y ≤ 2,  0 ≤ z ≤ 3}

 15 ✳
1. Sketch the surface given by the equation z = 1 − x . 2

2. Let E be the solid bounded by the plane y = 0, the cylinder z = 1 − x 2


, and the plane y = z. Set up the integral

∭ f (x, y, z) dV
E

as an iterated integral.

 16 ✳

Let
1 x y

J =∫ ∫ ∫ f (x, y, z) dz dy dx
0 0 0

Express J as an integral where the integrations are to be performed in the order x first, then y, then z.

 17 ✳

Let E be the region bounded by z = 2x, z = y , and x = 3. The triple integral ∭ f (x, y, z) dV can be expressed as an
2

iterated integral in the following three orders of integration. Fill in the limits of integration in each case. No explanation
required.

3.5.9 https://math.libretexts.org/@go/page/89219
y= x= z=

∫ ∫ ∫ f (x, y, z) dz dx dy
y= x= z=

y= z= x=

∫ ∫ ∫ f (x, y, z) dx dz dy
y= z= x=

z= x= y=

∫ ∫ ∫ f (x, y, z) dy dx dz
z= x= y=

 18 ✳

Let E be the region inside the cylinder x 2


+y
2
= 1, below the plane z = y and above the plane z = −1. Express the integral

∭ f (x, y, z) dV
E

as three different iterated integrals corresponding to the orders of integration: (a) dz dx dy, (b) dx dy dz, and (c) dy dz dx.

 19 ✳

Let E be the region bounded by the planes y = 0, y = 2, y + z = 3 and the surface z = x 2


. Consider the intergal

I =∭ f (x, y, z) dV
E

Fill in the blanks below. In each part below, you may need only one integral to express your answer. In that case, leave the
other blank.
                             
––
– ––
– ––
– ––
– ––
– ––

1. I =∫ ∫ ∫ f (x, y, z) dz dx dy + ∫ ∫ ∫ f (x, y, z) dz dx dy


                             
––
– ––
– ––
– ––
– ––
– ––

                             
––
– ––
– ––
– ––
– ––
– ––

2. I =∫ ∫ ∫ f (x, y, z) dx dy dz + ∫ ∫ ∫ f (x, y, z) dx dy dz


                             
––
– ––
– ––
– ––
– ––
– ––

                             
––
– ––
– ––
– ––
– ––
– ––

3. I =∫ ∫ ∫ f (x, y, z) dy dx dz + ∫ ∫ ∫ f (x, y, z) dy dx dz


                             
––
– ––
– ––
– ––
– ––
– ––

 20 ✳

Evaluate ∭ E
z dV , where E is the region bounded by the planes y = 0, z = 0 x +y = 2 and the cylinder y 2
+z
2
=1 in the
first octant.

 21 ✳

Find ∭ x dV where D is the tetrahedron bounded by the planes x = 1, y = 1, z = 1, and x + y + z = 2.


D

 22 ✳

The solid region T is bounded by the planes x = 0, y = 0, z = 0, and x + y + z = 2 and the surface x 2
+ z = 1.

1. Draw the region indicating coordinates of all corners.


2. Calculate ∭ x dV . T

This page titled 3.5: Triple Integrals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

3.5.10 https://math.libretexts.org/@go/page/89219
3.6: Triple Integrals in Cylindrical Coordinates
Many problems possess natural symmetries. We can make our work easier by using coordinate systems, like polar coordinates, that
are tailored to those symmetries. We will look at two more such coordinate systems — cylindrical and spherical coordinates.

Cylindrical Coordinates
In the event that we wish to compute, for example, the mass of an object that is invariant under rotations about the z -axis 1, it is
advantageous to use a natural generalization of polar coordinates to three dimensions. The coordinate system is called cylindrical
coordinates.

 Definition 3.6.1

Cylindrical coordinates are denoted 2 r, θ and z and are defined by


r =  the distance from (x, y, 0) to (0, 0, 0)

=  the distance from (x, y, z) to the z-axis

θ =  the angle between the positive x axis and

the line joining (x, y, 0) to (0, 0, 0)

z =  the signed distance from (x, y, z) to the xy-plane

That is, r and θ are the usual polar coordinates and z is the usual z.

The Cartesian and cylindrical coordinates are related by 3

 Equation 3.6.2

x = r cos θ y = r sin θ z =z
−−−−−−
2 2
y
r = √x +y θ = arctan z =z
x

Here are sketches of surfaces of constant r, constant θ, and constant z.

The Volume Element in Cylindrical Coordinates


Before we can start integrating using these coordinates we need to determine the volume element. Recall that before integrating in
polar coordinates, we had to establish that dA = r dr dθ. In the arguments that follow we establish that dV = r dr dθ dz.
If we cut up a solid by
first slicing it into horizontal plates of thickness dz by using planes of constant z,

3.6.1 https://math.libretexts.org/@go/page/92245
and then subdividing the plates into wedges using surfaces of constant θ, say with the difference between successive θ 's being
dθ,

and then subdividing the wedges into approximate cubes using surfaces of constant r, say with the difference between
successive r's being dr,

we end up with approximate cubes that look like

When we introduced slices using surfaces of constant r, the difference between the successive r's was dr, so the indicated edge
of the cube has length dr.
When we introduced slices using surfaces of constant z, the difference between the successive z 's was dz, so the vertical edges
of the cube have length dz.
When we introduced slices using surfaces of constant θ, the difference between the successive θ 's was dθ, so the remaining
edges of the cube are circular arcs of radius essentially 4 r that subtend an angle θ, and so have length r dθ. See the derivation
of equation 3.2.5.
So the volume of the approximate cube in cylindrical coordinates is (essentially 5)

 Equation 3.6.3

dV = r dr dθ dz

Sample Integrals in Cylindrical Coordinates


Now we can use 3.6.3 to handle a variant of Example 3.5.1 in which the density is invariant under rotations around the z -axis.
Cylindrical coordinates are tuned to provide easier integrals to evaluate when the integrand is invariant under rotations about the z -
axis, or when the domain of integration is cylindrical.

3.6.2 https://math.libretexts.org/@go/page/92245
 Example 3.6.4

Find the mass of the solid body consisting of the inside of the sphere x 2
+y
2
+z
2
=1 if the density is ρ(x, y, z) = x 2 2
+y .

Solution
Before we get started, note that x + y is the square of the distance from (x, y, z) to the z -axis. Consequently both the
2 2

integrand, x + y , and the domain of integration, x + y + z ≤ 1, and hence our solid, are invariant under rotations about
2 2 2 2 2

the z -axis 6. That makes this integral a good candidate for cylindrical coordinates.
Again, by symmetry the total mass of the sphere will be eight times the mass in the first octant. We shall cut the first octant part
of the sphere into tiny pieces using cylindrical coordinates. That is, we shall cut it up using planes of constant z, planes of
constant θ, and surfaces of constant r.
First slice the (the first octant part of the) sphere into horizontal plates by inserting many planes of constant z, with the
various values of z differing by dz. The figure on the left below shows the part of one plate in the first octant outlined in
red. Each plate
has thickness dz,
has z essentially constant on the plate, and
−−− −−
has (x, y) running over x ≥ 0, y ≥ 0, x + y ≤ 1 − z . In cylindrical coordinates, r runs from 0 to √1 − z and θ
2 2 2 2

runs from 0 to .π

The bottom plate has, essentially, z = 0 and the top plate has, essentially, z = 1. See the figure on the right below.

So far, this looks just like what we did in Example 3.5.1.


Concentrate on any one plate. Subdivide it into wedges by inserting many planes of constant θ, with the various values of θ
differing by dθ. The figure on the left below shows one such wedge outlined in blue. Each wedge
has z and θ essentially constant on the wedge, and
−− −− −
has r running over 0 ≤ r ≤ √1 − z . 2

The leftmost wedge has, essentially, θ = 0 and the rightmost wedge has, essentially, θ = π

2
. See the figure on the right
below.

Concentrate on any one wedge. Subdivide it into tiny approximate cubes by inserting many surfaces of constant r, with the
various values of r differing by dr. The figure on the left below shows the top of one approximate cube in black. Each cube
has volume r dr dθ dz, by 3.6.3, and
has r, θ and z all essentially constant on the cube.
−−−− −
The first cube has, essentially, r = 0 and the last cube has, essentially, r = √1 − z . See the figure on the right below.
2

3.6.3 https://math.libretexts.org/@go/page/92245
Now we can build up the mass.
Concentrate on one approximate cube. Let's say that it contains the point with cylindrical coordinates r, θ and z.
The cube has volume essentially dV = r dr dθ dz and
essentially has density ρ(x, y, z) = ρ(r cos θ, r sin θ, z) = r and so 2

essentially has mass r dr dθ dz. (See how nice the right coordinate system can be!)
3

To get the mass any one wedge, say the wedge whose θ coordinate runs from θ to θ + dθ, we just add up the masses of the
approximate cubes in that wedge, by integrating r from its smallest value on the wedge, namely 0, to its largest value on
−−− −−
the wedge, namely √1 − z . The mass of the wedge is thus
2

√1−z 2
3
dθ dz ∫ dr r
0

To get the mass of any one plate, say the plate whose z coordinate runs from z to z + dz, we just add up the masses of the
wedges in that plate, by integrating θ from its smallest value on the plate, namely 0, to its largest value on the plate, namely
. The mass of the plate is thus
π

π/2 √1−z 2

3
dz ∫ dθ ∫ dr r
0 0

To get the mass of the part of the sphere in the first octant, we just add up the masses of the plates that it contains, by
integrating z from its smallest value in the octant, namely 0, to its largest value on the sphere, namely 1. The mass in the
first octant is thus
1 π/2 √1−z 2 1 π/2
1 2
3 2
∫ dz ∫ dθ ∫ dr r = ∫ dz ∫ dθ  (1 − z )
0 0 0
4 0 0

1
π 2
2
= ∫ dz  (1 − z )
8 0

1
π
2 4
= ∫ dz (1 − 2 z +z )
8 0

8/15


π 2 1
= [1 − + ]
8 3 5

1
= π
15

So the mass of the total (eight octant) sphere is 8 × 1

15
π =
8

15
π.

Just by way of comparison, here is the integral in Cartesian coordinates that gives the mass in the first octant. (We found the
limits of integration in Example 3.5.1.)
1 √1−z 2 √1−y
2
−z
2

2 2
∫ dz ∫ dy ∫ dx (x +y )
0 0 0

In the next example, we compute the moment of inertia of a right circular cone. The Definition 3.3.13 of the moment of inertia was
restricted to two dimensions. However, as was pointed out at the time, the same analysis extends naturally to the definition

3.6.4 https://math.libretexts.org/@go/page/92245
 Equation 3.6.5

2
IA = ∭ D(x, y, z) ρ(x, y, z) dx dy dz
V

of the moment of inertia of a solid V in three dimensions. Here


ρ(x, y, z) is the mass density of the solid at the point (x, y, z) and
D(x, y, z) is the distance from (x, y, z) to the axis of rotation.

 Example 3.6.6

Find the moment of inertia of a right circular cone


of radius a,
of height h, and
of constant density with mass M
about an axis through the vertex (i.e. the tip of the cone) and parallel to the base.
Solution
Here is a sketch of the cone.

Let's pick a coordinate system with


the vertex at the origin,
the cone symmetric about the z -axis and
the axis of rotation being the y -axis.
and call the cone V.

We shall use 3.6.5 to find the moment of inertia. In the current problem, the axis of rotation is the y -axis. The point on the y -
axis that is closest to (x, y, z) is (0, y, 0) so that the distance from (x, y, z) to the axis is just
− −−−−−
2 2
D(x, y, z) = √ x + z

Our solid has constant density and mass M , so

3.6.5 https://math.libretexts.org/@go/page/92245
M
ρ(x, y, z) =
Volume(V)

The formula
1
2
Volume(V) = πa h
3

for the volume of a cone was derived in Example 1.6.1 of the CLP-2 text and in Appendix B.5.2 of the CLP-1 text. However
because of the similarity between the integral Volume(V) = ∭ dx dy dz and the integral ∭ (x + z ) dx dy dz, that we
V V
2 2

need for our computation of I , it is easy to rederive the volume formula and we shall do so.
A

We'll evaluate both of the integrals above using cylindrical coordinates.


Start by slicing the cone into horizontal plates by inserting many planes of constant z, with the various values of z differing
by dz.

Each plate
is a circular disk of thickness dz.
By similar triangles, as in the figure on the right below, the disk at height z has radius R obeying
R a a
= ⟹ R = z
z h h

So the disk at height z has the cylindrical coordinates r running from 0 to z and θ running from 0 to 2π. a

The bottom plate has, essentially, z = 0 and the top plate has, essentially, z = h.
Now concentrate on any one plate. Subdivide it into wedges by inserting many planes of constant θ, with the various values
of θ differing by dθ.
The first wedge has, essentially θ = 0 and the last wedge has, essentially, θ = 2π.
Concentrate on any one wedge. Subdivide it into tiny approximate cubes 7 by inserting many surfaces of constant r, with
the various values of r differing by dr. Each cube
has volume r dr dθ dz, by 3.6.3.
The first cube has, essentially, r = 0 and the last cube has, essentially, r = a

h
z.

So the two integrals of interest are


a
h 2π z
h

∭ dx dy dz = ∫ dz ∫ dθ ∫ dr r
V 0 0 0

h 2π 2 h
2
1 a a π 2
=∫ dz ∫ dθ  ( z) = ∫ dz z
2
0 0
2 h h 0

1 2
= πa h
3

3.6.6 https://math.libretexts.org/@go/page/92245
as expected, and
2 2
x +z
h 2π
a
z 
h
2 2 2 2 2
∭ (x + z ) dx dy dz = ∫ dz ∫ dθ ∫ dr r (r cos θ+z )
V 0 0 0

h 2π
4 2
1 a 2
1 a 2
=∫ dz ∫ dθ [ ( z) cos θ+ ( z) z ]
0 0
4 h 2 h

h 4 2
1 a a
4
=∫ dz [ + ] πz
4 2
0 4 h h

2
since  ∫ cos θ dθ = π by Remark 3.3.5
0

4 2
1 1 a a
5
= [ + ] πh
5 4 h4 h
2

Putting everything together, the moment of inertia is


2 ρ(x,y,z)
D(x,y,z) 

2 2
M
IA = ∭ (x +z )  dx dy dz
1 2
V πa h
3

4 2
M 1 1 a a 5
=3   [ + ] πh
2 4
πa h 5 4 h h2

3
2 2
= M (a + 4h )
20

Exercises
Stage 1

 1

Use (r, θ, z) to denote cylindrical coordinates.


1. Draw r = 0.
2. Draw r = 1.
3. Draw θ = 0.
4. Draw θ = . π

 2

Sketch the points with the specified cylindrical coordinates.


1. r = 1, θ = 0, z = 0

2. r = 1, θ =
π

4
, z =0

3. r = 1, θ =
π

2
, z =0

4. r = 0, θ = π, z = 1

5. r = 1, θ =
π

4
, z =1

 3

Convert from cylindrical to Cartesian coordinates.


1. r = 1, θ = 0, z = 0

2. r = 1, θ =
π

4
, z =0

3. r = 1, θ =
π

2
, z =0

4. r = 0, θ = π, z = 1

3.6.7 https://math.libretexts.org/@go/page/92245
5. r = 1, θ =
π

4
, z =1

 4

Convert from Cartesian to cylindrical coordinates.


1. (1, 1, 2)
2. (−1, −1, 2)

3. (−1, √3, 0)
4. (0, 0, 1)

 5
Rewrite the following equations in cylindrical coordinates.
1. z = 2xy
2. x + y + z
2 2 2
=1

3. (x − 1) + y 2 2
=1

Stage 2

 6

Use cylindrical coordinates to evaluate the volumes of each of the following regions.
−−−−−−
1. Above the xy--plane, inside the cone z = 2a − √x + y and inside the cylinder x + y = 2ay.
2 2 2 2


2. Above the xy--plane, under the paraboloid z = 1 − x − y and in the wedge −x ≤ y ≤ √3x.
2 2

3. Above the paraboloid  z = x + y   and below the plane  z = 2y.


2 2

 7✳

Let E be the region bounded between the parabolic surfaces z =x


2
+y
2
and z = 2 −x
2
−y
2
and within the cylinder
3/2
x
2
+y
2
≤ 1. Calculate the integral of f (x, y, z) = (x 2
+y )
2
over the region E.

 8✳

Let E be the region bounded above by the sphere x


2
+y
2
+z
2
=2 and below by the paraboloid z =x
2 2
+y . Find the
centroid of E.

 9✳
Let E be the smaller of the two solid regions bounded by the surfaces z =x
2
+y
2
and x
2
+y
2
+z
2
= 6. Evaluate
(x + y ) dV .
2 2

E

 10 ✳

Let a >0 be a fixed positive real number. Consider the solid inside both the cylinder x
2
+y
2
= ax and the sphere
+ z = a . Compute its volume.
2 2 2 2
x +y

You may use that ∫ sin 3


(θ) =
1

12
cos(3θ) −
3

4
cos(θ) + C

 11 ✳

Let E be the solid lying above the surface z = y and below the surface z = 4 − x
2 2
. Evaluate

3.6.8 https://math.libretexts.org/@go/page/92245
2
∭ y  dV
E

You may use the half angle formulas:


1 − cos(2θ) 1 + cos(2θ)
2 2
sin θ = , cos θ =
2 2

 12

The centre of mass (x̄, ȳ , z̄ ) of a body B having density ρ(x, y, z) (units of mass per unit volume) at (x, y, z) is defined to be
1
x̄ = ∭ xρ(x, y, z)
M B

1
dV ȳ = ∭ yρ(x, y, z) dV
M B

1
z̄ = ∭ zρ(x, y, z) dV
M B

where

M =∭ ρ(x, y, z) dV
B

is the mass of the body. So, for example, x̄ is the weighted average of x over the body. Find the centre of mass of the part of
the solid ball x + y + z ≤ a with x ≥ 0, y ≥ 0 and z ≥ 0, assuming that the density ρ is constant.
2 2 2 2

 13 ✳

A sphere of radius 2m centred on the origin has variable density 5


(z
2
+ 1) kg/m 3
. A hole of diameter 1m is drilled through
√3

the sphere along the z --axis.


1. Set up a triple integral in cylindrical coordinates giving the mass of the sphere after the hole has been drilled.
2. Evaluate this integral.

 14 ✳

Consider the finite solid bounded by the three surfaces: z = e and x


2 2
−x −y 2 2
, z =0 +y = 4.

1. Set up (but do not evaluate) a triple integral in rectangular coordinates that describes the volume of the solid.
2. Calculate the volume of the solid using any method.

 15 ✳

Find the volume of the solid which is inside x 2


+y
2
= 4, above z = 0 and below 2z = y.

Stage 3

 16 ✳

The density of hydrogen gas in a region of space is given by the formula


2
z + 2x
ρ(x, y, z) =
2 2
1 +x +y

1. At (1, 0, −1), in which direction is the density of hydrogen increasing most rapidly?

3.6.9 https://math.libretexts.org/@go/page/92245
2. You are in a spacecraft at the origin. Suppose the spacecraft flies in the direction of ⟨0, 0, 1⟩ . It has a disc of radius 1,
centred on the spacecraft and deployed perpendicular to the direction of travel, to catch hydrogen. How much hydrogen has
been collected by the time that the spacecraft has traveled a distance 2?

You may use the fact that ∫ 0
2
cos θ dθ = π.

 17

A torus of mass M is generated by rotating a circle of radius a about an axis in its plane at distance b from the centre (b > a).
The torus has constant density. Find the moment of inertia about the axis of rotation. By definition the moment of intertia is
∭ r dm where dm is the mass of an infinitesmal piece of the solid and r is its distance from the axis.
2

1. like a pipe or a can of tuna fish


2. We are using the standard mathematics conventions for the cylindrical coordinates. Under the ISO conventions they are
(ρ, ϕ, z). See Appendix A.7.

3. As was the case for polar coordinates, it is sometimes convenient to extend these definitions by saying that x = r cos θ and
y = r sin θ even when r is negative. See the end of Section 3.2.1.

4. The inner edge has radius r, but the outer edge has radius r + dr. However the error that this generates goes to zero in the limit
dr, dθ, dz → 0.

5. By “essentially”, we mean that the formula for dV works perfectly when we take the limit dr, dθ, dz → 0 of Riemann sums.
6. Imagine that you are looking that the solid from, for example, far out on the x-axis. You close your eyes for a minute. Your evil
twin then sneaks in, rotates the solid about the z -axis, and sneaks out. You open your eyes. You will not be able to tell that the
solid has been rotated.
7. Again they are wonky cubes, but we can bound the error and show that it goes to zero in the limit dr, dθ, dz → 0.

This page titled 3.6: Triple Integrals in Cylindrical Coordinates is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.

3.6.10 https://math.libretexts.org/@go/page/92245
3.7: Triple Integrals in Spherical Coordinates
Spherical Coordinates
In the event that we wish to compute, for example, the mass of an object that is invariant under rotations about the origin, it is
advantageous to use another generalization of polar coordinates to three dimensions. The coordinate system is called spherical
coordinates.

 Definition 3.7.1
Spherical coordinates are denoted 1 ρ, θ and φ and are defined by

ρ =  the distance from (0, 0, 0) to (x, y, z)

φ =  the angle between the z axis and the line joining (x, y, z) to (0, 0, 0)

θ =  the angle between the x axis and the line joining (x, y, 0) to (0, 0, 0)

Here are two more figures giving the side and top views of the previous figure.

The spherical coordinate θ is the same as the cylindrical coordinate θ. The spherical coordinate a⃗ rphi is new. It runs from 0 (on
the positive z -axis) to π (on the negative z -axis). The Cartesian and spherical coordinates are related by

 Equation 3.7.2

x = ρ sin φ cos θ y = ρ sin φ sin θ z = ρ cos φ


− −−−− −
−−−−−−−−−− √ x2 + y 2
2 2 2
y
ρ = √x +y +z θ = arctan φ = arctan
x z

Here are three figures showing


a surface of constant ρ, i.e. a surface x + y + z = ρ with ρ a constant (which looks like an onion skin),
2 2 2 2

a surface of constant θ, i.e. a surface y = x tan θ with 2 θ a constant (which looks like the page of a book), and
−−− −−−
a surface of constant φ, i.e. a surface z = √x + y   tan φ with φ a constant (which looks a conical funnel).
2 2

3.7.1 https://math.libretexts.org/@go/page/92246
The Volume Element in Spherical Coordinates
If we cut up a solid 3 by
first slicing it into segments (like segments of an orange) by using planes of constant θ, say with the difference between
successive θ 's being dθ,

and then subdividing the segments into “searchlights” (like the searchlight outlined in blue in the figure below) using surfaces
of constant φ, say with the difference between successive φ 's being dφ,

and then subdividing the searchlights into approximate cubes using surfaces of constant ρ, say with the difference between
successive ρ's being dρ,

we end up with approximate cubes that look like

3.7.2 https://math.libretexts.org/@go/page/92246
The dimensions of the approximate “cube” in spherical coordinates are (essentially) dρ by ρdφ by ρ sin φ dθ. (These dimensions
are derived in more detail in the next section.) So the approximate cube has volume (essentially)

 Equation 3.7.3
2
dV = ρ sin φ dρ dθ dφ

The Details
Here is an explanation of the edge lengths given in the above figure. Each of the 12 edges of the cube is formed by holding two of
the three coordinates ρ, θ, φ fixed and varying the third.
Four of the cube edges are formed by holding θ and φ fixed and varying ρ. The intersection of a plane of fixed θ with a cone of
fixed φ is a straight line emanating from the origin. When we introduced slices using spheres of constant ρ, the difference
between the successive ρ's was dρ, so those edges of the cube each have length dρ.

Four of the cube edges are formed by holding θ and ρ fixed and varying φ. The intersection of a plane of fixed θ (which
contains the origin) with a sphere of fixed ρ (which is centred on the origin) is a circle of radius ρ centred on the origin. It is a
line of longitude 4.

When we introduced searchlights using surfaces of constant φ, the difference between the successive φ 's was dφ. Thus those
four edges of the cube are circular arcs of radius essentially ρ that subtend an angle dφ, and so have length ρ dφ.

3.7.3 https://math.libretexts.org/@go/page/92246
Four of the cube edges are formed by holding φ and ρ fixed and varying θ. The intersection of a cone of fixed φ with a sphere
of fixed ρ is a circle. As both ρ and φ are fixed, the circle of intersection lies in the plane z = ρ cos φ. It is a line of latitude.
The circle has radius ρ sin φ and is centred on (0, 0, ρ cos φ).

When we introduced segments using surfaces of constant θ, the difference between the successive θ 's was dθ. Thus these four
edge of the cube are circular arcs of radius essentially ρ sin φ that subtend an angle dθ, and so have length ρ sin φ dθ.

Sample Integrals in Spherical Coordinates

 Example 3.7.4. Ice Cream Cone


Find the volume of the ice cream 5 cone that consists of the part of the interior of the sphere x + y + z = a 2 2 2 2
that is above
the xy-plane and that is inside the cone x + y = b z . Here a and b are any two strictly positive constants.
2 2 2 2

Solution
Note that, in spherical coordinates
2 2 2 2 2 2 2 2 2 2 2
x +y =ρ sin φ z =ρ cos φ x +y +z =ρ

Consequently, in spherical coordinates, the equation of the sphere is ρ = a, and the equation of the cone is tan φ = b . Let's 2 2

write β = arctan b, with 0 < β < . Here is a sketch of the part of the ice cream cone in the first octant. The volume of the
π

3.7.4 https://math.libretexts.org/@go/page/92246
full ice cream cone will be four times the volume of the part in the first octant.

We shall cut the first octant part of the ice cream cone into tiny pieces using spherical coordinates. That is, we shall cut it up
using planes of constant θ, cones of constant φ, and spheres of constant ρ.
First slice the (the first octant part of the) ice cream cone into segments by inserting many planes of constant θ, with the
various values of θ differing by dθ. The figure on the left below shows one segment outlined in red. Each segment
has θ essentially constant on the segment, and
has φ running from 0 to β and ρ running from 0 to a.
The leftmost segment has, essentially, θ = 0 and the rightmost segment has, essentially, θ = π

2
. See the figure on the
right below.

Concentrate on any one segment. A side view of the segment is sketched in the figure on the left below. Subdivide it into
long thin searchlights by inserting many cones of constant φ, with the various values of φ differing by dφ. The figure on
the left below shows one searchlight outlined in blue. Each searchlight
has θ and φ essentially constant on the searchlight, and
has ρ running over 0 ≤ ρ ≤ a.
The leftmost searchlight has, essentially, φ = 0 and the rightmost searchlight has, essentially, φ = β. See the figure on
the right below.

Concentrate on any one searchlight. Subdivide it into tiny approximate cubes by inserting many spheres of constant ρ, with
the various values of ρ differing by dρ. The figure on the left below shows the side view of one approximate cube in black.
Each cube
has ρ, θ and φ all essentially constant on the cube and
has volume ρ sin φ dρ dθ dφ, by 3.7.3.
2

The first cube has, essentially, ρ = 0 and the last cube has, essentially, ρ = a. See the figure on the right below.

3.7.5 https://math.libretexts.org/@go/page/92246
Now we can build up the volume.
Concentrate on one approximate cube. Let's say that it contains the point with spherical coordinates ρ, θ, φ. The cube has
volume essentially dV = ρ sin φ dρ dθ dφ, by 3.7.3.
2

To get the volume any one searchlight, say the searchlight whose φ coordinate runs from φ to φ + dφ, we just add up the
volumes of the approximate cubes in that searchlight, by integrating ρ from its smallest value on the searchlight, namely 0,
to its largest value on the searchlight, namely a. The volume of the searchlight is thus
a
2
dθ dφ ∫ dρ ρ sin φ
0

To get the volume of any one segment, say the segment whose θ coordinate runs from θ to θ + dθ, we just add up the
volumes of the searchlights in that segment, by integrating φ from its smallest value on the segment, namely 0, to its largest
value on the segment, namely β. The volume of the segment is thus
β a
2
dθ ∫ dφ  sin φ ∫ dρ ρ
0 0

To get the volume of V , the part of the ice cream cone in the first octant, we just add up the volumes of the segments that it
1

contains, by integrating θ from its smallest value in the octant, namely 0, to its largest value on the octant, namely . π

The volume in the first octant is thus


π/2 β a
2
Volume(V1 ) = ∫ dθ ∫ dφ  sin φ ∫ dρ ρ
0 0 0

3 π/2 β
a
= ∫ dθ ∫ dφ  sin φ
3 0 0

3 π/2
a
= [1 − cos β] ∫ dθ
3 0

3
πa
= [1 − cos β]
6

So the volume of V, the total (four octant) ice cream cone, is


3
4πa
Volume(V) = 4 Volume(V1 ) = [1 − cos β]
6

We can express β (which was not given in the statement of the original problem) in terms of b (which was in the statement of
the original problem), just by looking at the triangle

The right hand and bottom sides of the triangle have been chosen so that tan β = b, which was the definition of β. So
cos β =
1
and the volume of the ice cream cone is
√1+b2

3
2πa 1
Volume(V) = [1 − − −−−−]
3 √ 1 + b2

Note that, as in Example 3.2.11, we can easily apply a couple of sanity checks to our answer.

3.7.6 https://math.libretexts.org/@go/page/92246
If b = 0, so that the cone is just x + y = 0, which is the line x = y = 0, the total volume should be zero. Our answer
2 2

does indeed give 0 in this case.


In the limit b → ∞, the angle β → and the ice cream cone opens up into a hemisphere of radius a. Our answer does
π

indeed give the volume of the hemisphere, which is × π a .


1

2
4

3
3

 Example 3.7.5. Cored Apple


A cylindrical hole of radius b is drilled symmetrically through a perfectly spherical apple of radius a ≥ b. Find the volume of
apple that remains.
Solution
In Example 3.2.11 we computed the volume removed, basically using cylindrical coordinates. So we could get the answer to
this question just by subtracting the answer of Example 3.2.11 from π a . Instead, we will evaluate the volume remaining as
4

3
3

an exercise in setting up limits of integration when using spherical coordinates.


As in Example 3.2.11, let's use a coordinate system with the sphere centred on (0, 0, 0) and with the centre of the drill hole
following the z -axis. Here is a sketch of the apple that remains in the first octant. It is outlined in red. By symmetry the total
amount of apple remaining will be eight times the amount from the first octant.

First slice the first octant part of the remaining apple into segments by inserting many planes of constant θ, with the various
values of θ differing by dθ. The leftmost segment has, essentially, θ = 0 and the rightmost segment has, essentially, θ = . π

Each segment, viewed from the side, looks like

Subdivide it into long thin searchlights by inserting many cones of constant φ, with the various values of φ differing by
dφ. The figure on below shows one searchlight outlined in blue. Each searchlight

has θ and φ essentially constant on the searchlight.


The top searchlight has, essentially, φ = arcsin and the bottom searchlight has, essentially, φ =
b

a
π

2
.

3.7.7 https://math.libretexts.org/@go/page/92246
Concentrate on any one searchlight. Subdivide it into tiny approximate cubes by inserting many spheres of constant ρ, with
the various values of ρ differing by dρ. The figure on the left below shows the side view of one approximate cube in black.
Each cube
has ρ, θ and φ all essentially constant on the cube and
has volume dV = ρ sin φ dρ dθ dφ, by 3.7.3.
2

The figure on the right below gives an expanded view of the searchlight. From it, we see (after a little trig) that the first
cube has, essentially, \(\rho=\frac{b}{\sin\varphi\) and the last cube has, essentially, ρ = a (the radius of the apple).

Now we can build up the volume.


Concentrate on one approximate cube. Let's say that it contains the point with spherical coordinates ρ, θ, φ. The cube has
volume essentially dV = ρ sin φ dρ dθ dφ, by 3.7.3.
2

To get the volume any one searchlight, say the searchlight whose φ coordinate runs from φ to φ + dφ, we just add up the
volumes of the approximate cubes in that searchlight, by integrating ρ from its smallest value on the searchlight, namely
, to its largest value on the searchlight, namely a. The volume of the searchlight is thus
b

sin φ

a
2
dθ dφ ∫ dρ ρ sin φ
b

s in φ

To get the volume of any one segment, say the segment whose θ coordinate runs from θ to θ + dθ, we just add up the
volumes of the searchlights in that segment, by integrating φ from its smallest value on the segment, namely arcsin , to b

its largest value on the segment, namely . The volume of the searchlight is thus
π

π
a
2
2
dθ ∫ ∫ dρ ρ sin φ
b b
arcsin
a s in φ

To get the volume of the remaining part of the apple in the first octant, we just add up the volumes of the segments that it
contains, by integrating θ from its smallest value in the octant, namely 0, to its largest value on the octant, namely . The π

volume in the first octant is thus


π
π/2 a
2
2
Volume(V1 ) = ∫ dθ ∫ dφ ∫ dρ ρ sin φ
b b
0 arcsin
a s in φ

Now we just have to integrate


π
π/2 3
1 2 b
3
Volume(V1 ) = ∫ dθ ∫ dφ  sin φ [a − ]
3
3 0 arcsin
b
sin φ
a

π
π/2
1 2
3 3 2
= ∫ dθ ∫ dφ  [ a sin φ − b csc φ]
3 0 arcsin
b

π/2 π
1
3 3 2
= ∫ dθ  [−a cos φ + b cot φ] b

3 arcsin
a
0

2
since  ∫ csc φ dφ = − cot φ + C

π
π
3 3 2
= [−a cos φ + b cot φ] b

6 arcsin
a

Now cos π

2
= cot
π

2
=0 and, if we write α = arcsin b

a
,

3.7.8 https://math.libretexts.org/@go/page/92246
π
3 3
Volume(V1 ) = [a cos α − b cot α]
6

√a2 −b2 √a2 −b2


From the triangle below, we have cos α = a
and cot α = b
.

So
π − −−−−− − −−−−− π
2 2 2 2 2 2 2 2 3/2
Volume(V1 ) = [a √ a − b − b √ a − b ] = [a − b ]
6 6

The full (eight octant) volume of the remaining apple is thus


4 3/2
2 2
Volume(V) = 8Volume(V1 ) = π[ a −b ]
3

We can, yet again, apply the sanity checks of Example 3.2.11 to our answer.
If the radius of the drill bit b = 0, no apple is removed at all. So the total volume remaining should be π a . Our answer
4

3
3

does indeed give this.


If the radius of the drill bit b = a, the radius of the apple, then the entire apple disappears. So the remaining apple should
have volume 0. Again, our answer gives this.
As a final check note that the sum of the answer to Example 3.2.11 and the answer to this Example is 4

3
3
πa , as it should be.

Exercises
Stage 1

 1

Use (ρ, θ, φ) to denote spherical coordinates.


1. Draw φ = 0.
2. Draw φ = . π

3. Draw φ = . π

4. Draw φ = . 3π

5. Draw φ = π.

 2

Sketch the point with the specified spherical coordinates.


1. ρ = 0, θ = 0.1π, φ = 0.7π
2. ρ = 1, θ = 0.3π, φ = 0
3. ρ = 1, θ = 0, φ = π

4. ρ = 1, θ = , φ =
π

3
π

5. ρ = 1, θ = , φ =
π

2
π

6. ρ = 1, θ = , φ =
π

3
π

 3

Convert from Cartesian to spherical coordinates.


1. (−2, 0, 0)
2. (0, 3, 0)

3.7.9 https://math.libretexts.org/@go/page/92246
3. (0, 0, −4)
11 –
4. (− –, – , √3)
√2 √2

 4

Convert from spherical to Cartesian coordinates.


1. ρ = 1, θ = π

3
, φ =
π

2. ρ = 2, θ = π

2
, φ =
π

 5

Rewrite the following equations in spherical coordinates.


1. z2
= 3x
2
+ 3y
2

2. x2
+y
2
+ (z − 1 )
2
=1

3. x2
+y
2
=4

 6. ✳

Using spherical coordinates and integration, show that the volume of the sphere of radius 1 centred at the origin is 4π/3.

Stage 2

 7. ✳

Consider the region E in 3-dimensions specified by the spherical inequalities

1 ≤ ρ ≤ 1 + cos φ

1. Draw a reasonably accurate picture of E in 3-dimensions. Be sure to show the units on the coordinates axes.
2. Find the volume of E.

 8. ✳

Use spherical coordinates to evaluate the integral

I =∭ z dV
D

−−−−− −
where D is the solid enclosed by the cone z = √x 2
+ y2 and the sphere 2
x +y
2
+z
2
= 4. That is, (x, y, z) is in D if and
−−− −−−
only if √x + y ≤ z and x + y + z ≤ 4.
2 2 2 2 2

 9
Use spherical coordinates to find
−−−−−−
1. The volume inside the cone z = √x + y and inside the sphere x + y + z = a .
2 2 2 2 2 2

2. ∭ x dV and ∭ z dV over the part of the sphere of radius a that lies in the first octant.
R R

3. The mass of a spherical planet of radius a whose density at distance ρ from the center is δ = A/(B + ρ 2
).

4. The volume enclosed by  ρ = a(1 − cos φ). Here ρ and φ refer to the usual spherical coordinates.

3.7.10 https://math.libretexts.org/@go/page/92246
 10. ✳

Consider the hemispherical shell bounded by the spherical surfaces


2 2 2 2 2 2
x +y +z =9 and x +y +z =4

and above the plane z = 0. Let the shell have constant density D.
1. Find the mass of the shell.
2. Find the location of the center of mass of the shell.

 11. ✳

Let

I =∭ xz dV
T

where T is the eighth of the sphere x 2


+y
2
+z
2
≤1 with x, y, z ≥ 0.
1. Sketch the volume T .
2. Express I as a triple integral in spherical coordinates.
3. Evaluate I by any method.

 12. ✳

Evaluate W =∭
Q
xz dV , where Q is an eighth of the sphere x 2
+y
2
+z
2
≤9 with x, y, z ≥ 0.

 13. ✳
−1
3
Evaluate ∭ R
3 [1 + (x
2
+y
2 2
+z ) ]  dV .

 14. ✳

Evaluate
1 √1−x2 1+√1−x −y
2 2

2 2 2 5/2
∫ ∫ ∫ (x +y +z )  dz dy dx
−1 −√1−x2 1−√1−x2 −y 2

by changing to spherical coordinates.

 15
Evaluate the volume of a circular cylinder of radius a and height h by means of an integral in spherical coordinates.

 16. ✳
Let B denote the region inside the sphere x
2
+y
2
+z
2
=4 and above the cone x
2
+y
2 2
=z . Compute the moment of
inertia

2
∭ z dV
B

3.7.11 https://math.libretexts.org/@go/page/92246
 17. ✳

1. Evaluate ∭ z dV where Ω is the three dimensional region in the first octant x ≥ 0, y ≥ 0, z ≥ 0, occupying the inside
Ω

of the sphere x + y + z = 1.
2 2 2

2. Use the result in part (a) to quickly determine the centroid of a hemispherical ball given by z ≥ 0, x
2
+y
2
+z
2
≤ 1.

 18. ✳

Consider the top half of a ball of radius 2 centred at the origin. Suppose that the ball has variable density equal to 9z units of
mass per unit volume.
1. Set up a triple integral giving the mass of this half-ball.
2. Find out what fraction of that mass lies inside the cone
−−−−−−
2 2
z = √x +y

Stage 3

 19. ✳

Find the limit or show that it does not exist


2 2
xy + y z + xz
lim
2 2 4
(x,y,z)→(0,0,0) x +y +z

 20. ✳

A certain solid V is a right-circular cylinder. Its base is the disk of radius 2 centred at the origin in the xy-plane. It has height 2
−−− −− −
and density √x + y .
2 2

A smaller solid U is obtained by removing the inverted cone, whose base is the top surface of V and whose vertex is the point
(0, 0, 0).

1. Use cylindrical coordinates to set up an integral giving the mass of U .


2. Use spherical coordinates to set up an integral giving the mass of U .
3. Find that mass.

 21. ✳
−−−−−−
A solid is bounded below by the cone z= √x2 +y 2   and above by the sphere 2
x +y
2
+z
2
= 2. It has density
2 2
δ(x, y, z) = x +y .

1. Express the mass M of the solid as a triple integral, with limits, in cylindrical coordinates.
2. Same as (a) but in spherical coordinates.
3. Evaluate M .

 22. ✳

Let

I =∭ xz dV
E

where E is the eighth of the sphere x 2


+y
2
+z
2
≤1 with x, y, z ≥ 0.
1. Express I as a triple integral in spherical coordinates.
2. Express I as a triple integral in cylindrical coordinates.

3.7.12 https://math.libretexts.org/@go/page/92246
3. Evaluate I by any method.

 23. ✳

Let

2 2
I =∭ (x + y ) dV
T

−−−−−− −−
where T is the solid region bounded below by the cone z = √3x 2
+ 3y
2
and above by the sphere x 2
+y
2
+z
2
= 9.

1. Express I as a triple integral in spherical coordinates.


2. Express I as a triple integral in cylindrical coordinates.
3. Evaluate I by any method.

 24. ✳

Let E be the “ice cream cone” x 2


+y
2
+z
2
≤ 1, x
2
+y
2
≤z
2
, z ≥ 0. Consider
−−−−−−−−−−
2 2 2
J =∭ √x +y +z  dV
E

1. Write J as an iterated integral, with limits, in cylindrical coordinates.


2. Write J as an iterated integral, with limits, in spherical coordinates.
3. Evaluate J.

 25. ✳

The body of a snowman is formed by the snowballs x 2


+y
2
+z
2
= 12 (this is its body) and x2
+y
2
+ (z − 4 )
2
=4 (this is
its head).
1. Find the volume of the snowman by subtracting the intersection of the two snow balls from the sum of the volumes of the
snow balls. [Recall that the volume of a sphere of radius r is r . ] 4π

3
3

2. We can also calculate the volume of the snowman as a sum of the following triple integrals:

1.

2π 2
3 →
2
∫ ∫ ∫ ρ sin a⃗ rphi dρ dθ darphi
0 0 0

2.
r
2π √3 4−
√3

∫ ∫ ∫ r dz dr dθ
0 0 √3 r

3. π 2π 2 √3
2

∫ ∫ ∫ ρ sin(a⃗ rphi) dρ dθ darphi
π
0 0
6

Circle the right answer from the underlined choices and fill in the blanks in the following descriptions of the region of
integration for each integral. [Note: We have translated the axes in order to write down some of the integrals above. The
equations you specify should be those before the translation is performed.]
1. The region of integration in (1) is a part of the snowman's

 body / head / body and head


–––––––––––––––––––––––––––––

It is the solid enclosed by the

sphere / cone
–––––––––––––

defined by the equation

3.7.13 https://math.libretexts.org/@go/page/92246
and the

sphere / cone
–––––––––––––

defined by the equation

2. The region of integration in (2) is a part of the snowman's

 body / head / body and head


–––––––––––––––––––––––––––––

It is the solid enclosed by the

sphere / cone
–––––––––––––

defined by the equation

and the

sphere / cone
–––––––––––––

defined by the equation

3. The region of integration in (3) is a part of the snowman's

 body / head / body and head


–––––––––––––––––––––––––––––

It is the solid enclosed by the

sphere / cone
–––––––––––––

defined by the equation

and the

sphere / cone
–––––––––––––

defined by the equation

 26. ✳
1. Find the volume of the solid inside the surface defined by the equation ρ = 8 sin(a⃗ rphi) in spherical coordinates.
You may use that

4
1
∫ sin (φ) = (12φ − 8 sin(2φ) + sin(4φ)) + C
32

2. Sketch this solid or describe what it looks like.

 27. ✳

Let E be the solid


−−−−−−
2 2 2 2
0 ≤ z ≤ √x +y , x +y ≤ 1,

3.7.14 https://math.libretexts.org/@go/page/92246
and consider the integral
−−−−−−−−−−
2 2 2
I =∭ z√ x +y +z  dV .
E

1. Write the integral I in cylindrical coordinates.


2. Write the integral I in spherical coordinates.
3. Evaluate the integral I using either form.

 28. ✳

Consider the iterated integral


0 0 √a2 −x2 −y 2
2 2 2 2014
I =∫ ∫ ∫ (x +y +z )  dz dy dx
2 2
−a −√a −x 0

where a is a positive constant.


1. Write I as an iterated integral in cylindrical coordinates.
2. Write I as an iterated integral in spherical coordinates.
3. Evaluate I using whatever method you prefer.

 29. ✳
−−−−−−
The solid E is bounded below by the paraboloid z = x 2
+y
2
and above by the cone z = √x 2 2
+y . Let

2 2 2
I =∭ z(x +y + z ) dV
E

1. Write I in terms of cylindrical coordinates. Do not evaluate.


2. Write I in terms of spherical coordinates. Do not evaluate.
3. Calculate I .

 30. ✳
−−−−−−
Let Sbe the region on the first octant (so that x, y, z ≥ 0 ) which lies above the cone 2
z = √x + y
2
and below the sphere
2 2
(z − 1 ) + x + y = 1.
2
Let V be its volume.
1. Express V as a triple integral in cylindrical coordinates.
2. Express V as an triple integral in spherical coordinates.
3. Calculate V using either of the integrals above.

 31. ✳
−− −−−−− −
A solid is bounded below by the cone 2
z = √3 x + 3 y
2
and above by the sphere x
2
+y
2
+z
2
= 9. It has density
2 2
δ(x, y, z) = x +y .

1. Express the mass m of the solid as a triple integral in cylindrical coordinates.


2. Express the mass m of the solid as a triple integral in spherical coordinates.
3. Evaluate m.

1. We are using the standard mathematics conventions for the spherical coordinates. Under the ISO conventions they are (r, ϕ, θ).
See Appendix A.7.
2. and with the sign of x being the same as the sign of cos θ
3. You know the drill.
4. The problem of finding a practical, reliable method for determining the longitude of a ship at sea was a very big deal for a
period of several centuries. Among the scientists who worked in this were Galileo, Edmund Halley (of Halley's comet) and

3.7.15 https://math.libretexts.org/@go/page/92246
Robert Hooke (of Hooke's law).
5. A very mathematical ice cream. Rocky-rho'd? Choculus?

This page titled 3.7: Triple Integrals in Spherical Coordinates is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.

3.7.16 https://math.libretexts.org/@go/page/92246
3.8: Optional— Integrals in General Coordinates
One of the most important tools used in dealing with single variable integrals is the change of variable (substitution) rule

 Equation 3.8.1

x = f (u) dx = f (u) du

See Theorems 1.4.2 and 1.4.6 in the CLP-2 text. Expressing multivariable integrals using polar or cylindrical or spherical
coordinates are really multivariable substitutions. For example, switching to spherical coordinates amounts replacing the
coordinates x, y, z with the coordinates ρ, θ, φ by using the substitution
2
X = r(ρ, θ, φ) dx dy dz = ρ sin φ dρ dθ dφ

where

X = ⟨x , y , z⟩ and r(ρ, θ, φ) = ⟨ρ cos θ sin φ , ρ sin θ sin φ , ρ cos φ⟩

We'll now derive a generalization of the substitution rule 3.8.1 to two dimensions. It will include polar coordinates as a special
case. Later, we'll state (without proof) its generalization to three dimensions. It will include cylindrical and spherical coordinates as
special cases.
Suppose that we wish to integrate over a region, R, in R and that we also wish 1 to use two new coordinates, that we'll call u and
2

2
v, in place of x and y. The new coordinates u, v are related to the old coordinates x, y, by the functions

x = x(u, v)

y = y(u, v)

To make formulae more compact, we'll define the vector valued function r(u, v) by

r(u, v) = ⟨x(u, v) , y(u, v)⟩

As an example, if the new coordinates are polar coordinates, with r renamed to u and θ renamed to v, then x(u, v) = u cos v and
y = u sin v.

Note that if we hold v fixed and vary u, then r(u, v) sweeps out a curve. For example, if x(u, v) = u cos v and y = u sin v, then, if
we hold v fixed and vary u, r (u,
⃗  v) sweeps out a straight line (that makes the angle v with the x-axis), while, if we hold u > 0

fixed and vary v, r(u, v) sweeps out a circle (of radius u centred on the origin).

We start by cutting R (the shaded region in the figure below) up into small pieces by drawing a bunch of curves of constant u (the
blue curves in the figure below) and a bunch of curves of constant v (the red curves in the figure below).
Concentrate on any one of the small pieces. Here is a greatly magnified sketch.

3.8.1 https://math.libretexts.org/@go/page/92247
For example, the lower red curve was constructed by holding v fixed at the value v , varying u and sketching r(u, v
0 0 ), and the
upper red curve was constructed by holding v fixed at the slightly larger value v + dv, varying u and sketching r(u, v
0 0 + dv). So
the four intersection points in the figure are
P2 = r(u0 , v0 + dv) P3 = r(u0 + du, v0 + dv)

P0 = r(u0 , v0 ) P1 = r(u0 + du, v0 )

Now, for any small constants dU and dV , we have the linear approximation 3
∂r ∂r
r(u0 + dU , v0 + dV ) ≈ r(u0 , v0 ) + (u0 , v0 ) dU + (u0 , v0 ) dV
∂u ∂v

Applying this three times, once with dU = du, dV = 0 (to approximate P1 ), once with dU = 0, dV = dv (to approximate P2 ),
and once with dU = du, dV = dv (to approximate P ), 3

P0 = r(u0 , v0 )

∂r
P1 = r(u0 + du, v0 ) ≈ r(u0 , v0 ) + (u0 , v0 ) du
∂u

∂r
P2 = r(u0 , v0 + dv) ≈ r(u0 , v0 ) + (u0 , v0 ) dv
∂v

∂r ∂r
P3 = r(u0 + du, v0 + dv) ≈ r(u0 , v0 ) + (u0 , v0 ) du + (u0 , v0 ) dv
∂u ∂v

We have dropped all Taylor expansion terms that are of degree two or higher in du, dv. The reason is that, in defining the integral,
we take the limit du, dv → 0. Because of that limit, all of the dropped terms contribute exactly 0 to the integral. We shall not prove
this. But we shall show, in the optional §3.8.1, why this is the case.
The small piece of R surface with corners P 0, P1 , P2 , P3 is approximately a parallelogram with sides
−−−→ −−−→ ∂r ∂x ∂y
P0 P1 ≈ P2 P3 ≈ (u0 , v0 ) du = ⟨ (u0 , v0 ) , (u0 , v0 )⟩ du
∂u ∂u ∂u

−−−→ −−−→ ∂ r ∂x ∂y


P0 P2 ≈ P1 P3 ≈ (u0 , v0 ) dv = ⟨ (u0 , v0 ) , (u0 , v0 )⟩ dv
∂v ∂v ∂v

−−−→
Here the notation, for example, P0 P1 refers to the vector whose tail is at the point P0 and whose head is at the point P1 . Recall,
from 1.2.17 that
∣ a b ∣
area of parallelogram with sides ⟨a, b⟩ and ⟨c, d⟩ = ∣det [ ∣ad − bc ∣
]∣ = ∣ ∣
∣ c d ∣

So the area of our small piece of R is essentially

 Equation 3.8.2
∂x ∂y
∣ ∣
⎡ ⎤
∣ ∂u ∂u ∣
dA = det du dv
∣ ∂x ∂y ∣
⎣ ⎦
∣ ∂v ∂v

Recall that det M denotes the determinant of the matrix M . Also recall that we don't really need determinants for this text, though
it does make for nice compact notation.
The formula (3.8.2) is the heart of the following theorem, which tells us how to translate an integral in one coordinate system into
an integral in another coordinate system.

3.8.2 https://math.libretexts.org/@go/page/92247
 Theorem 3.8.3

Let the functions x(u, v) and y(u, v) have continuous first partial derivatives and let the function f (x, y) be continuous.
Assume that x = x(u, v), y = y(u, v) provides a one-to-one correspondence between the points (u, v) of the region U in the
uv-plane and the points (x, y) of the region R in the xy-plane. Then

∂y
∣ ∂x ∣
⎡ (u, v) (u, v) ⎤
∣ ∂u ∂u ∣
∬ f (x, y) dx dy = ∬ f (x(u, v) , y(u, v)) det du dv
∣ ∂x ∂y ∣
R U ⎣ (u, v) (u, v) ⎦∣
∣ ∂v ∂u

The determinant
∂x ∂y
⎡ (u, v) (u, v) ⎤
∂u ∂u
det
∂x ∂y
⎣ (u, v) (u, v) ⎦
∂v ∂v

that appears in (3.8.2) and Theorem 3.8.3 is known as the Jacobian 4.

 Example 3.8.4. dA for x ↔ y

We'll start with a pretty trivial example in which we simply rename x to Y and y to X. That is

x(X, Y ) = Y

y(X, Y ) = X

Since
∂x ∂y
=0 =1
∂X ∂X

∂x ∂y
=1 =0
∂Y ∂Y

(3.8.2), but with u renamed to X and v renamed to Y , gives


∣ 0 1 ∣
dA = ∣det [ ]∣ dX dY = dX dY
∣ 1 0 ∣

which should really not be a shock.

 Example 3.8.5. dA for Polar Coordinates

Polar coordinates have

x(r, θ) = r cos θ

y(r, θ) = r sin θ

Since
∂x ∂y
= cos θ = sin θ
∂r ∂r

∂x ∂y
= −r sin θ = r cos θ
∂θ ∂θ

(3.8.2), but with u renamed to r and v renamed to θ, gives


∣ cos θ sin θ ∣
2 2
dA = ∣det [ ]∣ drdθ = (r cos θ + r sin θ) drdθ
∣ −r sin θ r cos θ ∣

= r dr dθ

which is exactly what we found in 3.2.5.

3.8.3 https://math.libretexts.org/@go/page/92247
 Example 3.8.6. dA for Parabolic Coordinates
Parabolic 5 coordinates are defined by
2 2
u −v
x(u, v) =
2

y(u, v) = uv

Since
∂x ∂y
=u =v
∂u ∂u

∂x ∂y
= −v =u
∂v ∂v

(3.8.2) gives
∣ u v ∣
2 2
dA = ∣det [ ]∣ dudv = (u + v ) du dv
∣ −v u ∣

In practice applying the change of variables Theorem 3.8.3 can be quite tricky. Here is just one simple (and rigged) example.

 Example 3.8.7

Evaluate
y
∬  dx dy where R = {(x, y)|0 ≤ x ≤ 1,  1 + x ≤ y ≤ 2 + 2x}
R 1 +x

Solution
We can simplify the integrand considerably by making the change of variables
s =x x =s
y
t = y = t(1 + x) = t(1 + s)
1 +x

Of course to evaluate the given integral by applying Theorem 3.8.3 we also need to know
[∘] the domain of integration in terms of s and t and
[∘] dx dy in terms of ds dt.
By (3.8.2), recalling that x(s, t) = s and y(s, t) = t(1 + s),
∂x ∂y
∣ ∣
⎡ ⎤ ∣ 1 t ∣
∣ ∂s ∂s ∣
dx dy = det ds dt = ∣det [ ]∣ ds dt = (1 + s) ds dt
∣ ∂y ∣
⎣ ∂x
⎦ ∣ 0 1 +s ∣
∣ ∂t ∂t

To determine what the change of variables does to the domain of integration, we'll sketch R and then reexpress the boundary
of R in terms of the new coordinates s and t. Here is the sketch of R in the original coordinates (x, y).

3.8.4 https://math.libretexts.org/@go/page/92247
The region R is a quadrilateral. It has four sides.
The left side is part of the line x = 0. Recall that x = s. So, in terms of s and t, this line is s = 0.
The right side is part of the line x = 1. In terms of s and t, this line is s = 1.
y y
The bottom side is part of the line y = 1 + x, or = 1. Recall that t =
1+x
. So, in terms of s and t, this line is t = 1.
1+x
y
The top side is part of the line y = 2(1 + x), or = 2. In terms of s and t, this line is t = 2.
1+x

Here is another copy of the sketch of R. But this time the equations of its four sides are expressed in terms of s and t.

So, expressed in terms of s and t, the domain of integration R is much simpler:

{(s, t)|0 ≤ s ≤ 1,  1 ≤ t ≤ 2}

y
As dx dy = (1 + s) ds dt and the integrand 1+x
= t, the integral is, by Theorem 3.8.3,
1 2 1 2
2
y t
∬  dx dy =∫ ds ∫ dt (1 + s)t = ∫ ds (1 + s) [ ]
R
1 +x 0 1 0
2
1

1
2
3 s
= [s + ]
2 2
0

3 3
= ×
2 2
9
=
4

There are natural generalizations of (3.8.2) and Theorem 3.8.3 to three (and also to higher) dimensions, that are derived in precisely
the same way as (3.8.2) was derived. The derivation is based on the fact, discussed in the optional Section 1.2.4, that the volume of
the parallelepiped (three dimensional parallelogram)

3.8.5 https://math.libretexts.org/@go/page/92247
determined by the three vectors a = ⟨a 1, a2 , a3 ⟩ ,  b = ⟨b1 , b2 , b3 ⟩ and c = ⟨c 1, c2 , c3 ⟩ is given by the formula
∣ a a2 a3 ∣
⎡ 1 ⎤
∣ ∣
volume of parallelepiped with edges a, b, c = det ⎢ b1 b2 b3 ⎥
∣ ∣
⎣ ⎦
∣ c1 c2 c3 ∣

where the determinant of a 3 × 3 matrix can be defined in terms of some 2 × 2 determinants by

If we use

x = x(u, v, w)

y = y(u, v, w)

z = z(u, v, w)

to change from old coordinates x, y, z to new coordinates u, v, w, then

 Equation 3.8.8
∂y
∣ ∂x ∂z ∣
⎡ ⎤
∣ ∂u ∂u ∂u ∣
∣ ⎢ ∂x ∂y ∂z
⎥∣
dV = det ⎢ ⎥ du dv dw
∣ ⎢ ∂v ∂v ∂v ⎥∣
⎢ ⎥
∣ ∂x ∂y ∂z

⎣ ⎦
∣ ∂w ∂w ∂w

 Example 3.8.9. dV for Cylindrical Coordinates

Cylindrical coordinates have

x(r, θ, z) = r cos θ

y(r, θ, z) = r sin θ

z(r, θ, z) = z

Since
∂x ∂y ∂z
= cos θ = sin θ =0
∂r ∂r ∂r

∂x ∂y ∂z
= −r sin θ = r cos θ =0
∂θ ∂θ ∂θ

∂x ∂y ∂z
=0 =0 =1
∂z ∂z ∂z

(3.8.8), but with u renamed to r and v renamed to θ, gives


∣ cos θ sin θ 0 ∣
⎡ ⎤
∣ ∣
dV = det ⎢ −r sin θ r cos θ 0 ⎥ dr dθ dz
∣ ∣
⎣ ⎦
∣ 0 0 1 ∣

∣ r cos θ 0 −r sin θ 0
= ∣cos θ det [ ] − sin θ det [ ]
∣ 0 1 0 1

−r sin θ r cos θ ∣
+ 0 det [ ]∣ dr dθ dz
0 0 ∣
2 2
= (r cos θ + r sin θ) dr dθ dz

= r dr dθ dz

which is exactly what we found in (3.6.3).

3.8.6 https://math.libretexts.org/@go/page/92247
 Example 3.8.10. dV for Spherical Coordinates
Spherical coordinates have

x(ρ, θ, φ) = ρ cos θ sin φ

y(ρ, θ, φ) = ρ sin θ sin φ

z(ρ, θ, φ) = ρ cos φ

Since
∂x ∂y ∂z
= cos θ sin φ = sin θ sin φ = cos φ
∂ρ ∂ρ ∂ρ

∂x ∂y ∂z
= −ρ sin θ sin φ = ρ cos θ sin φ =0
∂θ ∂θ ∂θ

∂x ∂y ∂z
= ρ cos θ cos φ = ρ sin θ cos φ = −ρ sin φ
∂φ ∂φ ∂φ

(3.8.8), but with u renamed to ρ, v renamed to θ and w renamed to φ, gives


∣ cos θ sin φ sin θ sin φ cos φ ∣
⎡ ⎤
∣ ∣
dV = det ⎢ −ρ sin θ sin φ ρ cos θ sin φ 0 ⎥ dρ dθ dφ
∣ ∣
⎣ ⎦
∣ ρ cos θ cos φ ρ sin θ cos φ −ρ sin φ ∣

∣ ρ cos θ sin φ 0
= ∣cos θ sin φ det [ ]
∣ ρ sin θ cos φ −ρ sin φ

−ρ sin θ sin a⃗ rphi 0


− sin θ sin φ det [ ]
ρ cos θ cos φ −ρ sin φ

−ρ sin θ sin φ ρ cos θ sin φ ∣


+ cos φ det [ ]∣ dρ dθ dφ
ρ cos θ cos φ ρ sin θ cos φ ∣
2 2 3 2 3 2
=ρ ∣
∣ − cos θ sin φ − sin θ sin φ − sin φ cos φ∣
∣ dρ dθ dφ
2 2 2
=ρ ∣
∣ − sin φ sin φ − sin φ cos φ∣
∣ dρ dθ dφ

2
=ρ sin φ dρ dθ dφ

which is exactly what we found in (3.7.3).

Optional — Dropping Higher Order Terms in du, dv


In the course of deriving (3.8.2), that is, the dA formula for

we approximated, for example, the vectors


−−−→ ∂r ∂r
P0 P1 = r(u0 + du, v0 ) − r(u0 , v0 ) = (u0 , v0 ) du + E1 ≈ (u0 , v0 ) du
∂u ∂u
−−−→ ∂r ∂r
P0 P2 = r(u0 , v0 + dv) − r(u0 , v0 ) = (u0 , v0 ) dv + E2 ≈ (u0 , v0 ) dv
∂v ∂v

where E is bounded 6 by a constant times (du) and E is bounded by a constant times (dv)
1
2
2
2
. That is, we assumed that we could
just ignore the errors and drop E and E by setting them to zero.
1 2

So we approximated

3.8.7 https://math.libretexts.org/@go/page/92247
∣−−−→ −−−→∣ ∣ ∂r ∂r ∣
∣P0 P1 × P0 P2 ∣ = ∣[ (u0 , v0 ) du + E1 ] × [ (u0 , v0 ) dv + E2 ]∣
∣ ∣ ∣ ∂u ∂v ∣

∣ ∂r ∂r ∣
=∣ (u0 , v0 ) du × (u0 , v0 ) dv + E3 ∣
∣ ∂u ∂v ∣

∣ ∂r ∂r ∣
≈∣ (u0 , v0 ) du × (u0 , v0 ) dv∣
∣ ∂u ∂v ∣

where the length of the vector E is bounded by a constant times (du ) dv + du (dv) . We'll now see why dropping terms like E
3
2 2
3

does not change the value of the integral at all 7. Suppose that our domain of integration consists of all (u, v)'s in a rectangle of
width W and height H , as in the figure below.

Subdivide the rectangle into a grid of n × n small subrectangles by drawing lines of constant v (the red lines in the figure) and
lines of constant u (the blue lines in the figure). Each subrectangle has width du = and height dv = . Now suppose that in
W

n
H

setting up the integral we make, for each subrectangle, an error that is bounded by some constant times
2 2 W H (W + H )
2 2
W H W H
(du ) dv + du (dv) =( ) + ( ) =
n n n n n3

Because there are a total of n subrectangles, the total error that we have introduced, for all of these subrectangles, is no larger than
2

a constant times
W H (W + H ) W H (W + H )
2
n × =
3
n n

When we define our integral by taking the limit n → 0 of the Riemann sums, this error converges to exactly 0. As a consequence,
it was safe for us to ignore the error terms when we established the change of variables formulae.

1. We'll keep our third wish in reserve.


2. We are abusing notation a little here by using x and y both as coordinates and as functions. We could write x = f (u, v) and
y = g(u, v), but it is easier to remember x = x(u, v) and y = y(u, v).

3. Recall 2.6.1.
4. It is not named after the Jacobin Club, a political movement of the French revolution. It is not named after the Jacobite
rebellions that took place in Great Britain and Ireland between 1688 and 1746. It is not named after the Jacobean era of English
and Scottish history. It is named after the German mathematician Carl Gustav Jacob Jacobi (1804 – 1851). He died from
smallpox.
5. The name comes from the fact that both the curves of constant u and the curves of constant v are parabolas.
6. Remember the error in the Taylor polynomial approximations. See 2.6.13 and 2.6.14.
7. See the optional § 1.1.6 of the CLP-2 text for an analogous argument concerning Riemann sums.

This page titled 3.8: Optional— Integrals in General Coordinates is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.

3.8.8 https://math.libretexts.org/@go/page/92247
CHAPTER OVERVIEW
4: Appendices
A: Appendices
A.1: Trigonometry
A.2: Powers and Logarithms
A.3: Table of Derivatives
A.4: Table of Integrals
A.5: Table of Taylor Expansions
A.6: 3-D Coordinate Systems
A.7: ISO Coordinate System Notation
A.8: Conic Sections and Quadric Surfaces
B: Hints for Exercises
C: Answers to Exercises
D: Solutions to Exercises

This page titled 4: Appendices is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.

1
A: Appendices
This page titled A: Appendices is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.

A.1 https://math.libretexts.org/@go/page/89221
A.1: Trigonometry
A.1.1 Trigonometry — Graphs

sin θ cos θ tan θ

A.1.2 Trigonometry — Special Triangles

From the above pair of special triangles we have



π 1 π 1 π √3
sin = sin = sin =

4 √2 6 2 3 2

π 1 π √3 π 1
cos = – cos = cos =
4 √2 6 2 3 2

π π 1 π –
tan =1 tan = – tan = √3
4 6 √3 3

A.1.3 Trigonometry — Simple Identities


Periodicity

sin(θ + 2π) = sin(θ) cos(θ + 2π) = cos(θ)

Reflection

sin(−θ) = − sin(θ) cos(−θ) = cos(θ)

Reflection around π/4


π π
sin( − θ) = cos θ cos( − θ) = sin θ
2 2

Reflection around π/2

sin(π − θ) = sin θ cos(π − θ) = − cos θ

Rotation by π

sin(θ + π) = − sin θ cos(θ + π) = − cos θ

Pythagoras

A.1.1 https://math.libretexts.org/@go/page/92249
2 2
sin θ + cos θ =1
2 2
tan θ + 1 = sec θ
2 2
1 + cot θ = csc θ

sin and cos building blocks


sin θ 1 1 cos θ 1
tan θ = csc θ = sec θ = cot θ = =
cos θ sin θ cos θ sin θ tan θ

A.1.4 Trigonometry — Add and Subtract Angles


Sine

sin(α ± β) = sin(α) cos(β) ± cos(α) sin(β)

Cosine

cos(α ± β) = cos(α) cos(β) ∓ sin(α) sin(β)

Tangent
tan α + tan β
tan(α + β) =
1 − tan α tan β

tan α − tan β
tan(α − β) =
1 + tan α tan β

Double angle
sin(2θ) = 2 sin(θ) cos(θ)

2 2
cos(2θ) = cos (θ) − sin (θ)

2
= 2 cos (θ) − 1

2
= 1 − 2 sin (θ)

2 tan(θ)
tan(2θ) =
2
1 − tan θ

1 + cos(2θ)
2
cos θ =
2

1 − cos(2θ)
2
sin θ =
2

1 − cos(2θ)
2
tan θ =
1 + cos(2θ)

Products to sums
sin(α + β) + sin(α − β)
sin(α) cos(β) =
2

cos(α − β) − cos(α + β)
sin(α) sin(β) =
2

cos(α − β) + cos(α + β)
cos(α) cos(β) =
2

Sums to products
α +β α −β
sin α + sin β = 2 sin cos
2 2
α +β α −β
sin α − sin β = 2 cos sin
2 2

α +β α −β
cos α + cos β = 2 cos cos
2 2
α +β α −β
cos α − cos β = −2 sin sin
2 2

A.1.2 https://math.libretexts.org/@go/page/92249
A.1.5 Inverse Trigonometric Functions

arcsin x arctan x
arccos x

Domain: −1 ≤ x ≤ 1 Domain: all real numbers


Domain: −1 ≤ x ≤ 1
Range: − π π Range: − π
< arctan x <
π

2
≤ arcsin x ≤
2 Range: 0 ≤ arccos x ≤ π 2 2

Since these functions are inverses of each other we have


π π
arcsin(sin θ) = θ − ≤θ ≤
2 2

arccos(cos θ) = θ 0 ≤θ ≤π
π π
arctan(tan θ) = θ − ≤θ ≤
2 2

and also
sin(arcsin x) = x −1 ≤ x ≤ 1

cos(arccos x) = x −1 ≤ x ≤ 1

tan(arctan x) = x any real x

arccscx arccotx
arcsecx

Domain: |x| ≥ 1 Domain: |x| ≥ 1 Domain: all real numbers


Range: − π

2
≤ arccscx ≤
π

2 Range: 0 ≤ arcsecx ≤ π Range: 0 < arccotx < π


π
arccscx ≠ 0
arcsecx ≠
2

Again
π π
arccsc(csc θ) = θ − ≤θ ≤ ,  θ ≠ 0
2 2
π
arcsec(sec θ) = θ 0 ≤ θ ≤ π,  θ ≠
2

arccot(cot θ) = θ 0 <θ <π

and
csc(arccscx) = x |x| ≥ 1

sec(arcsecx) = x |x| ≥ 1

cot(arccotx) = x any real x

A.1.3 https://math.libretexts.org/@go/page/92249
This page titled A.1: Trigonometry is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

A.1.4 https://math.libretexts.org/@go/page/92249
A.2: Powers and Logarithms
A.2.1 Powers
In the following, x and y are arbitrary real numbers, q is an arbitrary constant that is strictly bigger than zero and e is
2.7182818284, to ten decimal places.
0 0
e = 1, q =1
x x
e q
x+y x y x−y x+y x y x−y
e =e e , e = , q =q q , q =
y y
e q
1 1
−x −x
e = , q =
x x
e q
x y xy x y xy
(e ) =e , (q ) =q

d x x
d g(x) ′ g(x)
d x x
e =e , e = g (x)e , q = (ln q) q
dx dx dx
x
∫ e  dx = e
x
+ C, ∫ e
ax
 dx =
1

a
e
ax
+C if a ≠ 0
∞ n
x
x
e =∑
n!
n=0

x x
lim e = ∞, lim e =0
x→∞ x→−∞

lim q
x
= ∞, lim q
x
=0 if q > 1
x→∞ x→−∞

lim q
x
= 0, lim q
x
=∞ if 0 < q < 1
x→∞ x→−∞

The graph of 2 is given below. The graph of qx x


, for any q > 1, is similar.

A.2.2 Logarithms
In the following, x and y are arbitrary real numbers that are strictly bigger than 0 (except where otherwise specified), p and q are
arbitrary constants that are strictly bigger than one, and e is 2.7182818284, to ten decimal places. The notation ln x means log x. e

Some people use log x to mean log x, others use it to mean log x and still others use it to mean log x.
10 e 2

ln x log x
e = x, q q
=x

ln (e ) = x,
x
logq (q
x
) =x for all −∞ < x < ∞
ln x logp x logp x
logq x = , ln x = , logq x =
ln q logp e logp q

ln 1 = 0, ln e = 1

logq 1 = 0, logq q = 1

ln(xy) = ln x + ln y, log (xy) = log x + log y


q q q
x x
ln ( ) = ln x − ln y, logq ( ) = logq x − logq y
y y
1 1
ln ( ) = − ln y, logq ( ) = − logq y
y y
y y
ln(x ) = y ln x, logq (x ) = y logq x

d 1 d 1
ln x = , logq x =
dx x dx x ln q

A.2.1 https://math.libretexts.org/@go/page/92250
x
∫ ln x dx = x ln x − x + C , ∫ logq x dx = x logq x − +C
ln q

lim ln x = ∞, lim ln x = −∞
x→∞ x→0

lim log x = ∞, lim log x = −∞


q q
x→∞ x→0

The graph of log 10


x is given below. The graph of logq
x, for any q > 1, is similar.

This page titled A.2: Powers and Logarithms is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.

A.2.2 https://math.libretexts.org/@go/page/92250
A.3: Table of Derivatives
Throughout this table, a and b are constants, independent of x.
′ dF
F (x) F (x) =
dx

′ ′
af (x) + bg(x) af (x) + bg (x)

′ ′
f (x) + g(x) f (x) + g (x)

′ ′
f (x) − g(x) f (x) − g (x)


af (x) af (x)

′ ′
f (x)g(x) f (x)g(x) + f (x)g (x)

′ ′ ′
f (x)g(x)h(x) f (x)g(x)h(x) + f (x)g (x)h(x) + f (x)g(x)h (x)

′ ′
f(x) f (x)g(x)−f(x)g (x)

2
g(x) g(x)


1 g (x)
− 2
g(x) g(x)

′ ′
f (g(x)) f (g(x))g (x)

′ dF
F (x) F (x) =
dx

a 0

a a−1
x ax

a a−1 ′
g(x) ag(x ) g (x)

sin x cosx


sin g(x) g (x) cosg(x)

cosx − sin x


cosg(x) −g (x) sin g(x)

2
tan x sec x

csc x − csc x cot x

sec x sec x tan x

2
cot x − csc x

x x
e e

g(x) ′ g(x)
e g (x)e

x x
a (ln a) a

′ dF
F (x) F (x) =
dx

1
ln x
x


g (x)
ln g(x)
g(x)

1
log x
a x ln a

1
arcsin x
√1−x2


g (x)

arcsin g(x) 2
√1−g(x)

A.3.1 https://math.libretexts.org/@go/page/92251
1

arccosx
√1−x2

1
arctan x
1+x2


g (x)
arctan g(x) 2
1+g(x)

1

arccscx
|x|√x2 −1

1
arcsecx
|x|√x2 −1

1
arccotx −
1+x2

This page titled A.3: Table of Derivatives is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.

A.3.2 https://math.libretexts.org/@go/page/92251
A.4: Table of Integrals
Throughout this table, a and b are given constants, independent of x and C is an arbitrary constant.

f(x) F (x) = ∫ f(x) dx

af (x) + bg(x) a ∫ f (x) dx + b ∫ g(x) dx  +  C

f (x) + g(x) ∫ f (x) dx + ∫ g(x) dx  +  C

f (x) − g(x) ∫ f (x) dx − ∫ g(x) dx  +  C

af (x) a ∫ f (x) dx  +  C

′ ′
u(x)v (x) u(x)v(x) − ∫ u (x)v(x) dx  +  C


f (y(x))y (x) F (y(x)) where F (y) = ∫ f (y) dy

a ax + C

a+ 1
a x
x + C if a ≠ −1
a+1

1
ln |x| + C
x

a+ 1
a ′ g(x)
g(x ) g (x) + C if a ≠ −1
a+1

f(x) F (x) = ∫ f(x) dx

sin x − cosx + C


g (x) sin g(x) − cosg(x) + C

cosx sin x + C

tan x ln | sec x| + C

csc x ln | csc x − cot x| + C

sec x ln | sec x + tan x| + C

cot x ln | sin x| + C

2
sec x tan x + C

2
csc x − cot x + C

sec x tan x sec x + C

csc x cot x − csc x + C

f(x) F (x) = ∫ f(x) dx

x x
e e +C

g(x) ′ g(x)
e g (x) e +C

ax 1 ax
e  e +C
a

x 1 x
a  a +C
ln a

ln x x ln x − x + C

1
arcsin x + C
√1−x2


g (x)

2
arcsin g(x) + C
√1−g(x)

A.4.1 https://math.libretexts.org/@go/page/92252
1 x
arcsin +C
√a2 −x2 a

1
arctan x + C
1+x2


g (x)

2 arctan g(x) + C
1+g(x)

1 1 x
arctan +C
a2 +x2 a a

x √x2 −1
arcsecx + C \quad(x )
> 1

This page titled A.4: Table of Integrals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

A.4.2 https://math.libretexts.org/@go/page/92252
A.5: Table of Taylor Expansions
Let n ≥ be an integer. Then if the function f has n + 1 derivatives on an interval that contains both x and x, we have the Taylor 0

expansion
1 1
′ ′′ 2 (n) n
f (x) = f (x0 ) + f (x0 ) (x − x0 ) + f (x0 ) (x − x0 ) +⋯ + f (x0 ) (x − x0 )
2! n!
1
(n+1) n+1
+ f (c) (x − x0 ) for some c between x0  and x
(n + 1)!

The limit as n → ∞ gives the Taylor series


∞ (n)
f (x0 )
n
f (x) = ∑ (x − x0 )
n!
n=0

for f . When x 0 =0 this is also called the Maclaurin series for f . Here are Taylor series expansions of some important functions.

1
x n
e =∑ x for  − ∞ < x < ∞
n!
n=0

1 1 1
2 3 n
= 1 +x + x + x +⋯ + x +⋯
2 3! n!
∞ n
(−1)
2n+1
sin x = ∑ x for  − ∞ < x < ∞
(2n + 1)!
n=0

n
1 1 (−1)
3 5 2n+1
=x− x + x −⋯ + x +⋯
3! 5! (2n + 1)!
∞ n
(−1)
2n
cos x = ∑ x for  − ∞ < x < ∞
(2n)!
n=0

n
1 1 (−1)
2 4 2n
=1− x + x −⋯ + x +⋯
2! 4! (2n)!

1 n
= ∑x for  − 1 ≤ x < 1
1 −x
n=0

2 3 n
= 1 +x +x +x +⋯ +x +⋯

1 n n
= ∑(−1 ) x for  − 1 < x ≤ 1
1 +x
n=0

2 3 n n
= 1 −x +x −x + ⋯ + (−1 ) x +⋯

1 n
ln(1 − x) = − ∑ x for  − 1 ≤ x < 1
n
n=1

1 1 1
2 3 n
= −x − x − x −⋯ − x −⋯
2 3 n
∞ n
(−1)
n
ln(1 + x) = − ∑ x for  − 1 < x ≤ 1
n
n=1

n
1 1 (−1)
2 3 n
=x− x + x −⋯ − x −⋯
2 3 n

p(p − 1) p(p − 1)(p − 2)


p 2 3
(1 + x) = 1 + px + x + x +⋯
2 3!

p(p − 1)(p − 2) ⋯ (p − n + 1)
n
+ x +⋯
n!

This page titled A.5: Table of Taylor Expansions is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.

A.5.1 https://math.libretexts.org/@go/page/92253
A.6: 3-D Coordinate Systems
A.6.1 Cartesian Coordinates
Here is a figure showing the definitions of the three Cartesian coordinates (x, y, z)

and here are three figures showing a surface of constant x, a surface of constant x, and a surface of constant z.

Finally here is a figure showing the volume element dV in cartesian coordinates.

A.6.2 Cylindrical Coordinates


Here is a figure showing the definitions of the three cylindrical coordinates
r =  distance from (0, 0, 0) to (x, y, 0)

θ =  angle between the the x axis and the line joining (x, y, 0) to (0, 0, 0)

z =  signed distance from (x, y, z) to the xy-plane

The cartesian and cylindrical coordinates are related by


x = r cos θ y = r sin θ z =z
−−−−−−
2 2
y
r = √x +y θ = arctan z =z
x

Here are three figures showing a surface of constant r, a surface of constant θ, and a surface of constant z.

A.6.1 https://math.libretexts.org/@go/page/92254
Finally here is a figure showing the volume element dV in cylindrical coordinates.

A.6.3 Spherical Coordinates


Here is a figure showing the definitions of the three spherical coordinates
ρ =  distance from (0, 0, 0) to (x, y, z)

a ⃗rphi =  angle between the z axis and the line joining (x, y, z) to (0, 0, 0)

θ =  angle between the x axis and the line joining (x, y, 0) to (0, 0, 0)

and here are two more figures giving the side and top views of the previous figure.

The cartesian and spherical coordinates are related by


x = ρ sin a⃗ rphi cos θ y = ρ sin a⃗ rphi sin θ z = ρ cos a⃗ rphi
− −−−− −
−−−−−−−−−− y √ x2 + y 2
2 2 2
ρ = √x +y +z θ = arctan a⃗ rphi = arctan
x z

Here are three figures showing a surface of constant ρ, a surface of constant θ, and a surface of constant a⃗ rphi.

A.6.2 https://math.libretexts.org/@go/page/92254
Finally, here is a figure showing the volume element dV in spherical coordinates


and two extracts of the above figure to make it easier to see how the factors ρ darphi and ρ sin a⃗ rphi dθ arise.

This page titled A.6: 3-D Coordinate Systems is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.

A.6.3 https://math.libretexts.org/@go/page/92254
A.7: ISO Coordinate System Notation
In this text we have chosen symbols for the various polar, cylindrical and spherical coordinates that are standard for mathematics.
There is another, different, set of symbols that are commonly used in the physical sciences and engineering. Indeed, there is an
international convention, called ISO 80000-2, that specifies those symbols 1. In this appendix, we summarize the definitions and
standard properties of the polar, cylindrical and spherical coordinate systems using the ISO symbols.

A.7.1 Polar Coordinates


In the ISO convention the symbols ρ and ϕ are used (instead of r and θ ) for polar coordinates.
ρ =  the distance from (0, 0) to (x, y)

ϕ =  the (counter-clockwise) angle between the x-axis 

 and the line joining (x, y) to (0, 0)

Cartesian and polar coordinates are related by

x = ρ cos ϕ y = ρ sin ϕ
−−−−−−
2 2
y
ρ = √x +y ϕ = arctan
x

The following two figures show a number of lines of constant ϕ, on the left, and curves of constant ρ, on the right.

Note that the polar angle ϕ is only defined up to integer multiples of 2π. For example, the point (1, 0) on the x-axis could have
ϕ = 0, but could also have ϕ = 2π or ϕ = 4π. It is sometimes convenient to assign ϕ negative values. When ϕ < 0, the counter-

clockwise angle ϕ refers to the clockwise angle |ϕ|. For example, the point (0, −1) on the negative y -axis can have ϕ = − and π

can also have ϕ = .3π

It is also sometimes convenient to extend the above definitions by saying that x = ρ cos ϕ and y = ρ sin ϕ even when ρ is
negative. For example, the following figure shows (x, y) for ρ = 1, ϕ = and for ρ = −1, ϕ = .
π

4
π

A.7.1 https://math.libretexts.org/@go/page/92255
Both points lie on the line through the origin that makes an angle of 45 with the x-axis and both are a distance one from the origin.

But they are on opposite sides of the the origin.


The area element in polar coordinates is

dA = ρ dρ dϕ

A.7.2 Cylindrical Coordinates


In the ISO convention the symbols ρ, ϕ and z are used (instead of r, θ and z ) for cylindrical coordinates.

ρ =  distance from (0, 0, 0) to (x, y, 0)

ϕ =  angle between the the x axis and the line joining (x, y, 0) to (0, 0, 0)

z =  signed distance from (x, y, z) to the xy-plane

The cartesian and cylindrical coordinates are related by


x = ρ cos ϕ y = ρ sin ϕ z =z
−−−−−− y
2 2
ρ = √x +y ϕ = arctan z =z
x

Here are three figures showing a surface of constant ρ, a surface of constant ϕ, and a surface of constant z.

Finally here is a figure showing the volume element dV in cylindrical coordinates.

A.7.2 https://math.libretexts.org/@go/page/92255
A.7.3 Spherical Coordinates
In the ISO convention the symbols r (instead of ρ), ϕ (instead of θ ) and θ (instead of ϕ ) are used for spherical coordinates.

r =  distance from (0, 0, 0) to (x, y, z)

θ =  angle between the z axis and the line joining (x, y, z) to (0, 0, 0)

ϕ =  angle between the x axis and the line joining (x, y, 0) to (0, 0, 0)

Here are two more figures giving the side and top views of the previous figure.

The cartesian and spherical coordinates are related by

x = r sin θ cos ϕ y = r sin θ sin ϕ z = r cos θ


−−−−−−
−−−−−−−−−− 2 2
y √x + y
2 2 2
r = √x +y +z ϕ = arctan θ = arctan
x z

Here are three figures showing a surface of constant r, a surface of constant ϕ, and a surface of constant θ.

Finally, here is a figure showing the volume element dV in spherical coordinates

A.7.3 https://math.libretexts.org/@go/page/92255
and two extracts of the above figure to make it easier to see how the factors r dθ and r sin θ dϕ arise.

1. It specifies more than just those symbols. See https://en.Wikipedia.org/wiki/ISO_31-11 and


https://en.Wikipedia.org/wiki/ISO/IEC_80000 The full ISO 80000-2 is available at https://www.iso.org/standard/64973.html.

This page titled A.7: ISO Coordinate System Notation is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.

A.7.4 https://math.libretexts.org/@go/page/92255
A.8: Conic Sections and Quadric Surfaces
A conic section is the curve of intersection of a cone and a plane that does not pass through the vertex of the cone. This is
illustrated in the figures below.

An equivalent 1 (and often used) definition is that a conic section is the set of all points in the xy-plane that obey Q(x, y) = 0 with
2 2
Q(x, y) = Ax + By + C xy + Dx + Ey + F = 0

being a polynomial of degree two 2. By rotating and translating our coordinate system the equation of the conic section can be
brought into one of the forms 3
This statement can be justified using a linear algebra eigenvalue/eigenvector analysis. It is beyond what we can cover here, but is
not too difficult for a standard linear algeba course.
αx
2
+ βy
2
with α, β, γ > 0, which is an ellipse (or a circle),

αx
2
− βy
2
with α, β > 0, γ ≠ 0, which is a hyperbola,

x
2
= δy, with δ ≠ 0 which is a parabola.
The three dimensional analogs of conic sections, surfaces in three dimensions given by quadratic equations, are called quadrics. An
example is the sphere
2 2 2
x +y +z = 1.

Here are some tables giving all of the quadric surfaces.


Figure A.8.1. Table of conic sections
name elliptic cylinder parabolic cylinder hyperbolic cylinder sphere
2 2
2 y 2 y
equation in standard form x

a2
+
2
= 1 y = ax
2 x

a2

2
= 1
2
x +y +z
2 2
= r
2

b b

x = constant cross-section two lines one line two lines circle

y = constant cross-section two lines two lines two lines circle

z = constant cross-section ellipse parabola hyperbola circle

sketch

Figure A.8.2. Table of quadric surfaces-1


name ellipsoid elliptic paraboloid elliptic cone
2 2 2 2 2 2 2 2
y y y
equation in standard form x

a2
+
2
+
z

c2
= 1
x

a2
+
2
=
z

c
x

a2
+
2
=
z

c2
b b b

two lines if x = 0, hyperbola if


x = constant cross-section ellipse parabola
x ≠ 0

two lines if y = 0, hyperbola if


y = constant cross-section ellipse parabola
y ≠ 0

z = constant cross-section ellipse ellipse ellipse

A.8.1 https://math.libretexts.org/@go/page/92256
name ellipsoid elliptic paraboloid elliptic cone

sketch

Figure A.8.3. Table of quadric surfaces-2


name hyperboloid of one sheet hyperboloid of two sheets hyperbolic paraboloid
2 2 2
2 y 2 2 y 2 y 2

equation in standard form x

a2
+
2

z

c2
= 1
x

a2
+
2

z

c2
= −1
2

x

a2
=
z

c
b b b

x = constant cross-section hyperbola hyperbola parabola

y = constant cross-section hyperbola hyperbola parabola

two lines if z = 0, hyperbola if


z = constant cross-section ellipse ellipse
z ≠ 0

sketch

It is outside our scope to prove this equivalence. Technically, we should also require that the constants A, B, C , D, E, F , are real
numbers, that A, B, C are not all zero, that Q(x, y) = 0 has more than one real solution, and that the polynomial can't be factored
into the product of two polynomials of degree one.

This page titled A.8: Conic Sections and Quadric Surfaces is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated
by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.

A.8.2 https://math.libretexts.org/@go/page/92256
B: Hints for Exercises
This page titled B: Hints for Exercises is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

B.1 https://math.libretexts.org/@go/page/89222
C: Answers to Exercises
This page titled C: Answers to Exercises is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.

C.1 https://math.libretexts.org/@go/page/89223
D: Solutions to Exercises
This page titled D: Solutions to Exercises is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.

D.1 https://math.libretexts.org/@go/page/89224
Index
C H spherical coordinates
Cartesian coordinates Hyperbolas 3.7: Triple Integrals in Spherical Coordinates
A.6: 3-D Coordinate Systems
A.6: 3-D Coordinate Systems A.8: Conic Sections and Quadric Surfaces
surface area
circle
3.4: Surface Area
A.8: Conic Sections and Quadric Surfaces P
conic section parabola
A.8: Conic Sections and Quadric Surfaces
T
A.8: Conic Sections and Quadric Surfaces
cylindrical coordinates triple integral
polar coordinates
3.5: Triple Integrals
3.6: Triple Integrals in Cylindrical Coordinates A.7: ISO Coordinate System Notation
A.6: 3-D Coordinate Systems triple integral in cylindrical coordinates
3.6: Triple Integrals in Cylindrical Coordinates
Q
D quadratic surfaces
triple integral in spherical coordinates
Double Integrals 3.7: Triple Integrals in Spherical Coordinates
A.8: Conic Sections and Quadric Surfaces
3.1: Double Integrals
3.3: Applications of Double Integrals
Quadric Surfaces V
1.9: Quadric Surfaces
vector
E 1.2: Vectors
S
ellipse
A.8: Conic Sections and Quadric Surfaces
scalar W
1.2: Vectors
wave equation
2.8: Optional — Solving the Wave Equation

1 https://math.libretexts.org/@go/page/91888
Detailed Licensing
Overview
Title: CLP-3 Multivariable Calculus (Feldman, Rechnitzer, and Yeager)
Webpages: 56
Applicable Restrictions: Noncommercial
All licenses found:
CC BY-NC-SA 4.0: 94.6% (53 pages)
Undeclared: 5.4% (3 pages)

By Page
CLP-3 Multivariable Calculus (Feldman, Rechnitzer, and 2.10: Lagrange Multipliers - CC BY-NC-SA 4.0
Yeager) - CC BY-NC-SA 4.0 3: Multiple Integrals - CC BY-NC-SA 4.0
Front Matter - CC BY-NC-SA 4.0 3.1: Double Integrals - CC BY-NC-SA 4.0
TitlePage - CC BY-NC-SA 4.0 3.2: Double Integrals in Polar Coordinates - CC BY-
InfoPage - CC BY-NC-SA 4.0 NC-SA 4.0
Table of Contents - Undeclared 3.3: Applications of Double Integrals - CC BY-NC-SA
Licensing - Undeclared 4.0
Colophon - CC BY-NC-SA 4.0 3.4: Surface Area - CC BY-NC-SA 4.0
Feedback about the text - CC BY-NC-SA 4.0 3.5: Triple Integrals - CC BY-NC-SA 4.0
Preface - CC BY-NC-SA 4.0 3.6: Triple Integrals in Cylindrical Coordinates - CC
1: Vectors and Geometry in Two and Three Dimensions - BY-NC-SA 4.0
CC BY-NC-SA 4.0 3.7: Triple Integrals in Spherical Coordinates - CC
BY-NC-SA 4.0
1.1: Points - CC BY-NC-SA 4.0
3.8: Optional— Integrals in General Coordinates -
1.2: Vectors - CC BY-NC-SA 4.0
CC BY-NC-SA 4.0
1.3: Equations of Lines in 2d - CC BY-NC-SA 4.0
1.4: Equations of Planes in 3d - CC BY-NC-SA 4.0 4: Appendices - CC BY-NC-SA 4.0
1.5: Equations of Lines in 3d - CC BY-NC-SA 4.0 A: Appendices - CC BY-NC-SA 4.0
1.6: Curves and their Tangent Vectors - CC BY-NC- A.1: Trigonometry - CC BY-NC-SA 4.0
SA 4.0 A.2: Powers and Logarithms - CC BY-NC-SA 4.0
1.7: Sketching Surfaces in 3d - CC BY-NC-SA 4.0 A.3: Table of Derivatives - CC BY-NC-SA 4.0
1.8: Cylinders - CC BY-NC-SA 4.0 A.4: Table of Integrals - CC BY-NC-SA 4.0
1.9: Quadric Surfaces - CC BY-NC-SA 4.0 A.5: Table of Taylor Expansions - CC BY-NC-SA
2: Partial Derivatives - CC BY-NC-SA 4.0 4.0
2.1: Limits - CC BY-NC-SA 4.0 A.6: 3-D Coordinate Systems - CC BY-NC-SA 4.0
2.2: Partial Derivatives - CC BY-NC-SA 4.0 A.7: ISO Coordinate System Notation - CC BY-
2.3: Higher Order Derivatives - CC BY-NC-SA 4.0 NC-SA 4.0
2.4: The Chain Rule - CC BY-NC-SA 4.0 A.8: Conic Sections and Quadric Surfaces - CC
2.5: Tangent Planes and Normal Lines - CC BY-NC- BY-NC-SA 4.0
SA 4.0 B: Hints for Exercises - CC BY-NC-SA 4.0
2.6: Linear Approximations and Error - CC BY-NC- C: Answers to Exercises - CC BY-NC-SA 4.0
SA 4.0 D: Solutions to Exercises - CC BY-NC-SA 4.0
2.7: Directional Derivatives and the Gradient - CC Back Matter - CC BY-NC-SA 4.0
BY-NC-SA 4.0 Index - CC BY-NC-SA 4.0
2.8: Optional — Solving the Wave Equation - CC BY- Glossary - CC BY-NC-SA 4.0
NC-SA 4.0 Detailed Licensing - Undeclared
2.9: Maximum and Minimum Values - CC BY-NC-SA
4.0

1 https://math.libretexts.org/@go/page/115428
2 https://math.libretexts.org/@go/page/115428

You might also like