Calculus
Calculus
CALCULUS
The LibreTexts mission is to unite students, faculty and scholars in a cooperative effort to develop an easy-to-use online platform
for the construction, customization, and dissemination of OER content to reduce the burdens of unreasonable textbook costs to our
students and society. The LibreTexts project is a multi-institutional collaborative venture to develop the next generation of open-
access texts to improve postsecondary education at all levels of higher learning by developing an Open Access Resource
environment. The project currently consists of 14 independently operating and interconnected libraries that are constantly being
optimized by students, faculty, and outside experts to supplant conventional paper-based books. These free textbook alternatives are
organized within a central environment that is both vertically (from advance to basic level) and horizontally (across different fields)
integrated.
The LibreTexts libraries are Powered by MindTouch® and are supported by the Department of Education Open Textbook Pilot
Project, the UC Davis Office of the Provost, the UC Davis Library, the California State University Affordable Learning Solutions
Program, and Merlot. This material is based upon work supported by the National Science Foundation under Grant No. 1246120,
1525057, and 1413739. Unless otherwise noted, LibreTexts content is licensed by CC BY-NC-SA 3.0.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not
necessarily reflect the views of the National Science Foundation nor the US Department of Education.
Have questions or comments? For information about adoptions or adaptions contact [email protected]. More information on our
activities can be found via Facebook (https://facebook.com/Libretexts), Twitter (https://twitter.com/libretexts), or our blog
(http://Blog.Libretexts.org).
2: Partial Derivatives
2.1: Limits
2.2: Partial Derivatives
2.3: Higher Order Derivatives
2.4: The Chain Rule
2.5: Tangent Planes and Normal Lines
2.6: Linear Approximations and Error
2.7: Directional Derivatives and the Gradient
2.8: Optional — Solving the Wave Equation
2.9: Maximum and Minimum Values
2.10: Lagrange Multipliers
3: Multiple Integrals
3.1: Double Integrals
3.2: Double Integrals in Polar Coordinates
3.3: Applications of Double Integrals
3.4: Surface Area
3.5: Triple Integrals
3.6: Triple Integrals in Cylindrical Coordinates
3.7: Triple Integrals in Spherical Coordinates
3.8: Optional— Integrals in General Coordinates
4: Appendices
A: Appendices
A.1: Trigonometry
A.2: Powers and Logarithms
A.3: Table of Derivatives
1 https://math.libretexts.org/@go/page/91890
A.4: Table of Integrals
A.5: Table of Taylor Expansions
A.6: 3-D Coordinate Systems
A.7: ISO Coordinate System Notation
A.8: Conic Sections and Quadric Surfaces
B: Hints for Exercises
C: Answers to Exercises
D: Solutions to Exercises
Index
Detailed Licensing
2 https://math.libretexts.org/@go/page/91890
Licensing
A detailed breakdown of this resource's licensing can be found in Back Matter/Detailed Licensing.
1 https://math.libretexts.org/@go/page/115427
Colophon
Cover Design Nick Loewen — licensed under the CC-BY-NC-SA 4.0 License.
Source files A link to the source files for this document can be found at the CLP textbook website. The sources are licensed under
the CC-BY-NC-SA 4.0 License.
Edition CLP3 Multivariable Calculus: May 2021
Website CLP-3
©2016 – 2021 Joel Feldman, Andrew Rechnitzer, Elyse Yeager
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. You can
view a copy of the license here.
1 https://math.libretexts.org/@go/page/92299
Feedback about the text
The CLP-3 Multivariable Calculus text is still undergoing testing and changes. Because of this we request that if you find a
problem or error in the text then:
1. Please check the errata list that can be found at the text webpage.
2. Is the problem in the online version or the PDF version or both?
3. Note the URL of the online version and the page number in the PDF
4. Send an email to [email protected] . Please be sure to include
a description of the error
the URL of the page, if found in the online edition
and if the problem also exists in the PDF, then the page number in the PDF and the compile date on the front page of PDF.
1 https://math.libretexts.org/@go/page/92301
Preface
This text is a merger of the CLP Multivariable Calculus textbook and problembook. It is, at the time that we write this, still a work
in progress; some bits and pieces around the edges still need polish. Consequently we recommend to the student that they still
consult text webpage for links to the errata — especially if they think there might be a typo or error. We also request that you send
us an email at [email protected]
Additionally, if you are not a student at UBC and using these texts please send us an email (again using the feedback button) —
we'd love to hear from you.
Joel Feldman, Andrew Rechnitzer and Elyse Yeager
1 https://math.libretexts.org/@go/page/92300
CHAPTER OVERVIEW
1: Vectors and Geometry in Two and Three Dimensions
Before we get started doing calculus in two and three dimensions we need to brush up on some basic geometry, that we will use a
lot. We are already familiar with the Cartesian plane 1, but we'll start from the beginning.
1. René Descartes (1596–1650) was a French scientist and philosopher, who lived in the Dutch Republic for roughly twenty years
after serving in the (mercenary) Dutch States Army. He is viewed as the father of analytic geometry, which uses numbers to
study geometry.
1.1: Points
1.2: Vectors
1.3: Equations of Lines in 2d
1.4: Equations of Planes in 3d
1.5: Equations of Lines in 3d
1.6: Curves and their Tangent Vectors
1.7: Sketching Surfaces in 3d
1.8: Cylinders
1.9: Quadric Surfaces
This page titled 1: Vectors and Geometry in Two and Three Dimensions is shared under a CC BY-NC-SA 4.0 license and was authored, remixed,
and/or curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the
LibreTexts platform; a detailed edit history is available upon request.
1
1.1: Points
Each point in two dimensions may be labeled by two coordinates 1 (x, y) which specify the position of the point in some units with
respect to some axes as in the figure below.
This is why the xy-plane is called “two dimensional” — the name of each point consists of two real numbers.
The set of all points in two dimensions is denoted 2 R . Observe that
2
Similarly, each point in three dimensions may be labeled by three coordinates (x, y, z), as in the two figures below.
The set of all points in three dimensions is denoted R . The plane that contains, for example, the x - and y-axes is called the xy-
3
plane.
The xy-plane is the set of all points (x, y, z) that satisfy z = 0.
The xz-plane is the set of all points (x, y, z) that satisfy y = 0.
The yz-plane is the set of all points (x, y, z) that satisfy x = 0.
More generally,
The set of all points (x, y, z) that obey z = c is a plane that is parallel to the xy-plane and is a distance |c| from it. If c > 0, the
plane z = c is above the xy-plane. If c < 0, the plane z = c is below the xy-plane. We say that the plane z = c is a signed
distance c from the xy-plane.
The set of all points (x, y, z) that obey y = b is a plane that is parallel to the xz-plane and is a signed distance b from it.
The set of all points (x, y, z) that obey x = a is a plane that is parallel to the yz-plane and is a signed distance a from it.
1.1.1 https://math.libretexts.org/@go/page/89203
Observe that our 2d distances extend quite easily to 3d.
the distance from the point (x, y, z) to the xy-plane is |z|
the distance from the point (x, y, z) to the xz-plane is |y|
the distance from the point (x, y, z) to the yz-plane is |x|
−−−−−−−−− −
the distance from the point (x, y, z) to the origin (0, 0, 0) is √x + y + z 2 2 2
−−−−−−−−−−
To see that the distance from the point (x, y, z) to the origin (0, 0, 0) is indeed √x 2 2
+y +z
2
,
apply Pythagoras to the right-angled triangle with vertices (0, 0, 0), (x, 0, 0) and (x, y, 0) to see that the distance from (0, 0, 0)
−−−−−−
to (x, y, 0) is √x + y and then
2 2
apply Pythagoras to the right-angled triangle with vertices (0, 0, 0), (x, y, 0) and (x, y, z) to see that the distance from (0, 0, 0)
−−−−−−−−−−−−−−
−−−−−− 2 −−−−−−−−− −
to (x, y, z) is √(√x 2 2
+y ) +z
2 2 2
= √x + y + z
2
.
More generally, the distance from the point (x, y, z) to the point (x , y ′ ′
,z )
′
is
−−−−−−−−−−−−−−−−−−−−−−−−−
′ 2 ′ 2 ′ 2
√ (x − x ) + (y − y ) + (z − z )
Notice that this gives us the equation for a sphere quite directly. All the points on a sphere are equidistant from the centre of the
sphere. So, for example, the equation of the sphere centered on (1, 2, 3) with radius 4, that is, the set of all points (x, y, z) whose
distance from (1, 2, 3) is 4, is
2 2 2
(x − 1 ) + (y − 2 ) + (z − 3 ) = 16
Here is an example in which we sketch a region in the xy-plane that is specified using inequalities.
Example 1.1.1
in the xy-plane.
We do so in two steps. In the first step, we sketch the curves x
2
− 6x + y
2
− 4y = −12,
2
x − 6x + y
2
− 4y = −9, and
y = 1.
1.1.2 https://math.libretexts.org/@go/page/89203
By completing squares, we see that the equation x − 6x + y − 4y = −12 is equivalent to (x − 3) + (y − 2) = 1,
2 2 2 2
which is the circle of radius 1 centred on (3, 2). It is sketched in the figure below.
By completing squares, we see that the equation x − 6x + y − 4y = −9 is equivalent to (x − 3) + (y − 2) = 4,
2 2 2 2
which is the circle of radius 2 centred on (3, 2). It is sketched in the figure below.
The point (x, y) obeys y = 1 if and only if it is a distance 1 vertically above the x-axis. So y = 1 is the line that is parallel
to the x-axis and is one unit above it. This line is also sketched in the figure below.
In the second step we determine the impact that the inequalities have.
The inequality x 2 2
− 6x + y − 4y ≥ −12 is equivalent to (x − 3) + (y − 2) ≥ 1 and hence is equivalent to
2 2
−−−−−−−−−−−−−− −
√(x − 3 ) + (y − 2 )2 ≥ 1.
2
So the point
(x, y) satisfies x − 6x + y − 4y ≥ −12 if and only if the distance from (x, y)
2 2
to (3, 2) is at least 1, i.e. if and only if (x, y) is outside (or on) the circle (x − 3) + (y − 2) = 1. 2 2
−−−−−−−−−−−−−− −
≤ 2. So the point (x, y) satisfies the inequality x − 6x + y − 4y ≤ −9 if and only if the
2 2 2 2
√(x − 3 ) + (y − 2 )
distance from (x, y) to (3, 2) is at most 2, i.e. if and only if (x, y) is inside (or on) the circle (x − 3) + (y − 2) = 4. 2 2
The point (x, y) obeys y ≥ 1 if and only if (x, y) is a vertical distance at least 1 above the x-axis, i.e. is above (or on) the
line y = 1.
So the region
2 2
{(x, y) | − 12 ≤ x − 6x + y − 4y ≤ −9, y ≥ 1}
Example 1.1.2
In this example, we are going to find the curve formed by the intersection of the xy-plane and the sphere of radius 5 centred on
(0, 0, 4).
The point (x, y, z) lies on the xy-plane if and only if z = 0, and lies on the sphere of radius 5 centred on (0, 0, 4) if and only if
2 2 2
x + y + (z − 4 ) = 25. So the point (x, y, z) lies on the curve of intersection if and only if both z = 0 and
2 2 2
x + y + (z − 4 ) = 25, or equivalently
2 2 2 2 2
z = 0, x +y + (0 − 4 ) = 25 ⟺ z = 0, x +y =9
1.1.3 https://math.libretexts.org/@go/page/89203
This is the circle in the xy-plane that is centred on the origin and has radius 3. Here is a sketch that show the parts of the sphere
and the circle of intersection that are in the first octant. That is, that have x ≥ 0, y ≥ 0 and z ≥ 0.
Example 1.1.3
In this example, we are going to find all points (x, y, z) for which the distance from (x, y, z) to (9, −12, 15) is twice the
distance from (x, y, z) to the origin (0, 0, 0).
−−−−−−−−−−−−−−−−−−−−−−−− −
The distance from (x, y, z) to (9, −12, 15) is √(x − 9) + (y + 12) 2 2
+ (z − 15 )
2
. The distance from (x, y, z) to (0, 0, 0) is
−−−−−−−−− −
2
√x + y + z2
. So we want to find all points (x, y, z) for which
2
−−−−−−−−−−−−−−−−−−−−−−−−− −−−−−−−−−−
2 2 2 2 2 2
√ (x − 9 ) + (y + 12 ) + (z − 15 ) = 2√ x +y +z
–
This is the sphere of radius 10 √2 centred on (−3, 4, −5).
Exercises
Stage 1
1
1. x2
+y
2
+z
2
= 2x − 4y + 4
2. x2
+y
2
+z
2
< 2x − 4y + 4
2
Describe and sketch the set of all points (x, y) in R that satisfy 2
1. x = y
2. x + y = 1
3. x + y = 4
2 2
4. x + y = 2y
2 2
5. x + y < 2y
2 2
1.1.4 https://math.libretexts.org/@go/page/89203
3
Describe the set of all points (x, y, z) in R that satisfy the following conditions. Sketch the part of the set that is in the first
3
octant.
1. z = x
2. x + y + z = 1
3. x + y + z = 4
2 2 2
4. x + y + z = 4,
2 2 2
z =1
5. x + y = 4
2 2
6. z = x + y
2 2
4
Stage 2
5
Consider any triangle. Pick a coordinate system so that one vertex is at the origin and a second vertex is on the positive x-axis.
Call the coordinates of the second vertex (a, 0) and those of the third vertex (b, c). Find the circumscribing circle (the circle
that goes through all three vertices).
6. ✳
A certain surface consists of all points P = (x, y, z) such that the distance from P to the point (0, 0, 1) is equal to the distance
from P to the plane z + 1 = 0. Find an equation for the surface, sketch and describe it verbally.
7
Show that the set of all points P that are twice as far from (3, −2, 3) as from (3/2, 1, 0) is a sphere. Find its centre and radius.
Stage 3
8
The pressure p(x, y) at the point (x, y) is at least zero and is determined by the equation x 2
− 2px + y
2 2
= 3p . Sketch several
isobars. An isobar is a curve with equation p(x, y) = c for some constant c ≥ 0.
1. This is why the xy-plane is called “two dimensional” — the name of each point consists of two real numbers.
2. Not surprisingly, the 2 in R signifies that each point is labelled by two numbers and the R in R signifies that the numbers in
2 2
question are real numbers. There are more advanced applications (for example in signal analysis and in quantum mechanics)
where complex numbers are used. The space of all pairs (z , z ), with z and z complex numbers is denoted R .
1 2 1 2
2
This page titled 1.1: Points is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.
1.1.5 https://math.libretexts.org/@go/page/89203
1.2: Vectors
In many of our applications in 2d and 3d, we will encounter quantities that have both a magnitude (like a distance) and also a
direction. Such quantities are called vectors. That is, a vector is a quantity which has both a direction and a magnitude, like a
velocity. If you are moving, the magnitude (length) of your velocity vector is your speed (distance travelled per unit time) and the
direction of your velocity vector is your direction of motion. To specify a vector in three dimensions you have to give three
components, just as for a point. To draw the vector with components a, b, c you can draw an arrow from the point (0, 0, 0) to the
point (a, b, c).
Similarly, to specify a vector in two dimensions you have to give two components and to draw the vector with components a, b you
can draw an arrow from the point (0, 0) to the point (a, b).
There are many situations in which it is preferable to draw a vector with its tail at some point other than the origin. For example, it
is natural to draw the velocity vector of a moving particle with the tail of the velocity vector at the position of the particle, whether
or not the particle is at the origin. The sketch below shows a moving particle and its velocity vector at two different times.
As a second example, suppose that you are analyzing the motion of a pendulum. There are three forces acting on the pendulum
bob: gravity g, which is pulling the bob straight down, tension t in the rod, which is pulling the bob in the direction of the rod, and
air resistance r, which is pulling the bob in a direction opposite to its direction of motion. All three forces are acting on the bob. So
it is natural to draw all three arrows representing the forces with their tails at the bob.
In this text, we will used bold faced letters, like v, t, g, to designate vectors. In handwriting, it is clearer to use a small overhead
arrow 1 , as in v,⃗ t ,⃗ g ,⃗ instead. Also, when we want to emphasise that some quantity is a number, rather than a vector, we will call
the number a scalar.
Both points and vectors in 2d are specified by two numbers. Until you get used to this, it might confuse you sometimes — does a
given pair of numbers represent a point or a vector? To distinguish 2 between the components of a vector and the coordinates of the
point at its head, when its tail is at some point other than the origin, we shall use angle brackets rather than round brackets around
the components of a vector. For example, the figure below shows the two-dimensional vector ⟨2, 1⟩ drawn in three different
positions. In each case, when the tail is at the point (u, v) the head is at (2 + u, 1 + v). We warn you that, out in the real world 3,
1.2.1 https://math.libretexts.org/@go/page/89204
no one uses notation that distinguishes between components of a vector and the coordinates of its head — usually round brackets
are used for both. It is up to you to keep straight which is being referred to.
By way of summary,
Definition 1.2.1
we use
bold faced letters, like v, t, g, to designate vectors, and
angle brackets, like ⟨2, 1⟩ , around the components of a vector, but use
round brackets, like (2, 1), around the coordinates of a point, and use
“scalar” to emphasise that some quantity is a number, rather than a vector.
we say that ⟨a , a ⟩ is the displacement vector for the step. Suppose now that we take a second step which moves us an
1 2
additional b units parallel to the x-axis and an additional b units parallel to the y -axis, as in the figure on the left below. So the
1 2
displacement vector for the second step is ⟨b , b ⟩ . All together, we have moved a + b units parallel to the x-axis and
1 2 1 1
a +b
2 units parallel to the y -axis. The displacement vector for the two steps combined is ⟨a + b , a + b ⟩ . We shall define
2 1 1 2 2
Suppose now that, instead, we decide to step in the same direction as the first step above, but to move twice as far, as in the
figure on the right below. That is, our step will move us 2a units in the direction of the x-axis and 2a units in the direction of
1 2
the y -axis and the corresponding displacement vector will be ⟨2a , 2a ⟩ . We shall define the product of the number 2 and the
1 2
1.2.2 https://math.libretexts.org/@go/page/89204
Pictorially, you add the vector b to the vector a by drawing b with its tail at the head of a and then drawing a vector from the tail
of a to the head of b, as in the figure on the left below. For a number s, we can draw the vector sa, by just
changing the vector a's length by the factor |s|, and,
if s < 0, reversing the arrow's direction,
as in the other two figures below.
The special case of multiplication by s = −1 appears so frequently that (−1)a is given the shorter notation −a. That is,
− ⟨a1 , a2 ⟩ = ⟨−a1 , −a2 ⟩
The operations of addition and multiplication by a scalar that we have just defined are quite natural and rarely cause any problems,
because they inherit from the real numbers the properties of addition and multiplication that you are used to.
We have just been introduced to many definitions. Let's see some of them in action.
Example 1.2.4
For example, if
then
2a = 2 ⟨1, 2, 3⟩ = ⟨2, 4, 6⟩
3c = 3 ⟨1, 0, 1⟩ = ⟨3, 0, 3⟩
and
1.2.3 https://math.libretexts.org/@go/page/89204
2a − b + 3c = ⟨2, 4, 6⟩ + ⟨−3, −2, −1⟩ + ⟨3, 0, 3⟩
= ⟨2 − 3 + 3 , 4 − 2 + 0 , 6 − 1 + 3⟩
= ⟨2, 2, 8⟩
Definition 1.2.5
There are some vectors that occur sufficiently commonly that they are given special names. One is the vector 0. Some others are
the “standard basis vectors”.
Definition 1.2.6
A sum of numbers times vectors, like a ^ı + a ^ȷ is called a linear combination of the vectors. Thus all vectors can be expressed as
1 2
linear combinations of the standard basis vectors. This makes basis vectors very helpful in computations. The standard basis
vectors are unit vectors, meaning that they are of length one, where the length of a vector a is denoted 4 |a| and is defined by
−−−−−−−−−−
2 2 2
a = ⟨a1 , a2 , a3 ⟩ ⟹ |a| = √ a +a +a
1 2 3
A unit vector is a vector of length one. We'll sometimes use the accent ^ to emphasise that the vector a
^ is a unit vector. That is,
^ | = 1.
|a
1.2.4 https://math.libretexts.org/@go/page/89204
Example 1.2.8
Recall that multiplying a vector a by a positive number s, changes the length of the vector by a factor s without changing the
⟨1,1,1⟩
direction of the vector. So (assuming that |a| ≠ 0) a
|a|
is a unit vector that has the same direction as a. For example, √3
is a
unit vector that points in the same direction as ⟨1, 1, 1⟩ .
Example 1.2.9
We go for a walk on a flat Earth. We use a coordinate system with the positive x-axis pointing due east and the positive y-axis
pointing due north. We
start at the origin and
walk due east for 4 units and then
–
walk northeast for 5√2 units and then
head towards the point (0, 11), but we only go
one third of the way.
Looking at the figure on the right above, we see that our displacement vector, for the second leg of the walk, has to be in
–
the same direction as the vector ⟨1, 1⟩ . So our displacement vector is the vector of length 5√2 with the same direction as
−−−−−− – ⟨1,1⟩
⟨1, 1⟩ . The vector ⟨1, 1⟩ has length √1 2 2
+1 = √2 and so √2
has length one and our displacement vector is
– ⟨1, 1⟩
5 √2 = 5 ⟨1, 1⟩ = ⟨5, 5⟩
–
√2
If we draw this displacement vector, ⟨5, 5⟩ with its tail at (4, 0), the starting point of the second leg of the walk, then its
head will be at (4 + 5, 0 + 5) = (9, 5) and that is the end point of the second leg of the walk.
On the final leg of our walk, we start at (9, 5) and walk towards (0, 11). The vector from (9, 5) to (0, 11) is
⟨0 − 9 , 11 − 5⟩ = ⟨−9, 6⟩ . As we go only one third of the way, our final displacement vector is
1
⟨−9, 6⟩ = ⟨−3, 2⟩
3
If we draw this displacement vector with its tail at (9, 5), the starting point of the final leg, then its head will be at
(9 − 3, 5 + 2) = (6, 7) and that is the end point of the final leg of the walk, and our final location.
1.2.5 https://math.libretexts.org/@go/page/89204
“scalar plus scalar”, “scalar plus vector” and “vector plus vector”
“scalar times scalar”, “scalar times vector” and “vector times vector”
We have been using “scalar plus scalar” and “scalar times scalar” since childhood. “vector plus vector” and “scalar times vector”
were just defined above. There is no sensible way to define “scalar plus vector”, so we won't. This leaves “vector times vector”.
There are actually two widely used such products. The first is the dot product, which is the topic of this section, and which is used
to easily determine the angle θ (or more precisely, cos θ) between two vectors. We'll get to the second, the cross product, later.
Here is preview of what we will do in this dot product subsection §1.2.2. We are going to give two formulae for the dot product,
a ⋅ b, of the pair of vectors a = ⟨a , a , a ⟩ and b = ⟨b , b , b ⟩ .
1 2 3 1 2 3
The first formula is a ⋅ b = a b + a b + a b . We will take it as our official definition of a ⋅ b. This formula provides us
1 1 2 2 3 3
We will show, in Theorem 1.2.11 below, that this second formula always gives the same answer as the first formula. The second
formula provides us with an easy way to determine the angle between two vectors. In particular, it provides us with an easy way
to test whether or not two vectors are perpendicular to each other. For example, the vectors ⟨1, 2, 3⟩ and ⟨−1, −1, 1⟩ have dot
product
This tell us as the angle θ between the two vectors obeys cos θ = 0, so that θ = π
2
. That is, the two vectors are perpendicular to
each other.
After we give our official definition of the dot product in Definition 1.2.10, and give the important properties of the dot product,
including the formula a ⋅ b = |a| |b| cos θ, in Theorem 1.2.11, we'll give some examples. Finally, to see the dot product in action,
we'll define what it means to project one vector on another vector and give an example.
a = ⟨a1 , a2 , a3 ⟩ , b = ⟨b1 , b2 , b3 ⟩ ⟹ a ⋅ b = a1 b1 + a2 b2 + a3 b3
2
(1) a ⋅ a = |a|
(2) a⋅ b = b ⋅ a
(3) a ⋅ (b + c) = a ⋅ b + a ⋅ c, (a + b) ⋅ c = a ⋅ c + b ⋅ c
(5) 0⋅a=0
Proof.
1.2.6 https://math.libretexts.org/@go/page/89204
Properties 0 through 5 are almost immediate consequences of the definition. For example, for property 3 (which is called the
distributive law) in dimension 2,
a ⋅ (b + c) = ⟨a1 , a2 ⟩ ⋅ ⟨b1 + c1 , b2 + c2 ⟩
= a1 (b1 + c1 ) + a2 (b2 + c2 ) = a1 b1 + a1 c1 + a2 b2 + a2 c2
= a1 b1 + a2 b2 + a1 c1 + a2 c2
Property 6 is sufficiently important that it is often used as the definition of dot product. It is not at all an obvious consequence of
the definition. To verify it, we just write |a − b| in two different ways. The first expresses |a − b| in terms of a ⋅ b. It is
2 2
2 1
|a − b | = (a − b ) ⋅ (a − b )
3
= a ⋅ a − a ⋅ b − b ⋅ a + b ⋅ b
1,2 2 2
= |a| + |b | − 2a ⋅ b
1 2
Here, =, for example, means that the equality is a consequence of property 1. The second way we write |a − b| involves cos θ
and follows from the cosine law for triangles. Just in case you don't remember the cosine law, we'll derive it right now! Start by
applying Pythagoras to the shaded triangle in the right hand figure of
That triangle is a right triangle whose hypotenuse has length |a − b| and whose other two sides have lengths (|b| − |a| cos θ)
and |a| sin θ. So Pythagoras gives
2 2 2
|a − b| = (|b| − |a| cos θ) + (|a| sin θ)
2 2 2 2 2
= |b | − 2|a| |b| cos θ + |a| cos θ + |a| sin θ
2 2
= |b | − 2|a| |b| cos θ + |a|
2
, this reduces to, (surprise!) Pythagoras' theorem.
2
Setting our two expressions for |a − b| equal to each other,
2 2 2 2 2
|a − b | = |a| + |b | − 2a ⋅ b = |b | − 2|a| |b| cos θ + |a|
Because of Property 7 of Theorem 1.2.11, the dot product can be used to test whether or not two vectors are perpendicular to each
other. That is, whether or not the angle between the two vectors is 90 . Another name 6 for “perpendicular” is “orthogonal”.
∘
Testing for orthogonality is one of the main uses of the dot product.
1.2.7 https://math.libretexts.org/@go/page/89204
Example 1.2.12
Consider the three vectors
a ⋅ b = ⟨1, 1, 0⟩ ⋅ ⟨1, 0, 1⟩ = 1 ×1 +1 ×0 +0 ×1 =1
−−−−−−−−−− –
tell us that c is perpendicular to both a and b. Since both |a| = |b| = √1 2
+1
2
+0
2
= √2 the first dot product tells us that
the angle, θ, between a and b obeys
a⋅ b 1 π
cos θ = = ⟹ θ =
|a| |b| 2 3
Dot products are also used to compute projections. First, here's the definition.
Draw two vectors, a and b, with their tails at a common point and drop a perpendicular from the head of a to the line that
passes through both the head and tail of b. By definition, the projection of the vector a on the vector b is the vector from the
tail of b to the point on the line where the perpendicular hits.
the figure on the left above, the length of the projection of a on b is |a| cos θ. By Property 6 of Theorem 1.2.11,
|a| cos θ = a ⋅ b/|b|, so the projection is a vector whose length is a ⋅ b/|b| and whose direction is given by the unit vector b/|b|.
Hence
a⋅ b b a⋅ b
projection of a on b = proj b a = = b
2
|b| |b| |b|
If |θ| is larger than 90 , as in the figure on the right above, the projection has length
∘
|a| cos(π − θ) = −|a| cos θ = −a ⋅ b/|b|
1.2.8 https://math.libretexts.org/@go/page/89204
Equation 1.2.14
a⋅ b
proj b a = b
2
|b|
|b|
b
|b|
. The coefficient, a⋅b
|b|
, of the unit vector b
|b|
, is called the
component of a in the direction b. As a special case, if b happens to be a unit vector, which, for emphasis, we'll now write has b
^
,
Equation 1.2.15
^ ^
proj ^ a = (a ⋅ b) b
b
Example 1.2.16
In this example, we will find the projection of the vector ⟨0, 3⟩ on the vector ⟨1, 1⟩ , as in the figure
0 ×1 +3 ×1 3 3
= ⟨1, 1⟩ = ⟨ , ⟩
2 2
1 +1 2 2
One use of projections is to “resolve forces”. There is an example in the next (optional) section.
If we use a coordinate system centered on the hinge, the (x, y) coordinates of the mass at time t are
1.2.9 https://math.libretexts.org/@go/page/89204
x(t) = ℓ sin θ(t)
where θ(t) is the angle between the rod and vertical at time t. We are now going to use Newton's law of motion
to determine now θ evolves in time. By definition, the velocity and acceleration vectors 8 for the position vector ⟨x(t), y(t)⟩ are
d dx dy
⟨x(t), y(t)⟩ = ⟨ (t), (t)⟩
dt dt dt
2 2 2
d d x d y
⟨x(t), y(t)⟩ = ⟨ (t), (t)⟩
2 2 2
dt dt dt
d d
= ⟨ℓ sin θ(t), −ℓ cos θ(t)⟩
dt dt
dθ dθ
= ⟨ℓ cos θ(t) (t) , ℓ sin θ(t) (t)⟩
dt dt
dθ
=ℓ (t) ⟨cos θ(t), sin θ(t)⟩
dt
2
d
a(t) = ⟨x(t), y(t)⟩
2
dt
d dθ
= {ℓ (t) ⟨cos θ(t), sin θ(t)⟩}
dt dt
2
d θ dθ d d
=ℓ (t) ⟨cos θ(t), sin θ(t)⟩ + ℓ (t) ⟨ cos θ(t), sin θ(t)⟩
2
dt dt dt dt
2 2
d θ dθ
=ℓ (t) ⟨cos θ(t), sin θ(t)⟩ + ℓ( (t)) ⟨− sin θ(t), cos θ(t)⟩
2
dt dt
dt
⟨cos θ, sin θ⟩ , so the total frictional force is
dθ
−βℓ ⟨cos θ, sin θ⟩
dt
has magnitude τ (t) and direction parallel to the rod pointing from the mass towards the hinge and so is the force due to tension in
the rod.
Hence, for this physical system, Newton's law of motion is
mass×acceleration
2 2
d θ dθ
mℓ ⟨cos θ, sin θ⟩ + mℓ ( ) ⟨− sin θ, cos θ⟩
2
dt dt
f riction
gravity tension
dθ
= mg ⟨0, −1⟩ + τ ⟨− sin θ, cos θ⟩ − βℓ ⟨cos θ, sin θ⟩ (∗)
dt
This is a rather complicated looking equation. Writing out its x- and y -components doesn't help. They also look complicated.
Instead, the equation can be considerably simplified (and consequently better understood) by “taking its components parallel to and
perpendicular to the direction of motion”. From the velocity vector v(t), we see that ⟨cos θ(t), sin θ(t)⟩ is a unit vector parallel to
1.2.10 https://math.libretexts.org/@go/page/89204
the direction of motion at time t. Recall, from 1.2.15, that the projection of any vector b on any unit vector d
^
(with the “hat” on d
^
The coefficient b ⋅ d
^
is, by definition, the component of b in the direction d
^
. So, by dotting both sides of the equation of motion
(∗) with d = ⟨cos θ(t), sin θ(t)⟩ , we extract the component parallel to the direction of motion. Since
^
this gives
2
d θ dθ
mℓ = −mg sin θ − βℓ
2
dt dt
which is much cleaner than (∗)! When θ is small, we can approximate sin θ ≈ θ and get the equation
2
d θ β dθ g
+ + θ =0
2
dt m dt ℓ
which is easily solved. There are systematic procedures for finding the solution, but we'll just guess.
When there is no friction (so that β = 0 ), we would expect the pendulum to just oscillate. So it is natural to guess
θ(t) = A sin(ωt − δ)
which is an oscillation with (unknown) amplitude A, frequency ω (radians per unit time) and phase δ. Substituting this guess into
g
the left hand side, θ + θ, yields
′′
2 g
−Aω sin(ωt − δ) + A sin(ωt − δ)
ℓ
−−
−
which is zero if ω = √g/ℓ. So θ(t) = A sin(ωt − δ) is a solution for any amplitude A and phase δ, provided the frequency
−−
−
ω = √g/ℓ.
When there is some, but not too much, friction, so that β >0 is relatively small, we would expect “oscillation with decaying
amplitude”. So we guess
−γt
θ(t) = Ae sin(ωt − δ)
′′ 2 2 −γt −γt
θ (t) = (γ − ω )Ae sin(ωt − δ) − 2γωAe cos(ωt − δ)
β −γt
+ [−2γω + ω] Ae cos(ωt − δ)
m
β g β β
vanishes if γ − ω −
2 2
m
γ+
ℓ
=0 and −2γω + m
ω = 0. The second equation tells us the decay rate γ = 2m
and then the first
tells us the frequency
−−−−−−−
−−−−−−−−−−− 2
2 β g g β
ω = √γ − γ+ =√ −
m ℓ ℓ 4m
2
2
β g
When there is a lot of friction (namely when 4m2
> , so that the frequency ω is not a real number), we would expect damping
ℓ
1.2.11 https://math.libretexts.org/@go/page/89204
To extract the components perpendicular to the direction of motion, we dot with ⟨− sin θ, cos θ⟩ rather than ⟨cos θ, sin θ⟩ . Note
that, because
the vector ⟨− sin θ, cos θ⟩ really is perpendicular to the direction of motion. Since
⟨− sin θ, cos θ⟩ ⋅ ⟨cos θ, sin θ⟩ = 0
dotting both sides of the equation of motion (∗) with ⟨− sin θ, cos θ⟩ gives
dθ 2
common point is the origin, you get a picture like the figure below.
Any parallelogram can be constructed like this if you pick the common point and two vectors appropriately. Let's compute the area
of the parallelogram. The area of the large rectangle with vertices (0, 0), (0, b + d), (a + c, 0) and (a + c, b + d) is
(a + c)(b + d). The parallelogram we want can be extracted from the large rectangle by deleting the two small rectangles (each of
area bc), and the two lightly shaded triangles (each of area cd ), and the two darkly shaded triangles (each of area ab ). So the
1
2
1
desired
1 1
area = (a + c)(b + d) − (2 × bc) − (2 × cd) − (2 × ab) = ad − bc
2 2
In the above figure, we have implicitly assumed that a, b, c, d ≥ 0 and d/c ≥ b/a. In words, we have assumed that both vectors
⟨a, b⟩ , ⟨c, d⟩ lie in the first quadrant and that ⟨c, d⟩ lies above ⟨a, b⟩ . By simply interchanging a ↔ c and b ↔ d in the picture
and throughout the argument, we see that when a, b, c, d ≥ 0 and b/a ≥ d/c, so that the vector ⟨c, d⟩ lies below ⟨a, b⟩ , the area
of the parallelogram is bc − ad. In fact, all cases are covered by the formula
1.2.12 https://math.libretexts.org/@go/page/89204
Equation 1.2.17
Given two vectors ⟨a, b⟩ and ⟨c, d⟩ , the expression ad − bc is generally written
a b
det [ ] = ad − bc
c d
a b
[ ]
c d
with rows ⟨a, b⟩ and ⟨c, d⟩ . The determinant of a 2 × 2 matrix is the product of the diagonal entries minus the product of the off-
diagonal entries.
There is a similar formula in three dimensions. Any three vectors a = ⟨a 1, a2 , a3 ⟩ , b = ⟨b1 , b2 , b3 ⟩ and c = ⟨c1, c2 , c3 ⟩ in three
dimensions
determine a parallelepiped (three dimensional parallelogram). Its volume is given by the formula
Equation 1.2.18
a ∣ a2 a3 ∣
⎡ 1 ⎤
∣ ∣
volume of parallelepiped with edges a, b, c = det ⎢ b1 b2 b3 ⎥
∣ ∣
⎣ ⎦
∣ c1 c2 c3 ∣
This formula is called “expansion along the top row”. There is one term in the formula for each entry in the top row of the 3 × 3
matrix. The term is a sign times the entry itself times the determinant of the 2 × 2 matrix gotten by deleting the row and column
that contains the entry. The sign alternates, starting with a “+”.
We shall not prove this formula completely here 10. It gets a little tedious. But, there is one case in which we can easily verify that
the volume of the parallelepiped is really given by the absolute value of the claimed determinant. If the vectors b and c happen to
lie in the xy plane, so that b = c = 0, then
3 3
a1 a2 a3
⎡ ⎤
det ⎢ b1 b2 0 ⎥ = a1 (b2 0 − 0 c2 ) − a2 (b1 0 − 0 c1 ) + a3 (b1 c2 − b2 c1 )
⎣ ⎦
c1 c2 0
= a3 (b1 c2 − b2 c1 )
The first factor, a , is the z -coordinate of the one vector not contained in the xy-plane. It is (up to a sign) the height of the
3
parallelepiped. The second factor is, up to a sign, the area of the parallelogram determined by b and c. This parallelogram forms
the base of the parallelepiped. The product is indeed, up to a sign, the volume of the parallelepiped. That the formula is true in
general is a consequence of the fact (that we will not prove) that the value of a determinant does not change when one rotates the
coordinate system and that one can always rotate our coordinate axes around so that b and c both lie in the xy-plane.
1.2.13 https://math.libretexts.org/@go/page/89204
The Cross Product
We have already seen two different products involving vectors — the multiplication of a vector by a scalar and the dot product of
two vectors. The dot product of two vectors yields a scalar. We now introduce another product of two vectors, called the cross
product. The cross product of two vectors will give a vector. There are applications which have two vectors as inputs and produce
one vector as an output, and which are related to the cross product. Here is a very brief mention of two such applications. We will
look at them in much more detail later.
Consider a parallelogram in three dimensions. A parallelogram is naturally determined by the two vectors that define its sides.
One measure of the size of a parallelogram is its area. One way to specify the orientation of the parallelogram is to give a vector
that is perpendicular to it. A very compact way to encode both the area and the orientation of the parallelogram is to give a
vector whose direction is perpendicular to the plane in which it lies and whose magnitude is its area. We shall see that such a
vector can be easily constructed by taking the cross product (definition coming shortly) of the two vectors that give the sides of
the parallelogram.
Imagine a rigid body which is rotating at a rate Ω radians per second about an axis whose direction is given by the unit vector
^ . Let P be any point on the body. We shall see, in the (optional) §1.2.7, that the velocity, v, of the point P is the cross product
a
(again, definition coming shortly) of the vector Ωa^ with the vector r from any point on the axis of rotation to P .
Finally, here is the definition of the cross product. Note that it applies only to vectors in three dimensions.
a × b = ⟨a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ⟩
Note that each component has the form a b − a b . The index i of the first a in component number
i j j i k of a × b is just after k in
the list 1, 2, 3, 1, 2, 3, 1, 2, 3, ⋯ .The index j of the first b is just before k in the list.
There is a much better way to remember this definition. Recall that a 2 × 2 matrix is an array of numbers having two rows and two
columns and that the determinant of a 2 × 2 matrix is defined by
a b
det [ ] = ad − bc
c d
It is the product of the entries on the diagonal minus the product of the entries not on the diagonal.
A 3 × 3 matrix is an array of numbers having three rows and three columns.
1.2.14 https://math.libretexts.org/@go/page/89204
i j k
⎡ ⎤
⎢ a1 a2 a3 ⎥
⎣ ⎦
b1 b2 b3
You will shortly see why the entries in the top row have been given the rather peculiar names i, j and k. The determinant of a
3 × 3 matrix can be defined in terms of some 2 × 2 determinants by
This formula is called “expansion of the determinant along the top row”. There is one term in the formula for each entry in the top
row. The term is a sign times the entry itself times the determinant of the 2 × 2 matrix gotten by deleting the row and column that
contains the entry. The sign alternates, starting with a +. If we now replace i by ^ı , j by ^ȷ and k by k
^
, we get exactly the formula
for a × b of Definition 1.2.19. That is the reason for the peculiar choice of names for the matrix entries. So
^
⎡ ^
ı ^
ȷ k ⎤
a × b = det ⎢ a a2 a3 ⎥
1
⎣ ⎦
b1 b2 b3
^
= ^
ı (a2 b3 − a3 b2 ) − ^
ȷ (a1 b3 − a3 b1 ) + k(a1 b2 − a2 b1 )
is a mnemonic device for remembering the definition of a × b. It is also good from the point of view of evaluating a × b. Here are
several examples in which we use the determinant mnemonic device to evaluate cross products.
Example 1.2.20
In this example, we'll use the mnemonic device to compute two very simple cross products. First
Second
Note that, unlike most (or maybe even all) products that you have seen before, ^ı × ^ȷ is not the same as ^ȷ × ^ı !
Example 1.2.21
In this example, we'll use the mnemonic device to compute two more complicated cross products. Let a = ⟨1, 2, 3⟩ and
b = ⟨1, −1, 2⟩ . First
Second
1.2.15 https://math.libretexts.org/@go/page/89204
Here are some important observations.
The vectors a × b and b × a are not the same! In fact b × a = −a × b. We shall see in Theorem 1.2.23 below that this
was not a fluke.
The vector a × b has dot product zero with both a and b. So the vector a × b is prependicular to both a and b. We shall
see in Theorem 1.2.23 below that this was also not a fluke.
Example 1.2.22
Yet again we use the mnemonic device to compute a more complicated cross product. This time let a = ⟨3, 2, 1⟩ and
b = ⟨6, 4, 2⟩ . Then
We shall see in Theorem 1.2.23 below that it is not a fluke that the cross product is 0. It is a consequence of the fact that a and
b = 2a are parallel.
We now move on to learning about the properties of the cross product. Our first properties lead up to a more intuitive geometric
definition of a × b, which is better for interpreting a × b. These properties of the cross product, which state that a × b is a vector
and then determine its direction and length, are as follows. We will collect these properties, and a few others, into a theorem
shortly.
(0)
a, b are vectors in three dimensions and a × b is a vector in three dimensions.
(1)
a×b is perpendicular to both a and b.
Proof.
To check that a and a × b are perpendicular, one just has to check that the dot product a ⋅ (a × b) = 0. The six terms in
(2)
|a × b| = |a| |b| sin θ where 0 ≤ θ ≤ π is the angle between a, b
Proof.
The formula |a × b| = |a| |b| sin θ is gotten by verifying that
2
|a × b| = (a × b) ⋅ (a × b)
2 2 2
= (a2 b3 − a3 b2 ) + (a3 b1 − a1 b3 ) + (a1 b2 − a2 b1 )
2 2 2 2 2 2 2 2
=a b − 2 a2 b3 a3 b2 + a b +a b − 2 a3 b1 a1 b3 + a b
2 3 3 2 3 1 1 3
2 2 2 2
+a b − 2 a1 b2 a2 b1 + a b
1 2 2 1
1.2.16 https://math.libretexts.org/@go/page/89204
is equal to
2 2 2 2 2 2
|a| |b | sin θ = |a| |b | (1 − cos θ)
2 2 2
= |a| |b | − (a ⋅ b )
2
2 2 2 2 2 2
= (a +a + a )(b +b + b ) − (a1 b1 + a2 b2 + a3 b3 )
1 2 3 1 2 3
2 2 2 2 2 2 2 2 2 2 2 2
=a b +a b +a b +a b +a b +a b
1 2 1 3 2 1 2 3 3 1 3 2
− (2 a1 b1 a2 b2 + 2 a1 b1 a3 b3 + 2 a2 b2 a3 b3 )
To see that |a| |b| sin θ is the area of the parallelogram with sides a and b, just recall that the area of any parallelogram is given
by the length of its base times its height. Think of a as the base of the parallelogram. Then |a| is the length of the base and
|b| sin θ is the height.
These properties almost determine a × b. Property 1 forces the vector a × b to lie on the line perpendicular to the plane
containing a and b. There are precisely two vectors on this line that have the length given by property 2. In the left hand figure of
the two vectors are labeled c and d. Which of these two candidates is correct is determined by the right hand rule 11 , which says
that if you form your right hand into a fist with your fingers curling from a to b, then when you stick your thumb straight out from
the fist, it points in the direction of a × b. This is illustrated in the figure on the right above 12 . The important special cases
(3)
^ ^ ^ ^
ı ×^
ȷ = k, ^
ȷ ×k = ^
ı , k × ^
ı = ^
ȷ
^ ^ ^
^
ȷ ×^
ı = −k, k × ^
ȷ = −^
ı , ^
ı × k = −^
ȷ
all follow directly from the definition of the cross product (see, for example, Example 1.2.20) and all obey the right hand
rule. Combining properties 1, 2 and the right hand rule give the geometric definition of a × b. To remember these three
special cases, just remember this figure.
The product of any two standard basis vectors, taken in the order of the arrows in the figure, is the third standard basis vector.
Going against the arrows introduces a minus sign.
(4)
^
a × b = |a| |b| sin θ n
Outline of Proof.
We have already seen that the right hand side has the correct length and, except possibly for a sign, direction. To check that the
right hand rule holds in general, rotate your coordinate system around 13 so that a points along the positive x axis and b lies in
1.2.17 https://math.libretexts.org/@go/page/89204
the xy-plane with positive y component. That is a = α ^ı and b = β ^ı + γmma^ȷ with α, γmma ≥ 0. Then
a × b = α^
ı × (β ^
ı + γmma^
ȷ ) = αβ ^
ı ×^
ı + αγmma ^
ı ×^
ȷ.
The first term vanishes by property 2, because the angle θ between ^ı and ^
ı is zero. So, by property 3, ^
a × b = αγmmak
points along the positive z axis, which is consistent with the right hand rule.
The analog of property 7 of the dot product (which says that a⋅ b is zero if and only if a=0 or b =0 or a⊥b ) follows
immediately from property 2.
(5)
The remaining properties are all tools for helping do computations with cross products. Here is a theorem which summarizes the
properties of the cross product. We have already seen the first five. The other properties are all tools for helping do computations
with cross products.
(0)
a, b are vectors in three dimensions and a × b is a vector in three dimensions.
(1)
a×b is perpendicular to both a and b.
(2)
|a × b| = |a| |b| sin θ where 0 ≤ θ ≤ π is the angle between a, b
(3)
^ ^ ^ ^
ı ×^
ȷ = k, ^
ȷ ×k = ^
ı , k × ^
ı = ^
ȷ
(4)
^
a × b = |a| |b| sin θ n
(5)
(6)
a × b = −b × a
(7)
a × (b + c) = a × b + a × c
(9)
1.2.18 https://math.libretexts.org/@go/page/89204
a ⋅ (b × c) = (a × b) ⋅ c
(10)
a × (b × c) = (c ⋅ a)b − (b ⋅ a)c
Proof.
We have already seen the proofs up to number 5. Numbers 6, 7 and 8 follow immediately from the definition, using a little
algebra. To prove numbers 9 and 10 we just write out the definitions of the left hand sides and the right hand sides and observe
that they are equal.
(9) The left hand side is
Warning 1.2.24
Take particular care with properties 6 and 10. They are counterintuitive and are a frequent source of errors. In particular, for
general vectors a, b, c, the cross product is neither commutative nor associative, meaning that
a×b ≠ b ×a
a × (b × c) ≠ (a × b) × c
For example
^ ^ ^
ı × (^
ı ×^
ȷ) = ^
ı × k = −k × ^
ı = −^
ȷ
(^
ı ×^
ı)×^
ȷ = 0 ×^
ȷ =0
Example 1.2.25
As an illustration of the properties of the dot and cross product, we now derive the formula for the volume of the parallelepiped
with edges a = ⟨a , a , a ⟩ , b = ⟨b , b , b ⟩ , c = ⟨c , c , c ⟩ that was mentioned in §1.2.4.
1 2 3 1 2 3 1 2 3
The volume of the parallelepiped is the area of its base times its height 14 . The base is the parallelogram with sides b and c. Its
area is the length of its base, which is |b|, times its height, which is |c| sin θ. (Drop a perpendicular from the head of c to the
line containing b ). Here θ is the angle between b and c. So the area of the base is |b| |c| sin θ = |b × c|, by property 2 of the
cross product.
To get the height of the parallelepiped, we drop a perpendicular from the head of a to the line that passes through the tail of a
and is perpendicular to the base of the parallelepiped. In other words, from the head of a to the line that contains both the head
and the tail of b × c. So the height of the parallelepiped is |a| | cos a⃗ rphi|. (The absolute values have been included because if
1.2.19 https://math.libretexts.org/@go/page/89204
the angle between b × c and a happens to be greater than 90 ,
∘
the cos a⃗ rphi produced by taking the dot product of a and
(b × c ) will be negative.)
All together
=∣
∣a ⋅ (b × c)∣
∣
∣ b2 b3 b1 b3 b1 b2 ∣
= ∣a1 det [ ] − a2 det [ ] + a3 det [ ]∣
∣ c2 c3 c1 c3 c1 c2 ∣
∣ a1 a2 a3 ∣
⎡ ⎤
∣ ∣
= det ⎢ b1 b2 b3 ⎥
∣ ∣
⎣ ⎦
∣ c1 c2 c3 ∣
Example 1.2.26
As a concrete example of the computation of the volume of a parallelepiped, we consider the parallelepiped with edges
a = ⟨0, 1, 2⟩
b = ⟨1, 1, 0⟩
c = ⟨0, 1, 0⟩
Here is a sketch.
The base of the parallelepiped is the parallelogram with sides b and c. It is the shaded parallelogram in the sketch above. As
^
⎡ ^
ı ^
ȷ k⎤
b × c = det ⎢ 1 1 0⎥
⎣ ⎦
0 1 0
1 0 1 0 1 1
^
= ^
ı det [ ]−^
ȷ det [ ] + k det [ ]
1 0 0 0 0 1
^
= ^
ı (1 × 0 − 0 × 1) − ^
ȷ (1 × 0 − 0 × 0) + k(1 × 1 − 1 × 0)
^
=k
1.2.20 https://math.libretexts.org/@go/page/89204
|a ⋅ (b × c)| = | ⟨0, 1, 2⟩ ⋅ ⟨0, 0, 1⟩ | = 2
Lemma 1.2.27
1. a ⋅ (b × c) = (a × b) ⋅ c
2. a × (b × c) = (c ⋅ a)b − (b ⋅ a)c
3. a × (b × c) + b × (c × a) + c × (a × b) = 0
Proof of (a).
We proved this in Theorem 1.2.23, by evaluating the left and right hand sides, and observing that they are the same. Here is a
second proof, in which we again write out both sides, but this time we express them in terms of determinants.
^
⎡ ^
ı ^
ȷ k ⎤
a⋅ b ×c = (a1 , a2 , a3 ) ⋅ det ⎢ b b2 b3 ⎥
1
⎣ ⎦
c1 c2 c3
b2 b3 b1 b3 b1 b2
= a1 det [ ] − a2 det [ ] + a3 det [ ]
c2 c3 c1 c3 c1 c2
a1 a2 a3
⎡ ⎤
= det ⎢ b1 b2 b3 ⎥
⎣ ⎦
c1 c2 c3
^
⎡ ^
ı ^
ȷ k ⎤
⎣ ⎦
b1 b2 b3
a2 a3 a1 a2
= c1 det [ ] − c2 det [ a1 a3 b1 b3 ] + c3 det [ ]
b2 b3 b1 b2
c1 c2 c3
⎡ ⎤
= det ⎢ a1 a2 a3 ⎥
⎣ ⎦
b1 b2 b3
Exchanging two rows in a determinant changes the sign of the determinant. Moving the top row of a 3 ×3 determinant to the
bottom row requires two exchanges of rows. So the two 3 × 3 determinants are equal.
Proof of (b).
The proof is not exceptionally difficult — just write out both sides and grind. Substituting in
^
b × c = (b2 c3 − b3 c2 ) ^
ı − (b1 c3 − b3 c1 )^
ȷ + (b1 c2 − b2 c1 )k
^ ^ ^
⎡ ı ȷ k ⎤
a × (b × c) = det ⎢ a1 a2 a3 ⎥
⎣ ⎦
b2 c3 − b3 c2 −b1 c3 + b3 c1 b1 c2 − b2 c1
= ^
ı [ a2 (b1 c2 − b2 c1 ) − a3 (−b1 c3 + b3 c1 )]
−^
ȷ [ a1 (b1 c2 − b2 c1 ) − a3 (b2 c3 − b3 c2 )]
^
+ k[ a1 (−b1 c3 + b3 c1 ) − a2 (b2 c3 − b3 c2 )]
1.2.21 https://math.libretexts.org/@go/page/89204
^
(a ⋅ c)b − (a ⋅ b)c = (a1 c1 + a2 c2 + a3 c3 )(b1 ^
ı + b2 ^
ȷ + b3 k)
^
− (a1 b1 + a2 b2 + a3 b3 )(c1 ^
ı + c2 ^
ȷ + c3 k)
= ^
ı [ a1 b1 c1 + a2 b1 c2 + a3 b1 c3 − a1 b1 c1 − a2 b2 c1 − a3 b3 c1 ]
+^
ȷ [ a1 b2 c1 + a2 b2 c2 + a3 b2 c3 − a1 b1 c2 − a2 b2 c2 − a3 b3 c2 ]
^
+ k [ a1 b3 c1 + a2 b3 c2 + a3 b3 c3 − a1 b1 c3 − a2 b2 c3 − a3 b3 c3 ]
= ^
ı [ a2 b1 c2 + a3 b1 c3 − a2 b2 c1 − a3 b3 c1 ]
+^
ȷ [ a1 b2 c1 + a3 b2 c3 − a1 b1 c2 − a3 b3 c2 ]
^
+ k [ a1 b3 c1 + a2 b3 c2 − a1 b1 c3 − a2 b2 c3 ]
The last formula that we had for the left hand side is the same as the last formula we had for the right hand side. Oof! This is a
little tedious to do by hand. But any computer algebra system will do it for you in a flash.
Proof of (c).
We just apply part (b) three times
a × (b × c) + b × (c × a) + c × (a × b)
=0
point on the axis of rotation and designate it as the origin of our coordinate system. Denote by r the vector from the origin to the
point P . Let θ denote the angle between a
^ and r. As time progresses the point P sweeps out a circle of radius R = |r | sin θ.
In one second P travels along an arc that subtends an angle of Ω radians, which is the fraction of a full circle. The length of this
Ω
2π
arc is Ω
2π
× 2πR = ΩR = Ω|r | sin θ so P travels the distance Ω|r | sin θ in one second and its speed, which is also the length of
and consequently is perpendicular to both a ^ and r. To distinguish between the “into the page” and “out of the page” cases, let's
impose the conventions that Ω > 0 and the axis of rotation a ^ is chosen to obey the right hand rule, meaning that if the thumb of
your right hand is pointing in the direction a^ , then your fingers are pointing in the direction of motion of the rigid body. Under
v ⊥a
^, r
(a
^ , r, v) obey the right hand rule
That is, v is exactly Ω a^ × r. It is conventional to define the “angular velocity” of a rigid body to be vector Ω = Ω a ^ . That is, the
vector with length given by the rate of rotation and direction given by the axis of rotation of the rigid body. In particular, the bigger
the rate of rotation, the longer the angular velocity vector. In terms of this angular velocity vector, the velocity of the point P is
1.2.22 https://math.libretexts.org/@go/page/89204
v = Ω×r
velocity.
The x- and y -axes of the moving observer are painted in red on the merry-go-round. The figure on the right above shows a top
view of the merry-go-round. The x- and y -axes of the moving observer are again red. The X- and Y -axes of the fixed observer are
blue. We are assuming that at time 0, the x-axis of the moving observer and the X-axis of the fixed observer coincide. As the
merry-go-round is rotating at Ω radians per second, the angle between the X-axis and x-axis after t seconds is Ωt.
As an example, suppose that the moving particle is tied to the tip of the moving observer's unit x vector. Then
x(t) = 1 y(t) = 0 z(t) = 0
or, if we write r(t) = (x(t), y(t), z(t)) and R(t) = (X(t), Y (t), Z(t)), then
In general, denote by ^ı (t) the coordinates of the unit x-vector of the moving observer at time t, as measured by the fixed observer.
Similarly ^ȷ (t) for the unit y -vector, and k
^
(t) for the unit z -vector. As the merry-go-round is rotating about the Z -axis at a rate of Ω
radians per second, the angle between the X-axis and x-axis after t seconds is Ωt, and
^
ı (t) = ( cos(Ωt) , sin(Ωt) , 0)
^
ȷ (t) = ( − sin(Ωt) , cos(Ωt) , 0)
^
k(t) = (0 , 0 , 1)
Differentiating, the velocity of the moving particle, as measured by the fixed observer is
dR dx dy dz
^
V(t) = = (t) ^
ı (t) + (t) ^
ȷ (t) + (t) k(t)
dt dt dt dt
d d d
^ ^ ^
+ x(t) ı (t) + y(t) ȷ (t) + z(t) k(t)
dt dt dt
1.2.23 https://math.libretexts.org/@go/page/89204
We saw, in the last (optional) §1.2.7, that
d d d
^ ^ ^
ı (t) = Ω × ^
ı (t) ^
ȷ (t) = Ω × ^
ȷ (t) k(t) = Ω × k(t)
dt dt dt
(You could also verify that these are correct by putting in Ω = (0, 0, Ω) and explicitly computing the cross products.) So
dx dy dz
^
V(t) = ( (t) ^
ı (t) + (t) ^
ȷ (t) + (t) k(t))
dt dt dt
^
+ Ω × (x(t) ^
ı (t) + y(t) ^
ȷ (t) + z(t) k(t))
Differentiating a second time, the acceleration of the moving particle (which is also F
m
, where F is the net force being applied to
the particle and m is the mass of the particle) as measured by the fixed observer is
2 2 2
F d x d y d z
^
= A(t) = ( (t) ^
ı (t) + (t) ^
ȷ (t) + (t) k(t))
m dt2 dt2 dt2
dx dy dz
^
+ 2Ω × ( (t) ^
ı (t) + (t) ^
ȷ (t) + (t) k(t))
dt dt dt
^
+ Ω × (Ω × [x(t) ^
ı (t) + y(t) ^
ȷ (t) + z(t) k(t)])
Recall that the angular velocity Ω = (0, 0, Ω) does not depend on time. The rotating observer sees ^
ı (t) as ^
ı = (1, 0, 0), sees ^ȷ (t)
as ^ȷ = (0, 1, 0), and sees k
^
(t) as k = (0, 0, 1) and so sees
^
F
= a(t) + 2Ω × v(t) + Ω × [Ω × r(t)]
m
where, as usual,
d dx dy dz
v(t) = r(t) =( (t) , (t) , (t))
dt dt dt dt
2 2 2 2
d d x d y d z
a(t) = r(t) = ( (t) , (t) , (t))
2 2 2 2
dt dt dt dt
Here
Fis the sum of all external forces acting on the moving particle,
Fcor = −2Ω × v(t) is called the Coriolis force and
−Ω × [Ω × r(t)] is called the centrifugal force.
As an example, suppose that you are the moving particle and that you are at the edge of the merry-go-round. Let's say t = 0 and
you are at ^ı . Then F is the friction that the surface of the merry-go-round applies to the soles of your shoes. If you are just standing
there, v(t) = 0, so that F = 0, and the friction F exactly cancels the centrifugal force −Ω × [Ω × r(t)] so that you remain at
cor
ı (t). Assume that Ω > 0. Now suppose that you start walking around the edge of the merry-go-round. Then, at t = 0, r = ^
^ ı and
if you walk in the direction of rotation (with speed one), as in the figure on the left below, v = ^ȷ and the Coriolis force
ı tries to push you off of the merry-go-round, while
^
F = −2Ω k × ^
cor ȷ = 2Ω ^
if you walk opposite to the direction of rotation (with speed one), as in the figure on the right below, v = −^ȷ so that the
Coriolis force F = −2Ω k
cor
^
× (−^ ȷ ) = −2Ω ^ı tries to pull you into the centre of the merry-go-round.
1.2.24 https://math.libretexts.org/@go/page/89204
On a rotating ball, such as the Earth, the Coriolis force deflects wind to the right (counterclockwise) in the northern hemisphere and
to the left (clockwise) is the southern hemisphere. In particular, hurricanes/cyclones/typhoons rotate counterclockwise in the
northern hemisphere and clockwise in the southern hemisphere. On the other hand, when it comes to water draining out of, for
example, a toilet, Coriolis force effects are dominated by other factors like asymmetry of the toilet.
Exercises
Stage 1
1.
2.
Determine whether or not the given points are collinear (that is, lie on a common straight line)
1. (1, 2, 3), (0, 3, 7), (3, 5, 11)
2. (0, 3, −5), (1, 2, −2), (3, 0, 4)
3.
4.
5.
6.
7.
Does the triangle with vertices (1, 2, 3), (4, 0, 5)and (3, 6, 4) have a right angle?
1.2.25 https://math.libretexts.org/@go/page/89204
8.
Show that the area of the parallelogram determined by the vectors a and b is |a × b|.
9.
Show that the volume of the parallelepiped determined by the vectors a, b and c is
|a ⋅ (b × c)|
10.
Verify by direct computation that
1. ^ı × ^ȷ = k
^
, ^
^
ȷ ×k = ^
^
ı, k×^ ı =^
ȷ
2. a ⋅ (a × b) = b ⋅ (a × b) = 0
11.
Consider the following statement: “If a ≠ 0 and if a ⋅ b = a ⋅ c then b = c. ” If the statment is true, prove it. If the statement
is false, give a counterexample.
12.
Consider the following statement: “The vector a × (b × c) is of the form αb + βc for some real numbers α and ” If the
β.
13.
14.
15.
Consider the three points O = (0, 0), A = (a, 0) and B = (b, c).
1. Sketch, in a single figure,
the triangle with vertices O, A and B, and
the circumscribing circle for the triangle (i.e. the circle that goes through all three vertices), and
the vectors
−
−→
OA, from O to A,
1.2.26 https://math.libretexts.org/@go/page/89204
−
−→
OB, from O to B,
−
−→
OC , from O to C , where C is the centre of the circumscribing circle.
Then add to the sketch and evaluate, from the sketch,
−
−→ −
−→
the projection of the vector OC on the vector OA, and
−
−→ −
−→
the projection of the vector OC on the vector OB.
2. Determine C .
3. Evaluate, using the formula 1.2.14,
−
−→ −
−→
the projection of the vector OC on the vector OA, and
−
−→ −
−→
the projection of the vector OC on the vector OB.
Stage 2
16.
Find the equation of a sphere if one of its diameters has end points (2, 1, 4) and (4, 3, 10).
17.
Use vectors to prove that the line joining the midpoints of two sides of a triangle is parallel to the third side and half its length.
18.
19 ✳.
20.
21.
Compute the dot product of the vectors a and b. Find the angle between them.
1. a = ⟨1, 2⟩ , b = ⟨−2, 3⟩
2. a = ⟨−1, 1⟩ , b = ⟨1, 1⟩
3. a = ⟨1, 1⟩ , b = ⟨2, 2⟩
4. a = ⟨1, 2, 1⟩ , b = ⟨−1, 1, 1⟩
5. a = ⟨−1, 2, 3⟩ , b = ⟨3, 0, 1⟩
1.2.27 https://math.libretexts.org/@go/page/89204
22.
23.
Determine all values of y for which the given vectors are perpendicular.
1. ⟨2, 4⟩ , ⟨2, y⟩
2. ⟨4, −1⟩ , ⟨y, y ⟩
2
24.
25.
26.
27.
28.
Let p = ⟨−1, 4, 2⟩ , q = ⟨3, 1, −1⟩ , r = ⟨2, −3, −1⟩ . Check, by direct computation, that
1. p × p = 0
2. p × q = −q × p
3. p × (3r) = 3(p × r)
4. p × (q + r) = p × q + p × r
5. p × (q × r) ≠ (p × q) × r
1.2.28 https://math.libretexts.org/@go/page/89204
29.
Calculate the area of the triangle with vertices (0, 0, 0), (1, 2, 3) and (3, 2, 1).
30 ✳.
A particle P of unit mass whose position in space at time t is r(t) has angular momentum L(t) = r(t) × r (t). ′
If
(t) = ρ(t)r(t) for a scalar function ρ, show that L is constant, i.e. does not change with time. Here denotes
′′ ′ d
r .
dt
Stage 3
31.
32.
Consider a cube such that each side has length s. Name, in order, the four vertices on the bottom of the cube A, B, C , D and
the corresponding four vertices on the top of the cube A , B , C , D .
′ ′ ′ ′
1. Show that all edges of the tetrahedron A C BD have the same length.
′ ′
2. Let E be the center of the cube. Find the angle between EA and EC .
33.
Find the angle between the diagonal of a cube and the diagonal of one of its faces.
34.
Consider a skier who is sliding without friction on the hill y = h(x) in a two dimensional world. The skier is subject to two
forces. One is gravity. The other acts perpendicularly to the hill. The second force automatically adjusts its magnitude so as to
prevent the skier from burrowing into the hill. Suppose that the skier became airborne at some (x , y ) with y = h(x ). How
0 0 0 0
35.
A marble is placed on the plane ax + by + cz = d. The coordinate system has been chosen so that the positive z -axis points
straight up. The coefficient c is nonzero and the coefficients a and b are not both zero. In which direction does the marble roll?
Why were the conditions “c ≠ 0 ” and “a, b not both zero” imposed?
36.
Show that a ⋅ (b × c) = (a × b) ⋅ c.
37.
38.
Derive a formula for (a × b) ⋅ (c × d) that involves dot but not cross products.
1.2.29 https://math.libretexts.org/@go/page/89204
39.
1. Verify that three of the faces are parallelograms. Are they rectangular?
2. Find the length of AA .′
40.
(Three dimensional Pythagorean Theorem) A solid body in space with exactly four vertices is called a tetrahedron. Let A, B,
C and D be the areas of the four faces of a tetrahedron. Suppose that the three edges meeting at the vertex opposite the face of
41.
(Three dimensional law of cosines) Let A, B, C and D be the areas of the four faces of a tetrahedron. Let α be the angle
between the faces with areas B and C , β be the angle between the faces with areas A and C and γ be the angle between the
faces with areas A and B. (By definition, the angle between two faces is the angle between the normal vectors to the faces.)
Show that
2 2 2 2
D =A +B +C − 2BC cos α − 2AC cos β − 2AB cos γ
1.2.30 https://math.libretexts.org/@go/page/89204
12. This figure is a variant of
https://commons.wikimedia.org/wiki/File:Right_hand_rule_simple.png
13. Note that as you translate or rotate the coordinate system, the right hand rule is preserved. If (a, b, n
^ ) obey the right hand rule
This page titled 1.2: Vectors is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.
1.2.31 https://math.libretexts.org/@go/page/89204
1.3: Equations of Lines in 2d
A line in two dimensions can be specified by giving one point (x 0, y0 ) on the line and one vector d = ⟨d x, dy ⟩ whose direction is
parallel to the line.
⟨x − x0 , y − y0 ⟩ = td
y − y0 = tdy
These are called the parametric equations of the line, because they contain a free parameter, namely t. As t varies from −∞ to ∞,
the point (x + td , y + td ) traverses the entire line.
0 x 0 y
It is easy to eliminate the parameter t from the equations. Just multiply x − x0 = tdx by dy , multiply y − y0 = tdy by dx and
subtract to give
(x − x0 )dy − (y − y0 )dx = 0
In the event that d and d are both nonzero, we can rewrite this as
x y
If (x, y) is any point on the line then the vector ⟨x − x0 , y − y0 ⟩ , whose tail is at (x0 , y0 ) and whose head is at (x, y), must be
perpendicular to n so that
Equation 1.3.3
n ⋅ ⟨x − x0 , y − y0 ⟩ = 0
nx (x − x0 ) + ny (y − y0 ) = 0 or nx x + ny y = nx x0 + ny y0
1.3.1 https://math.libretexts.org/@go/page/89205
Observe that the coefficients n , n of x and y in the equation of the line are the components of a vector ⟨n , n ⟩ perpendicular to
x y x y
the line. This enables us to read off a vector perpendicular to any given line directly from the equation of the line. Such a vector is
called a normal vector for the line.
Example 1.3.4
Consider, for example, the line y = 3x + 7. To rewrite this equation in the form
nx x + ny y = nx x0 + ny y0
we have to move terms around so that x and y are on one side of the equation and 7 is on the other side: 3x − y = −7. Then
n is the coefficient of x, namely 3, and n is the coefficient of y, namely −1. One normal vector for y = 3x + 7 is ⟨3, −1⟩ .
x y
Of course, if ⟨3, −1⟩ is perpendicular to y = 3x + 7, so is −5 ⟨3, −1⟩ = ⟨−15, 5⟩ . In fact, if we first multiply the equation
3x − y = −7 by −5 to get −15x + 5y = 35 and then set n and n to the coefficients of x and y respectively, we get
x y
n = ⟨−15, 5⟩ .
Example 1.3.5
In this example, we find the point on the line y = 6 − 3x (call the line L) that is closest to the point (7, 5).
We'll start by sketching the line. To do so, we guess two points on L and then draw the line that passes through the two points.
If (x, y) is on L and x = 0, then y = 6. So (0, 6) is on L.
If (x, y) is on L and y = 0, then x = 2. So (2, 0) is on L.
Denote by P the point on L that is closest to (7, 5). It is characterized by the property that the line from (7, 5) to P is
perpendicular to L. This is the case just because if Q is any other point on L, then, by Pythagoras, the distance from (7, 5) to
Q is larger than the distance from (7, 5) to P . See the figure on the right above.
Let's use N to denote the line which passes through (7, 5) and which is perpendicular to L.
Since L has the equation 3x + y = 6, one vector perpendicular to L, and hence parallel to N , is ⟨3, 1⟩ . So if (x, y) is any
point on N , the vector ⟨x − 7, y − 5⟩ must be of the form t ⟨3, 1⟩ . So the parametric equations of N are
⟨x − 7, y − 5⟩ = t ⟨3, 1⟩ or x = 7 + 3t, y = 5 + t
Now let (x, y) be the coordinates of P . Since P is on N , we have x = 7 + 3t, y = 5 +t for some t. Since P is also on L, we
also have 3x + y = 6. So
3(7 + 3t) + (5 + t) = 6
⟺ 10t + 26 = 6
⟺ t = −2
⟹ x = 7 + 3 × (−2) = 1, y = 5 + (−2) = 3
1.3.2 https://math.libretexts.org/@go/page/89205
Exercises
Stage 1
1
Which of the following gives its parametric equation: ⟨x, y⟩ = c + td, or ⟨x, y⟩ = c − td?
2
Which of the following gives its parametric equation: ⟨x, y⟩ = c + td, or ⟨x, y⟩ = −c + td?
3
⟨x − 1, y − 9⟩ = t ⟨8, 4⟩
and
1
⟨x − 9, y − 13⟩ = t ⟨1, ⟩
2
describe the same line by finding two different points that lie on both lines.
4
x −3 = 9t
y −5 = 7t
There are many different ways to write the parametric equations of this line. If we rewrite the equations as
x − x0 = dx t
y − y0 = dy t
Stage 2
5
Find the vector parametric, scalar parametric and symmetric equations for the line containing the given point and with the
given direction.
1. point (1, 2), direction ⟨3, 2⟩
2. point (5, 4), direction ⟨2, −1⟩
3. point (−1, 3), direction ⟨−1, 2⟩
6
Find the vector parametric, scalar parametric and symmetric equations for the line containing the given point and with the
given normal.
1. point (1, 2), normal ⟨3, 2⟩
2. point (5, 4), normal ⟨2, −1⟩
1.3.3 https://math.libretexts.org/@go/page/89205
3. point (−1, 3), normal ⟨−1, 2⟩
7
Use a projection to find the distance from the point (−2, 3) to the line 3x − 4y = −4.
8
Let a, b and c be the vertices of a triangle. By definition, a median of a triangle is a straight line that passes through a vertex
of the triangle and through the midpoint of the opposite side.
1. Find the parametric equations of the three medians.
2. Do the three medians meet at a common point? If so, which point?
9
√3
Let C be the circle of radius 1 centred at (2, 1). Find an equation for the line tangent to C at the point ( 5
2
,1+
2
).
This page titled 1.3: Equations of Lines in 2d is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.
1.3.4 https://math.libretexts.org/@go/page/89205
1.4: Equations of Planes in 3d
Specifying one point (x , y , z ) on a plane and a vector d parallel to the plane does not uniquely determine the plane, because it
0 0 0
on the plane and one vector n = ⟨n , n , n ⟩ with direction perpendicular to that of the plane does uniquely determine the plane. If
x y z
(x, y, z) is any point on the plane then the vector ⟨x − x , y − y , z − z ⟩ , whose tail is at (x , y , z ) and whose head is at
0 0 0 0 0 0
(x, y, z), lies entirely inside the plane and so must be perpendicular to n. That is,
n ⋅ ⟨x − x0 , y − y0 , z − z0 ⟩ = 0
nx (x − x0 ) + ny (y − y0 ) + nz (z − z0 ) = 0 or nx x + ny y + nz z = d
where d = n x x0 + ny y0 + nz z0 .
Again, the coefficients n , n , n of x, y and z in the equation of the plane are the components of a vector ⟨n , n , n ⟩
x y z x y z
perpendicular to the plane. The vector n is often called a normal vector for the plane. Any nonzero multiple of n will also be
perpendicular to the plane and is also called a normal vector.
Example 1.4.2
We have just seen that if we write the equation of a plane in the standard form
ax + by + cz = d
then it is easy to read off a normal vector for the plane. It is just ⟨a, b, c⟩ . So for example the planes
′
P : x + 2y + 3z = 4 P : 3x + 6y + 9z = 7
have normal vectors n = ⟨1, 2, 3⟩ and n = ⟨3, 6, 9⟩ , respectively. Since n = 3n, the two normal vectors
′ ′
n and ′
n are
parallel to each other. This tells us that the planes P and P are parallel to each other.
′
When the normal vectors of two planes are perpendicular to each other, we say that the planes are perpendicular to each other.
For example the planes
′′
P : x + 2y + 3z = 4 P : 2x − y = 7
Here is an example that illustrates how one can sketch a plane, given the equation of the plane.
1.4.1 https://math.libretexts.org/@go/page/89206
Example 1.4.3
P : 4x + 3y + 2z = 12
A good way to prepare for sketching a plane is to find the intersection points of the plane with the x-, y - and z -axes, just as
you are used to doing when sketching lines in the xy-plane. For example, any point on the x axis must be of the form (x, 0, 0).
For (x, 0, 0) to also be on P we need x = 12
4
= 3. So P intersects the x-axis at (3, 0, 0). Similarly, P intersects the y -axis at
(0, 4, 0) and the z -axis at (0, 0, 6). Now plot the points (3, 0, 0), (0, 4, 0) and (0, 0, 6). P is the plane containing these three
intersection of P with the xz-plane that is in the first octant is the line segment from (3, 0, 0) to (0, 0, 6) and the part of the
intersection of P with the yz-plane that is in the first octant is the line segment from (0, 4, 0) to (0, 0, 6). So we just have to
sketch the three line segments joining the three axis intercepts (3, 0, 0), (0, 4, 0) and (0, 0, 6). That's it.
Here are two examples that illustrate how one can find the distance between a point and a plane.
Example 1.4.4
By the “distance between x and the plane P ” we mean the shortest distance between x and any point y on P . In fact, we'll
evaluate the distance in two different ways. In the next Example 1.4.5, we'll use projection. In this example, our strategy for
finding the distance will be to
first observe that the vector n = ⟨1, 2, 3⟩ is normal to P and then
start walking 1 away from x in the direction of the normal vector n and
keep walking until we hit P . Call the point on P where we hit, y. Then the desired distance is the distance between x and
y. From the figure below it does indeed look like distance between x and y is the shortest distance between x and any
1.4.2 https://math.libretexts.org/@go/page/89206
So imagine that we start walking, and that we start at time t = 0 at x and walk in the direction n. Then at time t we might be
at
We hit the plane P at exactly the time t for which (1 + t, −1 + 2t, −3 + 3t) satisfies the equation for P, which is
x + 2y + 3z = 18. So we are on P at the unique time t obeying
y = [x + tn ] = (1 + t, −1 + 2t, −3 + 3t)∣
∣ = (3, 3, 3)
t=2 t=2
Now let's find a point on P . The plane P is given by a single equation, namely
x + 2y + 3z = 18
in the three unknowns, x, y, z. The easiest way to find one solution to this equation is to assign two of the unknowns the value
zero and then solve for the third unknown. For example, if we set x = y = 0, then the equation reduces to 3z = 18. So we
may take z = (0, 0, 6).
Then v, the vector from x = (1, −1, −3) to z = (0, 0, 6) is ⟨0 − 1 , 0 − (−1) , 6 − (−3)⟩ = ⟨−1, 1, 9⟩ so that, by Equation
1.2.14,
v⋅n
proj v = n
n 2
|n|
⟨−1, 1, 9⟩ ⋅ ⟨1, 2, 3⟩
= ⟨1, 2, 3⟩
2
| ⟨1, 2, 3⟩ |
28
= ⟨1, 2, 3⟩
14
= 2 ⟨1, 2, 3⟩
1.4.3 https://math.libretexts.org/@go/page/89206
just as we found in Example 1.4.4.
Example 1.4.6
Now we'll increase the degree of difficulty a tiny bit, and compute the distance between the planes
′
P : x + 2y + 2z = 1 and P : 2x + 4y + 4z = 11
By the “distance between the planes P and P ” we mean the shortest distance between any pair of points x and x with x in P
′ ′
′
n = ⟨1, 2, 2⟩ and n = ⟨2, 4, 4⟩ = 2n
to P and P are parallel to each other. So the planes P and P are parallel to each other. If they had not been parallel, they
′ ′
would have crossed and the distance between them would have been zero.
Our strategy for finding the distance will be to
first find a point x on P and then, like we did in Example 1.4.4,
start walking away from P in the direction of the normal vector n and
keep walking until we hit P . Call the point on P that we hit x . Then the desired distance is the distance between x and
′ ′ ′
x . From the figure below it does indeed look like distance between x and x is the shortest distance between any pair of
′ ′
points with one point on P and one point on P . Again, this is in fact true, though we won't prove it.
′
We can find a point on P just as we did on Example 1.4.5. The plane P is given by the single equation
x + 2y + 2z = 1
in the three unknowns, x, y, z. We can find one solution to this equation by assigning two of the unknowns the value zero and
then solving for the third unknown. For example, if we set y = z = 0, then the equation reduces to x = 1. So we may take
x = (1, 0, 0).
Now imagine that we start walking, and that we start at time t = 0 at x and walk in the direction n. Then at time t we might
be at
1
2(1 + t) + 4(2t) + 4(2t) = 11 ⟺ 18t = 9 ⟺ t =
2
3
′
x = [x + tn ] 1 = (1 + t, 2t, 2t)∣
∣ 1 =( , 1, 1)
t= t=
2 2 2
−−−−−−−−−−−−−−−−−−−−−−− − −−
3 9 3
2 2 2
√ (1 − ) + (0 − 1 ) + (0 − 1 ) =√ =
2 4 2
1.4.4 https://math.libretexts.org/@go/page/89206
Now we'll find the angle between two intersecting planes.
Example 1.4.7
The orientation (i.e. direction) of a plane is determined by its normal vector. So, by definition, the angle between two planes is
the angle between their normal vectors. For example, the normal vectors of the two planes
P1 : 2x + y − z = 3
P2 : x +y +z = 4
are
n1 = ⟨2, 1, −1⟩
n2 = ⟨1, 1, 1⟩
2
=
– –
√6 √3
so that
2
θ = arccos −− = 1.0799
√18
π
∘
= 61.87 to two decimal places.
Exercises
Stage 1
1
The vector k
^
is a normal vector (i.e. is perpendicular) to the plane z = 0. Find another nonzero vector that is normal to z = 0.
2
2
y + z = 4.
3
1. Find the equation of the plane that passes through the origin and has normal vector ⟨1, 2, 3⟩ .
2. Find the equation of the plane that passes through the point (0, 0, 1) and has normal vector ⟨1, 1, 3⟩ .
3. Find, if possible, the equation of a plane that passes through both (1, 2, 3) and (1, 0, 0) and has normal vector ⟨4, 5, 6⟩ .
4. Find, if possible, the equation of a plane that passes through both (1, 2, 3) and (0, 3, 4) and has normal vector ⟨2, 1, 1⟩ .
4. ✳
Find the equation of the plane that contains (1, 0, 0), (0, 1, 0) and (0, 0, 1).
1.4.5 https://math.libretexts.org/@go/page/89206
5
1. Find the equation of the plane containing the points (1, 0, 1), (1, 1, 0) and (0, 1, 1).
2. Is the point (1, 1, 1) on the plane?
3. Is the origin on the plane?
4. Is the point (4, −1, −1) on the plane?
6
What's wrong with the following exercise? “Find the equation of the plane containing (1, 2, 3), (2, 3, 4) and (3, 4, 5).”
Stage 2
7
8
Find the distance from the given point to the given plane.
1. point (−1, 2, 3), plane x + y + z = 7
2. point (1, −4, 3), plane x − 2y + z = 5
9. ✳
A plane Π passes through the points A = (1, 1, 3), B = (2, 0, 2) and C = (2, 1, 0) in R 3
.
10. ✳
Let A = (2, 3, 4) and let L be the line given by the equations x + y = 1 and x + 2y + z = 3.
1. Write an equation for the plane containing A and perpendicular to L.
2. Write an equation for the plane containing A and L.
11. ✳
Consider the plane 4x + 2y − 4z = 3. Find all parallel planes that are distance 2 from the above plane. Your answers should
be in the following form: 4x + 2y − 4z = C .
12. ✳
Find the distance from the point (1, 2, 3) to the plane that passes through the points (0, 1, 1), (1, −1, 3) and (2, 0, −1).
Stage 3
1.4.6 https://math.libretexts.org/@go/page/89206
13. ✳
Consider two planes W 1, W2 , and a line M defined by:
W1 : − 2x + y + z = 7
W2 : − x + 3y + 3z = 6
x 2y − 4
M : = = z+5
2 4
14
Find the equation of the sphere which has the two planes x + y + z = 3, x + y + z = 9 as tangent planes if the center of the
sphere is on the planes 2x − y = 0, 3x − z = 0.
15
Find the equation of the plane that passes through the point (−2, 0, 1) and through the line of intersection of
2x + 3y − z = 0, x − 4y + 2z = −5.
16
17
Describe the set of points equidistant from (1, 2, 3) and (5, 2, 7).
18
19. ✳
Consider a point P (5, −10, 2) and the triangle with vertices A(0, 1, 1), B(1, 0, 1) and C (1, 3, 0).
1. Compute the area of the triangle ABC .
2. Find the distance from the point P to the plane containing the triangle.
20. ✳
Suppose that you are at the point (2, 2, 0) on S, and you plan to follow the shortest path on S to (2, 1, −1). Express your
initial direction as a cross product.
1. To see why heading in the normal direction gives the shortest walk, revisit Example 1.3.5
2. Now might be a good time to review the Definition 1.2.13 of projection.
This page titled 1.4: Equations of Planes in 3d is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.
1.4.7 https://math.libretexts.org/@go/page/89206
1.5: Equations of Lines in 3d
Just as in two dimensions, a line in three dimensions can be specified by giving one point (x , y , z ) on the line and one vector
0 0 0
d = ⟨d , d , d ⟩ whose direction is parallel to that of the line. If (x, y, z) is any point on the line then the vector
x y z
⟨x − x , y − y , z − z ⟩ ,
0 0 0 whose tail is at (x , y , z ) and whose arrow is at (x, y, z), must be parallel to d and hence a scalar
0 0 0
⟨x − x0 , y − y0 , z − z0 ⟩ = td
These are called the parametric equations of the line. Solving all three equations for the parameter t (assuming that dx , dy and dz
and erasing the “t = ” again gives the (so called) symmetric equations for the line.
Here is an example in which we find the parametric equations of a line that is given by the intersection of two planes.
Example 1.5.2
The set of points (x, y, z) that obey x + y + z = 2 form a plane. The set of points (x, y, z) that obey x − y = 0 form a second
plane. The set of points (x, y, z) that obey both x + y + z = 2 and x − y = 0 lie on the intersection of these two planes and
hence form a line. We shall find the parametric equations for that line.
To sketch x + y + z = 2 we observe that if any two of x, y, z are zero, then the third is 2. So all of (0, 0, 2), (0, 2, 0) and
(2, 0, 0) are on x + y + z = 2. The plane x − y = 0 contains all of the z -axis, since (0, 0, z) obeys x − y = 0 for all z. Here
Method 1. Each point on the line has a different value of z. We'll use z as the parameter. (We could just as well use x or y.)
There is no law that requires us to use the parameter name t, but that's what we have done so far, so set t = z. If (x, y, z) is on
the line then z = t and
1.5.1 https://math.libretexts.org/@go/page/89207
x +y +t = 2
x −y =0
The second equation forces y = x. Substituting this into the first equation gives
t
2x + t = 2 ⟹ x =y =1−
2
Method 2. We first find one point on the line. There are lots of them. We'll find the point with z = 0. (We could just as well use
z=123.4, but arguably z = 0 is a little easier.) If (x, y, z) is on the line and z = 0, then
x +y = 2
x −y = 0
The second equation again forces y = x. Substituting this into the first equation gives
2x = 2 ⟹ x =y =1
So (1, 1, 0) is on the line. Now we'll find a direction vector, d, for the line.
Since the line is contained in the plane x + y + z = 2, any vector lying on the line, like d, is also completely contained in
that plane. So d must be perpendicular to the normal vector of x + y + z = 2, which is ⟨1, 1, 1⟩ .
Similarly, since the line is contained in the plane x − y = 0, any vector lying on the line, like d, is also completely
contained in that plane. So d must be perpendicular to the normal vector of x − y = 0, which is ⟨1, −1, 0⟩ .
So we may choose for d any vector which is perpendicular to both ⟨1, 1, 1⟩ and ⟨1, −1, 0⟩ , like, for example,
^
⎡ ^
ı ^
ȷ k⎤
−1 0 1 0 1 −1
^
= det ⎢ 1 −1 0⎥ = ^
ı det [ ]−^
ȷ det [ ] + k det [ ]
1 1 1 1 1 1
⎣ ⎦
1 1 1
^
= −^
ı −^
ȷ + 2k
We now have both a point on the line (namely (1, 1, 0)) and a direction vector for the line (namely ⟨−1, −1, 2⟩), so, as usual,
the parametric equations for the line are
⟨x − 1, y − 1, z⟩ = t ⟨−1, −1, 2⟩ or x = 1 − t, y = 1 − t, z = 2t
This looks a little different than the solution from method 1, but we'll see in a moment that they are really the same. Before
that, let's do one more method.
Method 3. We'll find two points on the line. We have already found that (1, 1, 0) is on the line. From the picture above, it looks
like (0, 0, 2) is also on the line. This is indeed the case since (0, 0, 2) obeys both x + y + z = 2 and x − y = 0. Notice that
we could also have guessed (0, 0, 2) by setting x = 0 and then solving y + z = x + y + z = 2, −y = x − y = 0 for x and y.
As both (1, 1, 0) and (0, 0, 2) are on the line, the vector with head at (1, 1, 0) and tail at (0, 0, 2), which is
⟨1 − 0, 1 − 0, 0 − 2⟩ = ⟨1, 1, −2⟩ , is a direction vector for the line. As (0, 0, 2) is a point on the line and ⟨1, 1, −2⟩ is a
direction vector for the line, the parametric equations for the line are
⟨x − 0, y − 0, z − 2⟩ = t ⟨1, 1, −2⟩ or x = t, y = t, z = 2 − 2t
This also looks similar, but not quite identical, to our previous answers. Time for a comparison.
Comparing the answers. The parametric equations given by the three methods are different. That's just because we have really
used different parameters in the three methods, even though we have called the parameter t in each case. To clarify the relation
between the three answers, rename the parameter of method 1 to t , the parameter of method 2 to t and the parameter of
1 2
1.5.2 https://math.libretexts.org/@go/page/89207
t1 t1
Method 1: x =1− y =1− z = t1
2 2
Method 2: x = 1 − t2 y = 1 − t2 z = 2t2
Method 3: x = t3 y = t3 z = 2 − 2t3
Substituting t = 2t into the Method 1 equations gives the Method 2 equations, and substituting t = 1 − t into the Method
1 2 3 2
3 equations gives the Method 2 equations. So all three really give the same line, just parametrized a little differently.
Warning 1.5.3. A line in three dimensions has infinitely many normal vectors
⟨x − 1, y − 1, z⟩ = t ⟨1, 2, −2⟩
has direction vector ⟨1, 2, −2⟩ . Any vector perpendicular to ⟨1, 2, −2⟩ is perpendicular to the line. The vector ⟨n 1, n2 , n3 ⟩ is
perpendicular to ⟨1, 2, −2⟩ if and only if
0 = ⟨1, 2, −2⟩ ⋅ ⟨n1 , n2 , n3 ⟩ = n1 + 2 n2 − 2 n3
There is whole plane of ⟨n1 , n2 , n3 ⟩ 's obeying this condition, of which ⟨2, −1, 0⟩ , ⟨0, 1, 1⟩ and ⟨2, 0, 1⟩ are only three
examples.
The next two examples illustrate two different methods for finding the distance between a point and a line.
Example 1.5.4
In this example, we find the distance between the point (2, 3, −1) and the line
L : ⟨x − 1, y − 2, z − 3⟩ = t ⟨1, 1, 2⟩
or, equivalently, x = 1 + t, y = 2 + t, z = 3 + 2t
The vector from (2, 3, −1) to the point (1 + t , 2 + t , 3 + 2t) on L is ⟨t − 1 , t − 1 , 2t + 4⟩ . The square of the distance
between (2, 3, −1) and the point (1 + t , 2 + t , 3 + 2t) on L is the square of the length of that vector, namely
2 2 2 2
d(t) = (t − 1 ) + (t − 1 ) + (2t + 4 )
The point on L that is closest to (2, 3, −1) is that whose value of t obeys
d 2
0 = d(t) = 2(t − 1) + 2(t − 1) + 2(2)(2t + 4) (∗)
dt
Before we solve this equation for t and finish of our computation, observe that this equation (divided by 2) says that
⟨1 , 1 , 2⟩ ⋅ ⟨t − 1 , t − 1 , 2t + 4⟩ = 0
That is, the vector from (2, 3, −1) to the point on L nearest (2, 3, −1) is perpendicular to L's direction vector.
Now back to our computation. The equation (∗) simplifies to 12t + 12 = 0. So the optimal t = −1 and the distance is
−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2 −−
d(−1) = √ (−1 − 1 ) + (−1 − 1 ) + (−2 + 4 ) = √12
In this example, we again find the distance between the point (2, 3, −1) and the line
L : ⟨x − 1, y − 2, z − 3⟩ = t ⟨1, 1, 2⟩
but we use a different method. In the figure below, Q is the point (2, 3, −1).
1.5.3 https://math.libretexts.org/@go/page/89207
If we drop a perpendicular from Q to the line L, it hits the line L at the point N , which is the point on L that is nearest Q. So
the distance from Q to L is exactly the distance from Q to N , which is exactly the length of the vector from Q to N . In the
figure above, w⃗ is the vector from Q to N . Now the vector w⃗ has to be perpendicular to the direction vector for L. That is, w⃗
has to be perpendicular to d ⃗ = ⟨1, 1, 2⟩ . However, as we saw in Warning 1.5.3, there are a huge number of vectors in different
directions that are perpendicular to d .⃗ So you might think that it is very hard to even determine the direction of w⃗ .
Fortunately, it isn't. Here is the strategy.
Pick any point on L and call it P .
It is very easy to find the vector from P to N — it is just the projection of the vector from P to Q (called v⃗ in the figure
above) on d .⃗
Once we know proj d
⃗
⃗ v, we will be able to compute
w⃗ = proj ⃗ v
⃗ − v ⃗
d
and then
w⃗ = proj ⃗ v
⃗ − v ⃗ = ⟨−1, −1, −2⟩ − ⟨1, 1, −4⟩ = ⟨−2, −2, 2⟩
d
The next two (optional) examples illustrate two different methods for finding the distance between two lines.
Of course the value of t in the parametric equation for L need not be the same as the value of t in the parametric equation for
L . So let us denote by x⃗ (s) = (1 + s , 2 , 3 − s) and y (t) the points on L and L , respectively,
′ ′
⃗ = (1 + t , 2 − 2t , 1 + t)
that are closest together. Note that the vector from x⃗ (s) to y (t)
⃗ is ⟨t − s , −2t , −2 + s + t⟩ . Then, in particular,
1.5.4 https://math.libretexts.org/@go/page/89207
x⃗ (s) is the point on L that is closest to the point y (t),
⃗ and
⃗
y (t) is the point on L that is closest to the point x⃗ (s).
′
So, as we saw in Example 1.5.4, the vector, ⟨t − s , −2t , −2 + s + t⟩ , that joins x⃗ (s) and y (t),
⃗ must be perpendicular to
both the direction vector of L and the direction vector of L . Consequently ′
So s = 1 and t = 1
3
and the distance between L and L is ′
∣
∣ ⟨t − s , −2t , −2 + s + t⟩ ∣
∣s=1, t=1/3 =∣
∣ ⟨−2/3 , −2/3 , −2/3⟩ ∣
∣
2
= –
√3
this time using a projection, much as in Example 1.4.5. The procedure, which will be justified below, is
first form a vector n⃗ that is perpendicular to the direction vectors of both lines by taking the cross product of the two
direction vectors. In this example,
^
⎡ ^
ı
ı ^
ȷ
ȷ k ⎤
^
⟨1, 0, −1⟩ × ⟨1, −2, 1⟩ = det ⎢ 1 0 −1 ⎥ = −2 ^
ı
ı − 2^
ȷ
ȷ − 2k
⎣ ⎦
1 −2 1
Since we just want n^ to be perpendicular to both direction vectors, we may simplify our computations by dividing this
the other point. This vector goes from one line to the other line. In this example, the point (1, 2, 3) is on L (just set t = 0 in
the equation for L) and the point (1, 2, 1) is on L (just set t = 0 in the equation for L ), so that we may take
′ ′
v ⃗ = ⟨1 − 1 , 2 − 2 , 3 − 1⟩ = ⟨0, 0, 2⟩
The distance between the two lines is the length of the projection of v ⃗ on n⃗ . In this example, by 1.2.14, the distance is
∣ v ⃗ ⋅ n⃗ ∣ | v ⃗ ⋅ n⃗ |
∣
∣proj n⃗ v∣
∣⃗ = ∣ n⃗ ∣ =
2
∣ |n⃗ | ∣ | n⃗ |
| ⟨0, 0, 2⟩ ⋅ ⟨1, 1, 1⟩ |
=
| ⟨1, 1, 1⟩ |
2
=
–
√3
that, as we observed in Example 1.5.6, the vector from x⃗ (s) to y (t) ⃗ is perpendicular to the direction vectors of both lines,
and so is parallel to n⃗ .
Denote by P the plane through x⃗ (s) that is perpendicular to n⃗ . As x⃗ (s) is on L and the direction vector of L is
perpendicular to n⃗ , the line L is contained in P .
Denote by P the plane through y (t)
′
⃗ that is perpendicular to n⃗ . As y (t)
⃗ is on L and the direction vector of L is ′ ′
1.5.5 https://math.libretexts.org/@go/page/89207
The planes P and P are parallel to each other. As x⃗ (s) is on P and y (t)
′
⃗ is on P , and the vector from x⃗ (s) to y (t)
′
⃗ is
perpendicular to both P and P , the distance from P to P is exactly the length of the vector from x⃗ (s) to y (t).
′ ′
⃗ That is
also the distance from L to L . ′
The vector v ⃗ constructed in the procedure above is a vector between L and L and so is also a vector between P and P .
′ ′
Looking at the figure below 1 , we see that the vector from x⃗ (s) to y (t)
⃗ is (up to a sign) the projection of v ⃗ on n⃗ .
Exercises
Stage 1
1
3
,−
1
2
,
1
6
⟩. ”
2
Stage 2
3
Find a vector parametric equation for the line of intersection of the given planes.
1. x − 2z = 3 and y + z = 5 1
2. 2x − y − 2z = −3 and 4x − 3y − 3z = −5
4
5
In each case, determine whether or not the given pair of lines intersect. Also find all planes containing the pair of lines.
1. ⟨x, y, z⟩ = ⟨−3, 2, 4⟩ + t ⟨−4, 2, 1⟩ and ⟨x, y, z⟩ = ⟨2, 1, 2⟩ + t ⟨1, 1, −1⟩
2. ⟨x, y, z⟩ = ⟨−3, 2, 4⟩ + t ⟨−4, 2, 1⟩ and ⟨x, y, z⟩ = ⟨2, 1, −1⟩ + t ⟨1, 1, −1⟩
3. ⟨x, y, z⟩ = ⟨−3, 2, 4⟩ + t ⟨−2, −2, 2⟩ and ⟨x, y, z⟩ = ⟨2, 1, −1⟩ + t ⟨1, 1, −1⟩
4. ⟨x, y, z⟩ = ⟨3, 2, −2⟩ + t ⟨−2, −2, 2⟩ and ⟨x, y, z⟩ = ⟨2, 1, −1⟩ + t ⟨1, 1, −1⟩
1.5.6 https://math.libretexts.org/@go/page/89207
6
Find the equation of the line through (2, −1, −1) and parallel to each of the two planes x + y = 0 and x − y + 2z = 0.
Express the equations of the line in vector and scalar parametric forms and in symmetric form.
7. ✳
Let L be the line given by the equations x + y = 1 and x + 2y + z = 3. Write a vector parametric equation for L.
8
1. Find a vector parametric equation for the line x + 2y + 3z = 11, x − 2y + z = −1.
2. Find the distance from (1, 0, 1) to the line x + 2y + 3z = 11, x − 2y + z = −1.
9
Let L1 be the line passing through (1, −2, −5) in the direction of d
⃗
1 = ⟨2, 3, 2⟩ . Let L2 be the line passing through
(−3, 4, −1) in the direction d ⃗ 2 = ⟨5, 2, 4⟩ .
10. ✳
Let L be a line which is parallel to the plane 2x + y − z = 5 and perpendicular to the line x = 3 − t, y = 1 − 2t and z = 3t.
1. Find a vector parallel to the line L.
2. Find parametric equations for the line L if L passes through a point Q(a, b, c) where a < 0, b > 0, c > 0, and the
distances from Q to the xy--plane, the xz--plane and the yz--plane are 2, 3 and 4 respectively.
11. ✳
12. ✳
13. ✳
Find the parametric equation for the line of intersection of the planes
x + y + z = 11 and x − y − z = 13.
14. ✳
1. Find a point on the y-axis equidistant from (2, 5, −3) and (−3, 6, 1).
2. Find the equation of the plane containing the point (1, 3, 1) and the line r (t)
⃗ =t ^
ı
ı +t ^
ȷ
^
ȷ + (t + 2) k.
1.5.7 https://math.libretexts.org/@go/page/89207
Stage 3
15. ✳
1. Find the parametric equations for the line which contains A and is perpendicular to the triangle ABC .
−
−→ −
−→
2. Find the equation of the set of all points P such that P A is perpendicular to P B. This set forms a
Plane/Line/Sphere/Cone/Paraboloid/Hyperboloid (circle one) in space.
3. A light source at the origin shines on the triangle ABC making a shadow on the plane x + 7y + z = 32. (See the
~
diagram.) Find A.
16
Let P , Q, R and S be the vertices of a tetrahedron. Denote by p ,⃗ q ,⃗ r ⃗ and s ⃗ the vectors from the origin to P , Q, R and S
respectively. A line is drawn from each vertex to the centroid of the opposite face, where the centroid of a triangle with vertices
a⃗ , b
⃗
and c ⃗ is 1
3
⃗
(a⃗ + b + c ).
⃗ Show that these four lines meet at 1
4
(p ⃗ + q ⃗ + r ⃗ + s ⃗ ).
17
y−7 y+2
Calculate the distance between the lines x+2
3
=
−4
=
z−2
4
and x−1
−3
=
4
=
z+1
1
.
This page titled 1.5: Equations of Lines in 3d is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.
1.5.8 https://math.libretexts.org/@go/page/89207
1.6: Curves and their Tangent Vectors
The right hand side of the parametric equation (x, y, z) = (1, 1, 0) + t ⟨1, 2, −2⟩ that we just saw in Warning 1.5.3 is a vector-
valued function of the one real variable t. We are now going to study more general vector-valued functions of one real variable.
That is, we are going to study functions that assign to each real number t (typically in some interval) a vector r (t).
⃗ For example
⃗
r (t) = (x(t), y(t), z(t))
While in some applications t will indeed be “time”, it does not have to be. It can be simply a parameter that is used to label the
different points on the curve that r (t)
⃗ sweeps out. We then say that r (t)
⃗ provides a parametrization of the curve.
That is,
r (θ)
⃗ = (a cos θ , a sin θ) 0 ≤ θ < 2π
−−−−−−−−−−−−− −
2 4 2
⃗
r (t) = (x(t), y(t)) = a ( arctan(t) , √ 1 − arctan (t) )
π π2
We can tweak the parametrization of Example 1.6.1 to get a parametrization of the circle of radius a that is centred on (h, k).
One way to do so is to redraw the sketch of Example 1.6.1 with the circle translated so that its centre is at (h, k).
We see from the sketch that
1.6.1 https://math.libretexts.org/@go/page/92234
⃗
r (θ) = (h + a cos θ , k + a sin θ) 0 ≤ θ < 2π
A second way to come up with this parametrization is to observe that we can turn the trig identity cos 2
t + sin
2
t =1 into the
equation (x − h) + (y − k) = a of the circle by
2 2 2
2
2 y
Example 1.6.3. Parametrization of x
a
2
+ 2
= 1 and of x 2/3
+ y
2/3
= a
2/3
2
2
y
We can build parametrizations of the curves x
a
2
+ 2
=1 and x 2/3
+y
2/3
=a
2/3
from the trig identity cos
2
t + sin
2
t = 1,
b
a
and sin t = b
turns cos 2
t + sin
2
t =1 into x
a
2
+ 2
= 1.
b
1 1 2/3
2/3
y y
Setting cos t = ( x
a
) 3
and sin t = ( a
) 3
turns cos 2
t + sin
2
t =1 into x
2/3
+
2/3
= 1.
a a
So
⃗
r (t) = (a cos t , b sin t) 0 ≤ t < 2π
3 3
⃗
r (t) = (a cos t , a sin t) 0 ≤ t < 2π
2 2
y
give parametrizations of x
a2
+
2
=1 and x 2/3
+y
2/3
=a
2/3
, respectively. To see that running t from 0 to 2π runs r (t)
⃗ once
b
The curve x + y = a
2/3 2/3
is called an astroid. From its equation, we would expect its sketch to look like a deformed
2/3
circle. But it is probably not so obvious that it would have the pointy bits of the right hand figure. We will not explain here why
they arise. The astroid is studied in some detail in Example 1.1.7 of the CLP-4 text. In particular, the above sketch is carefully
developed there.
A very easy method that can often create parametrizations for a curve is to use x or y as a parameter. Because we can solve
y
e = 1 +x
2
for y as a function of x, namely y = ln (1 + x ), we can use x as the parameter simply by setting t = x. This 2
1.6.2 https://math.libretexts.org/@go/page/92234
Example 1.6.5. Parametrization of x 2
+ y
2 2
= a , again
It is also quite common that one can use either x or y to parametrize part of, but all of, a curve. A simple example is the circle
x + y = a . For each −a < x < a, there are two points on the circle with that value of x. So one cannot use x to
2 2 2
parametrize the whole circle. Similarly, for each −a < y < a, there are two points on the circle with that value of y. So one
cannot use y to parametrize the whole circle. On the other hand
− −−−−−
2 2
⃗
r (t) = (t , √ a − t ) −a < t < a
− −−−−−
2 2
⃗
r (t) = (t , −√ a − t ) −a < t < a
provide parametrizations of the top half and bottom half, respectively, of the circle using x as the parameter, and
− −−−−−
2 2
⃗
r (t) = (√ a − t , t) −a < t < a
− −−−−−
2 2
⃗
r (t) = ( − √ a − t , t) −a < t < a
provide parametrizations of the right half and left half, respectively, of the circle using y as the parameter.
x = cos t
y = 7 −t
Note that we can eliminate the parameter t simply by using the second equation to solve for t as a function of y. Namely
t = 7 − y. Substituting this into the first equation gives us the Cartesian equation
x = cos(7 − y)
Curves often arise as the intersection of two surfaces. For example, the intersection of the sphere x 2
+y
2
+z
2
=1 with the plane
y = x is a circle. The part of that circle that is in the first octant is the red curve in the figure below.
One way to parametrize such curves is to choose one of the three coordinates x, y, z as the parameter, and solve the two given
equations for the remaining two coordinates, as functions of the parameter. Here are two examples.
Example 1.6.7
is the circle sketched above. We can choose to use y as the parameter and think of
x =y
2 2 2
x +z = 1 −y
as a system of two equations for the two unknowns x and z, with y being treated as a given constant, rather than as an
unknown. We can now (trivially) solve the first equation for x, substitute the result into the second equation, and finally solve
1.6.3 https://math.libretexts.org/@go/page/92234
for z.
2 2 2 2 2
x = y, x +z = 1 −y ⟹ z = 1 − 2y
−−−−−−
If, for example, we are interested in points (x, y, z) on the curve with z ≥ 0, we have z = √1 − 2y 2
and
−−−−−−
1 1
2
⃗
r (y) = (y , y , √ 1 − 2y ), − ≤y ≤
– –
√2 √2
is a parametrization for the part of the circle above the xy-plane. If, on the other hand, we are interested in points (x, y, z) on
− −−−− −
the curve with z ≤ 0, we have z = −√1 − 2y and 2
−−−−−−
1 1
2
⃗
r (y) = (y , y , −√ 1 − 2y ), − ≤y ≤
– –
√2 √2
Example 1.6.8
The previous example was rigged so that it was easy to solve for x and z as functions of y. In practice it is not always easy, or
even possible, to do so. A more realistic example is the set of all (x, y, z) obeying
2 2
2
y z
x + + =1
2 3
2 2
x + 2y =z
(Don't worry about how we make sketches like this. We'll develop some surface sketching technique in §1.7 below.)
Substituting x = z − 2y (from the second equation) into the first equation gives
2 2
2
3 z
2
− y +z+ =1
2 3
If, for example, we are interested in points (x, y, z) on the curve with y ≥ 0, this can be solved to give y as a function of z.
−−−−−−−−−−−−− −
2 3 2 14
y =√ (z + ) −
9 2 12
Then x 2
= z − 2y
2
also gives x as a function of z. If x ≥ 0,
−−−−−−−−−−−−−−−− −
2
4 3 14
x = √ z− (z + ) +
9 2 6
−−−−−−−−−−− −
4 4 2 1
=√ − z − z
3 9 3
1.6.4 https://math.libretexts.org/@go/page/92234
The other signs of x and y can be gotten by using the appropriate square roots. In this example, (x, y, z) is on the curve, i.e.
satisfies the two original equations, if and only if all of (±x, ±y, z) are also on the curve.
Definition 1.6.9
The derivative of the vector valued function r (t)
⃗ is defined to be
′ dr ⃗ r (t
⃗ + h) − r (t)
⃗
r ⃗ (t) = (t) = lim
dt h→0 h
′ ′ ′ ′ ^
r ⃗ (t) = x (t) ^
ı
ı + y (t)^
ȷ
ȷ + z (t)k
That is, to differentiate a vector valued function of t, just differentiate each of its components.
And of course differentiation interacts with arithmetic operations, like addition, in the obvious way. Only a little more thought is
required to see that differentiation interacts quite nicely with dot and cross products too. Here are some examples.
Example 1.6.10
Let
2 4 6 ^
a⃗ (t) = t ^
ı
ı +t ^
ȷ
ȷ +t k
⃗ −t −3t −5t ^
b(t) = e ^
ı
ı +e ^
ȷ
ȷ +e k
2
γ(t) = t
s(t) = sin t
We are about to compute some derivatives. To make it easier to follow what is going on, we'll use some colour. When we apply
the product rule
d ′ ′
[f (t) g(t)] = f (t) g(t) + f (t) g (t)
dt
gives
d −t 2 −t −3t 2 −3t −5t 2 −5t
⃗ ^
[γ(t)b(t)] = [2te −t e ]^
ı
ı + [2te −3 t e ]^
ȷ
ȷ + [2te −5 t e ]k
dt
−t −3t −5t ^ 2 −t −3t −5t ^
= 2t{ e ^
ı
ı +e ^
ȷ
ȷ +e k} + t { − e ^
ı
ı − 3e ^
ȷ
ȷ − 5e k}
′
′ ⃗ ⃗
= γ (t)b(t) + γ(t)b (t)
and
⃗ 2 −t 4 −3t 6 −5t
a⃗ (t) ⋅ b(t) = t e +t e +t e
gives
1.6.5 https://math.libretexts.org/@go/page/92234
d −t 2 −t 3 −3t 4 −3t 5 −5t 6 −5t
⃗
[ a⃗ (t) ⋅ b(t)] = [2te −t e ] + [4 t e −3 t e ] + [6 t e −5 t e ]
dt
−t 3 −3t 5 −5t 2 −t 4 −3t 6 −5t
= [2te + 4t e + 6t e ] + [−t e −3 t e −5 t e ]
3 5 ^ −t −3t −5t ^
= {2t ^
ı
ı + 4t ^
ȷ
ȷ + 6t k} ⋅ { e ^
ı
ı +e ^
ȷ
ȷ +e k}
2 4 6 ^ −t −3t −5t ^
+ {t ^
ı
ı +t ^
ȷ
ȷ +t k} ⋅ { − e ^
ı
ı − 3e ^
ȷ
ȷ − 5e k}
′
′
⃗ ⃗
= a⃗ (t) ⋅ b(t) + a⃗ (t) ⋅ b (t)
and
^ ^ ^
⎡ ı
ı ȷ
ȷ k ⎤
⃗
a⃗ (t) × b(t) = det ⎢ t2 t
4 6
t ⎥
⎣ −t −3t −5t ⎦
e e e
gives
d
⃗
[ a⃗ (t) × b(t)]
dt
3 −5t 5 −3t −5t 5 −t ^ −3t 3 −t
= ^
ı
ı ( 4 t e − 6 t e ) − ^
ȷ
ȷ ( 2te − 6 t e ) + k( 2te − 4 t e )
3 5 ^ −t −3t −5t ^
= {2t ^
ı
ı + 4t ^
ȷ
ȷ + 6t k} × { e ^
ı
ı +e ^
ȷ
ȷ +e k}
2 4 6 ^ −t −3t −5t ^
+ {t ^
ı
ı +t ^
ȷ
ȷ +t k} × { − e ^
ı
ı − 3e ^
ȷ
ȷ − 5e k}
′
′ ⃗ ⃗
= a⃗ (t) × b(t) + a⃗ (t) × b (t)
and
2 4 6 ^
a⃗ (s(t)) = (sin t) ^
ı
ı + (sin t) ^
ȷ
ȷ + (sin t) k
d
3 5 ^
⟹ [ a⃗ (s(t))] = 2(sin t) cos t ^
ı
ı + 4(sin t) cos t ^
ȷ
ȷ + 6(sin t) cos t k
dt
3 5^
= {2(sin t) ^
ı + 4(sin t) ^
ı ȷ
ȷ + 6(sin t) k} cos t
′ ′
= a⃗ (s(t)) s (t)
Let
α, β ∈ R be constants and
Then
d ′
′
⃗ ⃗
(a) [α a⃗ (t) + β b(t)] = α a⃗ (t) + β b (t) (linear combination)
dt
d ′
⃗ ′ ⃗ ⃗
(b) [γ(t)b(t)] = γ (t)b(t) + γ(t)b (t) (multiplication by scalar function)
dt
d ′
′
⃗ ⃗ ⃗
(c) [ a⃗ (t) ⋅ b(t)] = a⃗ (t) ⋅ b(t) + a⃗ (t) ⋅ b (t) (dot product)
dt
d ′
′
⃗ ⃗ ⃗
(d) [ a⃗ (t) × b(t)] = a⃗ (t) × b(t) + a⃗ (t) × b (t) (cross product)
dt
d ′ ′
(e) [ a⃗ (s(t))] = a⃗ (s(t)) s (t) (composition)
dt
1.6.6 https://math.libretexts.org/@go/page/92234
′ ′
Let's think about the geometric significance of r ⃗ (t). In particular, let's think about the relationship between r ⃗ (t) and distances
′ ⃗
r (t+h)− ⃗
r (t)
along the curve. The derivative r ⃗ (t) is the limit of h
as h → 0. The numerator, r (t
⃗ + h) − r (t),
⃗ is the vector with head
at r (t
⃗ + h) and tail at r (t).
⃗
dt
(t) = ∣
∣ (t)∣
∣.
dt
Lemma 1.6.12
Let r (t)
⃗ be a parametrized curve.
′
1. Denote by T
^
the unit tangent vector to the curve at r (t)
⃗ pointing in the direction of increasing t. If r ⃗ (t) ≠ 0 then
′
r ⃗ (t)
^
T(t) =
′
| r ⃗ (t)|
2. Denote by s(t) the length of the part of the curve between r (0)
⃗ and r (t).
⃗ Then
ds ∣ dr ⃗ ∣
(t) = ∣ (t)∣
dt ∣ dt ∣
T
∣ dr ⃗ ∣
s(T ) − s(T0 ) = ∫ ∣ (t)∣ d
T0 ∣ dt ∣
ds
= 1, then
∣ dr ⃗ ∣ ′
^
∣ (s)∣ = 1 T(s) = r ⃗ (s)
∣ dt ∣
Lemma 1.6.13
If r (t)
⃗ = (x(t) , y(t) , z(t)) is the position of a particle at time t, then
′ ′ ′ ′
ds
⃗ ^ ^
velocity at time t = v(t) = r ⃗ (t) = x (t) ^
ıı + y (t)^
ȷ
ȷ + z (t)k = (t) T(t)
dt
−−−−−−−−−−−−−−−−−−
ds ′ ′ 2 ′ 2 ′ 2
speed at time t = ⃗
(t) = | v(t)| = | r ⃗ (t)| = √ (x (t) + y (t) + z (t)
dt
′′ ′ ′′ ′′ ′′ ^
acceleration at time t = a⃗ (t) = r ⃗ (t) = v ⃗ (t) = (x (t) ^
ıı + y (t)^
ȷ
ȷ +z (t)k
1.6.7 https://math.libretexts.org/@go/page/92234
and the distance travelled between times T and T is
0
T T −−−−−−−−−−−−−−−−−−
dr ⃗ ′ 2 ′ 2 ′ 2
∣ ∣ √ (x (t)
s(T ) − s(T0 ) = ∫ (t) dt = ∫ + y (t) + z (t) dt
∣ ∣
T0
dt T 0
′ ′
Note that the velocity v(t)
⃗ = r ⃗ (t) is a vector quantity while the speed
ds
dt
(t) = | r ⃗ (t)| is a scalar quantity.
In general it can be quite difficult to compute arc lengths. So, as an easy warmup example, we will compute the circumference
of the circle 3 x + y = a . We'll also find a unit tangent to the circle at any point on the circle. We'll use the parametrization
2 2 2
⃗
r (θ) = (a cos θ , a sin θ) 0 ≤ θ ≤ 2π
of Example 1.6.1. Using Lemma 1.6.12, but with the parameter t renamed to θ
′
r ⃗ (θ) = −a sin θ ^
ı + a cos θ^
ı ȷ
ȷ
′
r ⃗ (θ)
^
T(θ) = = − sin θ ^
ı + cos θ^
ı ȷ
ȷ
′
| r ⃗ (θ)|
ds ′
(θ) = ∣
∣r ⃗ (θ)∣
∣ =a
dθ
Θ
′
s(Θ) − s(0) = ∫ ∣
∣r ⃗ (θ)∣
∣ dθ = aΘ
0
As 4 s(Θ) is the arc length of the part of the circle with 0 ≤ θ ≤ Θ, the circumference of the whole circle is
s(2π) = 2πa
which is reassuring, since this formula has been known 5 for thousands of years.
The formula s(Θ) − s(0) = aΘ also makes sense — the part of the circle with 0 ≤θ ≤Θ is the fraction Θ
2π
of the whole
circle, and so should have length × 2πa. Also note that
Θ
2π
⃗ ^
r (θ) ⋅ T(θ) = (a cos θ , a sin θ) ⋅ ( − sin θ , cos θ) = 0
so that the tangent to the circle at any point is perpendicular to the radius vector of the circle at that point. This is another
geometric fact that has been known 6 for thousands of years.
It is Proposition 18 in Book 3 of Euclid's Elements. It was published around 300BC.
⃗ ^
r (t) = 6 sin(2t) ^
ı
ı + 6 cos(2t)^
ȷ
ȷ + 5tk
where the standard basis vectors ^ ȷ = (0, 1, 0) and k = (0, 0, 1). We'll first sketch it, by observing that
ı = (1, 0, 0), ^
ı ȷ
^
1.6.8 https://math.libretexts.org/@go/page/92234
So all points of the curve lie on the cylinder x + y = 36 and 2 2
We have marked three points of the curve on the above sketch. The first has t = 0 and is 0 ^
ı ȷ + 0 k. The second has t =
ı + 6^
ȷ
^ π
and is 0^
ı
ı − 6^
ȷ
ȷ +
5π
2
^
k, and the third has t =π and is 0^
ı
ı + 6^
ȷ
^
ȷ + 5π k. We'll now use Lemma 1.6.12 to find a unit tangent
^
T(t) to the curve at r (t)
⃗ and also the arclength of the part of curve between t = 0 and t = π.
⃗ ^
r (t) = 6 sin(2t) ^
ı + 6 cos(2t)^
ı ȷ
ȷ + 5tk
′
^
r ⃗ (t) = 12 cos(2t) ^
ı
ı − 12 sin(2t)^
ȷ
ȷ + 5k
−−−−−−−−−−−−−−−−−−−−−−−− − −−−−−−
ds ′ 2 2 2 2 2 2 2
(t) = ∣
∣r ⃗ (t)∣
∣ = √ 12 cos (2t) + 12 sin (2t) + 5 = √ 12 + 5
dt
= 13
′
r ⃗ (t) 12 12 5
^ ^
T(t) = = cos(2t) ^
ı
ı − sin(2t)^
ȷ
ȷ + k
′
| r ⃗ (t))| 13 13 13
π
′
s(π) − s(0) = ∫ ∣
∣r ⃗ (t)∣
∣ dt = 13π
0
As |r (t)
⃗ −h ^ ı
ı −k ^ȷ | = a, the particle is running around the circle of radius a centred on (h, k). When t increases by T , the
ȷ
argument, 2π , of cos(2π ) and sin(2π ) increases by exactly 2π and the particle runs exactly once around the circle. In
t
T
t
T
t
particular, it travels a distance 2πa. So it is moving at speed . According to Lemma 1.6.13, it has
2πa
′ 2πa t 2πa t
velocity = r ⃗ (t) = − sin(2π )^
ı
ı + cos(2π )^
ȷ
ȷ
T T T T
ds ′ 2πa
speed = (t) = | r ⃗ (t)| =
dt T
2 2
′′ 4π a t 4π a t
acceleration = r ⃗ (t) = − cos(2π )^
ı
ı − sin(2π )^
ȷ
ȷ
2 2
T T T T
2
4π
=− ⃗
[ r (t) −h ^
ı −k ^
ı ȷ
ȷ]
2
T
particle. So the velocity is perpendicular to the radius vector, and hence parallel to the tangent vector of the circle at r (t).
⃗
The speed given by Lemma 1.6.13 is exactly the speed we found above, just before we started applying Lemma 1.6.13.
1.6.9 https://math.libretexts.org/@go/page/92234
′′
The acceleration r ⃗ (t) points in the direction opposite to the radius vector.
Exercises
Stage 1
Questions 1.6.2.1 through 1.6.2.5 provide practice with curve parametrization. Being comfortable with the algebra and
interpretation of these descriptions are essential ingredients in working effectively with parametrizations.
1
–
List the three points (−1/√2, 0), (1, 25), and (0, 25) in chronological order.
2
3
Find the specified parametrization of the first quadrant part of the circle x 2
+y
2 2
=a .
4
A circle of radius a rolls along the x-axis in the positive direction, starting with its centre at (a, a). In that position, we mark
the topmost point on the circle P . As the circle moves, P moves with it. Let θ be the angle the circle has rolled - see the
diagram below.
1. Give the position of the centre of the circle as a function of θ.
2. Give the position of P a function of θ.
1.6.10 https://math.libretexts.org/@go/page/92234
5
x + y + z = 0.
6
−t
1 2 2
⃗ ^ ^ ^
r (t) =e ı
ı + ȷ
ȷ + (t − 1 ) (t − 3 ) k
t
for t > 0.
Let the positive z axis point vertically upwards, as usual. When is the particle moving upwards, and when is it moving
downwards? Is it moving faster at time t = 1 or at time t = 3?
7
Below is the graph of the parametrized function r (t).
⃗ Let s(t) be the arclength along the curve from r (0)
⃗ to r (t).
⃗
8
What is the relationship between velocity and speed in a vector-valued function of time?
9✳
′ ′′ ′′′ dr ⃗ 2 3
d r ⃗ d r ⃗
Let r (t)
⃗ be a vector valued function. Let r ⃗ , r ⃗ , and r ⃗ denote , 2
, and 3
, respectively. Express
dt dt dt
1.6.11 https://math.libretexts.org/@go/page/92234
d ′ ′′
[(r ⃗ × r ⃗ ) ⋅ r ⃗ ]
dt
′ ′′ ′′′
in terms of r ,⃗ r ⃗ , r ⃗ , and r ⃗ . Select the correct answer.
′ ′′ ′′′
1. (r ⃗ × r ⃗ ) ⋅ r ⃗
2. (r ⃗ × r ⃗ ) ⋅ r ⃗ + (r ⃗ × r ⃗ ) ⋅ r ⃗
′ ′′ ′ ′′′
′ ′′′
3. (r ⃗ × r ⃗ ) ⋅ r ⃗
4. 0
5. None of the above.
Stage 2
10 ✳
−−−−−−−−−−− −
2. |v(t)|
⃗ = √10 + 5 e + 5 e
t −t
−−−−−−−−−−−− −
3. |v(t)|
⃗ = √10 + e +e
10t −10t
4. |v(t)|
⃗ = 5(e +e )
5t −5t
5. |v(t)|⃗ = 5(e + e )
t −t
11
Find the velocity, speed and acceleration at time t of the particle whose position is r (t).
⃗ Describe the path of the particle.
1. r (t)
⃗ = a cos t ^
ı + a sin t ^
ı ȷ
^
ȷ + ct k
2. r (t)
⃗ = a cos t sin t ^
ı + a sin t ^
ı ȷ
^
ȷ + a cos t k
2
12 ✳
1. Let
2 1 3
⃗
r (t) = (t , 3, t )
3
Find the unit tangent vector to this parametrized curve at t = 1, pointing in the direction of increasing t.
2. Find the arc length of the curve from (a) between the points (0, 3, 0) and (1, 3, − ). 1
13
−
−
Using Lemma 1.6.12, find the arclength of r (t)
⃗ = (t, √
3
2
2 3
t ,t ) from t = 0 to t = 1.
14
7
A particle's position at time t is given by r (t)
⃗ = (t + sin t, cos t) . What is the magnitude of the acceleration of the particle
at time t?
The particle traces out a cycloid--see Question 1.6.2.4
1.6.12 https://math.libretexts.org/@go/page/92234
15 ✳
3
3
)
16 ✳
Let r (t)
⃗ = (3 cos t, 3 sin t, 4t) be the position vector of a particle as a function of time t ≥ 0.
17 ✳
Consider the curve
1 3
1 3 3
⃗ ^
r (t) = cos t ^
ı
ı + sin t^
ȷ
ȷ + sin tk
3 3
18 ✳
Let r (t)
⃗ =(
1
3
3
t ,
1
2
2
t ,
1
2
t), t ≥ 0. Compute s(t ), the arclength of the curve at time t.
19 ✳
20
If a particle has constant mass m, position r ,⃗ and is moving with velocity v,⃗ then its angular momentum is L = m(r ⃗ × v).
⃗
For a particle with mass m = 1 and position function r ⃗ = (sin t, cos t, t), find \(\left|\frac{\mathrm{d}\textbf{L}
{\mathrm{d}t} \right|\text{.}\)
21 ✳
This curve starts from the origin and eventually reaches the ellipsoid E whose equation is 2x 2
+ 2y
2
+z
2
= 24.
22 ✳
1.6.13 https://math.libretexts.org/@go/page/92234
Stage 3
23 ✳
dt
when the particle is at (1, 3, 6).
2
4. Find d u
2
when the particle is at (1, 3, 6).
dt
24 ✳
2
2
^
k and velocity v ⃗ 0 =
π
2
^
ı
ı at time 0. It moves under a force
2t ^
F(t) = −3t ^
ı + sin t ^
ı ȷ
ȷ + 2e k.
25 ✳
Let C be the curve of intersection of the surfaces y = x and z = x . A particle moves along
2 2
3
3
C with constant speed such
that > 0. The particle is at (0, 0, 0) at time t = 0 and is at (3, 9, 18) at time t =
dx
dt
.
7
1. Find the length of the part of C between (0, 0, 0) and (3, 9, 18).
2. Find the constant speed of the particle.
3. Find the velocity of the particle when it is at (1, 1, ). 2
26
A camera mounted to a pole can swivel around in a full circle. It is tracking an object whose position at time t seconds is x(t)
metres east of the pole, and y(t) metres north of the pole.
In order to always be pointing directly at the object, how fast should the camera be programmed to rotate at time t? (Give your
answer in terms of x(t) and y(t) and their derivatives, in the units rad/sec.)
27
A projectile falling under the influence of gravity and slowed by air resistance proportional to its speed has position satisfying
2
d r ⃗ dr ⃗
^
= −gk − α
dt2 dt
dt
= v⃗0
at time t = 0, find r (t).
⃗ (Hint: Define u(t) = e αt dr ⃗
dt
(t) and substitute
dr ⃗
dt
(t) = e
−αt
u(t) into the given differential equation to find a differential equation for u.)
1.6.14 https://math.libretexts.org/@go/page/92234
28 ✳
acceleration vector
29 ✳
30 ✳
1. The curve r ⃗ (t) = ⟨1 + t, t , t ⟩ and r ⃗ (t) = ⟨cos t, sin t, t⟩ intersect at the point P (1, 0, 0). Find the angle of
1
2 3
2
1. When we say r (t) ⃗ = (x(t), y(t), z(t)), we mean that (x(t), y(t), z(t)) is the point at the head of the vector r (t)
⃗ when its tail
is at the origin.
2. We of course assume that the constant a > 0.
3. We of course assume that the constant a > 0.
4. You might guess that Θ is a capital Greek theta. You'd be right.
5. The earliest known written approximations of π in Egypt and Babylon, date from 1900–1600BC. The first recorded algorithm
for rigorously evaluating π was developed by Archimedes around 250 BC. The first use of the symbol π for the ratio between
the circumference of a circle and its diameter, in print was in 1706 by William Jones.
6. It is Proposition 18 in Book 3 of Euclid's Elements. It was published around 300BC.
This page titled 1.6: Curves and their Tangent Vectors is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.
1.6.15 https://math.libretexts.org/@go/page/92234
1.7: Sketching Surfaces in 3d
In practice students taking multivariable calculus regularly have great difficulty visualising surfaces in three dimensions, despite
the fact that we all live in three dimensions. We'll now develop some technique to help us sketch surfaces in three dimensions 1.
We all have a fair bit of experience drawing curves in two dimensions. Typically the intersection of a surface (in three dimensions)
with a plane is a curve lying in the (two dimensional) plane. Such an intersection is usually called a cross-section. In the special
case that the plane is one of the coordinate planes, the intersection is sometimes called a trace. One can often get a pretty good idea
of what a surface looks like by sketching a bunch of cross-sections. Here are some examples.
Example 1.7.1. 4x 2
+ y
2
− z
2
= 1
Solution
We'll start by fixing any number z and sketching the part of the surface that lies in the horizontal plane z = z
0 0.
The intersection of our surface with that horizontal plane is a horizontal cross-section. Any point (x, y, z) lying on that
horizontal cross-section satisfies both
2 2 2
z = z0 and 4 x +y −z =1
2 2 2
⟺ z = z0 and 4 x +y = 1 +z
0
2
√1 + z
0
2
and when x = 0, we have
−−−−− −−−−− −−−−−
y = ±√1 + z
2
0
. So the curve is just an ellipse with x semi-axis 1
2
√1 + z
2
0
and y semi-axis √1 + z 2
0
. It's easy to sketch.
Remember that this ellipse is the part of our surface that lies in the plane z = z . Imagine that the sketch of the ellipse is on a
0
single sheet of paper. Lift the sheet of paper up, move it around so that the x- and y -axes point in the directions of the three
dimensional x- and y -axes and place the sheet of paper into the three dimensional sketch at height z . This gives a single 0
1.7.1 https://math.libretexts.org/@go/page/92235
We can build up the full surface by stacking many of these horizontal ellipses — one for each possible height z . So we now 0
draw a few of them as in the figure below. To reduce the amount of clutter in the sketch, we have only drawn the first octant
(i.e. the part of three dimensions that has x ≥ 0, y ≥ 0 and z ≥ 0 ).
Here is why it is OK, in this case, to just sketch the first octant. Replacing x by −x in the equation 4x + y − z = 1 does 2 2 2
not change the equation. That means that a point (x, y, z) is on the surface if and only if the point (−x, y, z) is on the surface.
So the surface is invariant under reflection in the yz-plane. Similarly, the equation 4x + y − z = 1 does not change when y
2 2 2
is replaced by −y or z is replaced by −z. Our surface is also invariant reflection in the xz- and yz-planes. Once we have the
part in the first octant, the remaining octants can be gotten simply by reflecting about the coordinate planes.
We can get a more visually meaningful sketch by adding in some vertical cross-sections. The x = 0 and y = 0 cross-sections
(also called traces — they are the parts of our surface that are in the yz- and xz-planes, respectively) are
2 2 2 2
x = 0, y −z =1 and y = 0, 4 x −z =1
These equations describe hyperbolae 3. If you don't remember how to sketch them, don't worry. We'll do it now. We'll first
sketch them in 2d. Since
2 2
y = 1 +z ⟹ |y| ≥ 1
and y = ±1 when z = 0
1
and x =± when z = 0
2
1
and for large z, x ≈ ± z
2
1.7.2 https://math.libretexts.org/@go/page/92235
Now we'll incorporate them into the 3d sketch. Once again imagine that each is a single sheet of paper. Pick each up and move
it into the 3d sketch, carefully matching up the axes. The red (blue) parts of the hyperbolas above become the red (blue) parts
of the 3d sketch below (assuming of course that you are looking at this on a colour screen).
Now that we have a pretty good idea of what the surface looks like we can clean up and simplify the sketch. Here are a couple
of possibilities.
Example 1.7.2. 4x 2
+ y
2
− z
2
= −1
Solution
As in the last example, we'll start by fixing any number z and sketching the part of the surface that lies in the horizontal plane
0
2 2 2
z = z0 and 4 x +y =z −1
0
1.7.3 https://math.libretexts.org/@go/page/92235
Think of z as a constant.
0
2
√z
2
0
−1 and y semi-axis √z0
2
−1 . These semi-axes
are small when |z | is close to 1 and grow as |z | increases.
0 0
The first octant parts of a few of these horizontal cross-sections are drawn in the figure below.
Next we add in the x = 0 and y = 0 cross-sections (i.e. the parts of our surface that are in the yz- and xz-planes, respectively)
2 2 2 2
x = 0, z = 1 +y and y = 0, z = 1 + 4x
Now that we have a pretty good idea of what the surface looks like we clean up and simplify the sketch.
Here is are two figures created by graphing software.
Example 1.7.3. yz = 1
Sketch the surface yz = 1.
Solution
1.7.4 https://math.libretexts.org/@go/page/92235
This surface has a special property that makes it relatively easy to sketch. There are no x's in the equation yz = 1. That means
that if some y and z obey y z = 1, then the point (x, y , z ) lies on the surface yz = 1 for all values of x. As x runs from
0 0 0 0 0 0
−∞ to ∞, the point (x, y , z ) sweeps out a straight line parallel to the x-axis. So the surface yz = 1 is a union of lines
0 0
parallel to the x-axis. It is invariant under translations parallel to the x-axis. To sketch yz = 1, we just need to sketch its
intersection with the yz-plane and then translate the resulting curve parallel to the x-axis to sweep out the surface.
We'll start with a sketch of the hyperbola yz = 1 in two dimensions.
Next we'll move this 2d sketch into the yz-plane, i.e. the plane x = 0, in 3d, except that we'll only draw in the part in the first
octant.
1.7.5 https://math.libretexts.org/@go/page/92235
Example 1.7.4. xyz = 4
As usual, we start by fixing any number z and sketching the part of the surface that lies in the horizontal plane
0 z = z0 . The
intersection of our surface with that horizontal plane is the hyperbola
4
z = z0 and xy =
z0
Note that x → ∞ as y → 0 and that y → ∞ as x → 0. So the hyperbola has both the x-axis and the y -axis as asymptotes,
when drawn in the xy-plane. The first octant parts of a few of these horizontal cross-sections (namely, z = 4, z = 2 and
0 0
z =
0
1
2
) are drawn in the figure below.
Next we add some vertical cross-sections. We can't use x = 0 or y = 0 because any point on xyz = 1 must have all of x, y, z
nonzero. So we use
1.7.6 https://math.libretexts.org/@go/page/92235
Finally, we clean up and simplify the sketch.
Solution
Fix any real number C . Then, for the specified function f , the level curve f (x, y) = C is the set of points (x, y) that obey
1.7.7 https://math.libretexts.org/@go/page/92235
2 2 2 2
x + 4y − 2x + 2 = C ⟺ x − 2x + 1 + 4 y +1 = C
2 2
⟺ (x − 1 ) + 4y = C −1
Now (x − 1) 2
+ 4y is the sum of two squares, and so is always at least zero. So if C − 1 < 0, i.e. if C < 1, there is no curve
2
f (x, y) = C . If C − 1 = 0, i.e. if C = 1, then f (x, y) = C − 1 = 0 if and only if both (x − 1 ) = 0 and 4 y = 0 and so the
2 2
level curve consists of the single point (1, 0). If C > 1, then f (x, y) = C become (x − 1) + 4y = C − 1 > 0 which 2 2
describes an ellipse centred on (1, 0). It intersects the x-axis when y = 0 and
2
−−−−− −−−−−
(x − 1 ) = C −1 ⟺ x − 1 = ±√ C − 1 ⟺ x = 1 ± √C − 1
and it intersects the line x = 1 (i.e. the vertical line through the centre) when
2
−−−−− 1
−−−−−
4y = C −1 ⟺ 2y = ±√ C − 1 ⟺ y =± √C − 1
2
−−−−− −−−−−
So, when C > 1, f (x, y) = C is the ellipse centred on (1, 0) with x semi-axis √C − 1 and y semi-axis 1
2
√C − 1 . Here is a
sketch of some representative level curves of f (x, y) = x + 4y − 2x + 2. 2 2
It is often easier to develop an understanding of the behaviour of a function f (x, y) by looking at a sketch of its level curves,
than it is by looking at a sketch of its graph. On the other hand, you can also use a sketch of the level curves of f (x, y) as the
first step in building a sketch of the graph z = f (x, y). The next step would be to redraw, for each C , the level curve
f (x, y) = C , in the plane z = C , as we did in Example 1.7.1.
the solution z of e = 1 is f (x, y). So, for the specified function f and any fixed real number C , the level curve
x+y+z
x+y+C
e =1 ⟺ x +y +C = 0 (by taking the logarithm of both sides)
⟺ x + y = −C
This is of course a straight line. It intersects the x-axis when y = 0 and x = −C and it intersects the y -axis when x = 0 and
y = −C . Here is a sketch of some level curves.
1.7.8 https://math.libretexts.org/@go/page/92235
We have just seen that sketching the level curves of a function f (x, y) can help us understand the behaviour of f . We can
generalise this to functions F (x, y, z) of three variables. A level surface of F (x, y, z) is a surface whose equation is of the form
F (x, y, z) = C for some constant C . It is the set of points (x, y, z) at which F takes the value C .
−
−
Let F (x, y, z) = x + y + z . If C > 0, then the level surface F (x, y, z) = C is the sphere of radius √C centred on the
2 2 2
origin. Here is a sketch of the parts of the level surfaces F = 1 (radius 1), F = 4 (radius 2) and F = 9 (radius 3) that are in
the first octant.
Let F (x, y, z) = x + z and C > 0. Consider the level surface x + z = C . The variable y does not appear in this
2 2 2 2
−
−
equation. So for any fixed y , the intersection of the our surface x + z = C with the plane y = y is the circle of radius √C
0
2 2
0
centred on x = z = 0. Here is a sketch of the first quadrant part of one such circle.
1.7.9 https://math.libretexts.org/@go/page/92235
−
−
The full surface is the horizontal stack of all of those circles with y running over R. It is the cylinder of radius √C centred on
0
the y -axis. Here is a sketch of the parts of the level surfaces F = 1 (radius 1), F = 4 (radius 2) and F = 9 (radius 3) that are
in the first octant.
that contains the intercepts (ln C , 0, 0), (0, ln C , 0) and (0, 0, ln C ). Here is a sketch of the parts of the level surfaces
F =e (intercepts (1, 0, 0), (0, 1, 0), (0, 0, 1)),
F =e
2
(intercepts (2, 0, 0), (0, 2, 0), (0, 0, 2)) and
F =e
3
(intercepts (3, 0, 0), (0, 3, 0), (0, 0, 3))
that are in the first octant.
Exercises
Stage 1
1.7.10 https://math.libretexts.org/@go/page/92235
1. ✳
Match the following equations and expressions with the corresponding pictures. Cartesian coordinates are (x, y, z), cylindrical
coordinates are (r, θ, z), and spherical coordinates are (ρ, θ, a⃗ rphi).
2 2 2
(a) a⃗ rphi = π/3 (b) r = 2 cos θ (c) x +y =z +1
2 2 4 4
(d) y =x +z (e) ρ = 2 cos a⃗ rphi (f) z =x +y − 4xy
2
In each of (a) and (b) below, you are provided with a sketch of the first quadrant parts of a few level curves of some function
f (x, y). Sketch the first octant part of the corresponding graph z = f (x, y).
(a) (b)
3
Sketch a few level curves for the function f (x, y) whose graph z = f (x, y) is sketched below.
1.7.11 https://math.libretexts.org/@go/page/92235
Stage 2
4
2. f (x, y) = xy
3. f (x, y) = xe −y
5. ✳
2y
Sketch the level curves of f (x, y) = x +y
2 2
.
6. ✳
Draw a “contour map” of f (x, y) = e , showing all types of level curves that occur.
2 2
−x +4 y
7. ✳
8. ✳
9
Describe the level surfaces of
1. f (x, y, z) = x + y + z 2 2 2
2. f (x, y, z) = x + 2y + 3z
3. f (x, y, z) = x + y 2 2
10
11
2. x + y + 2z = 4
2 2 2
y z x
3. + =1+
9 4 16
4. y 2
=x
2
+z
2
1.7.12 https://math.libretexts.org/@go/page/92235
2 2 2
x y z
5. + + =1
9 12 9
6. x 2
+y
2
+z
2
+ 4x − by + 9z − b = 0 where b is a constant.
2 2
x y z
7. = +
4 4 9
8. z = x 2
Stage 3
12
The surface below has circular level curves, centred along the z -axis. The lines given are the intersection of the surface with
the right half of the yz-plane. Give an equation for the surface.
1. Of course you could instead use some fancy graphing software, but part of the point is to build intuition. Not to mention that
you can't use fancy graphing software on your exam.
2. The semi-axes of an ellipse are the line segments from the centre of the ellipse to the farthest points on the ellipse and to the
nearest points on the ellipse. For a circle the lengths of all of these line segments are just the radius.
3. It's not just a figure of speech!
This page titled 1.7: Sketching Surfaces in 3d is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.
1.7.13 https://math.libretexts.org/@go/page/92235
1.8: Cylinders
There are some classes of relatively simple, but commonly occurring, surfaces that are given their own names. One such class is
cylindrical surfaces. You are probably used to thinking of a cylinder as being something that looks like x + y = 1. 2 2
Example 1.8.2
Here are sketches of three cylinders. The familiar cylinder on the left below
in Example 1.7.3. It is called a hyperbolic cylinder. In this example, the given fixed curve is the hyperbola yz = 1, x =0 and
the given line is the x-axis.
1.8.1 https://math.libretexts.org/@go/page/92236
This page titled 1.8: Cylinders is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.
1.8.2 https://math.libretexts.org/@go/page/92236
1.9: Quadric Surfaces
Another named class of relatively simple, but commonly occurring, surfaces is the quadric surfaces.
for some constants A, B, ⋯ , J. Each constant z cross section of a quadric surface has an equation of the form
2 2
Ax + Dxy + By + gx + hy + j = 0, z = z0
If A = B = D = 0 but g and h are not both zero, this is a straight line. If A, B, and D are not all zero, then by rotating and
translating our coordinate system the equation of the cross section can be brought into one of the forms 2
αx
2
+ βy
2
with α, β > 0, which, if γ > 0, is an ellipse (or a circle),
=γ
αx
2
− βy
2
with α, β > 0, which, if γ ≠ 0, is a hyperbola, and if γ = 0 is two lines,
=γ
x
2
= δy, which, if δ ≠ 0 is a parabola, and if δ = 0 is a straight line.
There are similar statements for the constant x cross sections and the constant y cross sections. Hence quadratic surfaces are built
by stacking these three types of curves.
We have already seen a number of quadric surfaces in the last couple of sections.
We saw the quadric surface 4x 2
+y
2
−z
2
=1 in Example 1.7.1.
Its constant z cross sections are ellipses and its x = 0 and y = 0 cross sections are hyperbolae. It is called a hyperboloid of one
sheet.
We saw the quadric surface x 2
+y
2
=1 in Example 1.8.2.
Its constant z cross sections are circles and its x =0 and y =0 cross sections are straight lines. It is called a right circular
cylinder.
Appendix A.8 contains other quadric surfaces.
1. Technically, we should also require that the polynomial can't be factored into the product of two polynomials of degree one.
2. This statement can be justified using a linear algebra eigenvalue/eigenvector analysis. It is beyond what we can cover here, but
is not too difficult for a standard linear algebra course.
This page titled 1.9: Quadric Surfaces is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
1.9.1 https://math.libretexts.org/@go/page/92237
CHAPTER OVERVIEW
2: Partial Derivatives
In this chapter we are going to generalize the definition of “derivative” to functions of more than one variable and then we are
going to use those derivatives. We will parallel the development in Chapters 1 and 2 of the CLP-1 text. We shall
define limits and continuity of functions of more than one variable (Definitions 2.1.2 and 2.1.3) and then
study the properties of limits in more than one dimension (Theorem 2.1.5) and then
define derivatives of functions of more than one variable (Definition 2.2.1).
We are going to be able to speed things up considerably by recycling what we have already learned in the CLP-1 text.
We start by generalizing the definition of “limit” to functions of more than one variable.
2.1: Limits
2.2: Partial Derivatives
2.3: Higher Order Derivatives
2.4: The Chain Rule
2.5: Tangent Planes and Normal Lines
2.6: Linear Approximations and Error
2.7: Directional Derivatives and the Gradient
2.8: Optional — Solving the Wave Equation
2.9: Maximum and Minimum Values
2.10: Lagrange Multipliers
This page titled 2: Partial Derivatives is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
1
2.1: Limits
Before we really start, let's recall some useful notation.
Definition 2.1.1
N is the set {1, 2, 3, ⋯}of all natural numbers.
R is the set of all real numbers.
∈ is read “is an element of”.
If S is a set and T is a subset of S, then S ∖ T is {x ∈ S|x ∉ T } , the set S with the elements of T removed. In particular,
if S is a set and a is an element of S, then S ∖ {a} = {x ∈ S|x ≠ a} is the set S with the element a removed.
If n is a natural number, R is used for both the set of n -component vectors ⟨x , x , ⋯ , x ⟩ and the set of points
n
1 2 n
(x , x , ⋯ , x ) with n coordinates.
1 2 n
If S and T are sets, then f : S → T means that f is a function which assigns to each element of S an element of T . The set
S is called the domain of f .
The definition of the limit of a function of more than one variable looks just like the definition 1 of the limit of a function of one
variable. Very roughly speaking
lim f (x⃗ ) = L
⃗ a⃗
x→
if f (x⃗ ) approaches L whenever x⃗ approaches a⃗ . Here is a more careful definition of limit.
the function f (x⃗ ) be defined for all x⃗ near 3 a⃗ and take values in R n
n
L ∈ R
We write
lim f (x⃗ ) = L
⃗ a⃗
x→
if 4 the value of the function f (x⃗ ) is sure to be arbitrarily close to L whenever the value of x⃗ is close enough to a⃗ , without 5
being exactly a⃗ .
Now that we have extended the definition of limit, we can extend the definition of continuity.
Let
m and n be natural numbers
m
a⃗ ∈ R
the function f (x⃗ ) be defined for all x⃗ near a⃗ and take values in R n
2.1.1 https://math.libretexts.org/@go/page/89209
2. The function f is continuous on a set D if it is continuous at every point of D.
Here are a few very simple examples. There will be some more substantial examples later — after, as we did in the CLP-1 text, we
build some tools that can be used to build complicated limits from simpler ones.
Example 2.1.4
1. If f (x, y) is the constant function which always takes the value L, then
lim f (x, y) = L
(x,y)→(a,b)
2. If f 2
: R
2
→ R is defined by f (x, y) = (x, y), then
lim f (x, y) = a
(x,y)→(a,b)
Similarly, if g : R 2
→ R is defined by g(x, y) = y, then
lim g(x, y) = b
(x,y)→(a,b)
Limits of multivariable functions have much the same computational properties as limits of functions of one variable. The
following theorem summarizes a bunch of them. For simplicity, it concerns primarily real valued functions. That is, functions that
output real numbers as opposed to vectors. However it does contain one vector valued function. The function X in the theorem
takes as input an n -component vector and returns an m-component vector. We will not deal with many vector valued functions
here in CLP-3, but we will see a lot in CLP-4.
Let
m and n be natural numbers
a⃗ ∈ R and b ⃗ ∈ R
m n
c, F , G ∈ R
and
m ⃗
f , g : D ∖ { a⃗ } → R X : R ∖ { b} → D ∖ { a⃗ } γ : R → R
Assume that
Then
1. lim [f (x⃗ ) + g(x⃗ )] = F +G
⃗ a⃗
x→
lim cf (x⃗ ) = cF
⃗ a⃗
x→
2.1.2 https://math.libretexts.org/@go/page/89209
⃗
f ( x)
3. lim ⃗
=
F
G
if G ≠ 0
⃗ a⃗ g( x)
x→
4. lim f (X(y ))
⃗ =F
⃗ b ⃗
y→
This shows that multivariable limits interact very nicely with arithmetic, just as single variable limits did. Also recall, from
Theorem 1.6.8 in the CLP-1 text,
Theorem 2.1.6
Example 2.1.7
a
as a typical application of Theorem 2.1.5. Here “=” means that part (a) of Theorem 2.1.5 justifies that equality. Start by
computing separately the limits of the numerator and denominator.
a
lim (x + sin y) = lim x+ lim sin y
(x,y)→(2,3) (x,y)→(2,3) (x,y)→(2,3)
e
= lim x + sin ( lim y)
(x,y)→(2,3) (x,y)→(2,3)
= 2 + sin 3
2 2 a 2 2
lim (x y + 1) = lim x y + lim 1
(x,y)→(2,3) (x,y)→(2,3) (x,y)→(2,3)
b
= ( lim x)( lim x)( lim y)( lim y) + 1
(x,y)→(2,3) (x,y)→(2,3) (x,y)→(2,3) (x,y)→(2,3)
2 2
= 2 3 +1
lim =
(x,y)→(2,3) x2 y 2 + 1 lim (x2 y 2 + 1)
(x,y)→(2,3)
2 + sin 3
=
37
While the CLP-1 text's Definition 1.3.3 of the limit of a function of one variable, and our Definition 2.1.2 of the limit of a
multivariable function look virtually identical, there is a substantial practical difference between the two. In dimension one, you
can approach a point from the left or from the right and that's it. There are only two possible directions of approach. In two or more
dimensions there is “much more room” and there are infinitely many possible types of approach. One can even spiral in to a point.
See the middle and right hand figures below.
2.1.3 https://math.libretexts.org/@go/page/89209
The next few examples illustrate the impact that the“extra room” in dimensions greater than one has on limits.
Example 2.1.8
2
x y
As a second example, we consider lim
x2 +y 2
. In this example, both the numerator, x 2
y, and the denominator, x
2 2
+y ,
(x,y)→(0,0)
y = r sin θ
The points (x, y) that are close to (0, 0) are those with small r, regardless of what θ is. Recall that lim f (x, y) = L when
(x,y)→(0,0)
approaches L as (x, y) approaches (0, 0). Substituting x = r cos θ, y = r sin θ into that statement turns it into the
f (x, y)
statement that lim f (x, y) = L when f (r cos θ, r sin θ) approaches L as r approaches 0. For our current example
(x,y)→(0,0)
2 2
x y (r cos θ) (r sin θ)
2
= = r cos θ sin θ
2 2 2
x +y r
As ∣∣r cos 2
θ sin θ∣
∣ ≤r tends to 0 as r tends to 0 (regardless of what θ does as r tends to 0) we have
2
x y
lim =0
(x,y)→(0,0) x2 + y 2
Example 2.1.9
2 2 2 2
x −y x −y
As a third example, we consider lim 2
x +y
2
. Once again, the best way to see the behaviour of f (x, y) = 2
x +y
2
for (x, y)
(x,y)→(0,0)
Note that, this time, f is independent of r but does depend on θ. Here is a greatly magnified sketch of a number of level curves
for f (x, y).
2.1.4 https://math.libretexts.org/@go/page/89209
Observe that
√3
as (x, y) approaches (0, 0) along the ray with 2θ = 30 ∘
, f (x, y) approaches the value 2
(and in fact f (x, y) takes the
√3
value cos(30 ) = ∘
at every point of that ray)
2
2
(and in fact f (x, y) takes the value
∘
cos(60 ) = at every point of that ray)
1
and so on
So there is not single number L such that f (x, y) approaches L as r = |(x, y)| → 0, no matter what the direction of approach
2 2
x −y
is. The limit lim
x +y
2 2
does not exist.
(x,y)→(0,0)
Looking at the sketch above, we see that f (x, y) takes the value F along an entire ray θ = const, r > 0. In the case
√3
F =
2
the ray is 2θ = 30 , r > 0. In particular, because the ray extends all the way to (0, 0), f takes the value F for
,
∘
2 2
x −y
That is true regardless of which really small number you picked. So f (x, y) = x2 +y 2
does not approach any single value as
2 2
x −y
r = |(x, y)| approaches 0 and we conclude that lim 2
x +y
2
does not exist.
(x,y)→(0,0)
Example 2.1.10
0 if x = y
as (x, y) → (0, 0). Here is a graph of the level curve, f (x, y) = −3, for this function.
2.1.5 https://math.libretexts.org/@go/page/89209
Here is a larger graph of level curves, f (x, y) = c, for various values of the constant c.
2
(2 cos θ−sin θ)
If we approach the origin along any fixed ray θ = const, then f (r cos θ, r sin θ) is the constant (or 0 if
cos θ−sin θ
cos θ = sin θ ) times r and so approaches zero as r approaches zero. You can see this in the figure below, which shows the
8
3
16
2.1.6 https://math.libretexts.org/@go/page/89209
If you move towards the origin on either of those rays, you first cross the f =3 level curve, then the f =2 level curve, then
the f = 1 level curve, then the f = level curve, and so on.
1
That f (x, y) → 0 as (x, y) → (0, 0) along any fixed ray is suggestive, but does not imply that the limit exists and is zero.
Recall that to have lim f (x, y) = 0, we need f (x, y) → 0 no matter how (x, y) → (0, 0). It is not sufficient to check
(x,y)→(0,0)
which shows the level curves yet again, with a circle x + y = r superimposed. For every single −∞ < c < ∞, the level
2 2 2
Consequently there is no one number L such that f (x, y) is close to L whenever (x, y) is sufficiently close to (0, 0). The limit
lim f (x, y) does not exist.
(x,y)→(0,0)
Another way to see that f (x, y) does not have any limit as (x, y) → (0, 0) is to show that f (x, y) does not have a limit as
(x, y) approaches (0, 0) along some specific curve. This can be done by picking a curve that makes the denominator, x − y,
tend to zero very quickly. One such curve is x − y = x or, equivalently, y = x − x . Along this curve, for x ≠ 0,
3 3
2 2
3 3
(2x − x + x ) (x + x )
3
f (x, x − x ) = =
3 3
x −x +x x
2 2
(1 + x ) +∞ as x → 0 with x > 0
= ⟶ {
x −∞ as x → 0 with x < 0
The choice of the specific power x is not important. Any power x with p > 2 will have the same effect.
3 p
This limit depends on the choice of the constant a. Once again, this proves that f (x, y) does not have a limit as
(x, y) → (0, 0).
Exercises
Stage 1
2.1.7 https://math.libretexts.org/@go/page/89209
1
Suppose f (x, y) is a function such that lim f (x, y) = 10.
(x,y)→(0,0)
2
A millstone pounds wheat into flour. The wheat sits in a basin, and the millstone pounds up and down.
Samples of wheat are taken from various places along the basin. Their diameters are measured and their position on the basin is
recorded.
Consider this claim: “As the particles get very close to the millstone, the diameters of the particles approach 50 μ m.” In this
context, describe the variables below from Definition 2.1.2.
1. x
2. a
3. L
3
2
x
Let f (x, y) = 2 2
.
x +y
4
Let f (x, y) = x 2
−y
2
1. Express the function in terms of the polar coordinates r and θ, and simplify.
2. Suppose (x, y) is a distance of 1 from the origin. What are the largest and smallest values of f (x, y)?
3. Let r > 0. Suppose (x, y) is a distance of r from the origin. What are the largest and smallest values of f (x, y)?
4. Let ϵ > 0. Find a positive value of r that guarantees |f (x, y)| < ϵ whenever (x, y) is at most r units from the origin.
5. What did you just show?
5
Stage 2
6
Evaluate, if possible,
1. lim (xy + x )
2
(x,y)→(2,−1)
x
2. lim
2 2
(x,y)→(0,0) x +y
2
x
3. lim
2 2
(x,y)→(0,0) x +y
3
x
4. lim
2 2
(x,y)→(0,0) x +y
2.1.8 https://math.libretexts.org/@go/page/89209
2 2
x y
5. lim
2 4
(x,y)→(0,0) x +y
y
(sin x) (e − 1)
6. lim
(x,y)→(0,0) xy
7. ✳
8 8
x +y
1. Find the limit: lim
4 4
.
(x,y)→(0,0) x +y
5
xy
2. Prove that the following limit does not exist: lim .
(x,y)→(0,0) x8 + y 10
8. ✳
Evaluate each of the following limits or show that it does not exist.
3 3
x −y
1. lim
(x,y)→(0,0) x2 + y 2
2 4
x −y
2. lim
2 4
(x,y)→(0,0) x +y
Stage 3
9. ✳
Evaluate each of the following limits or show that it does not exist.
2 2 2 2
2x + x y − y x + 2y
1. lim
(x,y)→(0,0) x2 + y 2
2 2 2 2
x y − 2x y + x
2. lim
2 2 2
(x,y)→(0,1) (x +y − 2y + 1 )
10
2
x y
Define, for all (x, y) ≠ (0, 0), f (x, y) = 4
x +y
2
.
11. ✳
Compute the following limits or explain why they do not exist.
xy
1. lim
2 2
(x,y)→(0,0) x +y
sin(xy)
2. lim
2 2
(x,y)→(0,0) x +y
2 2 4
x + 2x y +y
3. lim
4
(x,y)→(−1,1) 1 +y
x
4. lim |y |
(x,y)→(0,0)
2.1.9 https://math.libretexts.org/@go/page/89209
3. To be precise, there is a number r > 0 such that f (x⃗ ) is defined for all x⃗ obeying |x⃗ − a⃗ | < r.
4. There is a precise, formal version of this definition that looks just like Definition 1.7.1 of the CLP-1 text.
5. You may find the condition “without being exactly a⃗ ” a little strange, but there is a good reason for it, which we have already
f (x)−f (a) f (x)−f (a)
seen in Calculus I. In the definition f ′
(x) = lim
x−a
, the function whose limit is being taken, namely x−a
, is not
x→a
defined at all at x = a. This will again happen when we define derivatives of functions of more than one variable.
6. Not just a pun.
This page titled 2.1: Limits is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.
2.1.10 https://math.libretexts.org/@go/page/89209
2.2: Partial Derivatives
We are now ready to define derivatives of functions of more than one variable. First, recall how we defined the derivative, f (a), ′
of a function of one variable, f (x). We imagined that we were walking along the x-axis, in the positive direction, measuring, for
example, the temperature along the way. We denoted by f (x) the temperature at x. The instantaneous rate of change of temperature
that we observed as we passed through x = a was
df f (a + h) − f (a) f (x) − f (a)
(a) = lim = lim
dx h→0 h x→a x −a
Next suppose that we are walking in the xy-plane and that the temperature at (x, y) is f (x, y). We can pass through the point
(x, y) = (a, b) moving in many different directions, and we cannot expect the measured rate of change of temperature if we walk
parallel to the x-axis, in the direction of increasing x, to be the same as the measured rate of change of temperature if we walk
parallel to the y -axis in the direction of increasing y. We'll start by considering just those two directions. We'll consider other
directions (like walking parallel to the line y = x ) later.
Suppose that we are passing through the point (x, y) = (a, b) and that we are walking parallel to the x-axis (in the positive
direction). Then our y -coordinate will be constant, always taking the value y = b. So we can think of the measured temperature as
the function of one variable B(x) = f (x, b) and we will observe the rate of change of temperature
∂f
This is called the “partial derivative f with respect to x at (a, b)” and is denoted ∂x
(a, b). Here
the symbol ∂, which is read “partial”, indicates that we are dealing with a function of more than one variable, and
∂f
the x in ∂x
indicates that we are differentiating with respect to x, while y is being held fixed, i.e. being treated as a constant.
∂f
∂x
is read “partial dee f dee x”.
dx
∂
∂x
is appropriate. We shall later encounter situations when d
dx
f and ∂
∂x
f are both defined and have
different meanings.
If, instead, we are passing through the point (x, y) = (a, b) and are walking parallel to the y -axis (in the positive direction), then
our x-coordinate will be constant, always taking the value x = a. So we can think of the measured temperature as the function of
one variable A(y) = f (a, y) and we will observe the rate of change of temperature
dA A(b + h) − A(b) f (a, b + h) − f (a, b)
(b) = lim = lim
dy h→0 h h→0 h
∂f
This is called the “partial derivative f with respect to y at (a, b)” and is denoted ∂y
(a, b).
df
Just as was the case for the ordinary derivative (x) (see Definition 2.2.6 in the CLP-1 text), it is common to treat the partial
dx
derivatives of f (x, y) as functions of (x, y) simply by evaluating the partial derivatives at (x, y) rather than at (a, b).
∂f f (x + h, y) − f (x, y)
(x, y) = lim
∂x h→0 h
∂f f (x, y + h) − f (x, y)
(x, y) = lim
∂y h→0 h
respectively. The partial derivatives of functions of more than two variables are defined analogously.
Partial derivatives are used a lot. And there many notations for them.
2.2.1 https://math.libretexts.org/@go/page/89210
Definition 2.2.2
∂f
The partial derivative ∂x
(x, y) of a function f (x, y) is also denoted
∂f
fx (x, y) fx Dx f (x, y) Dx f D1 f (x, y) D1 f
∂x
∂f
The subscript 1 on D 1f indicates that f is being differentiated with respect to its first variable. The partial derivative ∂x
(a, b)
is also denoted
∂f ∣
∣
∂x ∣(a,b)
∂f
with the subscript (a, b) indicating that ∂x
is being evaluated at (x, y) = (a, b).
in terms of the shape of the graph z = f (x, y) of the function f (x, y). That graph appears in the figure below. It looks like the
part of a deformed sphere that is in the first octant.
∂f
The definition of (a, b) concerns only points on the graph that have y = b. In other words, the curve of intersection of the
∂x
surface z = f (x, y) with the plane y = b. That is the red curve in the figure. The two blue vertical line segments in the figure
f (a+h,b)−f (a,b)
have heights f (a, b) and f (a + h, b), which are the two numbers in the numerator of h
.
A side view of the curve (looking from the left side of the y -axis) is sketched in the figure below.
2.2.2 https://math.libretexts.org/@go/page/89210
Again, the two blue vertical line segments in the figure have heights f (a, b) and f (a + h, b), which are the two numbers in the
f (a+h,b)−f (a,b)
numerator of h
. So the numerator f (a + h, b) − f (a, b) and denominator h are the rise and run, respectively, of
∂f
the curve z = f (x, b) from x = a to x = a + h. Thus ∂x
(a, b) is exactly the slope of (the tangent to) the curve of intersection
∂f
of the surface z = f (x, y) and the plane y =b at the point (a, b, f (a, b)). In the same way ∂y
(a, b) is exactly the slope of
(the tangent to) the curve of intersection of the surface z = f (x, y) and the plane x = a at the point (a, b, f (a, b)).
∂x
by using what we already know about
ordinary derivatives . More precisely,
d
dx
∂f
to evaluate ∂x
(x, y), treat the y in f (x, y) as a constant and differentiate the resulting function of x with respect to x.
∂f
To evaluate ∂y
(x, y), treat the x in f (x, y) as a constant and differentiate the resulting function of y with respect to y.
∂f
To evaluate (a, b), treat the y in f (x, y) as a constant and differentiate the resulting function of x with respect to x. Then
∂x
Example 2.2.4
Let
3 2 2
f (x, y) = x +y + 4x y
Then, since ∂
∂x
treats y as a constant,
∂f ∂ 3
∂ 2
∂ 2
= (x ) + (y ) + (4x y )
∂x ∂x ∂x ∂x
2 2
∂
= 3x + 0 + 4y (x)
∂x
2 2
= 3x + 4y
and, since ∂
∂y
treats x as a constant,
∂f ∂ ∂ ∂
3 2 2
= (x ) + (y ) + (4x y )
∂y ∂y ∂y ∂y
∂ 2
= 0 + 2y + 4x (y )
∂y
= 2y + 8xy
2.2.3 https://math.libretexts.org/@go/page/89210
∂f
2 2
(1, 0) = 3(1 ) + 4(0 ) =3
∂x
∂f
(1, 0) = 2(0) + 8(1)(0) = 0
∂y
Example 2.2.5
Let
xy
f (x, y) = y cos x + xe
Then, since ∂
∂x
treats y as a constant, ∂
∂x
e
yx
= ye
yx
and
∂ ∂ ∂ ∂
xy xy
(x, y) = y (cos x) + e (x) + x (e ) (by the product rule)
∂x ∂x ∂x ∂x
xy xy
= −y sin x + e + xy e
∂ ∂ ∂ xy
(x, y) = cos x (y) + x (e )
∂x ∂y ∂y
2 xy
= cos x + x e
Let's move up to a function of four variables. Things generalize in a quite straight forward way.
Example 2.2.6
Let
2 3y
f (x, y, z, t) = x sin(y + 2z) + t e ln z
Then
∂f
(x, y, z, t) = sin(y + 2z)
∂x
∂f
2 3y
(x, y, z, t) = x cos(y + 2z) + 3 t e ln z
∂y
∂f 2 3y
(x, y, z, t) = 2x cos(y + 2z) + t e /z
∂z
∂f
3y
(x, y, z, t) = 2te ln z
∂t
Now here is a more complicated example — our function takes a special value at (0, 0). To compute derivatives there we revert to
the definition.
Example 2.2.7
Set
cos x−cos y
if x ≠ y
x−y
f (x, y) = {
0 if x = y
cos x−cos y
If b ≠ a, then for all (x, y) sufficiently close to (a, b), f (x, y) = and we can compute the partial derivatives of f at
x−y
(a, b) using the familiar rules of differentiation. However that is not the case for (a, b) = (0, 0). To evaluate f (0, 0), we need x
0 if x = 0
with respect to x at x = 0. As we cannot use the usual differentiation rules, we evaluate the derivative 2 by applying the
definition
2.2.4 https://math.libretexts.org/@go/page/89210
f (h, 0) − f (0, 0)
fx (0, 0) = lim
h→0 h
cos h−1
−0
h
= lim (Recall that h ≠ 0 in the limit.)
h→0 h
cos h − 1
= lim
2
h→0 h
− sin h
= lim ô
(By l'H pital's rule.)
h→0 2h
− cos h
= lim ô
(By l'H pital again.)
h→0 2
1
=−
2
2
by substituting in the Taylor expansion
h
2 4
h h
cos h = 1 − + −⋯
2 4!
We can also use Taylor expansions to understand the behaviour of f (x, y) for (x, y) near (0, 0). For x ≠ y,
2 4
2 4
x x y y
[1 − + − ⋯] − [1 − + − ⋯]
cos x − cos y 2! 4! 2! 4!
=
x −y x −y
2 2 4 4
x −y x −y
− + −⋯
2! 4!
=
x −y
2 2 4 4
1 x −y 1 x −y
=− + −⋯
2! x −y 4! x −y
3 2 2 3
x +y x + x y + xy +y
=− + −⋯
2! 4!
0 if x = y
Example 2.2.8
Again set
cos x−cos y
if x ≠ y
x−y
f (x, y) = {
0 if x = y
2.2.5 https://math.libretexts.org/@go/page/89210
∂ cos x − cos y
fy (x, y) =
∂y x −y
∂ ∂
(x − y) (cos x − cos y) − (cos x − cos y) (x − y)
∂y ∂y
=
2
(x − y)
f (x, x + h) − f (x, x)
= lim
h→0 h
cos x−cos(x+h)
−0
x−(x+h)
= lim (Recall that h ≠ 0 in the limit.)
h→0 h
cos(x + h) − cos x
= lim
2
h→0 h
Now we apply L'Hôpital's rule, remembering that, in this limit, x is a constant and h is the variable — so we differentiate with
respect to h.
− sin(x + h)
fy (x, y) = lim
h→0 2h
Note that if x is not an integer multiple of π, then the numerator − sin(x + h) does not tend to zero as h tends to zero, and the
limit giving f (x, y) does not exist. On the other hand, if x is an integer multiple of π, both the numerator and denominator
y
tend to zero as h tends to zero, and we can apply L'Hôpital's rule a second time. Then
− cos(x + h)
fy (x, y) = lim
h→0 2
cos x
=−
2
The conclusion:
(x−y) sin y+cos x−cos y
⎧ if x ≠ y
⎪
⎪ 2
(x−y)
fy (x, y) = ⎨ cos x
− if x = y with x an integer multiple of π
⎪
⎩
2
⎪
DN E if x = y with x not an integer multiple of π
is not continuous at (0, 0) and yet has both partial derivatives f (0, 0) and f x y (0, 0) perfectly well defined. We'll also see how
that is possible. First let's compute the partial derivatives. By definition,
2.2.6 https://math.libretexts.org/@go/page/89210
h
2
h
f (0 + h, 0) − f (0, 0) − 0
h−0
fx (0, 0) = lim = lim = lim 1
h→0 h h→0 h h→0
=1
2
0
f (0, 0 + h) − f (0, 0) −0
0−h
fy (0, 0) = lim = lim = lim 0
h→0 h h→0 h h→0
=0
2
x 1
3
lim f (x, x − x ) = lim = lim
3
x→0 x→0 x − (x − x ) x→0 x
x
approaches +∞, and as x approoaches 0
So how is this possible? The answer is that f (0, 0) only involves values of f (x, y) with y = 0. As f (x, 0) = x, for all values
x
of x, we have that f (x, 0) is a continuous, and indeed a differentiable, function. Similarly, f (0, 0) only involves values of y
f (x, y) with x = 0. As f (0, y) = 0, for all values of y, we have that f (0, y) is a continuous, and indeed a differentiable,
function. On the other hand, the bad behaviour of f (x, y) for (x, y) near (0, 0) only happens for x and y both nonzero.
Example 2.2.10
The equation
5 2 z 2x
z +y e +e =0
implicitly determines z as a function of x and y. That is, the function z(x, y) obeys
5 2 z(x,y) 2x
z(x, y ) +y e +e =0
∂x
(0, 0).
We are not going to be able to explicitly solve the equation for z(x, y). All we know is that
5 2 z(x,y) 2x
z(x, y ) +y e +e =0
∂x
(0, 0) by differentiating 4 the whole equation with respect to x, giving
∂z ∂z
4 2 z(x,y) 2x
5z(x, y ) (x, y) + y e (x, y) + 2 e =0
∂x ∂x
4
∂z
5z(0, 0 ) (0, 0) + 2 = 0
∂x
2.2.7 https://math.libretexts.org/@go/page/89210
Next we have a partial derivative disguised as a limit.
Example 2.2.11
The critical observation is that, in taking the limit z → 0, x and y are fixed. They do not change as z is getting smaller and
smaller. Furthermore this limit is exactly of the form of the limits in the Definition 2.2.1 of partial derivative, disguised by
some obfuscating changes of notation.
Set
3
(x + y + z)
f (x, y, z) =
(x + y)
Then
3 3
(x + y + z) − (x + y ) f (x, y, z) − f (x, y, 0)
lim = lim
z→0 (x + y)z z→0 z
f (x, y, 0 + h) − f (x, y, 0)
= lim
h→0 h
∂f
= (x, y, 0)
∂z
3
∂ (x + y + z)
= [ ]
∂z x +y
z=0
3
(const+z)
Recalling that ∂
∂z
treats x and y as constants, we are evaluating the derivative of a function of the form const
. So
3 3 2
(x + y + z) − (x + y ) (x + y + z) ∣
lim =3 ∣
z→0 (x + y)z x +y ∣
z=0
= 3(x + y)
The next example highlights a potentially dangerous difference between ordinary and partial derivatives.
Example 2.2.12
−1
In this example we are going to see that, in contrast to the ordinary derivative case, ∂r
∂x
is not, in general, the same as ( ∂x
∂r
) .
5
Recall that Cartesian and polar coordinates (for (x, y) ≠ (0, 0) and r > 0 ) are related by
x = r cos θ
y = r sin θ
−−−−−−
2 2
r = √x +y
y
tan θ =
x
Fix any point (x 0, y0 ) ≠ (0, 0) and let (r 0, θ0 ), 0 ≤ θ0 < 2π, be the corresponding polar coordinates. Then
2.2.8 https://math.libretexts.org/@go/page/89210
∂x ∂r x
(r, θ) = cos θ (x, y) =
− −−−− −
∂r ∂x √ x2 + y 2
so that
−1
−1 ⎛ ⎞
∂x ∂r x0 −1
(r0 , θ0 ) = ( (x0 , y0 )) ⟺ cos θ0 = ⎜ −−−−−−⎟ = (cos θ0 )
∂r ∂x 2 2
⎝ √x +y ⎠
0 0
2
⟺ cos θ0 = 1
⟺ θ0 = 0, π
We can also see pictorially why this happens. By definition, the partial derivatives
∂x x(r0 + dr, θ0 ) − x(r0 , θ0 )
(r0 , θ0 ) = lim
∂r dr→0 dr
Here we have just renamed the h of Definition 2.2.1 to dr and to dx in the two definitions.
In computing ∂x
∂r
is held fixed, r is changed by a small amount dr and the resulting
(r0 , θ0 ), θ0
dx = x(r + dr, θ ) − x(r , θ ) is computed. In the figure on the left below, dr is the length of the orange line segment and
0 0 0 0
On the other hand, in computing , y is held fixed, x is changed by a small amount dx and the resulting
∂r
∂x
dr = r(x + dx, y ) − r(x , y ) is computed. In the figure on the right above, dx is the length of the pink line segment and
0 0 0 0
Here are the two figures combined together. We have arranged that the same dr is used in both computations. In order for the
dr 's to be the same in both computations, the two dx's have to be different (unless θ = 0, π ). So, in general, 0
∂x ∂r −1
(r0 , θ0 ) ≠ ( (x0 , y0 )) .
∂r ∂x
The inverse function theorem, for functions of one variable, says that, if y(x) and x(y) are inverse functions, meaning that
dy
y(x(y)) = y and x(y(x)) = x, and are differentiable with ≠ 0, then
dx
2.2.9 https://math.libretexts.org/@go/page/89210
dx 1
(y) =
dy
dy
(x(y))
dx
dy
To see this, just apply d
dy
to both sides of y(x(y)) = y to get dx
(x(y))
dx
dy
(y) = 1, by the chain rule (see Theorem 2.9.3 in
the CLP-1 text). In the CLP-1 text, we used this to compute the derivatives of the logarithm (see Theorem 2.10.1 in the CLP-1
text) and of the inverse trig functions (see Theorem 2.12.7 in the CLP-1 text).
We have just seen, in Example 2.2.12, that we can't be too naive in extending the single variable inverse function theorem to
functions of two (or more) variables. On the other hand, there is such an extension, which we will now illustrate, using
Cartesian and polar coordinates. For simplicity, we'll restrict our attention to x > 0, y > 0, or equivalently, r > 0, 0 < θ < . π
The functions which convert between Cartesian and polar coordinates are
−−−−−−
2 2
x(r, θ) = r cos θ r(x, y) = √ x +y
y
y(r, θ) = r sin θ θ(x, y) = arctan( )
x
The two functions on the left convert from polar to Cartesian coordinates and the two functions on the right convert from
Cartesian to polar coordinates. The inverse function theorem (for functions of two variables) says that,
if you form the first order partial derivatives of the left hand functions into the matrix
∂x ∂r
(r, θ) (r, θ)
∂r ∂θ cos θ −r sin θ
[ ] =[ ]
∂y ∂y
(r, θ) (r, θ) sin θ r cos θ
∂r ∂θ
and you form the first order partial derivatives of the right hand functions into the matrix
x y
x y
∂r ∂r ⎡ ⎤
⎡ (x, y) (x, y) ⎤ √x2 +y 2 √x2 +y 2
⎡ 2 2 2 2 ⎤
∂x ∂y √x +y √x +y
=⎢
⎢ −
y
1
⎥ =⎢
⎥ ⎥
∂θ ∂θ −y x
⎣ (x, y) (x, y) ⎦ x
2
x
⎣ ⎦
∂x ∂y ⎣ y
2
y
2 ⎦ x2 +y 2 x2 +y 2
1+( ) 1+( )
x x
and if you evaluate the second matrix at x = x(r, θ), y = y(r, θ),
∂r ∂r
⎡ (x(r, θ), y(r, θ)) (x(r, θ), y(r, θ)) ⎤ cos θ sin θ
∂x ∂y
=[ ]
∂θ ∂θ sin θ cos θ
⎣ (x(r, θ), y(r, θ)) (x(r, θ), y(r, θ)) ⎦ −
r r
∂x ∂y
6
and if you multiply the two matrices together
∂r ∂r ∂x ∂x
⎡ (x(r, θ), y(r, θ)) (x(r, θ), y(r, θ)) ⎤ (r, θ) (r, θ)
∂x ∂y ∂r ∂θ
[ ]
∂θ ∂θ ∂y ∂y
⎣ (x(r, θ), y(r, θ)) (x(r, θ), y(r, θ)) ⎦ (r, θ) (r, θ)
∂x ∂y ∂r ∂θ
(cos θ)(cos θ) + (sin θ)(sin θ) (cos θ)(−r sin θ) + (sin θ)(r cos θ)
=[ ]
sin θ cos θ sin θ cos θ
(− )(cos θ) + ( )(sin θ) (− )(−r sin θ) + ( )(r cos θ)
r r r r
2.2.10 https://math.libretexts.org/@go/page/89210
This two variable version of the inverse function theorem can be derived by applying the derivatives ∂
∂r
and ∂
∂θ
to the
equations
and using the two variable version of the chain rule, which we will see in §2.4.
Exercises
Stage 1
1
Let f (x, y) = e x
cos y. The following table gives some values of f (x, y).
x = 0 x = 0.01 x = 0.1
∂f
1. Find two different approximate values for ∂x
(0, 0) using the data in the above table.
∂f
2. Find two different approximate values for ∂y
(0, 0) using the data in the above table.
∂f ∂f
3. Evaluate ∂x
(0, 0) and ∂y
(0, 0) exactly.
2
You are traversing an undulating landscape. Take the z -axis to be straight up towards the sky, the positive x-axis to be due
south, and the positive y -axis to be due east. Then the landscape near you is described by the equation z = f (x, y), with you at
the point (0, 0, f (0, 0)). The function f (x, y) is differentiable.
Suppose f y (0, 0) < 0. Is it possible that you are at a summit? Explain.
3✳
Let
2
x y
if (x, y) ≠ (0, 0)
f (x, y) = { x2 +y 2
0 if (x, y) = (0, 0)
Stage 2
4
Find all first partial derivatives of the following functions and evaluate them at the given point.
1. f (x, y, z) = x 3 4
y z
5
(0, −1, −1)
2.2.11 https://math.libretexts.org/@go/page/89210
2. w(x, y, z) = ln(1 + e xyz
) (2, 0, −1)
1
3. f (x, y) = −−−−−− (−3, 4)
2 2
√x + y
5
x+y
Show that the function z(x, y) = x−y
obeys
∂z
x (x, y) + y
∂x
f rac∂z∂y(x, y) = 0
6✳
∂x
,
∂z
∂y
in terms of x, y, z.
2. Evaluate ∂z
∂x
and ∂z
∂y
at (x, y, z) = (−1, −2, 1/2).
7✳
Find ∂U
∂T
and ∂T
∂V
at (1, 1, 2, 4) if (T , U , V , W ) are related by
2
(T U − V ) ln(W − U V ) = ln 2
8✳
Suppose that u = x 2
+ yz, x = ρr cos(θ), y = ρr sin(θ) and z = ρr. Find ∂u
∂r
at the point (ρ
0, r0 , θ0 ) = (2, 3, π/2).
9
Use the definition of the derivative to evaluate f x (0, 0) and f y (0, 0) for
2 2
x −2 y
if x ≠ y
f (x, y) = { x−y
0 if x = y
Stage 3
10
Let f be any differentiable function of one variable. Define z(x, y) = f (x 2 2
+ y ). Is the equation
∂z ∂z
y (x, y) − x (x, y) = 0
∂x ∂y
necessarily satisfied?
11
Define the function
2
(x+2y)
if x + y ≠ 0
f (x, y) = { x+y
0 if x + y = 0
∂f ∂f
1. Evaluate, if possible, ∂x
(0, 0) and ∂y
(0, 0).
2.2.12 https://math.libretexts.org/@go/page/89210
2. Is f (x, y) continuous at (0, 0)?
12
Consider the cylinder whose base is the radius-1 circle in the xy-plane centred at (0, 0), and which slopes parallel to the line in
the yz-plane given by z = y.
When you stand at the point (0, −1, 0), what is the slope of the surface if you look in the positive y direction? The positive x
direction?
1. There are applications in which there are several variables that cannot be varied independently. For example, the pressure,
volume and temperature of an ideal gas are related by the equation of state P V = (constant)T . In those applications, it may
not be clear from the context which variables are being held fixed.
2. It is also possible to evaluate the derivative by using the technique of the optional Section 2.15 in the CLP-1 text.
3. The only real number z which obeys z = −1 is z = −1. However there are four other complex numbers which also obey
5
5
z = −1.
4. You should have already seen this technique, called implicit differentiation, in your first Calculus course. It is covered in
Section 2.11 in the CLP-1 text.
5. If you are not familiar with polar coordinates, don't worry about it. There will be an introduction to them in §3.2.1.
6. Matrix multiplication is usually covered in courses on linear algebra, which you may or may not have taken. That's why this
example is optional.
This page titled 2.2: Partial Derivatives is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
2.2.13 https://math.libretexts.org/@go/page/89210
2.3: Higher Order Derivatives
df
You have already observed, in your first Calculus course, that if f (x) is a function of x, then its derivative, dx
(x), is also a
2
d f
function of x, and can be differentiated to give the second order derivative dx
2
(x), which can in turn be differentiated yet again to
give the third order derivative, f (3)
(x), and so on.
We can do the same for functions of more than one variable. If f (x, y) is a function of x and y, then both of its partial derivatives,
∂f ∂f
∂x
(x, y) and ∂y
(x, y) are also functions of x and y. They can both be differentiated with respect to x and they can both be
differentiated with respect to y. So there are four possible second order derivatives. Here they are, together with various alternate
notations.
2
∂ ∂f ∂ f
( ) (x, y) = (x, y) = fxx (x, y)
2
∂x ∂x ∂x
2
∂ ∂f ∂ f
( ) (x, y) = (x, y) = fxy (x, y)
∂y ∂x ∂y∂x
2
∂ ∂f ∂ f
( ) (x, y) = (x, y) = fyx (x, y)
∂x ∂y ∂x∂y
2
∂ ∂f ∂ f
( ) (x, y) = (x, y) = fyy (x, y)
2
∂y ∂y ∂y
2 2
∂ f
In ∂y ∂x
=
∂
∂y ∂x
f, the derivative closest to f , in this case ∂
∂x
, is applied first.
In f xy , the derivative with respect to the variable closest to f , in this case x, is applied first.
Example 2.3.1
Let f (x, y) = e my
cos(nx). Then
my my
fx = −ne sin(nx) fy = m e cos(nx)
2 my my
fxx = −n e cos(nx) fyx = −mne sin(nx)
my 2 my
fxy = −mne sin(nx) fyy = m e cos(nx)
Example 2.3.2
2 αx+βy αx+βy
fxx = α e fyx = βαe
αx+βy 2 αx+βy
fxy = αβe fyy = β e
Example 2.3.3
If f (x 1, x2 , x3 , x4 ) = x
4
1
3
x
2
x
2
3
x4 , then
2.3.1 https://math.libretexts.org/@go/page/89211
4 3
∂ f ∂
4 3 2
= (x x x )
1 2 3
∂ x1 ∂ x2 ∂ x3 ∂ x4 ∂ x1 ∂ x2 ∂ x3
2
∂
4 3
= (2 x x x3 )
1 2
∂ x1 ∂ x2
∂
4 2
= (6 x x x3 )
1 2
∂x1
3 2
= 24 x x x3
1 2
and
4 3
∂ f ∂
3 3 2
= (4 x x x x4 )
1 2 3
∂ x4 ∂ x3 ∂ x2 ∂ x1 ∂ x4 ∂ x3 ∂ x2
2
∂
3 2 2
= (12 x x x x4 )
1 2 3
∂ x4 ∂ x3
∂
3 2
= (24 x x x3 x4 )
1 2
∂x4
3 2
= 24 x x x3
1 2
In all of these examples, it didn't matter what order we took the derivatives in. The following theorem 1 shows that this was no
accident.
and define
1
F (h, k) = [f (x0 + h, y0 + k) − f (x0 , y0 + k) − f (x0 + h, y0 ) + f (x0 , y0 )]
hk
2 2
∂ f ∂ f
We define F (h, k) in this way because both partial derivatives ∂x∂y
(x0 , y0 ) and ∂y∂x
(x0 , y0 ) are limits of F (h, k) as h, k → 0.
∂ ∂f
(x0 , y0 ) = lim lim F (h, k)
∂x ∂y h→0 k→0
2.3.2 https://math.libretexts.org/@go/page/89211
Note that the two right hand sides here are identical except for the order in which the limits are taken.
Now, by the mean value theorem (four times),
(2) 1 ∂f ∂f
F (h, k) = [ (x0 + h, y0 + θ1 k) − (x0 , y0 + θ1 k)]
h ∂y ∂y
(3) ∂ ∂f
= (x0 + θ2 h, y0 + θ1 k)
∂x ∂y
(4) 1 ∂f ∂f
F (h, k) = [ (x0 + θ3 h, y0 + k) − (x0 + θ3 h, y0 )]
k ∂x ∂x
(5) ∂ ∂f
= (x0 + θ3 h, y0 + θ4 k)
∂y ∂x
for some numbers 0 < θ 1, θ2 , θ3 , θ4 < 1. All of the numbers θ 1, θ2 , θ3 , θ4 depend on x 0, y0 , h, k. Hence
∂ ∂f ∂ ∂f
(x0 + θ2 h, y0 + θ1 k) = (x0 + θ3 h, y0 + θ4 k)
∂x ∂y ∂y ∂x
for all h and k. Taking the limit (h, k) → (0, 0) and using the assumed continuity of both partial derivatives at (x 0, y0 ) gives
∂ ∂f ∂ ∂f
lim F (h, k) = (x0 , y0 ) = (x0 , y0 )
(h,k)→(0,0) ∂x ∂y ∂y ∂x
as desired. To complete the proof we just have to justify the details (1), (2), (3), (4) and (5).
The Details
1. By definition,
∂ ∂f 1 ∂f ∂f
(x0 , y0 ) = lim [ (x0 , y0 + k) − (x0 , y0 )]
∂y ∂x k→0 k ∂x ∂x
Similarly,
∂ ∂f 1 ∂f ∂f
(x0 , y0 ) = lim [ (x0 + h, y0 ) − (x0 , y0 )]
∂x ∂y h→0 h ∂y ∂y
2. The mean value theorem (Theorem 2.13.4 in the CLP-1 text) says that, for any differentiable function a⃗ rphi(x),
the slope of the line joining the points (x 0, a⃗ rphi(x0 )) and (x 0 + k, a⃗ rphi(x0 + k)) on the graph of a⃗ rphi
is the same as
the slope of the tangent to the graph at some point between x and x 0 0 + k.
2.3.3 https://math.libretexts.org/@go/page/89211
Applying this with x replaced by y and a⃗ rphi replaced by G(y) = f (x 0 + h, y) − f (x0 , y) gives
G(y0 + k) − G(y0 ) dG
= (y0 + θ1 k) for some 0 < θ1 < 1
k dy
∂f ∂f
= (x0 + h, y0 + θ1 k) − (x0 , y0 + θ1 k)
∂y ∂y
1 G(y0 + k) − G(y0 )
F (h, k) = [ ]
h k
1 ∂f ∂f
= [ (x0 + h, y0 + θ1 k) − (x0 , y0 + θ1 k)]
h ∂y ∂y
∂f
3. Define H (x) = ∂y
(x, y0 + θ1 k). By the mean value theorem,
1
F (h, k) = [H (x0 + h) − H (x0 )]
h
dH
= (x0 + θ2 h) for some 0 < θ2 < 1
dx
∂ ∂f
= (x0 + θ2 h, y0 + θ1 k)
∂x ∂y
1 A(x0 + h) − A(x0 )
F (h, k) = [ ]
k h
1 dA
= (x0 + θ3 h) for some 0 < θ3 < 1
k dx
1 ∂f ∂f
= [ (x0 + θ3 h, y0 + k) − (x0 + θ3 h, y0 )]
k ∂x ∂x
∂f
5. Define B(y) = ∂x
(x0 + θ3 h, y). By the mean value theorem
1
F (h, k) = [B(y0 + k) − B(y0 )]
k
dB
= (y0 + θ4 k) for some 0 < θ4 < 1
dy
∂ ∂f
= (x0 + θ3 h, y0 + θ4 k)
∂y ∂x
Optional — An Example of
∂ f ∂ f
(x0 , y0 ) ≠ (x0 , y0 )
∂x∂y ∂y∂x
2 2
∂ f ∂ f
In Theorem 2.3.4, we showed that ∂x∂y
(x0 , y0 ) =
∂y∂x
(x0 , y0 ) if the partial derivatives
2
∂ f
∂x∂y
and \(\frac{\partial^2 f }{\partial y\partial x}\]
2 2
∂ f ∂ f
exist and are continuous at (x0 , y0 ). Here is an example which shows that if the partial derivatives ∂x∂y
and ∂y∂x
are not
continuous at (x 0 , y0 ), then it is possible that
2.3.4 https://math.libretexts.org/@go/page/89211
2 2
∂ f ∂ f
(x0 , y0 ) ≠ (x0 , y0 ).
∂x∂y ∂y∂x
Define
2 2
x −y
xy if (x, y) ≠ (0, 0)
f (x, y) = { x2 +y 2
0 if (x, y) = (0, 0)
This function is continuous everywhere. Note that f (x, 0) = 0 for all x and f (0, y) = 0 for all y. We now compute the first order
partial derivatives. For (x, y) ≠ (0, 0),
2 2 2 2
∂f x −y 2x 2x(x −y )
(x, y) = y + xy − xy
2 2 2 2 2 2 2
∂x x +y x +y (x +y )
2 2 2
x −y 4xy
=y + xy
2 2 2 2 2
x +y (x +y )
2 2 2 2
∂f x −y 2y 2y(x −y )
(x, y) = x − xy − xy
2 2 2 2 2 2 2
∂y x +y x +y (x +y )
2 2 2
x −y 4yx
=x − xy
x2 + y 2 (x2 + y 2 )2
∂f d d
(0, 0) = [ f (0, y)] =[ 0] =0
∂y dy dy
y=0 y=0
0 if (x, y) = (0, 0)
2 2 3 2
x −y 4x y
x 2 2
− 2
if (x, y) ≠ (0, 0)
x +y 2 2
fy (x, y) = { ( x +y )
0 if (x, y) = (0, 0)
∂f ∂f
Both ∂x
(x, y) and ∂y
(x, y) are continuous. Finally, we compute
2
∂ f d 1
(0, 0) = [ fy (x, 0)] = lim [ fy (h, 0) − fy (0, 0)]
∂x∂y dx h→0 h
x=0
2 2
1 h −0
= lim [h − 0] = 1
2 2
h→0 h h +0
2
∂ f d 1
(0, 0) = [ fx (0, y)] = lim [ fx (0, k) − fx (0, 0)]
∂y∂x dy k→0 k
y=0
2 2
1 0 −k
= lim [k − 0] = −1
2
k→0 k 0 +k
2
Exercises
Stage 1
1
Let all of the third order partial derivatives of the function f (x, y, z) exist and be continuous. Show that
fxyz (x, y, z) = fxzy (x, y, z) = fyxz (x, y, z) = fyzx (x, y, z)
2.3.5 https://math.libretexts.org/@go/page/89211
2
Stage 2
3
2. f (x, y) = e xy
; fxx (x, y), fxy (x, y), fxxy (x, y), fxyy (x, y)
3 3
1 ∂ f ∂ f
3. f (u, v, w) = , (u, v, w) , (3, 2, 1)
u + 2v + 3w ∂u∂v∂w ∂u∂v∂w
4
−− −−−−−
Find all second partial derivatives of f (x, y) = √x 2 2
+ 5y .
5
6✳
Let f (r, θ) = r m
cos mθ be a function of r and θ, where m is a positive integer.
1. Find the second order partial derivatives f , f , f and evaluate their respective values at (r, θ) = (1, 0).
rr rθ θθ
2. Determine the value of the real number λ so that f (r, θ) satisfies the differential equation
λ 1
frr + fr + fθθ = 0
2
r r
Stage 3
7
1 2 2 2
1. The history of this important theorem is pretty convoluted. See “A note on the history of mixed partial derivatives” by Thomas
James Higgins which was published in Scripta Mathematica 7 (1940), 59-62. The Theorem is named for Alexis Clairaut (1713-
-1765), a French mathematician, astronomer, and geophysicist, and Hermann Schwarz (1843--1921), a German mathematician.
This page titled 2.3: Higher Order Derivatives is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.
2.3.6 https://math.libretexts.org/@go/page/89211
2.4: The Chain Rule
You already routinely use the one dimensional chain rule
d df dx
f (x(t)) = (x(t)) (t)
dt dx dt
We now generalize the chain rule to functions of more than one variable. For concreteness, we concentrate on the case in which all
functions are functions of two variables. That is, we find the partial derivatives and of a function F (s, t) that is defined as a
∂F
∂s
∂F
∂t
composition
We are using the name F for the new function F (s, t) as a reminder that it is closely related to, though not the same as, the
function f (x, y). The partial derivative is the rate of change of F when s is varied with t held constant. When s is varied, both
∂F
∂s
the x-argument, x(s, t), and the y -argument, y(s, t), in f (x(s, t) , y(s, t)) vary. Consequently, the chain rule for
f (x(s, t) , y(s, t)) is a sum of two terms — one resulting from the variation of the x-argument and the other resulting from the
∂F ∂f ∂x ∂f ∂y
(s, t) = (x(s, t) , y(s, t)) (s, t) + (x(s, t) , y(s, t)) (s, t)
∂s ∂x ∂s ∂y ∂s
∂F ∂f ∂x ∂f ∂y
(s, t) = (x(s, t) , y(s, t)) (s, t) + (x(s, t) , y(s, t)) (s, t)
∂t ∂x ∂t ∂y ∂t
We will give the proof of this theorem in §2.4.4, below. It is common to state this chain rule as
∂F ∂f ∂x ∂f ∂y
= +
∂s ∂x ∂s ∂y ∂s
∂F ∂f ∂x ∂f ∂y
= +
∂t ∂x ∂t ∂y ∂t
That is, it is common to suppress the function arguments. But you should make sure that you understand what the arguments are
before doing so.
Theorem 2.4.1 is given for the case that F is the composition of a function of two variables, f (x, y), with two functions, x(s, t)
and y(s, t), of two variables each. There is nothing magical about the number two. There are obvious variants for any numbers of
variables. For example,
Equation 2.4.2
If F (t) = f (x(t), y(t), z(t)), then
dF ∂f dx ∂f dy
(t) = (x(t) , y(t) , z(t)) (t) + (x(t) , y(t) , z(t)) (t)
dt ∂x dt ∂y dt
∂f dz
+ (x(t) , y(t) , z(t)) (t)
∂z dt
2.4.1 https://math.libretexts.org/@go/page/89212
and
Equation 2.4.3
There will be a large number of examples shortly. First, here is a memory aid.
While the functions f and F are closely related, they are not the same. One is a function of x and y while the other is a function
of s and t.
Step 2: Write down the template
∂F ∂f
=
∂s ∂s
Note that
The function F appears once in the numerator on the left. The function f , from which F is constructed by a change of
variables, appears once in the numerator on the right.
The variable in the denominator on the left appears once in the denominator on the right.
Step 3: Fill in the blanks with every variable that makes sense. In particular, since f is a function of x and y, it may only be
differentiated with respect to x and y. So we add together two copies of our template — one for x and one for y:
∂F ∂f ∂x ∂f ∂y
= +
∂s ∂x ∂s ∂y ∂s
∂y ∂f
Note that x and y are functions of s so that the derivatives ∂x
∂s
and ∂s
make sense. The first term, ∂x
∂x
∂s
, arises from the
∂f ∂y
variation of x with respect to s and the second term, ∂y ∂s
, arises from the variation of y with respect to s.
Step 4: Put in the functional dependence explicitly. Fortunately, there is only one functional dependence that makes sense. The
left hand side is a function of s and t. Hence the right hand side must also be a function of s and t. As f is a function of x and
y, this is achieved by evaluating f at x = x(s, t) and y = y(s, t).
∂F ∂f ∂x ∂f ∂y
(s, t) = (x(s, t), y(s, t)) (s, t) + (x(s, t), y(s, t)) (s, t)
∂s ∂x ∂s ∂y ∂s
∂f ∂f
If you fail to put in the arguments, or at least if you fail to remember what the arguments are, you may forget that ∂x
and ∂y
depend on s and t. Then, if you have to compute a second derivative of F , you will probably fail to differentiate the factors
∂f ∂f
∂x
(x(s, t), y(s, t)) and ∂y
(x(s, t), y(s, t)).
To help remember the formulae of Theorem 2.4.1, it is sometimes also useful to pretend that our variables are physical quantities
with f , F having units of grams, x, y having units of meters and s, t having units of seconds. Note that
the left hand side, , has units grams per second.
∂F
∂s
Each term on the right hand side contains the partial derivative of f with respect to a different independent variable. That
independent variable appears once in the denominator and once in the numerator, so that its units (in this case meters) cancel
2.4.2 https://math.libretexts.org/@go/page/89212
∂f ∂f ∂y
out. Thus both of the terms ∂x
∂x
∂s
and ∂y ∂s
on the right hand side also have the units grams per second.
Hence both sides of the equation have the same units.
∂f ∂f ∂y
Here is a pictorial procedure that uses a tree diagram to help remember the chain rule ∂
∂s
f (x(s, t), y(s, t)) =
∂x
∂x
∂s
+
∂y ∂s
. As
in the figure on the left below,
write, on the top row, “f ”.
Write, on the middle row, each of the variables that the function f (x, y) depends on, namely“x” and “y ”.
Write, on the bottom row,
below x, each of the variables that the function x(s, t) depends on, namely “s ” and “t ”, and
below y, each of the variables that the function y(s, t) depends on, namely “s ” and “t ”.
Draw a line joining each function with each of the variables that it depends on.
Then, as in the figure on the right below, write beside each line, the partial derivative of the function at the top of the line with
respect to the variable at the bottom of the line.
Finally
observe, from the figure below, that there are two paths from f , on the top, to s, on the bottom. One path goes from f at the
top, through x in the middle to s at the bottom. The other path goes from f at the top, through y in the middle to s at the
bottom.
For each such path, multiply together the partial derivatives beside the lines of the path. In this example, the two products
∂f ∂f ∂y
are ∂x
∂x
∂s
, for the first path, and ∂y ∂s
, for the second path.
∂f ∂f ∂y
Then add together those products, giving, in this example, ∂x
∂x
∂s
+
∂y ∂s
.
∂f ∂y
+ (x(s, t), y(s, t)) (s, t)
∂y ∂s
Example 2.4.4
∂f dz
+ (x(t) , y(t) , z(t)) (t)
∂z dt
2.4.3 https://math.libretexts.org/@go/page/89212
∂f ∂f dy ∂f
of Equation 2.4.2, without arguments, is ∂x
dx
dt
+
∂y dt
+
∂z
dz
dt
. The corresponding tree diagram is
Because x(t), y(t) and z(t) are each functions of just one variable, the derivatives beside the lower lines in the tree are
ordinary, rather than partial, derivatives.
Example 2.4.5. ∂
∂s
f (x(s, t), y(s, t))
∂s
f (x(s, t), y(s, t)) for
xy
f (x, y) = e x(s, t) = s y(s, t) = cos t
Define F (s, t) = f (x(s, t) , y(s, t)). The appropriate chain rule for this example is the upper equation of Theorem 2.4.1.
∂F ∂f ∂x ∂f ∂y
(s, t) = (x(s, t) , y(s, t)) (s, t) + (x(s, t) , y(s, t)) (s, t)
∂s ∂x ∂s ∂y ∂s
∂f xy
∂f x(s,t) y(s,t) s cos t
(x, y) = ye (x(s, t), y(s, t)) = y(s, t)e = cos t e
∂x ∂x
∂f xy
∂f x(s,t) y(s,t) s cos t
(x, y) = xe (x(s, t), y(s, t)) = x(s, t)e = s e
∂y ∂y
∂x ∂y
=1 =0
∂s ∂s
so that
∂f
∂f ∂y
∂x
∂x ∂y
∂s ∂s
∂F
s cos t s cos t s cos t
(s, t) = {cos t e } (1) + {s e } (0) = cos t e
∂s
Example 2.4.6. d
dt
f (x(t), y(t))
dt
f (x(t), y(t)) for
2 2
f (x, y) = x −y x(t) = cos t y(t) = sin t
Define F (t) = f (x(t), y(t)). Since F (t) is a function of one variable its derivative is denoted dF
dt
rather than ∂F
∂t
. The
appropriate chain rule for this example (see 2.4.2) is
dF ∂f dx ∂f dy
(t) = (x(t), y(t)) (t) + (x(t), y(t)) (t)
dt ∂x dt ∂y dt
2.4.4 https://math.libretexts.org/@go/page/89212
2 2
f (x, y) = x −y
∂f ∂f
(x, y) = 2x (x(t), y(t)) = 2x(t) = 2 cos t
∂x ∂x
∂f ∂f
(x, y) = −2y (x(t), y(t)) = −2y(t) = −2 sin t
∂y ∂y
dx dy
= − sin t = cos t
dt dt
so that
dF
(t) = (2 cos t)(− sin t) + (−2 sin t)(cos t) = −4 sin t cos t
dt
Example 2.4.7. ∂
∂t
f (x + ct)
Example 2.4.8. ∂
2
f (x + ct)
∂t
∂t
′
(x, t) = c f (x + ct) = F (u(x, t)) where ′
F (u) = c f (u) and
u(x, t) = x + ct. Then
2
∂ ∂W dF ∂u ′′
f (x + ct) = (x, t) = (u(x, t)) (x, t) = c f (x + ct) c
∂t2 ∂t du ∂t
2 ′′
=c f (x + ct)
∂T
.
Before we can find , we first have to decide what it means. This happens regularly in applications. In fact, this particular
∂P
∂T
problem comes from thermodynamics. The variables P , V , T are the pressure, volume and temperature, respectively, of some
gas. These three variables are not independent. They are related by an equation of state, here denoted F (P , V , T ) = 0. Given
values for any two of P , V , T , the third can be found by solving F (P , V , T ) = 0. We are being asked to find . This
∂P
∂T
implicitly instructs us to treat P , in this problem, as the dependent variable. So a careful wording of this problem (which you
will never encounter in the “real world”) would be the following. The function P (V , T ) is defined by F (P (V , T ), V , T ) = 0.
Find ( ) . That is, find the rate of change of pressure as the temperature is varied, while holding the volume fixed.
∂P
∂T V
Since we are not told explicitly what F is, we cannot solve explicitly for P (V , T ). So, instead we differentiate both sides of
F (P (V , T ), V , T ) = 0
with respect to T, while holding V fixed. Think of the left hand side, F (P (V , T ), V , T ), as being
F (P (V , T ), Q(V , T ), R(V , T )) with Q(V , T ) = V and R(V , T ) = T . By the chain rule,
∂ ∂P ∂Q ∂R
F (P (V , T ), Q(V , T ), R(V , T )) = F1 + F2 + F3 =0
∂T ∂T ∂T ∂T
2.4.5 https://math.libretexts.org/@go/page/89212
with F referring to the partial derivative of F with respect to its j
j
th
argument. Experienced chain rule users never introduce Q
and R. Instead, they just write
∂F ∂P ∂F ∂V ∂F ∂T
+ + =0
∂P ∂T ∂V ∂T ∂T ∂T
Recalling that V and T are the independent variables and that, in computing ∂
∂T
, V is to be treated as a constant,
∂V ∂T
=0 =1
∂T ∂T
and solving
∂F
∂P (P (V , T ), V , T )
∂T
(V , T ) = −
∂T ∂F
(P (V , T ), V , T )
∂P
Example 2.4.10
2
d y
Suppose that f (x, y) = 0 and that we are to find dx
2
.
Once again, x and y are not independent variables. Given a value for either x or y, the other is determined by solving
2
d y
f (x, y) = 0. Since we are asked to find , it is y that is to be viewed as a function of x, rather than the other way around. So
dx
2
f (x, y) = 0 really means that, in this problem, f (x, y(x)) = 0 for all x. Differentiating both sides of this equation with
respect to x,
d
⟹ f (x, y(x)) = 0
dx
Note that d
dx
is not the same as
f (x, y(x)) fx (x, y(x)). The former is, by definition, the rate of change with respect to x of
g(x) = f (x, y(x)). Precisely,
f (x + Δx , y(x)) − f (x , y(x))
⟹ fx (x, y(x)) = lim (∗∗)
Δx→0 Δx
dg
The right hand sides of (∗) and (∗∗) are not the same. In , as Δx varies the value of y that is substituted into the first f (⋯)
dx
on the right hand side, namely y(x + Δx), changes as Δx changes. That is, we are computing the rate of change of f along
the (curved) path y = y(x). In (∗∗), the corresponding value of y is y(x) and is independent of Δx. That is, we are computing
the rate of change of f along a horizontal straight line. As a concrete example, suppose that f (x, y) = x + y. Then,
0 = f (x , y(x)) = x + y(x) gives y(x) = −x so that
d d d d
f (x, y(x)) = f (x, −x) = [x + (−x)] = 0 =0
dx dx dx dx
2.4.6 https://math.libretexts.org/@go/page/89212
∣ ∣
fx (x, y(x)) = fx (x, y) =1 =1
∣ ∣
y=−x y=−x
Now back to
d
⟹ f (x, y(x)) = 0
dx
dx dy
⟹ fx (x, y(x)) + fy (x, y(x)) (x) = 0 by the chain rule
dx dx
dy fx (x, y(x))
⟹ (x) = −
dx fy (x, y(x))
2
d y d fx (x, y(x))
⟹ (x) = − [ ]
2
dx dx fy (x, y(x))
d d
fy (x, y(x)) [ fx (x, y(x))] − fx (x, y(x)) [ fy (x, y(x))]
dx dx
=− (†)
2
fy (x, y(x))
dx
[ fx (x, y(x))] and d
dx
[ fy (x, y(x))]. For the former apply the chain
rule to h(x) = u(x, y(x)) with u(x, y) = f (x, y). x
d dh
[ fx (x, y(x))] = (x)
dx dx
dx dy
= ux (x, y(x)) + uy (x, y(x)) (x)
dx dx
dx dy
= fxx (x, y(x)) + fxy (x, y(x)) (x)
dx dx
fx (x, y(x))
= fxx (x, y(x)) − fxy (x, y(x)) [ ]
fy (x, y(x))
fx (x, y(x))
= fyx (x, y(x)) − fyy (x, y(x)) [ ]
fy (x, y(x))
into the right hand side of (†) gives the final answer.
fx fx
2 fy fxx − fy fxy − fx fyx + fx fyy
d y fy fy
(x) = −
2 2
dx fy
2 2
fy fxx − 2 fx fy fxy + fx fyy
=−
3
fy
We now move on to the proof of Theorem 2.4.1. To give you an idea of how the proof will go, we first review the proof of the
familiar one dimensional chain rule.
df
Review of the Proof of d
dt
f (x(t)) =
dx
(x(t))
dx
dt
(t)
As a warm up, let's review the proof of the one dimensional chain rule
d df dx
f (x(t)) = (x(t)) (t)
dt dx dx
df
assuming that dx
dt
exists and that dx
is continuous. We wish to find the derivative of F (t) = f (x(t)). By definition
2.4.7 https://math.libretexts.org/@go/page/89212
F (t + h) − F (t)
′
F (t) = lim
h→0 h
Notice that the numerator is the difference of f (x) evaluated at two nearby values of x, namely x = x(t + h) and x = x(t). The 1 0
mean value theorem is a good tool for studying the difference in the values of f (x) at two nearby points. Recall that the mean value
theorem says that, for any given x and x , there exists an (in general unknown) c between them so that
0 1
′
f (x1 ) − f (x0 ) = f (c) (x1 − x0 )
For this proof, we choose x 0 = x(t) and x 1 = x(t + h). The the mean value theorem tells us that there exists a c so that h
′
f (x(t + h)) − f (x(t)) = f (x1 ) − f (x0 ) = f (ch ) [x(t + h) − x(t)]
We have put the subscript h on c to emphasise that c , which is between x = x(t) and x = x(t + h), may depend on h. Now
h h 0 1
since c is trapped between x(t) and x(t + h) and since x(t + h) → x(t) as h → 0, we have that c must also tend to x(t) as
h h
x(t + h) − x(t)
′
= lim f (ch ) lim
h→0 h→0 h
′ ′
= f (x(t)) x (t)
as desired.
∂s
∂F F (s + h, t) − F (s, t)
(s, t) = lim
∂s h→0 h
The numerator is the difference of f (x, y) evaluated at two nearby values of (x, y), namely (x , y ) = (x(s + h, t) , y(s + h, t)) 1 1
and (x , y ) = (x(s, t) , y(s, t)). In going from (x , y ) to (x , y ), both the x and y -coordinates change. By adding and
0 0 0 0 1 1
subtracting we can separate the change in the x-coordinate from the change in the y -coordinate.
The first half, {f (x , y ) − f (x , y )}, has the same y argument in both terms and so is the difference of the function of one
1 1 0 1
variable g(x) = f (x, y ) (viewing y just as a constant) evaluated at the two nearby values, x , x , of x. Consequently, we can
1 1 0 1
make use of the mean value theorem as we did in §2.4.3 above. There is a c between x = x(s, t) and x = x(s + h, t) such x,h 0 1
that
∂f
′
f (x1 , y1 ) − f (x0 , y1 ) = g(x1 ) − g(x0 ) = g (cx,h )[ x1 − x0 ] = (cx,h , y1 ) [ x1 − x0 ]
∂x
∂f
= (cx,h , y(s + h, t)) [x(s + h, t) − x(s, t)]
∂x
We have introduced the two subscripts in c x,h to remind ourselves that it may depend on h and that it lies between the two x-values
x and x .
0 1
2.4.8 https://math.libretexts.org/@go/page/89212
Similarly, the second half, {f (x , y ) − f (x , y )}, is the difference of the function of one variable h(y) = f (x
0 1 0 0 0, y) (viewing x
0
just as a constant) evaluated at the two nearby values, y , y , of y. So, by the mean value theorem,
0 1
′
∂f
f (x0 , y1 ) − f (x0 , y0 ) = h(y1 ) − h(y0 ) = h (cy,h )[ y1 − y0 ] = (x0 , cy,h ) [ y1 − y0 ]
∂y
∂f
= (x(s, t) , cy,h ) [y(s + h, t) − y(s, t)]
∂y
for some (unknown) c between y = y(s, t) and y = y(s + h, t). Again, the two subscripts in c remind ourselves that it
y,h 0 1 y,h
may depend on h and that it lies between the two y -values y and y . So, noting that, as h tends to zero, c , which is trapped
0 1 x,h
between x(s, t) and x(s + h, t), must tend to x(s, t), and c , which is trapped between y(s, t) and y(s + h, t), must tend to
y,h
y(s, t),
∂f x(s + h, t) − x(s, t)
= lim (cx,h , y(s + h, t)) lim
h→0 ∂x h→0 h
∂f y(s + h, t) − y(s, t)
+ lim (x(s, t) , cy,h ) lim
h→0 ∂y h→0 h
∂f ∂x ∂f ∂y
= (x(s, t) , y(s, t)) (s, t) + (x(s, t) , y(s, t)) (s, t)
∂x ∂s ∂y ∂s
We can of course follow the same procedure to evaluate the partial derivative with respect to t. This concludes the proof of
Theorem 2.4.1.
Exercises
Stage 1
1
Write out the chain rule for each of the following functions.
1. ∂h
∂x
for h(x, y) = f (x, u(x, y))
2. dh
dx
for h(x) = f (x, u(x), v(x))
3. ∂h
∂x
for h(x, y, z) = f (u(x, y, z), v(x, y), w(x))
2
A piece of the surface z = f (x, y) is shown below for some continuously differentiable function f (x, y). The level curve
f (x, y) = z is marked with a blue line. The three points P , P , and P lie on the surface.
1 0 1 2
2.4.9 https://math.libretexts.org/@go/page/89212
On the level curve z = z1 , we can think of y as a function of x. Let w(x) = f (x, y(x)) = z1 . We approximate, at P0 ,
Δf
fx (x, y) ≈
Δx
and dw
dx
(x) ≈
Δw
Δx
. Identify the quantities Δf , Δw, and Δx from the diagram.
3✳
Let w = f (x, y, t) with x and y depending on t. Suppose that at some point (x, y) and at some time t, the partial derivatives
dy
fx , fy and f are equal to
t 2, −3 and 5 respectively, while dx
dt
=1 and dt
= 2. Find and explain the difference between dw
dt
and f t.
4
5
What is wrong with the following argument? Suppose that w = f (x, y, z) and z = g(x, y). By the chain rule,
∂w ∂w ∂x ∂w ∂y ∂w ∂z ∂w ∂w ∂z
= + + = +
∂x ∂x ∂x ∂y ∂x ∂z ∂x ∂x ∂z ∂x
Hence 0 = ∂w
∂z
∂z
∂x
and so ∂w
∂z
=0 or ∂z
∂x
= 0.
Stage 2
6
∂s
and ∂w
∂t
given that the function w =x
2
+y
2 2
+z , with
x = st, y = s cos t and z = s sin t.
7
3
Evaluate ∂
∂x∂y 2
f (2x + 3y, xy) in terms of partial derivatives of f. You may assume that f is a smooth function so that the
Chain Rule and Clairaut's Theorem on the equality of the mixed partial derivatives apply.
2.4.10 https://math.libretexts.org/@go/page/89212
8
Find all second order derivatives of g(s, t) = f (2s + 3t, 3s − 2t). You may assume that f (x, y) is a smooth function so that
the Chain Rule and Clairaut's Theorem on the equality of the mixed partial derivatives apply.
9✳
2 2
∂ f ∂ f
Assume that f (x, y) satisfies Laplace's equation ∂x
2
+
∂y
2
= 0. Show that this is also the case for the composite function
2 2
∂ g ∂ g
g(s, t) = f (s − t, s + t).That is, show that + = 0. You may assume that f (x, y) is a smooth function so that the
∂s
2
∂t
2
Chain Rule and Clairaut's Theorem on the equality of the mixed partial derivatives apply.
10 ✳
Let z = f (x, y) where x = 2s + t and y = s − t. Find the values of the constants a, b and c such that
2 2 2 2 2
∂ z ∂ z ∂ z ∂ z ∂ z
a +b +c = +
2 2 2 2
∂x ∂x ∂y ∂y ∂s ∂t
You may assume that z = f (x, y) is a smooth function so that the Chain Rule and Clairaut's Theorem on the equality of the
mixed partial derivatives apply.
11 ✳
Let F be a function on 2
R . Denote points in R by (u, v) and the corresponding partial derivatives of
2
F by Fu (u, v),
Fv (u, v), Fuu (u, v), Fuv (u, v), etc.. Assume those derivatives are all continuous. Express
2
∂
2 2
F (x − y , 2xy)
∂x ∂y
12 ✳
u(x, y) is defined as
2
y −y
u(x, y) = e F (x e )
∂x
and ∂u
∂y
.
13 ✳
Let f (x) and g(x) be two functions of x satisfying f ′′
(7) = −2 and g ′′
(−4) = −1. If z = h(s, t) = f (2s + 3t) + g(s − 6t)
2
∂t
z
2
when s = 2 and t = 1.
14 ✳
Suppose that w = f (xz, yz), where f is a differentiable function. Show that
∂w ∂w ∂w
x +y =z
∂x ∂y ∂z
2.4.11 https://math.libretexts.org/@go/page/89212
15 ✳
Suppose z = f (x, y) has continuous second order partial derivatives, and x = r cos t, y = r sin t. Express the following
partial derivatives in terms r, t, and partial derivatives of f .
∂z
1.
∂t
2
∂ z
2. 2
∂t
16 ✳
Let z = f (x, y), where f (x, y) has continuous second-order partial derivatives, and
fx (2, 1) = 5, fy (2, 1) = −2,
Find d
2
z(x(t), y(t)) when x(t) = 2t 2
, y(t) = t
3
and t = 1.
dt
17 ✳
2 2 2 2
∂z
=
∂
∂x
F
2
+
∂
∂y
F
2
and the mixed partial derivatives ∂
∂x∂y
F
and ∂
∂y∂x
F
are equal. Let A be some constant and let G(γmma, s, t) = F (γmma + s, γmma − s, At). Find the value of A such that
2 2
∂G ∂ G ∂ G
= + .
∂t ∂γmma2 ∂s2
18 ✳
19 ✳
Let f (u, v) be a differentiable function of two variables, and let z be a differentiable function of x and y defined implicitly by
f (xz, yz) = 0. Show that
∂z ∂z
x +y = −z
∂x ∂y
20 ✳
Let w(s, t) = u(2s + 3t, 3s − 2t) for some twice differentiable function u = u(x, y).
1. Find w in terms of u , u , and u . You can assume that u
ss xx xy yy xy = uyx .
21 ✳
Suppose that f (x, y) is twice differentiable (with f xy = fyx ), and x = r cos θ and y = r sin θ.
1. Evaluate f , f and f in terms of r, θ and partial derivatives of f with respect to x and y.
θ r rθ
2. Let g(x, y) be another function satisfying g = f and g = −f . Express f and f in terms of r, θ and g
x y y x r θ r, gθ .
2.4.12 https://math.libretexts.org/@go/page/89212
22 ✳
∂f ∂f
n⃗ ablaf (x0 , y0 ) = ⟨ (x0 , y0 ) , (x0 , y0 )⟩
∂x ∂y
n⃗ ablag(1, 2) = ⟨−1, 4⟩ ,
and
Assuming g(1, 2) = 3, h(1, 2) = 6, and z(s, t) = f (g(s, t), h(s, t)), find
n⃗ ablaz(1, 2)
23 ✳
1. Let f be an arbitrary differentiable function defined on the entire real line. Show that the function w defined on the entire
plane as
−y
w(x, y) = e f (x − y)
∂x
at the
point (u, v) = (2, 1) which corresponds to the point (x, y) = (2, 11).
24 ✳
The equations
2
x − y cos(uv) = v
4
2 2
x +y − sin(uv) = u
π
define x and y implicitly as functions of u and v (i.e. x = x(u, v), and y = y(u, v) ) near the point (x, y) = (1, 1) at which
π
(u, v) = ( , 0).
2
1. Find
∂x ∂y
and
∂u ∂u
at (u, v) = ( π
2
, 0).
2. If z = x + y
4 4
, determine ∂z
∂u
at the point (u, v) = ( π
2
, 0).
25 ✳
Let f (u, v) be a differentiable function, and let u = x + y and v = x − y. Find a constant, α, such that
2 2 2 2
(fx ) + (fy ) = α((fu ) + (fv ) )
2.4.13 https://math.libretexts.org/@go/page/89212
Stage 3
26
arises in many models involving wave-like phenomena. Let u(x, t) and v(ξ, η) be related by the change of variables
ξ(x, t) = x − ct
η(x, t) = x + ct
2 2 2
1. Show that ∂
∂x
u
2
−
1
c
2
∂ u
2
= 0 if and only if ∂
∂ξ∂η
v
= 0.
∂t
2 2
2. Show that ∂
∂x2
u
−
1
c2
∂ u
2
= 0 if and only if u(x, t) = F (x − ct) + G(x + ct) for some functions F and G.
∂t
3. Interpret F (x − ct) + G(x + ct) in terms of travelling waves. Think of u(x, t) as the height, at position x and time t, of
a wave that is travelling along the x-axis.
Remark: Don't be thrown by the strange symbols ξ and η. They are just two harmless letters from the Greek alphabet, called
“xi” and “eta” respectively.
27
Evaluate
∂y
1. ∂z
if e yz 2
− x z ln y = π
dy
2. dx
if F (x, y, x 2 2
−y ) = 0
∂y
3. ( ∂x
) if xyuv = 1 and x + y + u + v = 0
u
This page titled 2.4: The Chain Rule is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
2.4.14 https://math.libretexts.org/@go/page/89212
2.5: Tangent Planes and Normal Lines
The tangent line to the curve y = f (x) at the point (x , f (x )) is the straight line that fits the curve best 1 at that point. Finding
0 0
tangent lines was probably one of the first applications of derivatives that you saw. See, for example, Theorem 2.3.2 in the CLP-1
text. The analog of the tangent line one dimension up is the tangent plane. The tangent plane to a surface S at a point (x , y , z ) is 0 0 0
the plane that fits S best at (x , y , z ). For example, the tangent plane to the hemisphere
0 0 0
2 2 2
S = {(x, y, z)| x +y + (z − 1 ) = 1, 0 ≤ z ≤ 1}
We are now going to determine, as our first application of partial derivatives, the tangent plane to a general surface S at a general
point (x , y , z ) lying on the surface. We will also determine the line which passes through (x , y , z ) and whose direction is
0 0 0 0 0 0
For example, the following figure shows the side view of the tangent plane (in black) and normal line (in blue) to the surface
2
z = x +y (in red) at the point (0, 1, 1).
2
Furthermore we can use any (nonzero) vector that is perpendicular to S at (x , y , z ) as both the normal vector to the tangent
0 0 0
form z = f (x, y) and then, more generally, for surfaces of the form G(x, y, z) = 0.
the specified surface at the specified point, and, second, taking the cross product of those two tangent vectors. Consider the red
curve in the figure below. It is the intersection of our surface z = f (x, y)
2.5.1 https://math.libretexts.org/@go/page/89213
with the plane y = y 0. Here is a side view of the red curve.
The vector from the point (x , y , f (x , y )), on the red curve, to the point (x + h , y
0 0 0 0 0 0 , f (x0 + h, y0 )), also on the red curve,
is almost tangent to the red curve, if h is very small. As h tends to 0, that vector, which is
⟨h , 0 , f (x0 + h, y0 ) − f (x0 , y0 )⟩
becomes exactly tangent to the curve. However its length also tends to 0. If we divide by h, and then take the limit h → 0, we get
1 f (x0 + h, y0 ) − f (x0 , y0 )
lim ⟨h , 0 , f (x0 + h, y0 ) − f (x0 , y0 )⟩ = lim ⟨1 , 0 , ⟩
h→0 h h→0 h
f ( x0 +h, y0 )−f ( x0 , y0 )
Since the limit lim h→0
h
is the definition of the partial derivative f x (x0 , y0 ), we get that
1
lim ⟨h , 0 , f (x0 + h, y0 ) − f (x0 , y0 )⟩ = ⟨1 , 0 , fx (x0 , y0 )⟩
h→0 h
is a nonzero vector that is exactly tangent to the red curve and hence is also tangent to our surface z = f (x, y) at the point
(x0 , y0 , f (x0 , y0 )).
For the second tangent vector, we repeat the process with the blue curve in the figure at the beginning of this subsection. That blue
curve is the intersection of our surface z = f (x, y) with the plane x = x . Here is a front view of the blue curve.
0
2.5.2 https://math.libretexts.org/@go/page/89213
1
⟨0 , h , f (x0 , y0 + h) − f (x0 , y0 )⟩
h
from the point (x , y , f (x , y )), on the blue curve, to (x , y + h , f (x , y + h)), also on the blue curve, (and lengthened
0 0 0 0 0 0 0 0
by a factor ) is almost tangent to the blue curve. Taking the limit h → 0 gives the tangent vector
1
1 f (x0 , y0 + h) − f (x0 , y0 )
lim ⟨0 , h , f (x0 , y0 + h) − f (x0 , y0 )⟩ = lim ⟨0 , 1 , ⟩
h→0 h h→0 h
= ⟨0 , 1 , fy (x0 , y0 )⟩
Now that we have two vectors in the tangent plane to the surface z = f (x, y) at (x 0 , y0 , f (x0 , y0 )), we can find a normal vector
to the tangent plane by taking their cross product. Their cross product is
^
⎡ ^
ı
ı ^
ȷ
ȷ k ⎤
⎣ ⎦
0 1 fy (x0 , y0 )
^
= −fx (x0 , y0 ) ^
ı − fy (x0 , y0 ) ^
ı ȷ
ȷ +k
The tangent plane to the surface z = f (x, y) at (x 0 , y0 , f (x0 , y0 )) is the plane through (x 0 , y0 , f (x0 , y0 )) with normal vector
−fx (x0 , y0 ) ^
ı − fy (x0 , y0 ) ^
ı ȷ
^
ȷ + k. This plane has equation
Now that we have the normal vector, finding the equation of the normal line to the surface z = f (x, y) at the point
(x , y , f (x , y )) is straightforward. Writing it in parametric form,
0 0 0 0
By way of summary
2. The equation of the tangent plane to the surface z = f (x, y) at the point (x 0 , y0 , f (x0 , y0 )) may be written as
3. The parametric equation of the normal line to the surface z = f (x, y) at the point (x 0 , y0 , f (x0 , y0 )) is
2.5.3 https://math.libretexts.org/@go/page/89213
Example 2.5.2
As a warm-up example, we'll find the tangent plane and normal line to the surface z = x 2
+y
2
at the point (1, 0, 1). To do so,
we just apply Theorem 2.5.1 with x = 1, y = 0 and
0 0
2 2
f (x, y) = x +y f (1, 0) = 1
fx (x, y) = 2x fx (1, 0) = 2
fy (x, y) = 2y fy (1, 0) = 0
= 1 + 2(x − 1) + 0(y − 0)
= −1 + 2x
= ⟨1, 0, 1⟩ + t ⟨−2 , 0 , 1⟩
= ⟨1 − 2t , 0 , 1 + t⟩
That was pretty simple — find the partial derivatives and substitute in the coordinates. Let's do something a bit more challenging.
Solution
Write f (x, y) = x + y . Let's denote by (a, b, f (a, b)) the point on z = f (x, y) that is nearest (0, 3, 0). Before we really get
2 2
into the problem, let's make a simple sketch and think about what the lines from (0, 3, 0) to the surface look like and, in
particular, the angles between these lines and the surface.
The line from (0, 3, 0) to (a, b, f (a, b)), the point on z = f (x, y) nearest (0, 3, 0), is distinguished from the other lines from
(0, 3, 0) to the surface, by being perpendicular to the surface. We will provide a detailed justification for this claim below.
Let's first exploit the fact that the vector from (0, 3, 0) to (a, b, f (a, b)) must be perpendicular to the surface to determine
(a, b, f (a, b)), and consequently the distance from (0, 3, 0) to the surface. By Theorem 2.5.1.a, with x = a and y = b, the0 0
vector
^ ^
−fx (a, b) ^
ı
ı − fy (a, b) ^
ȷ
ȷ + k = −2a ^
ı
ı − 2b ^
ȷ
ȷ +k
is normal to the surface z = f (x, y) at (a, b, f (a, b)). So the vector from (0, 3, 0) to (a, b, f (a, b)), namely
^ 2 2 ^
a ^
ı + (b − 3) ^
ı ȷ + f (a, b) k = a ^
ȷ ı + (b − 3) ^
ı ȷ
ȷ + (a + b ) k
must be parallel to (∗). This does not force the vector (∗) to equal (∗∗), but it does force the existence of some number t
obeying
2.5.4 https://math.libretexts.org/@go/page/89213
2 2 ^ ^
a ^
ı + (b − 3) ^
ı ȷ
ȷ + (a + b ) k = t( − 2a ^
ı − 2b ^
ı ȷ
ȷ + k)
or equivalently
⎧
⎪ a = −2a t
⎨ b − 3 = −2b t
⎩
⎪ 2 2
a +b =t
We now have a system of three equations in the three unknowns a, b and t. If we can solve them, we will have found the point
on the surface that we want.
The first equation is a(1 + 2t) = 0 so that either a = 0 or t = − . 1
In general, cubic equations are very hard to solve 2. But, in this case, we can guess one solution 3, namely b = 1. So (b − 1)
3 2
0 = 2b + b − 3 = (b − 1)(2 b + 2b + 3)
We can now find the roots of the quadratic factor by using the high school formula
−−−−−−−−−−
2
−2 ± √ 2 − 4(2)(3)
Since 2
2
− 4(2)(3) < 0, the factor 2b
2
+ 2b + 3 has no real roots. So the only real solution to the cubic equation
2b
3
+b −3 = 0 is b = 1.
In summary,
a = 0, b = 1 and
the point on z = x 2
+y
2
nearest (0, 3, 0) is (0, 1, 1) and
−−−−−−−−−
–
the distance from (0, 3, 0) to z = x 2
+y
2
is the distance from (0, 3, 0) to (0, 1, 1), which is √(−2) 2 2
+1 = √5.
Finally back to the claim that, because (a, b, f (a, b)) is the point on z = f (x, y) that is nearest 4 (0, 3, 0), the vector from
(0, 3, 0) to (a, b, f (a, b)) must be perpendicular to the surface z = f (x, y) at (a, b, f (a, b)). Note that the square of the
and
restricting our attention to the slice x = a of the surface, y = b minimizes h(y) = D(a, y) = a 2
+ (y − 3 )
2
+ f (a, y )
2
so
that
∂
′ 2 2 2 ∣
0 = h (b) = [a + (y − 3 ) + f (a, y ) ]∣
∂y ∣
y=b
2.5.5 https://math.libretexts.org/@go/page/89213
We have expressed the final right hand sides of both of the above bullets as the dot product of the vector ⟨a , b − 3 , f (a, b)⟩
the vanishing of the dot product of two vectors implies that the two vectors are perpendicular.
Thus, that
=0
tells us that the vector ⟨a , b − 3 , f (a, b)⟩ from (0, 3, 0) to (a, b, f (a, b)) is perpendicular to both ⟨1 , 0 , f (a, b)⟩ and x
⟨0 , 1 , f (a, b)⟩ and hence is parallel to their cross product ⟨1 , 0 , f (a, b)⟩ × ⟨0 , 1 , f (a, b)⟩ , which we already know is a
y x y
Because you are walking along the surface, we know that r (t)
⃗ always lies on the surface and so
for all t. Differentiating this equation with respect to t gives, by the chain rule,
∂G ∂G
′ ′
(x(t) , y(t) , z(t)) x (t) + (x(t) , y(t) , z(t)) y (t)
∂x ∂y
∂G ′
+ (x(t) , y(t) , z(t)) z (t) = 0
∂z
Expressing this as a dot product allows us to turn this into a statement about vectors.
∂G ∂G ∂G ′
⟨ (x0 , y0 , z0 ) , (x0 , y0 , z0 ) , (x0 , y0 , z0 )⟩ ⋅ r ⃗ (0) = 0
∂x ∂y ∂z
The first vector in this dot product is sufficiently important that it is given its own name.
2.5.6 https://math.libretexts.org/@go/page/89213
Definition 2.5.4. Gradient
The gradient 5 of the function G(x, y, z) at the point (x 0 , y0 , z0 ) is
∂G ∂G ∂G
⟨ (x0 , y0 , z0 ) , (x0 , y0 , z0 ) , (x0 , y0 , z0 )⟩
∂x ∂y ∂z
It is denoted n⃗ ablaG(x 0, y0 , z0 ).
So (∗) tells us that the gradient n⃗ ablaG(x 0, y0 , z0 ), is perpendicular to the vector r ⃗ (0). ′
is thus exactly tangent to our path, and consequently to the surface G(x, y, z) = 0 at (x 0, y0 , z0 ). This is true for all paths on the
surface that pass through (x , y , z ) at time t = 0, which tells us that n⃗ ablaG(x , y
0 0 0 0 0, z0 ) is perpendicular to the surface at
(x , y , z ). We have just found a normal vector!
0 0 0
The above argument goes through unchanged for surfaces of the form 6 G(x, y, z) = K, for any constant K. So we have
Let K be a constant and (x 0, y0 , z0 ) be a point on the surface G(x, y, z) = K. Assume that the gradient
∂G ∂G ∂G
n⃗ ablaG(x0 , y0 , z0 ) = ⟨ (x0 , y0 , z0 ) , (x0 , y0 , z0 ) , (x0 , y0 , z0 )⟩
∂x ∂y ∂z
of G at (x 0, y0 , z0 ) is nonzero.
1. The vector n⃗ ablaG(x , y , z ) is normal to the surface G(x, y, z) = K at (x , y
0 0 0 0 0, z0 ).
n⃗ ablaG(x0 , y0 , z0 ) ⋅ ⟨x − x0 , y − y0 , z − z0 ⟩ = 0
Remark 2.5.6
Theorem 2.5.1 about the tangent planes and normal lines to the surface z = f (x, y) is actually a very simple consequence of
Theorem 2.5.5 about the tangent planes and normal lines to the surface G(x, y, z) = 0. This is just because we can always
rewrite the equation z = f (x, y) as z − f (x, y) = 0 and apply Theorem 2.5.5 with G(x, y, z) = z − f (x, y). Since
^
n⃗ ablaG(x0 , y0 , z0 ) = −fx (x0 , y0 ) ^
ı
ı − fy (x0 , y0 ) ^
ȷ
ȷ +k
Example 2.5.7
Find the tangent plane and the normal line to the surface
2 2
z =x + 5xy − 2 y
2.5.7 https://math.libretexts.org/@go/page/89213
As a preliminary check, note that
2 2
1 + 5 × 1 × 2 − 2(2 ) =3
which verifies that the point (1, 2, 3) is indeed on the surface. This is a good reality check and also increases our confidence
that the question is asking what we think that it is asking. Rewrite the equation of the surface as
2 2
G(x, y, z) = x + 5xy − 2 y − z = 0. Then the gradient
^
n⃗ ablaG(x, y, z) = (2x + 5y) ^
ı
ı + (5x − 4y) ^
ȷ
ȷ −k
is a normal vector to the surface at (1, 2, 3). Equipped 8 with the normal, it is easy to work out an equation for the tangent
plane.
or
12x − 3y − z = 3
We can quickly check that the point (1, 2, 3) does indeed lie on the plane:
12 × 1 − 3 × 2 − 3 = 3
or
x −1 y −2 z−3
= = ( = t)
12 −3 −1
Another warm-up example. This time the surface is a hyperboloid of one sheet.
Example 2.5.8
Find the tangent plane and the normal line to the surface
2 2 2
x +y −z =4
is a normal vector to the surface at (2, −3, 3). The tangent plane is
2.5.8 https://math.libretexts.org/@go/page/89213
n⃗ ⋅ ⟨x − 2 , y + 3 , z − 3⟩ = ⟨2 , −3 , −3⟩ ⋅ ⟨x − 2 , y + 3 , z − 3⟩ = 0
or
2x − 3y − 3z = 4
Again, as a check, we can verify that our point (2, −3, 3) is indeed on the plane:
2 × 2 − 3 × (−3) − 3 × 3 = 4
⟨x − 2 , y + 3 , z − 3⟩ = t n⃗ = t ⟨2 , −3 , −3⟩
or
x −2 y +3 z−3
= = ( = t)
2 −3 −3
Warning 2.5.9
The vector n⃗ ablaG(x, y, z) is not a normal vector to the surface G(x, y, z)=K at (x , y 0 0, z0 ). The vector
n ⃗ ablaG(x0 , y0 , z0 ) is a normal vector to G(x, y, z)=K at (x , y , z ) (provided G(x , y , z ) = K ).
0 0 0 0 0 0
As an example of the consequences of failing to evaluate n⃗ ablaG(x, y, z) at the point (x 0, y0 , z0 ), consider the problem
2 2 2
Find the tangent plane to the surface x +y +z = 1 at the point (0, 0, 1).
n⃗ ablaG(0, 0, 1) ⋅ ⟨x − 0 , y − 0 , z − 1⟩ = 0 or 2(z − 1) = 0 or z =1
This is of course correct — the tangent plane to the unit sphere at the north pole is indeed horizontal.
But if we were to incorrectly apply part (b) of Theorem 2.5.5 by failing to evaluate n⃗ ablaG(x, y, z) at (0, 0, 1), we would find
that the “tangent plane” is
n⃗ ablaG(x, y, z) ⋅ ⟨x − 0 , y − 0 , z − 1⟩ = 0
2 2 2
or x +y +z −z = 0
This is horribly wrong. It is not even a plane, as any plane has an equation of the form ax + by + cz = d, with a, b, c and d
constants.
Example 2.5.10
Suppose that we wish to find the highest and lowest points on the surface G(x, y, z) = x − 2x + y − 4y + z − 6z = 2. 2 2 2
That is, we wish to find the points on the surface with the maximum value of z and with the minimum 9 value of z.
Completing three squares,
2 2 2
G(x, y, z) =x − 2x + y − 4y + z − 6z
2 2 2
= (x − 1 ) + (y − 2 ) + (z − 3 ) − 14.
So the surface G(x, y, z) = 2 is a sphere, whose highest point is the north pole and whose lowest point is the south pole. But
let's pretend that G(x, y, z) = 2 is some complicated surface that we can't easily picture.
We'll find its highest and lowest points by exploiting the fact that the tangent plane to G = 2 is horizontal at the highest and
lowest points. Equivalently, the normal vector to G = 2 is vertical at the highest and lowest points. To see that this is the case,
look at the figure below. If the tangent plane at (x , y , z ) is not horizontal, then the tangent plane contains points near
0 0 0
2.5.9 https://math.libretexts.org/@go/page/89213
with z bigger than z and points near (x , y , z ) with z smaller than z
(x0 , y0 , z0 ) 0 0 0 0 0. Near (x
0, y0 , z0 ), the tangent plane is a
good approximation to the surface. So the surface also contains 10 such points.
The gradient is
^
n⃗ ablaG(x, y, z) = (2x − 2) ^
ı
ı + (2y − 4) ^
ȷ
ȷ + (2z − 6) k
y = 2. So the normal vector to the surface G = 2 at the point (x, y, z) is vertical when x = 1, y = 2 and (don't forget that
(x, y, z) has to be on G = 2 )
2 2 2
G(1, 2, z) = 1 −2 ×1 +2 −4 ×2 +z − 6z = 2
2
⟺ z − 6z − 7 = 0
⟺ (z − 7)(z + 1) = 0
⟺ z = 7, − 1
The highest point is (1, 2, 7) and the lowest point is (1, 2, −1), as expected.
We could have short-cut the last example by using that the surface was a sphere. Here is an example in the same spirit for which we
don't have an easy short-cut.
Example 2.5.11
In the last example, we found the points on a specified surface having the largest and smallest values of z. We'll now ramp up
the level of difficulty a bit and find the points on the surface x + 2y + 3z = 72 that have the largest and smallest values of
2 2 2
x + y + 3z.
To develop a strategy for tackling this problem, consider the following sketch.
The red ellipse in the sketch is intended to represent (schematically) our surface
2 2 2
x + 2y + 3z = 72
which is an ellipsoid. The middle diagonal (black) line is intended to represent (schematically) the plane x + y + 3z = C for
some more or less randomly chosen value of the constant C. At each point on that plane, the function, x + y + 3z, (that we are
trying to maximize and minimize) takes the value C . In particular, for the C chosen in the figure, x + y + 3z = C does
intersect our surface, indicating that x + y + 3z does indeed take the value C somewhere on our surface.
2.5.10 https://math.libretexts.org/@go/page/89213
To maximize x + y + 3z, imagine slowly increasing the value of C . As we do so, the plane x + y + 3z = C moves to the
right. We want to stop increasing C at the biggest value of C for which the plane x + y + 3z = C intersects our surface
x + 2 y + 3 z = 72. For that value of C the plane x + y + 3z = C , which is represented by the right hand blue line in the
2 2 2
sketch, is again tangent to our surface. The previous Example 2.5.10 was similar, except that the plane was z = C .
We are now ready to compute. We need to find the points (a, b, c) (in the sketch, they are the black dot points of tangency) for
which
is on the surface and
(a, b, c)
2
and c = t. Substituting this into the first equation gives
2
1 2 2
9 2 2
t + t + 3t = 72 ⟺ t = 72 ⟺ t = 16 ⟺ t = ±4
2 2
So
the point on the surface x 2
+ 2y
2
+ 3z
2
= 72 at which x + y + 3z takes its maximum value is
(a, b, c) = (t,
t
2
, t)
∣
∣
= (4, 2, 4) and
t=4
(a, b, c) = (t,
t
2
, t)
∣
∣
= (−4, −2, −4) and
t=−4
Example 2.5.12
Find the distance from the point (1, 1, 1) to the plane x + 2y + 3z = 20.
Solution 1
First note that the point (1, 1, 1) is not itself on the plane x + 2y + 3z = 20 because
1 + 2 × 1 + 3 × 1 = 6 ≠ 20
Denote by (a, b, c) the point on the plane x + 2y + 3z = 20 that is nearest (1, 1, 1). Then the vector from (1, 1, 1) to (a, b, c),
namely ⟨a − 1 , b − 1 , c − 1⟩ , must be perpendicular 11 to the plane. As the gradient of x + 2y + 3z, namely ⟨1 , 2 , 3⟩ , is a
normal vector to the plane, ⟨a − 1 , b − 1 , c − 1⟩ must be parallel to ⟨1 , 2 , 3⟩ . So there must be some number t so that
⟨a − 1 , b − 1 , c − 1⟩ = t ⟨1 , 2 , 3⟩
or
a = t + 1, b = 2t + 1, c = 3t + 1
The distance from (1, 1, 1) to the plane x + 2y + 3z = 20 is the length of the vector
−−
⟨a − 1 , b − 1 , c − 1⟩ = t ⟨1 , 2 , 3⟩ = ⟨1 , 2 , 3⟩ which is √14.
2.5.11 https://math.libretexts.org/@go/page/89213
Solution 2
Denote by P = (a, b, c) the point on the plane x + 2y + 3z = 20 that is nearest the point Q = (1, 1, 1). Pick any other point
on the plane and call it R. For example (x, y, z) = (20, 0, 0) obeys x + 2y + 3z = 20 and so R = (20, 0, 0) is a point on the
plane.
The triangle P QR is right angled. Denote by θ the angle between the hypotenuse QR and the side QP . The distance from
Q = (1, 1, 1) to the plane is the length of the line segment QP , which is
Now, the dot product between the vector from Q to R, which is ⟨19, −1, −1⟩ , with the vector ⟨1, 2, 3⟩ , which is normal to the
plane and hence parallel to the side QP is
⟨19, −1, −1⟩ ⋅ ⟨1, 2, 3⟩ = 14
so that, finally,
14 −−
distance = |QR| cos θ = −− = √14
√14
Example 2.5.13
Let F (x, y, z) = 0 and G(x, y, z) = 0 be two surfaces. These two surfaces intersect along a curve. Find a tangent vector to this
curve at the point (x , y , z ).
0 0 0
Solution
Call the tangent vector T. Then T has to be
tangent to the surface F (x, y, z) = 0 at (x 0, y0 , z0 ) and
tangent to the surface G(x, y, z) = 0 at (x 0, y0 , z0 ).
Consequently T has to be
perpendicular to the vector n⃗ ablaF (x 0, y0 , z0 ), which is normal to F (x, y, z) = 0 at (x 0, y0 , z0 ), and at the same time has
to be
perpendicular to the vector n⃗ ablaG(x 0, y0 , z0 ), which is normal to G(x, y, z) = 0 at (x 0, y0 , z0 ).
Recall that an easy way to construct a vector that is perpendicular to two other vectors is to take their cross product. So we take
^
⎡ ^
ı
ı ^
ȷ
ȷ k ⎤
^
= (Fy Gz − Fz Gy ) ^
ı
ı + (Fz Gx − Fx Gz ) ^
ȷ
ȷ + (Fx Gy − Fy Gx ) k
2.5.12 https://math.libretexts.org/@go/page/89213
Example 2.5.14
–
Find a tangent vector to this curve at the point (√3 , 1 , 1).
Solution
– –
As a preliminary check, we verify that the point (√3 , 1 , 1) really is on the curve. To do so, we check that (√3 , 1 , 1)
We'll find the specified tangent vector by using the strategy of Example 2.5.13.
Write F (x, y, z) = x 2
+y
2
+z
2
and G(x, y, z) = x 2
+y
2
− 4z. Then
the vector
– ∣ –
n⃗ ablaF (√3, 1, 1) = ⟨2x , 2y , 2z⟩ = 2 ⟨√3 , 1 , 1⟩
∣
(x,y,z)=( √3,1,1)
–
is normal to the surface F (x, y, z) = 5 at (√3 , 1 , 1), and
the vector
– ∣ –
n⃗ ablaG(√3, 1, 1) = ⟨2x , 2y , −4⟩ = 2 ⟨√3 , 1 , −2⟩
∣
(x,y,z)=( √3,1,1)
–
is normal to the surface G(x, y, z) = 0 at (√3 , 1 , 1).
So a tangent vector is
^
⎡ ^ı
ı ^
ȷ
ȷ k ⎤
– – –
⟨√3 , 1 , 1⟩ × ⟨√3 , 1 , −2⟩ = det ⎢ √3 1 1 ⎥
⎣ – ⎦
√3 1 −2
– – – – ^
= ( − 2 − 1) ^
ı + (√3 + 2 √3) ^
ı ȷ
ȷ + (√3 − √3) k
–
= −3 ^
ı
ı + 3 √3 ^
ȷ
ȷ
There is an easy common factor of 3 in both components. So we can create a slightly neater tangent vector by dividing the
– –
length of −3 ^ ȷ by 3, giving ⟨−1 , √3 , 0⟩ .
ı + 3 √3 ^
ı ȷ
Recreating this effect in computer generated graphics is called “hidden-surface elimination”. In general, implementing hidden-
surface elimination can be quite complicated. Often a technique called “ray tracing” is used 12. However, it is easy if you know
about vectors and gradients, and you are only looking at a single convex body. By definition, a solid is convex if, whenever
two points are in the solid, then the line segment joining the two points is also contained in the solid.
2.5.13 https://math.libretexts.org/@go/page/89213
So suppose that we are looking at a convex solid, that the equation of the surface of the solid is G(x, y, z) = 0, and that our
eye is at (x , y , z ).
e e e
First consider a light ray that leaves our eye and then just barely nicks the solid at the point (x, y, z), as in the figure on the
left below. The light ray is a tangent line to the surface at (x, y, z). So the direction vector of the light ray,
⟨x − x , y − y , z − z ⟩ , is tangent to the surface at (x, y, z) and consequently is perpendicular to the normal vector,
e e e
⟨x − xe , y − ye , z − ze ⟩ ⋅ n⃗ ablaG(x, y, z) = 0
Now consider a light ray that leaves our eye and then passes through the solid, as in the figure on the right above. Call the
point at which the light ray first enters the solid (x, y, z) and the point at which the light ray leaves the solid (x , y , z ). ′ ′ ′
Let v ⃗ be a vector that has the same direction as, i.e. is a positive multiple of, the vector ⟨x − x , y − y , z − z ⟩ .
e e e
Let n⃗ be an outward pointing normal to the solid at (x, y, z). It will be either n⃗ ablaG(x, y, z) or −n⃗ ablaG(x, y, z).
′
Let n⃗ be an outward pointing normal to the solid at (x , y , z ). It will be either n⃗ ablaG(x , y , z ) or
′ ′ ′ ′ ′ ′
′ ′ ′
−n⃗ ablaG(x , y , z ).
Then
at the point (x, y, z) where the ray enters the solid, which is a visible point, the direction vector v ⃗ points into the solid.
The angle θ between v ⃗ and the outward pointing normal n⃗ is greater than 90 , so that the dot product
∘
at the point (x , y , z ) where the ray leaves the solid, which is a hidden point, the direction vector v ⃗ points out of the
′ ′ ′
′
solid. The angle θ between v ⃗ and the outward pointing normal n⃗ is less than 90 , so that the dot product
∘
′ ′
v ⃗ ⋅ n⃗ = | v|⃗ | n⃗ | cos θ > 0.
Our conclusion is that, if we are looking in the direction v,⃗ and if the outward pointing normal 13 to the surface of the solid at
(x, y, z) is n⃗ ablaG(x, y, z) then the point (x, y, z) is hidden if and only if v ⃗ ⋅ n⃗ ablaG(x, y, z) > 0.
This method was used by the computer graphics program that created the shaded figures 14 in Examples 1.7.1 and 1.7.2, which
are reproduced here.
2.5.14 https://math.libretexts.org/@go/page/89213
Tangent planes, in addition to being geometric objects, provide a simple but powerful tool for approximating functions of two
variables near a specified point. We saw something very similar in the CLP-1 text where we approximated functions of one variable
by their tangent lines. This brings us to our next topic — approximating functions.
Exercises
Stage 1
1
2
Let the point r ⃗ = (x , y , z ) lie on the surface G(x, y, z) = 0. Assume that n⃗ ablaG(x , y , z ) ≠ 0. Suppose that the
0 0 0 0 0 0 0
parametrized curve r (t) ⃗ = (x(t), y(t), z(t)) is contained in the surface and that r (
⃗ t ) = r ⃗ . Show that the tangent line to the 0 0
3
Let F (x , y , z ) = G(x , y , z ) = 0 and let the vectors n⃗ ablaF (x , y , z ) and n⃗ ablaG(x , y , z ) be nonzero and not be
0 0 0 0 0 0 0 0 0 0 0 0
parallel to each other. Find the equation of the normal plane to the curve of intersection of the surfaces F (x, y, z) = 0 and
G(x, y, z) = 0 at (x , y , z ). By definition, that normal plane is the plane through (x , y , z ) whose normal vector is the
0 0 0 0 0 0
4
Stage 2
5✳
2
x y
Let f (x, y) = 4 2
. Find the tangent plane to the surface z = f (x, y) at the point (−1 , 1,
1
3
).
x + 2y
6✳
7
Find the equations of the tangent plane and the normal line to the graph of the specified function at the specified point.
1. f (x, y) = x − y at (−2, 1)
2 2
2. f (x, y) = e at (2, 0)
xy
2.5.15 https://math.libretexts.org/@go/page/89213
8✳
Consider the surface z = f (x, y) defined implicitly by the equation xy z + y 2 2
z
3 2
= 3 +x . Use a 3--dimensional gradient
vector to find the equation of the tangent plane to this surface at the point (−1, 1, 2). Write your answer in the form
z = ax + by + c, where a, b and c are constants.
9✳
A surface is given by
2 2
z =x − 2xy + y .
10 ✳
2y
Find the tangent plane and normal line to the surface z = f (x, y) = x2 +y 2
at (x, y) = (−1, 2).
11 ✳
Find all the points on the surface x 2
+ 9y
2
+ 4z
2
= 17 where the tangent plane is parallel to the plane x − 8z = 0.
12 ✳
13 ✳
14
–
Find a vector of length √3 which is tangent to the curve of intersection of the surfaces z
2 2
= 4x
2
+ 9y and
6x + 3y + 2z = 5 at (2, 1, −5).
Stage 3
15
Find all horizontal planes that are tangent to the surface with equation
2 2
−( x +y )/2
z = xye
16 ✳
Let S be the surface
2 2 3
xy − 2x + yz + x +y +z =7
1. Find the tangent plane and normal line to the surface S at the point (0, 2, 1).
2. The equation defining S implicitly defines z as a function of x and y for (x, y, z) near (0, 2, 1). Find expressions for ∂z
∂x
and ∂z
∂y
. Evaluate ∂z
∂y
at (x, y, z) = (0, 2, 1).
2.5.16 https://math.libretexts.org/@go/page/89213
2
∂x∂y
z
.
17 ✳
1. Find a vector perpendicular at the point (1, 1, 3) to the surface with equation x + z = 10.
2 2
2. Find a vector tangent at the same point to the curve of intersection of the surface in part (a) with surface y 2
+z
2
= 10.
3. Find parametric equations for the line tangent to that curve at that point.
18 ✳
Find the (acute) angle between the curve and the surface at P .
19
Find the distance from the point (1, 1, 0) to the circular paraboloid with equation z = x 2 2
+y .
1. It is possible, but beyond the scope of this text, to give a precise meaning to “fits best”.
2. The method for solving cubics was developed in the 15th century by del Ferro, Cardano and Ferrari (Cardano's student). Ferrari
then went on to discover a formula for the roots of a quartic. Both the cubic and quartic formulae are extremely cumbersome,
and no such formula exists for polynomials of degree 5 and higher. This is the famous Abel-Ruffini theorem.
3. See Appendix A.16 in the CLP-2 text. There it is shown that any integer root of a polynomial with integer coefficients must
divide the constant term exactly. So in this case only ±1 and ±3 could be integer roots. So it is good to check to see if any of
these are solutions before moving on to more sophisticated techniques.
4. Note that we are assuming that (a, b, f (a, b)) is the point on the surface that is nearest (0, 3, 0). That there exists such a point is
intuitively obvious from a sketch of the surface. The mathematical proof that there exists such a point is beyond the scope of
this text.
5. The gradient will also play a big role in Section 2.7.
6. Alternatively, one could rewrite G = K as G − K = 0 and replace G by G − K in the above argument.
7. Indeed we could write Theorem 2.5.1 as a corollary of Theorem 2.5.5. But in a textbook one tries to start with the concrete and
move to the more general.
8. The spelling “equipt” is a bit archaic. There must be a joke here about quips.
9. Recall that “minimum” means the most negative, not the closest to zero.
10. While this is intuitively obvious, proving it is beyond the scope of this text.
11. We saw why this vector must be perpendicular to the plane in Example 2.5.3.
12. You can find out more about it by plugging “ray tracing” into the search engine of your choice.
13. If n⃗ ablaG(x, y, z) is the inward pointing normal, just replace G by −G.
14. Those figures are not convex. But it was still possible to use the method discussed above because any light ray from our eye
that passes through the figure intersects the figure at most twice. It first enters the figure at a visible point and then exits the
figure at a hidden point.
This page titled 2.5: Tangent Planes and Normal Lines is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.
2.5.17 https://math.libretexts.org/@go/page/89213
2.6: Linear Approximations and Error
A frequently used, and effective, strategy for building an understanding of the behaviour of a complicated function near a point is
to approximate it by a simple function. The following suite of such approximations is standard fare in Calculus I courses. See, for
example, §3.4 in the CLP-1 text.
g(t0 + Δt) ≈ g(t0 ) constant approximation
′
g(t0 + Δt) ≈ g(t0 ) + g (t0 ) Δt linear, or tangent line, approximation
′ 1 ′′ 2
g(t0 + Δt) ≈ g(t0 ) + g (t0 ) Δt + g (t0 ) Δt quadratic approximation
2
is known as the Taylor polynomial of degree n. You may have also found a formula for the error introduced in making this
approximation. The error E (Δt) is defined by
n
′ 1 ′′ 2 1 (n) n
g(t0 + Δt) = g(t0 ) + g (t0 )Δt + g (t0 )Δt +⋯ + g (t0 )Δt + En (Δt)
2! n!
and obeys 1
1 (n+1) n+1
En (Δt) = g (t0 + cΔt)Δt
(n+1)!
′
≈ g(t0 ) + g (t0 ) Δt
′
= g(0) + g (0)
′
∂f ∂f
g (t) = (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy
∂x ∂y
so that,
Equation 2.6.1
∂f ∂f
f (x0 + Δx , y0 + Δy) ≈ f (x0 , y0 ) + (x0 , y0 ) Δx + (x0 , y0 ) Δy
∂x ∂y
Of course exactly the same procedure works for functions of three or more variables. In particular
Equation 2.6.2
2.6.1 https://math.libretexts.org/@go/page/92239
f (x0 + Δx , y0 + Δy , z0 + Δz)
∂f ∂f
≈ f (x0 , y0 , z0 ) + (x0 , y0 , z0 ) Δx + (x0 , y0 , z0 ) Δy
∂x ∂y
∂f
+ (x0 , y0 , z0 ) Δz
∂z
While these linear approximations are quite simple, they tend to be pretty decent provided Δx and Δy are small. See the optional
§2.6.1 for a more precise statement.
Remark 2.6.3
Looking at part (b) of Theorem 2.5.1, we see that this just says that the tangent plane to the surface z = f (x, y) at the point
(x , y , f (x , y )) remains close to the surface when (x, y) is close to (x , y ).
0 0 0 0 0 0
Example 2.6.4
Let
−−−−−−
2 2
f (x, y) = √ x +y
Then
∂f 1 2x x0
(x, y) = − −−−− − fx (x0 , y0 ) = −−−−−−
∂x 2 √ x2 + y 2 2 2
√x +y
0 0
∂f 1 2y y0
(x, y) = fy (x0 , y0 ) =
− −−−− − −−−−−−
∂y 2 √ x2 + y 2 2 2
√x +y
0 0
Definition 2.6.5
People often write Δf for the change f (x0 + Δx , y0 + Δy) − f (x0 , y0 ) in the value of f. Then the linear approximation
2.6.1 becomes
∂f ∂f
Δf ≈ (x0 , y0 ) Δx + (x0 , y0 ) Δy
∂x ∂y
If they want to emphasize that that Δx, Δy and Δf are really small (they may even say “infinitesimal”), they'll write 2 dx, dy
and df instead. In this notation
∂f ∂f
df ≈ (x0 , y0 ) dx + (x0 , y0 ) dy
∂x ∂y
2.6.2 https://math.libretexts.org/@go/page/92239
Definition 2.6.6
Suppose that we wish to approximate a quantity Q and that the approximation turns out to be Q + ΔQ. Then
the absolute error in the approximation is |ΔQ| and
ΔQ
the relative error in the approximation is ∣∣ Q
∣
∣
and
ΔQ
the percentage error in the approximation is 100 ∣∣ Q
∣
∣
−
−−
In Example 3.4.5 of the CLP-1 text we found an approximate value for the number √4.1 by using a linear approximation to the
single variable function f (x) = √−
x . We can make similar use of linear approximations to multivariable functions.
Example 2.6.7
3
(0.998)
Find an approximate value for 1.003
.
Solution
3
x
Set f (x, y) = . We are to find (approximately) f (0.998 , 1.003). We can easily find
y
3
1
f (1, 1) = =1
1
and since
2 3
∂f 3x ∂f x
= and =−
2
∂x y ∂y y
∂f ∂f
≈ f (1, 1) + (1, 1) Δx + (1, 1) Δy
∂x ∂y
Example 2.6.8
Solution
Set f (x, y, z) = x 1/2
+y
1/3
+z
1/4
. We are to find (approximately) f (4.2 , 26.7 , 256.4). We can easily find
1/2 1/3 1/4
f (4, 27, 256) = (4 ) + (27 ) + (256 ) = 2 +3 +4 = 9
and since
∂f 1 ∂f 1 ∂f 1
= = =
1/2 2/3
∂x 2x ∂y 3y ∂z 4z 3/4
2.6.3 https://math.libretexts.org/@go/page/92239
∂f 1 1 1
(4, 27, 256) = = ×
1/2 2 2
∂x 2(4)
∂f 1 1 1
(4, 27, 256) = = ×
∂y 3(27)2/3 3 9
∂f 1 1 1
(4, 27, 256) = = ×
3/4 4 64
∂z 4(256)
= f (4 + Δx , 27 + Δy , 256 + Δz)
∂f ∂f
≈ f (4, 27, 256) + (4, 27, 256) Δx + (4, 27, 256) Δy
∂x ∂y
∂f
+ (4, 27, 256) Δz
∂z
0.2 0.3 0.4 1 1 1
≈9+ − + =9+ − +
2 ×2 3 ×9 4 × 64 20 90 640
= 9.0405
to four decimal places. The exact answer is 9.03980 to five decimal places.
That's a difference of about
9.0405 − 9.0398
100 % = 0.008%
9
Note that we could have used the single variable approximation techniques in the CLP-1 text to separately approximate
1/2
(4.2 ) , (26.7) and (256.4) and then added the results together. Indeed what we have done here is equivalent.
1/3 1/4
Example 2.6.9
A triangle has sides a = 10.1 cm and b = 19.8cm which include an angle 35 . Approximate the area of the triangle.
∘
Solution
The triangle has height h = a sin θ and hence has area
1 1
A(a, b, θ) = bh = ab sin θ
2 2
The sin θ in this formula hides a booby trap built into this problem. In preparing the linear approximation we will need to use
the derivative of sin θ. But the standard derivative sin θ = cos θ only applies when θ is expressed in radians — not in
d
dθ
π π π
∘
35 = (30 + 5) radians = ( + ) radians
180 6 36
6
+
π
36
). We will, of course 3, choose
π
a0 = 10 b0 = 20 θ0 =
6
π
Δa = 0.1 Δb = −0.2 Δθ =
36
2.6.4 https://math.libretexts.org/@go/page/92239
1 1 1
A(a0 , b0 , θ0 ) = a0 b0 sin θ0 = (10)(20) = 50
2 2 2
∂A 1 1 1
(a0 , b0 , θ0 ) = b0 sin θ0 = (20) =5
∂a 2 2 2
∂A 1 1 1 5
(a0 , b0 , θ0 ) = a0 sin θ0 = (10) =
∂b 2 2 2 2
–
∂A 1 1 √3 –
(a0 , b0 , θ0 ) = a0 b0 cos θ0 = (10)(20) = 50 √3
∂θ 2 2 2
∂A ∂A
≈ A(a0 , b0 , θ0 ) + (a0 , b0 , θ0 )Δa + (a0 , b0 , θ0 )Δb
∂a ∂b
∂A
+ (a0 , b0 , θ0 )Δθ
∂θ
5 – π
= 50 + 5 × 0.1 + × (−0.2) + 50 √3
2 36
5 5 – π
= 50 + − + 50 √3
10 10 36
– π
= 50 (1 + √3 )
36
≈ 57.56
to two decimal places. The exact answer is 57.35 to two decimal places. Our approximation has an error of about
57.56 − 57.35
100 % = 0.37%
57.35
Another practical use of these linear approximations is to quantify how errors made in measured quantities propagate in
computations using those measured quantities. Let's explore this idea a little by recycling the last example.
Suppose, that, as in Example 2.6.9, we are attempting to determine the area of a triangle by measuring the lengths of two of its
sides together with the angle between them and then using the formula
1
A(a, b, θ) = ab sin θ
2
Of course, in the real world 4, we cannot measure lengths and angles exactly. So if we need to know the area to within 1%, the
question becomes: “How accurately do we have to measure the side lengths and included angle if we want the area that we
compute to have an error of no more than about 1%?”
Let's call the exact side lengths and included angle a , b and θ , respectively, and the measured side lengths and included
0 0 0
angle a + Δa, b + Δb and θ + Δθ. So Δa, Δb and Δθ represent the errors in our measurements. Then, by 2.6.2, the error
0 0 0
Δa Δb Δθ
= b0 sin θ0 + a0 sin θ0 + a0 b0 cos θ0
2 2 2
By the triangle inequality, |u + v| ≤ |u| + |v|, and the fact that |uv| = |u| |v|,
2.6.5 https://math.libretexts.org/@go/page/92239
∣ Δa Δb cos θ0 ∣
∣100 + 100 + 100Δθ ∣
∣ a0 b0 sin θ0 ∣
∣ Δa ∣ ∣ Δb ∣ ∣ cos θ0 ∣
≤ 100 ∣ ∣ + 100 ∣ ∣ + 100|Δθ| ∣ ∣
∣ a0 ∣ ∣ b0 ∣ ∣ sin θ0 ∣
6
≤θ ≤0
π
2
so that cot θ ≤ cot = √3 ≤ 2. Then
0
π
∣ Δa ∣ ∣ Δa ∣
100 ∣ ∣ ≤ 100 ∣ ∣ = 10 |Δa|
∣ a0 ∣ ∣ 10 ∣
∣ Δb ∣ ∣ Δb ∣
100 ∣ ∣ ≤ 100 ∣ ∣ = 10 |Δb|
∣ b0 ∣ ∣ 10 ∣
∣ cos θ0 ∣
100|Δθ| ∣ ∣ ≤ 100|Δθ| 2 = 200 |Δθ|
∣ sin θ0 ∣
and
|ΔA|
100 ≲ 10 |Δa| + 10 |Δb| + 200 |Δθ|
A(a0 , b0 , θ0 )
So it will suffice to have measurement errors |Δa|, |Δb| and |Δθ| obey
Example 2.6.11
A Question
Suppose that three variables are measured with percentage error ε , ε and ε respectively. In other words, if the exact value
1 2 3
∣ Δxi ∣
100 ∣ ∣ = εi
∣ xi ∣
Suppose further that a quantity P is then computed by taking the product of the three variables. So the exact value of P is
P (x1 , x2 , x3 ) = x1 x2 x3
and the measured value is P (x 1 + Δx1 , x2 + Δx2 , x3 + Δx3 ). What is the percentage error in this measured value of P ?
Solution
The percentage error in the measured value P (x 1 + Δx1 , x2 + Δx2 , x3 + Δx3 ) is
∣ P (x1 + Δx1 , x2 + Δx2 , x3 + Δx3 ) − P (x1 , x2 , x3 ) ∣
100 ∣ ∣
∣ P (x1 , x2 , x3 ) ∣
We can get a much simpler approximate expression for this percentage error, which is good enough for virtually all
applications, by applying
2.6.6 https://math.libretexts.org/@go/page/92239
∂
Px (x1 , x2 , x3 ) = [ x1 x2 x3 ] = x2 x3
1
∂x1
∂
Px2 (x1 , x2 , x3 ) = [ x1 x2 x3 ] = x1 x3
∂x2
∂
Px3 (x1 , x2 , x3 ) = [ x1 x2 x3 ] = x1 x2
∂x3
So
P (x1 + Δx1 , x2 + Δx2 , x3 + Δx3 )
≤ ε1 + ε2 + ε3
More generally, if we take a product of n, rather than three, variables the percentage error in the product becomes at most
n
(approximately) ∑ ε . This is the basis of the experimentalist's rule of thumb that when you take products, percentage errors
i
i=1
add.
Still more generally, if we take a “product” ∏ n
i=1
x
mi
i
, the percentage error in the “product” becomes at most (approximately)
n
∑ | m i | εi
i=1
and the corresponding exact formula (see (3.4.32) in the CLP-1 text)
′ 1 ′′ 2
g(t0 + Δt) = g(t0 ) + g (t0 ) Δt + g (t0 + cΔt) Δt for some 0 ≤ c ≤ 1
2
tells us about f . We have already found, using the chain rule, that
∂f ∂f
′
g (t) = (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy
∂x ∂y
∂f ∂f
We now need to evaluate g ′′
(t). Temporarily write f 1 =
∂x
and f 2 =
∂y
so that
′
g (t) = f1 (x0 + t Δx , y0 + t Δy) Δx + f2 (x0 + t Δx , y0 + t Δy) Δy
2.6.7 https://math.libretexts.org/@go/page/92239
Then we have, again using the chain rule,
d
[ f1 (x0 + t Δx , y0 + t Δy)]
dt
∂f1 ∂f1
= (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy
∂x ∂y
2 2
∂ f ∂ f
= (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy (∗)
2
∂x ∂y∂x
and
d
[ f2 (x0 + t Δx , y0 + t Δy)]
dt
∂f2 ∂f2
= (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy
∂x ∂y
2 2
∂ f ∂ f
= (x0 + t Δx , y0 + t Δy) Δx + (x0 + t Δx , y0 + t Δy) Δy (∗∗)
2
∂x∂y ∂y
2 2
∂ f ∂ f
Adding Δx times (∗) to Δy times (∗∗) and recalling that ∂y∂x
=
∂x∂y
, gives
2
′′
∂ f 2
g (t) = (x0 + t Δx , y0 + t Δy) Δx
∂x2
2
∂ f
+2 (x0 + t Δx , y0 + t Δy) ΔxΔy
∂x∂y
2
∂ f 2
+ (x0 + t Δx , y0 + t Δy) Δy
2
∂y
is
Equation 2.6.12
f (x0 + Δx , y0 + Δy)
∂f ∂f
≈ f (x0 , y0 ) + (x0 , y0 ) Δx + (x0 , y0 ) Δy
∂x ∂y
2 2 2
1 ∂ f 2
∂ f ∂ f 2
+ { (x0 , y0 ) Δx +2 (x0 , y0 ) ΔxΔy + (x0 , y0 ) Δy }
2 2
2 ∂x ∂x∂y ∂y
is
Equation 2.6.13
f (x0 + Δx , y0 + Δy)
∂f ∂f
= f (x0 , y0 ) + (x0 , y0 ) Δx + (x0 , y0 ) Δy
∂x ∂y
2 2 2
1 ∂ f ∂ f ∂ f
2 2
+ { ⃗
(r (c)) Δx +2 ⃗
(r (c)) ΔxΔy + ⃗
(r (c)) Δy }
2 2
2 ∂x ∂x∂y ∂y
where r (c)
⃗ = (x 0 + c Δx , y0 + c Δy) and c is some (unknown) number satisfying 0 ≤ c ≤ 1.
2.6.8 https://math.libretexts.org/@go/page/92239
Equation 2.6.14
If we can bound the second derivatives
2 2 2
∣∂ f ∣ ∣ ∂ f ∣ ∣∂ f ∣
∣ ⃗
(r (c))∣ , ∣ ⃗
(r (c))∣ , ∣ ⃗
(r (c))∣ ≤M
2 2
∣ ∂x ∣ ∣ ∂x∂y ∣ ∣ ∂y ∣
M 2 2
≤ (|Δx | + 2|Δx| |Δy| + |Δy | )
2
Why might we want to do this? The left hand side of 2.6.14 is exactly the error in the linear approximation 2.6.1. So the right hand
side is a rigorous bound on the error in the linear approximation.
and
x0 = 1 Δx = −0.002 y0 = 1 Δy = 0.003
Then the exact answer is f (x0 + Δx , y0 + Δy) and the approximate answer is
∂f ∂f
f (x0 , y0 ) +
∂x
(x0 , y0 ) Δx +
∂y
(x0 , y0 ) Δy, so that, by 2.6.13, the error in the approximation is exactly
2 2 2
1∣∂ f ∂ f ∂ f ∣
2 2
∣ ⃗
(r (c)) Δx +2 ⃗
(r (c)) ΔxΔy + ⃗
(r (c)) Δy ∣
2 2
2 ∣ ∂x ∂x∂y ∂y ∣
with r (c)
⃗ = (1 − 0.002c , 1 + 0.0003c) for some, unknown, 0 ≤ c ≤ 1. For our function f
3 2 3
x ∂f 3x ∂f x
f (x, y) = (x, y) = (x, y) = −
2
y ∂x y ∂y y
2 2 2 2 3
∂ f 6x ∂ f 3x ∂ f 2x
(x, y) = (x, y) = − (x, y) =
2 2 2
∂x y ∂x∂y y ∂y y3
and
1 2 2
error ≤ [6Δx + 2 × 3|Δx Δy| + 2Δy ]
2
2 2
≤ 3(0.002 ) + 3(0.002)(0.003) + (0.003 )
= 0.000039
2.6.9 https://math.libretexts.org/@go/page/92239
Example 2.6.16
−−−−−−−−−−
In this example, we find the quadratic approximation of f (x, y) = √1 + 4x + y at (x , y ) = (1, 2) and use it to compute
2 2
0 0
approximately f (1.1 , 2.05). We know that we will need all partial derivatives up to order 2, so we first compute them and
evaluate them at (x , y ) = (1, 2).
0 0
−−−−−−−−−−
2 2
f (x, y) = √ 1 + 4 x +y f (x0 , y0 ) = 3
4x 4
fx (x, y) = − −−−−−−−− − fx (x0 , y0 ) =
√ 1 + 4 x2 + y 2 3
y 2
fy (x, y) = − −−−−−−−− − fy (x0 , y0 ) =
2 2 3
√ 1 + 4x + y
2
4 16x 4 16
fxx (x, y) = − fxx (x0 , y0 ) = −
− −−−−−−−− − 2 2 3/2
2 2 3 27
√ 1 + 4x + y [1 + 4 x +y ]
20
=
27
4xy 8
fxy (x, y) = − fxy (x0 , y0 ) = −
2 2 3/2
[1 + 4 x +y ] 27
2
1 y 1 4
fyy (x, y) = − −−−−−−−− − − fyy (x0 , y0 ) = −
2 2 2 2 3/2
√ 1 + 4x + y [1 + 4 x + y ] 3 27
5
=
27
We now just substitute them into 2.6.12 to get that the quadratic approximation to f about (x 0, y0 ) is
f (x0 + Δx , y0 + Δy)
1 2 2
+ [fxx (x0 , y0 )Δx + 2 fxy (x0 , y0 )ΔxΔy + fyy (x0 , y0 )Δy ]
2
4 2 10 2
8 5 2
=3+ Δx + Δy + Δx − ΔxΔy + Δy
3 3 27 27 54
= 3.1691
The actual value, to four decimal places, is 3.1690. The percentage error is about 0.004\%.
Example 2.6.17
2x
f (x, y) = e sin(3y) f (x0 , y0 ) = 0
2x
fx (x, y) = 2 e sin(3y) fx (x0 , y0 ) = 0
2x
fy (x, y) = 3 e cos(3y) fy (x0 , y0 ) = 3
2x
fxx (x, y) = 4 e sin(3y) fxx (x0 , y0 ) = 0
2x
fxy (x, y) = 6 e cos(3y) fxy (x0 , y0 ) = 6
2x
fyy (x, y) = −9 e sin(3y) fyy (x0 , y0 ) = 0
2.6.10 https://math.libretexts.org/@go/page/92239
f (x , y) ≈ f (x, y) + fx (x, y)x + fy (0, 0)y
1 2 2
+ [fxx (0, 0)x + 2 fxy (0, 0)xy + fyy (0, 0)y ]
2
= 3y + 6xy
That's pretty simple — just compute a bunch of partial derivatives and substitute into the formula 2.6.12.
But there is also a sneakier, and often computationally more efficient, method to get the same result. It exploits the single
variable Taylor expansions
x
1 2
e = 1 +x + x +⋯
2!
1 3
sin y = y − y +⋯
3!
Replacing x by 2x in the first and y by 3y in the second and multiplying the two together, keeping track only of terms of
degree at most two, gives
2x
f (x, y) = e sin(3y)
1 2
1 3
= [1 + (2x) + (2x ) + ⋯ ][(3y) − (3y ) +⋯ ]
2! 3!
2
9 3
= [1 + 2x + 2 x + ⋯ ][3y − y +⋯ ]
2
2
9 3 3 2 3
= 3y + 6xy + 6 x y + ⋯ − y − 9x y − 9x y +⋯
2
= 3y + 6xy + ⋯
found not only linear and quadratic approximations, but in fact a whole hierarchy of approximations. For each integer n ≥ 0, the
n
th
degree Taylor polynomial for f (x) about x = a was defined, in Definition 3.4.11 of the CLP-1 text, to be
n
1
(k) k
∑ f (a) ⋅ (x − a)
k!
k=0
We'll now define, and find, the Taylor polynomial of degree n for the function f (x, y) about (x, y) = (x 0, y0 ). It is going to be a
polynomial of degree n in Δx and Δy. The most general such polynomial is
ℓ m
Tn (Δx, Δy) = ∑ aℓ,m (Δx ) (Δy )
ℓ,m≥0
ℓ+m≤n
with all of the coefficients a being constants. The specific coefficients for the Taylor polynomial are determined by the
ℓ,m
requirement that all partial derivatives of T (Δx, Δy) at Δx = Δy = 0 are the same as the corresponding partial derivatives of
n
f (x + Δx , y + Δy) at Δx = Δy = 0.
0 0
By way of preparation for our computation of the derivatives of T n (Δx, Δy), consider
2 3
d 4 3
d 4 2
d 4
t = 4t t = (4)(3)t t = (4)(3)(2)t
2 3
dt dt dt
4 5 6
d 4
d 4
d 4
t = (4)(3)(2)(1) = 4! t =0 t =0
4 5 6
dt dt dt
and
2.6.11 https://math.libretexts.org/@go/page/92239
2 3
d ∣ d ∣ d ∣
4 4 4
t ∣ =0 t ∣ =0 t ∣ =0
∣ 2 3
dt t=0 dt ∣t=0 dt ∣t=0
4 5 6
d ∣ d ∣ d ∣
4 4 4
t ∣ = 4! t ∣ =0 t ∣ =0
4 5 6
dt ∣t=0 dt ∣ dt ∣
t=0 t=0
so that
p
d ∣ m! if p = m
m
t ∣ ={
p
dt ∣ 0 if p ≠ m
t=0
Consequently
p q
∂ ∂ ∣ ℓ! m! if p = ℓ and q = m
ℓ m
(Δx ) (Δy ) ∣ ={
p q
∂(Δx) ∂(Δy) ∣ 0 if p ≠ ℓ or q ≠ m
Δx=Δy=0
and
p+q p q
∂ Tn ∂ ∂ ∣
ℓ m
(0, 0) = ∑ aℓ,m (Δx ) (Δy ) ∣
p q p q
∂(Δx ) ∂(Δy ) ∂(Δx) ∂(Δy) ∣
ℓ,m≥0 Δx=Δy=0
ℓ+m≤n
p! q! ap,q if p + q ≤ n
={
0 if p + q > n
Our requirement that the derivatives of f and T match is the requirement that, for all p + q ≤ n,
n
p+q p+q
∂ Tn ∂
∣
(0, 0) = f (x0 + Δx , y0 + Δy)
p q p q ∣
∂(Δx ) ∂(Δy ) ∂(Δx ) ∂(Δy ) Δx=Δy=0
p+q
∂ f
= (x0 , y0 )
p q
∂x ∂y
So the Taylor polynomial of degree n for the function f (x, y) about (x, y) = (x 0, y0 ) is the right hand side of
Equation 2.6.18
ℓ+m
1 ∂ f
ℓ m
f (x0 + Δx , y0 + Δy) ≈ ∑ (x0 , y0 ) (Δx ) (Δy )
ℓ m
ℓ! m! ∂x ∂y
ℓ,m≥0
ℓ+m≤n
This is for functions, f (x, y), of two variables. There are natural extensions of this for functions of any (finite) number of
variables. For example, the Taylor polynomial of degree n for a function, f (x, y, z), of three variables is the right hand side of
f (x0 + Δx , y0 + Δy , z0 + Δz)
k+ℓ+m
1 ∂ f
k ℓ m
≈ ∑ (x0 , y0 , z0 ) (Δx ) (Δy ) (Δz)
k ℓ m
k! ℓ! m! ∂x ∂y ∂z
k,ℓ,m≥0
k+ℓ+m≤n
2.6.12 https://math.libretexts.org/@go/page/92239
Exercises
Stage 1
1
Define P (x, y) = x y .
m n
2. Denote by
∣ P (x0 + Δx, y0 + Δy) − P (x0 , y0 ) ∣
P% = 100 ∣ ∣
∣ P (x0 , y0 ) ∣
∣ Δx ∣
x% = 100 ∣ ∣
∣ x0 ∣
∣ Δy ∣
y% = 100 ∣ ∣
∣ y0 ∣
the percentage errors in P , x and y respectively. Use the linear approximation to find an (approximate) upper bound on P %
in terms of m, n, x and y .
% %
2
This conclusion is ridiculous. We're saying that the y -coordinate is more than twice the distance from the point to the origin.
What was the mistake?
Stage 2
3
Find an approximate value for f (x, y) = sin(πxy + ln y) at (0.01, 1.05)without using a calculator or computer.
4✳
2
x y
Let f (x, y) = 4 2
. Find an approximate value for f (−0.9 , 1.1) without using a calculator or computer.
x + 2y
5
Four numbers, each at least zero and each at most 50, are rounded to the first decimal place and then multiplied together.
Estimate the maximum possible error in the computed product.
2.6.13 https://math.libretexts.org/@go/page/92239
6✳
One side of a right triangle is measured to be 3 with a maximum possible error of ±0.1, and the other side is measured to be 4
with a maximum possible error of ±0.2. Use the linear approximation to estimate the maximum possible error in calculating
the length of the hypotenuse of the right triangle.
7✳
If two resistors of resistance R and R are wired in parallel, then the resulting resistance R satisfies the equation
1 2
. Use the linear approximation to estimate the change in R if R decreases from 2 to 1.9 ohms and R
1 1 1
= + 1 2
R R1 R2
8
If the resistances, measured in Ohms, are R = 25Ω, R 1 2 = 40Ω and R 3 = 50Ω, with a possible error of 0.5\% in each case,
estimate the maximum error in the calculated value of R.
9
The specific gravity S of an object is given by S = where A is the weight of the object in air and W is the weight of
A
A−W
the object in water. If A = 20 ± .01 and W = 12 ± .02 find the approximate percentage error in calculating S from the
given measurements.
10 ✳
where s is the specific heat and r is the density. We expect to measure (s, r) to be approximately (2, 2) and would like to have
the most accurate value for P . There are two different ways to measure s and r. Method 1 has an error in s of ±0.01 and an
error in r of ±0.1, while method 2 has an error of ±0.02 for both s and r.
Should we use method 1 or method 2? Explain your reasoning carefully.
11
A rectangular beam that is supported at its two ends and is subjected to a uniform load sags by an amount
4
pℓ
S =C
wh3
where p = load, ℓ = length, h = height, w = width and C is a constant. Suppose p ≈ 100, ℓ ≈ 4, w ≈ .1 and h ≈ .2. Will
the sag of the beam be more sensitive to changes in the height of the beam or to changes in the width of the beam.
12 ✳
2y
Let z = f (x, y) = 2
x +y
2
. Find an approximate value for f (−0.8, 2.1).
2.6.14 https://math.libretexts.org/@go/page/92239
13 ✳
1. Find .
∂z
∂x
2. If f (−1, 1) < 0, find the linear approximation of the function z = f (x, y) at (−1, 1).
3. If f (−1, 1) < 0, use the linear approximation in (b) to approximate f (−1.02, 0.97).
14 ✳
15 ✳
Two sides and the enclosed angle of a triangle are measured to be m, 4 ± .1 m and 90 ± 1 respectively. The length of
3 ± .1
∘
16 ✳
−−−−−−
Use differentials to find a reasonable approximation to the value of f (x, y) = xy √x 2
+y
2
at x = 3.02, y = 3.96. Note that
3.02 ≈ 3 and 3.96 ≈ 4.
17 ✳
Use differentials to estimate the volume of metal in a closed metal can with diameter 8cm and height 12cm if the metal is
0.04cm thick.
18 ✳
Stage 3
19 ✳
∂x
,
∂z
∂y
as functions of x, y, z.
2. Evaluate ∂z
∂y
,
∂z
∂y
at (1, 1, 2).
3. Measurements are made with errors, so that x = 1 ± 0.03 and y = 1 ± 0.02. Find the corresponding maximum error in
measuring z.
4. A particle moves over the surface along the path whose projection in the xy--plane is given in terms of the angle θ as
2.6.15 https://math.libretexts.org/@go/page/92239
x(θ) = 1 + cos θ, y(θ) = sin θ
dθ
at points A and B.
20 ✳
1. Suppose that x = 1 and y = e, but errors of size 0.1 are made in measuring each of x and y. Estimate the maximum error
that this could cause in f (x, y).
2. The graph of the function f sits in R , and the point (1, e, 1) lies on that graph. Find a nonzero vector that is perpendicular
3
21 ✳
1. Compute ∂z
∂x
,
∂z
∂y
in terms of x, y, z.
2. Evaluate ∂z
∂x
and ∂z
∂y
at (x, y, z) = (2, −1/2, 1).
3. If x decreases from 2 to 1.94, and y increases from −0.5 to −0.4, find the approximate change in z from 1.
4. Find the equation of the tangent plane to the surface at the point (2, −1/2, 1).
22 ✳
∂f ∂f
A surface z = f (x, y) has derivatives ∂x
=3 and ∂y
= −2 at (x, y, z) = (1, 3, 1).
1. If x increases from 1 to 1.2, and y decreases from 3 to 2.6, find the change in z using a linear approximation.
2. Find the equation of the tangent plane to the surface at the point (1, 3, 1).
23 ✳
According to van der Waal's equation, a gas satisfies the equation
2 2
(p V + 16)(V − 1) = T V ,
where p, V and T denote pressure, volume and temperature respectively. Suppose the gas is now at pressure 1, volume 2 and
temperature 5. Find the approximate change in its volume if p is increased by 0.2 and T is increased by 0.3.
24 ✳
1. Find the equation of the tangent plane to the graph z = f (x, y) at the point where (x, y) = (2, 1).
2. Find the tangent plane approximation to the value of f (1.99, 1.01)using the tangent plane from part (a).
25 ✳
1. Use a linear approximation of the function z = f (x, y) at (0, 1) to estimate f (0.1, 1.2).
2. Find a point P (a, b, c) on the graph of z = f (x, y) such that the tangent plane to the graph of z = f (x, y) at the point P is
parallel to the plane 2x + 2y − z = 3.
2.6.16 https://math.libretexts.org/@go/page/92239
26 ✳
1. Find the equation of the tangent plane to the surface x z + y sin(πx) = −y at the point P = (1, 1, −1).
2 3 2
∂x
∂z
3. Let z be the same implicit function as in part (ii), defined by the equation x z + y sin(πx) = −y . Let x = 0.97, and
2 3 2
27 ✳
The surface x
4
+y
4
+z
4
+ xyz = 17 passes through (0, 1, 2), and near this point the surface determines x as a function,
x = F (y, z), of y and z.
1. Find F and F at (x, y, z) = (0, 1, 2).
y z
2. Use the tangent plane approximation (also known as linear, first order or differential approximation) to find the
approximate value of x (near 0) such that (x, 1.01, 1.98)lies on the surface.
(n+1)!
g
(n+1)
(c)(x − a)
n+1
2. Don't take the notation dx or the terminology “infinitesimal” too seriously. It is just intended to signal “very small”.
3. There are other choices possible. For example, we could write 35 = 45 − 10 . To get a good approximation we try to make
∘ ∘ ∘
This page titled 2.6: Linear Approximations and Error is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.
2.6.17 https://math.libretexts.org/@go/page/92239
2.7: Directional Derivatives and the Gradient
df
The principal interpretation of (a) is the rate of change of f (x), per unit change of x, at x = a. The natural analog of this
dx
interpretation for multivariable functions is the directional derivative, which we now introduce through a question.
A Question
Suppose that you are standing at (a, b) near a campfire. The temperature you feel at (x, y) is f (x, y). You start to move with
velocity v ⃗ = ⟨v , v ⟩ . What rate of change of temperature do you feel?
1 2
The Answer
Let's set the beginning of time, t = 0, to the time at which you leave (a, b). Then
at time 0 you are at (a, b) and feel the temperature f (a, b) and
at time t you are at (a + v t , b + v t) and feel the temperature f (a + v
1 2 1t , b + v2 t). So
the change in temperature between time 0 and time t is f (a + v t , b + v 1 2 t) − f (a, b),
f (a+v1 t , b+v2 t)−f (a,b)
the average rate of change of temperature, per unit time, between time 0 and time t is t
and the
f (a+v1 t , b+v2 t)−f (a,b)
instantaneous rate of change of temperature per unit time as you leave (a, b) is lim t
.
t→0
dg
= (0)
dt
d
∣
= [f (a + v1 t , b + v2 t)]
∣
dt t=0
By the chain rule, we can write the right hand side in terms of partial derivatives of f .
d
[f (a + v1 t , b + v2 t)] = fx (a + v1 t , b + v2 t) v1 + fy (a + v1 t , b + v2 t) v2
dt
So, the instantaneous rate of change per unit time as you leave (a, b) is
f (a + v1 t , b + v2 t) − f (a, b)
lim
t→0 t
∣
= [ fx (a + v1 t , b + v2 t) v1 + fy (a + v1 t , b + v2 t) v2 ]
∣
t=0
= fx (a, b) v1 + fy (a, b) v2
Notice that we have expressed the rate of change as the dot product of the velocity vector with a vector of partial derivatives of f .
We have seen such a vector of partial derivatives of f before; in Definition 2.5.4, we defined the gradient of the three variable
function G(x, y, z) at the point (x , y , z ) to be ⟨G (x , y , z ) , G (x , y , z ) , G (x , y , z )⟩ . Here we see the
0 0 0 x 0 0 0 y 0 0 0 z 0 0 0
Definition 2.7.1
The vector ⟨f x (a, b) , fy (a, b)⟩ is denoted n⃗ ablaf (a, b) and is called “the gradient of the function f at the point (a, b)”.
In general, the gradient of f is a vector with one component for each variable of f . The j th
component is the partial derivative of f
with respect to the j variable.
th
Now because the dot product n⃗ ablaf (a, b) ⋅ v ⃗ appears frequently, we introduce some handy notation.
2.7.1 https://math.libretexts.org/@go/page/92240
Definition 2.7.2
Armed with this useful notation we can answer our question very succinctly.
Equation 2.7.3
The rate of change of f per unit time as you leave (a, b) moving with velocity v ⃗ is
We can compute the rate of change of temperature per unit distance (as opposed to per unit time) in a similar way. The change in
temperature between time 0 and time t is f (a + v t, b + v t) − f (a, b). Between time 0 and time t, you have travelled a distance
1 2
⃗
| v|t. So the instantaneous rate of change of temperature per unit distance as you leave (a, b) is
f (a + v1 t, b + v2 t) − f (a, b)
lim
t→0 t| v|⃗
⃗
| v|
times lim t
which we computed above to be D f (a, b).
v ⃗
So
t→0
Equation 2.7.4
Given any nonzero vector v,⃗ the rate of change of f per unit distance as you leave (a, b) moving in direction v ⃗ is
v ⃗
∇f (a, b) ⋅ =D v
⃗ f (a, b)
| v|⃗ | v |⃗
Definition 2.7.5
D v
⃗ f (a, b) is called the directional derivative of the function f (x, y) at the point (a, b) in the direction 1 v.⃗
| v |
⃗
The Implications
We have just seen that the instantaneous rate of change of f per unit distance as we leave (a, b) moving in direction v ⃗ is a dot
product, which we can write as
v ⃗
∇f (a, b) ⋅ = |∇f (a, b)| cos θ
| v|⃗
where θ is the angle between the gradient vector ∇f (a, b) and the direction vector v.⃗ Writing it in this way allows us to make some
useful observations. Since cos θ is always between −1 and +1
the direction of maximum rate of increase is that having θ = 0. So to get maximum rate of increase per unit distance, as you
leave (a, b), you should move in the same direction as the gradient ∇f (a, b). Then the rate of increase per unit distance is
|∇f (a, b)|.
The direction of minimum (i.e. most negative) rate of increase is that having θ = 180 . To get minimum rate of increase per ∘
unit distance you should move in the direction opposite ∇f (a, b). Then the rate of increase per unit distance is −|∇f (a, b)|.
The directions giving zero rate of increase are those perpendicular to ∇f (a, b). If you move in a direction perpendicular to
∇f (a, b), then f (x, y) remains constant as you leave (a, b). At that instant, you are moving so that f (x, y) remains constant
and consequently you are moving along the level curve f (x, y) = f (a, b). So ∇f (a, b) is perpendicular to the level curve
f (x, y) = f (a, b) at (a, b). The corresponding statement in three dimensions is that ∇F (a, b, c) is perpendicular to the level
2.7.2 https://math.libretexts.org/@go/page/92240
surface F (x, y, z) = F (a, b, c) at (a, b, c). Hence a good way to find a vector normal to the surface F (x, y, z) = F (a, b, c) at
the point (a, b, c) is to compute the gradient ∇F (a, b, c). This is precisely what we saw back in Theorem 2.5.5.
Now that we have defined the directional derivative, here are some examples.
Example 2.7.6
Find the directional derivative of the function f (x, y) = e at the point (0, 1) in the direction −^
2
x+y
ı
ı +^
ȷ
ȷ.
Solution
To compute the directional derivative, we need the gradient. To compute the gradient, we need some partial derivatives. So we
start with the partial derivatives of f at (0, 1):
2
x+y ∣
fx (0, 1) = e =e
∣ x=0
y=1
2
x+y ∣
fy (0, 1) = 2ye = 2e
∣ x=0
y=1
∇f (0, 1) = fx (0, 1) ^
ı + fy (0, 1) ^
ı ȷ =e^
ȷ ı + 2e ^
ı ȷ
ȷ
−^
ı
ı +^
ȷ
ȷ −^
ı
ı +^
ȷ
ȷ e
D ^ ^ f (0, 1) = ∇f (0, 1) ⋅ = (e ^
ı
ı + 2e ^
ȷ
ȷ) ⋅ =
−ıı +ȷȷ
– –
^
ı +^
|−ı ȷȷ |
|−^
ı +^
ı ȷ
ȷ| √2 √2
Example 2.7.7
Find the directional derivative of the function w(x, y, z) = xyz + ln(xz) at the point (1, 3, 1) in the direction ⟨1 , 0 , 1⟩ . In
what directions is the directional derivative zero?
Solution
First, the partial derivatives of w at (1, 3, 1) are
1 ∣ 1
wx (1, 3, 1) = [yz + ]∣ = 3 ×1 + =4
x ∣ 1
(1,3,1)
∣
wy (1, 3, 1) = xz∣ = 1 ×1 =1
∣
(1,3,1)
1 ∣ 1
wz (1, 3, 1) = [xy + ]∣ = 1 ×3 + =4
z ∣ 1
(1,3,1)
8 –
= = 4 √2
–
√2
which is the case if and only if t is perpendicular to ⟨4 , 1 , 4⟩ . So if we walk in the direction of any vector in the plane,
4x + y + 4z = 0 (which has normal vector ⟨4 , 1 , 4⟩) then the directional derivative is zero.
2.7.3 https://math.libretexts.org/@go/page/92240
Example 2.7.8
Let
2 2
f (x, y) = 5 − x − 2y (a, b) = ( − 1, −1)
In this example, we'll explore the behaviour of the function f (x, y) near the point (a, b).
Note that for any fixed f < 5, f (x, y) = f is the ellipse x
0 0
2
+ 2y
2
= 5 − f0 . So the graph z = f (x, y) consists of a bunch of
horizontal ellipses stacked one on top of each other.
−−−−
−−−−− 5−f0
Since the ellipse x2
+ 2y
2
= 5 − f0 has x-semi-axis √5 − f and y -semi-axis √ 0
2
,
The part of the graph z = f (x, y) in the first octant is sketched in the left hand figure below.
Several level curves, f (x, y) = f , are sketched in the right hand figure below.
0
√5
√5
⟨1, 2⟩ and that minimum rate is
–
−| ⟨2, 4⟩ | = −2 √5.
The directions giving zero rate of increase are perpendicular to ∇f (a, b). One vector perpendicular 2 to ⟨1, 2⟩ is ⟨2, −1⟩ .
So the unit vectors giving the direction of zero rate of increase are the ± (2, −1). These are the directions of the tangent
1
√5
vector at (a, b) to the level curve of f through (a, b), which is the curve f (x, y) = f (a, b).
Example 2.7.9
What is the rate of change of f (x, y, z) = x + y + z at (3, 5, 4) moving in the positive x-direction along the curve of
2 2 2
Solution
As a first check note that (3, 5, 4) really does lie on both surfaces because
2 2 2
G(3, 5, 4) = 2(3 ) − 5 + 2(4 ) = 18 − 25 + 32 = 25
2 2 2
H (3, 5, 4) = 3 −5 + 4 = 9 − 25 + 16 = 0
2.7.4 https://math.libretexts.org/@go/page/92240
We compute gradients to get the normal vectors to the surfaces G(x, y, z) = 25 and H (x, y, z) = 0 at (3, 5, 4).
^
∇G(3, 5, 4) = [4x ^
ı − 2y ^
ı ȷ
ȷ + 4z k]
(3,5,4)
^ ^
= 12 ^
ı − 10 ^
ı ȷ + 16 k = 2(6 ^
ȷ ı −5 ^
ı ȷ
ȷ + 8 k)
^
∇H (3, 5, 4) = [2x ^
ı − 2y ^
ı ȷ
ȷ + 2z k]
(3,5,4)
^ ^
=6 ^
ı
ı − 10 ^
ȷ
ȷ + 8 k = 2(3 ^
ı
ı −5 ^
ȷ
ȷ + 4 k)
The direction of interest is tangent to the curve of intersection. So the direction of interest is tangent to both surfaces and hence
is perpendicular to both gradients. Consequently one tangent vector to the curve at (3, 5, 4) is
^ ^
∇G(3, 5, 4) × ∇H (3, 5, 4) = 4(6 ^
ı −5 ^
ı ȷ + 8 k) × (3 ^
ȷ ı −5 ^
ı ȷ
ȷ + 4 k)
^
⎡ ^
ı
ı ^
ȷ
ȷ k⎤
= 4 det ⎢ 6 −5 8 ⎥
⎣ ⎦
3 −5 4
^ ^
= 4 (20 ^
ı − 15 k) = 20 (4 ^
ı ı
ı − 3 k)
and the unit tangent vector to the curve at (3, 5, 4) that has positive x component is
^
4 ^
ı
ı −3 k 4 3
^ ^
= ı
ı − k
^ 5 5
|4 ^
ı
ı − 3 k|
^
^
ı +2y ^
[2x ı ȷ
ȷ +2z k]( x,y,z) =( 3,5,4)
4 3
^ ^
= (6 ^
ı + 10 ^
ı ȷ
ȷ + 8 k) ⋅ ( ^
ı
ı − k)
5 5
=0
Actually, we could have known that the rate of change would be zero.
indent=-0.1in
Any point (x, y, z) on the curve obeys both y = x + z and 2x − y + 2z = 25.
2 2 2 2 2 2
That is, f (x, y, z) = x + y + z takes the value 50 at every point of the curve.
2 2 2
Let's change things up a little. In the next example, we are told the rates of change in two different directions. From this we are to
determine the rate of change in a third direction.
Example 2.7.10
–
The rate of change of a given function f (x, y) at the point P = (1, 2) in the direction towards P = (2, 3) is 2√2 and in the
0 1
direction towards P = (1, 0) is −3. What is the rate of change of f at P towards the origin P = (0, 0)?
2 0 3
Solution
We can easily determine the rate of change of f at the point P in any direction once we know the gradient 0
n⃗ ablaf (1, 2) = a ^ ȷ . So we will first use the two given rates of change to determine a and b, and then we determine the
ı +b ^
ı ȷ
2.7.5 https://math.libretexts.org/@go/page/92240
−
−→
As you might guess, the notation P Q means the vector whose tail is at P and whose head is at Q. So the given rates of change
tell us that
– ⟨1, 1⟩ ⟨1, 1⟩
√2 = D f (1, 2) = ∇f (1, 2) ⋅ = ⟨a, b⟩ ⋅
⟨ 1,1⟩
–
|⟨ 1,1⟩ |
| ⟨1, 1⟩ | √2
a b
= – + –
√2 √2
= −b
These two lines give us two linear equations in the two unknowns a and b. The second equation directly gives us b = 3.
⟨−1, −2⟩ 7
= ⟨1, 3⟩ ⋅ =−
– –
√5 √5
2 2 2
x +y +z =1
0 0 0
n⃗ = ∇g(x0 , y0 , z0 ) = ⟨2 x0 , 2 y0 , 2 z0 ⟩
Let's pause to take stock. We need to find all (a, b, c)'s such that the statement
2.7.6 https://math.libretexts.org/@go/page/92240
^ ^ are perpendicular
the normal vectors N and n
In equations, we need to find all (a, b, c)'s such that the statement
2 2 2
(x0 , y0 , z0 ) obeys x +y +z =1
0 0 0
2 2 2
and (x0 − a) + (y0 − b ) + (z0 − c ) =1 (S1)
Now if we expand (S2) then we can, with a little care, massage it into something that looks more like (S1).
2 2 2
x0 (x0 − a) + y0 (y0 − b) + z0 (z0 − c) = x +y +z − ax0 − b y0 − c z0
0 0 0
1
2 2 2 2 2 2 2 2 2
= {[ x +y + z ] + [(x0 − a) + (y0 − b ) + (z0 − c ) ] − a −b −c }
0 0 0
2
–
Our conclusion is that the set of allowed points (a, b, c) is the sphere of radius √2 centred on the origin.
function f (r, θ) of the polar coordinates 3 r and θ. We are supposed to convert this function to Cartesian coordinates.
This means that we are to consider the function
with
−−−−−−
2 2
r(x, y) = √ x +y
y
θ(x, y) = arctan
x
Then we are to compute the gradient of g(x, y) and express the answer in terms of r and θ. By the chain rule,
∂g ∂f ∂r ∂f ∂θ
= +
∂x ∂r ∂x ∂θ ∂x
2
∂f 1 2x ∂f −y/x
= +
− −−−− − 2
∂r 2 √ x2 + y 2 ∂θ 1 + (y/x)
∂f x ∂f y
= − −− −−− −
∂r √ x2 + y 2 ∂θ x2 + y 2
∂f r cos θ ∂f r sin θ
= −
2
∂r r ∂θ r
2.7.7 https://math.libretexts.org/@go/page/92240
Similarly
∂g ∂f ∂r ∂f ∂θ
= +
∂y ∂r ∂y ∂θ ∂y
∂f 1 2y ∂f 1/x
= − −−−− − +
∂r 2 √ x2 + y 2 ∂θ 1 + (y/x)2
∂f y ∂f x
= +
− −−−− − 2 2
∂r √ x2 + y 2 ∂θ x +y
∂f ∂f cos θ
= sin θ +
∂r ∂θ r
So
fθ
⟨gx , gy ⟩ = fr ⟨cos θ, sin θ⟩ + ⟨− sin θ, cos θ⟩
r
⟨gx (x, y), gy (x, y)⟩ = fr (r(x, y), θ(x, y)) ⟨cos θ(x, y) , sin θ(x, y)⟩
1
+ fθ (r(x, y), θ(x, y)) ⟨− sin θ(x, y) , cos θ(x, y)⟩
r(x, y)
Exercises
Stage 1
1✳
2✳
Find ∇(y 2
+ sin(xy)).
Stage 2
3
Find the rate of change of the given function at the given point in the given direction.
1. f (x, y) = 3x − 4y at the point (0, 2) in the direction −2 ^
ıı.
2. f (x, y, z) = x + y + z
−1 −1
at (2, −3, 4) in the direction ^
−1
ı
ı +^
ȷ
^
ȷ + k.
4
In what directions at the point (2, 0) does the function f (x, y) = xy have the specified rates of change?
1. −1
2. −2
3. −3
5
2.7.8 https://math.libretexts.org/@go/page/92240
6✳
You are standing at a location where the surface of the earth is smooth. The slope in the southern direction is 4 and the slope in
–
the south-eastern direction is √2. Find the slope in the eastern direction.
7✳
Assume that the directional derivative of w = f (x, y, z) at a point P is a maximum in the direction of the vector 2 ^
ı
ı −^
ȷ
^
ȷ + k,
–
and the value of the directional derivative in that direction is 3√6.
1. Find the gradient vector of w = f (x, y, z) at P .
2. Find the directional derivative of w = f (x, y, z) at P in the direction of the vector ^
ı +^
ı ȷ
ȷ
8✳
The positive x-axis points east and the positive y -axis points north, and the hiker starts from the point P (2, 1, 4).
1. In what direction should the hiker proceed from P to ascend along the steepest path? What is the slope of the path?
2. Walking north from P , will the hiker start to ascend or descend? What is the slope?
3. In what direction should the hiker walk from P to remain at the same height?
9
Two hikers are climbing a (small) mountain whose height is z = 1000 − 2x − 3y . They start at (1, 1, 995) and follow the
2 2
path of steepest ascent. Their (x, y) coordinates obey y = ax for some constants a, b. Determine a and b.
b
10 ✳
A mosquito is at the location (3, 2, 1) in R . She knows that the temperature T near there is given by T
3
= 2x
2
+y
2 2
−z .
1. She wishes to stay at the same temperature, but must fly in some initial direction. Find a direction in which the initial rate
of change of the temperature is 0.
2. If you and another student both get correct answers in part (a), must the directions you give be the same? Why or why not?
3. What initial direction or directions would suit the mosquito if she wanted to cool down as fast as possible?
11 ✳
1. A bird passes through (2, 1, 3) travelling towards (4, 3, 4) with speed 2. At what rate does the air temperature it
experiences change at this instant?
2. If instead the bird maintains constant altitude (z = 3 ) as it passes through (2, 1, 3) while also keeping at a fixed air
temperature, T = 8, what are its two possible directions of travel?
12 ✳
Let f (x, y) = 2x 2
+ 3xy + y
2
be a function of x and y.
1. Find the maximum rate of change of f (x, y) at the point P (1, − ) . 4
2. Find the directions in which the directional derivative of f (x, y) at the point P (1, −
4
3
) has the value 1
5
.
2.7.9 https://math.libretexts.org/@go/page/92240
13 ✳
The temperature T (x, y) at a point of the xy-plane is given by
2
x
T (x, y) = ye
A bug travels from left to right along the curve y = x at a speed of 0.01m/sec. The bug monitors T (x, y) continuously. What
2
is the rate of change of T as the bug passes through the point (1, 1)?
14 ✳
1. Find the directional derivative of T at (3, 2, 1), in the direction of the point (0, 1, 2).
2. At the point (3, 2, 1), in what direction does the temperature decrease most rapidly?
− −−−
3. Moving along the curve given by x = 3e , y = 2 cos t, z = √1 + t , find
t
, the rate of change of temperature with
dT
dt
respect to t, at t = 0.
4. Suppose ^ ı + 5^
ı ȷȷ + ak is a vector that is tangent to the temperature level surface T (x, y, z) = 3 at (3, 2, 1). What is a?
^
15 ✳
Let
2 2 2
−( x +y +z )
f (x, y, z) = (2x + y)e
2 2
g(x, y, z) = xz + y + yz + z
3. A bat at (0, 1, −1) flies in the direction in which f (x, y, z) and g(x, y, z) do not change, but z increases. Find a vector in
this direction.
16 ✳
17 ✳
2 2 2
18 ✳
The directional derivative of a function w = f (x, y, z) at a point P in the direction of the vector ^
ı is 2, in the direction of the
ı
–
vector ^
ı +^
ı ȷȷ is −√2, and in the direction of the vector ^ ı +^
ı ȷ + k is −
ȷ
^
. Find the direction in which the function
5
√3
w = f (x, y, z) has the maximum rate of change at the point P . What is this maximum rate of change?
2.7.10 https://math.libretexts.org/@go/page/92240
19 ✳
Suppose it is known that the direction of the fastest increase of the function f (x, y) at the origin is given by the vector ⟨1, 2⟩ .
Find a unit vector u that is tangent to the level curve of f (x, y) that passes through the origin.
20 ✳
21 ✳
dt
2
2. In which direction should a bird at the point (0, −1, 1) fly if it wants to keep both P and T constant. (Give one possible
direction vector. It does not need to be a unit vector.)
3. An ant crawls on the surface z + zx + y = 2. When the ant is at the point (0, −1, 1), in which direction should it go for
3 2
maximum increase of the temperature T = 5 + xy − z ? Your answer should be a vector ⟨a, b, c⟩ , not necessarily of unit
2
length. (Note that the ant cannot crawl in the direction of the gradient because that leads off the surface. The direction
vector ⟨a, b, c⟩ has to be on the tangent plane to the surface.)
22 ✳
√3
⟨1, −1, −1⟩ and w = 1
√3
⟨1, 1, 1⟩ .
Dv f = 0
Dw f = 4
23 ✳
The elevation of a hill is given by the equation f (x, y) = x 2 2
y e
−x−y
. An ant sits at the point (1, 1, e −2
).
2. Find a vector v = ⟨v 1, v2 , v3 ⟩ pointing in the direction of the path that the ant could take in order to stay on the same
elevation level e .
−2
2.7.11 https://math.libretexts.org/@go/page/92240
3. Find a vector v = ⟨v , v , v ⟩ pointing in the direction of the path that the ant should take in order to maximize its
1 2 3
24 ✳
2
y
2
+ 2z
2
degrees.
1. A sparrow is flying along the curve r (s)
⃗ = ( s , 2s, s ) at a constant speed of 3 ms
1
3
3 2
. What is the velocity of the sparrow
−1
when s = 1?
2. At what rate does the sparrow feel the temperature is changing at the point A( , 2, 1) for which s = 1.1
3. At the point A( , 2, 1) in what direction will the temperature be decreasing at maximum rate?
1
4. An eagle crosses the path of the sparrow at A( , 2, 1), is moving at right angles to the path of the sparrow, and is also
1
moving in a direction in which the temperature remains constant. In what directions could the eagle be flying as it passes
through the point A?
25 ✳
Assume that the temperature T at a point (x, y, z) near a flame at the origin is given by
200
T (x, y, z) =
2 2 2
1 +x +y +z
where the coordinates are given in meters and the temperature is in degrees Celsius. Suppose that at some moment in time, a
moth is at the point (3, 4, 0) and is flying at a constant speed of 1m/s in the direction of maximum increase of temperature.
1. Find the velocity vector v ⃗ of the moth at this moment.
2. What rate of change of temperature does the moth feel at that moment?
26 ✳
We say that u is inversely proportional to v if there is a constant k so that u = k/v. Suppose that the temperature T in a metal
ball is inversely proportional to the distance from the centre of the ball, which we take to be the origin. The temperature at the
point (1, 2, 2) is 120 .∘
27 ✳
The depth of a lake in the xy-plane is equal to f (x, y) = 32 − x 2
− 4x − 4 y
2
meters.
1. Sketch the shoreline of the lake in the xy-plane.
Your calculus instructor is in the water at the point (−1, 1). Find a unit vector which indicates in which direction he should
swim in order to:
1. [(b)] stay at a constant depth?
2. [(c)] increase his depth as rapidly as possible (i.e. be most likely to drown)?
Stage 3
28
1. Draw a contour diagram for T showing some isotherms (curves of constant temperature).
2. In what direction should an ant at position (2, −1) move if it wishes to cool off as quickly as possible?
2.7.12 https://math.libretexts.org/@go/page/92240
3. If the ant moves in that direction at speed v at what rate does its temperature decrease?
4. What would the rate of decrease of temperature of the ant be if it moved from (2, −1) at speed v in direction ⟨−1, −2⟩ ?
5. Along what curve through (2, −1) should the ant move to continue experiencing maximum rate of cooling?
29 ✳
1. Give the direction in which f is increasing the fastest at the point (1, 0, π/2).
2. Give an equation for the plane T tangent to the surface
P = {(x, y, z)|x + z = 0} .
30 ✳
1. A bee starts flying at P and flies along the unit vector pointing towards the point Q = (3, 2, 2). What is the rate of change
of T (x, y, z) in this direction?
2. Use the linear approximation of T at the point P to approximate T (1.9, 1, 1.2).
3. Let S(x, y, z) = x + z. A bee starts flying at P ; along which unit vector direction should the bee fly so that the rate of
change of T (x, y, z) and of S(x, y, z) are both zero in this direction?
31 ✳
32 ✳
A meteor strikes the ground in the heartland of Canada. Using satellite photographs, a model
100
z = f (x, y) = −
2 2
x + 2x + 4 y + 11
of the resulting crater is made and a plan is drawn up to convert the site into a tourist attraction. A car park is to be built at
(4, 5) and a hiking trail is to be made. The trail is to start at the car park and take the steepest route to the bottom of the crater.
1. Sketch a map of the proposed site clearly marking the car park, a few level curves for the function f and the trail.
2. In which direction does the trail leave the car park?
33 ✳
You are standing at a lone palm tree in the middle of the Exponential Desert. The height of the sand dunes around you is given
in meters by
2 2
−( x +2 y )
h(x, y) = 100e
2.7.13 https://math.libretexts.org/@go/page/92240
where x represents the number of meters east of the palm tree (west if x is negative) and y represents the number of meters
north of the palm tree (south if y is negative).
1. Suppose that you walk 3 meters east and 2 meters north. At your new location, (3, 2), in what direction is the sand dune
sloping most steeply downward?
2. If you walk north from the location described in part (a), what is the instantaneous rate of change of height of the sand
dune?
3. If you are standing at (3, 2) in what direction should you walk to ensure that you remain at the same height?
4. Find the equation of the curve through (3, 2) that you should move along in order that you are always pointing in a steepest
descent direction at each point of this curve.
34 ✳
Let f (x, y) be a differentiable function with f (1, 2) = 7. Let
3 4 3 4
u = ^
ı
ı + ^
ȷ
ȷ, v = ^
ı
ı − ^
ȷ
ȷ
5 5 5 5
be unit vectors. Suppose it is known that the directional derivatives Du f (1, 2) and Dv f (1, 2) are equal to 10 and 2
respectively.
1. Show that the gradient vector ∇f at (1, 2) is 10 ^
ı + 5^
ı ȷ
ȷ.
3. Using the tangent plane approximation, estimate the value of f (1.01, 2.05).
This page titled 2.7: Directional Derivatives and the Gradient is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.
2.7.14 https://math.libretexts.org/@go/page/92240
2.8: Optional — Solving the Wave Equation
Many phenomena are modelled by equations that relate the rates of change of various quantities. As rates of change are given by
derivatives the resulting equations contain derivatives and so are called differential equations. We saw a number of such differential
equations in §2.4 of the CLP-1 text.
In this section we consider
2 2
∂ w 1 ∂ w
(x, t) − (x, t) = 0
2 2 2
∂x c ∂t
This is an extremely important 1 partial differential equation called the “wave equation” (in one spatial dimension) that is used in
modelling water waves, sound waves, seismic waves, light waves and so on. The reason that we are looking at it here is that we can
use what we have just learned to see that its solutions are waves travelling with speed c.
To start, we'll use gradients and the chain rule to find the solution of the slightly simpler equation
∂w 1 ∂w
(x, t) − (x, t) = 0
∂x c ∂t
This equation tells that the gradient of any solution w(x, t) must always be perpendicular to the constant vector ⟨1 , −
1
c
⟩.
1 b
⟨a, b⟩ ⋅ ⟨1 , − ⟩ =0 ⟺ a− =0 ⟺ b = ac ⟺ ⟨a, b⟩ = a ⟨1, c⟩
c c
Thus the gradient of any solution w(x, t) must always be parallel to the constant vector ⟨1 , c⟩ .
Recall that one of our implications following Definition 2.7.5 is that the gradient of w(x, t) must always be perpendicular to the
level curves of w.
So the level curves of w(x, t) are always perpendicular to the constant vector ⟨1 , c⟩. They must be straight lines with
equations of the form
⟨1 , c⟩ ⋅ ⟨x − x0 , t − t0 ⟩ = 0 or x + ct = u with u a constant
That is, for each constant u, w(x, t) takes the same value at each point of the straight line x + ct = u. Call that value U (u). So
w(x, t) = U (u) = U (x + ct) for some function U .
This solution represents a wave packet moving to the left with speed c. You can see this by observing that all points (x, t) in space-
time for which x + ct takes the same fixed value, say z, have the same value of U (x + ct), namely U (z). So if you move so that
your position at time t is x = z − ct (i.e. move the left with speed c ) you always see the same value of w. The figure below
illustrates this. It contains the graphs of U (x), U (x + c) = U (x + ct)∣∣ and U (x + 2c) = U (x + ct)∣∣
t=1
for a bump shaped t=2
U (x). In the figure the location of the tick z on the x-axis was chosen so that so that U (z) = max U (x). x
2.8.1 https://math.libretexts.org/@go/page/92241
The above argument that lead to the solution w(x, t) = U (x + ct) was somewhat handwavy. But we can easily turn it into a much
tighter argument by simply changing variables from (x, y) to (u, v) with u = x + ct. It doesn't much matter what we choose
(within reason) for the new variable v. Let's take v = x − ct. Then x = and t = and it is easy to translate back and forth
u+v
2
u−v
2c
between x, t and u, v.
Now define the function W (u, v) by
w(x, t) = W (x + ct , x − ct)
and
∂w ∂
(x, t) = [W (x + ct , x − ct)]
∂t ∂t
∂W ∂ ∂W ∂
= (x + ct , x − ct) (x + ct) + (x + ct , x − ct) (x − ct)
∂u ∂t ∂v ∂t
∂W ∂W
= (x + ct , x − ct) × c + (x + ct , x − ct) × (−c)
∂u ∂v
Subtracting 1
c
times the second equation from the first equation gives
∂w 1 ∂w ∂W
(x, t) − (x, t) = 2 (x + ct , x − ct)
∂t c ∂t ∂v
So
∂w 1 ∂w
w(x, t) obeys the equation (x, t) − (x, t) = 0 for all x, r
∂x c ∂t
if and only if
∂W
W (u, v) obeys the equation (x + ct , x − ct) = 0 for all x, t,
∂v
2
and t = u−v
2c
, is the case if and only if
∂W
W (u, v) obeys the equation (u , v) = 0 for all u, v
∂v
∂v
W (u, v) is independent of v, so that W (u, v) is of the form W (u, v) = U (u), for some
function U , and, so finally,
Now that we have solved our toy equation, let's move on to the 1d wave equation.
2.8.2 https://math.libretexts.org/@go/page/92241
Example 2.8.1. Wave Equation
We'll now expand the above argument to find the general solution to
2 2
∂ w 1 ∂ w
(x, t) − (x, t) = 0
2 2 2
∂x c ∂t
We'll again make the change of variables from (x, y) to (u, v) with u = x + ct and v = x − ct and again define the function
W (u, v) by
w(x, t) = W (x + ct , x − ct)
∂W ∂W
= (x + ct , x − ct) + (x + ct , x − ct)
∂u ∂v
∂w ∂
(x, t) = [W (x + ct , x − ct)]
∂t ∂t
∂W ∂W
= (x + ct , x − ct) × c + (x + ct , x − ct) × (−c)
∂u ∂v
∂u
(u, v) and W 2 (u, v) =
∂W
∂v
(u, v) so that
∂w
(x, t) = W1 (x + ct , x − ct) + W2 (x + ct , x − ct)
∂x
∂w
(x, t) = c W1 (x + ct , x − ct) − c W2 (x + ct , x − ct)
∂t
∂ ∂
= [ W1 (x + ct , x − ct)] + [ W2 (x + ct , x − ct)]
∂x ∂x
∂ ∂
=c [ W1 (x + ct , x − ct)] −c [ W2 (x + ct , x − ct)]
∂t ∂t
with all of the functions on the right hand sides having arguments (x + ct , x − ct). So, subtracting 1
c
2
times the second from
the first, we get
2 2 2
∂ w 1 ∂ w ∂ W
(x, t) − (x, t) =4 (x + ct , x − ct)
2 2 2
∂x c ∂t ∂u∂v
2 2
∂x2
w
(x, t) −
1
c2
∂ w
2
(x, t) = 0 for all x and t if and only if
∂t
2
∂ W
(u , v) = 0
∂u∂v
2.8.3 https://math.libretexts.org/@go/page/92241
for all u and v.
This tells us that the u-derivative of is zero, so that
∂W
∂v
is independent of u. That is
∂W
∂v
˜(v) for some
∂W
(u, v) = V
∂v
~
function V . The reason that we have called it V˜ instead of V with become evident shortly.
Recall that to apply , you treat u as a constant and differentiate with respect to v.
∂
∂v
So ∂W
∂v
(u, v) = V ˜(v) says that, when u is thought of as a constant, W is an antiderivative of V
˜.
~
That is, W (u, v) = ∫ V (v) dv + U , with U being an arbitrary constant. As u is being thought of as a constant, U is
allowed to depend on u.
~
So, denoting by V any antiderivative of V , we can write our solution in a very neat form.
As we saw above U (x + ct) represents a wave packet moving to the left with speed c. Similarly, V (x − ct) represents a wave
packet moving to the right with speed c.
This is known as d'Alembert's form of the solution. It is named after Jean le Rond d'Alembert, 1717--1783, who was a French
mathematician, physicist, philosopher and music theorist.
Notice that w(x, t) = U (x + ct) + V (x − ct) is a solution regardless of what U and V are. The differential equation cannot
tell us what U and V are. To determine them, we need more information about the system — usually in the form of initial
conditions, like w(x, 0) = ⋯ and (x, 0) = ⋯ . General techniques for solving partial differential equations lie beyond this
∂w
∂t
text — but definitely require a good understanding of multivariable calculus. A good reason to keep on reading!
in one application. To be precise, we apply Newton's law to an elastic string, and conclude that small amplitude transverse
vibrations of the string obey the wave equation.
Here is a sketch of a tiny element of the string.
The basic notation that we will use (most of which appears in the sketch) is
w(x, t) = vertical displacement of the string from the x axis
at position x and time t
at position x and time t
2.8.4 https://math.libretexts.org/@go/page/92241
2. tension pulling to the left, which has magnitude T (x, t) and acts at an angle θ(x, t) below horizontal and, possibly,
3. various external forces, like gravity. We shall assume that all of the external forces act vertically and we shall denote by
F (x, t)Δx the net magnitude of the external force acting on the element of string.
−−−−−−−−−
The length of the element of string is essentially √Δx + Δw so that the mass of the element of string is essentially
2 2
−−−−−−−− −
2
ρ(x)√Δx + Δw and the vertical component of Newton's law F = ma says that
2
2
−−−−−−−−− ∂ w
2 2
ρ(x) √ Δx + Δw (x, t)
2
∂t
∂T ∂θ
= (x, t) sin θ(x, t) + T (x, t) cos θ(x, t) (x, t) + F (x, t) (E1)
∂x ∂x
We can dispose of all the θ 's by observing from the figure above that
Δw ∂w
tan θ(x, t) = lim = (x, t)
Δx→0 Δx ∂x
2
∂ w
∂w ∂θ 2
(x, t)
∂x
θ(x, t) = arctan (x, t) (x, t) =
∂w 2
∂x ∂x
1 +( (x, t))
∂x
Substituting these formulae into (E1) give a horrendous mess. However, we can get considerable simplification by looking only at
small vibrations. By a small vibration, we mean that |θ(x, t)| ≪ 1 for all x and t. This implies that | tan θ(x, t)| ≪ 1, hence that
∣ ≪ 1 and hence that
∂w
∣
∣ (x, t)∣
∂x
−−−−−−−−−−
2
∂w ∂w
√1 + ( ) ≈1 sin θ(x, t) ≈ (x, t)
∂x ∂x
2
∂θ ∂ w
cos θ(x, t) ≈ 1 (x, t) ≈ (x, t) (E2)
2
∂x ∂x
which is indeed relatively simple, but still exhibits a problem. This is one equation in the two unknowns w and T .
Fortunately there is a second equation lurking in the background, that we haven't used yet. Namely, the horizontal component of
Newton's law of motion. As a second simplification, we assume that there are only transverse vibrations. That is, our tiny string
element moves only vertically. Then the net horizontal force on it must be zero. That is,
2.8.5 https://math.libretexts.org/@go/page/92241
∂
[T (x, t) cos θ(x, t)] = 0
∂x
Thus T (x, t) cos θ(x, t) is independent of x. For small amplitude vibrations, cos θ is very close to one, for all x. So T is a function
of t only, which is determined by how hard you are pulling on the ends of the string at time t. So for small, transverse vibrations,
(E3) simplifies further to
2 2
∂ w ∂ w
ρ(x) (x, t) = T (t) (x, t) + F (x, t)
2 2
∂t ∂x
In the event that the string density ρ is a constant, independent of x, the string tension T (t) is a constant independent of t (in other
words you are not continually playing with the tuning pegs) and there are no external forces F we end up with the wave equation
−−
2 2
∂ w 2
∂ w T
(x, t) = c (x, t) where c =√
∂ t2 ∂x2 ρ
as desired.
The equation that is called the wave equation has built into it a lot of approximations. By going through the derivation, we have
seen what those approximations are, and we can get some idea as to when they are applicable.
1. If you plug “wave equation” into your favourite search engine you will get more than a million hits.
This page titled 2.8: Optional — Solving the Wave Equation is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.
2.8.6 https://math.libretexts.org/@go/page/92241
2.9: Maximum and Minimum Values
One of the core topics in single variable calculus courses is finding the maxima and minima of functions of one variable. We'll now
extend that discussion to functions of more than one variable 1. Rather than leaping into the deep end, we'll not be too ambitious
and concentrate on functions of two variables. That being said, many of the techniques work more generally. To start, we have the
following natural extensions to some familiar definitions.
Definition 2.9.1
Let the function f (x, y) be defined for all (x, y) in some subset R of R . Let (a, b) be a point in R.
2
(a, b) is a local maximum of f (x, y) if f (x, y) ≤ f (a, b) for all (x, y) close to (a, b). More precisely, (a, b) is a local
maximum of f (x, y) if there is an r > 0 such that f (x, y) ≤ f (a, b) for all points (x, y) within a distance r of (a, b).
(a, b) is a local minimum of f (x, y) if f (x, y) ≥ f (a, b) for all (x, y) close to (a, b).
Local maximum and minimum values are also called extremal values.
(a, b) is an absolute maximum or global maximum of f (x, y) if f (x, y) ≤ f (a, b) for all (x, y) in R.
(a, b) is an absolute minimum or global minimum of f (x, y) if f (x, y) ≥ f (a, b) for all (x, y) in R.
Let's recall why that's true. Suppose that the largest value of f (x) is f (a). Then for all h > 0,
f (a+h) − f (a)
f (a+h) ≤ f (a) ⟹ f (a+h) − f (a) ≤ 0 ⟹ ≤0 if h > 0
h
You also observed at the time that for this argument to work, you only need f (x) ≤ f (a) for all x's close to a, not necessarily for
all x's in the whole world. (In the above inequalities, we only used f (a + h) with h small.) Since we care only about f (x) for x
near a, we can refine the above statement.
If f (a) is a local maximum for f (x) and f is differentiable at a, then f ′
(a) = 0.
Let's use the ideas of the above discourse to extend the study of local maxima and local minima to functions of more than one
variable. Suppose that the function f (x, y) is defined for all (x, y) in some subset R of R , that (a, b) is point of R that is not on
2
the boundary of R, and that f has a local maximum at (a, b). See the figure below.
2.9.1 https://math.libretexts.org/@go/page/92242
Then the function f (x, y) must decrease in value as (x, y) moves away from (a, b) in any direction. No matter which direction d
⃗
we choose, the directional derivative of f at (a, b) in direction d must be zero or smaller. Writing this in mathematical symbols, we
⃗
get
⃗
d
D ⃗ f (a, b) = ∇f (a, b) ⋅ ≤0
d
⃗
|d |
And the directional derivative of f at (a, b) in the direction −d ⃗ also must be zero or negative.
⃗ ⃗
−d d
D ⃗ f (a, b) = ∇f (a, b) ⋅ ≤0 which implies that ∇f (a, b) ⋅ ≥0
−d
⃗ ⃗
|d | |d |
⃗
As n⃗ ablaf (a, b) ⋅
d
⃗
must be both positive (or zero) and negative (or zero) at the same time, it must be zero. In particular,
|d |
choosing d ⃗ = ^
ı forces the x component of n⃗ ablaf (a, b) to be zero, and choosing d = ^
ı
⃗
ȷ forces the y component of ∇f (a, b) to be
ȷ
zero. We have thus shown that ∇f (a, b) = 0. The same argument shows that ∇f (a, b) = 0 when (a, b) is a local minimum too.
This is an important and useful result, so let's theoremise it.
Theorem 2.9.2
Let the function f (x, y) be defined for all (x, y) in some subset R of R . Assume that 2
∇f (a, b) = 0.
Definition 2.9.3
Let f (x, y) be a function and let (a, b) be a point in its domain. Then
if ∇f (a, b) exists and is zero we call (a, b) a critical point (or a stationary point) of the function, and
if ∇f (a, b) does not exist then we call (a, b) a singular point of the function.
Warning 2.9.4
Note that some people (and texts) combine both of these cases and call (a, b) a critical point when either the gradient is zero or
does not exist.
2.9.2 https://math.libretexts.org/@go/page/92242
Warning 2.9.5
Theorem 2.9.2 tells us that every local maximum or minimum (in the interior of the domain of a function whose partial
derivatives exist) is a critical point. Beware that it does not 4 tell us that every critical point is either a local maximum or a local
minimum.
In fact, we shall see later 5, in Examples 2.9.13 and 2.9.15, critical points that are neither local maxima nor a local minima. None-
the-less, Theorem 2.9.2 is very useful because often functions have only a small number of critical points. To find local maxima
and minima of such functions, we only need to consider its critical and singular points. We'll return later to the question of how to
tell if a critical point is a local maximum, local minimum or neither. For now, we'll just practice finding critical points.
Solution
To find the critical points, we need to find the gradient. To find the gradient we need to find the first order partial derivatives.
So, as a preliminary calculation, we find the two first order partial derivatives of f (x, y).
fx (x, y) = 2x − 2y + 2
fy (x, y) = −2x + 4y − 6
2x − 2y + 2 = 0 − 2x + 4y − 6 = 0
or equivalently (dividing by two and moving the constants to the right hand side)
x − y = −1 (E1)
−x + 2y = 3 (E2)
This is a system of two equations in two unknowns (x and y ). One strategy for a solving system like this is to
First use one of the equations to solve for one of the unknowns in terms of the other unknown. For example, (E1) tells us
that y = x + 1. This expresses y in terms of x. We say that we have solved for y in terms of x.
Then substitute the result, y = x + 1 in our case, into the other equation, (E2). In our case, this gives
−x + 2(x + 1) = 3 ⟺ x +2 = 3 ⟺ x =1
We have now found that x = 1, y = x + 1 = 2 is the only solution. So the only critical point is (1, 2). Of course it only
takes a moment to verify that ∇f (1, 2) = ⟨0, 0⟩ . It is a good idea to do this as a simple check of our work.
An alternative strategy for solving a system of two equations in two unknowns, like (E1) and (E2), is to
add equations (E1) and (E2) together. This gives
The point here is that adding equations (E1) and (E2) together eliminates the unknown x, leaving us with one equation in
the unknown y, which is easily solved. For other systems of equations you might have to multiply the equations by some
numbers before adding them together.
We now know that y = 2. Substituting it into (E1) gives us
x − 2 = −1 ⟹ x =1
Once again (thankfully) we have found that the only critical point is (1, 2).
This was pretty easy because we only had to solve linear equations, which in turn was a consequence of the fact that f (x, y) was a
polynomial of degree two. Here is an example with some slightly more challenging algebra.
2.9.3 https://math.libretexts.org/@go/page/92242
Example 2.9.7. f (x, y) = 2x 3
− 6xy + y
2
+ 4y
Solution
As in the last example, we need to find where the gradient is zero, and to find the gradient we need the first order partial
derivatives.
2
fx = 6 x − 6y fy = −6x + 2y + 4
⟺ (x − 1)(x − 2) = 0
⟺ x = 1 or 2
When x = 1, y = 1 2
=1 and when x = 2, y = 2 2
= 4. So, there are two critical points: (1, 1), (2, 4).
Alternatively, we could have also used the second equation to write y = 3x − 2, and then substituted that into the first
equation to get
2 2
6x − 6(3x − 2) = 0 ⟺ x − 3x + 2 = 0
just as above.
And here is an example for which the algebra requires a bit more thought.
= y(10x + y − 15)
= x(5x + 2y − 15)
The critical points are the solutions of fx (x, y) = fy (x, y) = 0. That is, we need to find all x, y that satisfy the pair of
equations
y(10x + y − 15) = 0 (E1)
The first equation, y(10x + y − 15) = 0, is satisfied if at least one of the two factors y, (10x + y − 15) is zero. So the first
equation is satisfied if at least one of the two equations
y =0 (E1a)
10x + y = 15 (E1b)
is satisfied. The second equation, x(5x + 2y − 15) = 0, is satisfied if at least one of the two factors x, (5x + 2y − 15) is
zero. So the second equation is satisfied if at least one of the two equations
x =0 (E2a)
5x + 2y = 15 (E2b)
2.9.4 https://math.libretexts.org/@go/page/92242
is satisfied.
So both critical point equations (E1) and (E2) are satisfied if and only if at least one of (E1a), (E1b) is satisfied and in addition
at least one of (E2a), (E2b) is satisfied. So both critical point equations (E1) and (E2) are satisfied if and only if at least one of
the following four possibilities hold.
(E1a) and (E2a) are satisfied if and only if x = y = 0
(E1a) and (E2b) are satisfied if and only if y = 0, 5x + 2y = 15 ⟺ y = 0, 5x = 15
(E1b) and (E2a) are satisfied if and only if 10x + y = 15, x = 0 ⟺ y = 15, x = 0
(E1b) and (E2b) are satisfied if and only if 10x + y = 15, 5x + 2y = 15. We can use, for example, the second of these
equations to solve for x in terms of y: x = (15 − 2y). When we substitute this into the first equation we get
1
2(15 − 2y) + y = 15, which we can solve for y. This gives −3y = 15 − 30 or y = 5 and then x =
1
(15 − 2 × 5) = 1.
5
In conclusion, the critical points are (0, 0), (3, 0), (0, 15) and (1, 5).
A more compact way to write what we have just done is
fx (x, y) = 0 and fy (x, y) = 0
⟺ {x = y = 0} or {y = 0, x = 3}
or {x = 0, y = 15} or {x = 1, y = 5}
Let's try a more practical example — something from the real world. Well, a mathematician's “real world”. The interested reader
should search-engine their way to a discussion of “idealisation”, “game theory” “Cournot models” and “Bertrand models”. But
don't spend too long there. A discussion of breweries is about to take place.
Example 2.9.9
In a certain community, there are two breweries in competition 6, so that sales of each negatively affect the profits of the other.
If brewery A produces x litres of beer per month and brewery B produces y litres per month, then the profits of the two
breweries are given by
2 2 2 2
2x +y 4y +x
P = 2x − Q = 2y −
6 6
10 2 × 10
respectively. Find the sum of the two profits if each brewery independently sets its own production level to maximize its own
profit and assumes that its competitor does likewise. Then, assuming cartel behaviour, find the sum of the two profits if the two
breweries cooperate so as to maximize that sum 7.
Solution
If A adjusts x to maximize P (for y held fixed) and B adjusts y to maximize Q (for x held fixed) then x and y are determined
by the equations
4x
Px = 2 − 6
=0 (E1)
10
8y
Qy = 2 − =0 (E2)
6
2×10
2
6
10 and equation (E2) yields y =
1
2
10 .
6
Knowing x and y we can determine P, Q and the total
profit
1 5 2 2
P +Q = 2(x + y) − 6
( x + 3y )
10 2
6 5 3 5 6
= 10 (1 + 1 − − ) = 10
8 4 8
2
x
2
+ 3 y ),
2
then x and y are determined by
10
2.9.5 https://math.libretexts.org/@go/page/92242
5x
(P + Q)x =2− 6
=0 (E1)
10
6y
(P + Q)y =2− 6
=0 (E2)
10
5
6
10 and equation (E2) yields y = 1
3
6
10 . Again knowing x and y we can determine the total profit
1 5 2 2
P +Q = 2(x + y) − 6
( x + 3y )
10 2
6 4 2 2 1 11 6
= 10 ( + − − ) = 10
5 3 5 3 15
So cooperating really does help their profits. Unfortunately, like a very small tea-pot, consumers will be a little poorer 8.
Moving swiftly away from the last pun, let's do something a little more geometric.
Example 2.9.10
Equal angle bends are made at equal distances from the two ends of a 100 metre long fence so the resulting three segment
fence can be placed along an existing wall to make an enclosure of trapezoidal shape. What is the largest possible area for such
an enclosure?
Solution
This is a very geometric problem (fenced off from pun opportunities), and as such we should start by drawing a sketch and
introducing some variable names.
The area enclosed by the fence is the area inside the blue rectangle (in the figure on the right above) plus the area inside the
two blue triangles.
1
A(x, θ) = (100 − 2x)x sin θ + 2 ⋅ ⋅ x sin θ ⋅ x cos θ
2
2 2
= (100x − 2 x ) sin θ + x sin θ cos θ
∂A 2 2 2 2
0 = = (100x − 2 x ) cos θ + x { cos θ − sin θ}
∂θ
Note that both terms in the first equation contain the factor sin θ and all terms in the second equation contain the factor x. If
either sin θ or x are zero the area A(x, θ) will also be zero, and so will certainly not be maximal. So we may divide the first
equation by sin θ and the second equation by x, giving
(100 − 4x) + 2x cos θ = 0 (E1)
2 2
(100 − 2x) cos θ + x{ cos θ − sin θ} =0 (E2)
These equations might look a little scary. But there is no need to panic. They are not as bad as they look because θ enters only
through cos θ and sin θ, which we can easily write in terms of cos θ. Furthermore we can eliminate cos θ by observing that
2
2
(100−4x)
the first equation forces cos θ = −
100−4x
2x
and hence sin
2
θ = 1 − cos
2
θ =1−
2
4x
. Substituting these into the second
equation gives
2.9.6 https://math.libretexts.org/@go/page/92242
2
100 − 4x (100 − 4x)
−(100 − 2x) +x [ − 1] =0
2
2x 2x
2 2
⟹ −(100 − 2x)(100 − 4x) + (100 − 4x ) − 2x =0
2
⟹ 6x − 200x = 0
100 −100/3 1
∘
⟹ x = cos θ = − = θ = 60
3 200/3 2
Now here is a very useful (even practical!) statistical example — finding the line that best fits a given collection of points.
An experiment yields n data points (x i, yi ), i = 1, 2, ⋯ , n. We wish to find the straight line y = mx + b which “best” fits
the data.
The definition of “best” is “minimizes the root mean square error”, i.e. minimizes
n
2
E(m, b) = ∑(m xi + b − yi )
i=1
Note that
term number i in E(m, b) is the square of the difference between y , which is the i i
th
measured value of y, and
[mx + b ] , which is the approximation to y given by the line y = mx + b.
i
x=xi
All terms in the sum are positive, regardless of whether the points (x i, yi ) are above or below the line.
Our problem is to find the m and b that minimizes E(m, b). This technique for drawing a line through a bunch of data points
is called “linear regression”. It is used a lot 9 10. Even in the real world — and not just the real world that you find in
mathematics problems. The actual real world that involves jobs.
Solution
We wish to choose m and b so as to minimize E(m, b). So we need to determine where the partial derivatives of E are zero.
n n n n
∂E
2
0 = = ∑ 2(m xi + b − yi )xi = m[ ∑ 2 x ] + b[ ∑ 2 xi ] − [ ∑ 2 xi yi ]
i
∂m
i=1 i=1 i=1 i=1
n n n n
∂E
0 = = ∑ 2(m xi + b − yi ) = m[ ∑ 2 xi ] + b[ ∑ 2] − [ ∑ 2 yi ]
∂b
i=1 i=1 i=1 i=1
There are a lot of symbols here. But remember that all of the x 's and y 's are given constants. They come from, for example,
i i
experimental data. The only unknowns are m and b. To emphasize this, and to save some writing, define the constants
2.9.7 https://math.libretexts.org/@go/page/92242
n n n n
2
Sx = ∑ xi Sy = ∑ yi Sx2 = ∑ x Sxy = ∑ xi yi
i
The equations which determine the critical points are (after dividing by two)
Sx2 m + Sx b = Sxy (E1)
Sx m + n b = Sy (E2)
These are two linear equations on the unknowns m and b. They may be solved in any of the usual ways. One is to use (E2) to
solve for b in terms of m
1
b = (Sy − Sx m)
n
for m. We can then solve this equation for m and substitute back into (E3) to get b. This gives
nSxy − Sx Sy
m =
2
nSx2 − Sx
2
Sy nSx2 − Sx Sx nSxy − Sx Sy nSy Sx2 − nSx Sxy
b = − =
2 2 2
n nSx2 − Sx n nSx2 − Sx n(nSx2 − Sx )
Sx Sxy − Sy Sx2
=−
2
nSx2 − Sx
2
−Sx (E1) + Sx2 (E2) : [nSx2 − Sx ]b = −Sx Sxy + Sy Sx2
If f ′′
(a) ≠ 0, f (x) is going to look a lot like f (a) + 1
2
f
′′
(a) (x − a)
2
when x is really close to a. In particular
if f ′′
(a) > 0, then we will have f (x) > f (a) when x is close to (but not equal to) a, so that a will be a local minimum and
if f ′′
(a) < 0, then we will have f (x) < f (a) when x is close to (but not equal to) a, so that a will be a local maximum, but
if f ′′
(a) = 0, then we cannot draw any conclusions without more work.
A similar, but messier, analysis is possible for functions of two variables. Here are some simple quadratic examples that provide a
warmup for that messier analysis.
2.9.8 https://math.libretexts.org/@go/page/92242
Example 2.9.12. f (x, y) = x 2
+ 3xy + 3 y
2
− 6x − 3y − 6
Consider f (x, y) = x 2
+ 3xy + 3 y
2
− 6x − 3y − 6. The gradient of f is
∇f (x, y) = (2x + 3y − 6) ^
ı
ı + (3x + 6y − 3) ^
ȷ
ȷ
3x + 6y = 3 (E2)
Multiplying the first equation by 2 and subtracting the second equation gives
x =9
Now let's try to determine if f (x, y) has a local minimum, or a local maximum, or neither, at (9, −4). A good way to
determine the behaviour of f (x, y) for (x, y) near (9, −4) is to make the change of variables 11
x = 9 + Δx y = −4 + Δy
2 2
= (Δx ) + 3Δx Δy + 3(Δy ) − 27
And a good way to study the sign of quadratic expressions like (Δx ) + 3Δx Δy + 3(Δy ) is to complete the square. So far
2 2
you have probably just completed the square for quadratic expressions that involve only a single variable. For example
2
3 9
2
x + 3x + 3 = (x + ) − +3
2 4
When there are two variables around, like Δx and Δy, you can just pretend that one of them is a constant and complete the
square as before. For example, if you pretend that Δy is a constant,
2
3 9
2 2 2
(Δx ) + 3Δx Δy + 3(Δy ) = (Δx + Δy) + (3 − ) (Δy )
2 4
2
3 3 2
= (Δx + Δy) + (Δy )
2 4
2
As the smallest values of (Δx + 3
2
Δy) and 3
4
(Δy )
2
are both zero, we have that
for all (x, y) so that (9, −4) is both a local minimum and a global minimum for f .
You have already encountered single variable functions that have a critical point which is neither a local max nor a local min. See
Example 3.5.9 in the CLP-1 text. Here are a couple of examples which show that this can also happen for functions of two
variables. We'll start with the simplest possible such example.
2.9.9 https://math.libretexts.org/@go/page/92242
Example 2.9.13. f (x, y) = x 2
− y
2
The first partial derivatives of f (x, y) = x − y are f (x, y) = 2x and f (x, y) = −2y. So the only critical point of this
2 2
x y
function is (0, 0). Is this a local minimum or maximum? Well let's start with (x, y) at (0, 0) and then move (x, y) away from
(0, 0) and see if f (x, y) gets bigger or smaller. At the origin f (0, 0) = 0. Of course we can move (x, y) away from (0, 0) in
x = 0 and then increase x, the value of the function f increases — which means that (0, 0) cannot be a local maximum for
f.
Next let's move (x, y) away from (0, 0) along the y -axis. Then (x, y) = (0, y) and f (x, y) = f (0, y) = −y . So when we
2
start with y = 0 and then increase y, the value of the function f decreases — which means that (0, 0) cannot be a local
minimum for f .
So moving away from (0, 0) in one direction causes the value of f to increase, while moving away from (0, 0) in a second
direction causes the value of f to decrease. Consequently (0, 0) is neither a local minimum or maximum for f . It is called a
saddle point, because the graph of f looks like a saddle. (The full definition of “saddle point” is given immediately after this
example.) Here are some figures showing the graph of f .
The figure below show some level curves of f . Observe from the level curves that
f increases as you leave (0, 0) walking along the x axis
f decreases as you leave (0, 0) walking along the y axis
Approximately speaking, if a critical point (a, b) is neither a local minimum nor a local maximum, then it is a saddle point. For
(a, b) to not be a local minimum, f has to take values bigger than f (a, b) at some points nearby (a, b). For (a, b) to not be a local
maximum, f has to take values smaller than f (a, b) at some points nearby (a, b). Writing this more mathematically we get the
following definition.
Definition 2.9.14
The critical point (a, b) is called a saddle point for the function f (x, y) if, for each r > 0,
there is at least one point (x, y), within a distance r of (a, b), for which f (x, y) > f (a, b) and
there is at least one point (x, y), within a distance r of (a, b), for which f (x, y) < f (a, b).
2.9.10 https://math.libretexts.org/@go/page/92242
Here is another example of a saddle point. This time we have to work a bit to see it.
Consider f (x, y) = x 2
− 2xy − y
2
+ 4y − 2. The gradient of f is
−2x − 2y = −4
The first equation gives that x = y. Substituting y = x into the second equation gives
−2y − 2y = −4 ⟹ x =y =1
x = 1 + Δx y = 1 + Δy
to give
f (1 + Δx , 1 + Δy)
2 2
= (1 + Δx ) − 2(1 + Δx)(1 + Δy) − (1 + Δy ) + 4(1 + Δy) − 2
2 2
= (Δx ) − 2Δx Δy − (Δy )
Notice that f has now been written as the difference of two squares, much like the f in the saddle point Example 2.9.13.
If Δx and Δy are such that the first square (Δx − Δy) is nonzero, but the second square (Δy) is zero, then
2 2
On the other hand, if Δx and Δy are such that the first square (Δx − Δy) is zero but the second square (Δy) is
2 2
nonzero, then f (1 + Δx , 1 + Δy) = −2(Δy ) < 0 = f (1, 1). That is, whenever Δx = Δy ≠ 0, then
2
2
f (1 + Δx , 1 + Δy) = −2(Δy ) < 0 = f (1, 1).
So
f (x, y) > f (1, 1) at all points on the blue line in the figure above, and
f (x, y) < f (1, 1) at all point on the red line.
We conclude that (1, 1) is the only critical point for f (x, y), and furthermore that it is a saddle point.
2.9.11 https://math.libretexts.org/@go/page/92242
The above three examples show that we can find all critical points of quadratic functions of two variables. We can also classify
each critical point as either a minimum, a maximum or a saddle point.
Of course not every function is quadratic. But by using the quadratic approximation 2.6.12 we can apply the same ideas much more
generally. Suppose that (a, b) is a critical point of some function f (x, y). For Δx and Δy small, the quadratic approximation
2.6.12 gives
f (a + Δx , b + Δy)
≈ f (a , b) + fx (a , b) Δx + fy (a , b) Δy
1 2 2
+ { fxx (a, b) Δx + 2 fxy (a, b) ΔxΔy + fyy (a, b) Δy }
2
1 2 2
= f (a , b) + { fxx (a, b) Δx + 2 fxy (a, b) ΔxΔy + fyy (a, b) Δy }
2
since (a, b) is a critical point so that fx (a, b) = fy (a, b) = 0. Then using the technique of Examples 2.9.12 and 2.9.15, we get
12
(details below).
Let r > 0 and assume that all second order derivatives of the function f (x, y) are continuous at all points (x, y) that are within
a distance r of (a, b). Assume that f (a, b) = f (a, b) = 0. Define
x y
2
D(x, y) = fxx (x, y) fyy (x, y) − fxy (x, y )
if D(a, b) > 0 and f (a, b) < 0, then f (x, y) has a local maximum at (a, b),
xx
if D(a, b) < 0, then f (x, y) has a saddle point at (a, b), but
if D(a, b) = 0, then we cannot draw any conclusions without more work.
Proof
We are putting quotation marks around the word “Proof”, because we are not going to justify the fact that it suffices to analyse
the quadratic approximation in equation (∗). Let's temporarily suppress the arguments (a, b). If f (a, b) ≠ 0, then by xx
2 2
fxy fxy
2
= fxx (Δx + Δy) + ( fyy − ) Δy
fxx fxx
1 2 2
2
= {(fxx Δx + fxy Δy ) + (fxx fyy − fxy ) Δy }
fxx
Similarly, if f yy (a, b) ≠ 0,
2 2
fxx Δx + 2 fxy ΔxΔy + fyy Δy
1 2
2 2
= {(fxy Δx + fyy Δy ) + (fxx fyy − fxy ) Δx }
fyy
Note that this algebra breaks down if f xx (a, b) = fyy (a, b) = 0. We'll deal with that case shortly. More importantly, note that
if (f f −
xx yy
2
fxy ) >0 then both f xx and f yy must be nonzero and of the same sign and furthermore, whenever Δx or Δy
are nonzero,
2 2
2
{(fxx Δx + fxy Δy ) + (fxx fyy − fxy ) Δy } >0 and
2
2 2
{(fxy Δx + fyy Δy ) + (fxx fyy − fxy ) Δx } >0
2.9.12 https://math.libretexts.org/@go/page/92242
if f xx (a, b) > 0, then (a, b) is a local minimum and
if f xx (a, b) < 0, then (a, b) is a local maximum.
If (f xx fyy
2
− fxy ) < 0 and fxx is nonzero then
2 2
2
{(fxx Δx + fxy Δy ) + (fxx fyy − fxy ) Δy }
(a, b) is a saddle point. Similarly, (a, b) is also a saddle point if (f ) < 0 and f is nonzero. 2
f −f xx yy xy yy
2 2
fxx Δx + 2 fxy Δx Δy + fyy Δy = 2 fxy Δx Δy
is strictly positive for one sign of Δx Δy and is strictly negative for the other sign of Δx Δy. So (a, b) is again a saddle
point.
You might wonder why, in the local maximum/local minimum cases of Theorem 2.9.16, f (a, b) appears rather than f (a, b). xx yy
The answer is only that x is before y in the alphabet 13 . You can use f (a, b) just as well as f (a, b). The reason is that if
yy xx
D(a, b) > 0 (as in the first two bullets of the theorem), then because D(a, b) = f we 2
(a, b) f (a, b) − f (a, b ) > 0, xx yy xy
necessarily have f (a, b) f (a, b) > 0 so that f (a, b) and f (a, b) must have the same sign — either both are positive or both
xx yy xx yy
are negative.
You might also wonder why we cannot draw any conclusions when D(a, b) = 0 and what happens then. The second derivative test
for functions of two variables was derived in precisely the same way as the second derivative test for functions of one variable is
derived — you approximate the function by a polynomial that is of degree two in (x − a), (y − b) and then you analyze the
behaviour of the quadratic polynomial near (a, b). For this to work, the contributions to f (x, y) from terms that are of degree two
in (x − a), (y − b) had better be bigger than the contributions to f (x, y) from terms that are of degree three and higher in (x − a),
(y − b) when (x − a), (y − b) are really small. If this is not the case, for example when the terms in f (x, y) that are of degree two
in (x − a), (y − b) all have coefficients that are exactly zero, the analysis will certainly break down. That's exactly what happens
when D(a, b) = 0. Here are some examples. The functions
4 4 4 4
f1 (x, y) = x +y f2 (x, y) = −x −y
3 3 4 4
f3 (x, y) = x +y f4 (x, y) = x −y
all have (0, 0) as the only critical point and all have D(0, 0) = 0. The first, f1 has its minimum there. The second, f2 , has its
maximum there. The third and fourth have a saddle point there.
Here are sketches of some level curves for each of these four functions (with all renamed to simply f ).
Solution
Thinking a little way ahead, to find the critical points we will need the gradient and to apply the second derivative test of
Theorem 2.9.16 we will need all second order partial derivatives. So we need all partial derivatives of order up to two. Here
they are.
2.9.13 https://math.libretexts.org/@go/page/92242
3 2
f = 2x − 6xy + y + 4y
2
fx = 6 x − 6y fxx = 12x fxy = −6
(Of course, f xy and f yx have to be the same. It is still useful to compute both, as a way to catch some mechanical errors.)
We have already found, in Example 2.9.7, that the critical points are (1, 1), (2, 4). The classification is
critical
fxx fyy − fxy
2
fxx type
point
(1, 1) 12 × 2 − (−6 )
2
< 0 saddle point
(2, 4) 24 × 2 − (−6 )
2
> 0 24 local min
We were able to leave the f xx entry in the top row blank, because
we knew that f (1, 1)f (1, 1) − f (1, 1) < 0, and
xx yy
2
xy
They are not needed to answer this question, but can give you some idea as to what the graph of f looks like.
of f (x, y) in Example 2.9.8. Again, to classify the critical points we need the second order partial derivatives. They are
fyy (x, y) = 2x
(Once again, we have computed both f and f to guard against mechanical errors.) We have already found, in Example
xy yx
2.9.8, that the critical points are (0, 0), (0, 15), (3, 0) and (1, 5). The classification is
2.9.14 https://math.libretexts.org/@go/page/92242
critical
fxx fyy − fxy
2
fxx type
point
(0, 0) 0 × 0 − (−15 )
2
< 0 saddle point
(3, 0) 0 × 6 − 15
2
< 0 saddle point
(1, 5) 50 × 2 − 5
2
> 0 75 local min
Here is a sketch of some level curves of our f (x, y). f is negative in the shaded regions and f is positive in the unshaded
regions.
Again this is not needed to answer this question, but can give you some idea as to what the graph of f looks like.
Example 2.9.19
Solution
We know the drill now. We start by computing all of the partial derivatives of f up to order 2.
3 2 2 2
f =x + xy − 3x − 4y +4
2 2
fx = 3 x +y − 6x fxx = 6x − 6 fxy = 2y
fy = 2y(x − 4) = 0 (E2)
The second equation, 2y(x − 4) = 0, is satisfied if and only if at least one of the two equations y = 0 and x = 4 is satisfied.
When y = 0, equation (E1) forces x to obey
2 2
0 = 3x +0 − 6x = 3x(x − 2)
so that x = 0 or x = 2.
When x = 4, equation (E1) forces y to obey
2 2 2
0 = 3 ×4 +y − 6 × 4 = 24 + y
which is impossible.
So, there are two critical points: (0, 0), (2, 0). Here is a table that classifies the critical points.
2.9.15 https://math.libretexts.org/@go/page/92242
critical 2
fxx fyy − fxy fxx type
point
(2, 0) 6 × (−4) − 0
2
< 0 saddle point
Example 2.9.20
A manufacturer wishes to make an open rectangular box of given volume V using the least possible material. Find the design
specifications.
Solution
Denote by x, y and z, the length, width and height, respectively, of the box.
The box has two sides of area xz, two sides of area yz and a bottom of area xy. So the total surface area of material used is
S = 2xz + 2yz + xy
However the three dimensions x, y and z are not independent. The requirement that the box have volume V imposes the
constraint
xyz = V
We can use this constraint to eliminate one variable. Since z is at the end of the alphabet (poor z ), we eliminate z by
substituting z = . So we have find the values of x and y that minimize the function
V
xy
2V 2V
S(x, y) = + + xy
y x
2
xy = 2V (E2)
x2
. Substituting this into (E2) gives
2
4V 3 −
−− 2V 3 −
−−
3
x = 2V ⟹ x = 2V ⟹ x = √2V and y = = √2V
4 2/3
x (2V )
As there is only one critical point, we would expect it to give the minimum 14 . But let's use the second derivative test to verify
that at least the critical point is a local minimum. The various second partial derivatives are
4V 3 −
−− 3 −
−−
Sxx (x, y) = Sxx (√2V , √2V ) = 2
3
x
3
−
−− 3
−
−−
Sxy (x, y) = 1 Sxy (√2V , √2V ) = 1
4V 3 −
−− 3 −
−−
Syy (x, y) = Syy (√2V , √2V ) = 2
3
y
2.9.16 https://math.libretexts.org/@go/page/92242
So
3
−
−− 3
−
−− 3
−
−− 3
−
−− 3
−
−− 3
−
−− 2
Sxx (√2V , √2V ) Syy (√2V , √2V ) − Sxy (√2V , √2V ) =3 >0
3
−
−− 3
−
−−
Sxx (√2V , √2V ) = 2 > 0
−
−− 3 −
−−
and, by Theorem 2.9.16.b, (√2V is a local minimum and the desired dimensions are
3
, √2V )
−−
3 −
−− V
3
x = y = √2V z =√
4
Note that our solution has x = y. That's a good thing — the function S(x, y) is symmetric in x and y. Because the box has no
top, the symmetry does not extend to z.
the interval 0 ≤ x ≤ 1 is 1 and is attained at x = 1, but f (x) = 1 is never zero, so that f has no critical points.
′
So to find the maximum and minimum of the function f (x) on the interval [0, 1], you
1. build up a list of all candidate points 0 ≤ a ≤ 1 at which the maximum or minimum could be attained, by finding all a 's for
which either
1. 0 <a <1 and f (a) = 0 or
′
16
2. 0 < a < 1 and f (a) does not exist
′
or
3. a is a boundary point, i.e. a = 0 or a = 1,
2. and then you evaluate f (a) at each a on the list of candidates. The biggest of these candidate values of f (a) is the absolute
maximum and the smallest of these candidate values is the absolute minimum.
The procedure for finding the maximum and minimum of a function of two variables, f (x, y) in a set like, for example, the unit
disk x + y ≤ 1, is similar. You again
2 2
1. build up a list of all candidate points (a, b) in the set at which the maximum or minimum could be attained, by finding all
17
(a, b) 's for which either
1. (a, b) is in the interior of the set (for our example, a + b < 1 ) and f (a, b) = f (a, b) = 0 or
2 2
x y
2. (a, b) is in the interior of the set and f (a, b) or f (a, b) does not exist or
x y
18
3. (a, b) is a boundary point, (for our example, a + b = 1 ), and could give the maximum or minimum on the boundary —
2 2
like a deformed x-axis. We can find the maximum and minimum of f (x, y) on this curve by converting f (x, y) into a function of
one variable (on the curve) and using the standard function of one variable techniques. This is best explained by some examples.
2.9.17 https://math.libretexts.org/@go/page/92242
Example 2.9.21
Find the maximum and minimum of T (x, y) = (x + y)e on the region defined by x (i.e. on the unit disk).
2 2
−x −y 2 2
+y ≤1
Solution
Let's follow our checklist. First critical points, then points where the partial derivatives don't exist, and finally the boundary.
Interior Critical Points: If T takes its maximum or minimum value at a point in the interior, x + y < 1, then that point must 2 2
be either a critical point of T or a singular point of T . To find the critical points we compute the first order derivatives.
2 2 2 2
2 −x −y 2 −x −y
Tx (x, y) = (1 − 2 x − 2xy)e Ty (x, y) = (1 − 2xy − 2 y )e
Because the exponential e is never zero, the critical points are the solutions of
2 2
−x −y
Tx = 0 ⟺ 2x(x + y) = 1
Ty = 0 ⟺ 2y(x + y) = 1
As both 2x(x + y) and 2y(x + y) are nonzero, we may divide the two equations, which gives x
y
= 1, forcing x = y.
Substituting this into either equation gives 2x(2x) = 1 so that x = y = ± 1
2
.
2
,
1
2
) and (− 1
2
,−
1
2
). Both are in x 2
+y
2
< 1.
x = cos t and y = sin t, in terms of the angle t. This will make the formula for T on the boundary quite a bit easier to deal
As all t 's are allowed, this function takes its max and min at zeroes of
dT
−1
= ( − sin t + cos t)e
dt
All together, we have the following candidates for max and min, with the max and min indicated.
1 1 1 1 1 1 1 1
point (
2
,
2
) (−
2
,−
2
) (
√2
,
√2
) (−
√2
,−
√2
)
1 1 √2 √2
value of T √e
≈ 0.61 −
√e e
≈ 0.52 −
e
max min
The following sketch shows all of the critical points. It is a good idea to make such a sketch so that you don't accidentally
include a critical point that is outside of the allowed region.
2.9.18 https://math.libretexts.org/@go/page/92242
In the last example, we analyzed the behaviour of f on the boundary of the region of interest by using the parametrization
x = cos t, y = sin t of the circle x + y = 1. Sometimes using this parametrization is not so clean. And worse, some curves don't
2 2
have such a simple parametrization. In the next problem we'll look at the boundary a little differently.
Example 2.9.22
Solution
Again, we first find all critical points, then find all singular points and, finally, analyze the boundary.
Interior Critical Points: If f takes its maximum or minimum value at a point in the interior, x + y < 1, then that point must 2 2
be either a critical point of f or a singular point of f . To find the critical points 19 we compute the first order derivatives.
2 2
fx = 3 x +y − 6x fy = 2xy − 8y
fy = 2y(x − 4) = 0 (E2)
The second equation, 2y(x − 4) = 0, is satisfied if and only if at least one of the two equations y = 0 and x = 4 is satisfied.
When y = 0, equation (E1) forces x to obey
2 2
0 = 3x +0 − 6x = 3x(x − 2)
so that x = 0 or x = 2.
When x = 4, equation (E1) forces y to obey
2 2 2
0 = 3 ×4 +y − 6 × 4 = 24 + y
which is impossible.
So, there are only two critical points: (0, 0), (2, 0).
Singular points: In this problem, there are no singular points.
Boundary: On the boundary, x + y = 1, we could again take advantage of having a circle and write x = cos t and y = sin t.
2 2
But, for practice, we'll use another method 20. We know that (x, y) satisfies x + y = 1, and hence y = 1 − x . Examining
2 2 2 2
the formula for f (x, y), we see that it contains only even 21 powers of y, so we can eliminate y by substituting y = 1 − x 2 2
when x = −1 (⇒ y = f = 0 ) or
when x = +1 (⇒ y = 0, f = 2 ) or
−
−
when 0 = d
dx
2
(x + x ) = 1 + 2x (⇒ x = − 1
2
, y = ±√
3
4
, f =−
1
4
).
2.9.19 https://math.libretexts.org/@go/page/92242
Note that the point (2, 0) is outside the allowed region 22. So all together, we have the following candidates for max and min,
with the max and min indicated.
√3
point (0, 0) (−1, 0) (1, 0) (−
1
2
,±
2
)
1
value of f 4 0 2 −
4
max min
Example 2.9.23
2 2
fy = 0 ⟺ x(1 − 2 x y) = 0 ⟺ x = 0 or 2 x y = 1
If y = 0, we cannot have 2x 2
y = 1, so we must have x = 0.
2
3x y
If 3x 2
y = 1, we cannot have x = 0, so we must have 2x 2
y = 1. Dividing gives 1 = 2
2x y
=
3
2
which is impossible.
Singular points: Yet again there are no singular points in this problem.
Boundary: The region is a square, so its boundary consists of its four sides.
First, we look at the part of the boundary with x = 0. On that entire side f = 0.
Next, we look at the part of the boundary with y = 0. On that entire side f = 0.
Next, we look at the part of the boundary with y = 1. There f = f (x, 1) = x − x . To find the maximum and minimum of 3
f (x, y) on the part of the boundary with y = 1, we must find the maximum and minimum of x − x when 0 ≤ x ≤ 1.
3
Recall that, in general, the maximum and minimum of a function h(x) on the interval a ≤ x ≤ b, must occur either at
x = a or at x = b or at an x for which either h (x) = 0 or h (x) does not exist. In this case, (x − x ) = 1 − 3 x , so the
′ ′ d 3 2
dx
either at x = 0, where f = 0,
or at x = , where f =
1 2
,
√3 3 √3
or at x = 1, where f = 0.
dy
2
(y − y ) = 1 − 2y, the only
critical point of y − y is at y =
2 1
2
. So the the max and min of y − y for 0 ≤ y ≤ 1 must occur
2
either at y = 0, where f = 0,
2.9.20 https://math.libretexts.org/@go/page/92242
or at y = , where f = ,
1
2
1
or at y = 1, where f = 0.
All together, we have the following candidates for max and min, with the max and min indicated.
1 1
point (0, 0) (0,0≤y≤1) (0≤x≤1,0) (1, 0) (1,
2
) (1, 1) (0, 1) (
√3
, 1)
1 2
value of f 0 0 0 0 4
0 0 3√3
≈ 0.385
Example 2.9.24
Find the maximum and minimum values of f (x, y) = xy + 2x + y when (x, y) runs over the triangular region with vertices
(0, 0), (1, 0) and (0, 2). The triangular region is sketched in
Solution
As usual, let's examine the critical points, singular points and boundary in turn.
Interior Critical Points: If f takes its maximum or minimum value at a point in the interior, then that point must be either a
critical point of f or a singular point of f . The critical points are the solutions of
fx (x, y) = y + 2 = 0 fy (x, y) = x + 1 = 0
So there is exactly one critical point, namely (−1, −2). This is well outside the triangle and so is not a candidate for the
location of the max and min.
Singular points: Yet again there are no singular points for this f .
Boundary: The region is a triangle, so its boundary consists of its three sides.
First, we look at the side that runs from (0, 0) to (0, 2). On that entire side x = 0, so that f (0, y) = y. The smallest value
of f on that side is f = 0 at (0, 0) and the largest value of f on that side is f = 2 at (0, 2).
Next, we look at the side that runs from (0, 0) to (1, 0). On that entire side y = 0, so that f (x, 0) = 2x. The smallest value
of f on that side is f = 0 at (0, 0) and the largest value of f on that side is f = 2 at (1, 0).
Finally, we look at the side that runs from (0, 2) to (1, 0). Or first job is to find the equation of the line that contains (0, 2)
and (1, 0). By way of review, we'll find the equation using three different methods.
Method 1: You (probably) learned in high school that any line in the xy-plane 23 has equation y = mx + b where b is
the y intercept and m is the slope. In this case, the line crosses the y axis at y = 2 and so has y intercept b = 2. The line
Δy
passes through (0, 2) and (1, 0) and so, as we see in the figure below, has slope m = Δx
=
0−2
1−0
= −2. Thus the side
2.9.21 https://math.libretexts.org/@go/page/92242
of the triangle that runs from (0, 2) to (1, 0) is y = 2 − 2x with 0 ≤ x ≤ 1.
Method 2: Every line in the xy-plane has an equation of the form ax + by = c. In this case (0, 0) is not on the line so that
c ≠ 0 and we can divide the equation by c, giving y = 1. Rename = A and = B. Thus, because the line does
a b a b
x+
c c c c
not pass through the origin, it has an equation of the form Ax + By = 1, for some constants A and B. In order for (0, 2)
to lie on the line, x = 0, y = 2 has to be a solution of Ax + By = 1. That is, Ax ∣∣ + By ∣∣ = 1, so that B =
x=0
. In
y=2
1
order for (1, 0) to lie on the line, x = 1, y = 0 has to be a solution of Ax + By = 1. That is Ax ∣∣ x=1
+ By ∣
∣
y=0
= 1, so
that A = 1. Thus the line has equation x + y = 1, or equivalently, y = 2 − 2x.
1
Method 3: The vector from (0, 2) to (1, 0) is ⟨1 − 0 , 0 − 2⟩ = ⟨1, −2⟩ . As we see from the figure above, it is a direction
vector for the line. One point on the line is (0, 2). So a parametric equation for the line (see Equation 1.3.1) is
⟨x − 0 , y − 2⟩ = t ⟨1, −2⟩ or x = t, y = 2 − 2t
By any of these three methods 24 , we have that the side of the triangle that runs from (0, 2) to (1, 0) is y = 2 − 2x with
0 ≤ x ≤ 1. On that side of the triangle
2
f (x, 2 − 2x) = x(2 − 2x) + 2x + (2 − 2x) = −2 x + 2x + 2
1 1 2 2 5
f( , 1) = g( ) =− + +2 =
2 2 4 2 2
All together, we have the following candidates for max and min, with the max and min indicated.
2
, 1)
value of f 0 2 2
5
min max
2.9.22 https://math.libretexts.org/@go/page/92242
Example 2.9.25
−−−−−−
Find the high and low points of the surface z = √x 2 2
+y with (x, y) varying over the square |x| ≤ 1, |y| ≤ 1 .
Solution
−−−−− −
The function f (x, y) = √x 2
+ y2 has a particularly simple geometric interpretation — it is the distance from the point
(x, y) to the origin. So
the minimum of f (x, y) is achieved at the point in the square that is nearest the origin — namely the origin itself. So
(0, 0, 0) is the lowest point on the surface and is at height 0.
The maximum of f (x, y) is achieved at the points in the square that are farthest from the origin — namely the four corners
– –
of the square ( ± 1, ±1). At those four points z = √2. So the highest points on the surface are (±1, ±1, √2).
Even though we have already answered this question, it will be instructive to see what we would have found if we had
−−−−−−
followed our usual protocol. The partial derivatives of f (x, y) = √x + y are defined for (x, y) ≠ (0, 0) and are
2 2
x y
fx (x, y) = − −−−− − fy (x, y) = − −−−− −
√ x2 + y 2 √ x2 + y 2
There is one singular point — namely (0, 0). The minimum value of f is achieved at the singular point.
The boundary of the square consists of its four sides. One side is
{(x, y)|x = 1, − 1 ≤ y ≤ 1}
−−−− − −−−− −
On this side f = √1 + y 2 .As √1 + y 2increases with |y|, the smallest value of f on that side is 1 (when y = 0 ) and the
–
largest value of f is √2 (when y = ±1 ). The same thing happens on the other three sides. The maximum value of f is
achieved at the four corners. Note that f and f are both nonzero at all four corners.
x y
Exercises
Stage 1
1✳
a. Some level curves of a function f (x, y) are plotted in the xy--plane below.
For each of the four statements below, circle the letters of all points in the diagram where the situation applies. For example, if
the statement were “These points are on the y --axis”, you would circle both P and U , but none of the other letters. You may
2.9.23 https://math.libretexts.org/@go/page/92242
assume that a local maximum occurs at point T .
(i) n⃗ ablaf is zero P R ST U
negative
b. The diagram below shows three “y traces” of a graph z = F (x, y) plotted on xz--axes. (Namely the intersections of the
surface z = F (x, y) with the three planes (y = 1.9, y = 2, y = 2.1 ). For each statement below, circle the correct word.
2
−−−−−−
Find the high and low points of the surface z = √x + y with (x, y) varying over the square |x| ≤ 1, |y| ≤ 1 . Discuss the
2 2
3
If t0 is a local minimum or maximum of the smooth function f (t) of one variable (t runs over all real numbers) then
′
f (t0 ) = 0. Derive an analogous necessary condition for x⃗ to be a local minimum or maximium of the smooth function
0
Stage 2
4✳
2
Let z = f (x, y) = (y 2
−x )
2
.
1. Make a reasonably accurate sketch of the level curves in the xy--plane of z = f (x, y) for z = 0, 1 and 16. Be sure to show
the units on the coordinate axes.
2. Verify that (0, 0) is a critical point for z = f (x, y), and determine from part (a) or directly from the formula for f (x, y)
whether (0, 0) is a local minimum, a local maximum or a saddle point.
3. Can you use the Second Derivative Test to determine whether the critical point (0, 0) is a local minimum, a local maximum
or a saddle point? Give reasons for your answer.
2.9.24 https://math.libretexts.org/@go/page/92242
5✳
Use the Second Derivative Test to find all values of the constant c for which the function z =x
2
+ cxy + y
2
has a saddle
point at (0, 0).
6✳
7✳
Find all critical points for f (x, y) = x(x + xy + y − 9). Also find out which of these points give local maximum values for
2 2
f (x, y), which give local minimum values, and which give saddle points.
8✳
Find the largest and smallest values of x y z in the part of the plane
2 2
2x + y + z = 5 where x ≥ 0, y ≥ 0 and z ≥ 0. Also
find all points where those extreme values occur.
9
Find and classify all the critical points of f (x, y) = x 2
+y
2 2
+ x y + 4.
10 ✳
Find all saddle points, local minima and local maxima of the function
3 2 2
f (x, y) = x +x − 2xy + y − x.
11 ✳
Find and classify [as local maxima, local minima, or saddle points] all critical points of f (x, y).
12
Find the maximum and minimum values of f (x, y) = xy − x 3
y
2
when (x, y) runs over the square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1.
13
The temperature at all points in the disc x is given by T (x, y) = (x + y)e Find the maximum and minimum
2 2
2 2 −x −y
+y ≤1 .
14 ✳
1. For the function z = f (x, y) = x + 3xy + 3y − 6x − 3y − 6. Find and classify as [local maxima, local minima, or
3 2
2.9.25 https://math.libretexts.org/@go/page/92242
1. 2 2
f (x, y) = (x + y − 1)(x − y) + 1
−−−−−−
2. f (x, y) = √x
2
+y
2
4. 2
f (x, y) = x +y
2
15 ✳
Classify as [ local maxima, minima or saddle points] all critical points of f (x, y).
16 ✳
1. Find and classify the critical points of h(x, y) as local maxima, local minima or saddle points.
2. Find the maximum and minimum values of h(x, y) on the disk x + y ≤ 1. 2 2
17 ✳
Find the absolute maximum and minimum values of the function f (x, y) = 5 + 2x − x 2
− 4y
2
on the rectangular region
R = {(x, y)| − 1 ≤ x ≤ 3, − 1 ≤ y ≤ 1}
18 ✳
Find the minimum of the function h(x, y) = −4x − 2y + 6 on the closed bounded domain defined by x 2
+y
2
≤ 1.
19 ✳
20 ✳
Find and classify the critical points of f (x, y) = 3x
2
y +y
3
− 3x
2
− 3y
2
+ 4.
21 ✳
Consider the function
3 2
f (x, y) = 2 x − 6xy + y + 4y
2.9.26 https://math.libretexts.org/@go/page/92242
22 ✳
23 ✳
Let
f (x, y) = xy(x + 2y − 6)
24 ✳
defined in the xy-plane. Classify each critical point as a local minimum, maximum or saddle point.
25 ✳
A metal plate is in the form of a semi-circular disc bounded by the x-axis and the upper half of x + y = 4. The temperature
2 2
at the point (x, y) is given by T (x, y) = ln (1 + x + y ) − y. Find the coldest point on the plate, explaining your steps
2 2
26 ✳
defined in the xy-plane. Classify each critical point as a local minimum, maximum or saddle point. Explain your reasoning.
27 ✳
28 ✳
29 ✳
2.9.27 https://math.libretexts.org/@go/page/92242
on the quarter-circle D = {(x, y)|x 2
+y
2
≤ 4, x ≥ 0, y ≥ 0} .
30
Equal angle bends are made at equal distances from the two ends of a 100 metre long fence, so that the resulting three segment
fence can be placed along an existing wall to make an enclosure of trapezoidal shape. What is the largest possible area for such
an enclosure?
31
Find the most economical shape of a rectangular box that has a fixed volume V and that has no top.
Stage 3
32 ✳
1. Find the maximum and minimum values of T (x, y) on the disk D defined by x + y ≤ 4. 2 2
2. Suppose an ant lives on the disk D. If the ant is initially at point (1, 1), in which direction should it move so as to increase
its temperature as quickly as possible?
3. Suppose that the ant moves at a velocity v ⃗ = ⟨−2, −1⟩ . What is its rate of increase of temperature as it passes through
(1, 1)?
33 ✳
where k > 0 is a constant. Find and classify all critical points of f (x, y) as local minima, local maxima, saddle points or points
of indeterminate type. Carefully distinguish the cases k < , k = and k > .
1
2
1
2
1
34 ✳
1. Show that the function f (x, y) = 2x + 4y + has exactly one critical point in the first quadrant x > 0, y > 0, and find
1
xy
xy
35
An experiment yields data points (x , y ), i = 1, 2, ⋯ , n. We wish to find the straight line y = mx + b which “best” fits
i i
the data. The definition of “best” is “minimizes the root mean square error”, i.e. minimizes ∑ (mx + b − y ) . Find m
n
i=1 i i
2
and b.
2.9.28 https://math.libretexts.org/@go/page/92242
4. A very common error of logic that people make is “Affirming the consequent”. “If P then Q” is true, does not imply that “If Q
then P” is true . The statement “If he is Shakespeare then he is dead” is true. But concluding from “That man is dead” that “He
must be Shakespeare” is just silly.
5. And you also saw, for example in Example 3.6.4 of the CLP-1 text, that critical points that are also inflection points are neither
local maxima nor local minima.
6. We have both types of music here — country and western.
7. This sort of thing is generally illegal.
8. Sorry about the pun.
9. Proof by search engine.
10. And has been used for a long time. It was introduced by the French mathematician Adrien-Marie Legendre, 1752--1833, in
1805, and by the German mathematician and physicist Carl Friedrich Gauss, 1777--1855, in 1809.
11. This is equivalent to translating the graph so that the critical point lies at (0, 0).
12. There are analogous results in higher dimensions that are accessible to people who have learned some linear algebra. They are
derived by diagonalizing the matrix of second derivatives, which is called the Hessian matrix.
13. The shackles of convention are not limited to mathematics. Election ballots often have the candidates listed in alphabetic order.
14. Indeed one can use the facts that 0 < x < ∞, that 0 < y < ∞, and that S → ∞ as x → 0 and as y → 0 and as x → ∞ and as
y → ∞ to prove that the single critical point gives the global minimum.
15. Recall that “extremal value” means “either maximum value or minimum value”.
16. Recall that if f (a) does not exist, then a is called a singular point of f .
′
17. This is probably a good time to review the statement of Theorem 2.9.2.
18. It should intuitively obvious from a sketch that the boundary of the disk x + y ≤ 1 is the circle x + y = 1. But if you
2 2 2 2
really need a formal definition, here it is. A point (a, b) is on the boundary of a set S if there is a sequence of points in S that
converges to (a, b) and there is also a sequence of points in the complement of S that converges to (a, b).
19. We actually found the critical points in Example 2.9.19. But, for the convenience of the reader, we'll repeat that here.
20. Even if you don't believe that “you can't have too many tools”, it is pretty dangerous to have to rely on just one tool.
−−−− −
21. If it contained odd powers too, we could consider the cases y ≥ 0 and y ≤ 0 separately and substitute y = √1 − x in the 2
−−−− −
former case and y = −√1 − x in the latter case.
2
22. We found (2, 0) as a solution to the critical point equations (E1), (E2). That's because, in the course of solving those equations,
we ignored the constraint that x + y ≤ 1.
2 2
23. To be picky, any line the xy-plane that is not parallel to the y axis.
24. In the third method, x has just be renamed to t.
This page titled 2.9: Maximum and Minimum Values is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.
2.9.29 https://math.libretexts.org/@go/page/92242
2.10: Lagrange Multipliers
In the last section we had to solve a number of problems of the form “What is the maximum value of the function f on the curve
C ?” In those examples, the curve C was simple enough that we could reduce the problem to finding the maximum of a function of
one variable. For more complicated problems this reduction might not be possible. In this section, we introduce another method for
solving such problems. First some nomenclature.
Definition 2.10.1
A problem of the form
“Find the maximum and minimum values of the function f (x, y) for (x, y) on the curve g(x, y) = 0. ”
is one type of constrained optimization problem. The function being maximized or minimized, f (x, y), is called the objective
function. The function, g(x, y), whose zero set is the curve of interest, is called the constraint function.
Such problems are quite common. As we said above, we have already encountered them in the last section on absolute maxima and
minima, when we were looking for the extreme values of a function on the boundary of a region. In economics “utility functions”
are used to model the relative “usefulness” or “desirability” or “preference” of various economic choices. For example, a utility
function U (w, κ) might specify the relative level of satisfaction a consumer would get from purchasing a quantity w of wine and κ
of coffee. If the consumer wants to spend $100 and wine costs $20 per unit and coffee costs $5 per unit, then the consumer would
like to maximize U (w, κ) subject to the constraint that 20w + 5κ = 100.
To this point we have always solved such constrained optimization problems either by
solving g(x, y) = 0 for y as a function of x (or for x as a function of y ) or by
parametrizing the curve g(x, y) = 0. This means writing all points of the curve in the form (x(t), y(t)) for some functions x(t)
and y(t). For example we used x(t) = cos t, y(t) = sin t as a parametrization of the circle x + y = 1 in Example 2.9.21.
2 2
However quite often the function g(x, y) is so complicated that one cannot explicitly solve g(x, y) = 0 for y as a function of x or
for x as a function of y and one also cannot explicitly parametrize g(x, y) = 0. Or sometimes you can, for example, solve
g(x, y) = 0 for y as a function of x, but the resulting solution is so complicated that it is really hard, or even virtually impossible,
to work with. Direct attacks become even harder in higher dimensions when, for example, we wish to optimize a function
f (x, y, z) subject to a constraint g(x, y, z) = 0.
There is another procedure called the method of “Lagrange multipliers” 1 that comes to our rescue in these scenarios. Here is the
three dimensional version of the method. There are obvious analogs is other dimensions.
Let f (x, y, z) and g(x, y, z) have continuous first partial derivatives in a region of R that contains the surface S given by the
3
∇f (a, b, c) = λ∇g(a, b, c)
that is
fx (a, b, c) = λ gx (a, b, c)
fy (a, b, c) = λ gy (a, b, c)
fz (a, b, c) = λ gz (a, b, c)
Proof
Suppose that (a, b, c) is a point of S and that f (x, y, z) ≥ f (a, b, c) for all points (x, y, z) on S that are close to (a, b, c). That
is (a, b, c) is a local minimum for f on S. Of course the argument for a local maximum is virtually identical.
2.10.1 https://math.libretexts.org/@go/page/92243
Imagine that we go for a walk on S, with the time t running, say, from t = −1 to t = +1 and that at time t = 0 we happen to
be exactly at (a, b, c). Let's say that our position is (x(t), y(t), z(t)) at time t.
Write
So F (t) is the value of f that we see on our walk at time t. Then for all t close to 0, (x(t), y(t), z(t)) is close to
(x(0), y(0), z(0)) = (a, b, c) so that
for all t close to zero. So F (t) has a local minimum at t = 0 and consequently F ′
(0) = 0.
This is true for all paths on S that pass through (a, b, c) at time 0. In particular it is true for all vectors ′ ′ ′
⟨x (0) , y (0) , z (0)⟩
∇f (a, b, c) = λ∇g(a, b, c)
for some number λ. That's the Lagrange multiplier rule of our theorem.
So to find the maximum and minimum values of f (x, y, z) on a surface g(x, y, z) = 0, assuming that both the objective function
f (x, y, z) and constraint function g(x, y, z) have continuous first partial derivatives and that ∇g(x, y, z) ≠ 0, you
1. build up a list of candidate points (x, y, z) by finding all solutions to the equations
fx (x, y, z) = λ gx (x, y, z)
fy (x, y, z) = λ gy (x, y, z)
fz (x, y, z) = λ gz (x, y, z)
g(x, y, z) = 0
Note that there are four equations and four unknowns, namely x, y, z and λ.
2. Then you evaluate f (x, y, z) at each (x, y, z) on the list of candidates. The biggest of these candidate values is the absolute
maximum and the smallest of these candidate values is the absolute minimum.
Another way to write the system of equations in the first step is
Example 2.10.3
Find the maximum and minimum of the function x 2
− 10x − y
2
on the ellipse whose equation is x 2
+ 4y
2
= 16.
Solution
2.10.2 https://math.libretexts.org/@go/page/92243
For this problem the objective function is f (x, y) = x − 10x − y and the constraint function is g(x, y) = x + 4y − 16.
2 2 2 2
To apply the method of Lagrange multipliers we need ∇f and ∇g. So we start by computing the first order derivatives of these
functions.
fx = 2x − 10 fy = −2y gx = 2x gy = 8y
So, according to the method of Lagrange multipliers, we need to find all solutions to
2x − 10 = λ(2x)
−2y = λ(8y)
2 2
x + 4y − 16 = 0
2 2
x + 4y − 16 = 0 (E3)
4
or y = 0.
If λ = − , (E1) gives − x = −5, i.e. x = 4, and then (E3) gives y = 0.
1
4
5
If y = 0, then (c) gives x = ±4 (and while we could easily use (E1) to solve for λ, we don't actually need λ ).
So the method of Lagrange multipliers, Theorem 2.10.2 (actually the dimension two version of Theorem 2.10.2), gives that the
only possible locations of the maximum and minimum of the function f are (4, 0) and (−4, 0). To complete the problem, we
only have to compute f at those points.
value of f −24 56
min max
In the previous example, the objective function and the constraint were specified explicitly. That will not always be the case. In the
next example, we have to do a little geometry to extract them.
Example 2.10.4
Find the rectangle of largest area (with sides parallel to the coordinates axes) that can be inscribed in the ellipse x 2
+ 2y
2
= 1.
Solution
Since this question is so geometric, it is best to start by drawing a picture.
2.10.3 https://math.libretexts.org/@go/page/92243
Call the coordinates of the upper right corner of the rectangle (x, y), as in the figure above. The four corners of the rectangle
are (±x, ±y) so the rectangle has width 2x and height 2y and the objective function is f (x, y) = 4xy. The constraint function
for this problem is g(x, y) = x + 2y − 1. Again, to use Lagrange multipliers we need the first order partial derivatives.
2 2
fx = 4y fy = 4x gx = 2x gy = 4y
So, according to the method of Lagrange multipliers, we need to find all solutions to
4y = λ(2x) (E1)
4x = λ(4y) (E2)
2 2
x + 2y −1 = 0 (E3)
2
λx. Substituting this into equation (E2) gives
2 2
4x = 2 λ x or 2x(2 − λ ) = 0
– –
So (E2) is satisfied if either x = 0 or λ = √2 or λ = −√2.
If x = 0, then (E1) gives y = 0 too. But (0, 0) violates the constraint equation (E3). Note that, to have a solution, all of the
equations (E1), (E2) and (E3) must be satisfied.
–
If λ = √2, then
–
(E2) gives x = √2y and then
(E3) gives 2y + 2y = 1 or y
2 2 2
=
1
4
so that
–
y =±
1
2
and x = √2y = ± . √2
1
–
If λ = −√2, then
–
(E2) gives x = −√2y and then
(E3) gives 2y + 2y = 1 or y
2 2 2
=
1
4
so that
–
y =±
1
2
and x = −√2y = ∓ 1
√2
.
2
), ( −
1
√2
, −
1
2
), (
1
√2
, −
1
2
) and ( − 1
√2
,
1
2
). They are the four
corners of a single rectangle. We said that we wanted (x, y) to be the upper right corner, i.e. the corner in the first quadrant. It
is (1
,
1
).
2
√2
Example 2.10.5
Find the ends of the major and minor axes of the ellipse 3x
2
− 2xy + 3 y
2
= 4. They are the points on the ellipse that are
farthest from and nearest to the origin.
Solution
Let (x, y) be a point on 3x − 2xy + 3y = 4. This point is at the end of a major axis when it maximizes its distance from the
2 2
centre, (0, 0) of the ellipse. It is at the end of a minor axis when it minimizes its distance from (0, 0). So we wish to maximize
− −− −− −
and minimize the distance √x + y subject to the constraint
2 2
2 2
g(x, y) = 3 x − 2xy + 3 y −4 = 0
−−−−−− −−−−−− 2
Now maximizing/minimizing √x + y is equivalent 4 to maximizing/minimizing its square
2 2 2 2 2 2
(√x + y ) = x + y . So
we are free to choose the objective function
2 2
f (x, y) = x +y
which we will do, because it makes the derivatives cleaner. Again, we use Lagrange multipliers to solve this problem, so we
start by finding the partial derivatives.
2.10.4 https://math.libretexts.org/@go/page/92243
2x = λ(6x − 2y)
2y = λ(−2x + 6y)
2 2
3x − 2xy + 3 y −4 =0
Dividing the first two equations by 2, and then collecting together the x's and the y 's gives
(1 − 3λ)x + λy = 0 (E1)
λx + (1 − 3λ)y = 0 (E2)
2 2
3x − 2xy + 3 y −4 =0 (E3)
To start, let's concentrate on the first two equations. Pretend, for a couple of minutes, that we already know the value of λ and
are trying to find x and y. Note that λ cannot be zero because if it is, (E1) forces x = 0 and (E2) forces y = 0 and (0, 0) is not
on the ellipse, i.e. violates (E3). So we may divide by λ and (E1) gives
1 − 3λ
y =− x
λ
1−3λ
Again, x cannot be zero, since then y = − λ
x would give y = 0 and (0, 0) is still not on the ellipse.
2
(1−3λ)
So we may divide λx − λ
x =0 by x, giving
2
(1 − 3λ)
2 2
λ− =0 ⟺ (1 − 3λ ) −λ =0
λ
2
⟺ 8λ − 6λ + 1 = (2λ − 1)(4λ − 1) = 0
2
or 1
4
. Subbing these into either (E1) or (E2) gives
1 1 1
λ = ⟹ − x+ y = 0 ⟹ x = y
2 2 2
E3
2 2 2
⟹ 3 x − 2x + 3x = 4 ⟹ x = ±1
1 1 1
λ = ⟹ x+ y = 0 ⟹ x = −y
4 4 4
E3 1
2 2 2
⟹ 3 x + 2x + 3x = 4 ⟹ x = ±
–
√2
E3
Here “ ⟹ ” indicates that we have just used (E3). We now have (x, y) = ±(1, 1), from λ = 1
2
, and (x, y) = ± ( 1
,−
1
)
√2 √2
–
from λ = 1
4
. The distance from (0, 0) to ±(1, 1), namely √2, is larger than the distance from (0, 0) to ±( 1
,−
1
), namely
√2 √2
√2
,−
√2
1
) and the ends of the major axes are ±(1, 1). Those ends are sketched in the
figure on the left below. Once we have the ends, it is an easy matter 5 to sketch the ellipse as in the figure on the right below.
2.10.5 https://math.libretexts.org/@go/page/92243
Example 2.10.6
U (w, κ) = 6 w 3
κ 3
subject to the constraint 4w + 2κ = 12
Solution
The constraint 4w + 2κ = 12 is simple enough that we can easily use it to express κ in terms of w, then substitute
2 1
However, for practice purposes, we'll use Lagrange multipliers with the objective function U (w, κ) = 6w 3
κ 3
and the
constraint function g(w, κ) = 4w + 2κ − 12. The first order derivatives of these functions are
1 1 2 2
− −
Uw = 4 w 3
κ 3
Uκ = 2 w 3
κ 3
gw = 4 gκ = 2
The boundary values w = 0 and κ = 0 give utility 0, which is obviously not going to be the maximum utility. So it suffices to
consider only local maxima. According to the method of Lagrange multipliers, we need to find all solutions to
1 1
−
4w 3 κ 3 = 4λ (E1)
2 2
−
2w 3 κ 3 = 2λ (E2)
4w + 2κ − 12 = 0 (E3)
Then
1 1
Example 2.10.7
Since
2
2(y − 2) = λ(2y) ⟺ y = (E2)
1 −λ
3
2(z − 3) = λ(2z) ⟺ z = (E3)
1 −λ
2 2 2
0 =x +y +z −1 (E4)
2.10.6 https://math.libretexts.org/@go/page/92243
Substituting (E1), (E2) and (E3) into (E4) gives
1 +4 +9 2 −−
−1 = 0 ⟹ (1 − λ ) = 14 ⟹ 1 − λ = ±√14
(1 − λ)2
We can then substitute these two values of λ back into the expressions for x, y, z in terms of λ to get the two points
(1, 2, 3) and −
1 1
(1, 2, 3).
√14 √14
−
1
√14
(1, 2, 3) to (1, 2, 3), which is {1 +
√14
1
} (1, 2, 3). So the nearest point is 1
√14
(1, 2, 3) and the farthest point is
−
1
√14
.
(1, 2, 3)
Let f (x, y, z), g(x, y, z) and h(x, y, z) have continuous first partial derivatives in a region of R
3
that contains the curve C
g(x, y, z) = h(x, y, z) = 0
Assume 6 that ∇g(x, y, z) × ∇h(x, y, z) ≠ 0 on C . If f , restricted to the curve C, has a local extreme value at the point
(a, b, c) on C , then there are real numbers λ and μ such that
that is
It is a function of five variables — the original variables x, y and z, and two auxiliary variables λ and μ. If there is a local extreme
value at (a, b, c) then (a, b, c) must obey
Equation 2.10.9
0 = Lλ (a, b, c, λ, μ) = g(a, b, c)
0 = Lμ (a, b, c, λ, μ) = h(a, b, c)
for some λ and μ. So solving this system of five equations in five unknowns gives all possible candidates for the locations of local
maxima and minima. We'll go through an example shortly.
2.10.7 https://math.libretexts.org/@go/page/92243
Call your velocity vector v.⃗ It is tangent to the curve g(x, y, z) = h(x, y, z) = 0. Because f has a local minimum at (a, b, c), f
must be increasing (or constant) as we leave (a, b, c). So the directional derivative
Now start over. Again walk away from (a, b, c) along the curve g = h = 0, but this time moving in the opposite direction, with
velocity vector −v.⃗ Again f must be increasing (or constant) as we leave (a, b, c), so the directional derivative
D ⃗ ≥ 0
f (a, b, c) = ∇f (a, b, c) ⋅ (−v)
−v ⃗
As both ∇f (a, b, c) ⋅ v ⃗ and −∇f (a, b, c) ⋅ v ⃗ are at least zero, we now have that
∇f (a, b, c) ⋅ v ⃗ = 0
for all vectors v ⃗ that are tangent to the curve g = h = 0 at (a, b, c). Let's denote by T the set of all vectors v ⃗ that are tangent to
the curve g = h = 0 at (a, b, c) and let's denote by T the set of all vectors that are perpendicular to all vectors in T . So (∗)
⊥
We now find all vectors in T . We can easily guess two such vectors. Since the curve g = h = 0 lies inside the surface g = 0
⊥
∇g(a, b, c) ⋅ v ⃗ = 0
Similarly, since the the curve g = h = 0 lies inside the surface h = 0 and ∇h(a, b, c) is normal to h = 0 at (a, b, c), we have
∇h(a, b, c) ⋅ v ⃗ = 0
Picking any two constants λ and μ, multiplying (E1) by λ, multiplying (E2) by μ and adding gives that
for all vectors v ⃗ in T . Thus, for all λ and μ, the vector λ∇g(a, b, c) + μ∇h(a, b, c) is in T ⊥
.
Now the vectors in T form a line. (They are all tangent to the same curve at the same point.) So, T , the set of all vectors
⊥
perpendicular to T , forms a plane. As λ and μ run over all real numbers, the vectors λ∇g(a, b, c) + μ∇h(a, b, c) form a plane.
Thus we have found all vector in T and we conclude that ∇f (a, b, c) must be of the form λ∇g(a, b, c) + μ∇h(a, b, c) for
⊥
are exactly the first three equations of 2.10.9. This completes the explanation of why Lagrange multipliers work in this setting.
Example 2.10.10
Find the distance from the origin to the curve that is the intersection of the two surfaces
2 2 2
z =x +y x − 2z = 3
Solution
Yet again, we simplify the algebra by maximizing the square of the distance rather than the distance itself. So we are to
maximize
2 2 2
f (x, y, z) = x +y +z
2.10.8 https://math.libretexts.org/@go/page/92243
2 2 2
0 = g(x, y, z) = x +y −z 0 = h(x, y, z) = x − 2z − 3
Since
fx = 2x fy = 2y fz = 2z
gx = 2x gy = 2y gz = −2z
hx = 1 hy = 0 hz = −2
x − 2z = 3 (E5)
Since equation (E2) factors so nicely we start there. It tells us that either y = 0 or λ = 1.
Case \(\lambda=1\text{:}\) When λ = 1 the remaining equations reduce to
0 =μ (E1)
0 = 4z + 2μ (E3)
2 2 2
z =x +y (E4)
x − 2z = 3 (E5)
So
equation (E1) gives μ = 0.
Then substituting μ = 0 into (E3) gives z = 0.
Then substituting z = 0 into (E5) gives x = 3.
Then substituting z = 0 and x = 3 into (E4) gives 0 = 9 + y 2
, which is impossible, since 9 + y 2
≥9 >0 for all y.
So we can't have λ = 1.
Case y = 0: When y = 0 the remaining equations reduce to
2(1 − λ)x = μ (E1)
(1 + λ)z = −μ (E3)
2 2
z =x (E4)
x − 2z = 3 (E5)
These don't clean up quite so nicely as in the λ =1 case. But at least equation (E4) tells us that z = ±x. So we have to
consider those two possibilities.
Subcase y = 0, z = x: When y = 0 and z = x, the remaining equations reduce to
(1 + λ)x = −μ (E3)
−x = 3 (E5)
So equation (E5) now tells us that x = −3 so that (x, y, z) = (−3, 0, −3). (We don't really care what λ and μ are. But as they
obey −6(1 − λ) = μ, −3(1 + λ) = −μ we have, adding the two equations together
−9 + 3λ = 0 ⟹ λ =3
(1 + λ)x = μ (E3)
3x = 3 (E5)
2.10.9 https://math.libretexts.org/@go/page/92243
So equation (E5) now tells us that x = 1 so that (x, y, z) = (1, 0, −1). (Again, we don't really care what λ and μ are. But as
they obey 2(1 − λ) = μ, (1 + λ) = μ we have, subtracting the second equation from the first,
1
1 − 3λ = 0 ⟹ λ =
3
3
. )
Conclusion: We have two candidates for the location of the max and min, namely (−3, 0, −3) and (1, 0, −1). The first is a
– –
distance 3√2 from the origin, giving the maximum, and the second is a distance √2 from the origin, giving the minimum. In
–
particular, the distance is √2.
Exercises
Stage 1
1✳
1. Does the function f (x, y) = x + y have a maximum or a minimum on the curve xy = 1? Explain.
2 2
2
The surface S is given by the equation g(x, y, z) = 0. You are walking on S measuring the function f (x, y, z) as you go. You
are currently at the point (x , y , z ) where f takes its largest value on S, and are walking in the direction d ⃗ ≠ 0. Because
0 0 0
1. What is the directional derivative of f at (x 0, y0 , z0 ) in the direction d ?⃗ Do not use the method of Lagrange multipliers.
2. What is the directional derivative of f at (x 0, y0 , z0 ) in the direction d ?⃗ This time use the method of Lagrange multipliers.
Stage 2
3
Find the maximum and minimum values of the function f (x, y, z) = x + y − z on the sphere x 2
+y
2
+z
2
= 1.
4
2
2 2
y
Find a, b and c so that the volume 4π
3
abc of an ellipsoid x
a
2
+ 2
+
z
c
2
=1 passing through the point (1, 2, 1) is as small as
b
possible.
5✳
6✳
Use the Method of Lagrange Multipliers to find the radius of the base and the height of a right circular cylinder of maximum
volume which can be fit inside the unit sphere x + y + z = 1. 2 2 2
7✳
Use the method of Lagrange Multipliers to find the maximum and minimum values of
f (x, y) = xy
2.10.10 https://math.libretexts.org/@go/page/92243
subject to the constraint
2 2
x + 2y = 1.
8✳
9✳
10 ✳
Use Lagrange multipliers to find the maximum and minimum values of the function f (x, y, z) = x 2
+y
2
−
1
20
z
2
on the curve
of intersection of the plane x + 2y + z = 10 and the paraboloid x + y − z = 0. 2 2
11 ✳
–
Find the point P = (x, y, z) (with x, y and z > 0 ) on the surface x 3 2
y z = 6 √3 that is closest to the origin.
12 ✳
13 ✳
Find the radius of the largest sphere centred at the origin that can be inscribed inside (that is, enclosed inside) the ellipsoid
2 2 2
2(x + 1 ) +y + 2(z − 1 ) =8
14 ✳
Let C be the intersection of the plane x + y + z = 2 and the sphere x 2
+y
2
+z
2
= 2.
15 ✳
1. Use Lagrange multipliers to find the extreme values of
2 2 2
f (x, y, z) = (x − 2 ) + (y + 2 ) + (z − 4 )
on the sphere x + y + z = 6.
2 2 2
16 ✳
1. Find the minimum of the function
2 2 2
f (x, y, z) = (x − 2 ) + (y − 1 ) +z
2.10.11 https://math.libretexts.org/@go/page/92243
subject to the constraint x + y + z = 1, using the method of Lagrange multipliers.
2 2 2
17 ✳
Use Lagrange multipliers to find the minimum and maximum values of (x + z)e subject to x y 2
+y
2
+z
2
= 6.
18 ✳
19
Find the ends of the major and minor axes of the ellipse 3x 2
− 2xy + 3 y
2
= 4.
20 ✳
A closed rectangular box with a volume of 96 cubic meters is to be constructed of two materials. The material for the top costs
twice as much per square meter as that for the sides and bottom. Use the method of Lagrange multipliers to find the dimensions
of the least expensive box.
21 ✳
2
T (x, y, z) = 40x y z
22 ✳
Find the dimensions of the box of maximum volume which has its faces parallel to the coordinate planes and which is
contained inside the region 0 ≤ z ≤ 48 − 4x − 3y .
2 2
23 ✳
A rectangular bin is to be made of a wooden base and heavy cardboard with no top. If wood is three times more expensive than
cardboard, find the dimensions of the cheapest bin which has a volume of 12m . 3
2.10.12 https://math.libretexts.org/@go/page/92243
24 ✳
A closed rectangular box having a volume of 4 cubic metres is to be built with material that costs $8 per square metre for the
sides but $12 per square metre for the top and bottom. Find the least expensive dimensions for the box.
25 ✳
Suppose that a, b, c are all greater than zero and let D be the pyramid bounded by the plane ax + by + cz = 1 and the 3
coordinate planes. Use the method of Lagrange multipliers to find the largest possible volume of D if the plane
ax + by + cz = 1 is required to pass through the point (1, 2, 3). (The volume of a pyramid is equal to one-third of the area of
Stage 3
26 ✳
Use Lagrange multipliers to find the minimum distance from the origin to all points on the intersection of the curves
g(x, y, z) = x − z − 4 = 0
and h(x, y, z) = x +y +z−3 = 0
27 ✳
on the sphere x 2
+y
2
+z
2
= 36. Determine all points at which these values occur.
28 ✳
1. 1. Give the system of equations that must be solved in order to find the warmest and coolest point on the circle
x + y = 100 by the method of Lagrange multipliers.
2 2
2. Find the warmest and coolest points on the circle by solving that system.
2. 1. Give the system of equations that must be solved in order to find the critical points of T (x, y).
2. Find the critical points by solving that system.
3. Find the coolest point on the solid disc x 2
+y
2
≤ 100.
29 ✳
1. By finding the points of tangency, determine the values of c for which x + y + z = c is a tangent plane to the surface
2 2 2
4x + 4y +z = 96.
2. Use the method of Lagrange Multipliers to determine the absolute maximum and minimum values of the function
f (x, y, z) = x + y + z along the surface g(x, y, z) = 4 x + 4 y + z = 96.
2 2 2
30
Let f (x, y) have continuous partial derivatives. Consider the problem of finding local minima and maxima of f (x, y) on the
curve xy = 1.
Define g(x, y) = xy − 1. According to the method of Lagrange multipliers, if (x, y) is a local minimum or maximum of
f (x, y) on the curve xy = 1, then there is a real number λ such that
2.10.13 https://math.libretexts.org/@go/page/92243
∇f (x, y) = λ∇g(x, y), g(x, y) = 0
x
). Define F (x) = f (x, 1
x
). If x ≠ 0 is a local minimum or
maximum of F (x), we have that
′
F (x) = 0
if and only if
1
x ≠ 0 obeys (E2) and y = .
x
1. Joseph-Louis Lagrange was actually born Giuseppe Lodovico Lagrangia in Turin, Italy in 1736. He moved to Berlin in 1766
and then to Paris in 1786. He eventually acquired French citizenship and then the French claimed he was a French
mathematician, while the Italians continued to claim that he was an Italian mathematician.
2. We call L an auxiliary function because, while we use it to help solve the problem, it doesn't actually appear in either the
statement of the question or in the answer itself
3. Some people use L(x, y, z, λ) = f (x, y, z) + λ g(x, y, z) instead. This amounts to renaming λ to −λ. While we care that λ
has a value, we don't care what it is.
4. The function S(z) = z is a strictly increasing function for z ≥ 0. So, for a, b ≥ 0, the statement “a < b ” is equivalent to the
2
This page titled 2.10: Lagrange Multipliers is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.
2.10.14 https://math.libretexts.org/@go/page/92243
CHAPTER OVERVIEW
3: Multiple Integrals
b
In your previous calculus courses you defined and worked with single variable integrals, like ∫ f (x) dx. In this chapter, we
a
define and work with multivariable integrals, like ∬ f (x, y) dx dy and ∭ f (x, y, z) dx dy dz. We start with two variable
R V
integrals.
3.1: Double Integrals
3.2: Double Integrals in Polar Coordinates
3.3: Applications of Double Integrals
3.4: Surface Area
3.5: Triple Integrals
3.6: Triple Integrals in Cylindrical Coordinates
3.7: Triple Integrals in Spherical Coordinates
3.8: Optional— Integrals in General Coordinates
Thumbnail: A diagram depicting a worked triple integral example. The questions is "Find the volume of the region bounded above
by the sphere x + y + z = a and below by the cone z sin (a) = (x + y ) cos (a) where a is in the interval [0, π] (Public
2 2 2 2 2 2 2 2 2
This page titled 3: Multiple Integrals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
1
3.1: Double Integrals
Vertical Slices
Suppose that you want to compute the mass of a plate that fills the region R in the xy-plane. Suppose further that the density of the
plate, say in kilograms per square meter, depends on position. Call the density f (x, y). For simplicity we'll assume that R is the
region between the bottom curve y = B(x) and the top curve y = T (x) with x running from a to b. That is,
R = { (x, y) ∣
∣ a ≤ x ≤ b, B(x) ≤ y ≤ T (x) }
We'll shortly express that mass as a two dimensional integral. As a warmup, recall the procedure that we used to set up a (one
dimensional) integral representing the area of R in Example 1.5.1 of the CLP-2 text.
Pick a natural number n (that we will later send to infinity), and then
subdivide R into n narrow vertical slices, each of width Δx = . Denote by x
b−a
n
i = a + i Δx the x-coordinate of the right
hand edge of slice number i.
For each i = 1, 2, … , n, slice number i has x running from x to x . We approximate its area by the area of a rectangle. We
i−1 i
∗ ∗
Area ≈ ∑ [T (x ) − B(x )]Δx
i i
i=1
By taking the limit as n → ∞ (i.e. taking the limit as the width of the rectangles goes to zero), we convert the Riemann sum
into a definite integral (see Definition 1.1.9 in the CLP-2 text) and at the same time our approximation of the area becomes the
exact area:
3.1.1 https://math.libretexts.org/@go/page/89215
n b
∗ ∗
Area = lim ∑ [T (x ) − B(x )]Δx = ∫ [T (x) − B(x)]dx
i i
n→∞
i=1 a
Now we can expand that procedure to yield the mass of R rather than the area of R. We just have to replace our approximation
[T (x ) − B(x )]Δx of the area of slice i by an approximation to the mass of slice i. To do so, we
∗ ∗
i i
Pick a natural number m (that we will later send to infinity), and then
subdivide slice number i into m tiny rectangles, each of width Δx and of height Δy = 1
m
∗
i
∗
[T (x ) − B(x )].
i
Denote by
y = B(x ) + j Δy the y -coordinate of the top of rectangle number j.
∗
j i
At this point we approximate the density inside each rectangle by a constant. For each j = 1, 2, … , m, rectangle number j has
y running from y to y . We pick a number y between y and y and approximate the density on rectangle number j in
∗
j−1 j j j−1 j
∗ ∗
Mass of slice i ≈ ∑ f (x , y ) Δx Δy
i j
j=1
By taking the limit as m → ∞ (i.e. taking the limit as the height of the rectangles goes to zero), we convert the Riemann sum
into a definite integral:
∗
T (x )
i
∗ ∗
Mass of slice i ≈ Δx ∫ f (x , y) dy = F (x ) Δx
i i
∗
B( xi )
where
T (x)
F (x) = ∫ f (x, y) dy
B(x)
Notice that, while we started with the density f (x, y) being a function of both x and y, by taking the limit of this Riemann sum,
we have “integrated out” the dependence on y. As a result, F (x) is a function of x only, not of x and y.
Finally taking the limit as n → ∞ (i.e. taking the limit as the slice width goes to zero), we get
∗
n T (x ) n
i
∗ ∗
Mass = lim ∑ Δx ∫ f (x , y) dy = lim ∑ F (x ) Δx
i i
n→∞ ∗ n→∞
B( x )
i=1 i i=1
This is our first double integral. There are a couple of different standard notations for this integral.
3.1.2 https://math.libretexts.org/@go/page/89215
Definition 3.1.1
b T (x)
b T (x) b T (x)
=∫ ∫ f (x, y) dy dx = ∫ dx ∫ dy f (x, y)
a B(x) a B(x)
The last three integrals here are called iterated integrals, for obvious reasons.
Note that
b T (x)
T (x)
first evaluate the inside integral ∫ B(x)
f (x, y) dy using the inside limits of integration, and by treating x as a constant and
using standard single variable integration techniques, such as those in the CLP-2 text. The result of the inside integral is a
function of x only. Call it F (x).
b
Then evaluate the outside integral ∫ F (x) dx, whose integrand is the answer to the inside integral. Again, this integral is
a
T (x)
first evaluate the inside integral ∫ B(x)
dy f (x, y) using the limits of integration that are directly beside the dy. Indeed the dy
T (x)
is written directly beside ∫ B(x)
to make it clear that the limits of integration B(x) and T (x) are for the y -integral. In the
T (x)
past you probably wrote this integral as ∫ B(x)
f (x, y) dy. The result of the inside integral is again a function of x only. Call
it F (x).
b
Then evaluate the outside integral ∫ dx F (x), whose integrand is the answer to the inside integral and whose limits of
a
Horizontal Slices
We found, when computing areas of regions in the xy-plane, that it is often advantageous to use horizontal slices, rather than
vertical slices. See, for example, Example 1.5.4 in the CLP-2 text. The same is true when setting up multidimensional integrals. So
we now repeat the setup procedure of the last section, but starting with horizontal slices, rather than vertical slices. This procedure
will be useful when dealing with regions of the form
R = { (x, y) ∣
∣ c ≤ y ≤ d, L(y) ≤ x ≤ R(y) }
Here L(y) (“L” stands for “left”) is the smallest 1 allowed value of x, when the y -coordinate is y, and R(y) (“R ” stands for
“right”) is the largest allowed value of x, when the y -coordinate is y. Suppose that we wish to evaluate the mass of a plate that fills
3.1.3 https://math.libretexts.org/@go/page/89215
the region R, and that the density of the plate is f (x, y). We follow essentially the same the procedure as we used with vertical
slices, but with the roles of x and y swapped.
Pick a natural number n (that we will later send to infinity). Then
subdivide the interval c ≤ y ≤ d into n narrow subintervals, each of width Δy = . Each subinterval cuts a thin horizontal
d−c
slice i by a rectangle whose left side is at x = L(y ) and whose right side is at x = R(y ).
∗
i
∗
i
If we were computing the area of R, we would now approximate the area of slice i by [R(x ) − L(x )]Δy, which is the area ∗
i
∗
i
m
∗
i
∗
i
For each j = 1, 2, … , m, rectangle number j has x running over a very narrow range. We pick a number x somewhere in ∗
j
that range. See the small black rectangle in the figure below.
∗ ∗
Mass of slice i ≈ ∑ f (x , y ) Δx Δy
j i
j=1
By taking the limit as m → ∞ (i.e. taking the limit as the width of the rectangles goes to zero), we convert the Riemann
sum into a definite integral:
∗
R( yi )
∗ ∗
Mass of slice i ≈ Δy ∫ f (x, y ) dx = F (y ) Δy
i i
∗
L( y )
i
where
R(y)
F (y) = ∫ f (x, y) dx
L(y)
3.1.4 https://math.libretexts.org/@go/page/89215
Observe that, as x has been integrated out, F (y) is a function of y only, not of x and y.
Finally taking the limit as n → ∞ (i.e. taking the limit as the slice width goes to zero), we get
∗
n R( y ) n
i
∗ ∗
Mass = lim ∑ Δy ∫ f (x, y ) dx = lim ∑ F (y ) Δy
i i
n→∞ ∗ n→∞
L( y )
i=1 i i=1
n
d
Now ∑ F (y ∗
i
) Δy is a Riemann sum approximation to the integral ∫ c
F (y) dy. So
i=1
d d R(y)
Definition 3.1.2
d R(y)
d R(y) d R(y)
=∫ ∫ f (x, y) dx dy = ∫ dy ∫ dx f (x, y)
c L(y) c L(y)
Note that
d R(y)
R(y)
first evaluate the inside integral ∫ L(y)
f (x, y) dx using the inside limits of integration. The result of the inside integral is a
function of y only. Call it F (y).
d
Then evaluate the outside integral ∫ c
F (y) dy, whose integrand is the answer to the inside integral.
d R(y)
R(y)
first evaluate the inside integral ∫ L(y)
dx f (x, y) using the limits of integration that are directly beside the dx. Again, the
R(y)
dx is written directly beside ∫ L(y)
to make it clear that the limits of integration L(y) and R(y) are for the x-integral. In the
R(y)
past you probably wrote this integral as ∫ L(y)
f (x, y) dx. The result of the inside integral is again a function of y only. Call
it F (y).
d
Then evaluate the outside integral ∫ dy F (y), whose integrand is the answer to the inside integral and whose limits of
c
Theorem 3.1.3
Let R be a region in the xy-plane and let the function f (x, y) be defined and continuous on R.
1. If
R = { (x, y) ∣
∣ a ≤ x ≤ b, B(x) ≤ y ≤ T (x) }
with B(x) and T (x) being continuous, and if the mass density in R is f (x, y), then the mass of R is
b T (x) b T (x)
b T (x)
=∫ dx ∫ dy f (x, y)
a B(x)
3.1.5 https://math.libretexts.org/@go/page/89215
2. If
R = { (x, y) ∣
∣ c ≤ y ≤ d, L(y) ≤ x ≤ R(y) }
with L(y) and R(y) being continuous, and if the mass density in R is f (x, y), then the mass of R is
d R(y) d R(y)
d R(y)
=∫ dy ∫ dx f (x, y)
c L(y)
{ (x, y) ∣
∣ a ≤ x ≤ b, B(x) ≤ y ≤ T (x) }
= { (x, y) ∣
∣ c ≤ y ≤ d, L(y) ≤ x ≤ R(y) }
∫ ∫ f (x, y) dy dx = ∫ ∫ f (x, y) dx dy
a B(x) c L(y)
This is called Fubini's theorem 2. It will be discussed more in the optional §3.1.5.
Definition 3.1.4
Example 3.1.5
Let R be the triangular region above the x-axis, to the right of the y -axis and to the left of the line x + y = 1. Find the mass of
R if it has density f (x, y) = y.
Solution
We'll do this problem twice — once using vertical strips and once using horizontal strips. First, here is a sketch of R.
Solution using vertical strips. We'll now set up a double integral for the mass using vertical strips. Note, from the figure
3.1.6 https://math.libretexts.org/@go/page/89215
that
the leftmost points in R have x = 0 and the rightmost point in R has x = 1 and
for each fixed x between 0 and 1, the point (x, y) in R with the smallest y has y = 0 and the point (x, y) in R with the
largest y has y = 1 − x.
Thus
Mass = ∫ dx ∫ dy f (x, y) = ∫ dx ∫ dy y
a B(x) 0 0
so that the
1 2 3 1
(1 − x) (1 − x) 1
Mass = ∫ dx = [− ] =
0
2 6 6
0
Solution using horizontal strips. This time we'll set up a double integral for the mass using horizontal strips. Note, from the
figure
that
the lowest points in R have y = 0 and the topmost point in R has y = 1 and
for each fixed y between 0 and 1, the point (x, y) in R with the smallest x has x = 0 and the point (x, y) in R with the
largest x has x = 1 − y.
Thus
Mass = ∫ dy ∫ dx f (x, y) = ∫ dy ∫ dx y
c L(y) 0 0
3.1.7 https://math.libretexts.org/@go/page/89215
1−y
1−y 2
∫ y dx = [xy] = y −y
0
0
Double integrals share the usual basic properties that we are used to from integrals of functions of one variable. See, for example,
Theorem 1.2.1 and Theorem 1.2.12 in the CLP-2 text. Indeed the following theorems follow from them.
then
In the very special (but not that uncommon) case that R is the rectangle
R = {(x, y)|a ≤ x ≤ b, c ≤ y ≤ d}
3.1.8 https://math.libretexts.org/@go/page/89215
b d
b d
=∫ dx g(x) ∫ dy h(y)
a c
= [∫ dx g(x)] [∫ dy h(y)]
a c
d
since ∫c
dy h(y) is a constant as far as the $x$-integral is concerned.
This is worth stating as a theorem
Theorem 3.1.7
R = {(x, y)|a ≤ x ≤ b, c ≤ y ≤ d}
Just as was the case for single variable integrals, sometimes we don't actually need to know the value of a double integral exactly.
We are instead interested in bounds on its value. The following theorem provides some simple tools for generating such bounds.
They are the multivariable analogs of the single variable tools in Theorem 1.2.12 of the CLP-2 text.
∬ f (x, y) dxdy ≥ 0
R
2. If there are constants m and M such that m ≤ f (x, y) ≤ M for all (x, y) in R, then
4. We have
∣ ∣
∣∬ f (x, y) dxdy ∣ ≤ ∬ |f (x, y)| dxdy
∣ R
∣ R
Volumes
Now that we have defined double integrals, we should start putting them to use. One of the most immediate applications arises
from interpreting f (x, y), not as a density, but rather as the height of the part of a solid above the point (x, y) in the xy-plane. Then
Theorem 3.1.3 gives the volume between the xy-plane and the surface z = f (x, y).
We'll now see how this goes in the case of part (b) of Theorem 3.1.3. The case of part (a) works in the same way. So we assume
that the solid V lies above the base region
R = { (x, y) ∣
∣ c ≤ y ≤ d, L(y) ≤ x ≤ R(y) }
3.1.9 https://math.libretexts.org/@go/page/89215
and that
V = { (x, y, z) ∣
∣ (x, y) ∈ R, 0 ≤ z ≤ f (x, y) }
The base region R (which is also the top view of V ) is sketched in the figure on the left below and the part of V in the first octant is
sketched in the figure on the right below.
Subdivide slice number i into m tiny rectangles, each of height Δy and of width Δx = 1
m
⋯.
Compute, approximately, the volume of the part of V that is above each rectangle.
Take the limit m → ∞ and then the limit n → ∞.
We have just been through this type of argument twice. So we'll abbreviate the argument and just say
slice the base region R into long “infinitesimally” thin strips of width dy.
Subdivide each strip into “infinitesimal” rectangles each of height dy and of width dx. See the figure on the left above.
The volume of the part of V that is above the rectangle centred on (x, y) is essentially f (x, y) dx dy. See the figure on the right
above.
So the volume of the part of V that is above the strip centred on y is essentially 3 dy ∫
R(y)
L(y)
dx f (x, y) and
we arrive at the following conclusion.
Equation 3.1.9
If
V = { (x, y, z) ∣
∣ (x, y) ∈ R, 0 ≤ z ≤ f (x, y) }
where
R = { (x, y) ∣
∣ c ≤ y ≤ d, L(y) ≤ x ≤ R(y) }
then
d R(y)
Similarly
Equation 3.1.10
If
V = { (x, y, z) ∣
∣ (x, y) ∈ R, 0 ≤ z ≤ f (x, y) }
where
R = { (x, y) ∣
∣ a ≤ x ≤ b, B(x) ≤ y ≤ T (x) }
3.1.10 https://math.libretexts.org/@go/page/89215
then
b T (x)
Examples
Oof — we have had lots of equations and theory. It's time to put all of this to work. Let's start with a mass example and then move
on to a volume example. You will notice that the mathematics is really very similar. Just the interpretation changes.
Let ν > 0 be a constant and let R be the region above the curve x
2
= 4ν y and to the right of the curve y
2
=
1
2
ν x. Find the
mass of R if it has density f (x, y) = xy.
Solution
For practice, we'll do this problem twice — once using vertical strips and once using horizontal strips. We'll start by sketching
2
2
2y
R. First note that, since y ≥
x
4ν
and x ≥
ν
, both x and y are positive throughout R. The two curves intersect at points
(x, y) that satisfy both
2
2 2 2 2 4
2y x 2y 2 x x
x = and y = ⟹ x = = ( ) =
ν 4ν ν ν 4ν 8ν 3
3
x
⟺ ( − 1) x = 0
3
8ν
4ν
and the rightward
2
2y
opening parabola x = ν
intersect at (0, 0) and (2ν , ν ).
Solution using vertical strips. We'll now set up a double integral for the mass using vertical strips and using the abbreviated
argument of the end of the last section (on volumes). Note, from the figure above, that
2 −−
−
∣ x νx
R = {(x, y) ∣ 0 = a ≤ x ≤ b = 2ν , = B(x) ≤ y ≤ T (x) = √ }
∣ 4ν 2
3.1.11 https://math.libretexts.org/@go/page/89215
T (x)
So the mass of the strip centred on x is essentially dx ∫
B(x)
dy f (x, y) (the integral over y adds up the masses of all of the
different rectangles on the single vertical strip in question) and
we conclude that the
b T (x) 2ν √ν x/2
Here the integral over x adds up the masses of all of the different strips.
Recall that, when integrating y, x is held constant, so we may factor the constant x out of the inner y integral.
√ν x/2 √ν x/2
∫ dy xy = x ∫ dy y
x2 /(4ν ) x2 /(4ν )
√ν x/2
2
y
= x[ ]
2 2
x /(4ν )
2 5
νx x
= −
2
4 32ν
and the
2ν 2 5
νx x
Mass(R) = ∫ dx [ − ]
2
0
4 32ν
3 6 4
ν (2ν ) (2ν ) ν
= − =
3 ×4 6 × 32ν 2 3
Solution using horizontal strips. We'll now set up a double integral for the mass using horizontal strips, again using the
abbreviated argument of the end of the last section (on volumes). Note, from the figure at the beginning of this example, that
2
∣ 2y −−−
R = {(x, y) ∣ 0 = c ≤ y ≤ d = ν , = L(y) ≤ x ≤ R(y) = √4ν y }
∣ ν
Here the integral over y adds up the masses of all of the different strips. Recalling that, when integrating x, y is held
constant
3.1.12 https://math.libretexts.org/@go/page/89215
ν √4ν y
ν 2 √4ν y
x
=∫ dy y [ ]
0
2 2
2y /ν
ν 5
2
2y
=∫ dy [2ν y − ]
0
ν2
3 6 4
2ν (ν ) 2ν ν
= − =
2
3 6ν 3
Let R be the part of the xy -plane above the x-axis and below the parabola y = 1 − x 2
. Find the volume between R and the
− −− −
surface z = x √1 − y .
2
Solution
Yet again, for practice, we'll do this problem twice — once using vertical strips and once using horizontal strips. First, here is a
sketch of R.
Solution using vertical strips. We'll now set up a double integral for the volume using vertical strips. Note, from the figure
that
the leftmost point in R has x = −1 and the rightmost point in R has x = 1 and
for each fixed x between −1 and 1, the point (x, y) in R with the smallest y has y = 0 and the point (x, y) in R with the
largest y has y = 1 − x .2
Thus
2
R = {(x, y)| − 1 = a ≤ x ≤ b = 1, 0 = B(x) ≤ y ≤ T (x) = 1 − x }
and, by 3.1.10
2
b T (x) 1 1−x
2 − −−−
Volume = ∫ dx ∫ dy f (x, y) = ∫ dx ∫ dy x √ 1 − y
a B(x) −1 0
2
1 1−x
2 − −−−
=2∫ dx ∫ dy x √ 1 − y
0 0
1−x
2
−−−−
since the inside integral F (x) = ∫ 0
2
dy x √1 − y is an even function of x. Now, for x ≥ 0, the inside integral is
3.1.13 https://math.libretexts.org/@go/page/89215
2 2 2
1−x 1−x 1−x
− −−− − −−− 2
2 2 2 3/2
∫ x √ 1 − y dy = x ∫ √ 1 − y dy = x [− (1 − y ) ]
0 0
3
0
2
2 3
= x (1 − x )
3
so that the
1 1
3 6
2 2 3
4 x x 2
Volume = 2 ∫ dx x (1 − x ) = [ − ] =
0
3 3 3 6 9
0
Solution using horizontal strips. This time we'll set up a double integral for the volume using horizontal strips. Note, from
the figure
that
the lowest points in R have y = 0 and the topmost point in R has y = 1 and
−−−−
for each fixed y between 0 and 1, the point (x, y) in R with the leftmost x has x = −√1 − y and the point (x, y) in R
− −− −
with the rightmost x has x = √1 − y .
Thus
− −−− − −−−
R = {(x, y)|0 = c ≤ y ≤ d = 1, − √ 1 − y = L(y) ≤ x ≤ R(y) = √ 1 − y }
and, by 3.1.9
d R(y) 1 √1−y
2 − −−−
Volume = ∫ dy ∫ dx f (x, y) = ∫ dy ∫ dx x √ 1 − y
c L(y) 0 −√1−y
2 2
= (1 − y )
3
So the
1 3 1
2 2 (1 − y) 2
2
Volume = ∫ dy (1 − y ) = [− ] =
3 0
3 3 9
0
Solution
Our first job is figure out what the specified solid looks like. Note that
The variable z does not appear in the equation x + y = a . So, for every value of the constant z , the part of the cylinder
2 2 2
0
2
x +y = a
2 2
in the plane z = z , is the circle x + y = a , z = z . So the cylinder x + y = a consists of many
0
2 2 2
0
2 2 2
circles stacked vertically, one on top of the other. The part of the cylinder x + y = a that lies above the xy-plane is 2 2 2
3.1.14 https://math.libretexts.org/@go/page/89215
The variable y does not appear in the equation x + z = a . So, for every value of the constant y , the part of the cylinder
2 2 2
0
2 2
x +z = a
2
in the plane y = y , is the circle x + z = a , y = y . So the cylinder x + z = a consists of many
0
2 2 2
0
2 2 2
circles stacked horizontally, one beside the other. The part of the cylinder x + z = a that lies to the right of the xz-plane
2 2 2
and hence our solid, is symmetric about the yz-plane. In particular the volume of the part of the solid in the octant x ≤ 0,
y ≥ 0, z ≥ 0 is the same as the volume in the first octant x ≥ 0, y ≥ 0, z ≥ 0. Similarly, the equations do not change at all
if y is replaced by −y or if z is replaced by −z. Our solid is also symmetric about both the xz-plane and the xy-plane.
Hence the volume of the part of our solid in each of the eight octants is the same.
So we will compute the volume of the part of the solid in the first octant, i.e. with x ≥ 0, y ≥ 0, z ≥ 0. The total volume of
the solid is eight times that.
The part of the solid in the first octant is sketched in the figure on the left below. A point (x, y, z) lies in the first cylinder if
and only if x + y ≤ a .
2 2 2
−− −−−−
Notice that, in V 1, z
2
≤a
2
−x
2
so that z ≤ √a 2
−x
2
and
− −−−−−
2 2 2 2 2
V1 = {(x, y, z)|x ≥ 0, y ≥ 0, x +y ≤ a , 0 ≤ z ≤ √ a − x }
The top view of the part of the solid in the first octant is sketched in the figure on the right above. In that top view, x runs from
−− − −−−
0 to a. For each fixed x, y runs from 0 to √a − x . So we may rewrite
2 2
where
− −−−−− − −−−−−
∣ 2 2 2 2
R = { (x, y) 0 ≤ x ≤ a, 0 ≤ y ≤ √ a − x } and f (x, y) = √ a − x
∣
and “(x, y) ∈ R” is read “(x, y) is an element of R. ”. Note that f (x, y) is actually independent of y. This will make things a
bit easier below.
We can now compute the volume of V using our usual abbreviated protocol.
1
3.1.15 https://math.libretexts.org/@go/page/89215
The volume of the part of V above rectangle centred on (x, y) is essentially
1
− −−−−−
2 2
f (x, y) dx dy = √ a − x dx dy
√a2 −x2
− −−−−−
2 2
dx ∫ √ a − x dy
0
(the integral over y adds up the volumes over all of the different rectangles on the single horizontal strip in question) and
we conclude that the
a √a2 −x2
− −−−−−
2 2
Volume(V1 ) = ∫ dx ∫ dy √ a − x
0 0
Here the integral over x adds up the volumes over all of the different strips. Recalling that, when integrating y, x is held
constant
a √a2 −x2
− −−−−−
2 2
Volume(V1 ) = ∫ dx √ a − x [ ∫ dy]
0 0
a
2 2
=∫ dx (a −x )
0
3 a
x
2
= [a x − ]
3
0
3
2a
=
3
0 0
Solution
This integral represents the volume of a simple geometric figure and so can be evaluated without using any calculus at all. The
domain of integration is
R = {(x, y)|0 ≤ y ≤ 2, 0 ≤ x ≤ a}
−− −−−−
and the integrand is 2
so the integral represents the volume between the xy-plane and the surface
2
f (x, y) = √a − x ,
− −−− −−
z = √a − x , with (x, y) running over R. We can rewrite the equation of the surface as x + z = a , which, as in
2 2 2 2 2
Example 3.1.13, we recognize as the equation of a cylinder of radius a centred on the y -axis. We want the volume of the part
of this cylinder that lies above R. It is sketched in the figure below.
3.1.16 https://math.libretexts.org/@go/page/89215
The constant y cross-sections of this volume are quarter circles of radius a and hence of area 1
4
πa .
2
The inside integral,
a −−−− −−
∫
0
√a2 − x2 dx, is exactly this area. So, as y runs from 0 to 2,
2 a 2
− −−−−− 1 πa
2 2 2
∫ ∫ √ a − x dx dy = πa × 2 =
0 0
4 2
the method of trigonometric substitution, that was covered in §1.9 of the CLP-2 text. In this case, the appropriate substitution is
The lower limit of integration x = 0, i.e. a sin θ = 0, corresponds to θ = 0, and the upper limit x = a, i.e. a sin θ = a,
corresponds to θ = , so that
π
a π/2 π/2
− −−−−− −−−−−−−−−−
2 2 2 2 2 2 2
∫ √ a − x dx = ∫ a − a sin θ a cos θ dθ = a ∫ cos θ dθ
√
0 0 0
2 2
a cos θ
π/2
The orthodox procedure for evaluating the resulting trigonometric integral ∫ 0
2
cos θ dθ, covered in §1.8 of the CLP-2 text,
uses the trigonometric double angle formula
1 + cos(2θ)
2 2
cos(2θ) = 2 cos θ−1 to write cos θ =
2
and then
a π/2 2 π/2
− −−−−− a
2 2 2 2
∫ √ a − x dx = a ∫ cos θ dθ = ∫ [1 + cos(2θ)] dθ
0 0
2 0
2 π/2
a sin(2θ)
= [θ + ]
2 2
0
2
πa
=
4
π/2
However we remark that there is also an efficient, sneaky, way to evaluate definite integrals like ∫
0
cos
2
θ dθ. Looking at
the figures
we see that
3.1.17 https://math.libretexts.org/@go/page/89215
π/2 π/2
2 2
∫ cos θ dθ = ∫ sin θ dθ
0 0
Thus
π/2 π/2 π/2
1
2 2 2 2
∫ cos θ dθ = ∫ sin θ dθ = ∫ [ sin θ + cos θ] dθ
0 0 0
2
π/2
1 π
= ∫ dθ =
2 0
4
The integral ∫ ∫ dy dx represents the area of a region in the xy-plane. Express the same area as a double integral with
2
−1 x
∫ ∫ dy dx = ∫ [∫ dy] dx
−1 x2 −1 x2
The given iterated integral corresponds to the (vertical) slicing in the figure on the left below.
To reverse the order of integration we have to switch to horizontal slices as in the figure on the right above.
There we see a new wrinkle: the formula giving the value of x at the left hand end of a slice depends on whether the y
coordinate of the slice is bigger than, or smaller than y = 1. Looking at the figure on the right, we see that, on the domain of
integration,
y runs from 0 to 4 and
for each fixed 0 ≤ y ≤ 1, x runs from x = −√y to x = +√y.
for each fixed 1 ≤ y ≤ 4, x runs from x = y − 2 to x = +√y.
So
3.1.18 https://math.libretexts.org/@go/page/89215
2 x+2 1 √y 4 √y
∫ dx ∫ dy = ∫ dy ∫ dx + ∫ dy ∫ dx
2
−1 x 0 −√y 1 y−2
There was a moral to the last example. Just because both orders of integration have to give the same answer doesn't mean that they
are equally easy to evaluate. Here is an extreme example illustrating that moral.
Example 3.1.17
x
over the region in the xy-plane that is above the x-axis, to the right of the line y =x and to the
left of the line x = 1.
Solution
Here is a sketch of the specified domain.
We'll try to evaluate the specified integral twice — once using horizontal strips (the impossibly hard way) and once using
vertical strips (the easy way).
Solution using horizontal strips. To set up the integral using horizontal strips, as in the figure on the left below, we observe
that, on the domain of integration,
y runs from 0 to 1 and
for each fixed y, x runs from x = y to 1.
So the iterated integral is
1 1
sin x
∫ dy ∫ dx
0 y
x
x
does not have an antiderivative that can be expressed in terms of elementary
5
without resorting to, for example, numerical methods or infinite series 6.
1
functions . It is impossible to evaluate ∫ y
dx
sin x
Solution using vertical strips. To set up the integral using vertical strips, as in the figure on the right above, we observe that,
on the domain of integration,
x runs from 0 to 1 and
for each fixed x, y runs from 0 to y = x.
So the iterated integral is
1 x
sin x
∫ dx ∫ dy
0 0 x
This time, because x is treated as a constant in the inner integral, it is trivial to evaluate the iterated integral.
3.1.19 https://math.libretexts.org/@go/page/89215
1 x 1 x 1
sin x sin x
∫ dx ∫ dy =∫ dx ∫ dy = ∫ dx sin x = 1 − cos 1
0 0 x 0 x 0 0
Here is an example which is included as an excuse to review some integration technique from CLP-2.
Example 3.1.18
part of the surface z = 1 − 3x − 2y that lies in the horizontal plane z = z is the ellipse 3x + 2y = 1 − z . The biggest
2 2
0
2 2
0
of these ellipses is that in the xy-plane, where z = 0. It is the ellipse 3x + 2y = 1. As z increases the ellipse shrinks,
0
2 2
0
degenerating to a single point, namely (0, 0, 1), when z = 1. So the surface consists of a stack of ellipses and our solid is
0
2 2 2 2
V = {(x, y, z)|3 x + 2y ≤ 1, 0 ≤ z ≤ 1 − 3 x − 2y }
the coefficients 2 and 3 are interchanged), using vertical slices is likely to lead to exactly the same level of difficulty as using
horizontal slices. So we'll just pick one — say vertical slices.
The fattest part of R is on the y -axis. The intersection points of the ellipse with the y -axis have x =0 and y obeying
3(0 )
2
+ 2y
2
=1 or y =± . So in R, −
1
√2
≤y ≤ and, for each such y,
√2
1 1
√2
3x
2
≤ 1 − 2y
2
or
−−−−− −−−−−
1−2y 2 1−2y 2
−√
3
≤x ≤√
3
. So using vertical strips as in the figure above
3.1.20 https://math.libretexts.org/@go/page/89215
2 2
Volume(V) = ∬ (1 − 3 x − 2 y ) dx dy
R
2
1−2y
1
√
√2 3
2 2
=∫ dy ∫ dx (1 − 3 x − 2y )
2
1 1−2y
− −√
√2 3
2
1 1−2y
√
√2 3
2 2
=4∫ dy ∫ dx (1 − 3 x − 2y )
0 0
1 2
1−2y
√
√2
2 3 3
=4∫ dy [(1 − 2 y )x − x ]
0
0
1 −− −−−−−
2
√2 1 − 2y 2 1 − 2y
2
=4∫ dy √ [(1 − 2 y ) − ]
0 3 3
1
2 3/2
√2 1 − 2y
=8∫ dy [ ]
0
3
to give
dy
π π
2 3/2
2 cos θ cos θ 8 2
4
Volume(V) = 8 ∫ dθ [ ] = −− ∫ dθ cos θ
–
0 √2 3 √54 0
Finally, since ∫0
2
cos(4θ) dθ = ∫
0
2
cos(2θ) dθ = 0,
8 3 π π
Volume(V) = =
−− –
√54 8 2 2 √6
3.1.21 https://math.libretexts.org/@go/page/89215
Number the resulting rectangles contained in R, 1 through n. Notice that we are numbering all of the rectangles in R, not just
those in one particular row or column.
Denote by ΔA the area of rectangle #i.
i
row or column.
Now repeat this construction over and over again, using finer and finer grids. If, as the size 9 of the rectangles approaches zero, this
sum approaches a unique limit (independent of the choice of parallel lines and of points (x , y )), then we define ∗
i
∗
i
∗ ∗
∬ f (x, y) dx dy = lim ∑ f (x , y ) ΔAi
i i
R i=1
Theorem 3.1.19
B(x) ≤ y ≤ T (x)
L(y) ≤ x ≤ R(y)
The proof of this theorem is not particularly difficult, but is still beyond the scope of this text. The main ideas in the proof can
already be seen in §1.1.6 of the CLP-2 text. An important consequence of this theorem is
a ≤x ≤b c ≤y ≤d
{ } and { }
B(x) ≤ y ≤ T (x) L(y) ≤ x ≤ R(y)
3.1.22 https://math.libretexts.org/@go/page/89215
b T (x) d R(y)
The hypotheses of both of these theorems can be relaxed a bit, but not too much. For example, if
R = {(x, y)|0 ≤ x ≤ 1, 0 ≤ y ≤ 1}
then
n n
∗ ∗
∑ f (x , y ) ΔAi = ∑ ΔAi = Area(R)
i i
i=1 i=1
But if we choose all the x 's and y 's to be irrational numbers, then
∗
i
∗
i
n n
∗ ∗
∑ f (x , y ) ΔAi = ∑ 0 ΔAi = 0
i i
i=1 i=1
So the limit of ∗
∑ f (x , y ) ΔAi ,
i
∗
i
as the maximum diagonal of the rectangles approaches zero, depends on the choice of points
i=1
∗ ∗
(x , y ).
i i
So the integral ∬ R
f (x, y) dx dy does not exist.
Here is an even more pathological 10 example.
Example 3.1.21
In this example, we relax exactly one of the hypotheses of Fubini's Theorem, namely the continuity of f , and construct an
example in which both of the integrals in Fubini's Theorem exist, but are not equal. In fact, we choose
R = {(x, y)|0 ≤ x ≤ 1, 0 ≤ y ≤ 1} and we use a function f (x, y) that is continuous on R, except at exactly one point —
the origin.
First, let δ 1, δ2 , δ3 , ⋯ be any sequence of real numbers obeying
For example δ = or δ = n
1
n
are both acceptable. For each positive integer n, let I
n
1
n−1 n = (δn+1 , δn ] = {t| δn+1 < t ≤ δn }
2
∫ g(t) dt = 1
In
3.1.23 https://math.libretexts.org/@go/page/89215
We subdivided the interval 0 < x ≤ 1 into infinitely many subintervals I . As n increases, the subinterval I gets smaller
n n
over I is one.
n
Now we define the integrand f (x, y) in terms of these subintervals I and functions g n n.
⎧ 0 if x = 0
⎪
⎪
⎪
⎪
⎪0 if y = 0
You should think of (0, 1] × (0, 1] as a union of a bunch of small rectangles I × I , as in the figure below. On most of these
m n
rectangles, f (x, y) is just zero. The exceptions are the darkly shaded rectangles I × I on the “diagonal” of the figure and then n
On each darkly shaded rectangle, f (x, y) ≥ 0 and the graph of f (x, y) is the graph of g (x)g n n (y) which looks like a pyramid.
On each lightly shaded rectangle, f (x, y) ≤ 0 and the graph of f (x, y) is the graph of −g n+1 (x)gn (y) which looks like a
pyramidal hole in the ground.
1
Now fix any 0 ≤ y ≤ 1 and let's compute ∫ 0
f (x, y) dx. That is, we are integrating f along a line that is parallel to the x-axis.
1
If y = 0, then f (x, y) = 0 for all x, so ∫ f (x, y) dx = 0. If 0 < y ≤ 1, then there is exactly one positive integer
0
n with
y ∈ In and f (x, y) is zero, except for x in I or I . So for y ∈ I
n n+1 n
∫ f (x, y) dx = ∑ ∫ f (x, y) dx
0 Im
m=n,n+1
= gn (y) − gn (y) = 0
1
Here we have twice used that ∫
Im
g(t) dt = 1 for all m. Thus ∫
0
f (x, y) dx = 0 for all y and hence
1 1
∫ dy[ ∫ dx f (x, y)] = 0.
0 0
1
Finally, fix any 0 ≤ x ≤ 1 and let's compute ∫ 0
f (x, y) dy. That is, we are integrating f along a line that is parallel to the y -
1
axis. If x = 0, then f (x, y) = 0 for all y, so ∫
0
f (x, y) dy = 0. If 0 < x ≤ 1, then there is exactly one positive integer m
3.1.24 https://math.libretexts.org/@go/page/89215
with x ∈ I . If m ≥ 2, then f (x, y) is zero, except for y in I and I
m m m−1 . But, if m = 1, then f (x, y) is zero, except for y in
I . (Take another look at the figure above.) So for x ∈ I , with m ≥ 2,
1 m
∫ f (x, y) dy = ∑ ∫ f (x, y) dy
0 In
n=m,m−1
= gm (x) − gm (x) = 0
But for x ∈ I 1,
= g1 (x)
Thus
1
0 if x ≤ δ2
∫ f (x, y) dy = {
0
g1 (x) if x ∈ I1
and hence
1 1
The conclusion is that for the f (x, y) above, which is defined for all 0 ≤ x ≤ 1, 0 ≤y ≤1 and is continuous except at (0, 0),
1 1 1 1
In fact, if f (x) is any even power of x, then f (x) is an even function and if f (x) is any odd power of x, then f (x) is an odd
function.
We also learned how to exploit evenness and oddness to simplify integration.
Let a > 0.
1. If f (x) is an even function, then
3.1.25 https://math.libretexts.org/@go/page/89215
a a
∫ f (x)dx = 2 ∫ f (x)dx
−a 0
∫ f (x)dx = 0
−a
We will now see that we can similarly exploit evenness and oddness of functions of more than one variable. But for functions of
more than one variable there is also more than one kind of oddness and evenness. In the Definition 3.1.22 (Definition 1.2.8 in the
CLP-2 text) of evenness and oddness of the function f (x), we compared the value of f at x with the value of f at −x. The points
x and −x are the same distance from the origin, 0, and are on opposite sides of 0. The point −x is called the reflection of x across
the origin. To prepare for our definitions of evenness and oddness of functions of two variables, we now define three different
reflections in the two dimensional world of the xy-plane.
Definition 3.1.24
To get from the point (x, y) to its image reflected across the y -axis, you
start from (x, y), and
walk horizontally straight to the y -axis, and
cross the y -axis, and
continue horizontally the same distance as you have already travelled to (−x, y).
Here are four examples.
To get from the point (x, y) to its image reflected across the x-axis, you
start from (x, y), and
walk vertically straight to the x-axis, and
cross the x-axis, and
continue vertically the same distance as you have already travelled to the reflected image (x, −y).
Here are four examples.
3.1.26 https://math.libretexts.org/@go/page/89215
To get from the point (x, y) to its image reflected across the origin, you
start from (x, y), and
walk radially straight to the origin, and
cross the origin, and
continue radially in the same direction the same distance as you have already travelled to the reflected image (−x, −y).
Here are three examples.
For each of these three types of reflection, there is a corresponding kind of oddness and evenness.
Definition 3.1.25
we say that f (x, y) is odd under x → −x (i.e. under reflection across the y -axis) when f (−x, y) = −f (x, y) for all x and
y
and
we say that f (x, y) is even under y → −y (i.e. under reflection across the x-axis) when f (x, −y) = f (x, y) for all x and
y, and
we say that f (x, y) is odd under y → −y (i.e. under reflection across the x-axis) when f (x, −y) = −f (x, y) for all x and
y.
Example 3.1.26
Consequently
if m is even, then f (x, y) is even under x → −x and
if m is odd, then f (x, y) is odd under x → −x and
if n is even, then f (x, y) is even under y → −y and
3.1.27 https://math.libretexts.org/@go/page/89215
if n is odd, then f (x, y) is odd under y → −y and
if m + n is even, then f (x, y) is even (under reflection across the origin) and
if m + n is odd, then f (x, y) is odd (under reflection across the origin).
Recall from Theorem 3.1.23 (or Theorem 1.2.11 in the CLP-2 text) that we can exploit the evenness or oddness of the integrand,
a
f (x), of the integral ∫ f (x) dx to simplify the evaluation of the integral when b = −a, i.e. when the domain of integration is
b
invariant under reflection across the origin. Similarly, we will be able to simplify the evaluation of the double integral
∬ f (x, y) dx dy when the integrand is even or odd and the domain of integration R is invariant under the corresponding
R
reflection — meaning that the reflected R is identical to the original R. Here are some details for “reflection across the y -axis”.
The details for the other reflections are similar.
If R is any subset of the xy-plane,
The set notation on the right hand side means “the set of all points (−x, y) with (x, y) a point of R ”.
In the special case 11 that
A subset R of the xy-plane is invariant under reflection across the y -axis (or is also known as “symmetric about the y -axis”)
when
Recall that the symbol ⟺ is read “if and only if”. In the special case that
R = {(x, y)|c ≤ y ≤ d, L(y) ≤ x ≤ R(y)}
3.1.28 https://math.libretexts.org/@go/page/89215
We are finally ready for the analog of Theorem 3.1.23 (Theorem 1.2.11 in the CLP-2 text) for functions of two variables. By way of
motivation for that theorem, consider the integral ∬ f (x, y) dxdy, with the integrand, f (x, y), odd under x → −x, and the
R
domain of integration, R, symmetric about the y -axis. Slice up R into tiny (think “infinitesmal”) squares, either by subdividing
vertical slices into tiny squares, as in §3.1.1, or by subdividing horizontal slices into tiny squares, as in §3.1.2. Concentrate on any
point (x , y ) in R.
0 0
The contribution to the integral coming from the square that contains (x , y ) is (essentially 12 ) f (x , y ) Δx Δy. That
0 0 0 0
contribution is cancelled by the contribution coming from the square containing (the reflected point) (−x , y ), which is
0 0
f (−x0 , y0 ) Δx Δy = −f (x0 , y0 ) Δx Δy
∬ f (x, y) dxdy = 0
R
∬ f (x, y) dxdy = 0
R
Denote by R the set of all points in R that have x ≥ 0. If f (x, y) is even under x → −x, then
+
2. Let R be a subset of the xy-plane that is symmetric about the x-axis. If f (x, y) is odd under y → −y, then
∬ f (x, y) dxdy = 0
R
Denote by R the set of all points in R that have y ≥ 0. If f (x, y) is even under y → −y, then
+
3. Let R be a subset of the xy-plane that is invariant under reflection across the origin. If f (x, y) is odd (under reflection
across the origin), then
3.1.29 https://math.libretexts.org/@go/page/89215
∬ f (x, y) dxdy = 0
R
Denote by R either the set of all points in R that have x ≥ 0 or the set of all points in R that have y ≥ 0. If f (x, y) is
+
Proof
We will give only the proof for part (a) in the special case that
In part (a), we are assuming that R is symmetric about the y-axis, so that L(y) = −R(y). So, using horizontal strips, as
described in §3.1.2,
d R(y)
Fix any c ≤ y ≤ d.
If f (x, y) is odd under x → −x, then f (−x, y) = −f (x, y) for all −R(y) ≤ x ≤ R(y) and
R(y)
∫ dx f (x, y) = 0
−R(y)
=0
(for example if R has holes in it) is most easily done using the change of variables x = −u, y =v in Theorem 3.8.3, which is
part of the optional §3.8.
The proof of part (b) is similar to the proof of part (a).
The proof of part (c) is most easily done using the change of variables x = −u, y = −v in Theorem 3.8.3, which is part of the
optional §3.8.
3.1.30 https://math.libretexts.org/@go/page/89215
Example 3.1.28. ∬ R
e
x
sin(y + y ) dxdy
3
x 3
∬ e sin(y + y ) dxdy
R
Solution
Start by checking the evenness and oddness properties of the integrand f (x, y) = e x
sin(y + y ).
3
Since
−x 3
f (−x, y) = e sin(y + y )
x 3 x 3 x 3
f (x, −y) =e sin ( − y + (−y ) ) = e sin(−y − y ) = −e sin(y + y )
= −f (x, y)
−x 3
f (−x, −y) = −e sin(y + y )
the integrand is odd under y → −y but is neither even nor odd under x → −x and (x, y) → −(x, y). Fortunately (or by
rigging), the domain of integration R is invariant under y → −y (i.e. is symmetric about the x-axis) and so
x 3
∬ e sin(y + y ) dxdy = 0
R
Example 3.1.29. ∬ R
(x e
y
+ ye
x
+ xe
xy
+ 7) dxdy
y x xy
∬ (x e + ye + xe + 7) dxdy
R
Solution
First, let's sketch the ellipse x + 4y = 1. Notice that its x intercepts are the points (x, 0) that obey x + 4(0) = 1. So the
2 2 2 2
x-intercepts are (±1, 0). Similarly its y intercepts are the points (0, y) that obey 0 + 4 y = 1. So the y -intercepts are
2 2
From the sketch, it looks like R is invariant under x → −x (i.e. is symmetric about the y -axis) and is also invariant under
y → −y (i.e. is symmetric about the x-axis) and is also invariant under (x, y) → −(x, y). It is easy to check analytically that
this is indeed the case. The point (x, y) is in R if and only if it is inside x + 4y = 1. That is the case if and only if 2 2
x + 4 y ≤ 1. Since
2 2
2 2 2 2 2 2 2 2
(−x ) + 4y =x + (−4y ) = (−x ) + 4(−y ) =x + 4y
we have
3.1.31 https://math.libretexts.org/@go/page/89215
(x, y) is in R ⟺ (−x, y) is in R
Now let's check the evenness and oddness properties of the integrand.
y x xy
f (x, y) = xe + ye + xe +7
y −x −xy
f (−x, y) = −xe + ye − xe +7
−y x −xy
f (x, −y) = xe − ye + xe +7
−y −x xy
f (−x, −y) = −xe − ye − xe +7
The third term of f (x, y), namely x e , is odd under (x, y) → −(x, y).
xy
The fourth term of f (x, y), namely 7, is even under all of x → −x, y → −y, and (x, y) → −(x, y).
So, by parts (a), (b) and (c) of Theorem 3.1.27, in order,
y x xy
∬ (x e + ye + xe + 7) dxdy
R
y x xy
=∬ xe dxdy + ∬ ye dxdy + ∬ xe dxdy + 7 ∬ dxdy
R R R R
= 0 + 0 + 0 + 7 Area(R)
2
, it has area πab = 1
2
π and
y x xy 7
∬ (x e + ye + xe + 7) dxdy = π
2
R
Exercises
Stage 1
1
For each of the following, evaluate the given double integral without using iteration. Instead, interpret the integral as, for
example, an area or a volume.
3 1
1. ∫ ∫ dy dx
−1 −4
2 √4−y 2
2. ∫ ∫ dx dy
0 0
2
3 √9−y −−−−−−−−−
3. ∫ ∫ √9 − x
2
−y
2
dx dy
−3 0
2
1. ∫ f (x, y) dx
0
2
2. ∫ f (x, y) dy
0
2 3
3. ∫ ∫ f (x, y) dx dy
0 0
3.1.32 https://math.libretexts.org/@go/page/89215
3 2
4. ∫ ∫ f (x, y) dy dx
0 0
3 2
5. ∫ ∫ f (x, y) dx dy
0 0
Stage 2
Questions 3.1.7.3 through 3.1.7.8 provide practice with limits of integration for double integrals in Cartesian coordinates.
3
For each of the following, evaluate the given double integral using iteration.
1. ∬ (x
2 2
+ y ) dx dy where R is the rectangle 0 ≤ x ≤ a, 0 ≤ y ≤ b where a > 0 and b > 0.
R
2. ∬ (x − 3y) dx dy where T is the triangle with vertices (0, 0), (a, 0), (0, b).
T
3. ∬ xy
2
dx dy where R is the finite region in the first quadrant bounded by the curves y = x and x = y 2 2
.
R
4. ∬ x cos y dx dy where D is the finite region in the first quadrant bounded by the coordinate axes and the curve
D
2
y = 1 −x .
x
5. ∬ e
y
dx dy where R is the region 0 ≤ x ≤ 1, x 2
≤ y ≤ x.
R
y
xy
6. ∬ 4
dx dy where T is the triangle with vertices (0, 0), (0, 1), (1, 1).
T 1 +x
4
For each of the following integrals (i) sketch the region of integration, (ii) write an equivalent double integral with the order of
integration reversed and (iii) evaluate both double integrals.
x
2 e
1. ∫ dx ∫ dy
0 1
√2 √4−2y 2
2. ∫ dy ∫ dx y
2
0 −√4−2y
1 3x+2
3. ∫ dx ∫ dy
2
−2 x +4x
5. ✳
into a single iterated double integral with the order of integration reversed.
6. ✳
3.1.33 https://math.libretexts.org/@go/page/89215
7. ✳
8. ✳
A region E in the xy--plane has the property that for all continuous functions f
x=3 y=2x+3
1. Compute ∬ x dA.
E
9. ✳
2
∬ sin(y ) dA
D
10. ✳
11. ✳
12. ✳
Find the volume (V ) of the solid bounded above by the surface
2
−x
z = f (x, y) = e ,
below by the plane z = 0 and over the triangle in the xy--plane formed by the lines x = 1, y = 0 and y = x.
3.1.34 https://math.libretexts.org/@go/page/89215
13. ✳
1 2−y
y
Consider the integral I =∫ ∫ dx dy.
0 y
x
14. ✳
15. ✳
1. D is the region bounded by the parabola y 2
=x and the line y = x − 2. Sketch D and evaluate J where
J =∬ 3y dA
D
16. ✳
17. ✳
1. Combine the sum of the iterated integrals
1 √y 4 √y
2. Evaluate I if f (x, y) = e
2−x
.
18. ✳
Let
4 √8−y
I =∫ ∫ f (x, y) dx dy
0 √y
3.1.35 https://math.libretexts.org/@go/page/89215
1. Sketch the domain of integration.
2. Reverse the order of integration.
3. Evaluate the integral for f (x, y) = 1
2
.
(1+y)
19. ✳
Evaluate
0 2x
2
y
∫ ∫ e dy dx
−1 −2
20. ✳
Let
2 x 6 √6−x
21. ✳
Consider the domain D above the x--axis and below parabola y = 1 − x in the xy--plane. 2
1. Sketch D.
2. Express
∬ f (x, y) dA
D
as an iterated integral corresponding to the order dx dy. Then express this integral as an iterated integral corresponding to
the order dy dx.
3. Compute the integral in the case f (x, y) = e
3
x−( x /3)
.
22. ✳
1 1
Let I =∫
0
∫
x
2
3 3
x sin(y ) dy dx.
1. Sketch the region of integration in the xy--plane. Label your sketch sufficiently well that one could use it to determine the
limits of double integration.
2. Evaluate I .
23. ✳
Consider the solid under the surface z = 6 − xy, bounded by the five planes x = 0, x = 3, y = 0, y = 3, z = 0. Note that no
part of the solid lies below the x--y plane.
1. Sketch the base of the solid in the xy--plane. Note that it is not a square!
2. Compute the volume of the solid.
24. ✳
3.1.36 https://math.libretexts.org/@go/page/89215
25. ✳
Consider the volume above the xy -plane that is inside the circular cylinder x
2
+y
2
= 2y and underneath the surface
z = 8 + 2xy.
1. Express this volume as a double integral I , stating clearly the domain over which I is to be taken.
2. Express in Cartesian coordinates, the double integral I as an iterated intergal in two different ways, indicating clearly the
limits of integration in each case.
3. How much is this volume?
26. ✳
27. ✳
is equal to ∬D
sin (y
3
− 3y) dA for a suitable region R in the xy-plane.
1. Sketch the region R.
2. Write the integral I with the orders of integration reversed, and with suitable limits of integration.
3. Find I .
28. ✳
Find the double integral of the function f (x, y) = xy over the region bounded by y = x − 1 and y 2
= 2x + 6.
Stage 3
29
1. By the “smallest” x we mean the x farthest to the left along the number line, not the x closest to 0.
2. This theorem is named after the Italian mathematician Guido Fubini (1879--1943).
R(y)
3. Think of the part of V that is above the strip as being a thin slice of bread. Then the factor dy in dy ∫ L(y)
dx f (x, y) is the
R(y)
thickness of the slice of bread. The factor ∫ L(y)
dx f (x, y) is the surface area of the constant y cross-section
{(x, z)|L(y) ≤ x ≤ R(y), 0 ≤ z ≤ f (x, y)} , i.e. the surface area of the slice of bread.
x
integrand of the error function erf (x) = dt that is used in computing “bell curve” probabilities. See Example
2
2 −t
∫ e
√π 0
3.1.37 https://math.libretexts.org/@go/page/89215
10. For mathematicians, “pathological” is a synonym for “cool”.
11. Here L(y) (“L” stands for “left”) is the leftmost allowed value of x when the y -coordinate is y, and R(y) (“R ” stands for
“right”) is the rightmost allowed value of x, when the y -coordinate is y.
12. In this motivation, we suppress the Δx → 0 and Δy → 0 limits.
This page titled 3.1: Double Integrals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
3.1.38 https://math.libretexts.org/@go/page/89215
3.2: Double Integrals in Polar Coordinates
So far, in setting up integrals, we have always cut up the domain of integration into tiny rectangles by drawing in many lines of
constant x and many lines of constant y.
There is no law that says that we must cut up our domains of integration into tiny pieces in that way. Indeed, when the objects of
interest are sort of round and centered on the origin, it is often advantageous 1 to use polar coordinates, rather than Cartesian
coordinates.
Polar Coordinates
It may have been a while since you did anything in polar coordinates. So let's review before we resume integrating.
Definition 3.2.1
The polar coordinates 2 of any point (x, y) in the xy-plane are
Cartesian and polar coordinates are related, via a quick bit of trigonometry, by
Equation 3.2.2
x = r cos θ y = r sin θ
−−−−−−
y
2 2
r = √x +y θ = arctan
x
The following two figures show a number of lines of constant θ, on the left, and curves of constant r, on the right.
Note that the polar angle θ is only defined up to integer multiples of 2π. For example, the point (1, 0) on the x-axis could have
θ = 0, but could also have θ = 2π or θ = 4π. It is sometimes convenient to assign θ negative values. When θ < 0, the counter-
clockwise 3 angle θ refers to the clockwise angle |θ|. For example, the point (0, −1) on the negative y -axis can have θ = − and π
3.2.1 https://math.libretexts.org/@go/page/89216
It is also sometimes convenient to extend the above definitions by saying that x = r cos θ and y = r sin θ even when r is negative.
For example, the following figure shows (x, y) for r = 1, θ = and for r = −1, θ = .
π
4
π
Both points lie on the line through the origin that makes an angle of 45 with the x-axis and both are a distance one from the origin.
∘
Polar Curves
Here are a couple of examples in which we sketch curves specified by equations in terms of polar coordinates.
r = 1 + cos θ
Our starting point will be to understand how 1 + cos θ varies with θ. So it will be helpful to remember what the graph of cos θ
looks like for 0 ≤ θ ≤ 2π.
Now let's pick some easy θ values, find the corresponding r's and sketch them.
When θ = 0, we have r = 1 + cos 0 = 1 + 1 = 2. To sketch the point with θ = 0 and r = 2, we first draw in the half-line
consisting of all points with θ = 0, r > 0. That's the positive x-axis, sketched in gray in the leftmost figure below. Then we
put in a dot on that line a distance 2 from the origin. That's the red dot in the first figure below.
Now increase θ a bit (to another easy place to evaluate), say to θ = . As we do so r = 1 + cos θ decreases to
π
6
√3
r = 1 + cos
π
6
=1+
2
≈ 1.87. To sketch the point with θ = and r ≈ 1.87, we first draw in the half-line consisting
π
6
, r > 0. That's the upper gray line in the second figure below. Then we put in a dot on that line a
3.2.2 https://math.libretexts.org/@go/page/89216
distance 1.87 from the origin. That's the upper red dot in the second figure below.
followed by θ = 3π
6
=
π
2
,
followed by θ = 4π
6
=
2π
3
,
followed by θ = 5π
6
,
followed by θ = 6π
6
= π.
As θ increases, r = 1 + cos θ decreases, hitting r = 1 when θ = and ending at r = 0 when θ = π. For each of these θ 's,
π
we first draw in the half-line consisting of all points with that θ and r ≥ 0. Those are the five gray lines in the figure on the
right above. Then we put in a dot on each θ -line a distance r = 1 + cos θ from the origin. Those are the red dots on the
gray lines in the figure on the right above.
We could continue the above procedure for π ≤ θ ≤ 2π. Or we can look at the graph of cos θ above and notice that the
graph of cos θ for π ≤ θ ≤ 2π is exactly the mirror image, about θ = π, of the graph of cos θ for 0 ≤ θ ≤ π.
That is, cos(π + θ) = cos(π − θ) so that r(π + θ) = r(π − θ). So we get the figure.
Finally, we fill in a smooth curve through the dots and we get the graph below. This curve is called a cardioid because it
looks like a heart 4.
r = sin(3θ)
Again it will be useful to remember what the graph of sin(3θ) looks like for 0 ≤ θ ≤ 2π.
3.2.3 https://math.libretexts.org/@go/page/89216
We'll first consider 0 ≤ θ ≤ π
3
, so that 0 ≤ 3θ ≤ π. On this interval r(θ) = sin(3θ)
starts with r(0) = 0, and then
increases as θ increases until
, i.e. θ = , where r( and then
π π π
3θ = ) = 1,
2 6 6
3
. Notice that we have chosen values of θ for which sin(3θ) is
easy to compute.
θ 3θ r(θ)
0 0 0
π π 1
≈ 0.71
12 4 √2
2π π
1
12 2
3π 3π 1
≈ 0.71
12 4 √2
4π
π 0
12
and here is a sketch exhibiting those values and another sketch of the part of the curve with 0 ≤ θ ≤ π
3
.
Next consider π
3
≤θ ≤
2π
3
, so that π ≤ 3θ ≤ 2π. On this interval r(θ) = sin(3θ)
starts with r( ) = 0, and then
π
We are now encountering, for the first time, r(θ) 's that are negative. The figure on the left below contains, for each of
θ =
4π
12
= ,
π
3
5π
,
12
=
6π
,
12
and =
π
2
7π
12
8π
12
2π
the (dashed) half-line consisting of all points with that θ and r < 0 and
the dot with that θ and r(θ) = sin(3θ).
The figure on the right below provides a sketch of the part of the curve r = sin(3θ) with π
3
≤θ ≤
2π
3
.
3.2.4 https://math.libretexts.org/@go/page/89216
Finally consider 2π
3
≤ θ ≤ π (because r(θ + π) = sin(3θ + 3π) = − sin(3θ) = −r(θ), the part of the curve with
π ≤ θ ≤ 2π just retraces the part with 0 ≤ θ ≤ π ), so that 2π ≤ 3θ ≤ 3π. On this interval r(θ) = sin(3θ)
12
12
=
2π
3
,
9π
12
,
10π
12
,
11π
12
and 12π
12
=π
the (solid) half-line consisting of all points with that θ and r ≥ 0 and
the dot with that θ and r(θ) = sin(3θ).
The figure on the right below provides a sketch of the part of the curve r = sin(3θ) with 2π
3
≤ θ ≤ π.
Putting the three lobes together gives the full curve, which is called the “three petal rose”.
There is an infinite family of similar rose curves (also called rhodonea 5 curves).
3.2.5 https://math.libretexts.org/@go/page/89216
Here is an enlarged sketch of one such approximate rectangle.
One side has length dr, the spacing between the curves of constant r. The other side is a portion of a circle of radius r that
subtends, at the origin, an angle dθ, the angle between the lines of constant θ. As the circumference of the full circle is 2πr and as
dθ is the fraction
dθ
2π
of a full circle 6, the other side of the approximate rectangle has length 2πr = rdθ. So the shaded region
dθ
2π
Equation 3.2.5
dA = r dr dθ
which we see the error going to zero in the limit n → ∞, is provided in the (optional) section §3.2.4.
where the functions T (θ) and B(θ) are continuous and obey B(θ) ≤ T (θ) for all a ≤ θ ≤ b. Find the mass of R if it has
density f (x, y).
Solution
The figure on the left below is a sketch of R. Notice that r = T (θ) is the outer curve while r = B(θ) is the inner curve.
Divide R into wedges (as in wedges of pie 8 or wedges of cheese) by drawing in many lines of constant θ, with the various
values of θ differing by a tiny amount dθ. The figure on the right above shows one such wedge, outlined in blue.
Concentrate on any one wedge. Subdivide the wedge further into approximate rectangles by drawing in many circles of
constant r, with the various values of r differing by a tiny amount dr. The figure below shows one such approximate rectangle,
3.2.6 https://math.libretexts.org/@go/page/89216
in black.
Now concentrate on one such rectangle. Let's say that it contains the point with polar coordinates r and θ. As we saw in 3.2.5
above,
the area of that rectangle is essentially dA = r dr dθ.
As the mass density on the rectangle is essentially f (r cos θ , r sin θ), the mass of the rectangle is essentially
f (r cos θ , r sin θ) r dr dθ.
To get the mass of any one wedge, say the wedge whose polar angle runs from θ to θ + dθ, we just add up the masses of
the approximate rectangles in that wedge, by integrating r from its smallest value on the wedge, namely B(θ), to its largest
value on the wedge, namely T (θ). The mass of the wedge is thus
T (θ)
dθ ∫ dr r f (r cos θ , r sin θ)
B(θ)
Finally, to get the mass of R, we just add up the masses of all of the different wedges, by integrating θ from its smallest
value on R, namely a, to its largest value on R, namely b.
In conclusion,
b T (θ)
We have repeatedly used the word “essentially” above to avoid getting into the nitty-gritty details required to prove things
rigorously. The mathematically correct proof of 3.2.7 follows the same intuition, but requires some more careful error bounds,
as in the optional §3.2.4 below.
In the last example, we derived the important formula that the mass of the region
Equation 3.2.7
b T (θ)
We can immediately adapt that example to calculate areas and derive the formula that the area of the region
is
Equation 3.2.8
b
1 2
Area(R) = ∫ R(θ) dθ
2 a
3.2.7 https://math.libretexts.org/@go/page/89216
We just have set the density to 1. We do so in the next example.
b R(θ)
Area(R) = ∫ dθ ∫ dr r
a 0
The expression R(θ) dθ in 3.2.8 has a geometric interpretation. It is just the area of a wedge of a circular disk of radius
1
2
2
R(θ) (with R(θ) treated as a constant) that subtends the angle dθ.
To see this, note that area of the wedge is the fraction of the area of the entire disk, which is πR(θ) . So 3.2.8 just says that
dθ
2π
2
the area of R can be computed by cutting R up into tiny wedges and adding up the areas of all of the tiny wedges.
Find the area of one petal of the three petal rose r = sin(3θ).
Solution
Looking at the last figure in Example 3.2.4, we see that we want the area of
π
R = {(r cos θ, r sin θ)|0 ≤ θ ≤ , 0 ≤ r ≤ sin(3θ)}
3
3
, and R(θ) = sin(3θ),
π
1 3
2
area(R) = ∫ sin (3θ) dθ
2 0
π
1 3
= ∫ (1 − cos(6θ)) dθ
4 0
π
1 1 3
= [θ − sin(6θ)]
4 6
0
π
=
12
3.2.8 https://math.libretexts.org/@go/page/89216
In the first step we used the double angle formula cos(2ϕ) = 1 − 2 sin 2
(ϕ). Unsurprisingly, trig identities show up a lot when
polar coordinates are used.
A cylindrical hole of radius b is drilled symmetrically (i.e. along a diameter) through a metal sphere of radius a ≥ b. Find the
volume of metal removed.
Solution
Let's use a coordinate system with the sphere centred on (0, 0, 0) and with the centre of the drill hole following the z -axis. In
particular, the sphere is x + y + z ≤ a .
2 2 2 2
Here is a sketch of the part of the sphere in the first octant. The hole in the sphere made by the drill is outlined in red. By
symmetry the total amount of metal removed will be eight times the amount from the first octant.
That is, the volume of metal removed will be eight times the volume of the solid
−−−−−−−−−−
2 2 2
V1 = {(x, y, z)|(x, y) ∈ R1 , 0 ≤ z ≤ √ a −x −y }
In polar coordinates
− −−−−−
2 2
V1 = {(r cos θ, r sin θ, z)|(r cos θ, r sin θ) ∈ R1 , 0 ≤ z ≤ √ a − r }
π
R1 = {(r cos θ, r sin θ)|0 ≤ r ≤ b, 0 ≤ θ ≤ }
2
We follow our standard divide and sum up strategy. We will cut the base region R into small pieces and sum up the volumes
1
The figure on the left below shows one such wedge, outlined in blue.
3.2.9 https://math.libretexts.org/@go/page/89216
Concentrate on any one wedge. Subdivide the wedge further into approximate rectangles by drawing in many circles of
constant r, with the various values of r differing by a tiny amount dr. The figure on the right above shows one such
approximate rectangle, in black.
Concentrate on one such rectangle. Let's say that it contains the point with polar coordinates r and θ. As we saw in 3.2.5
above,
the area of that rectangle is essentially dA = r dr dθ.
−−−−−−
The part of V that is above that rectangle is like an office tower whose height is essentially √a − r , and whose base
1
2 2
has area dA = r dr dθ. It is outlined in black in the figure below. So the volume of the part of V that is above the 1
−− −− −−
rectangle is essentially √a − r r dr dθ.
2 2
To get the volume of the part of V above any one wedge (outlined in blue in the figure below), say the wedge whose polar
1
angle runs from θ to θ + dθ, we just add up the volumes above the approximate rectangles in that wedge, by integrating r
from its smallest value on the wedge, namely 0, to its largest value on the wedge, namely b. The volume above the wedge
is thus
2 2
b a −b
− −−−−− du
2 2 −
dθ ∫ dr r √ a − r = dθ ∫ √u
0 a2 −2
2 2
where u = a − r , du = −2r dr
2 2
a −b
3/2
u
= dθ[ ]
−3 2
a
1 3 2 2
3/2
= dθ [a − (a −b ) ]
3
Notice that this quantity is independent of θ. If you think about this for a moment, you can see that this is a consequence of
the fact that our solid is invariant under rotations about the z -axis.
Finally, to get the volume of V , we just add up the volumes over all of the different wedges, by integrating θ from its
1
3.2.10 https://math.libretexts.org/@go/page/89216
π/2
1 3/2
3 2 2
Volume(V1 ) = ∫ dθ [a − (a −b ) ]
3 0
π 3 2 2 3/2
= [a − (a −b ) ]
6
4π 3/2
3 2 2
= [a − (a −b ) ]
3
Note that we can easily apply a couple of sanity checks to our answer.
If the radius of the drill bit b = 0, no metal is removed at all. So the total volume removed should be zero. Our answer does
indeed give 0 in this case.
If the radius of the drill bit b = a, the radius of the sphere, then the entire sphere disappears. So the total volume removed
should be the volume of a sphere of radius a. Our answer does indeed give π a in this case. 4
3
3
If the radius, a, of the sphere and the radius, b, of the drill bit are measured in units of meters, then the remaining volume
3/2
4π
3
[a
3 2
− (a
2
−b ) ], has units meters 3
, as it should.
The previous two problems were given to us (or nearly given to us) in polar coordinates. We'll now get a little practice converting
integrals into polar coordinates, and recognising when it is helpful to do so.
is very simple. So whether or not this integral will be easy to evaluate using polar coordinates will be largely determined by the
domain of integration.
So our main task is to sketch the domain of integration. To prepare for the sketch, note that in the integral
1 x −−−−−− 1 x −−−−−−
2 2 2 2
∫ ∫ y√ x +y dy dx = ∫ dx [∫ dy y √ x +y ]
0 0 0 0
D = {(x, y)|0 ≤ x ≤ 1, 0 ≤ y ≤ x}
which is sketched in the figure on the left below. It is a right angled triangle.
Next we express the domain of integration in terms of polar coordinates, by expressing the equations of each of the boundary
lines in terms of polar coordinates.
3.2.11 https://math.libretexts.org/@go/page/89216
The x-axis, i.e. y = r sin θ = 0, is θ = 0.
The line x = 1 is r cos θ = 1 or r = .
1
cos θ
Concentrate on any one wedge. Subdivide the wedge further into approximate rectangles by drawing in many circles of
constant r, with the various values of r differing by a tiny amount dr. The figure on the right above shows one such
approximate rectangle, in black.
The rectangle that contains the point with polar coordinates r and θ has area (essentially) r dr dθ.
The first rectangle has r = 0.
The last rectangle has r = .
1
cos θ
So our integral is
2 2
y √x +y
1 x −−−−−− π/4
1
cos θ
2 2 2
∫ ∫ y√ x +y dy dx = ∫ dθ ∫ dr r (r sin θ)
0 0 0 0
Because the r-integral treats θ as a constant, we can pull the sin θ out of the inner r-integral.
1
1 x −−−−−− π/4
cos θ
2 2 3
∫ ∫ y√ x +y dy dx = ∫ dθ sin θ ∫ dr r
0 0 0 0
π/4
1 1
= ∫ dθ sin θ
4
4 0 cos θ
4
, u = cos θ =
1
√2
. So
1 x −−−−−− 1/ √2
2 2
1 1
∫ ∫ y√ x +y dy dx = ∫ (−du)
4
0 0
4 1 u
−3 1/ √2
1 u 1 –
=− [ ] = [2 √2 − 1]
4 −3 12
1
Evaluate ∫ e
−x
dx.
0
Solution
This is actually a trick question. In fact it is a famous trick question 9.
3.2.12 https://math.libretexts.org/@go/page/89216
The integrand e does not have an antiderivative that can be expressed in terms of elementary functions 10. So we cannot
2
−x
evaluate this integral using the usual Calculus II methods. However we can evaluate it's square
∞ 2 ∞ ∞ ∞ ∞
2 2 2 2 2
−x −x −y −x −y
[∫ e dx] =∫ e dx ∫ e dy = ∫ dx ∫ dy e
0 0 0 0 0
precisely because this double integral can be easily evaluated just by changing to polar coordinates! The domain of integration
is the first quadrant {(x, y)|x ≥ 0, y ≥ 0} . In polar coordinates, dxdy = r drdθ and the first quadrant is
π
{(r cos θ , r sin θ)|r ≥ 0, 0 ≤ θ ≤ }
2
So
∞ 2 ∞ ∞ π/2 ∞
2 2 2 2
−x −x −y −r
[∫ e dx] =∫ dx ∫ dy e =∫ dθ ∫ dr r e
0 0 0 0 0
As r runs all the way to +∞, this is an improper integral, so we should be a little bit careful.
∞ 2 π/2 R
2 2
−x −r
[∫ e dx] = lim ∫ dθ ∫ dr r e
R→∞
0 0 0
2
π/2 R
du
−u 2
= lim ∫ dθ ∫ e where u = r , du = 2r dr
R→∞
0 0
2
2
π/2 −u R
e
= lim ∫ dθ [− ]
R→∞
0
2
0
2
−R
π 1 e
= lim [ − ]
R→∞ 2 2 2
π
=
4
Example 3.2.14
Find the area of the region that is inside the circle r = 4 cos θ and to the left of the line x = 1.
Solution
First, let's check that r = 4 cos θ really is a circle and figure out what circle it is. To do so, we'll convert the equation
r = 4 cos θ into Cartesian coordinates. Multiplying both sides by r gives
2 2 2 2 2
r = 4r cos θ ⟺ x +y = 4x ⟺ (x − 2 ) +y =4
So r = 4 cos θ is the circle of radius 2 centred on (2, 0). We'll also need the intersection point(s) of x = r cos θ = 1 and
r = 4 cos θ. At such an intersection point
1
r cos θ = 1, r = 4 cos θ ⟹ = 4 cos θ
cos θ
1
2
⟹ cos θ =
4
1
⟹ cos θ = since r cos θ = 1 > 0
2
π
⟹ θ =±
3
3.2.13 https://math.libretexts.org/@go/page/89216
We could figure out the area of R by using some high school geometry, because R is a circular wedge with a triangle
removed. (See Example 3.2.15, below.)
Instead, we'll treat its computation as an exercise in integration using polar coordinates.
As R is symmetric about the x-axis, the area of R is twice the area of the part that is above the x-axis. We'll denote by R the 1
upper half of R. Note that we can write the equation x = 1 in polar coordinates as r = . Here is a sketch of R .
1
cos θ
1
2
,
if θ < π
3
, then r runs from 0 to 1
cos θ
, while
if θ > π
3
, then r runs from 0 to 4 cos θ.
This naturally leads us to split the domain of integration at θ = π
3
:
As ∫ r dr = r
2
+ C,
3.2.14 https://math.libretexts.org/@go/page/89216
π/3 2 π/2
sec θ
2
Area(R1 ) = ∫ dθ +∫ dθ 8 cos θ
0
2 π/3
π/2
1 π/3
∣
= tan θ +4 ∫ dθ [1 + cos(2θ)]
∣
2 0
π/3
– π/2
√3 sin(2θ)
= + 4 [θ + ]
2 2
π/3
– –
√3 π √3
= +4 [ − ]
2 6 4
–
2π √3
= −
3 2
and
4π –
Area(R) = 2Area(R1 ) = − √3
3
We'll now again compute the area of the region R that is inside the circle r = 4 cos θ and to the left of the line x = 1. That was
the region of interest in Example 3.2.14. This time we'll just use some geometry. Think of R as being the wedge W, of the
figure on the left below, with the triangle T , of the figure on the right below, removed.
First we'll get the area of W. The cosine of the angle between the x axis and the radius vector from C to A is 1
2
. So that
2π/3
angle is and W subtends an angle of . The entire circle has area π 2 , so that W, which is the fraction
π
3
2π
3
2
= of the 2π
1
3
–
Now we'll get the area of the triangle T . Think of T as having base BD. Then the length of the base of T is 2√3 and the
– –
height of T is 1. So T has area (2√3)(1) = √3.
1
All together
4π –
Area(R) = Area(W) − Area(T ) = − √3
3
We used some hand waving in deriving the area formula 3.2.8: the word “essentially” appeared quite a few times. Here is how do
that derivation more rigorously.
3.2.15 https://math.libretexts.org/@go/page/89216
for the area of the region
In the course of that derivation we approximated the area of the shaded region in
by dA = r dr dθ.
We will now justify that approximation, under the assumption that
′
0 ≤ R(θ) ≤ M | R (θ)| ≤ L
for all a ≤ θ ≤ b. That is, R(θ) is bounded and its derivative exists and is bounded too.
Divide the interval a ≤ θ ≤ b into n equal subintervals, each of length Δθ =
b−a
n
. Let θ be the midpoint of the
∗
i
i
th
interval. On
the i interval, θ runs from θ − Δθ to θ + Δθ.
th ∗
i
1
2
∗
i
1
∣ ∗ ∗
∣R(θ) − R(θi )∣
∣ ≤ L∣
∣θ − θi ∣
∣
On the i interval, the radius r = R(θ) runs over all values of R(θ) with θ satisfying ∣∣θ − θ ∣∣ ≤ Δθ. By (∗), all of these values
th ∗
i
1
of R(θ) lie between r = R(θ ) − LΔθ and R = R(θ ) + LΔθ. Consequently the part of R having θ in the i subinterval,
i
∗
i
1
2
i
∗
i
1
2
th
namely,
1 1
∗ ∗
Ri = {(r cos θ, r sin θ)| θ − Δθ ≤ θ ≤ θ + Δθ, 0 ≤ r ≤ R(θ)}
i i
2 2
That is, we have found one circular sector that is bigger than the one we are approximating, and one circular sector that is smaller.
The area of a circular disk of radius ρ is π ρ . A circular sector of radius ρ that subtends an angle Δθ is the fraction
2
of the full Δθ
2π
2
2
3.2.16 https://math.libretexts.org/@go/page/89216
So the area of R must lie between
i
2 2
1 1 1 1 1 1
2 ∗ 2 ∗
Δθ r = Δθ [R(θ ) − LΔθ] and Δθ R = Δθ [R(θ ) + LΔθ]
i i i i
2 2 2 2 2 2
Observe that
2
1 1
∗ ∗ 2 ∗ 2 2
[R(θ ) ± LΔθ] = R(θ ) ± LR(θ )Δθ + L Δθ
i i i
2 4
2
1 2 2
∗
R(θ ) − LM Δθ + L Δθ
i
4
2
1
∗
≤ [R(θ ) ± LΔθ] ≤
i
2
1
∗ 2 2 2
R(θ ) + LM Δθ + L Δθ
i
4
Hence (multiplying by Δθ
2
to turn them into areas)
1 2
1 2
1 2 3
∗
R(θ ) Δθ − LM Δθ + L Δθ
i
2 2 8
≤ Area(Ri ) ≤
1 2
1 2
1 2 3
∗
R(θ ) Δθ + LM Δθ + L Δθ
i
2 2 8
≤A ≤
n
1 2
1 2
1 2 3
∗
∑[ R(θ ) Δθ + LM Δθ + L Δθ ]
i
2 2 8
i=1
and
n
1 2
1 2
1 2 3
∗
∑ R(θ ) Δθ − nLM Δθ + nL Δθ
i
2 2 8
i=1
≤A ≤
n
1 2
1 2
1 2 3
∗
∑ R(θ ) Δθ + nLM Δθ + nL Δθ
i
2 2 8
i=1
b−a
Since Δθ = n
,
n 2 2 3
1 LM (b−a) L (b−a)
∗ 2
∑ R(θ ) Δθ− +
i 2
2 2 n 8 n
i=1
≤A ≤
n 2 2 3
1 LM (b−a) L (b−a)
∗ 2
∑ R(θ ) Δθ+ +
i
2 2 n 8 n2
i=1
3.2.17 https://math.libretexts.org/@go/page/89216
Now take the limit as n → ∞. Since
n 2 2 3
1 LM (b − a) L (b − a)
∗ 2
lim [ ∑ R(θ ) Δθ ± + ]
i 2
n→∞ 2 2 n 8 n
i=1
b 2 2 3
1 LM (b − a) L (b − a)
2
= ∫ R(θ) dθ ± lim + lim
2 a
n→∞ 2 n n→∞ 8 n2
b
1 2
= ∫ R(θ) dθ (since L, M , a and b are all constants)
2 a
we have that
b
1
2
A = ∫ R(θ) dθ
2 a
exactly, as desired.
Exercises
Stage 1
1
For each 1 ≤ i ≤ 5,
sketch, in the xy-plane, the point (x , y ) and
i i
2
1. Find all pairs (r, θ) such that
3
er (θ) = cos θ ^
ı + sin θ ^
ı ȷ
ȷ eθ (θ) = − sin θ ^
ı + cos θ ^
ı ȷ
ȷ
1. Determine, for each angle θ, the lengths of the vectors e (θ) and e (θ) and the angle between the vectors e (θ) and e
r θ r θ (θ).
Compute e (θ) × e (θ) (viewing e (θ) and e (θ) as vectors in three dimensions with zero k
r θ r θ
^
components).
2. For each 1 ≤ i ≤ 5, sketch, in the xy-plane, the point (x , y ) and the vectors e (θ ) and e (θ ). In your sketch of the
i i r i θ i
3.2.18 https://math.libretexts.org/@go/page/89216
4
Let ⟨a, b⟩ be a vector. Let r be the length of ⟨a, b⟩ and θ be the angle between ⟨a, b⟩ and the x-axis.
1. Express a and b in terms of r and θ.
2. Let ⟨A, B⟩ be the vector gotten by rotating ⟨a, b⟩ by an angle a⃗ rphi about its tail. Express A and B in terms of a, b and
a⃗ rphi.
5
For each of the regions R sketched below, express ∬
R
f (x, y) dx dy as an iterated integral in polar coordinates in two
different ways.
6
Sketch the domain of integration in the xy-plane for each of the following polar coordinate integrals.
π
2
4
4 s in θ+cos θ
3. ∫
√cos 2 θ+9 s in2 θ
Stage 2
7
1. ∬ (x + y)dx dy where S is the region in the first quadrant lying inside the disc x 2
+y
2
≤a
2
and under the line
S
–
y = √3x.
3. ∬ (x
2
+ y )dx dy
2
where T is the triangle with vertices (0, 0), (1, 0) and (1, 1).
T
4. ∬ ln(x
2 2
+ y ) dx dy
2 2
x +y ≤1
8
3.2.19 https://math.libretexts.org/@go/page/89216
9
10
11 ✳
−−−−−−
Consider the region E in 3--dimensions specified by the inequalities x 2
+y
2
≤ 2y and 0 ≤ z ≤ √x
2 2
+y .
1. Draw a reasonably accurate picture of E in 3--dimensions. Be sure to show the units on the coordinate axes.
2. Use polar coordinates to find the volume of E. Note that you will be “using polar coordinates” if you solve this problem by
means of cylindrical coordinates.
12 ✳
2 2
∫ ∫ (x +y ) 2
dy dx
x=0 y=0
13 ✳
1. Sketch the region L (in the first quadrant of the xy--plane) with boundary curves
2 2 2 2
x +y = 2, x +y = 4, y = x, y = 0.
The mass of a thin lamina with a density function ρ(x, y) over the region L is given by
M =∬ ρ(x, y) dA
L
14 ✳
1
Evaluate ∬ 2
dA.
2 2 2
R (1 + x +y )
15 ✳
3.2.20 https://math.libretexts.org/@go/page/89216
16 ✳
17 ✳
Let D be the region in the xy --plane bounded on the left by the line x =2 and on the right by the circle 2
x +y
2
= 16.
Evaluate
−3/2
2 2
∬ (x +y ) dA
D
18 ✳
19 ✳
Let D be the shaded region in the diagram. Find the average distance of points in D from the origin. You may use that
n−1
cos (x) sin(x)
n
∫ cos (x) dx =
n
+
n−1
n
∫ cos
n−2
(x) dx for all natural numbers n ≥ 2.
Stage 3
20 ✳
2 2
x +y ≤1
0 ≤ x ≤ 2y
y ≤ 2x
21 ✳
Consider
3.2.21 https://math.libretexts.org/@go/page/89216
2
√2 √4−y
y 2 2
x +y
J =∫ ∫ e dx dy
0 y x
22
Find the volume of the region in the first octant below the paraboloid
2 2
x y
z =1− −
2 2
a b
23
A symmetrical coffee percolator holds 24 cups when full. The interior has a circular cross-section which tapers from a radius of
3' at the centre to 2' at the base and top, which are 12' apart. The bounding surface is parabolic. Where should the mark
indicating the 6 cup level be placed?
24 ✳
1. Compute the volume under S and above the disk x + y ≤ 9 in the xy-plane. 2 2
Sketch R and express the volume as a single iterated integral with the order of integration reversed. Do not compute either
integral in part (b).
1. The “golden hammer” (also known as Maslow's hammer and as the law of the instrument) refers to a tendency to always use the
same tool, even when it isn't the best tool for the job. It is just as bad in mathematics as it is in carpentry.
2. In the mathematical literature, the angular coordinate is usually denoted θ, as we do here. The symbol ϕ is also often used for
the angular coordinate. In fact there is an ISO standard (#80000 – 2) which specifies that ϕ should be used in the natural
sciences and in technology. See Appendix A.7.
3. or anti-clockwise or widdershins. Yes, widdershins is a real word, though the Oxford English Dictionary lists its frequency of
usage as between 0.01 and 0.1 times per million words. Of course both “counter-clockwise” and “anti-clockwise” assume that
your clock is not a sundial in the southern hemisphere.
4. Well, a mathematician's heart. The name “cardioid” comes from the Greek word καρδια (which anglicizes to kardia) for heart.
5. The name rhodenea first appeared in the 1728 publication Flores geometrici of the Italian monk, theologian, mathematician and
engineer, Guido Grandi (1671– 1742).
6. Recall that θ has to be measured in radians for this to be true.
7. “Handwaving” is sometimes used as a pejorative to refer to an argument that lacks substance. Here we are just using it to
indicate that we have left out a bunch of technical details. In mathematics, “nose-following” is sometimes used as the polar
3.2.22 https://math.libretexts.org/@go/page/89216
opposite of handwaving. It refers to a very narrow, mechanical, line of reasoning.
8. There is a pie/pi/pye pun in there somewhere.
9. The solution is attributed to the French Mathematician Sim\'eon Denis Poisson (1781 – 840) and was published in the textbook
Cours d'Analyse de l'\'ecole polytechnique by Jacob Karl Franz Sturm (1803 – 1855).
z
10. On the other hand it is the core of the function erf(z) = dt, which gives Gaussian (i.e. bell curve) probabilities.
2
2 −t
∫ e
√π 0
This page titled 3.2: Double Integrals in Polar Coordinates is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.
3.2.23 https://math.libretexts.org/@go/page/89216
3.3: Applications of Double Integrals
Double integrals are useful for more than just computing areas and volumes. Here are a few other applications that lead to double
integrals.
Averages
In Section 2.2 of the CLP-2 text, we defined the average value of a function of one variable. We'll now extend that discussion to
functions of two variables. First, we recall the definition of the average of a finite set of numbers.
Definition 3.3.1
f1 + f2 + ⋯ + fn
=
n
The notations f¯ and ⟨f ⟩ are both commonly used to represent the average.
Now suppose that we want to take the average of a function f (x, y) with (x, y) running continuously over some region R in the
xy-plane. A natural approach to defining what we mean by the average value of f over R is to
n
1
n
. This can be done by, for
example, subdividing vertical strips into tiny squares, like in Example 3.1.11.
Name the squares (in any fixed order) R , R , ⋯ , R , where N is the total number of squares.
1 2 N
N ∗ ∗
∑ f (x , y ) Δx Δy
i=1 i i
=
N
∑ Δx Δy
i=1
define
Definition 3.3.2
Let f (x, y) be an integrable function defined on region R in the xy-plane. The average value of f on R is
∬ f (x, y) dx dy
R
¯
f = ⟨f ⟩ =
∬ dx dy
R
3.3.1 https://math.libretexts.org/@go/page/89217
Solution
By Definition 3.3.2 the average height is
− −−−−−−−− −
∬ z(x, y) dx dy ∬ √ a2 − x2 − y 2 dx dy
R R
z̄ = =
∬ dx dy ∬ dx dy
R R
The integrals in both the numerator and denominator are easily evaluated by interpreting them geometrically.
−−−−−−−−− −
The numerator ∬ R
z(x, y) dx dy = ∬
R
√a2 − x2 − y 2 dx dy can be interpreted as the volume of
− −−−−−−−−−
∣ 2 2 2 2 2 2
{ (x, y, z) x + y ≤ a , x ≤ 0, 0 ≤ z ≤ √ a − x − y }
∣
2 2 2 2
= {(x, y, z)| x +y +z ≤ a , x ≤ 0, z ≥ 0}
3
3
The denominator ∬ dx dy is the area of one half of a circular disk of radius a. So the denominator is
R
1
2
2
πa .
Notice this this number is bigger than zero and less than the maximum height, which is a. That makes sense.
This last example was relatively easy because we could reinterpret the integrals as geometric quantities. For practice, let's go
−−−−−−−−− −
back and evaluate the numerator ∬ √a − x − y dx dy of Example 3.3.3 as an iterated integral.
R
2 2 2
0 −−−−−−−−−−
Note that, in the inside integral ∫ dx √a
2
−x
2
−y
2
, the variable y is treated as a constant, so that the integrand
−√a2 −y 2
−−−−−−−−− − −−− −−−− −−−− −−
√a2 − y 2 − x2 = √C 2 − x2 with C being the constant √a2 − y 2 . The standard protocol for evaluating this integral uses
the trigonometric substitution
3.3.2 https://math.libretexts.org/@go/page/89217
π π
x = C sin θ with − ≤θ ≤
2 2
dx = C cos θ dθ
Trigonometric substitution was discussed in detail in Section 1.9 in the CLP-2 text. Since
x =0 ⟹ C sin θ = 0 ⟹ θ =0
−−−−−−
2 2
π
x = −√ a −y = −C ⟹ C sin θ = −C ⟹ θ =−
2
and
−−−−−−−−−− −−−−−−−−−− −
2 2 2 2 2 2
√a −x −y = √ C − C sin θ = C cos θ
0
sin(2θ)
0
1 + cos(2θ) ⎡ θ+ ⎤
2 2 2
=C ∫ dθ = C
−π/2
2 ⎣ 2 ⎦
−π/2
2
πC π
2 2
= = (a −y )
4 4
3
π a
3
= [a − ]
2 3
1 3
= πa
3
Remark 3.3.5
0
We remark that there is an efficient, sneaky, way to evaluate definite integrals like ∫ −π/2
cos
2
θ dθ. Looking at the figures
we see that
0 0
2 2
∫ cos θ dθ = ∫ sin θ dθ
−π/2 −π/2
Thus
0 0 0 0
2 2
1 2 2
1
∫ cos θ dθ =∫ sin θ dθ = ∫ [ sin θ + cos θ] dθ = ∫ dθ
−π/2 −π/2 −π/2
2 2 −π/2
π
=
4
3.3.3 https://math.libretexts.org/@go/page/89217
It is not at all unusual to want to find the average value of some function f (x, y) with (x, y) running over some region R, but to
also want some (x, y)'s to play a greater role in determining the average than other (x, y)'s. One common way to do so is to create
w( x1 , y1 ) w( x1 , y1 )
a “weight function” w(x, y) > 0 with w( x2 , y2 )
giving the relative importance of (x1 , y1 ) and (x
2, y2 ). That is, (x1 , y1 ) is w( x2 , y2 )
Definition 3.3.6
∬ f (x, y) w(x, y) dx dy
R
∬ w(x, y) dx dy
R
Note that if f (x, y) = F , a constant, then the weighted average of f is just F , just as you would want.
Centre of Mass
One important example of a weighted average is the centre of mass. If you support a body at its centre of mass (in a uniform
gravitational field) it balances perfectly. That's the definition of the centre of mass of the body. In Section 2.3 of the CLP-2 text, we
found that the centre of mass of a body that consists of mass distributed continuously along a straight line, with mass density
ρ(x)kg/m and with x running from a to b, is at
b
∫ x ρ(x) dx
a
x̄ =
b
∫ ρ(x) dx
a
That is, the centre of mass is at the average of the x-coordinate weighted by the mass density.
In two dimensions, the centre of mass of a plate that covers the region R in the xy -plane and that has mass density ρ(x, y) is the
point (x̄, ȳ ) where
∬ x ρ(x, y) dx dy ∬ x ρ(x, y) dx dy
R R
= =
∬ ρ(x, y) dx dy Mass(R)
R
∬ y ρ(x, y) dx dy ∬ y ρ(x, y) dx dy
R R
= =
∬ ρ(x, y) dx dy Mass(R)
R
If the mass density is a constant, the centre of mass is also called the centroid, and is the geometric centre of R. In this case
∬ x dx dy ∬ x dx dy
R R
x̄ = =
∬ dx dy Area(R)
R
∬ y dx dy ∬ y dx dy
R R
ȳ = =
∬ dx dy Area(R)
R
In Section 2.3 of the CLP-2 text, we did not have access to multivariable integrals, so we used some physical intuition to derive
that the centroid of a body that fills the region
R = { (x, y) ∣
∣ a ≤ x ≤ b, B(x) ≤ y ≤ T (x) }
3.3.4 https://math.libretexts.org/@go/page/89217
in the xy-plane is (x̄, ȳ ) where
b
∫ x[T (x) − B(x)] dx
a
x̄ =
A
b 2 2
∫ [T (x ) − B(x ) ] dx
a
ȳ =
2A
b
and A = ∫ [T (x) − B(x)] dx is the area of R. Now that we do have access to multivariable integrals, we can derive these
a
b T (x) b 2 2
1 1 1 T (x) B(x)
ȳ = ∬ y dx dy = ∫ dx ∫ dy y = ∫ dx [ − ]
A R
A a B(x)
A a
2 2
just as desired.
In Example 2.3.4 of the CLP-2 text, we found the centroid of the quarter circular disk
2 2 2
D = {(x, y)|x ≥ 0, y ≥ 0, x +y ≤r }
by using the formulae of the last example. We'll now find it again using 3.3.8.
Since the area of D is 1
4
πr ,
2
we have
∬ x dx dy ∬ y dx dy
D D
x̄ = ȳ =
1 2 1 2
πr πr
4 4
We'll evaluate ∬ D
x dx dy by using horizontal slices, as in the figure on the left below.
3.3.5 https://math.libretexts.org/@go/page/89217
y runs from 0 to r and
−−−−−−
for each y in that range, x runs from 0 to √r − y . 2 2
So
2 2
r √r −y
2 2
r √r −y
2
x
∬ x dx dy = ∫ dy ∫ dx x = ∫ dy [ ]
D 0 0 0 2
0
r 3
1 2 2
1 3
r
= ∫ dy [ r −y ] = [r − ]
2 0
2 3
3
r
=
3
and
3
4 r 4r
x̄ = [ ] =
2
πr 3 3π
This is the same answer as we got in Example 2.3.4 of the CLP-2 text. But because we were able to use horizontal slices, the
integral in this example was a little easier to evaluate than the integral in CLP-2. Had we used vertical slices, we would have
ended up with exactly the integral of CLP-2.
By symmetry, we should have ȳ = x̄. We'll check that by evaluating ∬ D
y dx dy by using vertical slices slices, as in the figure
on the right above. From that figure, we see that
x runs from 0 to r and
−−−−−−
for each x in that range, y runs from 0 to √r − x . 2 2
So
r √r2 −x2 r
1
2 2
∬ y dx dy = ∫ dx ∫ dy y = ∫ dx [ r −x ]
D 0 0
2 0
3
r
This is exactly the integral 1
2
∫
0
dy [ r
2 2
−y ] that we evaluated above, with y renamed to x. So ∬ D
y dx dy =
r
3
too and
3
4 r 4r
ȳ = [ ] = = x̄
2
πr 3 3π
as expected.
From the sketch, we see that R is symmetric about the x-axis. So we expect that its centroid, (x̄, ȳ ), has ȳ = 0. To see this
from the integral definition, note that the integral ∬ y dx dy R
3.3.6 https://math.libretexts.org/@go/page/89217
has domain of integration, namely R, invariant under y → −y (i.e. under reflection in the x-axis), and
has integrand, namely y, that is odd under y → −y.
So ∬ R
y dx dy = 0 and consequently ȳ = 0.
∬ x dx dy
R
x̄ =
∬ dx dy
R
π/3 2 π/2
sec θ 64
4
=∫ dθ +∫ dθ cos θ
0
3 π/3
3
The first integral is easy, provided we remember that tan θ is an antiderivative for sec
2
θ. For the second integral, we'll need
1+cos(2θ)
the double angle formula cos 2
θ =
2
:
2
2 1 + cos(2θ) 1
4 2 2
cos θ = ( cos θ) =[ ] = [1 + 2 cos(2θ) + cos (2θ)]
2 4
1 1 + cos(4θ)
= [1 + 2 cos(2θ) + ]
4 2
3 cos(2θ) cos(4θ)
= + +
8 2 8
so
π/2
1 π/3 64 3θ sin(2θ) sin(4θ)
∣
∬ x dx dy = tan θ + [ + + ]
∣
R1
3 0 3 8 4 32
π/3
– –
1 – 64 3 π √3 √3
= × √3 + [ × − + ]
3 3 8 6 4 ×2 32 × 2
4π –
= − 2 √3
3
has domain of integration, namely R, invariant under y → −y (i.e. under reflection in the x-axis), and
has integrand, namely x, that is even under y → −y.
So ∬ R
x dx dy = 2 ∬
R1
x dx dy and, all together,
4π – 8π –
2( − 2 √3) − 4 √3
3 3
x̄ = = ≈ 0.59
4π – 4π –
− √3 − √3
3 3
As a check, note that 0 ≤ x ≤ 1 on R and more of R is closer to x = 1 than to x = 0. So it makes sense that x̄ is between 1
and 1.
3.3.7 https://math.libretexts.org/@go/page/89217
Example 3.3.12. Reverse Centre of Mass
2 √2x−x2
Solution
This is another integral that can be evaluated without using any calculus at all. This time by relating it to a centre of mass. By
3.3.8,
∬ x dx dy = x̄ Area(R)
R
∬ y dx dy = ȳ Area(R)
R
2 √2x−x2
+3 ∫ dx [ ∫ dy y]
2
0 −√2x−x
−−−−−−
Observe that y = ±√2x − x is equivalent to
2
2 2 2 2 2
y = 2x − x = 1 − (x − 1 ) ⟺ (x − 1 ) +y =1
= 2 x̄ Area(R) + 3 ȳ Area(R) = 2π
Moment of Inertia
Consider a plate that fills the region R in the xy-plane, that has mass density ρ(x, y) kg/m , and that is rotating at ω rad/s about 2
some axis. Let's call the axis of rotation A. We are now going to determine the kinetic energy of that plate. Recall 2 that, by
3.3.8 https://math.libretexts.org/@go/page/89217
definition, the kinetic energy of a point particle of mass m that is moving with speed v is 1
2
2
mv .
To get the kinetic energy of the entire plate, cut it up into tiny rectangles 3, say of size dx × dy. Think of each rectangle as being
(essentially) a point particle. If the point (x, y) on the plate is a distance D(x, y) from the axis of rotation A, then as the plate
rotates, the point (x, y) sweeps out a circle of radius D(x, y). The figure on the right below shows that circle as seen from high up
on the axis of rotation.
The circular arc that the point (x, y) sweeps out in one second subtends the angle ω radians, which is the fraction ω
2π
of a full circle
and so has length (2πD(x, y)) = ω D(x, y). Consequently the rectangle that contains the point (x, y)
ω
2π
So (via our usual Riemann sum limit procedure) the kinetic energy of R is
1 1 1
2 2 2 2 2
∬ ω D(x, y ) ρ(x, y) dx dy = ω ∬ D(x, y ) ρ(x, y) dx dy = IA ω
R
2 2 R
2
where
2
IA = ∬ D(x, y ) ρ(x, y) dx dy
R
is called the moment of inertial of R about the axis A. In particular the moment of inertia of R about the y -axis is
2
Iy = ∬ x ρ(x, y) dx dy
R
2
Ix = ∬ y ρ(x, y) dx dy
R
Notice that the expression I ω for the kinetic energy has a very similar form to mv , just with the velocity v replaced by the
1
2
A
2 1
2
2
angular velocity ω, and with the mass m replaced by I , which can be thought of as being a bit like a mass.
A
So far, we have been assuming that the rotation was taking place in the xy-plane — a two dimensional world. Our analysis extends
naturally to three dimensions, though the resulting integral formulae for the moment of inertia will then be triple integrals, which
we have not yet dealt with. We shall soon do so, but let's first do an example in two dimensions.
3.3.9 https://math.libretexts.org/@go/page/89217
Example 3.3.14. Disk
Find the moment of inertia of the interior, R, of the circle x 2
+y
2
=a
2
about the x-axis. Assume that it has density one.
Solution
The distance from any point (x, y) inside the disk to the axis of
rotation (i.e. the x-axis) is |y|. So the moment of inertia of the interior of the disk about the x-axis is
2
Ix = ∬ y dxdy
R
2π a 2π a
2 2 3
Ix = ∫ dθ ∫ dr r (r sin θ) =∫ dθ sin θ∫ dr r
0 0 0 0
4 2π 4 2π
a a 1 − cos(2θ)
2
= ∫ dθ sin θ = ∫ dθ
4 0 4 0 2
4 2π
a sin(2θ)
= [θ − ]
8 2
0
1
4
= πa
4
2π
For an efficient, sneaky, way to evaluate ∫ 0
sin
2
θ dθ, see Remark 3.3.5.
Find the moment of inertia of the interior, R, of the cardiod r = a(1 + cos θ) about the z -axis. Assume that the cardiod lies in
the xy-plane and has density one.
Solution
We sketched the cardiod (with a = 1 ) in Example 3.2.3.
As we said above, the formula for I in Definition 3.3.13 is valid even when the axis of rotation is not contained in the xy-
A
plane. We just have to be sure that our D(x, y) really is the distance from (x, y) to the axis of rotation. In this example the axis
−−−−− −
of rotation is the z -axis so that D(x, y) = √x + y and that the moment of inertia is
2 2
2 2
IA = ∬ (x + y ) dxdy
R
3.3.10 https://math.libretexts.org/@go/page/89217
Switching to polar coordinates, using dxdy = r drdθ and x 2
+y
2
=r ,
2
2π a(1+cos θ) 2π a(1+cos θ)
2 3
IA =∫ dθ ∫ dr r × r =∫ dθ ∫ dr r
0 0 0 0
4 2π
a 4
= ∫ dθ (1 + cos θ)
4 0
4 2π
a 2 3 4
= ∫ dθ (1 + 4 cos θ + 6 cos θ + 4 cos θ + cos θ)
4 0
Now
2π
2π
∣
∫ dθ cos θ = sin θ =0
∣
0
0
2π 2π 2π
1 + cos(2θ) 1 sin(2θ)
2
∫ dθ cos θ =∫ dθ = [θ + ] =π
0 0
2 2 2
0
2π 2π 0
u=sin θ
3 2 2
∫ dθ cos θ =∫ dθ cos θ[1 − sin θ] ⟹ ∫ du (1 − u ) = 0
0 0 0
To integrate cos 4
θ, we use the double angle formula
cos(2θ) + 1
2
cos θ =
2
2
2
( cos(2θ) + 1) cos (2θ) + 2 cos(2θ) + 1
4
⟹ cos θ = =
4 4
cos(4θ)+1
+ 2 cos(2θ) + 1
2
=
4
3 1 1
= + cos(2θ) + cos(4θ)
8 2 8
to give
2π 2π
4
3 1 1
∫ dθ cos θ =∫ dθ [ + cos(2θ) + cos(4θ)]
0 0
8 2 8
3 1 1 3
= × 2π + ×0 + ×0 = π
8 2 8 4
All together
4
a 3
IA = [2π + 4 × 0 + 6 × π + 4 × 0 + π]
4 4
35 4
= πa
16
Exercises
Stage 1
1
For each of the following, evaluate the given double integral without using iteration. Instead, interpret the integral in terms of,
for example, areas or average values.
−−−−−
1. ∬ D
(x + 3) dx dy, where D is the half disc 0 ≤ y ≤ √4 − x 2
2. ∬ R
(x + y) dx dy where R is the rectangle 0 ≤ x ≤ a, 0 ≤ y ≤ b
3.3.11 https://math.libretexts.org/@go/page/89217
Stage 2
2. ✳
Find the centre of mass of the region D in the xy --plane defined by the inequalities x
2
≤ y ≤ 1, assuming that the mass
density function is given by ρ(x, y) = y.
3. ✳
4. ✳
−−−−−−
A thin plate of uniform density 1 is bounded by the positive x and y axes and the cardioid √x 2
+y
2
= r = 1 + sin θ, which
is given in polar coordinates. Find the x--coordinate of its centre of mass.
5. ✳
A thin plate of uniform density k is bounded by the positive x and y axes and the circle x 2
+y
2
= 1. Find its centre of mass.
6. ✳
Let R be the triangle with vertices (0, 2), (1, 0), and (2, 0). Let R have density ρ(x, y) = y 2
. Find ȳ , the y --coordinate of the
center of mass of R. You do not need to find x̄.
7. ✳
where A(D) is the area of the plane region D. Let D be the unit disk 1 ≥ x 2 2
+y . Find the average distance of a point in D
to the center of D.
8. ✳
A metal crescent is obtained by removing the interior of the circle defined by the equation x 2
+y
2
=x from the metal plate of
constant density 1 occupying the unit disc x + y ≤ 1.
2 2
8
.
3.3.12 https://math.libretexts.org/@go/page/89217
9. ✳
Stage 3
10. ✳
Let a, b and c be positive numbers, and let T be the triangle whose vertices are (−a, 0), (b, 0) and (0, c).
1. Assuming that the density is constant on T , find the center of mass of T .
2. The medians of T are the line segments which join a vertex of T to the midpoint of the opposite side. It is a well known
fact that the three medians of any triangle meet at a point, which is known as the centroid of T . Show that the centroid of T
is its centre of mass.
This page titled 3.3: Applications of Double Integrals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.
3.3.13 https://math.libretexts.org/@go/page/89217
3.4: Surface Area
Suppose that we wish to find the area of part, S, of the surface z = f (x, y). We start by cutting S up into tiny pieces. To do so,
we draw a bunch of curves of constant x (the blue curves in the figure below). Each such curve is the intersection of S with the
plane x = x for some constant x . And we also
0 0
draw a bunch of curves of constant y (the red curves in the figure below). Each such curve is the intersection of S with the
plane y = y for some constant y .
0 0
Concentrate on any one the tiny pieces. Here is a greatly magnified sketch of it, looking at it from above.
We wish to compute its area, which we'll call dS. Now this little piece of surface need not be parallel to the xy-plane, and indeed
need not even be flat. But if the piece is really tiny, it's almost flat. We'll now approximate it by something that is flat, and whose
area we know. To start, we'll determine the corners of the piece. To do so, we first determine the bounding curves of the piece.
Look at the figure above, and recall that, on the surface z = f (x, y).
The upper blue curve was constructed by holding x fixed at the value x , and sketching the curve swept out by
0
ȷ + f (x , y) k as y varied, and
^
x ^
0ı +y ^
ı ȷ 0
the lower blue curve was constructed by holding x fixed at the slightly larger value x + dx, and sketching the curve swept out
0
by (x + dx) ^
0 ı +y ^
ı ȷȷ + f (x + dx, y) k as y varied.
0
^
The red curves were constructed similarly, by holding y fixed and varying x.
So the four intersection points in the figure are
^
P0 = x0 ^
ı + y0 ^
ı ȷ
ȷ + f (x0 , y0 ) k
^
P1 = x0 ^
ı + (y0 + dy) ^
ı ȷ
ȷ + f (x0 , y0 + dy) k
^
P2 = (x0 + dx) ^
ı + y0 ^
ı ȷ
ȷ + f (x0 + dx, y0 ) k
^
P3 = (x0 + dx) ^
ı + (y0 + dy) ^
ı ȷ
ȷ + f (x0 + dx, y0 + dy) k
Now, for any small constants dX and dY , we have the linear approximation 1
∂f ∂f
f (x0 + dX, y0 + dY ) ≈ f (x0 , y0 ) + (x0 , y0 ) dX + (x0 , y0 ) dY
∂x ∂y
Applying this three times, once with dX = 0, dY = dy (to approximate P1 ), once with dX = dx, dY = 0 (to approximate P2 ),
and once with dX = dx, dY = dy (to approximate P ), 3
3.4.1 https://math.libretexts.org/@go/page/89218
∂f
^
P1 ≈ P0 + dy ^
ȷ
ȷ + (x0 , y0 ) dy k
∂y
∂f
^
P2 ≈ P0 + dx ^
ı
ı + (x0 , y0 ) dx k
∂x
∂f ∂f
^
P3 ≈ P0 + dx ^
ı + dy ^
ı ȷ
ȷ + [ (x0 , y0 ) dx + (x0 , y0 ) dy] k
∂x ∂y
Of course we have only approximated the positions of the corners and so have introduced errors. However, with more work, one
can bound those errors (like we in the optional §3.2.4) and show that in the limit dx, dy → 0, all of the error terms that we dropped
contribute exactly 0 to the integral.
The small piece of our surface with corners P 0, P1 , P2 , P3 is approximately a parallelogram with sides
−−−→ −−−→ ∂f
^
P0 P1 ≈ P2 P3 ≈ dy ^
ȷ
ȷ + (x0 , y0 ) dy k
∂y
−−−→ −−−→ ∂f
^
P0 P2 ≈ P1 P3 ≈ dx ^
ı
ı + (x0 , y0 ) dx k
∂x
∣ ∂f ∂f ∣
^ ^
≈ ∣ (^
ȷ
ȷ + (x0 , y0 ) k) × ( ^
ı
ı + (x0 , y0 ) k) ∣dxdy
∣ ∂y ∂x ∣
^
= fx (x0 , y0 ) ^
ı
ı + fy (x0 , y0 ) ^
ȷ
ȷ −k
as is its length:
∣ ∂f ∂f ∣
^ ^
∣( ^
ȷ
ȷ + (x0 , y0 ) k) × ( ^
ı
ı + (x0 , y0 ) k) ∣
∣ ∂y ∂x ∣
−−−−−−−−−−−−−−−−−−−−−−
2 2
= √ 1 + fx (x0 , y0 ) + fy (x0 , y0 )
Throughout this computation, x and y were arbitrary. So we have found the area of each tiny piece of the surface S.
0 0
Equation 3.4.1
3.4.2 https://math.libretexts.org/@go/page/89218
−−−−−−−−−−−−−−−−−−−
2 2
dS = √ 1 + hx (x, z) + hz (x, z) dxdz
Consequently, we have
Theorem 3.4.2
1. The area of the part of the surface z = f (x, y) with (x, y) running over the region D in the xy-plane is
−−−−−−−−−−−−−−−−−−−
2 2
∬ √ 1 + fx (x, y ) + fy (x, y ) dxdy
D
2. The area of the part of the surface x = g(y, z) with (y, z) running over the region D in the yz-plane is
−−−−−−−−−−−−−−−−−−
2 2
∬ √ 1 + gy (y, z) + gz (y, z) dydz
D
3. The area of the part of the surface y = h(x, z) with (x, z) running over the region D in the xz-plane is
−−−−−−−−−−−−−−−−−−−
2 2
∬ √ 1 + hx (x, z) + hz (x, z) dxdz
D
−−−−−−
Note that z = √x 2
+y
2
is the side of the cone. It does not include the top.
To find its area, we will apply 3.4.1 to
−−−−−−
2 2 2 2 2
z = f (x, y) = √ x +y with (x, y) running over x +y ≤a
y
fy (x, y) = − −−−− −
√ x2 + y 2
−−−−−−−−−−−−−−−−−−−−−−−−−−− −
x 2 y 2
= √1 +( ) +( ) dx dy
− −− −−− − −− −−−
√ x2 + y 2 √ x2 + y 2
−−−−−−−−−−
2 2
x +y
= √1 + dx dy
2
x + y2
–
= √2 dx dy
So
3.4.3 https://math.libretexts.org/@go/page/89218
– – – 2
Area = ∬ √2 dx dy = √2 ∬ dx dy = √2π a
2 2 2 2 2 2
x +y ≤a x +y ≤a
because ∬ 2
x +y
2 2
≤a
dx dy is exactly the area of a circular disk of radius a.
Solution
The intersection of x + z = a with any plane of constant y is the circle of radius a centred on x = z = 0. So S is a bunch
2 2 2
of circles stacked sideways. It is a cylinder on its side (with both ends open). By symmetry, the area of S is four times the area
of the part of S that is in the first octanct, which is
− −−−−−
∣ 2 2
S1 = {(x, y, z) z = f (x, y) = √ a − x , 0 ≤ x ≤ a, 0 ≤ y ≤ b}
∣
Since
x
fx (x, y) = − fy (x, y) = 0
− −−−− −
√ a2 − x2
−−−−−−−−−−−−−−− −
x 2
= √1 +(− ) dx dy
− −− −−−
√ a2 − x2
−−−−−−−−−−
2
x
= √1 + dxdy
2 2
a −x
a
= dxdy
− −−−− −
√ a2 − x2
So
a b a
a 1
Area(S1 ) = ∫ dx ∫ dy − −− −−− = ab ∫ dx − −−−− −
0 0 √ a2 − x2 0 √ a2 − x2
a
+ C. (See the table of integrals in Appendix A.4. Alternatively, use the trig
√a2 −x2
substitution x = a sin θ. ) So
a
x π
Area(S1 ) = ab [arcsin ] = ab[ arcsin 1 − arcsin 0] = ab
a 0 2
and
We could have also come to this conclusion by using a little geometry, rather than using calculus. Cut open the cylinder by
cutting along a line parallel to the y -axis, and then flatten out the cylinder. This gives a rectangle. One side of the rectangle is
just a circle of radius a, straightened out. So the rectangle has sides of lengths 2πa and b and has area 2πab.
3.4.4 https://math.libretexts.org/@go/page/89218
Example 3.4.5. Area of a hemisphere
(with a > 0 ). You probably know, from high school, that the answer is × 4π a = 2π a . But you have probably not seen a 1
2
2 2
derivation 3 of this answer. Note that, since x + y = a − z on the hemisphere, the set of (x, y)'s for which there is a z
2 2 2 2
− −−−−−−−−−
∣ 2 2 2 2 2 2
S = {(x, y, z) z = √ a − x − y , x + y ≤ a }
∣
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− −
−x 2 −y 2
= √1 +( ) +( ) dxdy
− −−−−−−−− − − −−−−−−−− −
√ a2 − x2 − y 2 √ a2 − x2 − y 2
−−−−−−−−−−−−−−
2 2
x +y
= √1 + dxdy
2 2 2
a −x −y
−−−−−−−−−−−
2
a
=√ dxdy
2 2 2
a −x −y
So the area is ∬ 2
x +y
2 2
≤a
a
√a2 −x2 −y 2
dxdy. To evaluate this integral, we switch to polar coordinates, substituting x = r cos θ,
y = r sin θ. This gives
a 2π
a a
area = ∬ − −−−−−−−− − dxdy = ∫ dr r ∫ dθ − −−−−−
2
x +y
2
≤a
2
√ a − x2 − y 2
2
0 0 √ a2 − r2
a
r
= 2πa ∫ dr
− −−−−−
0 √ a2 − r2
0
−du/2
2 2
= 2πa ∫ with u = a − r , du = −2r dr
−
a
2
√u
0
−
= 2πa[ − √u ]
a2
2
= 2πa
as it should be.
Example 3.4.6
3.4.5 https://math.libretexts.org/@go/page/89218
−−−−−−−−−−−−−−−−−−−
2 2
dS = √ 1 + fx (x, y ) + fy (x, y ) dxdy
−−−−−−−−−−−
2 2
= √ 1 + 4x + 4y dxdy
The point (x, y, z), with z = 2 − x − y , lies above the xy-plane if and only if z ≥ 0, or, equivalently, 2 − x
2 2 2
−y
2
≥ 0. So
the domain of integration is {(x, y)∣∣x + y ≤ 2} and
2 2
−−−−−−−−−−−
2 2
Surface Area = ∬ √ 1 + 4 x + 4y dxdy
2 2
x +y ≤2
√2
1 3/2 π
2
= 2π [ (1 + 4 r ) ] = [27 − 1]
12 6
0
13
= π
3
Exercises
Stage 1
1
2
, and a, b > 0. Denote by S the part of the surface z = y tan θ with 0 ≤ x ≤ a, 0 ≤ y ≤ b.
2
Let c > 0. Denote by S the part of the surface ax + by + cz = d with (x, y) running over the region D in the xy-plane. Find
the surface area of S, in terms of a, b, c, d and A(D), the area of the region D.
3
Let a, b, c > 0. Denote by S the triangle with vertices (a, 0, 0), (0, b, 0) and (0, 0, c).
1. Find the surface area of S in three different ways, each using Theorem 3.4.2.
2. Denote by T the projection of S onto the xy-plane. (It is the triangle with vertices (0, 0, 0) (a, 0, 0) and (0, b, 0).)
xy
Similarly use T to denote the projection of S onto the xz-plane and T to denote the projection of S onto the yz-plane.
xz yz
Show that
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
Area(S) = √ Area(Txy ) + Area(Txz ) + Area(Tyz )
Stage 2
4. ✳
5. ✳
Find the surface area of the part of the paraboloid z = a 2
−x
2
−y
2
which lies above the xy--plane.
3.4.6 https://math.libretexts.org/@go/page/89218
6. ✳
Find the area of the portion of the cone z 2
=x
2
+y
2
lying between the planes z = 2 and z = 3.
7. ✳
3
3/2
(x +y
3/2
), over the square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1.
8. ✳
1. To find the surface area of the surface z = f (x, y) above the region D, we integrate ∬ F (x, y) dA. What is F (x, y)? D
–
2. Consider a “Death Star”, a ball of radius 2 centred at the origin with another ball of radius 2 centred at (0, 0, 2√3) cut out
of it. The diagram below shows the slice where y = 0.
1. The Rebels want to paint part of the surface of Death Star hot pink; specifically, the concave part (indicated with a thick
line in the diagram). To help them determine how much paint is needed, carefully fill in the missing parts of this integral:
––
– ––
–
9. ✳
10. ✳
−−−−−−−−− −
Find the surface area of that part of the hemisphere 2 2
z = √a − x − y
2
which lies within the cylinder
a 2 a 2
2
(x − ) +y =( ) .
2 2
1. Recall 2.6.1.
2. As we mentioned above, the approximation below becomes exact when the limit dx, dy → 0 is taken in the definition of the
integral. See §3.3.5 in the CLP-4 text.
3. There is a pun hidden here, because you can (with a little thought) also get the surface area by differentiating the volume with
respect to the radius.
This page titled 3.4: Surface Area is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
3.4.7 https://math.libretexts.org/@go/page/89218
3.5: Triple Integrals
Triple integrals, that is integrals over three dimensional regions, are just like double integrals, only more so. We decompose the
domain of integration into tiny cubes, for example, compute the contribution from each cube and then use integrals to add up all of
the different pieces. We'll go through the details now by means of a number of examples.
Example 3.5.1
Find the mass inside the sphere x 2
+y
2
+z
2
=1 if the density is ρ(x, y, z) = |xyz|.
Solution
The absolute values can complicate the computations. We can avoid those complications by exploiting the fact that, by
symmetry, the total mass of the sphere will be eight times the mass in the first octant. We shall cut the first octant part of the
sphere into tiny pieces using Cartesian coordinates. That is, we shall cut it up using planes of constant z, planes of constant y,
and planes of constant x, which we recall look like
First slice the (the first octant part of the) sphere into horizontal plates by inserting many planes of constant z, with the
various values of z differing by dz. The figure on the left below shows the part of one plate in the first octant outlined in
red. Each plate
has thickness dz,
has z almost constant throughout the plate (it only varies by dz ), and
has (x, y) running over x ≥ 0, y ≥ 0, x + y ≤ 1 − z .
2 2 2
The bottom plate starts at z = 0 and the top plate ends at z = 1. See the figure on the right below.
Concentrate on any one plate. Subdivide it into long thin “square” beams by inserting many planes of constant y, with the
various values of y differing by dy. The figure on the left below shows the part of one beam in the first octant outlined in
blue. Each beam
has cross-sectional area dy dz,
has z and y essentially constant throughout the beam, and
−−−−−−−− −
has x running over 0 ≤ x ≤ √1 − y − z . 2 2
−−−− −
The leftmost beam has, essentially, y = 0 and the rightmost beam has, essentially, y = √1 − z . See the figure on the
2
right below.
3.5.1 https://math.libretexts.org/@go/page/89219
Concentrate on any one beam. Subdivide it into tiny approximate cubes by inserting many planes of constant x, with the
various values of x differing by dx. The figure on the left below shows the top of one approximate cube in black. Each
cube
has volume dx dy dz, and
has x, y and z all essentially constant throughout the cube.
−−−−−−−− −
The first cube has, essentially, x = 0 and the last cube has, essentially, x = √1 − y − z . See the figure on the right
2 2
below.
2 2
√1−y −z
dy dz ∫ dx xyz
0
To get the mass of any one plate, say the plate whose z coordinate runs from z to z + dz, we just add up the masses of the
beams in that plate, by integrating y from its smallest value on the plate, namely 0, to its largest value on the plate, namely
−−− −−
√1 − z . The mass of the plate is thus
2
√1−z 2 √1−y
2
−z
2
dz ∫ dy ∫ dx xyz
0 0
To get the mass of the part of the sphere in the first octant, we just add up the masses of the plates that it contains, by
integrating z from its smallest value in the octant, namely 0, to its largest value on the sphere, namely 1. The mass in the
first octant is thus
3.5.2 https://math.libretexts.org/@go/page/89219
1 √1−z 2 √1−y 2 −z 2
∫ dz ∫ dy ∫ dx xyz
0 0 0
1 √1−z 2 √1−y
2
−z
2
=∫ dz ∫ dy yz [ ∫ dx x]
0 0 0
1 √1−z 2
1
2 2
=∫ dz ∫ dy yz(1 − y −z )
0 0 2
1 √1−z 2 2
z(1 − z ) z 3
=∫ dz ∫ dy [ y− y ]
0 0
2 2
1 2 2 2 2
z (1 − z ) z (1 − z )
=∫ dz [ − ]
0
4 8
1 2 2
(1 − z )
=∫ dz z
0
8
0 2
du u 2
=∫ with u = 1 − z , du = −2z dz
1
−2 8
1
=
48
48
=
1
6
.
∫ dz ∫ dy ∫ dx xyz
0 0 0
1 √1−z 2 √1−y
2
−z
2
=∫ (∫ (∫ xyz dx) dy) dz
0 0 0
Example 3.5.2
In practice, often the hardest part of dealing with a triple integral is setting up the limits of integration. In this example, we'll
concentrate on exactly that.
Let V be the solid region in R bounded by the planes x = 0, y = 0, z = 0, y = 4 − x, and the surface z = 4 − x . We are
3 2
now going to write ∭ f (x, y, z) dV as an iterated integral (i.e. find the limits of integration) in two different ways. Here f is
V
2
V = {(x, y, z)|x ≥ 0, y ≥ 0, z ≥ 0, x + y ≤ 4, z ≤ 4 − x }
3.5.3 https://math.libretexts.org/@go/page/89219
The iterated integral ∭ f (x, y, z) dz dy dx = ∫ (∫ (∫ f (x, y, z) dz) dy) dx: For this iterated integral, the outside integral is
V
with respect to x, so we first slice up V using planes of constant x, as in the figure below.
So
2
2 4−x 4−x
=∫ ∫ ∫ f (x, y, z) dz dy dx
0 0 0
The iterated integral ∭ f (x, y, z) dy dx dz = ∫ (∫ (∫ f (x, y, z) dy) dx) dz: For this iterated integral, the outside integral is
V
with respect to z, so we first slice up V using planes of constant z, as in the figure below.
3.5.4 https://math.libretexts.org/@go/page/89219
4 √4−z 4−x
4 √4−z 4−x
=∫ ∫ ∫ f (x, y, z) dy dx dz
0 0 0
Example 3.5.3
As was said in the last example, in practice, often the hardest parts of dealing with a triple integral concern the limits of
integration. In this example, we'll again concentrate on exactly that. This time, we will consider the integral
2−y
2 2−y
2
I =∫ dy ∫ dz ∫ dx f (x, y, z)
0 0 0
and we will re-express I with the outside integral being over z. We will figure out the limits of integration for both the order
∫ dz ∫ dx ∫ dy f (x, y, z) and for the order ∫ dz ∫ dy ∫ dx f (x, y, z).
Our first task is to get a good idea as to what the domain of integration looks like. We start by reading off of the given integral
that
the outside integral says that y runs from 0 to 2, and
the middle integral says that, for each fixed y in that range, z runs from 0 to 2 − y and
2−y
the inside integral says that, for each fixed (y, z) as above, x runs from 0 to .
2
We'll sketch V shortly. Because it is generally easier to make 2d sketches than it is to make 3d sketches, we'll first make a 2d
sketch of the part of V that lies in the vertical plane y = Y . Here Y is any constant between 0 and 2. Looking at the definition
of V , we see that the point (x, Y , z) lies in V if and only if
2 −Y
0 ≤ z ≤ 2 −Y 0 ≤x ≤
2
Here, on the left, is a (2d) sketch of all (x, z)'s that obey those inequalities, and, on the right, is a (3d) sketch of all (x, Y , z)'s
that obey those inequalities.
So our solid V consists of a bunch of vertical rectangles stacked sideways along the y -axis. The rectangle in the plane y = Y
has side lengths 2−Y
2
and 2 − Y . As we move from the plane y = Y = 0, i.e. the xz-plane, to the plane y = Y = 2, the
rectangle decreases in size linearly from a one by two rectangle, when Y = 0, to a zero by zero rectangle, i.e. a point, when
Y = 2. Here is a sketch of V together with a typical y = Y rectangle.
3.5.5 https://math.libretexts.org/@go/page/89219
To re-express the given integral with the outside integral being with respect to z, we have to slice up V into horizontal plates
by inserting planes of constant z. So we have to figure out what the part of V that lies in the horizontal plane z = Z looks like.
From the figure above, we see that, in V , the smallest value of z is 0 and the biggest value of z is 2. So Z is any constant
between 0 and 2. Again looking at the definition of V in (∗) above, we see that the point (x, y, Z) lies in V if and only if
y ≥0 y ≤2 y ≤ 2 −Z x ≥0 2x + y ≤ 2
Here, on the top, is a (2d) sketch showing the top view of all (x, y)'s that obey those inequalities, and, on the bottom, is a (3d)
sketch of all (x, y, Z)'s that obey those inequalities.
To express I as an integral with the order of integration ∫ dz ∫ dy ∫ dx f (x, y, z), we subdivide the plate at height z into
vertical strips as in the figure
Since
y is essentially constant on each strip with the leftmost strip having y = 0 and the rightmost strip having y = 2 − z and
2−y
for each fixed y in that range, x runs from 0 to 2
we have
3.5.6 https://math.libretexts.org/@go/page/89219
2−y
2 2−z
2
I =∫ dz ∫ dy ∫ dx f (x, y, z)
0 0 0
Alternatively, to express I as an integral with the order of integration ∫ dz ∫ dx ∫ dy f (x, y, z), we subdivide the plate at
height z into horizontal strips as in the figure
Since
x is essentially constant on each strip with the first strip having x = 0 and the last strip having x = 1 and
for each fixed x between 0 and z/2, y runs from 0 to 2 − z and
for each fixed x between z/2 and 1, y runs from 0 to 2 − 2x
we have
2 z/2 2−z 2 1 2−2x
Exercises
Stage 1
1
Evaluate the integral
−−−−−−
2 2
∬ √b −y dx dywhere R is the rectangle 0 ≤ x ≤ a, 0 ≤ y ≤ b
R
2✳
Find the total mass of the rectangular box [0, 1] × [0, 2] × [0, 3] (that is, the box defined by the inequalities 0 ≤ x ≤ 1,
Stage 2
3
y
Evaluate ∭ x dV where R is the tetrahedron bounded by the coordinate planes and the plane x
a
+
b
+
z
c
= 1.
R
4
Evaluate ∭ y dV where R is the portion of the cube 0 ≤ x, y, z ≤ 1 lying above the plane y + z = 1 and below the plane
R
x + y + z = 2.
3.5.7 https://math.libretexts.org/@go/page/89219
5
For each of the following, express the given iterated integral as an iterated integral in which the integrations are performed in
the order: first z, then y, then x.
1 1−z 1−z
1. ∫ dz ∫ dy ∫ dx f (x, y, z)
0 0 0
1 1 y
2. ∫ dz ∫ dy ∫ dx f (x, y, z)
0 √z 0
6✳
2
y=1 z=1−y 2−y−z
∫ ∫ ∫ f (x, y, z) dx dz dy
y=−1 z=0 x=0
1. Draw a reasonably accurate picture of E in 3--dimensions. Be sure to show the units on the coordinate axes.
2. Rewrite the triple integral ∭ f dV as one or more iterated triple integrals in the order
E
y= x= z=
∫ ∫ ∫ f (x, y, z) dz dx dy
y= x= z=
7✳
A triple integral ∭ E
f (x, y, z) dV is given in the iterated form
x
1 1− 4−2x−4z
2
J =∫ ∫ ∫ f (x, y, z) dy dz dx
0 0 0
J =∫ ∫ ∫ f (x, y, z) dz dx dy
y= x= z=
8✳
Write the integral given below 5 other ways, each with a different order of integration.
1 1 1−y
I =∫ ∫ ∫ f (x, y, z) dz dy dx
0 √x 0
9✳
Let I =∭ f (x, y, z) dV where E is the tetrahedron with vertices (−1, 0, 0), (0, 0, 0), (0, 0, 3) and (0, −2, 0).
E
I =∫ ∫ ∫ f (x, y, z) dz dy dx
x= y= z=
I =∫ ∫ ∫ f (x, y, z) dy dx dz
z= x= y=
3.5.8 https://math.libretexts.org/@go/page/89219
10 ✳
Let T denote the tetrahedron bounded by the coordinate planes x = 0, y = 0, z = 0 and the plane x + y + z = 1. Compute
1
K =∭ dV
T (1 + x + y + z)4
11 ✳
Let E be the portion of the first octant which is above the plane z = x +y and below the plane z = 2. The density in E is
ρ(x, y, z) = z. Find the mass of E.
12 ✳
Evaluate the triple integral ∭ x dV , where E is the region in the first octant bounded by the parabolic cylinder y = x and
E
2
13 ✳
Let E be the region in the first octant bounded by the coordinate planes, the plane x + y = 1 and the surface z = y . Evaluate 2
∭ z dV .
E
14 ✳
Evaluate ∭ R
2
yz e
−xyz
dV over the rectangular box
R = {(x, y, z)|0 ≤ x ≤ 1, 0 ≤ y ≤ 2, 0 ≤ z ≤ 3}
15 ✳
1. Sketch the surface given by the equation z = 1 − x . 2
∭ f (x, y, z) dV
E
as an iterated integral.
16 ✳
Let
1 x y
J =∫ ∫ ∫ f (x, y, z) dz dy dx
0 0 0
Express J as an integral where the integrations are to be performed in the order x first, then y, then z.
17 ✳
Let E be the region bounded by z = 2x, z = y , and x = 3. The triple integral ∭ f (x, y, z) dV can be expressed as an
2
iterated integral in the following three orders of integration. Fill in the limits of integration in each case. No explanation
required.
3.5.9 https://math.libretexts.org/@go/page/89219
y= x= z=
∫ ∫ ∫ f (x, y, z) dz dx dy
y= x= z=
y= z= x=
∫ ∫ ∫ f (x, y, z) dx dz dy
y= z= x=
z= x= y=
∫ ∫ ∫ f (x, y, z) dy dx dz
z= x= y=
18 ✳
∭ f (x, y, z) dV
E
as three different iterated integrals corresponding to the orders of integration: (a) dz dx dy, (b) dx dy dz, and (c) dy dz dx.
19 ✳
I =∭ f (x, y, z) dV
E
Fill in the blanks below. In each part below, you may need only one integral to express your answer. In that case, leave the
other blank.
––
– ––
– ––
– ––
– ––
– ––
–
20 ✳
Evaluate ∭ E
z dV , where E is the region bounded by the planes y = 0, z = 0 x +y = 2 and the cylinder y 2
+z
2
=1 in the
first octant.
21 ✳
22 ✳
The solid region T is bounded by the planes x = 0, y = 0, z = 0, and x + y + z = 2 and the surface x 2
+ z = 1.
This page titled 3.5: Triple Integrals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
3.5.10 https://math.libretexts.org/@go/page/89219
3.6: Triple Integrals in Cylindrical Coordinates
Many problems possess natural symmetries. We can make our work easier by using coordinate systems, like polar coordinates, that
are tailored to those symmetries. We will look at two more such coordinate systems — cylindrical and spherical coordinates.
Cylindrical Coordinates
In the event that we wish to compute, for example, the mass of an object that is invariant under rotations about the z -axis 1, it is
advantageous to use a natural generalization of polar coordinates to three dimensions. The coordinate system is called cylindrical
coordinates.
Definition 3.6.1
That is, r and θ are the usual polar coordinates and z is the usual z.
Equation 3.6.2
x = r cos θ y = r sin θ z =z
−−−−−−
2 2
y
r = √x +y θ = arctan z =z
x
3.6.1 https://math.libretexts.org/@go/page/92245
and then subdividing the plates into wedges using surfaces of constant θ, say with the difference between successive θ 's being
dθ,
and then subdividing the wedges into approximate cubes using surfaces of constant r, say with the difference between
successive r's being dr,
When we introduced slices using surfaces of constant r, the difference between the successive r's was dr, so the indicated edge
of the cube has length dr.
When we introduced slices using surfaces of constant z, the difference between the successive z 's was dz, so the vertical edges
of the cube have length dz.
When we introduced slices using surfaces of constant θ, the difference between the successive θ 's was dθ, so the remaining
edges of the cube are circular arcs of radius essentially 4 r that subtend an angle θ, and so have length r dθ. See the derivation
of equation 3.2.5.
So the volume of the approximate cube in cylindrical coordinates is (essentially 5)
Equation 3.6.3
dV = r dr dθ dz
3.6.2 https://math.libretexts.org/@go/page/92245
Example 3.6.4
Find the mass of the solid body consisting of the inside of the sphere x 2
+y
2
+z
2
=1 if the density is ρ(x, y, z) = x 2 2
+y .
Solution
Before we get started, note that x + y is the square of the distance from (x, y, z) to the z -axis. Consequently both the
2 2
integrand, x + y , and the domain of integration, x + y + z ≤ 1, and hence our solid, are invariant under rotations about
2 2 2 2 2
the z -axis 6. That makes this integral a good candidate for cylindrical coordinates.
Again, by symmetry the total mass of the sphere will be eight times the mass in the first octant. We shall cut the first octant part
of the sphere into tiny pieces using cylindrical coordinates. That is, we shall cut it up using planes of constant z, planes of
constant θ, and surfaces of constant r.
First slice the (the first octant part of the) sphere into horizontal plates by inserting many planes of constant z, with the
various values of z differing by dz. The figure on the left below shows the part of one plate in the first octant outlined in
red. Each plate
has thickness dz,
has z essentially constant on the plate, and
−−− −−
has (x, y) running over x ≥ 0, y ≥ 0, x + y ≤ 1 − z . In cylindrical coordinates, r runs from 0 to √1 − z and θ
2 2 2 2
runs from 0 to .π
The bottom plate has, essentially, z = 0 and the top plate has, essentially, z = 1. See the figure on the right below.
The leftmost wedge has, essentially, θ = 0 and the rightmost wedge has, essentially, θ = π
2
. See the figure on the right
below.
Concentrate on any one wedge. Subdivide it into tiny approximate cubes by inserting many surfaces of constant r, with the
various values of r differing by dr. The figure on the left below shows the top of one approximate cube in black. Each cube
has volume r dr dθ dz, by 3.6.3, and
has r, θ and z all essentially constant on the cube.
−−−− −
The first cube has, essentially, r = 0 and the last cube has, essentially, r = √1 − z . See the figure on the right below.
2
3.6.3 https://math.libretexts.org/@go/page/92245
Now we can build up the mass.
Concentrate on one approximate cube. Let's say that it contains the point with cylindrical coordinates r, θ and z.
The cube has volume essentially dV = r dr dθ dz and
essentially has density ρ(x, y, z) = ρ(r cos θ, r sin θ, z) = r and so 2
essentially has mass r dr dθ dz. (See how nice the right coordinate system can be!)
3
To get the mass any one wedge, say the wedge whose θ coordinate runs from θ to θ + dθ, we just add up the masses of the
approximate cubes in that wedge, by integrating r from its smallest value on the wedge, namely 0, to its largest value on
−−− −−
the wedge, namely √1 − z . The mass of the wedge is thus
2
√1−z 2
3
dθ dz ∫ dr r
0
To get the mass of any one plate, say the plate whose z coordinate runs from z to z + dz, we just add up the masses of the
wedges in that plate, by integrating θ from its smallest value on the plate, namely 0, to its largest value on the plate, namely
. The mass of the plate is thus
π
π/2 √1−z 2
3
dz ∫ dθ ∫ dr r
0 0
To get the mass of the part of the sphere in the first octant, we just add up the masses of the plates that it contains, by
integrating z from its smallest value in the octant, namely 0, to its largest value on the sphere, namely 1. The mass in the
first octant is thus
1 π/2 √1−z 2 1 π/2
1 2
3 2
∫ dz ∫ dθ ∫ dr r = ∫ dz ∫ dθ (1 − z )
0 0 0
4 0 0
1
π 2
2
= ∫ dz (1 − z )
8 0
1
π
2 4
= ∫ dz (1 − 2 z +z )
8 0
8/15
π 2 1
= [1 − + ]
8 3 5
1
= π
15
15
π =
8
15
π.
Just by way of comparison, here is the integral in Cartesian coordinates that gives the mass in the first octant. (We found the
limits of integration in Example 3.5.1.)
1 √1−z 2 √1−y
2
−z
2
2 2
∫ dz ∫ dy ∫ dx (x +y )
0 0 0
In the next example, we compute the moment of inertia of a right circular cone. The Definition 3.3.13 of the moment of inertia was
restricted to two dimensions. However, as was pointed out at the time, the same analysis extends naturally to the definition
3.6.4 https://math.libretexts.org/@go/page/92245
Equation 3.6.5
2
IA = ∭ D(x, y, z) ρ(x, y, z) dx dy dz
V
Example 3.6.6
We shall use 3.6.5 to find the moment of inertia. In the current problem, the axis of rotation is the y -axis. The point on the y -
axis that is closest to (x, y, z) is (0, y, 0) so that the distance from (x, y, z) to the axis is just
− −−−−−
2 2
D(x, y, z) = √ x + z
3.6.5 https://math.libretexts.org/@go/page/92245
M
ρ(x, y, z) =
Volume(V)
The formula
1
2
Volume(V) = πa h
3
for the volume of a cone was derived in Example 1.6.1 of the CLP-2 text and in Appendix B.5.2 of the CLP-1 text. However
because of the similarity between the integral Volume(V) = ∭ dx dy dz and the integral ∭ (x + z ) dx dy dz, that we
V V
2 2
need for our computation of I , it is easy to rederive the volume formula and we shall do so.
A
Each plate
is a circular disk of thickness dz.
By similar triangles, as in the figure on the right below, the disk at height z has radius R obeying
R a a
= ⟹ R = z
z h h
So the disk at height z has the cylindrical coordinates r running from 0 to z and θ running from 0 to 2π. a
The bottom plate has, essentially, z = 0 and the top plate has, essentially, z = h.
Now concentrate on any one plate. Subdivide it into wedges by inserting many planes of constant θ, with the various values
of θ differing by dθ.
The first wedge has, essentially θ = 0 and the last wedge has, essentially, θ = 2π.
Concentrate on any one wedge. Subdivide it into tiny approximate cubes 7 by inserting many surfaces of constant r, with
the various values of r differing by dr. Each cube
has volume r dr dθ dz, by 3.6.3.
The first cube has, essentially, r = 0 and the last cube has, essentially, r = a
h
z.
∭ dx dy dz = ∫ dz ∫ dθ ∫ dr r
V 0 0 0
h 2π 2 h
2
1 a a π 2
=∫ dz ∫ dθ ( z) = ∫ dz z
2
0 0
2 h h 0
1 2
= πa h
3
3.6.6 https://math.libretexts.org/@go/page/92245
as expected, and
2 2
x +z
h 2π
a
z
h
2 2 2 2 2
∭ (x + z ) dx dy dz = ∫ dz ∫ dθ ∫ dr r (r cos θ+z )
V 0 0 0
h 2π
4 2
1 a 2
1 a 2
=∫ dz ∫ dθ [ ( z) cos θ+ ( z) z ]
0 0
4 h 2 h
h 4 2
1 a a
4
=∫ dz [ + ] πz
4 2
0 4 h h
2π
2
since ∫ cos θ dθ = π by Remark 3.3.5
0
4 2
1 1 a a
5
= [ + ] πh
5 4 h4 h
2
4 2
M 1 1 a a 5
=3 [ + ] πh
2 4
πa h 5 4 h h2
3
2 2
= M (a + 4h )
20
Exercises
Stage 1
1
2
2. r = 1, θ =
π
4
, z =0
3. r = 1, θ =
π
2
, z =0
4. r = 0, θ = π, z = 1
5. r = 1, θ =
π
4
, z =1
3
2. r = 1, θ =
π
4
, z =0
3. r = 1, θ =
π
2
, z =0
4. r = 0, θ = π, z = 1
3.6.7 https://math.libretexts.org/@go/page/92245
5. r = 1, θ =
π
4
, z =1
4
5
Rewrite the following equations in cylindrical coordinates.
1. z = 2xy
2. x + y + z
2 2 2
=1
3. (x − 1) + y 2 2
=1
Stage 2
6
Use cylindrical coordinates to evaluate the volumes of each of the following regions.
−−−−−−
1. Above the xy--plane, inside the cone z = 2a − √x + y and inside the cylinder x + y = 2ay.
2 2 2 2
–
2. Above the xy--plane, under the paraboloid z = 1 − x − y and in the wedge −x ≤ y ≤ √3x.
2 2
7✳
8✳
9✳
Let E be the smaller of the two solid regions bounded by the surfaces z =x
2
+y
2
and x
2
+y
2
+z
2
= 6. Evaluate
(x + y ) dV .
2 2
∭
E
10 ✳
Let a >0 be a fixed positive real number. Consider the solid inside both the cylinder x
2
+y
2
= ax and the sphere
+ z = a . Compute its volume.
2 2 2 2
x +y
12
cos(3θ) −
3
4
cos(θ) + C
11 ✳
Let E be the solid lying above the surface z = y and below the surface z = 4 − x
2 2
. Evaluate
3.6.8 https://math.libretexts.org/@go/page/92245
2
∭ y dV
E
12
The centre of mass (x̄, ȳ , z̄ ) of a body B having density ρ(x, y, z) (units of mass per unit volume) at (x, y, z) is defined to be
1
x̄ = ∭ xρ(x, y, z)
M B
1
dV ȳ = ∭ yρ(x, y, z) dV
M B
1
z̄ = ∭ zρ(x, y, z) dV
M B
where
M =∭ ρ(x, y, z) dV
B
is the mass of the body. So, for example, x̄ is the weighted average of x over the body. Find the centre of mass of the part of
the solid ball x + y + z ≤ a with x ≥ 0, y ≥ 0 and z ≥ 0, assuming that the density ρ is constant.
2 2 2 2
13 ✳
14 ✳
1. Set up (but do not evaluate) a triple integral in rectangular coordinates that describes the volume of the solid.
2. Calculate the volume of the solid using any method.
15 ✳
Stage 3
16 ✳
1. At (1, 0, −1), in which direction is the density of hydrogen increasing most rapidly?
3.6.9 https://math.libretexts.org/@go/page/92245
2. You are in a spacecraft at the origin. Suppose the spacecraft flies in the direction of ⟨0, 0, 1⟩ . It has a disc of radius 1,
centred on the spacecraft and deployed perpendicular to the direction of travel, to catch hydrogen. How much hydrogen has
been collected by the time that the spacecraft has traveled a distance 2?
2π
You may use the fact that ∫ 0
2
cos θ dθ = π.
17
A torus of mass M is generated by rotating a circle of radius a about an axis in its plane at distance b from the centre (b > a).
The torus has constant density. Find the moment of inertia about the axis of rotation. By definition the moment of intertia is
∭ r dm where dm is the mass of an infinitesmal piece of the solid and r is its distance from the axis.
2
3. As was the case for polar coordinates, it is sometimes convenient to extend these definitions by saying that x = r cos θ and
y = r sin θ even when r is negative. See the end of Section 3.2.1.
4. The inner edge has radius r, but the outer edge has radius r + dr. However the error that this generates goes to zero in the limit
dr, dθ, dz → 0.
5. By “essentially”, we mean that the formula for dV works perfectly when we take the limit dr, dθ, dz → 0 of Riemann sums.
6. Imagine that you are looking that the solid from, for example, far out on the x-axis. You close your eyes for a minute. Your evil
twin then sneaks in, rotates the solid about the z -axis, and sneaks out. You open your eyes. You will not be able to tell that the
solid has been rotated.
7. Again they are wonky cubes, but we can bound the error and show that it goes to zero in the limit dr, dθ, dz → 0.
This page titled 3.6: Triple Integrals in Cylindrical Coordinates is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.
3.6.10 https://math.libretexts.org/@go/page/92245
3.7: Triple Integrals in Spherical Coordinates
Spherical Coordinates
In the event that we wish to compute, for example, the mass of an object that is invariant under rotations about the origin, it is
advantageous to use another generalization of polar coordinates to three dimensions. The coordinate system is called spherical
coordinates.
Definition 3.7.1
Spherical coordinates are denoted 1 ρ, θ and φ and are defined by
Here are two more figures giving the side and top views of the previous figure.
The spherical coordinate θ is the same as the cylindrical coordinate θ. The spherical coordinate a⃗ rphi is new. It runs from 0 (on
the positive z -axis) to π (on the negative z -axis). The Cartesian and spherical coordinates are related by
Equation 3.7.2
a surface of constant θ, i.e. a surface y = x tan θ with 2 θ a constant (which looks like the page of a book), and
−−− −−−
a surface of constant φ, i.e. a surface z = √x + y tan φ with φ a constant (which looks a conical funnel).
2 2
3.7.1 https://math.libretexts.org/@go/page/92246
The Volume Element in Spherical Coordinates
If we cut up a solid 3 by
first slicing it into segments (like segments of an orange) by using planes of constant θ, say with the difference between
successive θ 's being dθ,
and then subdividing the segments into “searchlights” (like the searchlight outlined in blue in the figure below) using surfaces
of constant φ, say with the difference between successive φ 's being dφ,
and then subdividing the searchlights into approximate cubes using surfaces of constant ρ, say with the difference between
successive ρ's being dρ,
3.7.2 https://math.libretexts.org/@go/page/92246
The dimensions of the approximate “cube” in spherical coordinates are (essentially) dρ by ρdφ by ρ sin φ dθ. (These dimensions
are derived in more detail in the next section.) So the approximate cube has volume (essentially)
Equation 3.7.3
2
dV = ρ sin φ dρ dθ dφ
The Details
Here is an explanation of the edge lengths given in the above figure. Each of the 12 edges of the cube is formed by holding two of
the three coordinates ρ, θ, φ fixed and varying the third.
Four of the cube edges are formed by holding θ and φ fixed and varying ρ. The intersection of a plane of fixed θ with a cone of
fixed φ is a straight line emanating from the origin. When we introduced slices using spheres of constant ρ, the difference
between the successive ρ's was dρ, so those edges of the cube each have length dρ.
Four of the cube edges are formed by holding θ and ρ fixed and varying φ. The intersection of a plane of fixed θ (which
contains the origin) with a sphere of fixed ρ (which is centred on the origin) is a circle of radius ρ centred on the origin. It is a
line of longitude 4.
When we introduced searchlights using surfaces of constant φ, the difference between the successive φ 's was dφ. Thus those
four edges of the cube are circular arcs of radius essentially ρ that subtend an angle dφ, and so have length ρ dφ.
3.7.3 https://math.libretexts.org/@go/page/92246
Four of the cube edges are formed by holding φ and ρ fixed and varying θ. The intersection of a cone of fixed φ with a sphere
of fixed ρ is a circle. As both ρ and φ are fixed, the circle of intersection lies in the plane z = ρ cos φ. It is a line of latitude.
The circle has radius ρ sin φ and is centred on (0, 0, ρ cos φ).
When we introduced segments using surfaces of constant θ, the difference between the successive θ 's was dθ. Thus these four
edge of the cube are circular arcs of radius essentially ρ sin φ that subtend an angle dθ, and so have length ρ sin φ dθ.
Solution
Note that, in spherical coordinates
2 2 2 2 2 2 2 2 2 2 2
x +y =ρ sin φ z =ρ cos φ x +y +z =ρ
Consequently, in spherical coordinates, the equation of the sphere is ρ = a, and the equation of the cone is tan φ = b . Let's 2 2
write β = arctan b, with 0 < β < . Here is a sketch of the part of the ice cream cone in the first octant. The volume of the
π
3.7.4 https://math.libretexts.org/@go/page/92246
full ice cream cone will be four times the volume of the part in the first octant.
We shall cut the first octant part of the ice cream cone into tiny pieces using spherical coordinates. That is, we shall cut it up
using planes of constant θ, cones of constant φ, and spheres of constant ρ.
First slice the (the first octant part of the) ice cream cone into segments by inserting many planes of constant θ, with the
various values of θ differing by dθ. The figure on the left below shows one segment outlined in red. Each segment
has θ essentially constant on the segment, and
has φ running from 0 to β and ρ running from 0 to a.
The leftmost segment has, essentially, θ = 0 and the rightmost segment has, essentially, θ = π
2
. See the figure on the
right below.
Concentrate on any one segment. A side view of the segment is sketched in the figure on the left below. Subdivide it into
long thin searchlights by inserting many cones of constant φ, with the various values of φ differing by dφ. The figure on
the left below shows one searchlight outlined in blue. Each searchlight
has θ and φ essentially constant on the searchlight, and
has ρ running over 0 ≤ ρ ≤ a.
The leftmost searchlight has, essentially, φ = 0 and the rightmost searchlight has, essentially, φ = β. See the figure on
the right below.
Concentrate on any one searchlight. Subdivide it into tiny approximate cubes by inserting many spheres of constant ρ, with
the various values of ρ differing by dρ. The figure on the left below shows the side view of one approximate cube in black.
Each cube
has ρ, θ and φ all essentially constant on the cube and
has volume ρ sin φ dρ dθ dφ, by 3.7.3.
2
The first cube has, essentially, ρ = 0 and the last cube has, essentially, ρ = a. See the figure on the right below.
3.7.5 https://math.libretexts.org/@go/page/92246
Now we can build up the volume.
Concentrate on one approximate cube. Let's say that it contains the point with spherical coordinates ρ, θ, φ. The cube has
volume essentially dV = ρ sin φ dρ dθ dφ, by 3.7.3.
2
To get the volume any one searchlight, say the searchlight whose φ coordinate runs from φ to φ + dφ, we just add up the
volumes of the approximate cubes in that searchlight, by integrating ρ from its smallest value on the searchlight, namely 0,
to its largest value on the searchlight, namely a. The volume of the searchlight is thus
a
2
dθ dφ ∫ dρ ρ sin φ
0
To get the volume of any one segment, say the segment whose θ coordinate runs from θ to θ + dθ, we just add up the
volumes of the searchlights in that segment, by integrating φ from its smallest value on the segment, namely 0, to its largest
value on the segment, namely β. The volume of the segment is thus
β a
2
dθ ∫ dφ sin φ ∫ dρ ρ
0 0
To get the volume of V , the part of the ice cream cone in the first octant, we just add up the volumes of the segments that it
1
contains, by integrating θ from its smallest value in the octant, namely 0, to its largest value on the octant, namely . π
3 π/2 β
a
= ∫ dθ ∫ dφ sin φ
3 0 0
3 π/2
a
= [1 − cos β] ∫ dθ
3 0
3
πa
= [1 − cos β]
6
We can express β (which was not given in the statement of the original problem) in terms of b (which was in the statement of
the original problem), just by looking at the triangle
The right hand and bottom sides of the triangle have been chosen so that tan β = b, which was the definition of β. So
cos β =
1
and the volume of the ice cream cone is
√1+b2
3
2πa 1
Volume(V) = [1 − − −−−−]
3 √ 1 + b2
Note that, as in Example 3.2.11, we can easily apply a couple of sanity checks to our answer.
3.7.6 https://math.libretexts.org/@go/page/92246
If b = 0, so that the cone is just x + y = 0, which is the line x = y = 0, the total volume should be zero. Our answer
2 2
2
4
3
3
3
3
First slice the first octant part of the remaining apple into segments by inserting many planes of constant θ, with the various
values of θ differing by dθ. The leftmost segment has, essentially, θ = 0 and the rightmost segment has, essentially, θ = . π
Subdivide it into long thin searchlights by inserting many cones of constant φ, with the various values of φ differing by
dφ. The figure on below shows one searchlight outlined in blue. Each searchlight
a
π
2
.
3.7.7 https://math.libretexts.org/@go/page/92246
Concentrate on any one searchlight. Subdivide it into tiny approximate cubes by inserting many spheres of constant ρ, with
the various values of ρ differing by dρ. The figure on the left below shows the side view of one approximate cube in black.
Each cube
has ρ, θ and φ all essentially constant on the cube and
has volume dV = ρ sin φ dρ dθ dφ, by 3.7.3.
2
The figure on the right below gives an expanded view of the searchlight. From it, we see (after a little trig) that the first
cube has, essentially, \(\rho=\frac{b}{\sin\varphi\) and the last cube has, essentially, ρ = a (the radius of the apple).
To get the volume any one searchlight, say the searchlight whose φ coordinate runs from φ to φ + dφ, we just add up the
volumes of the approximate cubes in that searchlight, by integrating ρ from its smallest value on the searchlight, namely
, to its largest value on the searchlight, namely a. The volume of the searchlight is thus
b
sin φ
a
2
dθ dφ ∫ dρ ρ sin φ
b
s in φ
To get the volume of any one segment, say the segment whose θ coordinate runs from θ to θ + dθ, we just add up the
volumes of the searchlights in that segment, by integrating φ from its smallest value on the segment, namely arcsin , to b
its largest value on the segment, namely . The volume of the searchlight is thus
π
π
a
2
2
dθ ∫ ∫ dρ ρ sin φ
b b
arcsin
a s in φ
To get the volume of the remaining part of the apple in the first octant, we just add up the volumes of the segments that it
contains, by integrating θ from its smallest value in the octant, namely 0, to its largest value on the octant, namely . The π
π
π/2
1 2
3 3 2
= ∫ dθ ∫ dφ [ a sin φ − b csc φ]
3 0 arcsin
b
π/2 π
1
3 3 2
= ∫ dθ [−a cos φ + b cot φ] b
3 arcsin
a
0
2
since ∫ csc φ dφ = − cot φ + C
π
π
3 3 2
= [−a cos φ + b cot φ] b
6 arcsin
a
Now cos π
2
= cot
π
2
=0 and, if we write α = arcsin b
a
,
3.7.8 https://math.libretexts.org/@go/page/92246
π
3 3
Volume(V1 ) = [a cos α − b cot α]
6
So
π − −−−−− − −−−−− π
2 2 2 2 2 2 2 2 3/2
Volume(V1 ) = [a √ a − b − b √ a − b ] = [a − b ]
6 6
We can, yet again, apply the sanity checks of Example 3.2.11 to our answer.
If the radius of the drill bit b = 0, no apple is removed at all. So the total volume remaining should be π a . Our answer
4
3
3
3
3
πa , as it should be.
Exercises
Stage 1
1
3. Draw φ = . π
4. Draw φ = . 3π
5. Draw φ = π.
2
4. ρ = 1, θ = , φ =
π
3
π
5. ρ = 1, θ = , φ =
π
2
π
6. ρ = 1, θ = , φ =
π
3
π
3
3.7.9 https://math.libretexts.org/@go/page/92246
3. (0, 0, −4)
11 –
4. (− –, – , √3)
√2 √2
4
3
, φ =
π
2. ρ = 2, θ = π
2
, φ =
π
5
2. x2
+y
2
+ (z − 1 )
2
=1
3. x2
+y
2
=4
6. ✳
Using spherical coordinates and integration, show that the volume of the sphere of radius 1 centred at the origin is 4π/3.
Stage 2
7. ✳
1 ≤ ρ ≤ 1 + cos φ
1. Draw a reasonably accurate picture of E in 3-dimensions. Be sure to show the units on the coordinates axes.
2. Find the volume of E.
8. ✳
I =∭ z dV
D
−−−−− −
where D is the solid enclosed by the cone z = √x 2
+ y2 and the sphere 2
x +y
2
+z
2
= 4. That is, (x, y, z) is in D if and
−−− −−−
only if √x + y ≤ z and x + y + z ≤ 4.
2 2 2 2 2
9
Use spherical coordinates to find
−−−−−−
1. The volume inside the cone z = √x + y and inside the sphere x + y + z = a .
2 2 2 2 2 2
2. ∭ x dV and ∭ z dV over the part of the sphere of radius a that lies in the first octant.
R R
3. The mass of a spherical planet of radius a whose density at distance ρ from the center is δ = A/(B + ρ 2
).
4. The volume enclosed by ρ = a(1 − cos φ). Here ρ and φ refer to the usual spherical coordinates.
3.7.10 https://math.libretexts.org/@go/page/92246
10. ✳
and above the plane z = 0. Let the shell have constant density D.
1. Find the mass of the shell.
2. Find the location of the center of mass of the shell.
11. ✳
Let
I =∭ xz dV
T
12. ✳
Evaluate W =∭
Q
xz dV , where Q is an eighth of the sphere x 2
+y
2
+z
2
≤9 with x, y, z ≥ 0.
13. ✳
−1
3
Evaluate ∭ R
3 [1 + (x
2
+y
2 2
+z ) ] dV .
14. ✳
Evaluate
1 √1−x2 1+√1−x −y
2 2
2 2 2 5/2
∫ ∫ ∫ (x +y +z ) dz dy dx
−1 −√1−x2 1−√1−x2 −y 2
15
Evaluate the volume of a circular cylinder of radius a and height h by means of an integral in spherical coordinates.
16. ✳
Let B denote the region inside the sphere x
2
+y
2
+z
2
=4 and above the cone x
2
+y
2 2
=z . Compute the moment of
inertia
2
∭ z dV
B
3.7.11 https://math.libretexts.org/@go/page/92246
17. ✳
1. Evaluate ∭ z dV where Ω is the three dimensional region in the first octant x ≥ 0, y ≥ 0, z ≥ 0, occupying the inside
Ω
of the sphere x + y + z = 1.
2 2 2
2. Use the result in part (a) to quickly determine the centroid of a hemispherical ball given by z ≥ 0, x
2
+y
2
+z
2
≤ 1.
18. ✳
Consider the top half of a ball of radius 2 centred at the origin. Suppose that the ball has variable density equal to 9z units of
mass per unit volume.
1. Set up a triple integral giving the mass of this half-ball.
2. Find out what fraction of that mass lies inside the cone
−−−−−−
2 2
z = √x +y
Stage 3
19. ✳
20. ✳
A certain solid V is a right-circular cylinder. Its base is the disk of radius 2 centred at the origin in the xy-plane. It has height 2
−−− −− −
and density √x + y .
2 2
A smaller solid U is obtained by removing the inverted cone, whose base is the top surface of V and whose vertex is the point
(0, 0, 0).
21. ✳
−−−−−−
A solid is bounded below by the cone z= √x2 +y 2 and above by the sphere 2
x +y
2
+z
2
= 2. It has density
2 2
δ(x, y, z) = x +y .
1. Express the mass M of the solid as a triple integral, with limits, in cylindrical coordinates.
2. Same as (a) but in spherical coordinates.
3. Evaluate M .
22. ✳
Let
I =∭ xz dV
E
3.7.12 https://math.libretexts.org/@go/page/92246
3. Evaluate I by any method.
23. ✳
Let
2 2
I =∭ (x + y ) dV
T
−−−−−− −−
where T is the solid region bounded below by the cone z = √3x 2
+ 3y
2
and above by the sphere x 2
+y
2
+z
2
= 9.
24. ✳
25. ✳
3
3
2. We can also calculate the volume of the snowman as a sum of the following triple integrals:
1.
2π
2π 2
3 →
2
∫ ∫ ∫ ρ sin a⃗ rphi dρ dθ darphi
0 0 0
2.
r
2π √3 4−
√3
∫ ∫ ∫ r dz dr dθ
0 0 √3 r
3. π 2π 2 √3
2
→
∫ ∫ ∫ ρ sin(a⃗ rphi) dρ dθ darphi
π
0 0
6
Circle the right answer from the underlined choices and fill in the blanks in the following descriptions of the region of
integration for each integral. [Note: We have translated the axes in order to write down some of the integrals above. The
equations you specify should be those before the translation is performed.]
1. The region of integration in (1) is a part of the snowman's
sphere / cone
–––––––––––––
3.7.13 https://math.libretexts.org/@go/page/92246
and the
sphere / cone
–––––––––––––
sphere / cone
–––––––––––––
and the
sphere / cone
–––––––––––––
sphere / cone
–––––––––––––
and the
sphere / cone
–––––––––––––
26. ✳
1. Find the volume of the solid inside the surface defined by the equation ρ = 8 sin(a⃗ rphi) in spherical coordinates.
You may use that
4
1
∫ sin (φ) = (12φ − 8 sin(2φ) + sin(4φ)) + C
32
27. ✳
3.7.14 https://math.libretexts.org/@go/page/92246
and consider the integral
−−−−−−−−−−
2 2 2
I =∭ z√ x +y +z dV .
E
28. ✳
29. ✳
−−−−−−
The solid E is bounded below by the paraboloid z = x 2
+y
2
and above by the cone z = √x 2 2
+y . Let
2 2 2
I =∭ z(x +y + z ) dV
E
30. ✳
−−−−−−
Let Sbe the region on the first octant (so that x, y, z ≥ 0 ) which lies above the cone 2
z = √x + y
2
and below the sphere
2 2
(z − 1 ) + x + y = 1.
2
Let V be its volume.
1. Express V as a triple integral in cylindrical coordinates.
2. Express V as an triple integral in spherical coordinates.
3. Calculate V using either of the integrals above.
31. ✳
−− −−−−− −
A solid is bounded below by the cone 2
z = √3 x + 3 y
2
and above by the sphere x
2
+y
2
+z
2
= 9. It has density
2 2
δ(x, y, z) = x +y .
1. We are using the standard mathematics conventions for the spherical coordinates. Under the ISO conventions they are (r, ϕ, θ).
See Appendix A.7.
2. and with the sign of x being the same as the sign of cos θ
3. You know the drill.
4. The problem of finding a practical, reliable method for determining the longitude of a ship at sea was a very big deal for a
period of several centuries. Among the scientists who worked in this were Galileo, Edmund Halley (of Halley's comet) and
3.7.15 https://math.libretexts.org/@go/page/92246
Robert Hooke (of Hooke's law).
5. A very mathematical ice cream. Rocky-rho'd? Choculus?
This page titled 3.7: Triple Integrals in Spherical Coordinates is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.
3.7.16 https://math.libretexts.org/@go/page/92246
3.8: Optional— Integrals in General Coordinates
One of the most important tools used in dealing with single variable integrals is the change of variable (substitution) rule
Equation 3.8.1
′
x = f (u) dx = f (u) du
See Theorems 1.4.2 and 1.4.6 in the CLP-2 text. Expressing multivariable integrals using polar or cylindrical or spherical
coordinates are really multivariable substitutions. For example, switching to spherical coordinates amounts replacing the
coordinates x, y, z with the coordinates ρ, θ, φ by using the substitution
2
X = r(ρ, θ, φ) dx dy dz = ρ sin φ dρ dθ dφ
where
We'll now derive a generalization of the substitution rule 3.8.1 to two dimensions. It will include polar coordinates as a special
case. Later, we'll state (without proof) its generalization to three dimensions. It will include cylindrical and spherical coordinates as
special cases.
Suppose that we wish to integrate over a region, R, in R and that we also wish 1 to use two new coordinates, that we'll call u and
2
2
v, in place of x and y. The new coordinates u, v are related to the old coordinates x, y, by the functions
x = x(u, v)
y = y(u, v)
To make formulae more compact, we'll define the vector valued function r(u, v) by
As an example, if the new coordinates are polar coordinates, with r renamed to u and θ renamed to v, then x(u, v) = u cos v and
y = u sin v.
Note that if we hold v fixed and vary u, then r(u, v) sweeps out a curve. For example, if x(u, v) = u cos v and y = u sin v, then, if
we hold v fixed and vary u, r (u,
⃗ v) sweeps out a straight line (that makes the angle v with the x-axis), while, if we hold u > 0
fixed and vary v, r(u, v) sweeps out a circle (of radius u centred on the origin).
We start by cutting R (the shaded region in the figure below) up into small pieces by drawing a bunch of curves of constant u (the
blue curves in the figure below) and a bunch of curves of constant v (the red curves in the figure below).
Concentrate on any one of the small pieces. Here is a greatly magnified sketch.
3.8.1 https://math.libretexts.org/@go/page/92247
For example, the lower red curve was constructed by holding v fixed at the value v , varying u and sketching r(u, v
0 0 ), and the
upper red curve was constructed by holding v fixed at the slightly larger value v + dv, varying u and sketching r(u, v
0 0 + dv). So
the four intersection points in the figure are
P2 = r(u0 , v0 + dv) P3 = r(u0 + du, v0 + dv)
Now, for any small constants dU and dV , we have the linear approximation 3
∂r ∂r
r(u0 + dU , v0 + dV ) ≈ r(u0 , v0 ) + (u0 , v0 ) dU + (u0 , v0 ) dV
∂u ∂v
Applying this three times, once with dU = du, dV = 0 (to approximate P1 ), once with dU = 0, dV = dv (to approximate P2 ),
and once with dU = du, dV = dv (to approximate P ), 3
P0 = r(u0 , v0 )
∂r
P1 = r(u0 + du, v0 ) ≈ r(u0 , v0 ) + (u0 , v0 ) du
∂u
∂r
P2 = r(u0 , v0 + dv) ≈ r(u0 , v0 ) + (u0 , v0 ) dv
∂v
∂r ∂r
P3 = r(u0 + du, v0 + dv) ≈ r(u0 , v0 ) + (u0 , v0 ) du + (u0 , v0 ) dv
∂u ∂v
We have dropped all Taylor expansion terms that are of degree two or higher in du, dv. The reason is that, in defining the integral,
we take the limit du, dv → 0. Because of that limit, all of the dropped terms contribute exactly 0 to the integral. We shall not prove
this. But we shall show, in the optional §3.8.1, why this is the case.
The small piece of R surface with corners P 0, P1 , P2 , P3 is approximately a parallelogram with sides
−−−→ −−−→ ∂r ∂x ∂y
P0 P1 ≈ P2 P3 ≈ (u0 , v0 ) du = ⟨ (u0 , v0 ) , (u0 , v0 )⟩ du
∂u ∂u ∂u
−−−→
Here the notation, for example, P0 P1 refers to the vector whose tail is at the point P0 and whose head is at the point P1 . Recall,
from 1.2.17 that
∣ a b ∣
area of parallelogram with sides ⟨a, b⟩ and ⟨c, d⟩ = ∣det [ ∣ad − bc ∣
]∣ = ∣ ∣
∣ c d ∣
Equation 3.8.2
∂x ∂y
∣ ∣
⎡ ⎤
∣ ∂u ∂u ∣
dA = det du dv
∣ ∂x ∂y ∣
⎣ ⎦
∣ ∂v ∂v
∣
Recall that det M denotes the determinant of the matrix M . Also recall that we don't really need determinants for this text, though
it does make for nice compact notation.
The formula (3.8.2) is the heart of the following theorem, which tells us how to translate an integral in one coordinate system into
an integral in another coordinate system.
3.8.2 https://math.libretexts.org/@go/page/92247
Theorem 3.8.3
Let the functions x(u, v) and y(u, v) have continuous first partial derivatives and let the function f (x, y) be continuous.
Assume that x = x(u, v), y = y(u, v) provides a one-to-one correspondence between the points (u, v) of the region U in the
uv-plane and the points (x, y) of the region R in the xy-plane. Then
∂y
∣ ∂x ∣
⎡ (u, v) (u, v) ⎤
∣ ∂u ∂u ∣
∬ f (x, y) dx dy = ∬ f (x(u, v) , y(u, v)) det du dv
∣ ∂x ∂y ∣
R U ⎣ (u, v) (u, v) ⎦∣
∣ ∂v ∂u
The determinant
∂x ∂y
⎡ (u, v) (u, v) ⎤
∂u ∂u
det
∂x ∂y
⎣ (u, v) (u, v) ⎦
∂v ∂v
We'll start with a pretty trivial example in which we simply rename x to Y and y to X. That is
x(X, Y ) = Y
y(X, Y ) = X
Since
∂x ∂y
=0 =1
∂X ∂X
∂x ∂y
=1 =0
∂Y ∂Y
x(r, θ) = r cos θ
y(r, θ) = r sin θ
Since
∂x ∂y
= cos θ = sin θ
∂r ∂r
∂x ∂y
= −r sin θ = r cos θ
∂θ ∂θ
= r dr dθ
3.8.3 https://math.libretexts.org/@go/page/92247
Example 3.8.6. dA for Parabolic Coordinates
Parabolic 5 coordinates are defined by
2 2
u −v
x(u, v) =
2
y(u, v) = uv
Since
∂x ∂y
=u =v
∂u ∂u
∂x ∂y
= −v =u
∂v ∂v
(3.8.2) gives
∣ u v ∣
2 2
dA = ∣det [ ]∣ dudv = (u + v ) du dv
∣ −v u ∣
In practice applying the change of variables Theorem 3.8.3 can be quite tricky. Here is just one simple (and rigged) example.
Example 3.8.7
Evaluate
y
∬ dx dy where R = {(x, y)|0 ≤ x ≤ 1, 1 + x ≤ y ≤ 2 + 2x}
R 1 +x
Solution
We can simplify the integrand considerably by making the change of variables
s =x x =s
y
t = y = t(1 + x) = t(1 + s)
1 +x
Of course to evaluate the given integral by applying Theorem 3.8.3 we also need to know
[∘] the domain of integration in terms of s and t and
[∘] dx dy in terms of ds dt.
By (3.8.2), recalling that x(s, t) = s and y(s, t) = t(1 + s),
∂x ∂y
∣ ∣
⎡ ⎤ ∣ 1 t ∣
∣ ∂s ∂s ∣
dx dy = det ds dt = ∣det [ ]∣ ds dt = (1 + s) ds dt
∣ ∂y ∣
⎣ ∂x
⎦ ∣ 0 1 +s ∣
∣ ∂t ∂t
∣
To determine what the change of variables does to the domain of integration, we'll sketch R and then reexpress the boundary
of R in terms of the new coordinates s and t. Here is the sketch of R in the original coordinates (x, y).
3.8.4 https://math.libretexts.org/@go/page/92247
The region R is a quadrilateral. It has four sides.
The left side is part of the line x = 0. Recall that x = s. So, in terms of s and t, this line is s = 0.
The right side is part of the line x = 1. In terms of s and t, this line is s = 1.
y y
The bottom side is part of the line y = 1 + x, or = 1. Recall that t =
1+x
. So, in terms of s and t, this line is t = 1.
1+x
y
The top side is part of the line y = 2(1 + x), or = 2. In terms of s and t, this line is t = 2.
1+x
Here is another copy of the sketch of R. But this time the equations of its four sides are expressed in terms of s and t.
{(s, t)|0 ≤ s ≤ 1, 1 ≤ t ≤ 2}
y
As dx dy = (1 + s) ds dt and the integrand 1+x
= t, the integral is, by Theorem 3.8.3,
1 2 1 2
2
y t
∬ dx dy =∫ ds ∫ dt (1 + s)t = ∫ ds (1 + s) [ ]
R
1 +x 0 1 0
2
1
1
2
3 s
= [s + ]
2 2
0
3 3
= ×
2 2
9
=
4
There are natural generalizations of (3.8.2) and Theorem 3.8.3 to three (and also to higher) dimensions, that are derived in precisely
the same way as (3.8.2) was derived. The derivation is based on the fact, discussed in the optional Section 1.2.4, that the volume of
the parallelepiped (three dimensional parallelogram)
3.8.5 https://math.libretexts.org/@go/page/92247
determined by the three vectors a = ⟨a 1, a2 , a3 ⟩ , b = ⟨b1 , b2 , b3 ⟩ and c = ⟨c 1, c2 , c3 ⟩ is given by the formula
∣ a a2 a3 ∣
⎡ 1 ⎤
∣ ∣
volume of parallelepiped with edges a, b, c = det ⎢ b1 b2 b3 ⎥
∣ ∣
⎣ ⎦
∣ c1 c2 c3 ∣
If we use
x = x(u, v, w)
y = y(u, v, w)
z = z(u, v, w)
Equation 3.8.8
∂y
∣ ∂x ∂z ∣
⎡ ⎤
∣ ∂u ∂u ∂u ∣
∣ ⎢ ∂x ∂y ∂z
⎥∣
dV = det ⎢ ⎥ du dv dw
∣ ⎢ ∂v ∂v ∂v ⎥∣
⎢ ⎥
∣ ∂x ∂y ∂z
∣
⎣ ⎦
∣ ∂w ∂w ∂w
∣
x(r, θ, z) = r cos θ
y(r, θ, z) = r sin θ
z(r, θ, z) = z
Since
∂x ∂y ∂z
= cos θ = sin θ =0
∂r ∂r ∂r
∂x ∂y ∂z
= −r sin θ = r cos θ =0
∂θ ∂θ ∂θ
∂x ∂y ∂z
=0 =0 =1
∂z ∂z ∂z
∣ r cos θ 0 −r sin θ 0
= ∣cos θ det [ ] − sin θ det [ ]
∣ 0 1 0 1
−r sin θ r cos θ ∣
+ 0 det [ ]∣ dr dθ dz
0 0 ∣
2 2
= (r cos θ + r sin θ) dr dθ dz
= r dr dθ dz
3.8.6 https://math.libretexts.org/@go/page/92247
Example 3.8.10. dV for Spherical Coordinates
Spherical coordinates have
z(ρ, θ, φ) = ρ cos φ
Since
∂x ∂y ∂z
= cos θ sin φ = sin θ sin φ = cos φ
∂ρ ∂ρ ∂ρ
∂x ∂y ∂z
= −ρ sin θ sin φ = ρ cos θ sin φ =0
∂θ ∂θ ∂θ
∂x ∂y ∂z
= ρ cos θ cos φ = ρ sin θ cos φ = −ρ sin φ
∂φ ∂φ ∂φ
∣ ρ cos θ sin φ 0
= ∣cos θ sin φ det [ ]
∣ ρ sin θ cos φ −ρ sin φ
2
=ρ sin φ dρ dθ dφ
where E is bounded 6 by a constant times (du) and E is bounded by a constant times (dv)
1
2
2
2
. That is, we assumed that we could
just ignore the errors and drop E and E by setting them to zero.
1 2
So we approximated
3.8.7 https://math.libretexts.org/@go/page/92247
∣−−−→ −−−→∣ ∣ ∂r ∂r ∣
∣P0 P1 × P0 P2 ∣ = ∣[ (u0 , v0 ) du + E1 ] × [ (u0 , v0 ) dv + E2 ]∣
∣ ∣ ∣ ∂u ∂v ∣
∣ ∂r ∂r ∣
=∣ (u0 , v0 ) du × (u0 , v0 ) dv + E3 ∣
∣ ∂u ∂v ∣
∣ ∂r ∂r ∣
≈∣ (u0 , v0 ) du × (u0 , v0 ) dv∣
∣ ∂u ∂v ∣
where the length of the vector E is bounded by a constant times (du ) dv + du (dv) . We'll now see why dropping terms like E
3
2 2
3
does not change the value of the integral at all 7. Suppose that our domain of integration consists of all (u, v)'s in a rectangle of
width W and height H , as in the figure below.
Subdivide the rectangle into a grid of n × n small subrectangles by drawing lines of constant v (the red lines in the figure) and
lines of constant u (the blue lines in the figure). Each subrectangle has width du = and height dv = . Now suppose that in
W
n
H
setting up the integral we make, for each subrectangle, an error that is bounded by some constant times
2 2 W H (W + H )
2 2
W H W H
(du ) dv + du (dv) =( ) + ( ) =
n n n n n3
Because there are a total of n subrectangles, the total error that we have introduced, for all of these subrectangles, is no larger than
2
a constant times
W H (W + H ) W H (W + H )
2
n × =
3
n n
When we define our integral by taking the limit n → 0 of the Riemann sums, this error converges to exactly 0. As a consequence,
it was safe for us to ignore the error terms when we established the change of variables formulae.
3. Recall 2.6.1.
4. It is not named after the Jacobin Club, a political movement of the French revolution. It is not named after the Jacobite
rebellions that took place in Great Britain and Ireland between 1688 and 1746. It is not named after the Jacobean era of English
and Scottish history. It is named after the German mathematician Carl Gustav Jacob Jacobi (1804 – 1851). He died from
smallpox.
5. The name comes from the fact that both the curves of constant u and the curves of constant v are parabolas.
6. Remember the error in the Taylor polynomial approximations. See 2.6.13 and 2.6.14.
7. See the optional § 1.1.6 of the CLP-2 text for an analogous argument concerning Riemann sums.
This page titled 3.8: Optional— Integrals in General Coordinates is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or
curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts
platform; a detailed edit history is available upon request.
3.8.8 https://math.libretexts.org/@go/page/92247
CHAPTER OVERVIEW
4: Appendices
A: Appendices
A.1: Trigonometry
A.2: Powers and Logarithms
A.3: Table of Derivatives
A.4: Table of Integrals
A.5: Table of Taylor Expansions
A.6: 3-D Coordinate Systems
A.7: ISO Coordinate System Notation
A.8: Conic Sections and Quadric Surfaces
B: Hints for Exercises
C: Answers to Exercises
D: Solutions to Exercises
This page titled 4: Appendices is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.
1
A: Appendices
This page titled A: Appendices is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew
Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is
available upon request.
A.1 https://math.libretexts.org/@go/page/89221
A.1: Trigonometry
A.1.1 Trigonometry — Graphs
π π 1 π –
tan =1 tan = – tan = √3
4 6 √3 3
Reflection
Rotation by π
Pythagoras
A.1.1 https://math.libretexts.org/@go/page/92249
2 2
sin θ + cos θ =1
2 2
tan θ + 1 = sec θ
2 2
1 + cot θ = csc θ
Cosine
Tangent
tan α + tan β
tan(α + β) =
1 − tan α tan β
tan α − tan β
tan(α − β) =
1 + tan α tan β
Double angle
sin(2θ) = 2 sin(θ) cos(θ)
2 2
cos(2θ) = cos (θ) − sin (θ)
2
= 2 cos (θ) − 1
2
= 1 − 2 sin (θ)
2 tan(θ)
tan(2θ) =
2
1 − tan θ
1 + cos(2θ)
2
cos θ =
2
1 − cos(2θ)
2
sin θ =
2
1 − cos(2θ)
2
tan θ =
1 + cos(2θ)
Products to sums
sin(α + β) + sin(α − β)
sin(α) cos(β) =
2
cos(α − β) − cos(α + β)
sin(α) sin(β) =
2
cos(α − β) + cos(α + β)
cos(α) cos(β) =
2
Sums to products
α +β α −β
sin α + sin β = 2 sin cos
2 2
α +β α −β
sin α − sin β = 2 cos sin
2 2
α +β α −β
cos α + cos β = 2 cos cos
2 2
α +β α −β
cos α − cos β = −2 sin sin
2 2
A.1.2 https://math.libretexts.org/@go/page/92249
A.1.5 Inverse Trigonometric Functions
arcsin x arctan x
arccos x
2
≤ arcsin x ≤
2 Range: 0 ≤ arccos x ≤ π 2 2
arccos(cos θ) = θ 0 ≤θ ≤π
π π
arctan(tan θ) = θ − ≤θ ≤
2 2
and also
sin(arcsin x) = x −1 ≤ x ≤ 1
cos(arccos x) = x −1 ≤ x ≤ 1
arccscx arccotx
arcsecx
2
≤ arccscx ≤
π
Again
π π
arccsc(csc θ) = θ − ≤θ ≤ , θ ≠ 0
2 2
π
arcsec(sec θ) = θ 0 ≤ θ ≤ π, θ ≠
2
and
csc(arccscx) = x |x| ≥ 1
sec(arcsecx) = x |x| ≥ 1
A.1.3 https://math.libretexts.org/@go/page/92249
This page titled A.1: Trigonometry is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
A.1.4 https://math.libretexts.org/@go/page/92249
A.2: Powers and Logarithms
A.2.1 Powers
In the following, x and y are arbitrary real numbers, q is an arbitrary constant that is strictly bigger than zero and e is
2.7182818284, to ten decimal places.
0 0
e = 1, q =1
x x
e q
x+y x y x−y x+y x y x−y
e =e e , e = , q =q q , q =
y y
e q
1 1
−x −x
e = , q =
x x
e q
x y xy x y xy
(e ) =e , (q ) =q
d x x
d g(x) ′ g(x)
d x x
e =e , e = g (x)e , q = (ln q) q
dx dx dx
x
∫ e dx = e
x
+ C, ∫ e
ax
dx =
1
a
e
ax
+C if a ≠ 0
∞ n
x
x
e =∑
n!
n=0
x x
lim e = ∞, lim e =0
x→∞ x→−∞
lim q
x
= ∞, lim q
x
=0 if q > 1
x→∞ x→−∞
lim q
x
= 0, lim q
x
=∞ if 0 < q < 1
x→∞ x→−∞
A.2.2 Logarithms
In the following, x and y are arbitrary real numbers that are strictly bigger than 0 (except where otherwise specified), p and q are
arbitrary constants that are strictly bigger than one, and e is 2.7182818284, to ten decimal places. The notation ln x means log x. e
Some people use log x to mean log x, others use it to mean log x and still others use it to mean log x.
10 e 2
ln x log x
e = x, q q
=x
ln (e ) = x,
x
logq (q
x
) =x for all −∞ < x < ∞
ln x logp x logp x
logq x = , ln x = , logq x =
ln q logp e logp q
ln 1 = 0, ln e = 1
logq 1 = 0, logq q = 1
d 1 d 1
ln x = , logq x =
dx x dx x ln q
A.2.1 https://math.libretexts.org/@go/page/92250
x
∫ ln x dx = x ln x − x + C , ∫ logq x dx = x logq x − +C
ln q
lim ln x = ∞, lim ln x = −∞
x→∞ x→0
This page titled A.2: Powers and Logarithms is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.
A.2.2 https://math.libretexts.org/@go/page/92250
A.3: Table of Derivatives
Throughout this table, a and b are constants, independent of x.
′ dF
F (x) F (x) =
dx
′ ′
af (x) + bg(x) af (x) + bg (x)
′ ′
f (x) + g(x) f (x) + g (x)
′ ′
f (x) − g(x) f (x) − g (x)
′
af (x) af (x)
′ ′
f (x)g(x) f (x)g(x) + f (x)g (x)
′ ′ ′
f (x)g(x)h(x) f (x)g(x)h(x) + f (x)g (x)h(x) + f (x)g(x)h (x)
′ ′
f(x) f (x)g(x)−f(x)g (x)
2
g(x) g(x)
′
1 g (x)
− 2
g(x) g(x)
′ ′
f (g(x)) f (g(x))g (x)
′ dF
F (x) F (x) =
dx
a 0
a a−1
x ax
a a−1 ′
g(x) ag(x ) g (x)
sin x cosx
′
sin g(x) g (x) cosg(x)
cosx − sin x
′
cosg(x) −g (x) sin g(x)
2
tan x sec x
2
cot x − csc x
x x
e e
g(x) ′ g(x)
e g (x)e
x x
a (ln a) a
′ dF
F (x) F (x) =
dx
1
ln x
x
′
g (x)
ln g(x)
g(x)
1
log x
a x ln a
1
arcsin x
√1−x2
′
g (x)
arcsin g(x) 2
√1−g(x)
A.3.1 https://math.libretexts.org/@go/page/92251
1
−
arccosx
√1−x2
1
arctan x
1+x2
′
g (x)
arctan g(x) 2
1+g(x)
1
−
arccscx
|x|√x2 −1
1
arcsecx
|x|√x2 −1
1
arccotx −
1+x2
This page titled A.3: Table of Derivatives is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.
A.3.2 https://math.libretexts.org/@go/page/92251
A.4: Table of Integrals
Throughout this table, a and b are given constants, independent of x and C is an arbitrary constant.
af (x) a ∫ f (x) dx + C
′ ′
u(x)v (x) u(x)v(x) − ∫ u (x)v(x) dx + C
′
f (y(x))y (x) F (y(x)) where F (y) = ∫ f (y) dy
a ax + C
a+ 1
a x
x + C if a ≠ −1
a+1
1
ln |x| + C
x
a+ 1
a ′ g(x)
g(x ) g (x) + C if a ≠ −1
a+1
sin x − cosx + C
′
g (x) sin g(x) − cosg(x) + C
cosx sin x + C
tan x ln | sec x| + C
cot x ln | sin x| + C
2
sec x tan x + C
2
csc x − cot x + C
x x
e e +C
g(x) ′ g(x)
e g (x) e +C
ax 1 ax
e e +C
a
x 1 x
a a +C
ln a
ln x x ln x − x + C
1
arcsin x + C
√1−x2
′
g (x)
2
arcsin g(x) + C
√1−g(x)
A.4.1 https://math.libretexts.org/@go/page/92252
1 x
arcsin +C
√a2 −x2 a
1
arctan x + C
1+x2
′
g (x)
2 arctan g(x) + C
1+g(x)
1 1 x
arctan +C
a2 +x2 a a
x √x2 −1
arcsecx + C \quad(x )
> 1
This page titled A.4: Table of Integrals is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
A.4.2 https://math.libretexts.org/@go/page/92252
A.5: Table of Taylor Expansions
Let n ≥ be an integer. Then if the function f has n + 1 derivatives on an interval that contains both x and x, we have the Taylor 0
expansion
1 1
′ ′′ 2 (n) n
f (x) = f (x0 ) + f (x0 ) (x − x0 ) + f (x0 ) (x − x0 ) +⋯ + f (x0 ) (x − x0 )
2! n!
1
(n+1) n+1
+ f (c) (x − x0 ) for some c between x0 and x
(n + 1)!
for f . When x 0 =0 this is also called the Maclaurin series for f . Here are Taylor series expansions of some important functions.
∞
1
x n
e =∑ x for − ∞ < x < ∞
n!
n=0
1 1 1
2 3 n
= 1 +x + x + x +⋯ + x +⋯
2 3! n!
∞ n
(−1)
2n+1
sin x = ∑ x for − ∞ < x < ∞
(2n + 1)!
n=0
n
1 1 (−1)
3 5 2n+1
=x− x + x −⋯ + x +⋯
3! 5! (2n + 1)!
∞ n
(−1)
2n
cos x = ∑ x for − ∞ < x < ∞
(2n)!
n=0
n
1 1 (−1)
2 4 2n
=1− x + x −⋯ + x +⋯
2! 4! (2n)!
∞
1 n
= ∑x for − 1 ≤ x < 1
1 −x
n=0
2 3 n
= 1 +x +x +x +⋯ +x +⋯
∞
1 n n
= ∑(−1 ) x for − 1 < x ≤ 1
1 +x
n=0
2 3 n n
= 1 −x +x −x + ⋯ + (−1 ) x +⋯
∞
1 n
ln(1 − x) = − ∑ x for − 1 ≤ x < 1
n
n=1
1 1 1
2 3 n
= −x − x − x −⋯ − x −⋯
2 3 n
∞ n
(−1)
n
ln(1 + x) = − ∑ x for − 1 < x ≤ 1
n
n=1
n
1 1 (−1)
2 3 n
=x− x + x −⋯ − x −⋯
2 3 n
p(p − 1)(p − 2) ⋯ (p − n + 1)
n
+ x +⋯
n!
This page titled A.5: Table of Taylor Expansions is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.
A.5.1 https://math.libretexts.org/@go/page/92253
A.6: 3-D Coordinate Systems
A.6.1 Cartesian Coordinates
Here is a figure showing the definitions of the three Cartesian coordinates (x, y, z)
and here are three figures showing a surface of constant x, a surface of constant x, and a surface of constant z.
Here are three figures showing a surface of constant r, a surface of constant θ, and a surface of constant z.
A.6.1 https://math.libretexts.org/@go/page/92254
Finally here is a figure showing the volume element dV in cylindrical coordinates.
and here are two more figures giving the side and top views of the previous figure.
Here are three figures showing a surface of constant ρ, a surface of constant θ, and a surface of constant a⃗ rphi.
A.6.2 https://math.libretexts.org/@go/page/92254
Finally, here is a figure showing the volume element dV in spherical coordinates
→
and two extracts of the above figure to make it easier to see how the factors ρ darphi and ρ sin a⃗ rphi dθ arise.
This page titled A.6: 3-D Coordinate Systems is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.
A.6.3 https://math.libretexts.org/@go/page/92254
A.7: ISO Coordinate System Notation
In this text we have chosen symbols for the various polar, cylindrical and spherical coordinates that are standard for mathematics.
There is another, different, set of symbols that are commonly used in the physical sciences and engineering. Indeed, there is an
international convention, called ISO 80000-2, that specifies those symbols 1. In this appendix, we summarize the definitions and
standard properties of the polar, cylindrical and spherical coordinate systems using the ISO symbols.
x = ρ cos ϕ y = ρ sin ϕ
−−−−−−
2 2
y
ρ = √x +y ϕ = arctan
x
The following two figures show a number of lines of constant ϕ, on the left, and curves of constant ρ, on the right.
Note that the polar angle ϕ is only defined up to integer multiples of 2π. For example, the point (1, 0) on the x-axis could have
ϕ = 0, but could also have ϕ = 2π or ϕ = 4π. It is sometimes convenient to assign ϕ negative values. When ϕ < 0, the counter-
clockwise angle ϕ refers to the clockwise angle |ϕ|. For example, the point (0, −1) on the negative y -axis can have ϕ = − and π
It is also sometimes convenient to extend the above definitions by saying that x = ρ cos ϕ and y = ρ sin ϕ even when ρ is
negative. For example, the following figure shows (x, y) for ρ = 1, ϕ = and for ρ = −1, ϕ = .
π
4
π
A.7.1 https://math.libretexts.org/@go/page/92255
Both points lie on the line through the origin that makes an angle of 45 with the x-axis and both are a distance one from the origin.
∘
dA = ρ dρ dϕ
Here are three figures showing a surface of constant ρ, a surface of constant ϕ, and a surface of constant z.
A.7.2 https://math.libretexts.org/@go/page/92255
A.7.3 Spherical Coordinates
In the ISO convention the symbols r (instead of ρ), ϕ (instead of θ ) and θ (instead of ϕ ) are used for spherical coordinates.
Here are two more figures giving the side and top views of the previous figure.
Here are three figures showing a surface of constant r, a surface of constant ϕ, and a surface of constant θ.
A.7.3 https://math.libretexts.org/@go/page/92255
and two extracts of the above figure to make it easier to see how the factors r dθ and r sin θ dϕ arise.
This page titled A.7: ISO Coordinate System Notation is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by
Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.
A.7.4 https://math.libretexts.org/@go/page/92255
A.8: Conic Sections and Quadric Surfaces
A conic section is the curve of intersection of a cone and a plane that does not pass through the vertex of the cone. This is
illustrated in the figures below.
An equivalent 1 (and often used) definition is that a conic section is the set of all points in the xy-plane that obey Q(x, y) = 0 with
2 2
Q(x, y) = Ax + By + C xy + Dx + Ey + F = 0
being a polynomial of degree two 2. By rotating and translating our coordinate system the equation of the conic section can be
brought into one of the forms 3
This statement can be justified using a linear algebra eigenvalue/eigenvector analysis. It is beyond what we can cover here, but is
not too difficult for a standard linear algeba course.
αx
2
+ βy
2
with α, β, γ > 0, which is an ellipse (or a circle),
=γ
αx
2
− βy
2
with α, β > 0, γ ≠ 0, which is a hyperbola,
=γ
x
2
= δy, with δ ≠ 0 which is a parabola.
The three dimensional analogs of conic sections, surfaces in three dimensions given by quadratic equations, are called quadrics. An
example is the sphere
2 2 2
x +y +z = 1.
a2
+
2
= 1 y = ax
2 x
a2
−
2
= 1
2
x +y +z
2 2
= r
2
b b
sketch
a2
+
2
+
z
c2
= 1
x
a2
+
2
=
z
c
x
a2
+
2
=
z
c2
b b b
A.8.1 https://math.libretexts.org/@go/page/92256
name ellipsoid elliptic paraboloid elliptic cone
sketch
a2
+
2
−
z
c2
= 1
x
a2
+
2
−
z
c2
= −1
2
−
x
a2
=
z
c
b b b
sketch
It is outside our scope to prove this equivalence. Technically, we should also require that the constants A, B, C , D, E, F , are real
numbers, that A, B, C are not all zero, that Q(x, y) = 0 has more than one real solution, and that the polynomial can't be factored
into the product of two polynomials of degree one.
This page titled A.8: Conic Sections and Quadric Surfaces is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated
by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a
detailed edit history is available upon request.
A.8.2 https://math.libretexts.org/@go/page/92256
B: Hints for Exercises
This page titled B: Hints for Exercises is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
B.1 https://math.libretexts.org/@go/page/89222
C: Answers to Exercises
This page titled C: Answers to Exercises is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman,
Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit
history is available upon request.
C.1 https://math.libretexts.org/@go/page/89223
D: Solutions to Exercises
This page titled D: Solutions to Exercises is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel
Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform; a detailed
edit history is available upon request.
D.1 https://math.libretexts.org/@go/page/89224
Index
C H spherical coordinates
Cartesian coordinates Hyperbolas 3.7: Triple Integrals in Spherical Coordinates
A.6: 3-D Coordinate Systems
A.6: 3-D Coordinate Systems A.8: Conic Sections and Quadric Surfaces
surface area
circle
3.4: Surface Area
A.8: Conic Sections and Quadric Surfaces P
conic section parabola
A.8: Conic Sections and Quadric Surfaces
T
A.8: Conic Sections and Quadric Surfaces
cylindrical coordinates triple integral
polar coordinates
3.5: Triple Integrals
3.6: Triple Integrals in Cylindrical Coordinates A.7: ISO Coordinate System Notation
A.6: 3-D Coordinate Systems triple integral in cylindrical coordinates
3.6: Triple Integrals in Cylindrical Coordinates
Q
D quadratic surfaces
triple integral in spherical coordinates
Double Integrals 3.7: Triple Integrals in Spherical Coordinates
A.8: Conic Sections and Quadric Surfaces
3.1: Double Integrals
3.3: Applications of Double Integrals
Quadric Surfaces V
1.9: Quadric Surfaces
vector
E 1.2: Vectors
S
ellipse
A.8: Conic Sections and Quadric Surfaces
scalar W
1.2: Vectors
wave equation
2.8: Optional — Solving the Wave Equation
1 https://math.libretexts.org/@go/page/91888
Detailed Licensing
Overview
Title: CLP-3 Multivariable Calculus (Feldman, Rechnitzer, and Yeager)
Webpages: 56
Applicable Restrictions: Noncommercial
All licenses found:
CC BY-NC-SA 4.0: 94.6% (53 pages)
Undeclared: 5.4% (3 pages)
By Page
CLP-3 Multivariable Calculus (Feldman, Rechnitzer, and 2.10: Lagrange Multipliers - CC BY-NC-SA 4.0
Yeager) - CC BY-NC-SA 4.0 3: Multiple Integrals - CC BY-NC-SA 4.0
Front Matter - CC BY-NC-SA 4.0 3.1: Double Integrals - CC BY-NC-SA 4.0
TitlePage - CC BY-NC-SA 4.0 3.2: Double Integrals in Polar Coordinates - CC BY-
InfoPage - CC BY-NC-SA 4.0 NC-SA 4.0
Table of Contents - Undeclared 3.3: Applications of Double Integrals - CC BY-NC-SA
Licensing - Undeclared 4.0
Colophon - CC BY-NC-SA 4.0 3.4: Surface Area - CC BY-NC-SA 4.0
Feedback about the text - CC BY-NC-SA 4.0 3.5: Triple Integrals - CC BY-NC-SA 4.0
Preface - CC BY-NC-SA 4.0 3.6: Triple Integrals in Cylindrical Coordinates - CC
1: Vectors and Geometry in Two and Three Dimensions - BY-NC-SA 4.0
CC BY-NC-SA 4.0 3.7: Triple Integrals in Spherical Coordinates - CC
BY-NC-SA 4.0
1.1: Points - CC BY-NC-SA 4.0
3.8: Optional— Integrals in General Coordinates -
1.2: Vectors - CC BY-NC-SA 4.0
CC BY-NC-SA 4.0
1.3: Equations of Lines in 2d - CC BY-NC-SA 4.0
1.4: Equations of Planes in 3d - CC BY-NC-SA 4.0 4: Appendices - CC BY-NC-SA 4.0
1.5: Equations of Lines in 3d - CC BY-NC-SA 4.0 A: Appendices - CC BY-NC-SA 4.0
1.6: Curves and their Tangent Vectors - CC BY-NC- A.1: Trigonometry - CC BY-NC-SA 4.0
SA 4.0 A.2: Powers and Logarithms - CC BY-NC-SA 4.0
1.7: Sketching Surfaces in 3d - CC BY-NC-SA 4.0 A.3: Table of Derivatives - CC BY-NC-SA 4.0
1.8: Cylinders - CC BY-NC-SA 4.0 A.4: Table of Integrals - CC BY-NC-SA 4.0
1.9: Quadric Surfaces - CC BY-NC-SA 4.0 A.5: Table of Taylor Expansions - CC BY-NC-SA
2: Partial Derivatives - CC BY-NC-SA 4.0 4.0
2.1: Limits - CC BY-NC-SA 4.0 A.6: 3-D Coordinate Systems - CC BY-NC-SA 4.0
2.2: Partial Derivatives - CC BY-NC-SA 4.0 A.7: ISO Coordinate System Notation - CC BY-
2.3: Higher Order Derivatives - CC BY-NC-SA 4.0 NC-SA 4.0
2.4: The Chain Rule - CC BY-NC-SA 4.0 A.8: Conic Sections and Quadric Surfaces - CC
2.5: Tangent Planes and Normal Lines - CC BY-NC- BY-NC-SA 4.0
SA 4.0 B: Hints for Exercises - CC BY-NC-SA 4.0
2.6: Linear Approximations and Error - CC BY-NC- C: Answers to Exercises - CC BY-NC-SA 4.0
SA 4.0 D: Solutions to Exercises - CC BY-NC-SA 4.0
2.7: Directional Derivatives and the Gradient - CC Back Matter - CC BY-NC-SA 4.0
BY-NC-SA 4.0 Index - CC BY-NC-SA 4.0
2.8: Optional — Solving the Wave Equation - CC BY- Glossary - CC BY-NC-SA 4.0
NC-SA 4.0 Detailed Licensing - Undeclared
2.9: Maximum and Minimum Values - CC BY-NC-SA
4.0
1 https://math.libretexts.org/@go/page/115428
2 https://math.libretexts.org/@go/page/115428