Second Derivatives
Second Derivatives
As we have seen, a function f ( x, y ) of two variables has four different partial derivatives:
The Hessian of f ( x, y )
The Hessian matrix for a twice differentiable function f ( x, y ) is the matrix
∂2 f ∂2 f
f xx f x y ∂x 2 ∂x∂y
H f
∂2 f
f yx f y y ∂2 f
∂y∂x ∂y 2
Note that the four entries of the Hessian matrix are actually functions of x and y.
Thus the Hessian is itself a function
f xx ( x, y ) f x y ( x, y ) The Hessian H f is the first example we
H f ( x, y ) have seen of a matrix-valued function,
f yx ( x, y ) f y y ( x, y ) i.e. a function whose output is a matrix.
EXAMPLE 1
Compute the Hessian of the function f ( x, y ) x 4 y 2 .
SOLUTION We must compute all of the second partial derivatives of f . The first partial
derivatives are
f x ( x, y ) 4x 3 y 2 and f y ( x, y ) 2x 4 y,
so the second partial derivatives are
f xx ( x, y ) 12x 2 y 2 , f x y ( x, y ) 8x 3 y, f yx ( x, y ) 8x 3 y, f y y ( x, y ) 2x 4 .
Thus
12x 2 y 2 8x 3 y
H f ( x, y ) .
8x 3 y 2x 4
The Hessian of f ( x, y, z )
The Hessian matrix for a twice differentiable function f ( x, y, z ) is the matrix
f xx fx y f xz
H f f yx fyy f yz
f zx fz y f zz
2 SECOND DERIVATIVES
EXAMPLE 2
Compute H f (1, 2, 3) if f ( x, y, z ) x 3 z + yz 2 .
f x ( x, y, z ) 3x 2 z, f y ( x, y, z ) z 2 , f z ( x, y, z ) x 3 + 2yz.
Thus
Here we have simply placed each
derivative in the correct location. For
6xz 0 3x 2
example, f xx ( x, y, z ) 6xz, so this H f ( x, y, z ) 0 2z .
0
should be the upper-left entry of the 2
Hessian matrix. 3x 2z 2y
Substituting in x 1, y 2, and z 3 gives
18 0 3
H f (1, 2, 3) 0
0 6
3 6 4
The Hessian can be thought of as an analog of the gradient vector for second
derivatives. In the same way that the gradient ∇ f combines all of the first partial
derivatives of f into a single vector, the Hessian H f combines all of the second partial
derivatives of f into a single matrix.
Equivalently, a square matrix A is Note that the Hessian is always a symmetric matrix, meaning that the entries of
symmetric if the Hessian are symmetric across its main diagonal. For example, in the Hessian of a
A AT , two-variable function f ( x, y ) , the two off-diagonal entries are always equal:
where AT denotes the transpose of A. f xx f x y
f yx f y y
f xx fx y f xz
Each red entry of this matrix is equal to
the corresponding blue entry.
f yx fyy f yz
f zx fz y f zz
Du f u · ∇ f .
Du f ( x, y ) u · ∇ f ( x, y ) .
This function takes x and y as input and outputs the directional derivative of f in the
direction of u at the point ( x, y ) .
The second directional derivative of f in the direction of u is the directional
derivative of the directional derivative:
SECOND DERIVATIVES 3
In the special case where u is either i h1, 0i or j h0, 1i, the second directional
derivative is the same as a second partial derivative:
∂2 f ∂2 f
Di2 f , Dj2 f .
∂x 2 ∂y 2
EXAMPLE 3
Find the second directional derivative of the function f ( x, y ) 25x 2 y in the direction of the
unit vector u h3/5, 4/5i.
3 4
Du f ( x, y ) , · 50x y, 25x 2 30x y + 20x 2 .
5 5
3 4
Du2 f ( x, y ) , · h30y + 40x, 30xi 48x + 18y Here h30y + 40x, 30xi is the gradient of
5 5 30x y + 20x 2 .
Du f ha, bi · h f x , f y i a f x + b f y .
But
f xx f x y a a f xx + b f x y
[a b ] [ a b ] a 2 f xx + 2ab f x y + b 2 f y y
f x y f y y b a f x y + b f y y
EXAMPLE 4
Let f be a twice differentiable function, and suppose that
" #
4 7
H f (2, 3) .
7 5
Compute the directional derivative of f at the point (2, 3) in the direction of the vector
u h0.6, −0.8i.
−3.2
" #" # " #
4 7 0.6
Du2 f (2, 3) [ 0.6 −0.8 ] [ 0.6 −0.8 ] −2.08.
7 5 −0.8 0.2
Using this notation, we can write our Hessian formula for Du2 f as follows:
Though we are only stating this test for Second Derivative Test
the two-variable case, it works for any Let f ( x, y ) be a twice differentiable function, and let ( x 0 , y0 ) be a critical point for f .
number of variables.
The reason that this test works is that the eigenvalues of the Hessian H H f ( x0 , y0 )
are related to the directional second derivatives of f at x0 , y0 . In particular, if u is an
eigenvector for H with eigenvalue λ, then
Here uT u 1 since u is a unit vector.
Du f ( x0 , y0 ) uT H u uT λu λ uT u λ.
That is, the directional derivative of the Hessian in the direction of an eigenvector u is
equal to the corresponding eigenvalue. Thus we expect the eigenvalues of the Hessian It is less obvious that a critical point must
to be positive at a local minimum and negative at a local maximum. Moreover, if the be a local minimum just because all of the
eigenvalues of the Hessian are positive.
Hessian has both positive and negative eigenvalues, the corresponding point must be a This argument requires some additional
saddle point. linear algebra that we will not pursue
here.
EXAMPLE 5
The function f ( x, y ) x 3 + 2 ( x − y ) 2 − 3x has a critical point at (1, 1) . Classify this critical
point as a local maximum, a local minimum, or a saddle point.
6x + 4 −4
" #
H f ( x, y )
−4 4
and in particular
−4
" #
10 The eigenvalues add to 14 (the trace) and
H f (1, 1) multiply to 24 (the determinant), so they
−4 4 must be 2 and 12.
The eigenvalues of this matrix are 2 and 12, so (1, 1) is a local minimum.
EXAMPLE 6
The function f ( x, y ) 6 cos x + 4x sin y has a critical point at (0, 0) . Classify this critical point
as a local maximum, a local minimum, or a saddle point.
−6 cos x 4 cos y
" #
H f ( x, y )
4 cos y −4x sin y
and in particular
−6
" #
4 The eigenvalues add to −6 (the trace) and
H f (0, 0) multiply to −16 (the determinant), so
4 0
they must be −8 and 2.
The eigenvalues of this matrix are −8 and 2, so (0, 0) is a saddle point.
EXERCISES
1. f ( x, y ) x 2 sin y 2. f ( x, y, z ) x 2 y 3 z 4
3–4 Compute the Hessian matrix for the given function f at the given point P.
16z
3. f ( x, y ) x 3 + 4x y 2 ; P (2, 3) 4. f ( x, y, z ) √ ; P (4, 1, 8)
xy
6 SECOND DERIVATIVES
1
Compute Du2 f (2, 3) , where u is the unit vector u √ h1, 2i.
5
sin ( e y )
" #
0
H f ( x, y ) .
sin ( e y ) xe y cos ( e y )
4 3
Find a formula for Du2 f ( x, y ) , where u is the unit vector ,
5 5
9–12 Find all critical points of the given function. (See Section 11.7 of the textbook.)
9. f ( x, y ) x 4 + y 4 − 4x y + 2 10. f ( x, y ) x 3 − 12x y + 8y 3
13–18 A function and one of its critical points are given. Use the second derivative
test to determine whether the critical point is a local maximum, a local minimum, or a
saddle point.
19. Let f ( x, y ) x 3 − 3x 2 − 2y 2 . Find the critical points of f , and classify each critical
point as a local maximum, a local minimum, or a saddle point.