Mat 322-1-2
Mat 322-1-2
M.E. Egwe
Department of Mathematics,
University of Ibadan,
Ibadan, NIGERIA.
3 Introduction to IRn 13
3.1 Vector Spaces Revisited . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Derivaties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.1 Directional Derivatives . . . . . . . . . . . . . . . . . . . . . . 21
1
6 The Mapping Theorems 63
6.1 Inverse Function Theorem . . . . . . . . . . . . . . . . . . . . . . . . 63
6.2 Implicit function Theorem . . . . . . . . . . . . . . . . . . . . . . . . 70
Course synopsis
Introduction to limits of functions of several variable, Differentiation or derivatives.
Directional derivatives, partial derivatives and higher order derivatives. Taylor’s the-
orem. Inverse function theorem. Implicit function theorem. Extrema and method of
Lagrange multipliers. Riemann integrals, Riemann-Stieltjes integral. Functions of
bounded variation. Partial integration formula. Mean value theorems. Integration
of functions of several variables. Semester 2; LH 60; PH 0; 4 Units; Status R
Mode of Assessments
There shall be two standard tests, one assignment and as may be determined by the
course lecturer depending on the response of the students to the course lectures.
References
1. Walter Rudin - Principles of Mathematical Analysis.
4. T.M. Apostol.
2
Chapter 1
Remark 1.1.3. The concept is easily extended.Thus, F (x, y, z) denotes the value
of a function at (x, y, z) or a point in three dimensional space.
3
give some illustrations and examples taking the number of independent variables to
be two.
This function is defined only when x2 + y 2 < 1, since otherwise, the logarithm is
undefined. The region of definition is therefore the interior of the unit circle at the
origin as shown below.
y
p
Example 1.2.2. F (x, y) = x2 + y 2 − 1 + log(4 − x2 − y 2 ).
Here, we must have x2 + y 2 ≥ 1 in order for the square root to be real, while we
must have x2 + y 2 < 4 for the logarithm to be defined
the region of definition therefore for F (x, y) is the annular region between the cir-
cles x2 + y 2 = 1 and x2 + y 2 = 4. The inner circumference is part of the region of
definition while the outer circumference is not. This is shown below.
4
y
(2, 0) x
x
Example 1.2.3. g(x, y) = 2 .
y − 4x
This function is defined except when the denominator is zero, i.e., everywhere except
at the points of the parabola y 2 = 4x. This is shown below.
5
x
p p
Example 1.2.4. G(x, y) = x2 − y 2 + x2 + y 2 − 1.
6
x
Remark 1.2.5. Similar examples might be given for functions of three of more
independent variables. The region of definition might be interior of a cube, the
interior and boundary of an ellipsoid, the space between concentric spheres, etc.
Definition 1.2.6. The set all points (x, y) such that |x − x◦ | < δ, |y − y◦ | < δ
where δ > 0 is called a rectangular δ-neighbourhood of (x◦ , y◦ ).
The set 0 < |x − x◦ | < δ, 0 < |y − y◦ | < δ which excludes (x◦ , y◦ ) is called
the rectangular deleted δ-neighbourhood of (x◦ , y◦ ). Other neighbourhoods include
circular δ-neighbourhood of (x◦ , y◦ ). e.g.,
(x − x◦ )2 + (y − y◦ )2 < δ 2 .
For example, every bounded infinite set has at least one limit point (Bolzano-
Weierstrass). The limit point need not be a member of the set.
7
Definition 1.2.8. A set is said to be closed if it contains all its limit points. A set
S is called open if each point p ∈ S has some circular neighbourhood which belongs
entirely to the set S.
Example 1.2.9.
(1) The set of all points in the plane not on the parabola y 2 = 4x is an open set
(Verify!)
(2) The set of all points on or inside the circle x2 + y 2 = 1 is not open (check!)
Exercise 1.2.10.
(2) Let f (x, y) = log sin x + y −1/2 . Determine the set of points (x, y) where f is
defined. Is the set open, closed or neither?
1 −1
(3) Also f (x, y) = y − sin
x
8
Chapter 2
lim f (x, y) = 6.
x→1
y→2
9
Note that lim f (x, y) ̸= f (1, 2) since f (1, 2) = 0. The limit would in fact be 6 even if
x→1
y→2
f (x, y) were not defined at (1, 2). Thus, the existence of the limit as (x, y) → (x◦ , y◦ )
is in no way dependent on the existence of a value of f (x, y) at (x◦ , y◦ )
Remark 2.1.3. If the limit exists, then it is unique. The concept of one-sided limits
of functions of one variable is easily extended to functions of several variables.
π π
Example 2.1.4. lim tan−1 (y/x) = , and lim tan−1 (y/x) = − .
x→0+ 2 x→0− 2
y→1 y→1
π
But lim tan−1 (y/x) = does not exist since the limits as given above are not
x→0 2
y→1
equal.
In general, the theorems of limits, concepts of infinity, etc for functions of one
variable apply with appropriate modifications to functions of two or more variables.
The iterated limits lim f (x, y) and lim f (x, y) also denoted by lim lim f (x, y)
y→y◦ x→x◦ x→x◦ y→y◦
and lim lim f (x, y) respectively are not necessarily in general, equal.
y→y◦ x→x◦
Although, they must be equal if lim f (x, y) exists, their equality does not guar-
x→x◦
y→y◦
antee the existence of this limit.
x−y
Example 2.1.5. If f (x, y) = , then
x+y
x−y
lim lim = lim (1) = 1
x→0 y→0 x + y x→0
and
x−y
lim lim = lim (−1) = −1.
y→0 x→0 x + y x→0
Since the iterated limits are not equal, we have that lim f (x, y) can not exist.
x→0
y→0
10
2.2 Some solved problems
(1) If f (x, y) = x3 − 2xy + 3y 2 , find
1 2 f (x, y + k) − f (x, y)
(a) f , and (b)
x y k
Solution:
3 2
1 2 1
1 2 2
(a) f , = −2 +3
x y x x y y
(b)
f (x, y + k) − f (x, y) 1 3
= [x − 2x(y + k) − 3(y + k)2 ] − [x3 − 2xy + 3y 2 ]
k k
1 3
= (x − 2xy − 2kx + 3y 2 + 6k + 3k 2 − x3 + 2xy − 3y 2 )
k
1
= (−2kx + 6xy + 3K 2 )
k
= −2x + 6y + 3k.
p
(b) f (x, y) = 6 − (2x + 3y)
Solution:
Dom(f ) = {(x, y) : 2x + 3y ≤ 6.}
11
2.3 Exercises
1. (a) Compute the following limits
i.
(x + y)2 − (x − y)2
lim
(x, y)→(0, 0) xy
ii.
x3 − y 3
lim
(x, y)→(0, 0) x2 + y 2
iii.
xy
lim
(x, y)→(0, 0) x2 + y2 + 2
iv.
xy 3
lim
(x, y)→(0, 0) x2 + y 6
(b) Determine the domain of existence and continuity of the following func-
tions
i.
1
f (x, y, z) = 2
x + y + z2 − 4
2
ii.
1
f (x, y) = 2
x + y2
iii.
f (x, y) = 3x2 + 3y 2 log x2 + y 2
iv.
ex+y
f (x, y, z) =
1 + z2
12
Chapter 3
Introduction to IRn
e2 = (0, 1, . . . , 0)
e3 = (0, 0, 1, . . . , 0)
ei = (0, 0, . . . , 0, |{z}
1 , 0, . . . , 0)
i−thplace
{ei }n
i=1 form a basis for IR
n
Consider the figure below The distance between the point x ∈ IR3 and the origin is
given
q
d(x, 0) = ξ12 + ξ22 + ξ32
13
Thus, given ξ, η ∈ IRn , we have the distance between them to be given by
n
!1/2
X
d(ξ, η) = |ξi − ηi |2 − metric distance (3.1)
i=1
Thus
⟨x, x⟩ = ||x||2 (3.3)
(i) ||x|| ≥ 0
(ii) ||x|| = 0 ⇐⇒ x = 0
14
Definition 3.1.2. A mapping f with Dom(f ) = IRn and Ran(f ) ⊆ IRn ,
f : IRn → IRm is said to be linear if for all x, y ∈ IRn , α ∈ IR, we have
1. f (x + y) = f (x) + f (y)
2. f (αx) = αf (x)
f (0) = f (0 · x) = 0f (x) = 0.
That is, if f : IRn → IRm is linear, then ∃ a constant M > 0 such that
||f (x)|| ≤ M ||x|| ∀x ∈ IRn .
Proof:
Let {e1 , e2 , . . . , en } be the basis of IRn and let x ∈ IRn , then
n n
!
X X
x= xi e i , f (x) = f xi ei
i=1 i=1
By linearity of f , we have
n
X
f (x) = xi f (ei )
i=1
Therefore,
n
X
||f (x)|| = || xi f (ei )||
i=1
n
X
≤ |xi |||f (ei )||
i=1
n
X
≤ max |xi | ||f (ei )||
i=1
≤ M ||x||
15
n
X
where M = ||f (ei )||
i=1
Therefore,
max |xi | ≤ ||x|| □
i
3.2 Derivaties
Definition 3.2.1. Let f : Dom(f ) ⊂ IRn −→ IR and let x◦ ∈Dom(f ). f is said
to be continuous at x◦ if and only if given ϵ > 0 ∃δ = δ(ε) > 0 such that
∀ x ∈Dom(f ), ||x − x◦ || < δ ⇒ ||f (x) − f (x◦ )|| < ε.
Then f is continuous on Dom(f ) ⊆ IRn if and only if it is continuous at every point
of Dom(f ).
Theorem 3.2.2. If f is a linear map of IRn into IRm , then f is continuous on IRn .
≤ M ||x − x◦ ||.
The Linear map Tx◦ is called the derivatives of f at x◦ and is usually denoted
by Df (x◦ ).
Note that h is a point in IR and f (x◦ + h) − f (x◦ ) − Tx◦ (h) is a point in IRm
16
Remark 3.2.4. If for x◦ + h ∈ U , rx◦ (h) = f (x◦ + h) − f (x◦ ) − Tx◦ (h), then
equation (1) can be re-written as
Theorem 3.2.6. Let U ⊂ IRn be an open set on IRn and let f : U → IRm be
differentiable at x◦ ∈ U . Then Df (x◦ ) is unique.
Proof:
Let Tx◦ and Tx′ ◦ be two linear maps of IRn into IRm such that for the open set U
with x◦ and x◦ + h in U .
where
||rx◦ (h)||
lim =0
h→0 ||h||
f (x◦ + h) = f (x◦ ) + Tx′ ◦ (h) + Sx◦ (h),
where
||Sx◦ (h)||
lim = 0.
h→0 ||h||
17
Then,
Tx◦ (h) − Tx′ ◦ = Sx◦ (h) − rx◦ (h)
and hence
||Tx◦ (h) − Tx′ ◦ (h)|| ||Sx◦ (h) − rx◦ (h)||
=
||h|| ||h||
||Sx◦ (h)|| ||rx◦ (h)||
≤ + −→ 0 as h → 0.
||h|| ||h||
So that
||Tx◦ (h) − Tx′ ◦ (h)||
lim = 0.
h→0 ||h||
Thus, for each ε > 0, ∃ δ > 0, such that
Proof:
First, we show that for each x◦ ∈ U , there are constants δ > 0, and M > 0 such
that
||x − x◦ || < δ ⇒ ||f (x) − f (x◦ )|| ≤ M ||x − x◦ || ∀x ∈ IRn (by Lemma 3.1.3).
Now, if L : IRn −→ IRn is a linear map, then ∃, M0 > 0 such that ||L(x)|| ≤
M0 ||x|| ∀ x ∈ IRn .
18
Take L := Df (x◦ ) with x◦ ∈ U. Then given ε > 0, choose δ0 = min{δ, ε/M }.
Let ε = 1, then by definition of differentiability, ∃ δ0 > 0 such that
||x − x◦ || < δ0 ⇒ ||f (x) − f (x◦ ) − Df (x◦ )(x − x◦ )|| < ||x − x◦ ||.
But
||f (x) − f (x◦ )|| = ||f (x) − f (x◦ ) − Df (x)(x − x◦ ) + Df (x)(x − x◦ )||
< (1 + M0 )||x − x◦ ||
Set M := (1 + M0 ). Then,
= f (1 + h1 , 1 + h2 ) − f (1, 1)
= (1 + h1 )2 + (1 + h2 )2 − (12 + 12 ) (∗)
Tx◦ (h) = Tx◦ (h1 , h2 ) is a linear part of (*). i.e., 2h1 + 2h2 . We write this as linear
map Df : IR2 −→ IR, i.e.,
2
2h1 + 2h2 = (h1 , h2 )
2
19
provided the condition of differentiability in its definition is satisfied. Now we know
q
that ||h|| = ||(h1 , h2 )|| = h21 + h22 or ||(h1 , h2 )||2 = h21 + h22 .
Then for arbitrary ε > 0,
provided
||(h1 , h2 )|| < δ and δ = ε. □
Example 3.2.11. Let L : IRn → IRm be linear map. Prove that DL(x) = L(x).
Proof:
Given ε > 0, x◦ ∈ Dom(L), we find δ > 0 such that ||x − x◦ || < δ but
20
3.2.1 Directional Derivatives
Definition 3.2.13. Let f : IRn → IRm , where U is an open set, x◦ ∈ U and let u
be a vector in IRn . Then the directional derivative of f at x◦ in the direction of u
is denoted by Du f (x◦ ) and is defined by
f (x◦ + τ u) − f (x◦ )
Du f (x◦ ) = lim ,
τ →0 τ
provided the limit exists.
Solution:
p √ √
Notice that u as given above is a unit vector. i.e., (1/ 2)2 + (1/ 2)2 = 1
Therefore, the direction is at 450
1 1
f (ξ0 + τ u) = f (2, 0) + τ √ , − √
2 2
τ τ
= f 2 + √ , −√
2 2
2
τ τ τ
= 2+ √ +3 2+ √ −√
2 2 2
τ2 2
4τ −2τ τ
=4+ √ + +3 √ −
2 2 2 2
4τ τ 2 6τ 3τ 2
=4+ √ + −√ −
2 2 2 2
τ 2 2τ 2τ
= 4 − 2 − √ = 4 − √ − τ2
2 2 2
f (ξ0 ) = f (2, 0) = 22 + 3(2)(0) = 4
21
2τ
∴ f (ξ0 + τ u) − f (ξ0 ) = − √ − τ 2
2
Hence,
2τ
−√ − τ2 √
2 2
Du f (ξ0 ) = lim = − √ = − 2. □
τ →0 τ 2
Theorem 3.2.16. If U ⊆ IRn is an open set and if f : U → IRm is differentiable
at x◦ ∈ U and u is a unit vector in IRn , then Du f (x◦ ) exists and
¯
Proof: Let f : U ⊆ IRn → IRm , where U is an open set in IRn , and let x ∈ U .
If the derivative Df (x) exists, then each of the partial derivative Di f (x), 1 ≤ i ≤ n
exists and if u = (u1 , . . . , un ) ∈ IRn , then it is clear that D0 f (x◦ ) = 0 = Df (x◦ ).0.
Now, Df (x◦ ) is linear, therefore Df (x◦ ).0 = 0. Hence, this is true by degeneracy.
Next, let u ̸= 0, then by the definition of differentiability we have for any ε >
0, ∃ δ > 0 such that
δ
||τ u|| < δ ⇒ 0 < |τ | < .
||u||
δ , then
Hence, if 0 < |τ | < ||u||
f (x◦ + τ u) − f (x◦ )
|| − Df (x◦ )(u)|| ≤ ||u||.
τ
This shows that the directional derivative exists in the direction of u and
22
Example 3.2.17. Find the directional derivative of f : IR2 → IR : (x, y) 7→
1
4 − x2 − y 2 at (1, 2) in the direction of u = (cos π/3, sin π/3).
4
Du f (x, y) = Df (x, y).u = (−2x, −y/2)(cos π/3, sin π/3) = −2x cos π/3−y/2 sin π/3.
Example 3.2.18. Find the directional derivative of f (x, y) = x2 sin 2y at (1, π/2)
in the direction of v = (3, −4).
Solution:
Df (x, y) = (2x sin 2y, 2x2 cos 2y) is continuous. Thus f is differentiable. We thus
obtain the unit vector as
v 3 −4
u= =( , ).
∥v∥ 5 5
Hence,
Du f (x, y) = Df (x, y).u
= (2x sin 2y, 2x2 cos 2y)( 53 , −4
5 )
= 65 x sin 2y − 85 x2 cos 2y
Df (1, π/2) = 65 sin π/2 − 85 cos π
= 0( 65 ) + (−1)(−8/5)
= 58 .
□
Exercise 3.2.19. Give an example of a function f such that Du f (0, 0) exists for
all vector u but f is not differentiable at (0, 0).
23
Hints: Consider the function defined by
2
x y
if (x, y) ̸= (0, 0)
x2 + y 2
f (x, y) =
0 if (x, y) = (0, 0)
The existence of all partial derivatives at a point does not ensure the the differen-
tiability at that point.
∂f (x, y) ∂f (x, y)
= 1, =1
∂x ∂y
∂f (0, 0) ∂f (0, 0)
= 1, =1
∂x ∂y
But if x ̸= 0 and y ̸= 0, then
24
Problems:
2.
xy 2 if (x, y) ̸= (0, 0)
f (x, y) =
0 if (x, y) = (0, 0)
Show that Du f (0, 0) exists in a direction of a vector u ∈ IR2 but that f is not
differentiable at (0,0).
fj (x + τ ei ) − fj (x)
Di fj (x) = lim
τ →0 τ
fj (x1 , . . . , xi + τ, . . . , xn ) − fj (x1 , . . . , xn )
= lim
τ →0 τ
provided the limit exists. These are the partial derivatives of fj with respect to xi keeping all other
∂f ∂fj
components fixed. Di f, Di fj may be denoted by and respectively.
∂xi ∂xi
Moreover,
25
If x = (x1 , x2 , . . . , xn ) ∈ IRn , then
f (x + τ ei ) − f (x)
Di f (x) = lim
τ →0 τ
f (x1 , . . . , xi + τ, . . . , xn ) − f (x1 , . . . , xn )
= lim
τ →0 τ
N.B. x = (x1 , x1 , . . . , xn )
τ ei = (0, 0, . . . , τ, . . . , 0)
x + τ = (x1 , x2 , . . . , xi + τ, 0, . . . , 0). And
n
X
Df (x)(u) = ui Di f (x)
i=1
Proof:
Suppose Df (x) exists. Then by the theorem which states that Du f (x) =
Df (x)(u), we have that for each e1 , e2 , . . . , en , the partial derivatives
Di f (x), D2 f (x), . . . , Dn f (x) and are equal to Df (x)e1 , Df (x)e2 , . . . , Df (x)en re-
spectively.
n
X
Since Df (x) is linear and u = ui ei , we have that
i=1
n
X n
X
Df (x)u = ui Df (x)ei = ui Di f (x). □
i=1 i=1
(Di fj (x))m
j=1 i = 1, 2, . . . , n
Proof:
That Df (x) exists
f (x + τ ei ) − f (x)
= lim
τ →0 τ
Di f (x) = Df (x)ei (∗)
26
If (e∗1 , e∗2 , . . . e∗m ) is the basis for IRm , then since f = (f1 , f2 , . . . , fm ), we have
m
X
f (x) = fj (x)e∗j
j=1
27
If m = 1, we have
∂f ∂f ∂f
Df (x) = , ,··· , .
∂xi ∂x2 ∂xn
The matrix (Di fj (x))m
j=1 , i = 1, 2, . . . , n is called the Jacobian Matrix.
Definition 3.2.25. For m = 1, Df (x) is a 1×n-matrix given by (D1 f (x), D2 f (x), . . . , Dn f (x)).
The vector whose components are the same as Df (x) is called the gradient
of f and is denoted by ▽f or grad(f ).
∂f1 ∂f1
∂x ∂y
∂f2 ∂f2
Df (x, y) =
∂x ∂y
∂f3 ∂f3
∂x ∂y
3x2 y x3
= 3 2 4
4x y 2x y
3x2 0
3 1
Df (1, 1) =
4 2
3 0
28
Exercise 3.2.27. 1. If f is a function f : IR2 → IR3 defined by
2. Find the gradient of each point at which it exists for the function defined by
(a) f (x, y) = x2 + y 2 (sin xy)
(b) f (x, y) = ex cos y
(c) f (x, y, z) = log(x2 + 2y 2 − 3z 2 )
Theorem 3.2.28. Let U ⊆ IRn be an open set and let f : U → IR3 . Suppose
f = (f1 , f2 , . . . , fm ). If all the first partial derivatives of f exists and are continuous
on U, then f is differentiable.
Proof: We prove the result for m = 1. We show that Df (x◦ ) exists for some
arbitrary x◦ ∈ U . Take ε > 0. Since U is open, ∃ δ > 0 such that B(x◦ , δ) ⊂ U ,
and such that
29
Let h be sufficiently small and let
x◦ = (a1 , a2 , . . . , an ). Then
f (x◦ + h) − f (x◦ ) = f (a1 + h1 , a2 , ...) − f (a1 , a2 , ..., an ) + f (a1 + h1 , a2 + h2 + . . .)
+f (a1 + h1 , a2 , ..., an ) + · · · + f (a1 + h1 , ..., an + h) − f (a1 + h1 , a2 + h2 , ...)
= hi Di f (c)
n
X
Since Df (x◦ )hi = Di f (x◦ )hi , we have
n=1
n
X
f (x◦ + h) − f (x◦ ) = Di f (c)hi
i=1
Therefore,
n
X n
X
f (x◦ + h) − f (x◦ ) − Di f (x◦ )h1 = Di fi
i=1 i=1
n
X n
X
|f (x◦ + h) − f (x◦ ) − Di | ≤ |Di f (c) − Di f (x◦ )||hi |
i=1 i=1
n
1X
≤ |hi |ε ∀i, |hi | < ||h||
n
i=1
1
≤ ||h||nε
n
= ε||h||
Whenever ||h|| < δ. This shows that f is differentiable. □
30
Definition 3.2.30. A function f : U ⊆ IRn → IRm is said to be continuously differ-
entiable on U if Df (x) is a continuous map of U into L(IRn , IRm ) where L(IRn , IRm )
is the linear space of all linear maps from IRn to IRm .
More explicitly, it is required that to every x ∈ U and to every ε > 0, ∃δ > 0
such that ||Df (y) − Df (x)|| < ε, if y ∈ U and ||x − y|| < δ. It can be proved
that f : U ⊂ IRn → IRm is continuously differentiable if and only if the first partial
derivatives Di fj , (i = 1, 2, ..., n), (j = 1, 2, ..., m) exists and are continuous if it is
continuously differentiable, we say that f ∈ C(U ).
x2 y 4
if (x, y) ̸= 0
Exercise 3.2.31. 1. Let f (x, y) = x +6y 8
2
.
0
if (x, y) = (0, 0).
∂f ∂f
2. Show that (0, 0) and (0, 0) exist.
∂x ∂y
∂ 2f ∂ 2f
(i) Is = true?
∂x∂y ∂y∂x
(ii) Is f differentiable at (0, 0) or continuous at (0, 0)?
31
Chapter 4
4.1 Introduction
First, we prove the Chain rule otherwise known as the Composite mapping
theorem (CMT).
32
= ||g[f (x)] − g[f (x◦ )] − (Dg)f (x◦ )[f (x) − f (x◦ )]
+ (Dg)f (x◦ )[f (x) − f (x◦ )] − (Dg)[f (x◦ )].Df (x◦ )(x − x◦ )||
Now given ε > 0. by definition of derivatives of g, there exists a δ1 > 0 such that
||y − f (x◦ )|| < δ1 ⇒ ||g(y) − g(f (x◦ )) − (Dg)f (x◦ )(x − x◦ )||
||g[f (x)] − g[f (x◦ )] − (Dg)(f (x◦ ))[(f (x) − f (x◦ )]|| < ε/2M ||f (x) − f (x◦ )||
≤ ε/2||x − x◦ ||
|||g(f (x)) − g(f (x◦ ) − (Dg)(f (x◦ ))[f (x) − f (x◦ )]|
||x − x◦ || < δ2 ⇒ < ε/2.
||x − x◦ ||
But (Dg)f (x◦ ) is linear. Therefore, there exists K > 0 such that
Therefore,
or
||(Dg)f (x◦ )[f (x) − f (x◦ ) − Df (x◦ )(x − x◦ )]||
< ε/2.
||x − x◦ ||
33
Therefore, if ||x − x◦ || < δ = min(δ2 , δ3 ), we have
Therefore,
D(g ◦ f )(x◦ ) = (Dg)f (x◦ ).Df (x◦ ). □
g ◦ f : IRn → IRp
m
∂h1 X ∂g1 ∂fk
a11 = =
∂x1 ∂yk ∂x1
k=1
34
m
∂h1 X ∂g1 ∂fk
a12 = =
∂x2 ∂xk ∂x2
k=1
and so on.
Example 4.1.2. Let f = f (x, y), where x = u(r, θ), y = v(r, θ) choose u(r, θ) =
r cos θ,
v(r, θ) = r sin θ. Set h(r, θ) = f (r cos θ, r sin θ)
f (x, y) = reiθ = r cos θ + ir sin θ
w := f (u(r, θ), v(r, θ)), w = u + iv
using h(r, θ) = f (r cos θ, r sin θ)
∂h ∂f ∂x ∂f ∂y
= · + ·
∂r ∂x ∂r ∂y ∂r
∂h ∂f ∂x ∂f ∂y
= · + ·
∂θ ∂x ∂θ ∂y ∂θ
∂x ∂x
But = cos θ, = −r sin θ
∂r ∂θ
∂y ∂y
= sin θ, = r cos θ
∂r ∂θ
∂h ∂f ∂f
= cos θ + sin θ
∂r ∂x ∂y
∂h ∂f ∂f
= −r sin θ + r cos θ
∂θ ∂x ∂y
Example 4.1.3. Find the Jacobian of f (x, y) = (sin(x sin y), (x + y)2 ) if
f : IR2 → IR2 .
Solution:
35
∂f2 ∂
= 2(x + y) (x + y) = 2(x + y)
∂x ∂x
∂f1 ∂
= cos(x sin y) (sin y) = cos(x sin y)x cos y
∂y ∂y
∂f2
= 2(x + y)
∂y
By the general rule for derivatives, we have Jacobian matrix (where x = x1 , and
y = x2 ) as
sin y cos(x sin y) x cos y cos(x sin y)
2(x + y) 2(x + y).
Remark 4.1.4. The Jacobian matrices generally are not symmetric and indeed,
need not be square. Symmetry is only a property of the second derivatives.
(a) Let f (u, v, w) = eu−w , cos(u + v) + sin(u + v + w) and g(x, y) = ex , cos(y − x), e−y .
u2 + v 2
(d) Let h(x, y) = f ((u(x, y), v(x, y)) and f (u, v) = where u(x, y) =
u2 − v 2
∂h ∂h
e−x−y , v(x, y) = exy . Find , .
∂x ∂y
36
Definition 4.1.6. A subset E of a linear space X is said to be convex if and only
if, for each pair of points x, y ∈ E, the line segment joining x and y lies in E
= (1 − t)||x|| + tδ = δ,
Theorem 4.1.7. (Mean Value Theorem) Let U be an open set in IRn and consider
f : U → IR. Suppose the set U contains the points x, y and the line segment L[x, y]
joining them and that f is differentiable at every point of this segment . There exists
a point ξ ∈ L[x, y] such that
F : (−r, r + 1) → IR by
37
By the chain rule,
By applying the MVT for function of single variables, ∃t ∈ (0, 1) such that
= Df (ξ)(y − x). □
Remark 4.1.8. The theorem above is not valid for vector-valued functions as the
following example shows
Example 4.1.9. The function f : IR → IR2 is given by f (x) = (cos x, sin x). Prove
that there are points u, v ∈ IR such that
Solution:
If v = u + 2π, then v − u = 2π
Also
38
But, for all ξ ∈ IR,
Df (ξ)(u − v) = Df (ξ)(2π)
Hence,
= 2π. □
The r-th partial derivative of this function at x, i.e., Dr (Ds f )(x) is often denoted
by Drs f (x). The function Drs f (s) is called the second order or mixed partial
derivative of f .
xy(x2 − y 2 )
if (x, y) ̸= (0, 0)
x2 + y 2
f (x, y) =
0 elsewhere if (x, y) = (0, 0)
(a) Show that D2 f (x, 0) = x for all x and D1 f (0, y) = −y for all y.
(b) Show that D12 f (0, 0) ̸= D21 f (0, 0).
39
Solution:
∂ 2f
(a) Note that Drs f = .
∂xr ∂xs
Thus, we have
∂f (x, y)
D1 f (x, y) =
∂x
(x + y 2 )[(x2 − y 2 )]y 3
2
=
(x2 + y 2 )2
y(x4 + 4x2 y 2 − y
=
(x2 + y 2 )2
If (x, y) ̸= (0, 0)
D1 f (0, y) = −y
D1 f (0, 0) = 0
Since D1 f (0, 0) = 0, then,
D2 D1 f (0, 0) = −1.
Also, differentiating with respect to y,
∂f
D2 f (x, y) = (x, y)
∂y
(x2 + y 2 )(x3 − 3xy 2 ) − [xy(x2 − y 2 )(2y)]
= .
(x2 + y 2 )2
Theorem 4.2.4. Let U ⊂ IRn be an open set and let f be a continuous function
defined on U into IR. Suppose that the partial derivatives D1 f, D2 f, D1 D2 f and
D2 D1 f exists and are continuous, then D1 D2 f = D2 D1 f.
40
Then
We observe that
g1 (x + h) − g1 (x) = g2 (y + k) − g2 (y).
D1 f (x + θ1 h, y + k) − D1 f (x + θ1 h, y) = k(D1 f )′ (x + θ1 h, η)
D1 f (x + θ1 h, y + k) − D1 f (x + θ1 h, y) = kD1 D2 f (x + θ1 h, y + θk),
where 0 < θ1 .
Hence, if 0 < ϕ1 , we have
D2 f (x + h, y+ϕ2 k) − D2 f (x, y + ϕ2 k)
= h(D2 f )′ (x + θ2 h, y + ϕ2 k).
41
Therefore,
g2 (y + k) − g2 (y) = khD1 D2 f (x + θ2 h, y + ϕ2 k).
Hence,
hkD1 D2 f (x + θ1 h, y + ϕ1 k) = khD1 D2 f (x + θ2 h, y + ϕ2 k),
D2 D1 f (x + θ1 h, y + ϕ1 k) → D2 D1 f (x, y) as h, k → 0.
Also,
D1 D2 f (x + θ2 h, y + ϕ2 k) → D2 D1 f (x, y) as h, k → 0
f (x, y) = xy 2 cos x2 .
Show that
∂ 2f ∂ 2f
= .
∂x∂y ∂y∂x
Solution:
∂f
= y 2 [−2x2 sin x2 + cos x2 ]
∂x
= −2x2 y 2 sin x2 + y 2 cos x2 .
∂f
= −4x2 y sin x2 + cos x2 · 2y
∂x∂y
= −4x2 y sin x2 + 2y cos x2 .
∂f
= x cos x2 [2y]
∂y
= 2xy cos x2 .
∂ 2f
= 2xy(−2x sin x2 + cos x2 [2y]
∂y∂x
= −4x2 y sin x2 + 2y cos x2 .
42
Definition 4.2.6. A function f : U ⊆ IRn → IR: where U is an open set, is
said to be of class C k (U ). If the k-th derivatives of f exists and are continuous.
Equivalently,
f is of class C k (U ) if all the k-th partial derivative.
43
Definition 4.2.7. Let f be a real-valued function defined on IRn . Then the second-
order differential d2 f is a function of n-dimensional variables defined for x ∈ IRn
where f has a continuous second order partial derivatives and for t ∈ IRn ,
n X
X n
d2 f (x; t) = Dij f (x)tj ti ,
i=1 j=1
if all the third order partial derivatives exists at x and are continuous.
if all the nth partial derivatives exist and are continuous. Now, consider f : IR2 →
IR. If we denote an element of IR2 by (x, y) and if t = (∂x, ∂y), then
∂ 2f 2 2 2
2 + ∂ f dxdy + ∂ f dydx + ∂ f (dy)2
d2 f ((x, y), (dx, dy)) = (dx)
∂x2 ∂x∂y ∂y∂x ∂y 2
∂ 2f 2 2
2 + 2 ∂ f dxdy + ∂ f (dy)2 .
= (dx)
∂x2 ∂x∂y ∂y 2
Also,
∂ 3f ∂ 3f ∂ 3f ∂ 3f
d3 f (x, t) = 3
3 2
(dx) + 3 2 (dx) dy + 3 2
dx(dy) + 3 (dy)3
2
∂x ∂x ∂y ∂x∂y ∂y
∂ 4f 4 4
4 + 4 ∂ f (dx)3 dy + 6 ∂ f (dx)2 (dy)2
d4 f (x, t) = (dx)
∂x4 ∂x3 ∂y ∂x2 ∂y 2
∂ 4f 4
3 + ∂ f (dy)4
+4 dx(dy)
∂x∂y 3 ∂y 4
∂ mf ∂ mf
m
dm f (x, t) = (dx)m + (dx)m−1 dy+
∂xm 1 ∂xm−1 ∂y
∂ mf m
m m−2 (dy)2 + · · · + ∂ f (dy)m .
(dx)
2 ∂xm−2 ∂y 2 ∂y m
We are now ready to state and prove Taylor’s theorem.
44
Theorem 4.2.8. (Taylor’s Theorem) Let m be a positive integer, and let U ⊂ IRn
be an open set. Suppose that f : U → IR, has a continuous partial derivatives up to
and including order m at each point of U . If a ∈ U ,b ∈ U are such that L[a, b] ⊂ U ,
then ∃ a point ξ ∈ L[a, b] such that
1 2 1
f (b) = f (a) + df (a, b − a) + d f (a, b − a) + · · · + dm−1 f (a, b − a)
2! (m − 1)!
1 m
+ d f (ξ, b − a)
m!
m−1
X 1 k 1 m
= f (a) + d f (a, b − a) + d f (ξ, b − a).
k! m!
k=1
Proof:
Define a new function of single variable by the equation
= f [a + t(b − a)]
Then
g(0) = f (a) g(1) = f (b).
where
ψ(t) = tb + (1 − t)a
= a + t(b − a).
45
By applying chain rule to g, we see that g ′ (t) exists in the interval interval (0, 1)
and is given by
n
X
g ′ (t) = Dj f (ψ)(bj − aj )
j=1
= df ψ(t, b − a).
= d2 f (ψ(t), b − a).
= dm f (ψ(t), b − a).
g(0) = f (a)
g(1) = f (b)
1 ′′ 1 1 m
g(t) = g(0) + g ′ (t) + g (t) + g ′′′ (t) + · · · + g (t)
2! 3! m!
n n n
X 1 XX
f (b) = f (a) + Dj f (ψ(t))(bj − aj ) + Dij f (ψ(t))(bj − aj )(bi − ai )
2!
j=1 i=1 j=1
1 m
+ ··· + g (t). □
m!
Example 4.2.9. Let f : IR2 → IR be defined by f (x, y) = sin(xy). Obtain the
Taylors expansion for f at the point (0,0) up to order 2.
Solution:
a = (0, 0), b = (x, y)
46
f (a) = f (0, 0) = sin 0 = 0
∂f (0, 0) ∂f (0, 0)
f (x, y) = f (0, 0) ̸= x+ y
∂x ∂y
1 ∂ 2 f (0, 0) 2 ∂ 2 f (0, 0) ∂ 2 f (0, 0)
+ x +2 xy + .
2! ∂x2 ∂x∂y ∂y 2
Hence,
f (a) = f (0, 0), dx = x − 0, dy = y − 0, a = (x, y)
∂f ∂f (0, 0)
yx cos xy, =0
∂x ∂x
∂f ∂f (0, 0)
= cos xy, =0
∂y ∂y
∂ 2f ∂ 2 f (0, 0)
= −y sin(xy), =0
∂x2 ∂x2
∂ 2f ∂ 2 f (0, 0)
= −x2 sin(xy), =0
∂y 2 ∂y 2
∂ 2f ∂ 2 f (0, 0)
= − sin(xy) + cos(xy), =1
∂x∂y ∂x∂y
∴ sin(xy) = xy.
Problems 4.2
1 m
f (x) = Pm,x◦ (x) + [d f (x̃, x − x◦ ) − dm f (x◦ , x − x◦ )]
m!
n
X ∂ m f (x)
dm f (x̃, x − x ◦) = (x − x0i1 )(xi2 − x0i ) · · · (xim − x0i )
∂xm ∂xm−1 · · · ∂xi i1 2 m
i1 ,...,im
n
X ∂ m f (x◦ )
dm f (x ◦ , x−x◦ ) = (x −x0i1 )(xi2 −x0i ) · · · (xim −x0i )
∂xm ∂xm−1 · · · ∂xi i1 2 m
i1 ,...,im
48
∂f
= 1/2(1 + x + 4y)−1/2
∂x
∂f (ξ◦ ) 1
=
∂x 4
∂f
= 1/2(1 + x + 4y)−1/2 · 4
∂y
∂f (ξ◦ )
=1
∂y
∂ 2f
= −1/4(1 + x + 4y)−3/2 · 1
∂x2
∂ 2 f (ξ◦ )
= −1/32
∂x2
∂ 2f
= 2[−1/2(1 + x + 4y)−3/2 · (1)]
∂x∂y
∂ 2 f (ξ◦ )
= −1/8
∂x∂y
∂ 2f
2
= 2[−1/2(1 + x + 4y)−3/2 ] · 4
∂y
∂ 2 f (ξ◦ )
= −1/2.
∂y 2
Thus,
Example 4.2.10. 1. Obtain the Taylor’s expansion for the function up to and
including order 3 near point (1,-2).
49
Given that f (x, y) = x2 y + 3y − 2.
5. Obtain the third Taylor polynomial of f (x, y) = cos(x2 y) in the point of (x−2)
and (y − π/6)
50
Chapter 5
Applications to Extremum
Problems and the method of
Lagrange Multipliers
Example 5.1.2. Find the critical point of the function f : IR2 → IR defined by
2 2
f (x, y) = e−(x +y ) .
51
such that f (x) ≤ f (x◦ ) for every x ∈ B(x◦ , δ).
f is said to have absolute maximum at the point x◦ ∈ U if f (x) ≤ f (x◦ ) for
every x ∈ U . The concepts of local minimum and absolute minimum are defined
analogously.
Definition 5.1.4. The number which is either a local maximum or a local minimum
of f is called an EXTREMUM point of f .
Remark 5.1.5. The following result is a necessary condition for a critical point to
be a local maximum.
Proof: Let U be a non-zero element on IRn . Then for some values of t, x◦ +tu ∈
U and f (x◦ + tu) is defined. (Recall the Proof from one-variable calculus: Because
g(0) is a local maximum, g(t) ≤ g(0) for small t > 0, so g(t) − g(0) ≤ 0, and hence
g(t) − g(0)
g ′ (0) = lim ≤ 0, where lim means the limit as t → 0, t > 0. For
t→0+ t t→0+
g(t) − g(0)
small t < 0, we similarly have g ′ (0) = lim ≥ 0. Therefore, g ′ (0) = 0.)
t→0− t
Furthermore, for small values of t, tu is small and hence x◦ + tu ∈ B(x◦ , δ) so that
f (x◦ + tu) < f (x◦ ). Therefore, the function of a single variable g(t) = f (x◦ + tu)
has a local maximum at t = 0. Hence, its derivatives g ′ (0) = [Df (x◦ )]u = 0.
By chain rule, we have that ∇f (x◦ )u = 0 that is D1 f (x◦ ) = 0, · · · , Dn f (x◦ ) = 0.
This completes the proof. □
Definition 5.1.7. A real valued function Q defined in IRn by the equation of the
type
n
X
Q(x) = aij xi xj ,
i=1
where x = (x1 , · · · , xn ) and aij ’s are real numbers is called a quadratic form. The
quadratic form is called
52
2. Positive definite if x ̸= 0 ⇒ Q(x) > 0
The last theorem gave the necessary condition for a local maximum point of a
C 1 -function to be a critical point. The next result gives a sufficient condition for a
critical point to be a local maximum or local minimum.
2
∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ )
(a) if > 0 and − > 0,
∂x2 ∂x2 ∂y 2 ∂x∂y
it follows that (x◦ , y◦ ) is a local minimum of f .
2
∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ )
(b) if > 0 and − < 0,
∂x2 ∂x2 ∂y 2 ∂x∂y
53
it follows that (x◦ , y◦ ) is a local maximum of f.
(c) If
2
∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ )
− = 0.
∂x2 ∂y 2 ∂x∂y
then (x◦ , y◦ ) is neither a local maximum nor a local minimum. In this case, (x◦ , y◦ )
is a saddle point.
Before giving the proof of the above theorem, we shall need the following.
Definition 5.1.11. The Hessisan matrix is the derivative of the Jacobian matrix
given by
∂ 2f ∂ 2f
∂x2 ∂x∂y
H=
.
∂ 2f ∂ 2f
∂x∂y ∂y 2
∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ )
∂x2 ∂x∂y
det(H) = .
∂ 2 f (x ◦ , y◦ ) ∂ 2 f (x
◦ , y◦ )
∂y∂x ∂y 2
Proof:
∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ )
Let A = , B = and C = .
∂x2 ∂x∂y ∂y 2
A B
Then D = = AC − B 2 .
B C
We only need to look at the quadratic form Q(x, y) = Ax2 + 2Bxy + Cy 2 . Suppose
A ̸= 0. Specifically, let A > 0. Then we write
2B B2 C B2
Q = (x, y) = A[x2 + xy + 2 y 2 + ( − 2 )y 2 ]
A A A A
2B B 2 B2 2
= A(x2 + xy + 2 y 2 ) + (C − )y
A A A
B 1
= A(x + y)2 + (CA − B 2 )y 2
A A
B 2 D 2
= A(x + y) + y
A A
54
The quadratic form Q is positive definite if D > 0. Therefore, by the last theorem,
the point (x◦ , y◦ ) is a local minimum.
D
(b) Suppose A < 0, then is negative. Hence the quadratic form is negative
A
definite.
(iii) A = C = 0, then B ̸= 0, and Q(1, 1) = −Q(1, −1). In each case, the two given
values of Q differ in sign and so the point (x◦ , y◦ ) is neither a local maximum
nor a local minimum. Therefore, it follows that the point (x◦ , y◦ ) is a saddle
point of f .
Example 5.1.12. Investigate the nature of the critical points of the function
f (x, y) = y 3 − 2x2 − 2y 2 + y and determine whether each is a local minimum, local
maximum or saddle point.
Solution:
∂f ∂f
= −4x, = 3y 2 − 4y + 1
∂x ∂y
At critical points,
−4x = 0 ⇒ x = 0 and
3y 2 − 4y + 1 ⇒ y = 1 and 1/3
The critical points are (0,1), (0,1/3).
∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ ) ∂ 2 f (x◦ , y◦ )
A= = −4 B= =0 C= = 6y − 4
∂x2 ∂x∂y ∂y 2
(0,1) (0,1/3)
A -4 -4
B 0 0
C 2 -2
D -8 8
55
We see that from the table
2. For the point (0, 1/3), we have A < 0, D < 0. Therefore, (0,1/3) is a local
maximum.
Example: Discuss the critical points of the function f (x, y) = 2x4 + y 4 − 2x2 − 2y 2
in the open set B := {(x, y) ∈ IR2 : x2 + y 2 < 1}
∂f
= 8x3 − 4x
∂x
∂f
= 4y 3 − 4y
∂y
The necessary condition is
∂f
At critical point, =0
∂x
⇒ 8x3 − 4x = 0
√ √
⇒ x = 0, 1/ 2, −1/ 2
∂f
= 0 ⇒ 4y 3 − 4y = 0
∂y
y = 0, −1, 1
Since we are in the open set B := {(x, y) ∈ IR2 : x2 + y 2 < 1}, the critical points
are
√ √
y/x 0 1/ 2 -1/ 2
√ √
0 (0,0) (1/ 2,0) (-1/ 2,0)
√ √
-1 (0,-1) (1/ 2,-1) (-1/ 2,-1)
√ √
1 (0,1) (1/ 2,1) (-1/ 2,1)
√ √
The critical points that lie in the open set are (0, 0), (1/ 2, 0), (−1/ 2, 0)
∂ 2f
A= = 24x2 − 4
∂x2
56
∂ 2f
B= =0
∂x∂y
∂ 2f
C= = 12y 2 − 4
∂x2
√ √
(0, 0) (1/ 2, 0) (−1/ 2, 0)
A −4 8 8
B 0 0 0
C −4 −4 −4
D 16 −32 −32
57
Example 5.2.1. Find the extremum points of the function f : IR2 → IR defined by
3
f (x, y) = x2 − xy + y 2 − x3 if x and y satisfy the equation
2y − x = 4
g(x, y) = 0
g(x, y) = 2y − x − 4
Solution:
We form a new function
where g(x, y) = 2y − x − 4
x3
F (x, y, λ) = x2 − xy + y 2 − + λ(2y − x − 4)
3
∂F (x, y, λ)
= 2x − y − x2 − λ
∂x
∂F (x, y, λ)
= −x + 2y + 2λ
∂y
∂F (x, y, λ)
= 2y − x − 4
∂λ
∂F ∂F ∂F
Equate , and to zero and solve the resulting simultaneously equation
∂x ∂x ∂λ
2x − y − x2 − λ = 0 (1)
−x + 2y + 2λ = 0 (2)
2y − x − 4 = 0 (3)
4 + 2λ = 0 ⇒
λ = −2
2x − y − x2 + 2 = 0 (4) × 2
58
x + 2y − 4 = 0 (5) × 1
4x − 2y − 2x2 + 4 = 0
−x + 2y − 4 = 0
Adding,
3x − 2x2 = 0 ⇒ x = 0, 3/2
This example, a typical example of the lagrange multiplier problem can be described
generally as follows:
Obtain the extremum points of the function f (x, y) subject to the constraints
condition g(x, y) = 0. This equation is known as extremum problem with con-
straints.
Solution Procedures
Form an auxiliary function
Differentiate partially w.r.t. x, y and λ in turn and equate the resulting expressions
to zero. Thus, we have
59
Then
∂f (x, y) λ∂g(x, y)
+ =0
∂x ∂x
g(x, y) = 0.
This system of equations (called the Lagrange equation) is then solved simultane-
ously. Then the problem reduces to that of finding the critical points of F (x, y, λ).
The general situation is given in the following theorem, which establishes the
validity of Lagrange’s method.
gi (x) = 0, 1 ≤ i ≤ m. (L)
∂(g1 , ..., gm )
(1 ≤ i ≤ i2 ≤ · · · ≤ im ≤ n)
∂(xi1 , xi2 , ..., xim ) x
◦
Solution:
Hence, there are two constraints
g1 (x, y, z) = x2 + y 2 − 2 = 0
60
g2 (x, y, z) = x + z − 1 = 0.
and g1 (x, y, z) = 0
g2 (x, y, z) = 0. i.e
1 + 2xλ1 + λ2 1 (1)
1 + 2yλ1 + λ2 0 (2)
1 + 0λ1 + λ2 (3)
x2 + y 2 − 2 = 0
x + z − 1 = 0.
These equations are then solvable for x, y, z, λ1 and λ2 from the 3rd, λ2 = −1 and
2xλ1 = 0
2yλ1 = −1.
√
Since the second implies λ1 ̸= 0 we have that x = 0, y = ± 2 and z = 1. Hence,
√
our points are x = 0, y = ± 2 and z = 1. i.e.,
√
(0, ± 2, 1) or
√ √
(0, − 2, 1) and (0, + 2, 1)
f (x, y, z) = x + y + z.
√ √
By inspection, we see that (0, 2, 1) is the maximum point and (0, − 2, 1) is the
minimum point. □
Example 5.2.4. Find the largest volume of a rectangular box can have subject to
the constraint that the surface area be fixed at 10 square meters.
Solution:
Here x, y, z are the length of the sides, volume is f (x, y, z) = xyz. The constraint
is given by
2(xy + xz + yz) = 10
61
. i.e,
xy + xz + yz = 5
xy + λ(y + x) = 0
xy + xz + yz = 5.
Exercise 5.2.5. 1. Obtain the largest and least values of 2(x + y + z) − xyz on
the closed ball
B = {(x, y, z) : x2 y 2 + z 2 ≤ 9}
¯
2. A rectangular box without a top is to have a volume 18m3 . By using the
method of Lagrange multiplier, determine the dimension that will give its
maximum surface.
62
Chapter 6
Here, we shall consider two mapping theorems viz: the inverse function theorem
and the Implicit function theorem.
(Df −1 )(Df ) = In
(Df )(Df −1 ) = Im
where Ij = Identity map on IRj . Therefore, for any x ∈ U , the Jacobian matrix
Df (x) has both right and left inverses. So it must be a square matrix (i.e. m = n).
We conclude two things from the brief discussion
* First, we need to concern ourselves only with the case of a differentiable function
f : U → IRm .
63
** Secondly, if f is to have a differentiable inverse, then it is necessary that the
Jacobian matrix, Df (x) be invertible for each x ∈ U .
Proof: For the sake of clarity we will now break up the proof of the theorem
into a number of steps as follows:
Step 1: Simplification to a special case
We will prove the theorem below for the case when Df (x◦ ) is the identity transfor-
mation. Here we show that this is indeed sufficient to prove the general case.
Letλ = Df (x◦ ); then λ−1 exists, and by the chain rule D(λ−1 ◦f )(x◦ ) = D(λ)(f (x◦ ))◦
Df (x◦ ) = λ−1 ◦ Df (x◦ ) =identity transformation. Now if the theorem is true for
λ−1 ◦ f, then the theorem is also true for f. Indeed, if g is an inverse for λ−1 ◦ f,
the inverse for f will be g ◦ λ−1 . We can make one further simplifying assumption,
namely, that x◦ = 0 and f (x◦ ) = 0. To see this, let us suppose we have proven the
theorem for the special case x◦ = 0 and f (x◦ ) = 0. We now prove the general case
from this. Let h(x) = f (x + x◦ ) − f (x◦ ). Then h(0) = 0 and Dh(0) = Df (x◦ ), so
Dh(0) is invertible. Then if h has an inverse near x = 0, the required inverse for f
near x◦ is given by
f −1 (y) = h−1 (y − f (x◦ )) + x◦ .
Step 1 demonstrates that it is sufficient to prove the theorem under the assumptions
64
x◦ = 0, f (x◦ ) = 0 and Df (0) is the identity. This will be assumed in the remaining
parts of the proof.
Step 2:
To get a local inverse, we would like is choose two neighborhoods of 0 such that
given any y from the first neighborhood of 0 there is a unique x from the second
neighborhood such that f (x) = y. To do this, consider the function gy defined by
gy (x) = y + x − f (x) If for some closed neighborhood of zero this is a contracting
mapping, then it has a unique fixed point, say x, and so x = y + x − f (x) or x is the
unique point belonging to the neighborhood such that f (x) = y. Now construct this
neighborhood: define g(x) − x − f (x); then Dg(0) = 0. Assume g to be of class C p ,
with p > 1. This means in particular that Dg is a continuous function, and so by
continuity at 0 there exists an r > 0 such that ∥x∥ < r implies ∥Dg1 (x)∥ < 1/2n,
where g = g1 , g2 , · · · , gn ). By the mean-value theorem, given x ∈ D(0, r), there
exists points c1 , c2 , · · · , cn in D(0, r) such that gi (x) = gi (x) − gi (0) = Dgi (ci )(x −
0) = Dgi (ci )(x). Therefore,
n n n
X X X ∥x∥ r
∥g(x)∥ = ∥gi (x)∥ = ∥gi (ci )(x)∥ ≤ ∥gi (ci )∥∥(x)∥ < < .
2 2
i−1 i−1 i−1
Let x1 and x2 be any two points in D̄(0, r) Then ∥gy (x1 )−gy (x2 )∥ = ∥g(x1 )−g(x2 )∥
and by the mean-value theorem as above, ∥g(x1 ) − g(x2 )∥ ≤ (1/2)∥x1 − x2 ∥, and
so gy is a contracting map (with constant K = 1/2). Now this implies that there is
a unique fixed point x ∈ D̄(0, r) for gy , and this implies f (x) = y. This means that
f has an inverse f −1 : D̄(0, r/2) ⊂ IRn → D̄(0, r) ⊂ IRn .
Step 3: The inverse is continuous.
Let x1 , x2 ∈ D̄(0, r), then recalling the definition of g, we get
and hence, ∥x1 − x2 ∥ ≤ 2∥f (x1 ) − f (x2 )∥. Therefore, if y1 , y2 ∈ D̄(0, 1/2), we get
∥f −1 (y1 ) − f −1 (y2 )∥ ≤ ∥y1 − y2 ∥, so f −1 is continuous.
Step 4: Inverse is differentiable.
For suitably small r, the inverse is differentiable on D̄(0, r/2). We were given that
65
2
f (0) is invertible, that Df : A ⊂ IRn → IRn is continuous, show that for all x
in some neighborhood around 0, [Dg(x)]−1 exists. If this neighborhood does not
contain D̄(0, r/2), r is restricted further until this is the case. Hence we can assume
[Df (x)]−1 exists for all x ∈ D̄(0, r/2). Moreover, we can assume ∥[Df (x)]−1 y∥ ≤
M ∥y∥ for all x ∈ D̄(0, r/2) and y ∈ IRn by the continuity of [Df (x)]−1 .
Now, for y1 , y2 ∈ D̄(0, r/2), x1 = f −1 (y1 ) x2 = f −1 (y2 ),
∥f −1 (y1 )−f −1 (y2 )−[Df (x2 )]−1 .(y1 −y2 )∥
∥y1 −y2 ∥
66
Example 6.1.2. Let
u(x, y) = ex cos y
v(x, y) = ex sin y.
Solution:
∂u ∂u
∂x ∂y ex cos y −ex sin y
∂(u, v)
= =
∂(x, y)
∂v ∂v
ex sin y ex cos y
∂x ∂y
= e2x [cos2 y + sin2 y] = e2x
∂(u, v)
= e0 = 1 ̸= 0.
∂(x, y) (0,π/3)
By the inverse mapping theorem, the function f is locally invertible in the neigh-
bourhood of (0, π/3). To see the global invertibility, let x, y ∈ IR be arbitrary
67
Also,
π π π
f (0, ) = (e0 cos( ), e0 sin( ))
3 3 3
π 7π
Now (0, ) ̸= (0, )
3 3
π 7π
but f (0, ) = f (0, ).
3 3
Therefore, f is not one-to-one. Hence, f is not invertible globally in the neighbor-
hood of (0, π3 ). Note that the neighbourhood of 0, π3 on which f is invertible should
not include (0, 7π3 ).
To find the local inverse, recall that
log( eu2 + v 2 ) = 2x
1
∴ x= log(u2 + v 2 ).
2
Also, to obtain y,
v
= tan y.
u
v
⇒ y = tan−1 ( ).
u
∴ if g is a local inverse of f in the neighbourhood of (0, π3 ), then
1 v
g(u, v) = ( log(u2 + v 2 ), tan−1 ( )). □
2 u
ax2 + bx + cy = u (i)
αx2 + βx + γy = v (ii).
68
Solution:
∂u ∂u
∂x ∂y
∂(u, v)
=
∂(x, y)
∂v ∂v
∂x ∂y
2ax + b c
=
2αx + β γ
= 2(aγ − αc) + bγ − βc
= bγ − βc ̸= 0.
This shows that the given systems of (u, v) in (i) and (ii) are locally invertible
ax2 + bx + cy = u
2
γu − cv γu − cv
a +b + cy = u
bγ − βc bγ − βc
1 γu − cv γu − cv
y= u−a −b .
c bγ − βc bγ − βc
69
6.2 Implicit function Theorem
Consider the function f : IRn × IRn → IRm . Let x and y be related by the equation
f (x, y) = 0. We want to solve f (x, y) = 0 so that we obtain y = f (x). Such a
function is called an explicit function. We also want to compute Df (x). In general,
given f (x, y) = 0, one may not be able to solve for y in terms of x. Therefore, it
becomes important to verify whether such a relation exists or not. Consider the
following example. Suppose f : IR2 → IR is defined by f (x, y) = x2 + y 2 − 1. Let us
find x and y such that f (x, y) = 0. A function f (x) is a solution ⇐⇒ f (x, f (x)) =
0.
The solution of the equation is
p
f (x) = ± 1 − x2
p p
i.e f (x) = + 1 − x2 and f (x) = − 1 − x2 . Thus, y is defined for |x| < 1.
Therefore, f if it exists is not necessarily unique. Let (x◦ , y◦ ) be such that the
equation F (x◦ , y◦ ) = 0, can we find f (x) such that f is differentiable near (x◦ , y◦ )?
f is not differentiable near x◦ = ±1.
Now consider f : IRn × IRn → IRm and consider the equations
Theorem 6.2.1. (Implicit function theorem): Let Ω ⊂ IRn × IRn be an open set
and let F : Ω → IRm be a function of class C p , p ∈ Z. Suppose (x◦ , y◦ ) ∈ Ω and
70
F (x, y) = 0. From the determinant of the matrix (Di Fj )m
j=1 , i = 1, 2, ..., n i.e
∂F1 ∂F1
···
∂y1 ∂ym
.
△= .
.
∂Fm ∂Fm
···
∂y1 ∂ym
evaluated at (x◦ , y◦ ) where F = (F1 , ..., Fm ). Suppose △ =
̸ 0. Then, there exists
an open neighbourhood U ⊂ IRn and a unique function f : U → V such that
F (x, f (x)) = 0 ∀ x ∈ U . Moreover, f is of class C p .
71
(x◦ , y◦ ) ∋ G(S) = W and this G has an inverse.
G−1 : W → S of class C p . Since S is open, there exists open sets U and V such
that x◦ ∈ U ⊂ IRn , y◦ ∈ V ⊂ IRm and U × V ⊂ S. Let G(U × V ) = Y ∈ W. Then
G : U × V → Y is of class C p and has inverse G−1 : Y → U × V of class C p . Hence,
G−1 (x, w) = (x, H(x, w)), where H is a C p -map. H : Y → V.
Define a map π : IRn × IRm → IRm by π(x, y) = y, so that
= π ◦ G ◦ G−1 (x, w) = w.
Since G−1 (x, w) = (x, H(x, w)) it follows that if f (x, w) ∈ Y , then x ∈ U . Now de-
fine f : U → V by f (x) = H(x, 0). Since F (x, H(x, w)) = w, we have F (x, f (x)) =
0.
Since H is of class C p , f must be of class C p . By Inverse mapping theorem, H(x, w)
is uniquely determined and hence f (x) = H(x, 0) is also uniquely determined and
the proof is complete. □
72