0% found this document useful (0 votes)
95 views

Analysis in Many Variables II

This document provides a review of partial differentiation and chain rules for functions with multiple variables. It defines partial derivatives for functions of two or more variables by differentiating one variable while treating the others as constants. Second partial derivatives are introduced and the chain rule is generalized for functions of multiple variable functions. Examples are provided to illustrate these concepts for functions in polar and Cartesian coordinates.

Uploaded by

A Fry
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views

Analysis in Many Variables II

This document provides a review of partial differentiation and chain rules for functions with multiple variables. It defines partial derivatives for functions of two or more variables by differentiating one variable while treating the others as constants. Second partial derivatives are introduced and the chain rule is generalized for functions of multiple variable functions. Examples are provided to illustrate these concepts for functions in polar and Cartesian coordinates.

Uploaded by

A Fry
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Analysis in Many Variables II Notes

Lecturer: Dr Ian Vernon


[email protected]
Errors or corrections to [email protected]

October 22, 2018


Review of Partial Differentiation

If f (x) is a real valued function of a single real variable x, i.e. f : R 7→ R, then its derivative
with regard to (wrt) x is defined as

df f (x + h) − f (x)
= lim
dx h→0 h
if it exists.

If f (x, y) is a real valued function of two real variables x and y, f : R2 7→ R, then we define the
partial derivative as
∂f f (x + h, y) − f (x, y)
= lim
∂x h→0 h
∂f
i.e. ∂x
is defined by differentiating wrt x whilst treating y as a constant.
∂f

This can sometimes be written as ∂x y
to make it explicit that y is being fixed.

Eg.
∂f
f (x, y) = cxy then = ycxy−1
∂x

∂f
Similarly, ∂y
defined as
 
∂f f (x, y + h) − f (x, y) ∂f
= lim =
∂y h→0 h ∂y x

Eg.
y
f (x, y) = cxy = celog x = cey log x
∂f
= cey log x · log x = cxy · log x
∂y
∂f

If we had a function in three real variables f (x, y, z) we could have ∂z x,y
, etc. for 4, 5,..., n
variables.

Eg.

If (r, θ) are polar co-ordinates and (x, y) are Cartesian co-ordinates then we can view x and y
as functions of two variables

x = x(r, θ)
y = y(r, θ)

1
which produces

x(r, θ) = r cos θ
y(r, θ) = r sin θ

So partial differentiation gives


∂x ∂y
= cos θ = sin θ
∂r ∂r
∂x ∂y
= −r sin θ = r cos θ
∂θ ∂θ

End of Lecture 1

2
Continuing from where Lecture 1 finished off, we seek to give the partial derivatives of x = t cos θ
and y = r sin θ in terms of r and θ. p
By algebraic manipulation we can easily see that r = x2 + y 2 and tan θ = xy , which for this we
will call ∗ to make the algebra less messy.
∂r x ∂r y
=p =p
∂x x + y2
2 ∂y x + y2
2

∂∗ ∂θ 1
=⇒ sec2 θ = (via implicit diff. wrt y)
∂y ∂y x
∂θ cos2 θ ( x )2 x x
=⇒ = = r = 2 = 2
∂y x x r x + y2
and likewise by similar argument we can deduce
∂θ y
=− 2
∂x x + y2

Note:
dx 1
Unlike with normal derivatives, the relationship of dy
= dy does not hold for partial derivatives.
dx

∂x 1
6= ∂y
∂y ∂x

Second Partial Derivatives

If f (u, v) is a function of u, v we can calculate

∂ 2f ∂ 2f
   
∂ ∂f ∂ ∂f
= , =
∂u ∂v ∂u∂v ∂v ∂u ∂v∂u

If these two results are continuous functions of u and v then they are always equal.

Example:

With x = r cos θ then


∂x ∂x
= cos θ , = −r sin θ
∂r ∂θ
So    
∂ ∂x ∂ ∂ ∂x
= (cos θ) = − sin θ , = − sin θ
∂θ ∂r ∂θ ∂r ∂θ
and it can be easily seen that
∂ 2x ∂ 2x
=
∂θ∂r ∂r∂θ

3
Note

This doesn’t work if you mix up the sets of variables, i.e.


   
∂ ∂x ∂ ∂x
6=
∂x ∂r ∂r ∂x

Chain Rules
Example:

For normal differentiation the chain rule (to differentiate a function of a function) looks like the
following:
d
(sin(et )) = et cos(et )
dx
So if we set

F (t) = sin(et )
= f (x(t)), with f (x) = sin x, x(t) = et

For a function f (x, y) of two functions x(t), y(t) giving F (t) = f (x(t), y(t)) the chain rule looks
slightly different.

Example:

cos t x
F (t) = = f (x(t), y(t)) where x(t) = cos t, y(t) = sin t, f (x, y) =
sin t y

The chain rule here is that


 
dF ∂f dx ∂f dy
= +
dt ∂x dt ∂y dt
 
dx ∂ dy ∂
= + f
dt ∂x dt ∂y

So to check
∂f 1 ∂f x
= , =− 2
∂x y ∂y y
dx dy
= − sin t, = cos t
dt dt

4
So
 
∂f dx ∂f dy sin t cos t cos t
+ =− −
∂x dt ∂y dt sin t sin2 t
cos2 t
= −1 −
sin2 t
1
= − 2 = − csc2 t
sin t

If F depends on two variables u and v in functions x and y, i.e. F (u, v) = f (x(u, v), y(u, v))
then
∂F ∂x ∂f ∂y ∂f
= +
∂u ∂u ∂x ∂u ∂y
and similarly
∂F ∂v ∂f ∂y ∂f
= +
∂v ∂x ∂x ∂v ∂y
Also, if u and v can be written as functions of x and y then f (x, y) = F (u(x, y), v(x, y)) so

∂f ∂u ∂F ∂v ∂F
= +
∂x ∂x ∂u ∂x ∂v
and similarly
∂f ∂u ∂F ∂v ∂F
= +
∂y ∂y ∂u ∂y ∂v

Example:

Suppose (r, θ) are polar co-ordinates and (x, y) are Cartesian co-ordinates.
Then set F (r, θ) = r cos θ = x = f (x, y).
Then
∂f (x) ∂f
= = 1 (a), and = 0 (b) (the easier route)
∂x ∂x ∂y
Or (via the chain rule)
∂f ∂r ∂F ∂θ ∂F
= +
∂x ∂x ∂r ∂x ∂θ

End Lecture 2

5
(Picking up where Lecture 2 left off)

We could use f (x, y) = F (r(x, y), θ(x, y)).


Then
∂f ∂r ∂F ∂θ ∂F
= +
∂x ∂x ∂r ∂x ∂θ
x ∂F y ∂F
=p − 2 2
(∗)
x2 + y 2 ∂r x + y ∂θ
x y
=p cos θ − 2 (−r sin θ)
2
x +y 2 x + y2
= cos2 θ + sin2 θ = 1 which agrees with (a)

[solving for (b) is left as an exercise]

(∗) is often written as


∂ x ∂ y ∂
=p − 2
∂x x2 + y 2 ∂r y + x2 ∂θ
showing how partial derivatives of Cartesian and polar co-ordinates are related.

6
Chapter 1

Notation

• R - the set of real numbers.


• Rn can be thought of as the set of ordered n-tuples, (x1 , x2 , ..., xn ), where each xi ∈ R, i.e. the
Cartesian co-ordinates in n-dimensional space.

The position vector of a point in Rn can be written in terms of the standard basis {e1 , e2 , ... , en }
so

x = x1 e1 + x2 e2 + ... + xn en
Xn
= xi ei = xi ei
i=1

(the second form of the summation is an example of the Einstein Summation Convention, where
when two terms are multiplied together with the same index it is implied that the product is being
summed from i = 1 to i = n)

The ei vectors are all orthonormal wrt the scalar (dot) product
(
1 if i = j
ei · ej =
0 if i 6= j

This is often denoted as ei · ej = δij where δij is called the Kronecker delta, defined as
(
1 if i = j
δij :=
0 if i 6= j

For low values of n (i.e. R2 or R3 ) instead of (x1 , x2 ) or (x1 , x2 , x3 ) it is convention to use (x, y)
or (x, y, z)

For two vectors u, v ∈ Rn given by

u = u1 e1 + u2 e2 + ... + un en = ui ei
v = v1 e1 + v2 e2 + ... + vn en = vi ei

7
their scalar (dot) product is given by

u · v = u1 v1 + u2 v2 + ... + un vn = ui vi

The length of u is given by |u| = u · u.
If θ is the angle between u and v then u · v = |u||v| cos θ.
If x is the position vector of a point then |x| is sometimes denoted by r.

Scalar fields are real valued functions on Rn , i.e. they map Rn → R, x 7→ f (x)

Example:
x1 x2 xy
(n = 3) f (x) = = = f (x, y, z)
tan x3 tan z

Vector fields are vector valued functions on Rn , i.e. Rn → Rn , x 7→ f (x)

Example:

f (x) = x(a · x), for some fixed a ∈ Rn


or f (x) = a

Vector valued functions can be written component wise:

f (x) = x(a · x)

could be written as
fi = xi (aj xj )

End of Lecture 3

8
(H/W set: problem sheet 1 Q1,6 due Fri 19th 12noon to lockers in CM117)

Curves in Rn are given parametrically by specifying x as a function x(t) of some parameter t,


i.e. a map R → Rn , t 7→ x(t)

Example:

x(t) = a + tb ⇐⇒ xi (t) = ai + tbi , i = 1, 2, ..., n


This is a straight line parallel to b through a

dx
If x(t) is differentiable then is tangent to the curve (if non-zero)
dt
”Proof”

arc length δs
x(t + h)
x(t + h) − x(t)

x(t)

dx(t) x(t + h) − x(t)


= lim
dt h→0 h

Example:

x(t) = cos(t)e1 + sin(t)e2 + te3


dx(t)
= − sin(t)e1 + cos(t)e2 + e3
dt

Note
dx
If t is taken to be the ’arc length’ s along the curve from a fixed point on it then = 1 with
ds
t = s and h = δs.
as |x(s + δs) − x(s)| ≈ δs
|x(s + δs) − x(s)|
=⇒ lim =1
δs→0 δs

dx
=⇒ = 1
ds

9
Chapter 2

Differential Operators and ∇

2.1 Vector notation for partial differentiation


For f (x, y) : R2 → R we had
∂f f (x + h, y) − f (x, y)
= lim
∂x h→0 h
and
∂f f (x, y + h) − f (x, y)
= lim
∂y h→0 h
which works for f : R2 → R and by using z this notation can be extended to f : R3 → R, but for
larger values of n for f : Rn → R this notation becomes redundant as we run out of letters.
However, if we change the notation to use x1 , x2 , x3 instead of x, y, z and f (x) = x1 e1 + x2 e2 + x3 e3
instead of f (x, y, z) then we can write

∂f ∂f f (x + he1 ) − f (x)
= := lim
∂x ∂x1 h→0 h
etc, which can be generalised to n-dimensions with f (x) : Rn → R as

∂f (x) f (x + hea − f (x)


:= lim
∂xa h→0 h
∂f
Sometimes as a form of shorthand we may write as ∂a f .
∂xa

End Lecture 4

10
(Lecture began with problem class material, hence less content covered)

2.2 The chain rule in vector notation

The restriction of f (x, y) : R2 → R to the parametric curve C : (x(t), y(t)) gives a functions of
t of the form F (t) = f (x(t), y(t)) and

dF dx ∂F dy ∂F
= +
dt dt ∂x dt ∂y

We want to put this in a more ’vector’ notation:

C; x(t) = x1 (t)e1 + x2 (t)e2

where x1 (t) = x(t), x2 (t) = y(t)


giving f (x, y) = f (x)
So
d f (x(t)) dx1 ∂f dx2 ∂f
= +
dt dt ∂x1 dt ∂x2
or as an operator
d dx1 ∂ dx2 ∂
= +
dt dt ∂x1 dt ∂x2

End Lecture 5

11
Alternatively, we could write all of this as a scalar product:
   
dF dx1 dx2 ∂f ∂f
= e1 + e2 · e1 + e2
dt dt dt ∂x1 ∂x2
| {z } | {z }
(∗) (†)

The first bracket (∗) is dx


dt
, the tangent to the curve C.
The second bracket (†) is called ∇f , ”gradf ” or the gradient of f .

dF dx
= · ∇f
dt dt
Or as an operator:
d dx
= ·∇
dt dt

∇ is called ’del’ or ’nabla’.

In n dimensions it’s the same story

Example:

C; x(t) = x1 e1 + ... + xn en a parametric curve, f (x) = Rn → R, F (t) = f (x(t))

Then
dF df (x(t)) dx1 ∂f dxn ∂f
= = + ... +
dt dt dt ∂x1 dt ∂xn
dF dx
= · ∇f
dt dt

12
Chapter 3

The gradient of a scalar field

In n dimensions, we define the vector operator ’nabla’ by


∂ ∂ ∂
∇ := e1 + ... + en = ei
∂x1 ∂xn ∂xi
n
If f (x) : R → R is a scalar field then we define the gradient (”grad f ”) to be given by the action
of ∇ on f ;
∂f ∂f ∂f
∇f = gradf := e1 + ... + en = ei = e i ∂i f
∂x1 ∂xn ∂xi
∂f
This is a vector field with components
∂xi

3.1 Examples
Example 1
x2 + y 2
(In R2 ) x = e1 x + e2 y, f (x) =
4
∂f x ∂f y
=⇒ = , =
∂x 2 ∂y 2
x y 1
=⇒ ∇f = e1 + e2 = x
2 2 2
The vector field ∇f can be drawn as arrows of length ||∇f || and direction parallel to nablaf starting
at a variety of sample points.

13
End Lecture 6

Reminder: u · v = ||u|| ||v|| cos θ

Note
∇f is always at right angles (i.e. normal) to the curves ||x|| = a, curves of constant f .

Example
(in R2 )
f (x) = x · e1 , x = x1 e1 + x2 e2
so f (x) = x1
then ∇f = e1

3.2 Properties of the gradient

If f, g : Rn → R (they are scalar fields) and a, b ∈ R are constants then;


i) ∇(af + bg) = a∇f + b∇g
ii) ∇(f g) = (∇f )g + f (∇g )

Check for n = 2

i)

∂ ∂
∇(af + bg) = e1 (af + bg) + e2 (af + bg)
∂x1 ∂x2
   
∂f ∂g ∂f ∂g
= e1 a +b + e2 a +b
∂x1 ∂x1 ∂x2 ∂x2
   
∂f ∂f ∂g ∂g
= a e1 + e2 + b e1 + e2
∂x1 ∂x2 ∂x1 ∂x2
= a∇f + b∇g

as required

14
ii)
∂ ∂
∇(f g) = e1 (f g) + e2 (f g)
∂x1 ∂x2
       
∂f ∂g ∂f ∂g
= e1 g+f + e2 g+f
∂x1 ∂x1 ∂x2 ∂x2
   
∂f ∂f ∂g ∂g
= e1 + e2 g + f e1 + e2
∂x1 ∂x2 ∂x1 ∂x2
= (∇f )g + f (∇g )

as required
While these arguments are only for functions in R2 because they are done component wise it is
trivial to see how they would generalise up to Rn and so we can conclude that i) and ii) hold for all
f, g : Rn → R.

iii) If Φ is a function R → R and f : Rn → R is a scalar field then Φ(f ) is another scalar field

and ∇Φ(f ) = ∇f
df
Example
If Φ(f ) = f 2 and f (x, y) = x sin y then Φ(f ) = x2 sin2 y and by direct calculation

∇Φ(f ) = ∇(x2 sin2 y) = e1 2x sin2 y + e2 2x2 sin y cos y

or using the formula;


∇f = ∇(x sin y) = e1 sin y + e2 x cos y

= 2f = 2x sin y
df

∇Φ(f ) = (e1 sin y + e2 x cos y)(2x sin y)


= e1 2x sin2 y + e2 2x2 sin y cos y

In general
∂Φ ∂Φ
∇Φ = e1 + ... + en
∂x1 ∂xn
∂f dΦ ∂f dΦ
= e1 + .... + en
∂x1 df ∂xn df

= (∇f )
df

Example
(in R3 )

f (x) = a · x − x · x, where a ∈ R3 constant, (a = a1 e1 + a2 e2 + a3 e3 )

15
so
f (x) = f (x1 , x2 , x3 ) = a1 x1 + a2 x2 + a3 x3 − (x1 2 + x2 2 + x3 2 )
∂f ∂f ∂f
=⇒ = a1 − 2x1 , = a2 − 2x2 , = a3 − 2x3
∂x1 ∂x2 ∂x3
So

∇f = e1 (a1 − 2x1 ) + e2 (a2 − 2x2 ) + e3 (a3 − 2x3 )


= a − 2x

3.3 Directional Derivatives


Let C : x = x(t) be a curve in Rn (i.e. a function R → R) and f : Rn → R be a scalar field.
d dx
Then f (x(t)) : R → R gives the value of f restricted to C and f (x(t)) = · ∇f (by the chain
dt dt
rule)

d ∂f dx1 ∂f dxn
f (x(t)) = + .... +
dt ∂x dt ∂xn dt
 1   
∂f ∂f dx1 dxn
= + ... + · + ... +
∂x1 ∂xn dt t
dx
= (∇f ) ·
dt

dx d f (x(t))
If we use length s to parametricise C, then = n̂ is a unit vector, tangent to C and = n̂ · ∇f
ds ds
is the rate of change of f wrt distance (arc length) in the direction n̂.
df
This is called the directional derivative pf f in direction n̂, sometimes written
dn̂

End Lecture 7

16

You might also like