M832 Approximation Theory Course Notes
M832 Approximation Theory Course Notes
Mathematics and Computing: Taught MSc M832 CN1
M832
APPROXIMATION THEORY AND METHODS
(PART 1)
Course Notes
(Chapters 1–8 and 10)
Prepared by
P. J. Rippon
Second edition
Copyright
c 2007 The Open University SUP 01168 7
3.1
Contents
Introduction 2
Reading List 3
Chapter 1 The approximation problem and existence of best
approximations 4
Study Session 1: Approximation in a metric space 4
Study Session 2: Approximation in a normed linear space 7
Chapter 2 The uniqueness of best approximation 16
Study Session 1: Convexity 16
Study Session 2: Best approximation operators 17
Chapter 3 Approximation operators and some approximating
functions 25
Study Session 1: ‘Good’ versus ‘best’ approximation 25
Study Session 2: Types of approximating functions 26
Chapter 4 Polynomial interpolation 36
Study Session 1: Polynomial interpolation 36
Study Session 2: Chebyshev interpolation 37
Chapter 5 Divided differences 47
Study Session 1: Basic properties of divided differences 47
Study Session 2: Numerical considerations and Hermite
interpolation 49
Chapter 6 The uniform convergence of polynomial
approximations 55
Study Session 1: Monotone operators 55
Study Session 2: The Bernstein operator 56
Chapter 7 The theory of minimax approximation 63
Study Session 1: The extreme values of the error function 63
Study Session 2: Characterising best minimax
approximations 64
Chapter 8 The exchange algorithm 72
Study Session 1: Using the exchange algorithm 72
Study Session 2: Matters relating to the exchange
algorithm 75
Chapter 10 Rational approximation by the exchange algorithm 82
Study Session 1: The exchange algorithm for rational
approximation 82
Study Session 2: Some convergence properties of the
exchange algorithm 86
Introduction
The subject of Approximation Theory lies at the frontier between Applied
Mathematics and Pure Mathematics. Practical problems, such as the computer
calculation of special functions like ex , lead naturally to theoretical problems,
such as ‘how well can we approximate by a given method?’ or ‘how fast does a
given algorithm converge?’.
Powell’s book Approximation Theory and Methods (hereafter referred to as
‘Powell’) provides an excellent introduction to these theoretical problems, covering
the basic theory of a wide range of approximation methods. Professor Powell is an
expert on both pure and applied approximation theory, and the book contains a
very detailed list of references to and discussion of the research literature.
This course is based on a treatment of fifteen chapters of Powell. Do not be
misled by the statement that this is an undergraduate textbook. Much of the
material can be taught at that level, but when looked at in detail many parts of it
are quite demanding. These course notes will guide you through the book telling
you which sections to read, explaining difficult parts, correcting errors (mercifully
few!) and setting SAQs and Problems to test your understanding of the material.
You should attempt all the SAQs and as many Problems as you have time for:
full solutions are given at the end of the notes for each chapter.
You will find the exercises in Powell quite varied. Many are routine, but others
are rather hard and some are very hard (particularly those which contain the
word ‘investigate’). I have resisted the temptation to attach ‘stars’ to harder
exercises and instead tried to provide ‘hints’, where appropriate. In general I feel
that, at this level, it is a good idea for you to try and make your own judgement
about the difficulty of a given problem.
Many of the exercises require the use of a good scientific calculator (one with
special functions, including hyperbolics, and a memory). Some require the
solution of non-linear equations of the form f (x) = 0 by using, for example:
the bisection method (finding an interval [a, b] such that f (a), f (b) have
opposite signs, testing f (c), where c = 12 (a + b), and then repeating the process
with either [a, c] or [c, b]);
Newton’s method (making a good initial guess x0 at a solution and then
calculating the sequence xn given by
xn+1 = xn − f (xn )/f (xn ), n = 0, 1, 2, . . .).
Such methods can be implemented on a basic scientific calculator (especially if
only a rough answer is needed), but it will obviously save time if you have access
to a computer. You will not be expected to determine accurate solutions by such
methods in the examination. On the matter of accuracy, I have tended to present
calculations as they appeared on my own calculator, and have sometimes given
final answers correct to only three significant digits.
In order to pace you through the course, there are four Tutor-Marked
Assignments (TMAs). These are compulsory in that you cannot pass the course
without obtaining a reasonable average grade on them. Your three best TMAs
carry 50% of the total marks for the course, the remaining 50% coming from the
three-hour examination at the end of the course. Please note that TMAs cannot
be accepted after their cut-off dates, other than in exceptional circumstances.
Although you should have plenty to do reading Powell and these course notes, I
have added a reading list after this introduction. This splits into books covering
the background material which is assumed in Powell (Linear Algebra, Metric
Spaces, etc.) and other books on Approximation Theory.
2
I should be grateful to receive any comments you may have on the course notes
and on the set book. The course notes have already benefited greatly from close
reading by Mick Bromilow here at the OU and Martin Stynes of University
College, Cork. Their help has been invaluable. Finally, I should like to thank all
those who have helped prepare these Course Notes, including members of the
Desktop Publishing Unit, Alison Cadle who edited them, and the many M832
students who have supplied corrections to earlier versions.
Phil Rippon
Milton Keynes, August 2006
Reading List
Background
W. Rudin, Principles of Mathematical Analysis, McGraw–Hill, 1976.
(A concise introduction to real analysis, including metric spaces,
integration and functions of several variables, as well as basic linear
algebra — available in a paperback International Student Edition.)
V. Bryant, Metric Spaces, Cambridge University Press, 1985.
(An introduction to metric spaces, emphasising the importance of
iteration. Plenty of explanation.)
D. Kreider, R. Kuller, D. Ostberg and F. Perkins, An Introduction to Linear
Analysis, Addison–Wesley, 1966.
(An introduction to the use of vector spaces of functions in solving
linear differential equations — lots of worked exercises.)
M203 Introduction to Pure Mathematics
MST204 Mathematical Models and Methods
M386 Metric and Topological Spaces (now part of M435 )
Approximation Theory
T. J. Rivlin, An Introduction to the Approximation of Functions, Dover, 1981.
(Cheap and covers very similar material to Powell, with less on
splines and more on rational approximation.)
P. J. Davis, Interpolation and Approximation, Dover, 1976.
(Cheap, but a classic text which overlaps Powell considerably,
though with a much greater emphasis on complex approximation.)
D. Braess, Nonlinear Approximation Theory, Springer, 1986.
(Recent and sophisticated, this book examines the more difficult
non-linear theory which Powell largely avoids.)
3
Chapter 1 The approximation problem
and existence of best approximations
The book begins with a discussion of the types of problems which are to be solved
and several fundamental results. Powell assumes that the reader is quite familiar
with metric spaces and so the commentary below includes a short refresher course
on these, in case you are rusty on this subject.
This chapter splits into TWO study sessions:
Study session 1: Sections 1.1–1.2.
Study session 2: Sections 1.3–1.5.
Commentary
f
?
A
g
4
2. A metric space (B, d) is a set B and a metric (or distance function) d(a, b),
a, b ∈ B, such that for all a, b, c ∈ B:
(M1) d(a, b) ≥ 0, with equality if and only if a = b;
(M2) d(a, b) = d(b, a);
(M3) d(a, c) ≤ d(a, b) + d(b, c).
The most familiar metric spaces are R with the metric d(a, b) = |a − b| and
R2 with the metric
1
d(a, b) = (a1 − b1 )2 + (a2 − b2 )2 2 , a = (a1 , a2 ), b = (b1 , b2 ).
a2 a = (a1 , a2 )
d(a, b)
b2 b = (b1 , b2 )
a1 b1
For general n it is not quite so obvious that (M3) holds. The proof is given
later when we introduce a large family of metric spaces. Before that we recall
a number of definitions and results for future reference. No proofs are given
as these results are quite standard.
Convergence A sequence an , n = 1, 2, . . ., in B is convergent with limit
a∗ if d(an , a∗ ) → 0 as n → ∞.
Closed set A subset F of B is closed if every convergent sequence an ,
n = 1, 2, . . ., in F has its limit in F .
For example, the closed ball
{b ∈ B : d(a, b) ≤ r}, r > 0,
is a closed set.
Open set A subset E of B is open if B\E is closed.
For example, the open ball
{b ∈ B : d(a, b) < r}, r > 0,
is an open set.
Compact set A subset K of B is compact if every sequence an ,
n = 1, 2, . . ., in K has a convergent subsequence ank , k = 1, 2, . . ., whose limit
a is in K.
For example, every finite set is compact. In Rn with the metric d given by
equation (1), every closed set which is also bounded (i.e. lies inside some
fixed closed ball) is compact. Note that every compact set is closed.
5
Continuous function A function φ : (B, d) → (B , d ) is continuous at
a ∈ B if for each ε > 0 there is a δ > 0 such that
d(a, b) < δ ⇒ d (φ(a), φ(b)) < ε
(equivalently: for each sequence an → a in B, we have f (an ) → f (a)). We
say that φ : (B, d) → (B , d ) is continuous if φ is continuous at each a ∈ B.
Uniformly continuous function A function φ : (B, d) → (B , d ) is
uniformly continuous on B if for each ε > 0 there is a δ > 0 such that, for
all a, b ∈ B,
d(a, b) < δ ⇒ d (φ(a), φ(b)) < ε.
Extreme Value Theorem If φ : (B, d) → (R, d ) is continuous (where
d (a, b) = |a − b|), then φ attains a maximum value and a minimum value on
any compact subset K of B.
Uniform Continuity Theorem If φ : (B, d) → (B , d ) is continuous then
φ is uniformly continuous on any compact subset K of B.
3. Theorem 1.1. The proof can be shortened. You can omit the second
sentence and the word ‘Otherwise’ from the third sentence, and then use the
notation a∗ in place of a+ . Note that Powell uses ‘limitpoint’ to mean the
limit of a convergent subsequence.
The following picture may be helpful.
a3
f a1
A
a∗
a4
a2
B
Self-assessment questions
S1 Consider the problem of fitting the data in Figure 1.2 by a straight line. Show
that the set A of vectors (p(x1 ), . . . , p(x5 )), arising from functions
p(x) = c0 + c1 (x), forms a 2-dimensional subspace of R5 .
6
Study Session 2: Approximation in a normed linear
space
Commentary
1. Almost every metric space in Powell arises as a normed linear space
(n.l.s.). This is a linear space B (also called a vector space) with an
associated norm a , a ∈ B, such that, for all a, b ∈ B and λ ∈ R:
(N1) a ≥ 0, with equality if and only if a = 0;
(N2) λa = |λ| a ;
(N3) a + b ≤ a + b .
Roughly speaking, the norm measures how large the element a is, that is,
how far a lies from the zero element of the space.
By defining
d(a, b) = a − b ,
we find that (B, d) is a metric space. Properties (M1) and (M2) are
immediate, as is (M3), since
a − c = (a − b) + (b − c) (by linearity)
≤ a − b + b − c . (by (N3))
For this reason, (N3) is also called the triangle inequality.
Powell gives some important examples of norms in Section 1.4. Two of these
have useful geometric interpretations.
y = f (x) y = f (x)
a b a b
−||f ||∞
b
f ∞ = maxa≤x≤b |f (x)| f 1 = a
|f (x)| dx
Maximum Total
vertical shaded
f separation f area
g g
a b a b
b
f − g∞ = maxa≤x≤b |f (x) − g(x)| f − g1 = a
|f (x) − g(x)| dx
7
The 2-norm
12
b
f 2 = f (x)2 dx
a
1 1 1 2 1 2
|f (x)g(x)| dx ≤ f (x) dx + g(x) dx
f 2 g 2 a 2 f 22 a g 22 a
= 1.
The desired inequality now follows from
b b
f (x)g(x) dx ≤ |f (x)g(x)| dx.
a a
The metric on R which arises from the discrete 2-norm is precisely that
n
8
b
f (x)g(x) dx ≤ f p g q ,
a
AM
B A
{a ∈ B : ||a|| ≤ M }
9
Self-assessment questions
S3 Prove that the function φ in the above proof is continuous.
S8 Let A and f be as in Figure 1.4. Determine inf a∈A f − a for the 1-norm,
the 2-norm and the ∞-norm.
P5 Powell Exercise 1.1 (Hint: choose a suitable compact subset of A1 and apply
SAQ S2.)
10
with a∗1 ∈ A1 and a∗2 ∈ A2 (first choose a convergent subsequence a1nk of a1n
and then, if necessary, a subsequence of a2nk ).
Now, by the triangle inequality,
d(a∗1 , a∗2 ) ≤ d(a∗1 , a1nk ) + d(a1nk , a2nk ) + d(a2nk , a∗2 ) .
Letting k → ∞, we deduce that d(a∗1 , a∗2 ) ≤ d∗ , so that d(a∗1 , a∗2 ) = d∗ , as
required.
that is,
f + g ∞ ≤ f ∞ + g ∞ ,
as required.
S6 The set A0 is compact, being the intersection of a closed ball in B with A and
hence a closed subset of a compact set. Thus we can, by Theorem 1.1, choose
a∗ ∈ A0 such that
a − f ≥ a∗ − f , a ∈ A0 .
To see that
a − f ≥ a∗ − f , a ∈ A,
note that if a ∈ A\A0 , then
a − f > 0 − f ≥ a∗ − f ,
since 0 ∈ A0 .
11
S7 Since λ > 0, 1 − xλ ≥ 0 for 0 ≤ x ≤ 1, so that
1 1
xλ+1 λ
e 1 = 1 − xλ dx = x − = ,
0 λ + 1 0 λ + 1
1 1
2
λ 2
e 2 = 1−x dx = 1 − 2xλ + x2λ dx
0 0
1
2x λ+1
x 2λ+1
2λ2
= x− + = ,
λ+1 2λ + 1 0 (λ + 1)(2λ + 1)
e ∞ = max |1 − xλ | = 1.
0≤x≤1
√
S8 inf f − a 1 = 1; inf f − a 2 = 1/ 2; inf f − a ∞ = 12 .
a∈A a∈A a∈A
12
Now if x = 1, then
2
x − ax = |1 − a| = 1 − a,
and if x = 12 a, then
2
x − ax = 1 a 1 a − a = 1 a2 .
2 2 4
y = g(x)
y = f (x)
a = x0 x1 x2 b = xn
13
P3 Let us take [a, b] = [0, 1] for simplicity. The example can always be adapted to
[a, b] by a translation.
Consider first the example e(x) = 1 − xλ , 0 ≤ x ≤ 1. Equations (1.24) and
(1.25) give
e 2 2λ + 2
= ,
e 1 2λ + 1
which shows that
e 2 √
1≤ ≤ 2, 0 ≤ λ < ∞.
e 1
However, if we allow negative values of λ then e 2 / e 1 becomes unbounded
as λ tends to − 21 from above. Of course, f (x) = xλ is not continuous on [0, 1]
for negative values of λ, but this observation suggests a possible ‘shape’ for
our example.
Consider instead the continuous function
1 − x/ε, 0 ≤ x ≤ ε,
fε (x) =
0, ε < x ≤ 1,
where 0 < ε < 1. We have
ε ε
x2
fε 1 = (1 − x/ε)dx = x − = ε/2,
0 2ε 0
ε ε
x2 x3
fε 22 = (1 − x/ε)2 dx = x − + 2 = ε/3.
0 ε 3ε 0
√
Hence fε 2 / fε 1 = 2/ 3ε → ∞ as ε → 0.
P4 (i) The unit ball for the 1-norm in R3 is a regular octahedron centred at the
origin. The part of its boundary in the first octant has equation
x + y + z = 1. Thus, as r increases,
{a : a 1 ≤ r}
first meets 3x + 2y + z = 6 at the point (2, 0, 0), which is the closest point
to the origin with respect to the 1-norm.
(ii) The unit ball for the 2-norm in R3 is the ordinary ball centred at the
origin. As r increases,
{a : a 2 ≤ r}
first meets 3x + 2y + z = 6 at a point (x, y, z) whose normal (to the
plane) passes through the origin. Since the line {(3k, 2k, k) : k ∈ R} is
normal to the plane we solve
3(3k) + 2(2k) + k = 6 ⇒ k = 3/7.
Thus (9/7, 6/7, 3/7) is the closest point to the origin with respect to the
2-norm.
(iii) The unit ball for the ∞-norm in R3 is the cube with vertices
(±1, ±1, ±1). As r increases,
{a : a ∞ ≤ r}
first meets 3x + 2y + z = 6 at a point of the form (k, k, k), k > 0. Thus
(1, 1, 1) is the closest point to the origin with respect to the ∞-norm.
14
P5 The idea, as in Theorem 1.2, is to choose a compact subset A2 of A1 which
must contain the point of A1 which is closest to A0 . For example, we can
choose
A2 = A1 ∩ {a : a ≤ 2M },
where M is so large that
A0 ⊆ {a : a ≤ M }.
Then choose a∗0 ∈ A0 and a∗1 ∈ A2 (see SAQ S2) such that
a∗0 − a∗1 ≤ a0 − a1 , a0 ∈ A0 , a1 ∈ A2 .
To prove that
a∗0 − a∗1 ≤ a0 − a1 , a0 ∈ A0 , a1 ∈ A1 ,
note that if a0 ∈ A0 and a1 ∈ A1 \A2 , then
a0 − a1 ≥ a1 − a0
> 2M − M
≥ a∗0
= a∗0 − 0
≥ a∗0 − a∗1 ,
as required.
15
Chapter 2 The uniqueness of best
approximation
Ideally the method used to choose an approximation from a set A to a given
function f should give a unique answer. This chapter is devoted to the study of
those conditions under which a best approximation from A to f is unique.
Important new concepts introduced include ‘convexity’ and ‘scalar product’.
This chapter splits into TWO study sessions:
Study session 1: Sections 2.1 and 2.2.
Study session 2: Sections 2.3 and 2.4.
Commentary
1. The following diagrams illustrate the notion of a convex set and a strictly
convex set.
2. In the proof of Theorem 2.1, there is no need for modulus signs around θ and
1 − θ, since both quantities are positive.
B A
s0
1
2
(s0 + s1 )
f s1
s = 12 (s0 + s1 ) + λ f − 12 (s0 + s1 )
Note that the number λ ≥ 0, which appears in (2.6), does not need to be
maximal. All that is required is λ > 0 and s ∈ A.
16
4. The following diagram illustrates the proof of Theorem 2.4.
1
2
(s0 + s1 ) s1 A
h∗
B
s0 f
N (f, h∗ )
Self-assessment questions
S1 Which of the unit balls in Figure 1.5 (page 10) are strictly convex?
S2 Prove that
(a) the intersection of two convex sets is convex;
(b) the intersection of two strictly convex sets is strictly convex.
S3 Show that the norms (a) · 1 and (b) · ∞ are not strictly convex on C[0, 1].
Commentary
a∗ = X(f )
A
A projection operator is one for which X(X(f )) = X(f ), that is, the best
approximation from A to a point a ∈ A is a itself.
2. The final comment in Section 2.3 relates to the earlier comment on the
importance of the continuity of the best approximation operator to computer
calculations.
17
3. A scalar product (or inner product) on a linear space B is a real-valued
function (a, b), a, b ∈ B, such that for all a, b, c ∈ B and λ, µ ∈ R:
(S1) (a, a) ≥ 0, with equality if and only if a = 0;
(S2) (a, b) = (b, a);
(S3) (a, λb + µc) = λ(a, b) + µ(a, c).
Two important scalar products are given in Section 2.4.
In any linear space with a scalar product we can define a norm by the
equation
a = (a, a).
As usual, only the proof of the triangle inequality requires any work; it
follows from a version of the Cauchy–Schwarz inequality (see SAQ S6):
|(a, b)| ≤ a b , a, b ∈ B,
together with the identity
a + b 2 = a 2 + 2(a, b) + b 2,
which you can easily verify.
We shall meet other examples of scalar products in Chapter 11.
5. The norms
1/p
b
f p = |f (x)| dx
p
, 1 < p < ∞,
a
are all strictly convex. The proof (for p = 2) depends on a careful study of
the possibility of equality in Hölder’s inequality.
Self-assessment questions
S4 Prove Theorem 2.6. (Hint: consider A0 = {a ∈ A : a ≤ 4 f }.)
18
Problems for Chapter 2
P1 Powell Exercise 2.4
S3 (a) Consider f (x) = 2x and g(x) = 2(1 − x) on [0, 1]. Clearly f 1 and
g 1 = 1, but
1
2 (f + g) (x) = 1, 0 ≤ x ≤ 1,
so that 12 (f + g) 1 = 1. Hence · 1 is not strictly convex.
(b) Consider f (x) = 1 and g(x) = x on [0, 1]. Clearly f ∞ = 1 and
g ∞ = 1, but
1 1
2 (f + g) (x) = 2 (1 + x), 0 ≤ x ≤ 1,
so that 12 (f + g) ∞ = 1. Hence · ∞ is not strictly convex.
19
b
S5 (S1) (f, f ) = a w(x)f (x)2 dx ≥ 0, since w(x)f (x)2 ≥ 0, a ≤ x ≤ b.
Equality can occur only if f (x)2 = 0, a ≤ x ≤ b, since w(x) > 0,
a ≤ x ≤ b.
(S2) Obvious, by definition.
b
(S3) (f, λg + µh) = w(x)f (x)(λg(x) + µh(x)) dx
a
b b
=λ w(x)f (x)g(x) dx + µ w(x)f (x)h(x) dx
a a
= λ(f, g) + µ(g, h).
f (x) = 1 1 f = (1, 1, 1, 1, 1)
1
a(x) = λx
−1 1 −1 1
−λ
a = (−λ, 2
, 0, λ2 , λ)
−1
f (x) = 1 1 f = (1, 1, 1, 1, 1)
−1 1 −1 1
f − a∞ = 1, 0 ≤ λ ≤ 1 f − a∞ = 1, 0 ≤ λ ≤ 1
20
Solutions to Problems in Chapter 2
P1 Since the unit ball in the ∞-norm is a square with sides parallel to the axes,
the best approximation in A to a point f ∈ R2 \A is found as follows.
(a) If f lies in one of the shaded sets, then X(f ) lies on the circle
{a : a 2 = 1} and on a (projection) line through f at 45◦ to the axes.
(b) If f does not lie in one of the shaded sets, then X(f ) is the nearest of the
four points (±1, 0), (0, ±1).
X(f )
g X(g)
1
A
To prove directly (that is, without the help of Theorem 2.6) that X(f ) is
continuous, suppose first that f1 , f2 lie in the shaded set in the first quadrant.
Then
√
f1 − f2 ∞ ≥ d/ 2,
where d is the (ordinary) distance between the 45◦ projection lines through f1
and f2 .
f1 √
f2 N (f2 , d/ 2)
A X(f2 )
Furthermore,
√
X(f1 ) − X(f2 ) ∞ ≤ 2d,
since the line segment joining X(f1 ) to X(f2 ) makes an angle of more than
45◦ with the projection lines from f1 , f2 . Hence
X(f1 ) − X(f2 ) ∞ ≤ 2 f1 − f2 ∞ .
It follows that X is continuous on the shaded sets. Since X is constant on the
four unshaded sets in R2 \A (and these constant values agree with the values
of X on the boundaries between the shaded and unshaded sets in R2 \A) and
X is the identity on A itself, we deduce that X is continuous on the whole
of R2 .
21
P2 To prove that X(f ) = f / f , if f > 1, we have to show that
f − g ≥ f − f / f , g ∈ A.
But, by the ‘backwards’ form of the triangle inequality,
f − g ≥ f − g
≥ f − 1 (since g ∈ A)
= f (1 − 1/ f )
= f − f / f ,
as required. (Where did we use the fact that f > 1?)
To prove that
X(f1 ) − X(f2 ) ≤ 2 f1 − f2 , f1 , f2 ∈ B,
it is sufficient to consider three cases.
Case 1 f1 ≤ 1, f2 ≤ 1.
In this case X(f1 ) = f1 and X(f2 ) = f2 , so that
X(f1 ) − X(f2 ) = f1 − f2 .
Case 2 f1 ≤ 1, f2 > 1.
In this case X(f1 ) = f1 and X(f2 ) = f2 / f2 , so that
f2
X(f1 ) − X(f2 ) = f −
1 f2
1
= f
1 − f + f 1 −
f2
2 2
1
≤ f1 − f2 + f2 1 − (since f2 > 1)
f2
= f1 − f2 + f2 − 1
≤ f1 − f2 + f2 − f1 (since f1 ≤ 1)
≤ 2 f1 − f2 .
Case 3 f1 > 1, f2 ≥ f1 .
In this case X(f1 ) = f1 / f1 and X(f2 ) = f2 / f2 , so that
f1 f2
X(f1 ) − X(f2 ) = −
f1 f2
1 f1 − f2 + f2 1 − f1
=
f1 f2
≤ f1 − f2 + f2 − f1 (since f1 > 1)
≤ 2 f1 − f2 .
P3 First we remark that the sum of two norms on a linear space is also a norm
on that space.
To prove that
f = f 1 + f ∞ , f ∈ C[−π, π],
is not strictly convex, let A be the 1-dimensional subspace of functions of the
form
g(x) = λ sin2 x, −π ≤ x ≤ π,
where λ ∈ R, and let f (x) = x, −π ≤ x ≤ π.
For |λ| ≤ 1, the graph y = λ sin2 x meets y = x only at the origin since
|λ sin2 x| ≤ | sin x| < |x|, for x = 0.
22
y=x
y = λ sin2 x
−π π
Thus
f − g = π2 + π,
for g(x) = λ sin2 x, |λ| ≤ 1.
Since it is also clear that
f − g ≥ π2 + π, λ ∈ R,
we deduce that f does not have a unique best approximation in A. Hence, by
Theorem 2.4, this norm is not strictly convex.
P4 Recall that the unit ball of R3 in the 1-norm is the regular octahedron whose
face in the first octant lies on x + y + z = 1, and the unit ball of R3 in the
∞-norm is the cube with vertices (±1, ±1, ±1).
Thus the plane x + y = 1 meets the boundary of the unit ball in the 1-norm
in a line segment and also meets the boundary of the ball of radius 12 in the
∞-norm in a line segment.
z z
y y
1
1 1 1
x+y = 1 2
1
x+y = 1
2
1
2
1 1
x x
23
P5 This question shows that any closed, bounded, convex subset A of a linear
space B, with the property that f ∈ A ⇒ −f ∈ A, is the unit ball of some
norm, namely that given by
0, if f = 0,
f =
min {r ∈ (0, ∞) : f /r ∈ A}, if f = 0.
First note that the minimum in this definition is attained. Indeed, if we first
define
f = inf {r ∈ (0, ∞) : f /r ∈ A}, f = 0,
then f > 0, otherwise A is unbounded. Also, there is a sequence rn → f ,
such that f /rn ∈ A, and since f /rn → f / f we deduce that f / f ∈ A, as
required.
(N1) Certainly f ≥ 0, by definition and we have just seen that f > 0
for f = 0.
(N2) λf = |λ| f , for λ ∈ R.
It is clear that this holds if f = 0 or λ = 0.
If f = 0 and λ = 0, then
λf = min {r ∈ (0, ∞) : λf /r ∈ A}
= min {r ∈ (0, ∞) : |λ|f /r ∈ A} (since f ∈ A ⇔ −f ∈ A)
= min {|λ|r ∈ (0, ∞) : |λ|f /(r|λ|) ∈ A}
= |λ| min {r ∈ (0, ∞) : f /r ∈ A}
= |λ| f ,
as required.
(N3) f + g ≤ f + g
At first sight this looks difficult. However, by the definition of f + g ,
it is sufficient to prove that
f +g
∈ A. (1)
f + g
We know that f / f ∈ A and g/ g ∈ A so that, by the convexity of A,
f g
θ + (1 − θ) ∈ A, 0 < θ < 1.
f g
If we now choose
f g
θ= so that 1 − θ = ,
f + g f + g
then we obtain (1).
The above argument breaks down if f = g = 0, but in this case
f = g = 0, and so f + g = 0 also.
Hence f , f ∈ B, is indeed a norm on B.
24
Chapter 3 Approximation operators and
some approximating functions
Calculating a best approximation from a subspace A of B to f , with respect to
some norm on B, may not be as easy as calculating an approximation using some
other operator, such as an interpolation operator. To judge how good an
approximation is obtained by such an operator X, we use the ‘norm’ X of the
operator, which is exploited in Sections 3.1 and 3.2. The other two sections of the
chapter contain a discussion of the problems involved in approximating by
polynomials and an introduction to piecewise polynomial approximation.
This chapter splits into TWO study sessions:
Study session 1: Sections 3.1 and 3.2.
Study session 2: Sections 3.3 and 3.4.
Commentary
1. Powell defines X to be the smallest real number such that
X(f ) ≤ X f , f ∈ B.
Otherwise stated,
X = sup { X(f ) / f : f ∈ B, f = 0}.
Thus to determine the value of X , we must find a number M such that
X(f ) ≤ M f , f ∈ B,
and such that, whenever M < M , there exists f ∈ B with
X(f ) > M f .
If X(f ) / f is unbounded on B, then we say that the operator X is
unbounded.
Notice that if X is a linear operator, then
X(λf ) X(f )
= , f = 0, λ = 0,
λf f
so that
X = sup { X(f ) : f = 1}.
In general, the supremum may not be attained because the unit sphere
{f ∈ B : f = 1} need not be compact (see, for example, Problem P1).
25
2. The following diagram may help you with Theorem 3.1.
A
B
X(f )
p∗
d∗
f
[Note the word ‘a’ in the first line of the proof of Theorem 3.1.]
3. Page 25, line 13. The reason why p∗ (x) = x − 18 is the best L∞
approximation by a linear polynomial to f (x) = x2 on [0, 1] will become clear
in Chapter 7.
4. The point of the final paragraph of Section 3.2 is that algorithms for
calculating best L∞ approximations from Pn are more involved than those
for applying linear (projection) operators X : B → A, such as interpolation.
If we determine Xf and compute f (x) − Xf (x) at various points, finding an
x for which
|f (x) − Xf (x)| > (1 + X )ε,
then, by Theorem 3.1, the best approximation p∗ will satisfy f − p∗ > ε.
Thus a larger value of n may be required.
Self-assessment questions
S1 Show that the interpolation operator X defined at the bottom of page 23 is
unbounded with respect to (a) the 1-norm, (b) the 2-norm.
Commentary
1. Page 26, line 9. The promised technique appears in equation (3.23).
2. The space C (k) [a, b]. An example of a function f which belongs to C (k) [a, b],
but not to C (k+1) [a, b] is
xk+1 , x ≥ 0,
f (x) =
−xk+1 , x < 0.
For this function,
f (k) (x) = (k + 1)!|x|, x ∈ R,
which is continuous but not differentiable at 0.
26
3. The identity (3.23) holds because q ∈ Pn so that, as p varies over the whole
space Pn , so q + p varies over the whole of Pn .
1
4. Table 3.1. For k = 1, the terms d∗n (f ) scale by a factor of approximately 2 as
n doubles, whereas, for k = 3, they scale by approximately 18 . This gives
C1
d∗2n (f ) , k = 1,
2n
C3 C3
d∗2n (f ) n = n 3 , k = 3,
8 (2 )
which suggests that d∗n (f ) Ck /nk in both cases.
Notice that in (3.20), for a fixed value of k,
(n − k)! 1 1
= ∼ k as n → ∞.
n! n(n − 1) . . . (n − k + 1) n
an
(We say that an ∼ bn as n → ∞ if lim = 1.)
n→∞ bn
5. Page 29, line 8. An analytic function is one which has a power series
expansion about each point of its domain of definition. Such functions are
completely determined by their values on any given open interval, no matter
how short.
6. Page 29, line 6−. The spline function s is a piecewise polynomial on [a, b],
such that
s(x) = pj (x), ξj−1 ≤ x ≤ ξj , j = 1, . . . , n,
where each pj ∈ Pk , and
(i) (i)
pj (ξj ) = pj+1 (ξj ), i = 0, 1, . . . , k − 1, j = 1, . . . , n − 1.
pj
p1 pj+1
pn
27
Hence
s(x) = p2 (x) = p1 (x) + q1 (x), ξ1 ≤ x ≤ ξ2 ,
and so
d1
s(x) = p1 (x) + (x − ξ1 )k+ , ξ0 ≤ x ≤ ξ2 .
k!
Here
0, x < ξ1 ,
(x − ξ1 )+ =
x − ξ1 , x ≥ ξ1 .
Now put
q2 (x) = p3 (x) − p2 (x), x ∈ R,
and continue in this manner to obtain
1
n−1
s(x) = p1 (x) + dj (x − ξj )k+ , a ≤ x ≤ b,
k! j=1
(k)
where dj = qj (ξj ) and qj = pj+1 − pj . Thus dj is the jump in s(k) at ξj .
7. Page 30, line 9−. The ‘big oh’ notation used here needs some explanation.
We say that
f (x) = O(g(x)), x ∈ S,
for some subset S of R, if
|f (x)| ≤ M |g(x)|, x ∈ S,
where the constant M does not depend on x. For example,
x2 + x = O x2 , x ≥ 1, whereas x2 + x = O(x), 0 ≤ x ≤ 1.
Self-assessment questions
S3 The best L∞ approximation from P2 to f (x) = |x| on [−1, 1] is
p∗ (x) = x2 + 18 (this will become clear in Chapter 7). Verify that
f − p∗ ∞ = 18 , thus confirming one of the entries in Table 3.1.
P4 Powell Exercise 3.6 (Use Theorem 3.1 and be content to get the lower
estimate 0.048.)
28
Solutions to SAQs in Chapter 3
S1 Consider
⎧
⎨ 1 − x/ε, 0 ≤ x ≤ ε,
fε (x) = 0, ε < x < 1 − ε,
⎩
1 + (x − 1)/ε, 1 − ε ≤ x ≤ 1.
(Remember Problem P3, Chapter 1.)
Since fε (0) = fε (1) = 1, the interpolating function p = X(fε ) is simply
p(x) = 1, 0 ≤ x ≤ 1, and so p 1 = p 2 = 1. However,
ε 1
fε 1 = (1 − x/ε) dx + (1 + (x − 1)/ε) dx = ε,
0 1−ε
so that
X(fε ) 1 1
(a) = is unbounded.
fε 1 ε
Also
ε 1
fε 22 = (1 − x/ε)2 dx + (1 + (x − 1)/ε)2 dx = 2ε/3,
0 1−ε
so that
X(fε ) 2 3
(b) = is unbounded.
fε 2 2ε
1
y =x− 8
y = x2
1
− 18 2
1
29
S3 If f (x) = |x|, −1 ≤ x ≤ 1, and p∗ (x) = x2 + 18 , −1 ≤ x ≤ 1, then there are 5
candidates for the point x ∈ [−1, 1] such that f − p∗ ∞ = |f (x) − p∗ (x)|.
These are 0, ±1, and the points ±x where
e(x) = f (x) − p∗ (x) = |x| − x2 + 18
is a maximum. As in SAQ S2, we find that x = ± 12 , so that
f − p∗ ∞ = 1
8 = 0.125,
which confirms the first entry in the k = 1 column of Table 3.1.
S4 Following the proof of (3.31) given in the commentary, put p1 (x) = −x,
p2 (x) = x2 − x, p3 (x) = 3x2 − 5x + 2. Then
q1 (x) = p2 (x) − p1 (x) = x2 − x − (−x) = x2 ,
q2 (x) = p3 (x) − p2 (x) = 3x2 − 5x + 2 − x2 − x
= 2x2 − 4x + 2
= 2(x − 1)2 .
Hence s(x) = −x + (x)2+ + 2(x − 1)2+ , −1 ≤ x ≤ 2.
so that
b
Xf ∞ ≤ max |K(x, y)| dy f ∞ .
a≤x≤b a
Hence
b b
X ∞ ≤ max |K(x, y)| dy = |K(x0 , y)| dy,
a≤x≤b a a
say. To prove that X ∞ can be no less than this, we should like to find a
function f ∈ C[a, b] such that f ∞ = 1 and
b
Xf ∞ = |K(x0 , y)| dy.
a
The ideal function would be
1, if K(x0 , y) > 0,
f (y) = sgn(K(x0 , y)) =
−1, if K(x0 , y) < 0,
so that
b b
(Xf )(x0 ) = K(x0 , y)f (y) dy = |K(x0 , y)| dy,
a a
b
which implies that X ∞ ≥ a |K(x0 , y)| dy. Unfortunately, however, this
function f is not continuous (unless K(x0 , y) never vanishes).
Instead we take a continuous approximation fε , ε > 0, which differs from
sgn(K(x0 , y)) only on a set of length less than ε.
30
1 fε (y)
a b y
Total
length
<ε K(x0 , y)
−1
It follows that
b
b
Xfε (x0 ) − |K(x0 , y)| dy = K(x0 , y)(fε (y) − sgn(K(X0 , y))) dy
a a
b
≤K |fε (y) − sgn(K(x0 , y))| dy
a
≤ Kε,
where K = maxa≤y≤b |K(x0 , y)|. Hence
b
Xfε ∞ ≥ |K(x0 , y)| dy − Kε.
a
31
Now, by Problem P1,
1
1
X ∞ = max |1 + 3xy| dy
−1≤x≤1 2 −1
1 1
1 1
= 2 |1 + 3y| dy or 2 |1 − 3y| dy .
−1 −1
1
4
x=0 x= 2 x=1
1 1 1
−1 1 y −1 1 y −1 1 y
−2
Hence
1 1
X ∞ = 2 2 4 × 43 + 12 2 × 23 = 53 .
P3 First note that X is a projection (it is clearly linear), since if f (x) = a + bx,
then
12
Xf (x) = 2 (a + bt) dt + x − 14 (a + b − a)
0
= a + b/4 + b x − 14
= a + bx,
as required.
Now, for f ∈ C[0, 1],
1
2
|Xf (x)| ≤ 2 f (t) dt + |x − 14 | |f (1) − f (0)|
0
12
≤2 |f (t)| dt + 34 (|f (1)| + |f (0)|)
0
3
≤ f ∞ + 4 · 2 f ∞
5
= 2 f ∞ .
Hence
Xf ∞ ≤ 52 f ∞ ⇒ X ∞ ≤ 52 .
Thus, by Theorem 3.1,
f − Xf ∞ ≤ 1 + 52 f − p∗ ∞ ,
where p∗ is the best L∞ approximation to f from P1 , and so
f − Xf ∞ ≤ 72 f − p ∞ , for p ∈ P1 .
Remark In fact X ∞ = 5/2 in this problem, as you can see by
considering, for 0 < ε < 1,
−1 + 2x/ε, 0 ≤ x ≤ ε,
fε (x) =
1, ε < x ≤ 1.
32
P4 There is rather more to this question than meets the eye! First, if
p(x) = a + bx + cx2 , then
p(0) = a, p(1) = a + b + c, p(3) = a + 3b + 9c, p(4) = a + 4b + 16c,
and it is true that
a + 3b + 9c = − 12 a + (a + b + c) + 12 (a + 4b + 16c).
Now
min max |f (x) − p(x)| = f − p∗ ∞ ,
p∈P2 0≤x≤4
1 1 1
0 1 4 0 1 4 0 1 4
−1 −1 −1
(a) (b) (c)
As you can easily check, case (c) gives the largest value of p ∞ . In this case
5 2
p(x) = 17 1
8 − 2 x− 2 ⇒ p ∞ = p 52 = 17 8 .
17
Hence X ∞ = 8 , so that
0.15
f − p∗ ∞ ≥ = 0.048.
1 + 17
8
33
Two questions remain.
(I) How do we justify taking p(0), p(1), p(4) to be ±1?
(II) Can we in fact obtain the better estimate 0.05?
There are various ways to answer Question (I). For example, we could argue
from basic principles, examining the effects of taking p(0) = 1, |p(1)| < 1,
|p(4)| < 1, and so on. This would be tedious, and would not generalise to
other problems.
More generally, we can use a linear programming argument. This sounds very
grand, but it is really quite a simple idea. We want to maximise
|p(x)| = |a + bx + cx2 |, 0 ≤ x ≤ 4,
subject to the constraints
|p(xi )| = |a + bxi + cx2i | ≤ 1,
where x1 = 0, x2 = 1, x3 = 4. Now any equation of the form
Xa + Y b + Zc = k, where X, Y , Z, k are constant, defines a plane in R3 .
Hence the above 3 constraints define a parallelopiped P , centred at the origin,
of possible values of (a, b, c) in R3 .
The required maximum M of |p(x)| occurs for some (a, b, c) ∈ P and
x0 ∈ [0, 4], so that
M = max a + bx0 + cx20 .
(a,b,c)∈P
Since x0 is now fixed, we can find M by moving the plane a + bx0 + cx20 = k
as far as possible from the origin, while still meeting P ; at this point M = |k|.
Now, however, the plane must pass through at least one vertex of P , so that
|p(xi )| = 1, for i = 1, 2, 3, as required.
We shall see another approach in Chapter 4 which contains a formula for
X ∞ , where X is an interpolation operator from C[a, b] to Pn .
To answer Question (II) we look again at the proof of Theorem 3.1. Using
equation (3.12), we have
0.15 = |f (3) − (X(f ))(3)|
= |(f − p∗ ) (3) −(X(f − p∗ )) (3)|
≤ f − p∗ ∞ + |(X(f − p∗ )) (3)| .
Now consider the problem of maximising |(Xg)(3)|, for g ∈ C[0, 4], while
keeping g ∞ constant. Once again this is a linear programming problem so
that the maximum occurs for g(0) = ± g ∞, g(1) = ± g ∞ , g(4) = ± g ∞.
Examining cases (a), (b), (c) given earlier, we find that
|(Xg)(3)| ≤ 2 g ∞ , g ∈ C[0, 4]
((c) is again the extreme case). Hence, with g = f − p∗ ,
0.15 ≤ f − p∗ ∞ + 2 f − p∗ ∞ = 3 f − p∗ ∞ ,
and so f − p∗ ∞ ≥ 0.05, as required.
34
P5 This one is a little easier! First, since every quadratic spline is differentiable
at points of (−1, 1), we cannot have f − s ∞ = 0, that is s(x) = f (x),
−1 ≤ x ≤ 1, because f is not differentiable at 0.
However, we can make f − s ∞ < ε by defining
⎧
⎨ −x, −1 ≤ x ≤ −ε,
s(x) = p(x) = a + bx2 , −ε < x < ε,
⎩
x, ε ≤ x ≤ 1.
To guarantee that s is a quadratic spline, we require
y = s(x)
−1 −ε ε 1
35
Chapter 4 Polynomial interpolation
This chapter begins a detailed investigation of the interpolation of continuous
functions by polynomials. It turns out that the choice of interpolation points
makes a considerable difference to the accuracy of the interpolating
approximation; for example, we see in this chapter that equally-spaced
interpolation points make a rather poor choice. This investigation of interpolation
continues in Chapter 5.
This chapter splits into TWO study sessions:
Study session 1: Sections 4.1 and 4.2.
Study session 2: Sections 4.3 and 4.4.
Commentary
1. Equation (4.2) represents (n + 1) linear equations (one for each interpolation
point) with n + 1 unknowns (the coefficients of the required polynomial).
Theorem 4.1 shows that the corresponding (n + 1) × (n + 1) matrix
⎛ ⎞
1 x0 x20 . . . xn0
⎜ 1 x1 x21 . . . xn1 ⎟
⎜ ⎟
⎜ .. ⎟
⎝. ⎠
1 xn x2n . . . xnn
must be non-singular. In fact, this Vandermonde matrix, as it is called, has
determinant
"
(xj − xi ),
0≤i<j≤n
with n = 10 and k = 6.
y = 6 (x)
x0 x5 x6 x7 x10
Note that if all the xi are kept fixed except for xk and xk+1 which both tend
to a number c, then k (x) → ∞ for any x = x0 , x1 , . . . , xk−1 , c, xk+2 , . . . , xn .
3. The useful symbol δki in equation (4.11) is called the Kronecker delta.
36
4. The remarks before the statement of Theorem 4.2 provide a way of
remembering that it is the (n + 1)th derivative of f which appears in the
error formula (4.13).
Self-assessment questions
S1 Powell Exercise 4.1
Commentary
1. Table 4.1. Here is the graph of the Runge example together with its
Lagrange interpolating polynomial p of degree 10.
y = p(x)
1
1 y=
1 + x2
−5 −1 1 5
Notice that p(4.5) 1.6, as indicated by the 5th entry in the middle column
of Table 4.1. Here is the graph of the corresponding function
10
"
prod(x) = (x − xj ).
j=0
y = prod (x)
105
−5 −1 1 5
As you can see, there is a close relationship between prod(x) and the size of
the error function in the above interpolation.
37
2. Chebyshev polynomials (pronounced Cheby‘shov’ in Russian).
We have
cos θ = cos θ ⇒ T1 (x) = x,
cos 2θ = 2 cos2 θ − 1 ⇒ T2 (x) = 2x2 − 1
and cos 3θ = 4 cos3 θ − 3 cos θ ⇒ T3 (x) = 4x3 − 3x.
The graphs of these Chebyshev polynomials appear below.
1
2
y = 2x − 1
y=x
y = 4x3 − 3x
−1 1
−1
1
21π
π 22
y = T11 (x)
π y = cos−1 x
2 −1 0 1
−1 1
x0 x5 = 0 x10 −1
38
If xi are the Chebyshev points for the interval [a, b], defined by (4.28) and
(4.30), and ti are the Chebyshev points for [−1, 1], defined by (4.27), then,
for a ≤ x ≤ b with x = λ + µt,
"
n
prod(x) = (x − xi )
i=0
"n
= ((λ + µt) − (λ + µti ))
i=0
"
n
= µn+1 (t − ti )
i=0
= µn+1 prod(t),
where the latter product is defined with respect to the ti . Thus
max |prod(x)| = µn+1 max |prod(t)|
a≤x≤b −1≤t≤1
n+1
b−a 1
= · n
2 2
n+1
b−a
=2 .
4
For example, with n = 10 and [a, b] = [−5, 5], this maximum is 47 683.7,
which is considerably smaller than the corresponding maximum for
equally-spaced points. Finally, we plot the interpolating polynomial of degree
10 to the Runge example using these Chebyshev interpolation points.
−5 0 5
This graph confirms the fifth entry in the third column of Table 4.4, which
gives the maximum error in the above interpolation of approximately 0.1.
for any real-valued function φ(x, y), x ∈ X, y ∈ Y . Indeed, for any fixed
x ∈ X, y ∈ Y ,
φ(x, y) ≤ sup φ(ξ, y)
ξ∈X
so that
sup φ(x, y) ≤ sup sup φ(ξ, y).
y∈Y y∈Y ξ∈X
39
Thus
sup sup φ(x, y) ≤ sup sup φ(x, y),
x∈X y∈Y y∈Y x∈X
and equality is seen to hold by taking f (xk ) = sgn (k (x∗ )), where
n
n
|k (x∗ )| = max |k (x)|.
a≤x≤b
k=0 k=0
Note that all the norms in Theorem 4.3 are the ∞-norm · ∞ .
4. Table 4.5. It is natural to ask at what rate the norms in the right-hand
column are increasing. It can be shown that these grow like π2 loge n.
5. The ‘optimal nodes problem’, to find the interpolating points which minimise
X (see Powell Exercise 4.10) was solved only comparatively recently; see
the remarks in Appendix B. Note that the two independent papers [28] and
[89], referred to in Appendix B, appeared ‘back-to-back’ in the Journal of
Approximation Theory in 1978.
Self-assessment questions
S4 By calculating the interpolating polynomial from P2 to the Runge example,
with suitable interpolation points x0 , x1 , x2 , confirm the first entry in each
column of Table 4.1.
P2 Powell Exercise
4.3 (Hint: find the maxima of |(x − a)(x − b)| and
|(x − a) x − 12 (a + b) (x − b)| on [a, b].)
P3 Powell Exercise 4.4 (Hint: try to find a substitute for the function g of
(4.14).)
P4 Powell Exercise 4.5 (Hint: decide first where the maximum and minimum
gaps occur.)
40
Solutions to SAQs in Chapter 4
S1 Using equation (4.7),
(x − 1)(x − 2)(x − 3) (x − 0)(x − 2)(x − 3)
p(x) = f (0) + f (1)
(0 − 1)(0 − 2)(0 − 3) (1 − 0)(1 − 2)(1 − 3)
(x − 0)(x − 1)(x − 3) (x − 0)(x − 1)(x − 2)
+ f (2) + f (3)
(2 − 0)(2 − 1)(2 − 3) (3 − 0)(3 − 1)(3 − 2)
1 1
= − 6 f (0)(x − 1)(x − 2)(x − 3) + 2 f (1)x(x − 2)(x − 3)
− 12 f (2)x(x − 1)(x − 3) + 16 f (3)x(x − 1)(x − 2).
Hence
p(6) = −10f (0) + 36f (1) − 45f (2) + 20f (3).
If f (x) = (x − 3)3 , then f (0) = −27, f (1) = −8, f (2) = −1, f (3) = 0, and so
p(6) = −10(−27) + 36(−8) − 45(−1) + 20(0) = 27.
This is correct since f (6) = 27 and f is a cubic, so the interpolation is exact.
The uncertainty of p(6), if that of each function value is ±ε, is
3
± ε|k (6)| = ±(10ε + 36ε + 45ε + 20ε) = ±111ε.
k=0
depends only on the data points and the function values, and so can be
calculated beforehand. Hence
"n
n
µk
p(x) = (x − xj )
j=0
x − xk
k=0
41
S4 First calculate
1 1
f (−5) = 26 , f (0) = 1, f (5) = 26 .
The unique quadratic p(x) taking these values is of the form p(x) = 1 − ax2 ,
a > 0. To find a we use
1
26 = 1 − a · 52 ⇒ a= 1
26 .
5
Now, with n = 2 we have x 32 = 2 and
5 1 4
f = = = 0.137 931 034,
2
1 + 25
4
29
p( 52 ) = 1 − 1
26 · 25
4 = 79
104 = 0.759 615 384.
Thus
5 5
f 2 −p 2 = −0.621 684 35,
and the verification is complete.
42
Alternatively, Theorem 4.3 can be used. Here is the calculation for
equally-spaced points x0 = −5, x1 = 0, x2 = 5. First
(x − 0)(x − 5) 1
0 (x) = = 50 x(x − 5),
(−5 − 0)(−5 − 5)
(x − (−5))(x − 5)
1 (x) = 1
= − 25 (x2 − 25),
(0 − (−5))(0 − 5)
(x − (−5))(x − 0) 1
2 (x) = = 50 x(x + 5).
(5 − (−5))(5 − 0)
Now, for 0 ≤ x ≤ 5,
2
|k (x)| = 1
50 x(5 − x) + 1
25 25 − x2 + 1
50 x(x + 5)
k=0
= 1
25 25 + 5x − x2 ,
and the maximum of this expression occurs when x = 52 . Hence
2
max |k (x)| = 1
25 25 + 5.5/2 − (5/2)2 = 5/4.
0≤x≤5
k=0
By symmetry, the maximum of this sum will be the same for −5 ≤ x ≤ 0 and
so X ∞ = 5/4, as required.
43
P2 According to Theorem 4.2, the error in interpolating f (x) = cos x over
[kπ/n1 , (k + 1)π/n1 ] by p1 ∈ P1 , such that p1 (kπ/n1 ) = f (kπ/n1 ) and
p1 ((k + 1)π/n1 ) = f ((k + 1)π/n1 ), is at most
1 kπ (k + 1)π (2)
x− x−
kπ
max
(k+1)π 2 n1 n1 f ∞ .
n ≤x≤
1 n 1
(2)
Since f ∞ ≤ 1 and the maximum of |(x − a)(x − b)| on [a, b] is
((b − a)/2)2 , we deduce that
2
1 π π2
f − p1 ∞ ≤ 2 = .
2n1 8n21
To guarantee that this error is less than 10−6 it is, therefore, sufficient for n1
to satisfy
103 π
n1 > √ = 1110.7 . . . .
8
Again by Theorem 4.2, the error in interpolating f (x) = cos x over
[kπ/n2 , (k + 2)π/n2 ] by p2 ∈ P2 , such that p2 (kπ/n2 ) = f (kπ/n2 ),
p2 ((k + 1)π/n2 ) = f ((k + 1)π/n2 ), p2 ((k + 2)π/n2 ) = f ((k + 2)π/n2 ), is at
most
1 kπ (k + 1)π (k + 2)π (3)
x− x− x−
kπ
max
(k+2)π 6 n2 n2 n2 f ∞ .
n2 ≤x≤ n2
Since f ∞ ≤ 1 and the maximum of (x − a) x − 12 (a + b) (x − b) on
(3)
√
[a, b] is (2 3/9)((b − a)/2)3 , we deduce that
√ 3
2 3 π π3
f − p2 ∞ ≤ 16 · = 5/2 3 .
9 n2 3 n2
To guarantee that this error is less than 10−6 it is, therefore, sufficient for n2
to satisfy
102 π
n2 > = 125.7 . . . .
35/6
44
P4 Since
&
[2(n − i) + 1]π
xi = cos , i = 0, 1, . . . , n,
2(n + 1)
and the function f (x) = cos x is concave on [0, π/2] and convex on [π/2, π],
the maximum gap occurs in the middle of the range and the minimum gap
occurs at the ends.
If n is even, then the maximum gap i = 12 n, i + 1 = 12 n + 1 is
(n − 1)π (n + 1)π nπ π
cos − cos = 2 sin sin
2(n + 1) 2(n + 1) 2(n + 1) 2(n + 1)
π
< 2 sin
2(n + 1)
π
< ,
n+1
since sin x < x, for x > 0.
If n is odd, then the maximum gap i = 12 (n − 1), i + 1 = 12 (n + 1) is
nπ (n + 2)π (n + 1)π π
cos − cos = 2 sin sin
2(n + 1) 2(n + 1) 2(n + 1) 2(n + 1)
π
= 2 sin
2(n + 1)
π
< .
n+1
Since the gap for n + 1 equally-spaced points is 2/n, the desired factor is
indeed less than π/2 in both cases.
The minimum gap (i = n − 1, i + 1 = n) is
π 3π π π
cos − cos = 2 sin sin .
2(n + 1) 2(n + 1) 2(n + 1) n+1
Thus the ratio of the maximum to the minimum gap is
⎧ # $ # $
⎨sin nπ / sin π
2(n+1) n+1 , n even,
# $
⎩1/ sin π , n odd.
n+1
It is evident that
1 1 n+1
# $ > # $ = ,
sin n+1π π π
n+1
so the required lower estimate clearly holds for n odd. For n even, there is a
little more work to do, since sin(nπ/2(n + 1)) < 1. However,
nπ π π π
sin = sin − = cos
2(n + 1) 2 2(n + 1) 2(n + 1)
and
π π π
sin = 2 sin cos ,
n+1 2(n + 1) 2(n + 1)
so that
# $
nπ
sin 2(n+1) 1 1 n+1
# $ = # $ > # $= ,
π
sin n+1 π
2 sin 2(n+1) π
2 2(n+1) π
45
P5 We give a solution along the lines of that used to find X ∞ in Problem P4,
Chapter 3. Note that Theorem 4.3 cannot be used because we are not
interpolating by a general element of P3 . To find X ∞ , we must find the
maximum on [0, 3] of |p(x)| = |c0 + c1 x + c3 x3 |, where |p(0)| ≤ 1, |p(2)| ≤ 1
and |p(3)| ≤ 1; once again, by the linear programming argument, we need
consider only the cases p(0), p(2), p(3) = ±1. The critical cases are sketched
below.
1 1 1
0 2 3 0 2 3 0 2 3
−1 −1 −1
(a) (b) (c)
In fact, case (c) gives the greatest value of |p(x)| in [0, 3]. In this case we have
p(0) = 1, p(2) = 1 and p(3) = −1, so that
8 2 3
√
p(x) = 1 + 15 x − 15 x ⇒ p ∞ = p(2/ 3) = 1 + 4532 √ ,
3
as required.
46
Chapter 5 Divided differences
In Chapter 4 we found that interpolation can provide a good method of
determining a polynomial approximation to a given function. This chapter is
devoted to a good method of calculating such an interpolating polynomial using a
formula due to Newton which involves divided differences.
This chapter splits into TWO study sessions:
Study session 1: Sections 5.1, 5.2 and 5.3.
Study session 2: Sections 5.4 and 5.5.
Commentary
1. The definition of the divided difference given in Section 5.1 makes it clear
that f [x0 , x1 , . . . , xn ] is independent of the order in which the points
x0 , x1 , . . . , xn appear. For example
f [x0 , x1 , x2 , x3 ] = f [x1 , x3 , x0 , x2 ].
2. The remarks at the bottom of page 47 will make more sense after you have
read how to calculate divided differences in Section 5.3.
3. The key features of the Newton formula (5.12) are that, for
k = 0, 1, . . . , n − 1,
(a) the first k + 1 terms comprise the polynomial pk ∈ Pk which interpolates
f at x0 , x1 , . . . , xk ;
(b) the (k + 2)th term is an estimate for the error in the approximation of f
by pk .
If a large number of function values are available, therefore, Newton’s
formula should give better and better approximations to f by choosing more
and more interpolation points. By checking the size of each additional term
calculated, one can decide when further interpolation points are of no help.
47
The following diagram may help to interpret (5.14) in general.
xj f (xj )
xj+1 f (xj+1 )
..
. f [xj , . . . , xj+k ]
f [xj , . . . , xj+k+1 ]
..
. f [xj+1 , . . . , xj+k+1 ]
xj+k f (xj+k )
xj+k+1 f (xj+k+1 )
The (k + 1)th divided difference is found using the two adjacent terms in the
previous column and the corresponding x values at the ends of the diagonals.
Self-assessment questions
S1 Powell Exercise 5.1
48
Study Session 2: Numerical considerations and
Hermite interpolation
Commentary
1. The method of interpolation by calculating the coefficients ci , i = 0, 1, . . . , n,
n
in p(x) = ci xi is, of course, convenient for interpolating by very low
i=0
degree polynomials, where the coefficients can be found exactly. For higher
degree polynomials, however, it is difficult to calculate the coefficients with
sufficient accuracy because the corresponding matrix equation may be
ill-conditioned.
"m
2. The assertion at the bottom of page 54 that p is a multiple of (x − xi )i +1
is true because i=0
f (j) (xi ) = 0, j = 0, 1, . . . , i , i = 0, 1, . . . , m.
This implies that the Taylor expansion of p about each xi begins
f (i +1) (xi )
p(x) = (x − xi )i +1 + · · · ,
(i + 1)!
so that (x − xi )i +1 is a factor of p(x), for each i.
3. The word ‘suitable’ at the top of page 56 can be interpreted to mean ‘valid’.
4. The proof of Theorem 5.5 shows that Hermite interpolation is the limiting
case of Newton’s formula (5.12), which is obtained when various adjacent
interpolation points merge together.
Self-assessment questions
S3 Verify equation (5.19).
S4 Calculate the value p(1.8) given by (5.20) and confirm the value p(1.8) given
by (5.21). (Evaluate these polynomials by nested multiplication:
a0 + a1 x + a2 x2 + · · · + an xn = a0 + x(a1 + · · · + x(an−1 + an x) · · ·).)
S5 Verify that the polynomial (5.29) satisfies the last two interpolation
conditions in (5.28).
49
Solutions to SAQs in Chapter 5
S1 The table is as follows.
xi f (xi ) Order 1 Order 2 Order 3 Order 4
−2 3.28
14.08
−1 17.36 −3.72
−0.8 1.0
2 14.96 1.28 0
4.32 1.0
3 19.28 6.28
16.88
4 36.16
Thus
p4 (x) = 3.28 + 14.08(x + 2) − 3.72(x + 2)(x + 1) + (x + 2)(x + 1)(x − 2)
and so
p4 (4) = 3.28 + 14.08 × 6 − 3.72 × 6 × 5 + 6 × 5 × 2 = 36.16,
as expected. Note that p4 (x) = p3 (x) in this example.
S2 The required formula for pn (x0 ) follows from (5.12) by noting that, for
k = 1, 2, . . . , n − 1,
d d
(x − x0 ) . . . (x − xk ) = (x − x1 ) . . . (x − xk ) + (x − x0 ) (x − x1 ) . . . (x − xk )
dx dx
and so
d
(x − x0 ) . . . (x − xk )x=x = (x0 − x1 ) . . . (x0 − xk ).
dx 0
Thus
p (2) = f [2, 3] + (2 − 3)f [2, 3, 4] + (2 − 3)(2 − 4)f [2, 3, 4, −1]
+ (2 − 3)(2 − 4)(2 + 1)f [2, 3, 4, −1, −2].
By Comment 1 on page 47 and the above table,
f [2, 3] = f [3, 2] = 4.32,
f [2, 3, 4] = f [4, 3, 2] = 6.28,
f [2, 3, 4, −1] = f [4, 3, 2, −1] = 1,
f [2, 3, 4, −1, −2] = f [4, 3, 2, −1, −2] = 0.
Thus
p (2) = 4.32 + (2 − 3)6.28 + (2 − 3)(2 − 4) = 0.04.
The ordering x0 = 2, x1 = −1, x2 = 3, x3 = −2, x4 = 4 also allows us to
obtain the divided differences
f [2, −1], f [2, −1, 3], f [2, −1, 3, −2], f [2, −1, 3, −2, 4]
without compiling a fresh table. With this ordering,
p (2) = f [2, −1] + (2 + 1)f [2, −1, 3] + (2 + 1)(2 − 3)f [2, −1, 3, −2]
+ (2 + 1)(2 − 3)(2 + 2)f [2, −1, 3, −2, 4]
= − 0.8 + 3 × 1.28 + 3 × (−1) × 1 + 0
= 0.04,
once again.
50
S3 In exact arithmetic
p(1.8) = 0.0823 − 0.2 × 0.236 33 + 0.2 × 0.17 × 0.329
− 0.2 × 0.17 × 0.1 × 0.328 87 + 0.2 × 0.17 × 0.1 × 0.04 × 0.5008
= 0.0823 − 0.047 266 + 0.011 186 − 0.001 118 158 + 0.000 068 108 8
= 0.045 169 950 8.
(Note that using nested multiplication here may lose you a couple of digits at
the end.)
S4 Using (5.20),
p(1.8) = 6.700 98 + 1.8(−13.360 21 + 1.8(10.3856 + 1.8(−3.692 41 + 1.8 × 0.502 72)))
= 0.045 164 35,
which agrees with the data to 4 places of decimals. Using (5.21),
p(1.8) = 6.701 + 1.8(−13.36 + 1.8(10.386 + 1.8(−3.6924 + 1.8 × 0.502 72)))
= 0.046 916 672,
which agrees with the data to only 2 places of decimals.
S5 Since 1.8 − 1.6 = 0.2, 1.8 − 1.7 = 0.1 and 1.8 − 1.8 = 0,
p(1.8) = 0.082 297 + 0.2(−0.246 892 + 0.2(0.335 92 + 0.1 × (−0.297 35)))
= 0.045 166,
as required. Now
d
(x − 1.6)2 = 2(x − 1.6),
dx
d
(x − 1.6)2 (x − 1.7) = 2(x − 1.6)(x − 1.7) + (x − 1.6)2 ,
dx
d
(x − 1.6)2 (x − 1.7)(x − 1.8) = 2(x − 1.6)(x − 1.7)(x − 1.8) + (x − 1.6)2 (2x − 3.5).
dx
Hence
p (1.8) = − 0.246 892 + 0.335 92 × 2 × 0.2
− 0.297 35(2 × 0.2 × 0.1 + 0.2 × 0.2)
+ 0.203 75(0.2 × 0.2 × 0.1)
= − 0.135 497,
as required.
Remark In more complicated examples, one might use a more systematic
approach to calculate p (x), where
p(x) = a0 + (x − x0 )(a1 + (x − x1 )(a2 + · · · (x − xn−1 )(an + an+1 (x − xn )) . . .)).
Put
p(x) = q0 (x) = a0 + (x − x0 )q1 (x) = a0 + (x − x0 )(a1 + (x − x1 )q2 (x)) = . . . ,
so that
qn (x) = an + an+1 (x − xn ) and qn+1 (x) = an+1 .
Then
qk (x) = ak + (x − xk )qk+1 (x)
and so
qk (x) = qk+1 (x) + (x − xk )qk+1
(x).
Hence, by induction,
p (x) = q1 (x) + (x − x1 )(q2 (x) + · · · (qn (x) + (x − xn )qn+1 (x)) . . .)).
51
Solutions to Problems in Chapter 5
P1 First note that with the given values of xi , we have
"
n
(xk − xj ) = k!hk (n − k)!(−h)n−k = (−1)n−k hn k!(n − k)!,
j=0
j=k
so that
n
f (xk )
f [x0 , x1 , . . . , xn ] = h−n (−1)n−k .
k!(n − k)!
k=0
To verify that this formula is consistent with Theorem 5.3 we note that
k+1
f (xi+j )
−k−1
f [xj , . . . , xj+k+1 ] = h (−1)k+1−i ,
i=0
i!(k + 1 − i)!
k
f (xi+j )
f [xj , . . . , xj+k ] = h−k (−1)k−i ,
i=0
i!(k − i)!
k
f (xi+j+1 )
f [xj+1 , . . . , xj+k+1 ] = h−k (−1)k−i
i=0
i!(k − i)!
and
xj+k+1 − xj = (k + 1)h.
Now, the coefficient of f (xi+j ) in
f [xj+1 , . . . , xj+k+1 ] − f [xj , . . . , xj+k ]
xj+k+1 − xj
is
1 (−1)k−(i−1) (−1)k−i
−
(k + 1)hk+1 (i − 1)!(k − (i − 1))! i!(k − i)!
(−1)k+1−i 1 1
= +
(k + 1)hk+1 (i − 1)!(k + 1 − i)! i!(k − i)!
(−1)k+1−i
= ,
hk+1 i!(k + 1 − i)!
which is the coefficient of f (xi+j ) in f [xj , . . . , xj+k+1 ]. Hence the recurrence
relation (5.14) does indeed hold.
P2 The table can be reconstructed from the first entry in each column using
(5.14) in the form
f [xj+1 , . . . , xj+k+1 ] = f [xj , . . . , xj+k ] + (xj+k+1 − xj )f [xj , . . . , xj+k+1 ].
For example,
f [1.8, 1.76, 1.7, 1.63] = f [1.76, 1.7, 1.63, 1.6] + (1.8 − 1.6)f [1.8, 1.76, 1.7, 1.63, 1.6]
= −0.328 87 + 0.2 × 0.500 80
= −0.228 71,
f [1.76, 1.7, 1.63] = f [1.7, 1.63, 1.6] + (1.76 − 1.6)f [1.76, 1.7, 1.63, 1.6]
= 0.329 + 0.16 × (−0.328 87)
= 0.276 380 8,
f [1.8, 1.76, 1.7] = f [1.76, 1.7, 1.63] + (1.8 − 1.63)f [1.8, 1.76, 1.7, 1.63]
= 0.276 380 8 + 0.17 × (−0.228 71)
= 0.237 500 1.
52
Continuing in this manner, we find
f [1.7, 1.63] = −0.236 33 + (1.7 − 1.6) × 0.329 = −0.203 43,
f [1.76, 1.7] = −0.203 43 + (1.76 − 1.63) × 0.276 380 8 = −0.167 500 49,
f [1.8, 1.76] = −0.167 500 49 + (1.8 − 1.7) × 0.237 500 1 = −0.143 750 48,
f [1.63] = 0.082 30 + (1.63 − 1.6) × (−0.236 33) = 0.075 210 1,
f [1.7] = 0.075 210 1 + (1.7 − 1.63) × (−0.203 43) = 0.060 97,
f [1.76] = 0.060 97 + (1.76 − 1.7) × (−0.167 500 49) = 0.050 919 97,
f [1.8] = 0.050 919 97 + (1.8 − 1.76) × (−0.143 750 48) = 0.045 169 951.
53
P5 The difference table is as follows.
xi f (xi ) Order 1 Order 2 Order 3
0.0 0.0
0.119 778
0.1 0.119 778 0.009 57
0.129 348 0.000 018
0.2 0.249 126 0.009 588
0.138 936 −0.002 982
0.3 0.388 062 0.006 606
0.145 542 0.009 015
0.4 0.533 604 0.015 621
0.161 163 −0.008 982
0.5 0.694 767 0.006 639
0.167 802 0.003 013
0.6 0.862 569 0.009 652
0.177 454 0.000 005
0.7 1.040 023 0.009 657
0.187 111 0.000 041
0.8 1.227 134 0.009 698
0.196 809 −0.000 015
0.9 1.423 943 0.009 683
0.206 492
1.0 1.630 435
The second-order differences are irregular. Most noticeably, the numbers
0.006 606, 0.015 621, 0.006 639 are substantially different from the other
entries, and this can be traced to an error in the value of f (0.4). Indeed, if the
above value of f (0.4) is increased by ε, then the increases in the second-order
differences are ε, −2ε, ε respectively, so that their average remains constant at
(0.006 606 + 0.015 621 + 0.006 639)/3 = 0.009 622.
For the middle one of these three differences to equal 0.009 622, we require
2ε = 0.015 621 − 0.009 622, that is, ε = 0.002 999 5. It appears likely,
therefore, that f (0.4) should actually be 0.536 604. (Alternatively: note that
the increases in the corresponding third-order differences are ε, −3ε, 3ε, −ε,
so that ε 0.003.)
Once this error is corrected, notice that the second-order differences are
increasing, apart from the last one, and that the increase from 0.009 657 to
0.009 698 is abnormally large. This can be traced to an error in the value of
f (0.8). Once again an increase of ε in the above value of f (0.8) leads to
increases in the last three second-order differences of ε, −2ε, ε respectively.
The average of these differences is 0.009 679 3̇, which suggests that
2ε = 0.009 698 − 0.009 679 3̇, that is, ε = 0.000 009 3̇. It appears likely,
therefore, that f (0.8) should actually be 1.227 143.
Once these corrections are made, the third-order differences are (giving only
the significant digits)
18, 18, 15, 18, 13, 14, 14, 12,
which is quite regular, since the final digit of the data is probably rounded.
54
Chapter 6 The uniform convergence of
polynomial approximations
Chapter 6 begins the detailed study of uniform approximation, that is,
approximation of functions in the ∞-norm. The chapter is almost entirely
devoted to the proof of Weierstrass’ approximation theorem, which states that if f
is continuous on [a, b] and ε > 0 is given, then there is a polynomial p such that
p − f ∞ ≤ ε. The degree of p is not fixed in advance and will in general depend
on ε and f .
In this chapter we make frequent use of the Binomial Theorem
n
n n
(x + y) = xk y n−k ,
k
k=0
especially the cases x + y √
= 1 and x = y = 1. We also require Stirling’s
asymptotic formula, n! ∼ 2πn(n/e)n , in the problems.
This chapter splits into TWO study sessions:
Study session 1: Sections 6.1 and 6.2.
Study session 2: Sections 6.3 and 6.4.
Commentary
1. Weierstrass’ Theorem (proved in 1885) is probably the best known result in
approximation theory. Nevertheless, it is still a little surprising, since a
function f can be very badly behaved and yet be continuous. For example,
there are many functions f which are continuous but nowhere differentiable.
One such function (called the ‘blancmange function’ on account of the shape
of its graph) is given by
∞
1
f (x) = n
φ(2n x), 0 ≤ x ≤ 1,
n=0
2
where φ(x) = |x|, for |x| ≤ 12 , and φ(x + n) = φ(x), for n = 0, ±1, ±2, . . ..
The first example of such a continuous, nowhere differentiable function is due
to Weierstrass himself (1872).
3. Note that the assertion in (6.9) uses the fact that a continuous function on a
closed interval [a, b] is uniformly continuous (see commentary on Chapter 1).
Self-assessment questions
S1 Verify the simpler form of the definition of a monotone linear operator L,
given in Section 6.2 (first paragraph).
55
Study Session 2: The Bernstein operator
Commentary
n
1. The expression n!/(k!(n − k)!) is commonly referred to as or n Ck .
k
Self-assessment questions
S4 Verify the identity (6.27).
S6 Prove that if f is continuous on [a, b], then its modulus of continuity ω satisfies
(a) ω is increasing;
(b) lim ω(δ) = 0;
δ→0
(c) ω(δ1 + δ2 ) ≤ ω(δ1 ) + ω(δ2 ).
P3 Powell
√ Exercise 6.5 (Hint: you will need to use Stirling’s formula
n! ∼ 2πn(n/e)n .)
56
Solutions to SAQs in Chapter 6
S1 If L is linear and satisfies
(Lf )(x) ≥ 0, a ≤ x ≤ b,
whenever
f (x) ≥ 0, a ≤ x ≤ b,
then
(Lf )(x) − (Lg)(x) = (L(f − g))(x) ≥ 0, a ≤ x ≤ b,
whenever
f (x) ≥ g(x), a ≤ x ≤ b,
as required.
S2 Case 1 If |x − ξ| ≤ δ, then
qu (x) ≥ f (ξ) + ε ≥ f (x),
by (6.9).
Case 2 If |x − ξ| > δ, then
qu (x) ≥ f (ξ) + ε + 2 f ∞
≥ ε + f ∞ (since f ∞ ≥ −f (ξ))
≥ f (x) (since f ∞ ≥ f (x)).
Thus (6.11) holds in either case.
57
S5 (a) By inspection, the largest value of x2 − y 2 when |x − y| ≤ δ, x, y ∈ [0, 1],
occurs for x = 1, y = 1 − δ. Thus
ω(δ) = 1 − (1 − δ)2
= 2δ − δ2 .
(b) By inspection, the largest value of x − 12 − y − 12 when
|x − y| ≤ δ ≤ 12 , x, y ∈ [0, 1], occurs for x = 12 , y = 12 + δ. When
1 1
2 < δ ≤ 1, the largest value occurs for x = 2 , y = 1. Thus
δ, 0 < δ ≤ 12 ,
ω(δ) = 1 1
2, 2 < δ ≤ 1.
58
Solutions to Problems in Chapter 6
P1 The proof is a generalisation of that of (6.27):
n 3
n k
xk (1 − x)n−k
k n
k=0
n 2
n−1 n−k k
= x (1 − x)
k
k−1 n
k=1
1 n−1
n
= 2 xk (1 − x)n−k ((k − 1)(k − 2) + 3(k − 1) + 1)
n k−1
k=1
n n
(n − 1)(n − 2) 3 n − 3 3(n − 1) 2 n − 2
= x xk−3
(1 − x)n−k
+ x xk−2 (1 − x)n−k
n2 k−3 n2 k−2
k=3 k=2
n
1 n−1
+ x xk−1 (1 − x)n−k
n2 k−1
k=1
(n − 1)(n − 2) 3 3(n − 1) 2 1
= x + x + x,
n2 n2 n2
by the Binomial Theorem.
The method generalises to xr , for n > r, because we can always write k r−1 as
a linear combination of the expressions
(k − 1) . . . (k − r + 1), (k − 1) . . . (k − r + 2), . . . , (k − 1)(k − 2), (k − 1), 1.
Note that if n = r then Bn f is automatically in Pr = Pn .
P2 By (6.23),
6
6
p(j/6) = (j/6)k (1 − j/6)6−k fk ,
k
k=0
where fk = f (k/6), k = 0, 1, . . . , 6.
Thus, with the given values of p(j/6), j = 0, 1, . . . , 6, we have 0 = p(0) = f0
and 0 = p(1) = f6 , so that
1
0 = p(1/6) = 6 6.55 f1 + 15.54f2 + 20.53 f3 + 15.52 f4 + 6.5f5 ,
6
1
0 = p(1/3) = 6 6.45 .2f1 + 15.44 .22 f2 + 20.43 .23 f3 + 15.42 .24 f4 + 6.4.25 f5 ,
6
1
1 = p(1/2) = 6 (6f1 + 15f2 + 20f3 + 15f4 + 6f5 ),
6
1
0 = p(2/3) = 6 6.25 .4f1 + 15.24.42 f2 + 20.23 .43 f3 + 15.22 .44 f4 + 6.2.45 .f5 ,
6
1
0 = p(5/6) = 6 6.5f1 + 15.52 f2 + 20.53f3 + 15.54 f4 + 6.55 f5 ,
6
which reduce to
3750f1 + 1875f2 + 500f3 + 75f4 + 6f5 = 0, (1)
48f1 + 60f2 + 40f3 + 15f4 + 3f5 = 0, (2)
6f1 + 15f2 + 20f3 + 15f4 + 6f5 = 64, (3)
3f1 + 15f2 + 40f3 + 60f4 + 48f5 = 0, (4)
6f1 + 75f2 + 500f3 + 1875f4 + 3750f5 = 0. (5)
By considering (1)–(5) and (2)–(4), we find that f1 − f5 = 0 and f2 − f4 = 0.
Thus equations (1) to (5) further reduce to
3756f1 + 1950f2 + 500f3 = 0, (6)
12f1 + 30f2 + 20f3 = 64, (7)
51f1 + 75f2 + 40f3 = 0. (8)
59
Eliminating f3 first from (6), (7) and then from (7), (8) gives
3456f1 + 1200f2 = −1600,
27f1 + 15f2 = −128,
and so
1296f1 = 8640 ⇒ f1 = f5 = 20/3.
Substituting back, we obtain f2 = f4 = −308/15 and f3 = 30, as required.
1
P3 Since f 2 = 0, the error in question is
n
n 1 n k 1
(Bn f ) 12 − f 12 = 2 − 2 .
k n
k=0
and
n/2−1
n−1
n−1
n−1
2 = = 2n−1 ,
k k
k=0 k=0
60
P4 Each of the functions
n
φnk (x) = xk (1 − x)n−k , 0 ≤ x ≤ 1,
k
is positive for 0 < x < 1 and vanishes for x = 0, 1 (unless k = 0 or k = n,
respectively). Also, for 0 < k < n,
n k−1
φnk (x) = kx (1 − x)n−k − xk (n − k)(1 − x)n−k−1
k
n
= xk−1 (1 − x)n−k−1 (k(1 − x) − (n − k)x)
k
n
= xk−1 (1 − x)n−k−1 (k − nx),
k
so that φnk has a unique turning point in (0, 1), namely, a maximum for
k − nx = 0, that is, x = k/n. This maximum value is
k n−k
n k k
φnk (k/n) = 1− .
k n n
If we keep k fixed while n → ∞, then by Stirling’s formula,
√ k n−k
2πn(n/e)n k n−k
φnk (k/n) ∼
k! 2π(n − k)((n − k)/e)n−k n n
k
n k
=
n − k k!ek
kk
∼ as n → ∞.
k!ek
Note, however, that k/n → 0 as n → ∞, so in this case the peak of height
e−k k k /k! moves towards the y-axis.
On the other hand, if ξ = k/n remains fixed while n (and hence k) tends to
infinity, then the width of the peak becomes narrower. Indeed, if η = ξ, then
φnk (η) ηk (1 − η)n−k
= k = α,
φnk (ξ) ξ (1 − ξ)n−k
say, where 0 < α < 1, because ξ is the maximum of φnk . Now consider the
sequences np = pn and kp = pk, where p is a positive integer. Then
φnp kp (η) ηpk (1 − η)pn−pk
=
φnp kp (ξ) ξpk (1 − ξ)pn−pk
= αp .
Thus
φnp kp (η)
lim = 0,
p→∞ φnp kp (ξ)
and so the width of the peak at ξ = k/n must tend to 0 as p → ∞. The
height of this peak is in fact
pn
φnp kp (ξ) = ξpk (1 − ξ)pn−pk
pk
1
∼ as p → ∞,
2πpk(1 − ξ)
by a further application of Stirling’s formula.
These properties of the graphs y = φnk (x) are illustrated below in various
special cases.
61
0.5
φ31
φ62
φ93
φ61
φ91
0 1 1 1
9 6 3
1
62
Chapter 7 The theory of minimax
approximation
In Chapter 7 we consider the problem of approximating a given function
f ∈ C[a, b] by polynomials of fixed degree n in the ∞-norm. The polynomial
which best approximates f in this respect can be characterised rather elegantly
and is in fact unique. The theory can be extended to other linear spaces of
approximating functions which satisfy a criterion known as the ‘Haar condition’.
For conciseness we shall use the abbreviation b.m.a. for ‘best minimax
approximation’.
This chapter splits into TWO study sessions:
Study session 1: Sections 7.1 and 7.2.
Study session 2: Sections 7.3 and 7.4.
Commentary
1. The parameter θ in (7.2) may appear superfluous at first sight, but its rôle
becomes clear in Section 7.2.
2. The following diagram may clarify the final paragraph of Section 7.1.
f +g
p∗3
The function p∗3 is the b.m.a. from P1 to both f and to f + g, but the b.m.a.
from P1 to g is not the zero function.
3. The letter used in Section 7.2 to denote the set where the extreme values of
the error function occur is a script Z (Z) with subscript M . Here one must
interpret ‘extreme values of e∗ ’ to mean ‘maximum values of |e∗ |’.
4. The result in the first paragraph of Section 7.2 can be summarised as follows:
if p∗ is not a b.m.a. from A to f , then there exists p in A such that
sgn(e∗ (x)) = sgn(p(x)), x ∈ ZM ,
that is,
e∗ (x)p(x) > 0, x ∈ ZM .
63
The converse result is:
if p∗ is in A, e∗ = f − p∗ and there exists p in A such that
e∗ (x)p(x) > 0, x ∈ ZM ,
then there exists θ > 0 such that
f − (p∗ + θp) ∞ < f − p∗ ∞ ,
so that p∗ is not a b.m.a. from A to f .
This converse result is the special case of Theorem 7.1 in which Z = [a, b].
5. The proof of Theorem 7.1 is quite subtle. At a first reading you would do
well to assume that Z = [a, b]. The following diagram (based on the third
part of Figure 7.1) may clarify the rôles played by p, ZM , Z0 and d.
f
p∗
p
ZM
d
e∗ Z0
Self-assessment questions
S1 Sketch a diagram like the above, which corresponds to the second part of
Figure 7.1.
1
S2 Can the constant 2 in (7.13) be replaced by 1?
Commentary
1. The relationship between conditions (1), (2), (3) and (4) is examined in
detail in Appendix A, which will not be assessed. Note, however, that the
equivalence of (1) and (4) is straightforward. Indeed, if {φi : i = 0, 1, . . . , n}
is a basis of A, then:
n
(a) the function f = λi φi in A is identically zero if and only if
i=0
λ = (λ0 , λ1 , . . . , λn ) = (0, . . . , 0);
64
n
(b) the function f = λi φi in A has zeros at ξj , j = 0, 1, . . . , n, if and only
i=0
if
n
λi φi (ξj ) = 0, j = 0, 1, . . . , n,
i=0
that is,
Pλ = 0, (∗)
where P is the matrix with entries φi (ξj );
(c) equation (∗) has the unique solution λ = 0 if and only if P is
non-singular.
To verify that (4) holds for a given space A we need to check that the matrix
P is non-singular for every set {ξj : j = 0, 1, . . . , n} of distinct points in [a, b]
where {φi : i = 0, 1, . . . , n} is some basis of A. (See SAQ S4 and Powell
Exercise 7.8.)
5. In the sentence following the proof of Theorem 7.3, ‘C[a, b]’ should be ‘[a, b]’.
6. Theorem 7.4, and the discussion following it, indicate how to find the b.m.a.
from Pn to the discrete data {(ξi , f (ξi )) : i = 0, 1, . . . , n}. The equations
(7.27) are fundamental to the exchange algorithm, which is discussed in
Chapter 8. The matrix of the system (7.27) is non-singular because if a linear
mapping from Rn+1 to Rn+1 is onto, then it is also one–one.
7. Theorem 7.6 can be proved without the rather awkward Theorem 7.5 and
condition (3), using the more direct method of Powell Exercise 7.6. However,
Theorem 7.5 is needed for Theorem 7.7 (see also page 113).
65
Self-assessment questions
S3 Explain why Pn satisfies Haar conditions (1) and (2).
S4 Determine whether the following linear spaces satisfy the Haar condition:
(a) the space A spanned by φ0 (x) = 1, φ1 (x) = cos x on [0, π];
(b) the space A spanned by φ0 (x) = 1, φ1 (x) = cos x on [π/2, 3π/2].
S5 Verify that p∗ (x) = x2 + 1/8 is the b.m.a. from P3 to f (x) = |x| on [−1, 1]
(cf. Chapter 3, SAQ S3).
p∗
e∗ Z0
ZM
has degree k ≤ n and changes sign precisely at the points ζj . Moreover, the
function p(x) = 1 lies in Pn and has no zeros in [a, b].
66
S4 (a) To verify Haar condition (4) we need to show that, for distinct ξ1 , ξ2 in
[0, π], the matrix
1 cos ξ1
1 cos ξ2
is non-singular, that is, cos ξ1 = cos ξ2 . But this is evident, since cos is
one–one on [0, π], because it is strictly decreasing on [0, π].
(b) Condition (4) fails in this case because the above matrix is singular if, for
example, ξ0 = π/2, ξ1 = 3π/2, so cos ξ0 = 0 = cos ξ1 . Also, consideration
of φ1 (x) = cos x at ξ0 , ξ1 shows that condition (1) is false.
S5 By Theorem 7.2, it is sufficient to note that −1, − 21 , 0, 12 , 1 is an alternating
set of length 5(= 3 + 2) for f (x) − p∗ (x) with h = −1/8. Note that p∗ is also
a b.m.a. from
P2 to f (suitable alternating sets are either −1, − 2 , 0, 12 or
1
− 12 , 0, 12 , 1 ).
1
f p∗
0 α 1
Thus we have
f (0) − p∗ (0) = sin 0 − a = h, (1)
f (α) − p∗ (α) = sin(πα/2) − a − bα = −h, (2)
f (1) − p∗ (1) = sin(π/2) − a − b = h, (3)
where α is a solution of
π π
e (x) = cos x − b = 0.
2 2
Since (1) and (3) imply that b = 1, we deduce that
2 −1 2
α = cos = 0.560 664 18.
π π
Thus, from (1) and (2),
2h = α − sin(πα/2) = −0.210 513 662 ⇒ h = −0.105 256 831.
Since a = −h, we deduce that p∗ (x) = 0.105 + x.
67
S7 By Theorem 7.2, the error function e = f − p∗ must have an alternating set of
length 4. The required quadratic p∗ must surely, therefore, be of the following
form.
1 f
1
2
p∗
−1 − 1 α 1
2
If p∗ (x) = a + bx + cx2 , then for −1, − 21 , α, 1 to be an alternating set, we
want
f (−1) − p∗ (−1) = 12 − (a − b + c) = h, (4)
1
f − 2 − p∗ − 21 = 0 − a − 12 b + 14 c = −h, (5)
f (α) − p∗ (α) = α + 12 − a + αb + α2 c = h, (6)
∗ 3
f (1) − p (1) = 2 − (a + b + c) = −h. (7)
Here α is a solution of
e (x) = 1 − b − 2cx = 0 ⇒ α = (1 − b)/2c.
Equations (5) and (7) imply that 3/2 − 3b/2 − 3c/4 = 0, and hence that
α = 1/4. Equations (4) and (7) imply that a + c = 1 and also that
2b = 1 + 2h. Substituting in (5), we find that a = 2h, and then in (6), that
h = 9/50, a = 9/25, b = 17/25, c = 16/25.
Hence
p∗ (x) = 1
25 9 + 17x + 16x2 .
68
P2 The following special case may help to illuminate this rather slippery problem.
Suppose that all functions in A vanish at a particular point ξ1 . If it turns out
that ξ1 belongs to the set ZM for some approximation p∗ to f , then p∗ is a
b.m.a. to f , since we can obtain no better approximation at the point ξ1 .
In general, the condition
r
σj φ(ξj ) = 0, φ ∈ A, (8)
j=1
gives a linear dependence among the values taken at the ξj , by any member φ
of A. The condition
σj e∗ (ξj ) ≥ 0, j = 1, 2, . . . , r, (9)
where e∗ = f − p∗ , implies that σj and e∗ (ξj ), which are both non-zero, have
the same sign. Thus if φ is any member of A, then
e∗ (ξj )φ(ξj ) ≤ 0,
for at least one of the j, since otherwise (8) and (9) lead to a contradiction.
Since ξj ∈ ZM we deduce, by Theorem 7.1, that p∗ is a b.m.a. from A to f .
69
Evaluating f − p∗ at these points gives f − p∗ ∞ = 0.031 877. Thus, we
deduce that
0.03 ≤ min f − p ∞ ≤ 0.031 877.
p∈A
70
Since the sine function maps [−π/6, π/2] one-one onto [− 21 , 1], it is equivalent
to prove that the matrix
⎛ ⎞
1 1 − 2t20 3t0 − 4t30
⎜ ⎟
⎝ 1 1 − 2t21 3t1 − 4t31 ⎠ (16)
2 3
1 1 − 2t2 3t2 − 4t2
is non-singular for all distinct t0 , t1 , t2 in [− 21 , 1]. Here ti = sin ξi , i = 0, 1, 2.
In this form, we can evaluate the corresponding determinant and (with some
effort) find a fairly simple factorisation:
1 1 − 2t2 3t0 − 4t2
0 0
1 1 − 2t21 3t1 − 4t31 = (t0 − t1 ) (t1 − t2 ) (t2 − t0 ) (3 + 4 (t0 t1 + t1 t2 + t2 t0 )) .
1 1 − 2t22 3t2 − 4t32
The first three factors of this product are non-zero (since t0 , t1 , t2 are
distinct) but it remains to show that
3 + 4(t0 t1 + t1 t2 + t2 t0 ) = 0,
for distinct t0 , t1 , t2 in [− 12 , 1]. This can be done, but it is rather tricky, and
we omit the details.
Instead, we show that if α, β, γ are not all zero, then the cubic equation
p(t) = α + β(1 − 2t2 ) + γ(3t − 4t3 ) = 0
cannot have 3 distinct roots in [− 12 , 1], which shows that the matrix (16) is
non-singular and hence so is the matrix (14). Once again, an approach via
Rolle’s theorem turns out to be unsuccessful, so we try the following approach
which uses knowledge about the overall shape of a cubic graph.
The equation can certainly have at most 2 roots if γ = 0. We can then
assume, for example, that γ < 0. Since the coefficient of t3 is then positive (so
p(t) → ±∞ as t → ±∞), there can be 3 distinct roots in [−1/2, 1] only if
p(1) = α − β − γ ≥ 0, (17)
p (1) = −4β + 3γ − 12γ > 0, (18)
p − 12 = α + 12 β − γ ≤ 0, (19)
p − 12 = 2β + 3γ − 3γ > 0. (20)
According to (20), we have β > 0 and so (17) and (19) yield the contradictory
inequalities α > γ and α < γ, respectively. Hence the above matrices are
indeed non-singular and A does satisfy Haar condition (4).
To prove the final part note that if
φ(x) = α + β cos(2x) + γ sin(3x)
vanishes at x = −π/6, then
α + 12 β − γ = 0. (21)
Now
φ (−π/6) = −2β sin(−π/3) + 3γ cos(−π/2)
√
= 3β,
and, by equation (21),
φ(π/2) = α − β − γ = −3β/2.
If β = 0, then φ(π/2) = φ (−π/6) = 0. On the other hand, if β > 0, then
φ (−π/6) > 0 and φ(π/2) < 0, so φ has a zero in [−π/6, π/2]. A similar
argument applies if β < 0 and so each function φ in A which vanishes at −π/6
also vanishes at some other point of [−π/6, π/2].
Remark The final part could also have been proved by considering the
cubic function p defined earlier.
71
Chapter 8 The exchange algorithm
This chapter contains a detailed account of the exchange algorithm, which is an
iteration process for determining the b.m.a. from a finite-dimensional subspace A
of C[a, b] to a function f ∈ C[a, b]. The space A must satisfy the Haar condition,
since the algorithm is based on the theory developed in Chapter 7.
The exchange algorithm is analysed in Chapter 9, which will not be assessed. Two
proofs are given there that the algorithm converges. The first, in Sections 9.1 and
9.2, is fairly straightforward, but does not given an estimate for the rate of
convergence of the algorithm. The second proof, in Sections 9.3 and 9.4, is very
involved, but it serves to show that the algorithm converges remarkably quickly.
This chapter splits into TWO study sessions:
Study session 1: Sections 8.1, 8.2 and 8.3.
Study session 2: Sections 8.4 and 8.5.
Commentary
1. Although Powell does not mention it in the text, the version of the exchange
algorithm in which all points of the reference are changed at each iteration is
often called the Remes algorithm (see page 338).
|h| |h|
η
η ξ0 ξ1 ξ2 ξ3 ξ4 ξ0 ξ1 ξ2 ξ3 ξ4
−|h| −|h|
On the left, e(η) has the same sign as e(ξ0 ), so ξ0 leaves the reference; on the
right, e(η) has the opposite sign to e(ξ0 ), so ξ4 leaves the reference.
72
Step 1 Choose an initial reference: a ≤ ξ0 < ξ1 < . . . < ξn+1 ≤ b.
Step 2 Determine p ∈ A and h ∈ R, such that
f (ξi ) − p(ξi ) = (−1)i h, i = 0, 1, . . . , n + 1.
Thus, by Theorem 7.4,
|h| = min max |f (ξi ) − p(ξi )|.
p∈A i=0,1,...,n+1
73
Step 3 To determine f − p ∞ , we need to identify the extreme points of
e = f − p on [−1, 1], which occur either at ±1 or at solutions of
e (x) = ex − b − 2cx = 0.
The graphs y = ex and y = b + 2cx (with b = 1.130 86, c = 0.553 93)
indicate that this non-linear equation has two solutions η1 , η2 at
approximately ±0.5. One can use the bisection method or Newton’s
method to obtain the accurate values
η1 = −0.438 62 and η2 = 0.560 94.
Since e(η1 ) = 0.045 233 and e(η2 ) = −0.045 468, we have
f − p ∞ = |e(η2 )| = 0.045 468.
Note that |e(−1)| = |e(1)| = |h| < |e(η2 )|.
Remark Recall Powell’s comment on page 86 that f − p ∞
would in practice be obtained by computing many values of f − p
on [a, b] and approximating this function locally by quadratics.
Step 4 Since
δ = |f (η2 ) − p(η2 )| − |h| = 0.045 468 − 0.044 337 = 0.001 131,
the polynomial p is already fairly close to the best minimax
approximation from P2 to f (x) = ex on [−1, 1].
Step 5 The error function e = f − p has the following form.
|h|
1
−1 2 η2
η 0
− 12 1 1
−|h|
Self-assessment questions
S1 Justify the statement in the second paragraph of page 86 that the error
function e has at least n turning points.
S2 Justify the statement in the second paragraph of page 88 that the case when
|h| = 0 can occur only on the first iteration, and then any value of q gives the
increase (8.11).
74
Study Session 2: Matters relating to the exchange
algorithm
Commentary
1. Theorem 8.1 explains the choice of reference that we made when finding a
minimax approximation from P2 to f (x) = ex on [−1, 1]. The Chebyshev
polynomial T3 (x) = 4x3 − 3x takes its extreme values at −1, − 21 , 12 , 1 .
Notice also the bearing that Theorem 8.1 has on Powell Exercise 7.7. If we
map the above reference from [−1, 1] to [0, 1], then it becomes 0, 14 , 34 , 1 ,
and Theorem 8.1 implies that if this reference had been used in Powell
Exercise 7.7, then the calculated polynomial p(x) would have been the b.m.a.
from P2 to f (x) = x3 on [0, 1]. Since the given reference {0, 0.3, 0.8, 1} was
close to this ideal reference, the resulting polynomial was close to the b.m.a.
Self-assessment questions
S4 Verify that the points ξi in (8.17) satisfy (8.18).
P4 Powell Exercise 8.6 (Hint: express the remainder for the Taylor
approximation as an integral.)
75
Solutions to SAQs in Chapter 8
S1 Let ξk−1 , ξk , ξk+1 be consecutive points of the reference. If e(ξk ) = |h| > 0,
then the error function e has a local maximum inside [ξk−1 , ξk+1 ], with value
at least |h|. On the other hand, if e(ξk ) = −|h| < 0, then the function e has a
local minimum inside [ξk−1 , ξk+1 ], with value at most −|h|. Each of these
local extreme values is clearly distinct and so, since there are n intervals of
the form [ξk−1 , ξk+1 ], there are at least n local extreme points.
|h|
ξ1 η
a = ξ0 b = ξ2
−|h|
y = e(x)
S4 Since
Tn+1 (x) = cos (n + 1) cos−1 x , −1 ≤ x ≤ 1
(see (4.23)), we have
Tn+1 (ξi ) = cos((n + 1)(n + 1 − i)π/(n + 1))
= cos((n + 1 − i)π)
= (−1)n+1−i ,
because cos(nπ) = (−1)n .
76
S5 According to Theorem 8.1 and the discussion at the bottom of page 91, we
should use the reference {0, 0.25, 0.75, 1}. With f (x) = x3 and
p(x) = a + bx + cx2 , this gives
f (0) − p(0) = −a = h, (5)
f (1/4) − p(1/4) = 1/64 − (a + b/4 + c/16) = −h, (6)
f (3/4) − p(3/4) = 27/64 − (a + 3b/4 + 9c/16) = h, (7)
f (1) − p(1) = 1 − (a + b + c) = −h. (8)
Considering (7) − (5) and (8) − (6) gives
77
P2 The extreme values of the error function
e∗ (x) = f (x) − p∗ (x)
144
= − (69 − 20x + 2x2 )
x+2
occur when x = 0, 6, or x satisfies
−144
e∗ (x) = + 20 − 4x = 0.
(x + 2)2
This equation reduces to
(x − 1)(x2 − 16) = 0,
which has solutions x = 1, ±4. Since
e∗ (0) = 3, e∗ (1) = −3, e∗ (4) = 3, e∗ (6) = −3,
we deduce that p∗ is indeed the b.m.a. from P2 to f on [0, 6].
Next we determine the function p(x) = a + bx + cx2 which satisfies (8.4) with
the reference {0, 1 + α, 4 + β, 6}:
f (0) − p(0) = 72 − a = h, (9)
144
f (1 + α) − p(1 + α) = − a + b(1 + α) + c(1 + α)2 = −h, (10)
(3 + α)
144
f (4 + β) − p(4 + β) = − a + b(4 + β) + c(4 + β)2 = h, (11)
(6 + β)
f (6) − p(6) = 18 − (a + 6b + 36c) = −h. (12)
−1
Now, the binomial expansion for (1 + x) gives
1 1
= = 13 (1 − α/3) + O α2
3+α 3(1 + α/3)
and
1 1
= = 16 (1 − β/6) + O β2 ,
6+β 6(1 + β/6)
so that, if α, β are small enough for α2 , β2 to be neglected, then (10), (11)
reduce to
48 − 16α − (a + b(1 + α) + c(1 + 2α)) = −h, (13)
24 − 4β − (a + b(4 + β) + c(16 + 8β)) = h. (14)
Now we observe that the values a = 69, b = −20, c = 2 and h = 3 must satisfy
(9), (13), (14), (12) in the case α = β = 0, because p∗ is the b.m.a. from P2 to
f , with the alternating set being {0, 1, 4, 6}. Thus, these values for a, b, c and
h will satisfy (9), (13), (14), (12) when α, β = 0 also, provided that
−16α − bα − 2cα = (−16 − b − 2c)α = 0, (15)
−4β − bβ − 8cβ = (−4 − b − 8c)β = 0. (16)
Since both these equations do hold with b = −20 and c = 2, we deduce that
a, b, c, h do satisfy (9), (13), (14), (12), and so the function given by (8.4) in
this case is p∗ again.
Remark It is not, of course, pure chance that equations (15) and (16) hold.
In fact, if ξ is a point at which
f (ξ) − p(ξ) = h and f (ξ) − p (ξ) = 0,
then for small ε (small enough for ε2 to be neglected) we have the Taylor
approximation
f (ξ + ε) − p(ξ + ε) = f (ξ) − p(ξ) + ε (f (ξ) − p (ξ))
= f (ξ) − p(ξ)
= h.
78
In practice this means that if we choose a reference which is fairly close to an
alternating set for f − p∗ , then the polynomial given by (8.4) is very close
to p∗ .
79
Since T5 has no term in x4 , the coefficient of x4 in p̃4 (x) is −1/64 and so the
b.m.a. from P3 to p̃4 on [−1, 1] is
1 1
p̃3 (x) = p̃4 (x) +· T4 (x),
64 23
and the ∞-norm error made by this second approximation is
1/(64 × 23 ) = 1/512. Thus
f − p̃3 ∞ ≤ f − p5 ∞ + p5 − p̃4 ∞ + p̃4 − p̃3 ∞
1 1 1
≤ 192 + 2560 + 512
= 0.007 55.
Since p̃3 is a cubic function and f − p̃3 ∞ < 0.01, we are done.
For the record:
p̃4 (x) = 255
512 x − 18 x2 + 19 3
384 x − 1 4
64 x
and
1 255 9 2 19 3
p̃3 (x) = 512 + 512 x − 64 x + 384 x .
80
Iteration 5 {1, 2, 4}
⎫ &
f (1) − p(1) = 4.2 − a − b = h ⎬ 4.3 − 2a − 3b = 0
f (2) − p(2) = 0.1 − a − 2b = −h 5.8 − 2a − 6b = 0
⎭
f (4) − p(4) = 5.7 − a − 4b = h ⇒ a = 1.4, b = 0.5, h = 2.3
Thus p5 (x) = 1.4 + 0.5x, with |h| = 2.3. The maximum error of p5 is also 2.3
and so the algorithm ends.
Hence the b.m.a. from P1 to f on {0, 1, 2, 3, 4, 5, 6} is p∗ (x) = 1.4 + 0.5x. The
following figure illustrates the 5 approximations calculated by the algorithm.
p2
6 p1
p5
p4
3
p3
0 3 6
Remark Notice that maximum errors in the above process are not
decreasing, and so this example serves to answer Powell Exercise 8.4.
81
Chapter 10 Rational approximation by
the exchange algorithm
The approximation of continuous functions by rational functions, that is, ratios of
polynomials, is of great practical importance. For example, it is common for
computers to use rational functions to approximate special functions such as ex ,
sin x etc. As an example, we mention the function
40x
r(x) = 1 +
x2 − 20x + 138 − 4116/(x2 + 42)
which approximates f (x) = ex to within 1.11 × 10−7 on [−1, 1]. The theory of
minimax rational approximation is rather like that for minimax polynomial
approximation, but is more difficult because rational functions do not depend
linearly on their coefficients.
The first three sections of Chapter 10 are devoted to a version of the exchange
algorithm for rational functions, including a discussion of its possible failure. The
existence and uniqueness of best minimax rational approximations is not proved
and the characterisation of best minimax rational approximation in terms of
alternating sets is left to a series of (rather hard) exercises (10.1, 10.2 and 10.6).
Section 10.4 gives a brief description of an alternative algorithm for calculating
best minimax rational approximations, called the ‘differential correction
algorithm’ (see the book by Braess for more details). This section will not be
assessed in any way.
This chapter splits into TWO study sessions:
Study session 1: Sections 10.1 and 10.2.
Study session 2: Section 10.3.
Commentary
2. Powell uses the letter k in (10.5) to denote the kth approximation to the
b.m.a. from Amn to f . Unfortunately, this makes some of the formulas, such
as (10.12), look even more involved than they are already. This commentary
will suppress the letter k.
3. A proof that each f in C[a, b] has a unique b.m.a. from Amn can be found in
Achieser [2] or Rivlin [138]. Note that Theorem 1.2 does not apply because
Amn is not a linear space.
82
4. The process described in Section 10.2 for solving the non-linear equations
(10.10) to find aj , j = 0, 1, . . . , m, bj , j = 0, 1, . . . , n, and h, is very ingenious,
but rather hard to follow in the abstract. We illustrate the process here with
the concrete example f (x) = ex on [−1, 1], r(x) = (a0 +a1 x)/(b0 + b1 x) in
A11 , and initial reference {ξ0 , ξ1 , ξ2 , ξ3 } = −1, − 21 , 12 , 1 . The calculation
should be compared with that given in the commentary for Chapter 8 to find
the b.m.a. from P2 to f on [−1, 1].
In this case equations (10.10) are
a0 + a1 ξ0 = eξ0 − h (b0 + b1 ξ0 ), (1)
a0 + a1 ξ1 = eξ1 + h (b0 + b1 ξ1 ), (2)
a0 + a1 ξ2 = eξ2 − h (b0 + b1 ξ2 ), (3)
a0 + a1 ξ3 = eξ3 + h (b0 + b1 ξ3 ). (4)
These are 4 non-linear equations for the 5 unknowns a0 , a1 , b0 , b1 , h, but
remember that we are free to scale a0 , a1 , b0 , b1 . Even so, it does not look
easy to solve (1), (2), (3), (4).
The method of solution in Section 10.2 involves (10.11), which are special
cases of (4.11). It is helpful here to use the notation
"
m+n+1
1
Πi = , i = 0, 1, . . . , m + n + 1, (5)
j=0
(ξj − ξi )
j=i
83
that is, (A − hB)b = 0, where b = (b0 , b1 ),
3
Aj = eξi ξi+j Πi , , j = 0, 1, (9)
i=0
3
Bj = (−1)i ξi+j Πi , , j = 0, 1. (10)
i=0
In our case
1 1
A00 = 23 e−1 − 43 e− 2 + 43 e 2 − 23 e = 4
2 sinh 12 − sinh 1 ,
3
1 1
A10 = − 23 e−1 + 23 e− 2 + 23 e 2 − 23 e = 43 cosh 12 − cosh 1 ,
1 1
A11 = 23 e−1 − 13 e− 2 + 13 e 2 − 23 e = 23 sinh 12 − 2 sinh 1 ,
and
2 4 4 2
B00 = 3 + 3 + 3 + 3 = 4,
B10 = − 23 − 23 + 23 + 23 = 0,
2 1 1 2
B11 = 3 + 3 + 3 + 3 = 2.
Thus
−0.177 347 44 −0.553 939 55 4 0
A= , B= .
−0.553 939 55 −1.219 538 05 0 2
Now we wish to determine those h such that det(A − hB) = 0, that is,
(det B)h2 − (A00 B11 − 2A01 B01 + A11 B00 )h + det A = 0, (11)
in view of the symmetry of A and B. This quadratic is
8h2 + 5.232 847 08 h − 0.090 567 07 = 0,
which has solutions
h1 = 0.016 872 21 and h2 = −0.670 978 095.
Recalling that |h| denotes the levelled reference error, we try h1 first (since it
is smaller) and seek a solution b = (b0 , b1 ) to
−0.244 836 28 −0.553 939 55 b0 0
(A − h1 B)b = = .
−0.553 939 55 −1.253 282 47 b1 0
Choosing b1 = 1, we obtain b0 = −2.262 489 65 from the first equation, and
we note that, with these values, q(x) = b0 + b1 x has no zeros in [−1, 1]. From
equations (1) and (4), say, we obtain a0 = −2.299 130 562,
a1 = −1.153 973 103, so that
2.299 130 562 + 1.153 973 103 x
r(x) = ,
2.262 489 65 − x
and the levelled reference error is |h1 | = 0.016 872 21. The argument on
page 115 of Powell shows that the value h2 leads to a rational function r with
a singularity in [−1, 1], as you can easily check.
To continue the exchange algorithm, we seek a number η in [−1, 1] such that
(10.5) holds. The extreme values occur either at ±1 or at solutions of
d a0 + a1 x a1 b 0 − a0 b 1
0= ex − = ex − .
dx b0 + b1 x (b0 + b1 x)2
With the calculated values of a0 , a1 , b0 , b1 , we thus have to solve
4.909 982 764
ex − =0
(2.262 489 65 − x)2
and use of Newton’s method or the bisection method gives
η1 = −0.256 234 678, η2 = 0.704 578 163.
84
Evaluating f (x) − r(x) at these points we find that
f − r ∞ = 0.025 321 995 4. Thus we deduce from (10.8) that, to 3 significant
figures,
0.0169 ≤ f − r∗ ∞ ≤ 0.0253,
where r∗ is the b.m.a. from A11 to f (x) = ex on [−1, 1]. It would appear,
therefore, that the least maximum error in approximation from A11 to f is
about half that obtained in approximating from P2 to f (see the commentary
on Section 8.2).
The error function e = f − r has the form shown below.
|h|
− 12 η1
−1 0 1 η2 1
2
−|h|
5. Theorem 10.2 and the discussion which follows it shows that all values of h
which satisfy (10.16) are real. Note that if B is symmetric and positive
definite then B is conjugate to a diagonal matrix D with the (positive)
eigenvalues of B lying on the main diagonal. In fact, P−1 BP = D, where P
is the transition matrix from an eigenvector basis to the standard basis
(obtained by writing these eigenvectors as column vectors). Thus
1 1
B = PDP−1 ⇒ B 2 = PD 2 P−1 ,
# 1 $2 1 1
since B 2 = PD 2 P−1 PD 2 P−1 = PDP−1 = B, and so
1
# 1
$−1 1
B− 2 = PD 2 P−1 = PD− 2 P−1 .
Self-assessment questions
S1 Justify the fact, used in equation (10.19), that
"
m+n+1
1 "
m+n+1
1
(−1)i = , i = 0, 1, . . . , m + n + 1.
s=0
(ξs − ξi ) s=0
|ξs − ξi |
s=i s=i
3 −2
S2 Let B = . Prove that B is positive definite (i.e. B has strictly
−2 3
1
positive eigenvalues), and determine B− 2 .
85
Study Session 2: Some convergence properties of
the exchange algorithm
Commentary
1. Theorem 10.3 shows that in the discrete case the exchange algorithm
converges. As pointed out in this section, the exchange algorithm may fail,
but if it does converge, then the rate of convergence is very rapid — hence
the algorithm’s importance.
2. Equation (10.29). More precisely, the difficulty occurs when every alternating
set which satisfies (10.29) has length less than m + n + 2. In the example at
the bottom of page 117, where m = n = 1, the longest alternating set for
f − r has length 3, whereas m + n + 2 = 4 in this case (cf. the solution to
Problem P5).
Self-assessment questions
S3 Explain how the method of proof of Theorem 10.1 implies that r∗ − r is the
ratio of two cubic functions with four zeros (page 117, bottom).
S4 Confirm that the functions in A11 which satisfy (10.6) with the data
f (−4) = 0, f (−1) = 1, f (1) = 1, f (4) = 0, are given by (1.6 − 0.2x)/(2 − x)
and (1.6 + 0.2x)/(2 + x).
P2 Powell Exercise 10.2 (Hint: try to mimic the proof of Theorem 7.1.)
P4 Powell Exercise 10.4 (Note that you need not find the b.m.a. from A12 , in
the second part.)
P5 Powell Exercise 10.6 (Hint: try to use the dimension theorem for linear
mappings.)
86
S2 The characteristic equation of B is λ2 − 6λ + 5 = 0, so that the eigenvalues
are λ = 1, 5. Since these are both positive, B is positive definite.
Corresponding eigenvectors are (1, 1) for λ = 1 and (1, −1) for λ = 5, so that
the transition matrix from the basis {(1, 1), (1, −1)} to the standard basis is
1 1 −1 1 0
P= ⇒ P BP = = D.
1 −1 0 5
Now
− 12 1 0√ 1 1
D = ⇒ B− 2 = PD− 2 P−1
0 1/ 5
√ √
1 1
2 (1 + 1/ 5) 2 (1 − 1/ 5)
= 1 √ 1
√ .
2 (1 − 1/ 5) 2 (1 + 1/ 5)
strength of Theorem 7.5 is not needed in this case), and since r ∈ A22 ,
r∗ ∈ A11 , the function r − r∗ is indeed the ratio of two cubic functions, whose
denominator has no zeros. Hence r − r∗ = 0, as required (see solution to
Problem P5 for a more general result).
S4 To solve the equations (10.6) we need to consider equation (11), where the
coefficients of A and B are given by (9) and (10) (adapted to the present
case) and Πi , i = 0, 1, 2, 3, are given by (5).
First,
1 1 1 1
Π0 = 3·5·8 = 120 , Π1 = (−3)·2·5 = − 30 ,
1 1 1 1
Π2 = (−5)·(−2)·3 = 30 , Π3 = (−8)·(−5)·(−3) = − 120 ,
so that
1 1
A00 = 0 − 30 + 30 + 0 = 0,
1 1 1
A10 = 0 + 30 + 30 +0 = 15 ,
1 1
A11 = 0 − 30 + 30 + 0 = 0,
1 1 1 1 1
B00 = + 30
120 + 30 + 120 = 12 ,
1
1
1 1
B10 = (−4) · 120 − (−1) · − 30 + 1 · 30 − 4 · − 120 = 0,
1
1 1 1 1
B11 = 16 · 120 − 1 · − 30 + 1 · 30 − 16 · − 120 = 3.
Hence
0 1/15 1/12 0
A= and B = ,
1/15 0 0 1/3
so that equation (11) is h2 /36 − 1/225 = 0 ⇒ h = ±0.4. It follows that the
equations (A − hB)b = 0 reduce to −hb0 /12 + b1 /15 = 0, that is
b1 = 5hb0 /4. Thus, we find from equations (1) and (4) that a0 = 5h2 b0 ,
a1 = hb0 /4, so that the corresponding rational functions are
1.6 − 0.2x 1.6 + 0.2x
(h = −0.4) and (h = 0.4).
2−x 2+x
87
Solutions to Problems in Chapter 10
P1 The key observation here is that for x ∈ [a, b], r(x) lies between r∗ (x) and
r(x). Indeed, by the hint, we deduce that, for x ∈ [a, b],
p∗ (x) p(x) p∗ (x) p∗ (x) + θp(x) p(x)
∗
< ⇒ ∗
< ∗
< ,
q (x) q(x) q (x) q (x) + θq(x) q(x)
since q ∗ , q > 0 on [a, b], and similarly
p∗ (x) p(x) p∗ (x) p∗ (x) + θp(x) p(x)
> ⇒ > > .
q ∗ (x) q(x) q ∗ (x) q ∗ (x) + θq(x) q(x)
Also, if p∗ (x)/q ∗ (x) = p(x)/q(x), then p(x)/q(x) = p∗ (x)/q ∗ (x).
Now suppose that η ∈ [a, b] satisfies f − r ∞ = |f (η) − r(η)|. If r(η) lies
strictly between r∗ (η) and r(η), then it is clear that
|f (η) − r(η)| < max{|f (η) − r∗ (η)| , |f (η) − r(η)|} ≤ f − r∗ ∞ ,
and so f − r ∞ < f − r∗ ∞ , as required.
Otherwise, r(η) = r∗ (η) = r(η), so that
f − r ∞ = |f (η) − r(η)| = |f (η) − r(η)| < f − r∗ ∞ ,
once again.
88
On the other hand, if ξ ∈ [a, b]\Z0 , then f (ξ) − r∗ (ξ) and r∗ (ξ) − rθ (ξ) have
opposite signs (neither is zero!), so that
f − rθ ∞ = |f (ξ) − rθ (ξ)|
= |f (ξ) − r∗ (ξ) + r∗ (ξ) − rθ (ξ)|
< max{|f (ξ) − r∗ (ξ)| , |r∗ (ξ) − rθ (ξ)|}
≤ e∗ ∞ .
In either case, the desired reduction holds.
We have thus proved the ‘only if’ part of the following analogue of
Theorem 7.1.
Theorem Let f ∈ C[a, b]. Then r∗ is a b.m.a. from Amn to f if and only if
there is no function r in Amn such that
(f (x) − r∗ (x))(r(x) − r∗ (x)) > 0, x ∈ ZM ,
where
ZM = {x ∈ [a, b] : |f (x) − r∗ (x)| = f − r∗ ∞ }.
The proof of the ‘if’ part goes exactly as the beginning of Section 7.2.
89
1 + 4.9x
y=
10 + x
−90 + 29x
y=
−150 + 35x
1
0 1 2 5 6
P4 It
is a straightforward
matter to check that f − r∗ ∞ = 14 and that
1 1
−1, − 2 , 2 , 1 is an alternating set of length 4 for f − r∗ . Since
m + n + 2 = 5 in this case, we can apply the argument at the bottom of page
117 (see also SAQ S3) to show that r∗ is the b.m.a. to f from A21 .
To prove that r∗ is not the b.m.a. from A12 to f , we shall find a function of
the form
x
r(x) = , −1 ≤ x ≤ 1,
a + bx2
such that f − r ∞ < 14 . The above form for r is appropriate because the
function f itself is odd (i.e. f (−x) = −f (x)). There are many reasonable
choices for a and b, but a = 2, b = −1 seems a good idea since then we have
r (0) = 12 and r(1) = f (1) (and also r (1) = f (1)).
x 1
y=
2−x 2
y = x3
−1 1
−1
90
If r = p/q lies in Amn , then
p p∗ pq ∗ − p∗ q
r − r∗ = − ∗ = .
q q qq ∗
Thus we need to find polynomials p ∈ Pm , q ∈ Pn such that
"
k
pq ∗ − p∗ q = (x − ξi ), (19)
i=1
91
Printed in the United Kingdom.