100% found this document useful (1 vote)

1K views94 pages

M832 Approximation Theory Course Notes

The subject of Approximation Theory lies at the frontier between Applied Mathematics and Pure Mathematics. Practical problems, such as the computer calculation of special functions like e x, lead naturally to theoretical problems, such as ‘how well can we approximate by a given method?’ or ‘how fast does a given algorithm converge?’.

Uploaded by

Viator in Terra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

1K views94 pages

M832 Approximation Theory Course Notes

Uploaded by

Viator in Terra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 94

Mathematics and Computing: Taught MSc M832 CN1

M832
APPROXIMATION THEORY AND METHODS
(PART 1)

Course Notes
(Chapters 1–8 and 10)

Prepared by

P. J. Rippon

Second edition

Copyright
c 2007 The Open University SUP 01168 7
3.1
Contents
Introduction 2
Reading List 3
Chapter 1 The approximation problem and existence of best
approximations 4
Study Session 1: Approximation in a metric space 4
Study Session 2: Approximation in a normed linear space 7
Chapter 2 The uniqueness of best approximation 16
Study Session 1: Convexity 16
Study Session 2: Best approximation operators 17
Chapter 3 Approximation operators and some approximating
functions 25
Study Session 1: ‘Good’ versus ‘best’ approximation 25
Study Session 2: Types of approximating functions 26
Chapter 4 Polynomial interpolation 36
Study Session 1: Polynomial interpolation 36
Study Session 2: Chebyshev interpolation 37
Chapter 5 Divided differences 47
Study Session 1: Basic properties of divided differences 47
Study Session 2: Numerical considerations and Hermite
interpolation 49
Chapter 6 The uniform convergence of polynomial
approximations 55
Study Session 1: Monotone operators 55
Study Session 2: The Bernstein operator 56
Chapter 7 The theory of minimax approximation 63
Study Session 1: The extreme values of the error function 63
Study Session 2: Characterising best minimax
approximations 64
Chapter 8 The exchange algorithm 72
Study Session 1: Using the exchange algorithm 72
Study Session 2: Matters relating to the exchange
algorithm 75
Chapter 10 Rational approximation by the exchange algorithm 82
Study Session 1: The exchange algorithm for rational
approximation 82
Study Session 2: Some convergence properties of the
exchange algorithm 86
Introduction
The subject of Approximation Theory lies at the frontier between Applied
Mathematics and Pure Mathematics. Practical problems, such as the computer
calculation of special functions like ex , lead naturally to theoretical problems,
such as ‘how well can we approximate by a given method?’ or ‘how fast does a
given algorithm converge?’.
Powell’s book Approximation Theory and Methods (hereafter referred to as
‘Powell’) provides an excellent introduction to these theoretical problems, covering
the basic theory of a wide range of approximation methods. Professor Powell is an
expert on both pure and applied approximation theory, and the book contains a
very detailed list of references to and discussion of the research literature.
This course is based on a treatment of fifteen chapters of Powell. Do not be
misled by the statement that this is an undergraduate textbook. Much of the
material can be taught at that level, but when looked at in detail many parts of it
are quite demanding. These course notes will guide you through the book telling
you which sections to read, explaining difficult parts, correcting errors (mercifully
few!) and setting SAQs and Problems to test your understanding of the material.
You should attempt all the SAQs and as many Problems as you have time for:
full solutions are given at the end of the notes for each chapter.
You will find the exercises in Powell quite varied. Many are routine, but others
are rather hard and some are very hard (particularly those which contain the
word ‘investigate’). I have resisted the temptation to attach ‘stars’ to harder
exercises and instead tried to provide ‘hints’, where appropriate. In general I feel
that, at this level, it is a good idea for you to try and make your own judgement
about the difficulty of a given problem.
Many of the exercises require the use of a good scientific calculator (one with
special functions, including hyperbolics, and a memory). Some require the
solution of non-linear equations of the form f (x) = 0 by using, for example:
the bisection method (finding an interval [a, b] such that f (a), f (b) have
opposite signs, testing f (c), where c = 12 (a + b), and then repeating the process
with either [a, c] or [c, b]);
Newton’s method (making a good initial guess x0 at a solution and then
calculating the sequence xn given by
xn+1 = xn − f (xn )/f (xn ), n = 0, 1, 2, . . .).
Such methods can be implemented on a basic scientific calculator (especially if
only a rough answer is needed), but it will obviously save time if you have access
to a computer. You will not be expected to determine accurate solutions by such
methods in the examination. On the matter of accuracy, I have tended to present
calculations as they appeared on my own calculator, and have sometimes given
final answers correct to only three significant digits.
In order to pace you through the course, there are four Tutor-Marked
Assignments (TMAs). These are compulsory in that you cannot pass the course
without obtaining a reasonable average grade on them. Your three best TMAs
carry 50% of the total marks for the course, the remaining 50% coming from the
three-hour examination at the end of the course. Please note that TMAs cannot
be accepted after their cut-off dates, other than in exceptional circumstances.
Although you should have plenty to do reading Powell and these course notes, I
have added a reading list after this introduction. This splits into books covering
the background material which is assumed in Powell (Linear Algebra, Metric
Spaces, etc.) and other books on Approximation Theory.

2
I should be grateful to receive any comments you may have on the course notes
and on the set book. The course notes have already beneﬁted greatly from close
reading by Mick Bromilow here at the OU and Martin Stynes of University
College, Cork. Their help has been invaluable. Finally, I should like to thank all
those who have helped prepare these Course Notes, including members of the
Desktop Publishing Unit, Alison Cadle who edited them, and the many M832
students who have supplied corrections to earlier versions.

Phil Rippon
Milton Keynes, August 2006

Reading List

Background
W. Rudin, Principles of Mathematical Analysis, McGraw–Hill, 1976.
(A concise introduction to real analysis, including metric spaces,
integration and functions of several variables, as well as basic linear
algebra — available in a paperback International Student Edition.)
V. Bryant, Metric Spaces, Cambridge University Press, 1985.
(An introduction to metric spaces, emphasising the importance of
iteration. Plenty of explanation.)
D. Kreider, R. Kuller, D. Ostberg and F. Perkins, An Introduction to Linear
Analysis, Addison–Wesley, 1966.
(An introduction to the use of vector spaces of functions in solving
linear diﬀerential equations — lots of worked exercises.)
M203 Introduction to Pure Mathematics
MST204 Mathematical Models and Methods
M386 Metric and Topological Spaces (now part of M435 )

Approximation Theory
T. J. Rivlin, An Introduction to the Approximation of Functions, Dover, 1981.
(Cheap and covers very similar material to Powell, with less on
splines and more on rational approximation.)
P. J. Davis, Interpolation and Approximation, Dover, 1976.
(Cheap, but a classic text which overlaps Powell considerably,
though with a much greater emphasis on complex approximation.)
D. Braess, Nonlinear Approximation Theory, Springer, 1986.
(Recent and sophisticated, this book examines the more diﬃcult
non-linear theory which Powell largely avoids.)

3
Chapter 1 The approximation problem
and existence of best approximations
The book begins with a discussion of the types of problems which are to be solved
and several fundamental results. Powell assumes that the reader is quite familiar
with metric spaces and so the commentary below includes a short refresher course
on these, in case you are rusty on this subject.
This chapter splits into TWO study sessions:
Study session 1: Sections 1.1–1.2.
Study session 2: Sections 1.3–1.5.

Study Session 1: Approximation in a metric space

Read Sections 1.1 and 1.2

Commentary

1. Section 1.1 describes the three ingredients of an approximation problem:

(a) a function f to be approximated, lying in some underlying (background)
set B;
(b) a set of functions A ⊆ B from which we wish to choose an approximation
g to f ;
(c) a means of measuring how close together g and f are.

f
?
A
g

In a continuous approximation problem f is typically a real function, such

as f (x) = ex , and the set A is a ﬁnite-dimensional vector space of real
functions, such as the set Pn of polynomials of degree at most n. One
measure of how closely a function g approximates to f on an interval [a, b] is
max |g(x) − f (x)|.
a≤x≤b

In a discrete approximation problem f is typically a vector of function

values (f (x1 ), . . . , f (xn )), where g belongs to some set of approximating
functions. Note that f and g are used here to represent both a function and
the corresponding vector of function values. One measure of how closely g
approximates to f is
max |g(xi ) − f (xi )| or max |g(xi ) − fi |.
1≤i≤n 1≤i≤n

4
2. A metric space (B, d) is a set B and a metric (or distance function) d(a, b),
a, b ∈ B, such that for all a, b, c ∈ B:
(M1) d(a, b) ≥ 0, with equality if and only if a = b;
(M2) d(a, b) = d(b, a);
(M3) d(a, c) ≤ d(a, b) + d(b, c).
The most familiar metric spaces are R with the metric d(a, b) = |a − b| and
R2 with the metric
1
d(a, b) = (a1 − b1 )2 + (a2 − b2 )2 2 , a = (a1 , a2 ), b = (b1 , b2 ).

a2 a = (a1 , a2 )

d(a, b)

b2 b = (b1 , b2 )

a1 b1

This example explains why (M3) is called the triangle inequality.

More generally Rn is a metric space with
n 12

2
d(a, b) = (ai − bi ) , a = (a1 , . . . , an ), b = (b1 , . . . , bn ). (1)
i=1

For general n it is not quite so obvious that (M3) holds. The proof is given
later when we introduce a large family of metric spaces. Before that we recall
a number of definitions and results for future reference. No proofs are given
as these results are quite standard.
Convergence A sequence an , n = 1, 2, . . ., in B is convergent with limit
a∗ if d(an , a∗ ) → 0 as n → ∞.
Closed set A subset F of B is closed if every convergent sequence an ,
n = 1, 2, . . ., in F has its limit in F .
For example, the closed ball
{b ∈ B : d(a, b) ≤ r}, r > 0,
is a closed set.
Open set A subset E of B is open if B\E is closed.
For example, the open ball
{b ∈ B : d(a, b) < r}, r > 0,
is an open set.
Compact set A subset K of B is compact if every sequence an ,
n = 1, 2, . . ., in K has a convergent subsequence ank , k = 1, 2, . . ., whose limit
a is in K.
For example, every finite set is compact. In Rn with the metric d given by
equation (1), every closed set which is also bounded (i.e. lies inside some
fixed closed ball) is compact. Note that every compact set is closed.

5
Continuous function A function φ : (B, d) → (B , d ) is continuous at
a ∈ B if for each ε > 0 there is a δ > 0 such that
d(a, b) < δ ⇒ d (φ(a), φ(b)) < ε
(equivalently: for each sequence an → a in B, we have f (an ) → f (a)). We
say that φ : (B, d) → (B , d ) is continuous if φ is continuous at each a ∈ B.
Uniformly continuous function A function φ : (B, d) → (B , d ) is
uniformly continuous on B if for each ε > 0 there is a δ > 0 such that, for
all a, b ∈ B,
d(a, b) < δ ⇒ d (φ(a), φ(b)) < ε.
Extreme Value Theorem If φ : (B, d) → (R, d ) is continuous (where
d (a, b) = |a − b|), then φ attains a maximum value and a minimum value on
any compact subset K of B.
Uniform Continuity Theorem If φ : (B, d) → (B , d ) is continuous then
φ is uniformly continuous on any compact subset K of B.

3. Theorem 1.1. The proof can be shortened. You can omit the second
sentence and the word ‘Otherwise’ from the third sentence, and then use the
notation a∗ in place of a+ . Note that Powell uses ‘limitpoint’ to mean the
limit of a convergent subsequence.
The following picture may be helpful.

a3
f a1
A
a∗
a4
a2
B

4. The set A discussed

after Theorem 1.1 is not compact; for example, the
sequence 1 − n1 , 0 lies in A but has no subsequence which converges to a
limit in A. If f = (2, 0), say, then for each a ∈ A we can ﬁnd an a ∈ A which
is closer to f than a.

Self-assessment questions
S1 Consider the problem of ﬁtting the data in Figure 1.2 by a straight line. Show
that the set A of vectors (p(x1 ), . . . , p(x5 )), arising from functions
p(x) = c0 + c1 (x), forms a 2-dimensional subspace of R5 .

S2 Prove the following generalisation of Theorem 1.1. If A1 and A2 are compact

subsets of B, then there exist a∗1 in A1 and a∗2 in A2 such that
d(a∗1 , a∗2 ) = inf{d(a1 , a2 ) : a1 ∈ A1 , a2 ∈ A2 }.

6
Study Session 2: Approximation in a normed linear
space

Read Sections 1.3, 1.4 and 1.5

Commentary
1. Almost every metric space in Powell arises as a normed linear space
(n.l.s.). This is a linear space B (also called a vector space) with an
associated norm a , a ∈ B, such that, for all a, b ∈ B and λ ∈ R:
(N1) a ≥ 0, with equality if and only if a = 0;
(N2) λa = |λ| a ;
(N3) a + b ≤ a + b .
Roughly speaking, the norm measures how large the element a is, that is,
how far a lies from the zero element of the space.
By deﬁning
d(a, b) = a − b ,
we ﬁnd that (B, d) is a metric space. Properties (M1) and (M2) are
immediate, as is (M3), since
a − c = (a − b) + (b − c) (by linearity)
≤ a − b + b − c . (by (N3))
For this reason, (N3) is also called the triangle inequality.
Powell gives some important examples of norms in Section 1.4. Two of these
have useful geometric interpretations.

y = f (x) y = f (x)

a b a b

−||f ||∞

b
f ∞ = maxa≤x≤b |f (x)| f 1 = a
|f (x)| dx

The corresponding metrics have similar geometric interpretations.

Maximum Total
vertical shaded
f separation f area

g g
a b a b

b
f − g∞ = maxa≤x≤b |f (x) − g(x)| f − g1 = a
|f (x) − g(x)| dx

7
The 2-norm
12
b
f 2 = f (x)2 dx
a

has no convenient geometric interpretation.

For each of these norms, properties (N1) and (N2) are evident, but (N3)
requires some work. We give here the argument for the continuous 2-norm
above. The proof is in two stages.
(I) Cauchy–Schwarz Inequality

b

f (x)g(x) dx ≤ f 2 g 2 , where f, g ∈ C[a, b].
a

Proof If f 2 = 0 or g 2 = 0, then the result is clear.

Otherwise, we use the inequality
√ A+B
AB ≤ , for A, B ≥ 0, (2)
2
with A = f (x)2 / f 22 and B = g(x)2 / g 22. Integration gives
b b b

1 1 1 2 1 2
|f (x)g(x)| dx ≤ f (x) dx + g(x) dx
f 2 g 2 a 2 f 22 a g 22 a
= 1.
The desired inequality now follows from

b b

f (x)g(x) dx ≤ |f (x)g(x)| dx.
a a

Remark Another proof of the Cauchy–Schwarz inequality appears in

the notes for Chapter 2.
(II) Minkowski’s Inequality
f + g 2 ≤ f 2 + g 2 , where f, g ∈ C[a, b].
Proof If f + g 2 = 0, then the result is clear. Otherwise, note that
b
2
f + g 22 = (f (x) + g(x)) dx
a
b b
≤ |f (x) + g(x)| |f (x)| dx + |f (x) + g(x)| |g(x)| dx
a a
≤ f + g 2 f 2 + f + g 2 g 2 ,
by the Cauchy–Schwarz inequality. The desired inequality now follows on
dividing by f + g 2 .
As you should have realised, Minkowski’s inequality is just (N3) for the
2-norm f 2 . The proof of (N3) for the discrete 2-norm is similar,
proceeding via the discrete version of the Cauchy–Schwarz inequality:
m m 12 m 12

2
ai b i ≤ ai b2i .

i=1 i=1 i=1

The metric on R which arises from the discrete 2-norm is precisely that
n

deﬁned in equation (1) of the commentary for Study Session 1.

The argument to prove property (N3) for
1/p
b
f p = |f (x)|p dx ,
a

if 1 < p < ∞, is similar to the case p = 2. Instead of the Cauchy–Schwarz

inequality, we need the more general Hölder inequality

8

b

f (x)g(x) dx ≤ f p g q ,
a

where p−1 + q −1 = 1, whose proof is based on the inequality

A B
A1/p B 1/q ≤ + , A, B ≥ 0.
p q
Since Powell uses only the 1-norm, 2-norm and ∞-norm, we omit the
details.

2. Theorem 1.2 is of fundamental importance, and the proof looks

straightforward (note the use of the ‘backwards’ form of the triangle
inequality a − f ≥ a − f in (1.15), which is equivalent to
a ≤ a − f + f ). However, the second sentence is not quite so
transparent as it may appear. A closed ball in Rn is certainly compact, but
this does not immediately imply that a closed ball in a ﬁnite-dimensional
subspace of an n.l.s. is compact. The proof of this fact is a little tricky but it
illustrates how ‘analysis’ and ‘linear algebra’ interact in this subject.
We prove that if A is a ﬁnite-dimensional subspace of an n.l.s. and M ≥ 0,
then
AM = {a ∈ A : a ≤ M }
is compact, by showing that any sequence am , m = 1, 2, . . ., in AM has a
convergent subsequence, whose limit must be in AM (since AM is closed).

AM
B A

{a ∈ B : ||a|| ≤ M }

Let b1 , . . . , bn be a basis for A and write each am in the form

am = λ1m b1 + · · · + λnm bn , λ1m , . . . , λnm ∈ R.
The result follows if we can show that the sequence λm = (λ1m , . . . , λnm ) of
coeﬃcient vectors has a convergent subsequence in Rn , since the function
φ : Rn → A given by
φ(λ) = λ1 b1 + · · · + λn bn , (λ1 , . . . , λn ) ∈ Rn ,
is continuous.
The normalised coeﬃcient vectors µm = λm / λm 2 lie on the unit sphere
{µ ∈ Rn : µ 2 = 1} and so have a convergent subsequence µmk with limit
µ∗ , say, where µ∗ 2 = 1. By the continuity of φ, once again,

µ∗1 b1 + · · · + µ∗n bn = lim µ1mk b1 + · · · + µnmk bn
k→∞
= lim amk / λmk 2 .
k→∞

Hence λmk 2 ∞ as k → ∞ (for otherwise, since amk ≤ M , the above

limit is 0, which contradicts the linear independence of b1 , . . . , bn ). We may
assume, therefore (by taking a further subsequence), that λmk 2 is
convergent and deduce that λmk = λmk 2 µmk is convergent, as required.

3. In the proof of Theorem 1.3, the Cauchy–Schwarz inequality is applied with

f = |e| and g = 1.

9
Self-assessment questions
S3 Prove that the function φ in the above proof is continuous.

S4 Prove that f 1 , f ∈ C[a, b], satisﬁes (N3).

S5 Prove that f ∞ , f ∈ C[a, b], satisﬁes (N3).

S6 Give an alternative proof of Theorem 1.2 by considering

A0 = {a ∈ A : a − f ≤ f }.

S7 Verify equations (1.24), (1.25) and (1.26).

S8 Let A and f be as in Figure 1.4. Determine inf a∈A f − a for the 1-norm,
the 2-norm and the ∞-norm.

Problems for Chapter 1

P1 Use ﬁrst principles to ﬁnd:
(a) the best approximations to f (x) = ex on [0, 1] by a constant, in L1 , L2
and L∞ ;
(b) the best approximation to f (x) = x2 on [0, 1] by a linear function
p(x) = ax, in L∞ .

P2 Powell Exercise 1.5

P3 Powell Exercise 1.6

P4 Powell Exercise 1.7

P5 Powell Exercise 1.1 (Hint: choose a suitable compact subset of A1 and apply
SAQ S2.)

Solutions to SAQs in Chapter 1

S1 Since
p(xi ) = c0 + c1 xi , i = 1, 2, . . . , 5,
we have
(p(x1 ), p(x2 ), p(x3 ), p(x4 ), p(x5 )) = c0 (1, 1, 1, 1, 1) + c1 (x1 , x2 , x3 , x4 , x5 ).
Now (1, 1, 1, 1, 1) and (x1 , x2 , x3 , x4 , x5 ) are ﬁxed vectors in R5 , which are not
linearly dependent, and c0 , c1 can take any real values. Hence the set of
vectors (p(x1 ), p(x2 ), p(x3 ), p(x4 ), p(x5 )) forms a 2-dimensional subspace
of R5 .

S2 Let d∗ = inf{d(a1 , a2 ) : a1 ∈ A1 , a2 ∈ A2 } and choose sequences a1n ∈ A1 ,

a2n ∈ A2 , n = 1, 2, . . ., such that
lim d(a1n , a2n ) = d∗ .
n→∞

By the compactness of A1 and A2 , we can choose common subsequences

a1nk , a2nk , k = 1, 2, . . ., such that
lim a1nk = a∗1 and lim a2nk = a∗2 ,
k→∞ k→∞

10
with a∗1 ∈ A1 and a∗2 ∈ A2 (ﬁrst choose a convergent subsequence a1nk of a1n
and then, if necessary, a subsequence of a2nk ).
Now, by the triangle inequality,
d(a∗1 , a∗2 ) ≤ d(a∗1 , a1nk ) + d(a1nk , a2nk ) + d(a2nk , a∗2 ) .
Letting k → ∞, we deduce that d(a∗1 , a∗2 ) ≤ d∗ , so that d(a∗1 , a∗2 ) = d∗ , as
required.

S3 If λ = (λ1 , . . . , λn ), µ = (µ1 , . . . µn ) ∈ Rn , then

φ(λ) − φ(µ) = (λ1 − µ1 )b1 + · · · + (λn − µn )bn .
Hence
φ(λ) − φ(µ) ≤ |λ1 − µ1 | b1 + · · · + |λn − µn | bn
≤ λ − µ 2 max bi
1≤i≤n
= K λ − µ 2 ,
say. Hence
λ − µ 2 < ε/K ⇒ φ(λ) − φ(µ) < ε,
which proves that φ is (uniformly) continuous on Rn .

S4 Let f, g ∈ C[a, b]. Since

|f (x) + g(x)| ≤ |f (x)| + |g(x)|,
for all x ∈ [a, b], we deduce that
b b b
|f (x) + g(x)| dx ≤ |f (x)| dx + |g(x)| dx,
a a a
that is,
f + g 1 ≤ f 1 + g 1 ,
as required.

S5 Let f, g ∈ C[a, b] and suppose that

max |f (x) + g(x)| = |f (c) + g(c)|,
a≤x≤b

where c ∈ [a, b] (the maximum is attained because the function

x → |f (x) + g(x)| is continuous on [a, b]). Then
|f (c) + g(c)| ≤ |f (c)| + |g(c)|
≤ max |f (x)| + max |g(x)|,
a≤x≤b a≤x≤b

that is,
f + g ∞ ≤ f ∞ + g ∞ ,
as required.

S6 The set A0 is compact, being the intersection of a closed ball in B with A and
hence a closed subset of a compact set. Thus we can, by Theorem 1.1, choose
a∗ ∈ A0 such that
a − f ≥ a∗ − f , a ∈ A0 .
To see that
a − f ≥ a∗ − f , a ∈ A,
note that if a ∈ A\A0 , then
a − f > 0 − f ≥ a∗ − f ,
since 0 ∈ A0 .

11
S7 Since λ > 0, 1 − xλ ≥ 0 for 0 ≤ x ≤ 1, so that
1 1
xλ+1 λ
e 1 = 1 − xλ dx = x − = ,
0 λ + 1 0 λ + 1
1 1
2

λ 2

e 2 = 1−x dx = 1 − 2xλ + x2λ dx
0 0
1
2x λ+1
x 2λ+1
2λ2
= x− + = ,
λ+1 2λ + 1 0 (λ + 1)(2λ + 1)
e ∞ = max |1 − xλ | = 1.
0≤x≤1

√
S8 inf f − a 1 = 1; inf f − a 2 = 1/ 2; inf f − a ∞ = 12 .
a∈A a∈A a∈A

Solutions to Problems in Chapter 1

P1 (a) Let p(x) = c be a constant approximation to f (x) = ex . It is clear that
the minimum of f − p 1 occurs when 1 ≤ c ≤ e. Since ex − c = 0 when
x = log c, with ex − c < 0 for x < log c, we have
1
f − p 1 = |ex − c| dx
0
log c 1
= (c − e ) dx +
x
(ex − c) dx
0 log c
= 2c log c − 3c + e + 1.
The minimum of this expression occurs when
√
2(log c + 1) − 3 = 0 ⇒ c = e.
√
Hence p(x) =√ e is the best L1 approximation, with
f − p 1 = ( e − 1)2 .
To ﬁnd the best approximation in L2 we minimise
1
2
f − p 22 = (ex − c) dx
0
e2 − 1
= c2 − 2c(e − 1) +
2
2
e − 1
= (c − (e − 1))2 + − (e − 1)2 .
2
Evidently the minimum occurs for c = e − 1, and so p(x) = e − 1 is the
1
best L2 approximation, with f − p 2 = 2 (3 − e)(e − 1).
The best L∞ approximation again occurs when 1 ≤ c ≤ e, and the
maximum error occurs at the ends of the interval, so that
max |ex − c| = c − 1 = e − c ⇒ c = 12 (e + 1).
x∈[0,1]

Hence p(x) = 12 (e + 1) is the best L∞ approximation, with

f − p ∞ = 12 (e − 1).
(b) The best L∞ approximation to f (x) = x2 on [0, 1] by p(x) = ax occurs
when 0 < a < 1. For a given value of a, 0 < a < 1, there
are two
candidates for the point x which maximises x2 − ax, namely the point
x = 1 and the point where x2 − ax is at a minimum, which satisﬁes
2x − a = 0 ⇒ x = a/2.

12
Now if x = 1, then
2
x − ax = |1 − a| = 1 − a,

and if x = 12 a, then
2
x − ax = 1 a 1 a − a = 1 a2 .
2 2 4

Since 1 − a is decreasing and 14 a2 is increasing, we deduce that the best

L∞ approximation occurs when
a2 √
1−a= ⇒
a = 2( 2 − 1) 0.8284.
4
√
√ approximation is p(x) = 2( 2 − 1)x, with
Hence the best L∞
f − p ∞ = 3 − 2 2.

P2 Here A is the set of continuous piecewise-linear functions on [a, b]. Given

f ∈ C[a, b] and ε > 0 we must construct such a piecewise-linear function g,
with
f − g ∞ < ε.
(Note that the letter a has two meanings in the question.)
Since f must be uniformly continuous on [a, b], there exists δ > 0 such that
|x − y| < δ ⇒ |f (x) − f (y)| < ε. (3)
Now choose n > (b − a)/δ and deﬁne xk = a + (b − a)(k/n), for
k = 0, 1, . . . , n. Notice that xk+1 − xk = (b − a)/n < δ.
Next deﬁne g(xk ) = f (xk ), for k = 0, 1, . . . , n, and extend the function g
linearly to each interval [xk , xk+1 ] by the formula
(xk+1 − x)f (xk ) + (x − xk )f (xk+1 )
g(x) = .
xk+1 − xk

y = g(x)

y = f (x)

a = x0 x1 x2 b = xn

We claim that f − g ∞ < ε, that is,

|f (x) − g(x)| < ε, xk ≤ x ≤ xk+1 , k = 0, 1, . . . , n − 1.
This holds because

|f (x) − f (xk )| < ε

, xk ≤ x ≤ xk+1 ,
|f (x) − f (xk+1 )| < ε
by condition (3), and the values of g(x), xk ≤ x ≤ xk+1 , lie between f (xk )
and f (xk+1 ). Thus if f ∈ C[a, b] but f ∈
/ A, then there is no best
approximation to f from A because for any ε > 0 there exists g ∈ A such that
f − g ∞ < ε. This example shows that Theorem 1.2 is false if we drop the
hypothesis that A be ﬁnite-dimensional.

13
P3 Let us take [a, b] = [0, 1] for simplicity. The example can always be adapted to
[a, b] by a translation.
Consider ﬁrst the example e(x) = 1 − xλ , 0 ≤ x ≤ 1. Equations (1.24) and
(1.25) give

e 2 2λ + 2
= ,
e 1 2λ + 1
which shows that
e 2 √
1≤ ≤ 2, 0 ≤ λ < ∞.
e 1
However, if we allow negative values of λ then e 2 / e 1 becomes unbounded
as λ tends to − 21 from above. Of course, f (x) = xλ is not continuous on [0, 1]
for negative values of λ, but this observation suggests a possible ‘shape’ for
our example.
Consider instead the continuous function

1 − x/ε, 0 ≤ x ≤ ε,
fε (x) =
0, ε < x ≤ 1,
where 0 < ε < 1. We have
ε ε
x2
fε 1 = (1 − x/ε)dx = x − = ε/2,
0 2ε 0
ε ε
x2 x3
fε 22 = (1 − x/ε)2 dx = x − + 2 = ε/3.
0 ε 3ε 0
√
Hence fε 2 / fε 1 = 2/ 3ε → ∞ as ε → 0.

P4 (i) The unit ball for the 1-norm in R3 is a regular octahedron centred at the
origin. The part of its boundary in the first octant has equation
x + y + z = 1. Thus, as r increases,
{a : a 1 ≤ r}
first meets 3x + 2y + z = 6 at the point (2, 0, 0), which is the closest point
to the origin with respect to the 1-norm.
(ii) The unit ball for the 2-norm in R3 is the ordinary ball centred at the
origin. As r increases,
{a : a 2 ≤ r}
first meets 3x + 2y + z = 6 at a point (x, y, z) whose normal (to the
plane) passes through the origin. Since the line {(3k, 2k, k) : k ∈ R} is
normal to the plane we solve
3(3k) + 2(2k) + k = 6 ⇒ k = 3/7.
Thus (9/7, 6/7, 3/7) is the closest point to the origin with respect to the
2-norm.
(iii) The unit ball for the ∞-norm in R3 is the cube with vertices
(±1, ±1, ±1). As r increases,
{a : a ∞ ≤ r}
first meets 3x + 2y + z = 6 at a point of the form (k, k, k), k > 0. Thus
(1, 1, 1) is the closest point to the origin with respect to the ∞-norm.

14
P5 The idea, as in Theorem 1.2, is to choose a compact subset A2 of A1 which
must contain the point of A1 which is closest to A0 . For example, we can
choose
A2 = A1 ∩ {a : a ≤ 2M },
where M is so large that
A0 ⊆ {a : a ≤ M }.
Then choose a∗0 ∈ A0 and a∗1 ∈ A2 (see SAQ S2) such that
a∗0 − a∗1 ≤ a0 − a1 , a0 ∈ A0 , a1 ∈ A2 .
To prove that
a∗0 − a∗1 ≤ a0 − a1 , a0 ∈ A0 , a1 ∈ A1 ,
note that if a0 ∈ A0 and a1 ∈ A1 \A2 , then
a0 − a1 ≥ a1 − a0
> 2M − M
≥ a∗0
= a∗0 − 0
≥ a∗0 − a∗1 ,
as required.

15
Chapter 2 The uniqueness of best
approximation
Ideally the method used to choose an approximation from a set A to a given
function f should give a unique answer. This chapter is devoted to the study of
those conditions under which a best approximation from A to f is unique.
Important new concepts introduced include ‘convexity’ and ‘scalar product’.
This chapter splits into TWO study sessions:
Study session 1: Sections 2.1 and 2.2.
Study session 2: Sections 2.3 and 2.4.

Study Session 1: Convexity

Read Sections 2.1 and 2.2

Commentary
1. The following diagrams illustrate the notion of a convex set and a strictly
convex set.

Not convex Convex Strictly

(not strictly) convex

A point a is interior to a set A in a metric space if the open ball

{b : d(a, b) < r} ⊂ A, for some r > 0.

2. In the proof of Theorem 2.1, there is no need for modulus signs around θ and
1 − θ, since both quantities are positive.

3. The following diagram illustrates the proof of Theorem 2.3.

B A
s0
1
2
(s0 + s1 )

f s1

s = 12 (s0 + s1 ) + λ f − 12 (s0 + s1 )

Note that the number λ ≥ 0, which appears in (2.6), does not need to be
maximal. All that is required is λ > 0 and s ∈ A.

16
4. The following diagram illustrates the proof of Theorem 2.4.
1
2
(s0 + s1 ) s1 A

h∗
B

s0 f

N (f, h∗ )

Note that any ﬁnite-dimensional subspace A is convex, but not compact

(consider the sequence na, n = 1, 2, . . ., where a = 0).

Self-assessment questions
S1 Which of the unit balls in Figure 1.5 (page 10) are strictly convex?

S2 Prove that
(a) the intersection of two convex sets is convex;
(b) the intersection of two strictly convex sets is strictly convex.

S3 Show that the norms (a) · 1 and (b) · ∞ are not strictly convex on C[0, 1].

Study Session 2: Best approximation operators

Read Sections 2.3 and 2.4

Commentary

1. The following diagram illustrates the deﬁnition of the best approximation

operator X.

a∗ = X(f )
A

A projection operator is one for which X(X(f )) = X(f ), that is, the best
approximation from A to a point a ∈ A is a itself.

2. The ﬁnal comment in Section 2.3 relates to the earlier comment on the
importance of the continuity of the best approximation operator to computer
calculations.

17
3. A scalar product (or inner product) on a linear space B is a real-valued
function (a, b), a, b ∈ B, such that for all a, b, c ∈ B and λ, µ ∈ R:
(S1) (a, a) ≥ 0, with equality if and only if a = 0;
(S2) (a, b) = (b, a);
(S3) (a, λb + µc) = λ(a, b) + µ(a, c).
Two important scalar products are given in Section 2.4.
In any linear space with a scalar product we can deﬁne a norm by the
equation

a = (a, a).
As usual, only the proof of the triangle inequality requires any work; it
follows from a version of the Cauchy–Schwarz inequality (see SAQ S6):
|(a, b)| ≤ a b , a, b ∈ B,
together with the identity
a + b 2 = a 2 + 2(a, b) + b 2,
which you can easily verify.
We shall meet other examples of scalar products in Chapter 11.

4. The proof of Theorem 2.7 is perhaps more clearly written as follows.

If f = g and f 2 = g 2 = 1, then we have
f − g 22 = 1 − 2(f, g) + 1 > 0 ⇒ (f, g) < 1,
and hence
θf + (1 − θ)g 22 = θ2 + 2θ(1 − θ)(f, g) + (1 − θ)2
< θ2 + 2θ(1 − θ) + (1 − θ)2
= 1.

5. The norms
1/p
b
f p = |f (x)| dx
p
, 1 < p < ∞,
a

are all strictly convex. The proof (for p = 2) depends on a careful study of
the possibility of equality in Hölder’s inequality.

Self-assessment questions
S4 Prove Theorem 2.6. (Hint: consider A0 = {a ∈ A : a ≤ 4 f }.)

S5 Let w be a positive function in C[a, b]. Prove that

b
(f, g) = w(x)f (x)g(x) dx,
a

is a scalar product on C[a, b].

S6 Prove the Cauchy–Schwarz inequality for scalar products by considering the

discriminant of the quadratic expression
(a + λb, a + λb), λ ∈ R.

S7 Draw ﬁgures to illustrate the non-uniqueness of best approximation in the

four examples on pages 18–19.

18
Problems for Chapter 2
P1 Powell Exercise 2.4

P2 Powell Exercise 2.5

P3 Powell Exercise 2.6

P4 Powell Exercise 2.8

P5 Powell Exercise 2.1

Solutions to SAQs in Chapter 2

S1 Only the unit ball in the 2-norm is strictly convex.

S2 (a) Let S and T be convex sets. If s0 , s1 ∈ S ∩ T , then the points

s = θs0 + (1 − θ)s1 , 0 < θ < 1,
also lie in both S and T (since S and T are convex) and hence in S ∩ T .
Thus S ∩ T is convex.
(b) The proof is as above, but in addition we observe that if s is an interior
point of both S and T then s is an interior point of S ∩ T .

S3 (a) Consider f (x) = 2x and g(x) = 2(1 − x) on [0, 1]. Clearly f 1 and
g 1 = 1, but
1
2 (f + g) (x) = 1, 0 ≤ x ≤ 1,
so that 12 (f + g) 1 = 1. Hence · 1 is not strictly convex.
(b) Consider f (x) = 1 and g(x) = x on [0, 1]. Clearly f ∞ = 1 and
g ∞ = 1, but
1 1
2 (f + g) (x) = 2 (1 + x), 0 ≤ x ≤ 1,
so that 12 (f + g) ∞ = 1. Hence · ∞ is not strictly convex.

S4 The idea, as in Theorem 1.2, is to consider a compact subset of A which is

large enough to contain best approximations to all points in a neighbourhood
of f , and then apply Theorem 2.5.
For example, if we take
A0 = {a ∈ A : a ≤ 4 f },
then A0 contains the best approximations (from A) to all points g such that
g ≤ 2 f
(because A0 ⊇ {a ∈ A : a ≤ 2 g }, for all such g).
Thus if g ≤ 2 f , then the best approximation operator X0 with respect to
A0 coincides with the best approximation operator X with respect to A.
Applying Theorem 2.5, we ﬁnd that X0 is continuous at f , and so therefore
is X.

19
b
S5 (S1) (f, f ) = a w(x)f (x)2 dx ≥ 0, since w(x)f (x)2 ≥ 0, a ≤ x ≤ b.
Equality can occur only if f (x)2 = 0, a ≤ x ≤ b, since w(x) > 0,
a ≤ x ≤ b.
(S2) Obvious, by deﬁnition.
b
(S3) (f, λg + µh) = w(x)f (x)(λg(x) + µh(x)) dx
a
b b
=λ w(x)f (x)g(x) dx + µ w(x)f (x)h(x) dx
a a
= λ(f, g) + µ(g, h).

S6 Since, for all λ ∈ R,

0 ≤ (a + λb, a + λb) = a 2 + 2λ(a, b) + λ2 b 2 ,
we must have B 2 − 4AC ≤ 0, where
A = b 2 , B = 2(a, b), C = a 2 .
Thus
4(a, b)2 ≤ 4 a 2 b 2 ⇒ |(a, b)| ≤ a b ,
as required.

S7 The ﬁgures are as follows.

f (x) = 1 1 f = (1, 1, 1, 1, 1)
1

a(x) = λx

−1 1 −1 1
−λ
a = (−λ, 2
, 0, λ2 , λ)
−1

f − a1 = 2, |λ| ≤ 1 f − a1 = 5, |λ| ≤ 1

f (x) = 1 1 f = (1, 1, 1, 1, 1)

a(x) = λ(1 + x) a = (0, λ2 , λ, 3λ

2
, 2λ)

−1 1 −1 1

f − a∞ = 1, 0 ≤ λ ≤ 1 f − a∞ = 1, 0 ≤ λ ≤ 1

20
Solutions to Problems in Chapter 2
P1 Since the unit ball in the ∞-norm is a square with sides parallel to the axes,
the best approximation in A to a point f ∈ R2 \A is found as follows.
(a) If f lies in one of the shaded sets, then X(f ) lies on the circle
{a : a 2 = 1} and on a (projection) line through f at 45◦ to the axes.
(b) If f does not lie in one of the shaded sets, then X(f ) is the nearest of the
four points (±1, 0), (0, ±1).

X(f )
g X(g)
1
A

To prove directly (that is, without the help of Theorem 2.6) that X(f ) is
continuous, suppose ﬁrst that f1 , f2 lie in the shaded set in the ﬁrst quadrant.
Then
√
f1 − f2 ∞ ≥ d/ 2,
where d is the (ordinary) distance between the 45◦ projection lines through f1
and f2 .

f1 √
f2 N (f2 , d/ 2)

X(f1 ) N (f2 , ||f2 − f1 ||∞ )

A X(f2 )

Furthermore,
√
X(f1 ) − X(f2 ) ∞ ≤ 2d,
since the line segment joining X(f1 ) to X(f2 ) makes an angle of more than
45◦ with the projection lines from f1 , f2 . Hence
X(f1 ) − X(f2 ) ∞ ≤ 2 f1 − f2 ∞ .
It follows that X is continuous on the shaded sets. Since X is constant on the
four unshaded sets in R2 \A (and these constant values agree with the values
of X on the boundaries between the shaded and unshaded sets in R2 \A) and
X is the identity on A itself, we deduce that X is continuous on the whole
of R2 .

21
P2 To prove that X(f ) = f / f , if f > 1, we have to show that
f − g ≥ f − f / f , g ∈ A.
But, by the ‘backwards’ form of the triangle inequality,
f − g ≥ f − g
≥ f − 1 (since g ∈ A)
= f (1 − 1/ f )
= f − f / f ,
as required. (Where did we use the fact that f > 1?)
To prove that
X(f1 ) − X(f2 ) ≤ 2 f1 − f2 , f1 , f2 ∈ B,
it is suﬃcient to consider three cases.
Case 1 f1 ≤ 1, f2 ≤ 1.
In this case X(f1 ) = f1 and X(f2 ) = f2 , so that
X(f1 ) − X(f2 ) = f1 − f2 .
Case 2 f1 ≤ 1, f2 > 1.
In this case X(f1 ) = f1 and X(f2 ) = f2 / f2 , so that

f2
X(f1 ) − X(f2 ) = f −
1 f2

1
= f
1 − f + f 1 −
f2
2 2

1
≤ f1 − f2 + f2 1 − (since f2 > 1)
f2
= f1 − f2 + f2 − 1
≤ f1 − f2 + f2 − f1 (since f1 ≤ 1)
≤ 2 f1 − f2 .
Case 3 f1 > 1, f2 ≥ f1 .
In this case X(f1 ) = f1 / f1 and X(f2 ) = f2 / f2 , so that

f1 f2
X(f1 ) − X(f2 ) = −
f1 f2

1 f1 − f2 + f2 1 − f1

=
f1 f2
≤ f1 − f2 + f2 − f1 (since f1 > 1)
≤ 2 f1 − f2 .

P3 First we remark that the sum of two norms on a linear space is also a norm
on that space.
To prove that
f = f 1 + f ∞ , f ∈ C[−π, π],
is not strictly convex, let A be the 1-dimensional subspace of functions of the
form
g(x) = λ sin2 x, −π ≤ x ≤ π,
where λ ∈ R, and let f (x) = x, −π ≤ x ≤ π.
For |λ| ≤ 1, the graph y = λ sin2 x meets y = x only at the origin since
|λ sin2 x| ≤ | sin x| < |x|, for x = 0.

22
y=x

y = λ sin2 x

−π π

Hence, by the evenness of sin2 x, f − g 1 = π2 , for |λ| ≤ 1.

Also, for |λ| < 1, the function
e(x) = x − λ sin2 x
has no local maximum or minimum on R, since
e (x) = 1 − λ sin 2x > 0, x ∈ R.
Hence
f − g ∞ = max |e(x)| = max {e(π), −e(−π)} = π.
−π≤x≤π

Thus
f − g = π2 + π,
for g(x) = λ sin2 x, |λ| ≤ 1.
Since it is also clear that
f − g ≥ π2 + π, λ ∈ R,
we deduce that f does not have a unique best approximation in A. Hence, by
Theorem 2.4, this norm is not strictly convex.

P4 Recall that the unit ball of R3 in the 1-norm is the regular octahedron whose
face in the ﬁrst octant lies on x + y + z = 1, and the unit ball of R3 in the
∞-norm is the cube with vertices (±1, ±1, ±1).
Thus the plane x + y = 1 meets the boundary of the unit ball in the 1-norm
in a line segment and also meets the boundary of the ball of radius 12 in the
∞-norm in a line segment.

z z
y y
1
1 1 1
x+y = 1 2
1
x+y = 1
2

1
2
1 1
x x

23
P5 This question shows that any closed, bounded, convex subset A of a linear
space B, with the property that f ∈ A ⇒ −f ∈ A, is the unit ball of some
norm, namely that given by

0, if f = 0,
f =
min {r ∈ (0, ∞) : f /r ∈ A}, if f = 0.
First note that the minimum in this definition is attained. Indeed, if we first
define
f = inf {r ∈ (0, ∞) : f /r ∈ A}, f = 0,
then f > 0, otherwise A is unbounded. Also, there is a sequence rn → f ,
such that f /rn ∈ A, and since f /rn → f / f we deduce that f / f ∈ A, as
required.
(N1) Certainly f ≥ 0, by definition and we have just seen that f > 0
for f = 0.
(N2) λf = |λ| f , for λ ∈ R.
It is clear that this holds if f = 0 or λ = 0.
If f = 0 and λ = 0, then
λf = min {r ∈ (0, ∞) : λf /r ∈ A}
= min {r ∈ (0, ∞) : |λ|f /r ∈ A} (since f ∈ A ⇔ −f ∈ A)
= min {|λ|r ∈ (0, ∞) : |λ|f /(r|λ|) ∈ A}
= |λ| min {r ∈ (0, ∞) : f /r ∈ A}
= |λ| f ,
as required.
(N3) f + g ≤ f + g
At first sight this looks difficult. However, by the definition of f + g ,
it is sufficient to prove that
f +g
∈ A. (1)
f + g
We know that f / f ∈ A and g/ g ∈ A so that, by the convexity of A,
f g
θ + (1 − θ) ∈ A, 0 < θ < 1.
f g
If we now choose
f g
θ= so that 1 − θ = ,
f + g f + g
then we obtain (1).
The above argument breaks down if f = g = 0, but in this case
f = g = 0, and so f + g = 0 also.
Hence f , f ∈ B, is indeed a norm on B.

24
Chapter 3 Approximation operators and
some approximating functions
Calculating a best approximation from a subspace A of B to f , with respect to
some norm on B, may not be as easy as calculating an approximation using some
other operator, such as an interpolation operator. To judge how good an
approximation is obtained by such an operator X, we use the ‘norm’ X of the
operator, which is exploited in Sections 3.1 and 3.2. The other two sections of the
chapter contain a discussion of the problems involved in approximating by
polynomials and an introduction to piecewise polynomial approximation.
This chapter splits into TWO study sessions:
Study session 1: Sections 3.1 and 3.2.
Study session 2: Sections 3.3 and 3.4.

Study Session 1: ‘Good’ versus ‘best’ approximation

Read Sections 3.1 and 3.2

Commentary
1. Powell deﬁnes X to be the smallest real number such that
X(f ) ≤ X f , f ∈ B.
Otherwise stated,
X = sup { X(f ) / f : f ∈ B, f = 0}.
Thus to determine the value of X , we must ﬁnd a number M such that
X(f ) ≤ M f , f ∈ B,
and such that, whenever M < M , there exists f ∈ B with
X(f ) > M f .
If X(f ) / f is unbounded on B, then we say that the operator X is
unbounded.
Notice that if X is a linear operator, then
X(λf ) X(f )
= , f = 0, λ = 0,
λf f
so that
X = sup { X(f ) : f = 1}.
In general, the supremum may not be attained because the unit sphere
{f ∈ B : f = 1} need not be compact (see, for example, Problem P1).

25
2. The following diagram may help you with Theorem 3.1.

A
B

X(f )

p∗

d∗
f

[Note the word ‘a’ in the ﬁrst line of the proof of Theorem 3.1.]

3. Page 25, line 13. The reason why p∗ (x) = x − 18 is the best L∞
approximation by a linear polynomial to f (x) = x2 on [0, 1] will become clear
in Chapter 7.

4. The point of the ﬁnal paragraph of Section 3.2 is that algorithms for
calculating best L∞ approximations from Pn are more involved than those
for applying linear (projection) operators X : B → A, such as interpolation.
If we determine Xf and compute f (x) − Xf (x) at various points, ﬁnding an
x for which
|f (x) − Xf (x)| > (1 + X )ε,
then, by Theorem 3.1, the best approximation p∗ will satisfy f − p∗ > ε.
Thus a larger value of n may be required.

Self-assessment questions
S1 Show that the interpolation operator X deﬁned at the bottom of page 23 is
unbounded with respect to (a) the 1-norm, (b) the 2-norm.

S2 Explain why the application of this operator X to f (x) = x2 (see

equation (3.15)) shows that we can have equality in (3.11).

Study Session 2: Types of approximating functions

Read Sections 3.3 and 3.4

Commentary
1. Page 26, line 9. The promised technique appears in equation (3.23).

2. The space C (k) [a, b]. An example of a function f which belongs to C (k) [a, b],
but not to C (k+1) [a, b] is

xk+1 , x ≥ 0,
f (x) =
−xk+1 , x < 0.
For this function,
f (k) (x) = (k + 1)!|x|, x ∈ R,
which is continuous but not diﬀerentiable at 0.

26
3. The identity (3.23) holds because q ∈ Pn so that, as p varies over the whole
space Pn , so q + p varies over the whole of Pn .
1
4. Table 3.1. For k = 1, the terms d∗n (f ) scale by a factor of approximately 2 as
n doubles, whereas, for k = 3, they scale by approximately 18 . This gives
C1
d∗2n (f ) , k = 1,
2n
C3 C3
d∗2n (f ) n = n 3 , k = 3,
8 (2 )
which suggests that d∗n (f ) Ck /nk in both cases.
Notice that in (3.20), for a ﬁxed value of k,
(n − k)! 1 1
= ∼ k as n → ∞.
n! n(n − 1) . . . (n − k + 1) n

an
(We say that an ∼ bn as n → ∞ if lim = 1.)
n→∞ bn

5. Page 29, line 8. An analytic function is one which has a power series
expansion about each point of its domain of deﬁnition. Such functions are
completely determined by their values on any given open interval, no matter
how short.

6. Page 29, line 6−. The spline function s is a piecewise polynomial on [a, b],
such that
s(x) = pj (x), ξj−1 ≤ x ≤ ξj , j = 1, . . . , n,
where each pj ∈ Pk , and
(i) (i)
pj (ξj ) = pj+1 (ξj ), i = 0, 1, . . . , k − 1, j = 1, . . . , n − 1.

pj
p1 pj+1
pn

a = ξ0 ξ1 ξj−1 ξj ξj+1 ξn−1 b = ξn

Formula (3.31) can be explained as follows. First note that

s(x) = p1 (x), ξ0 ≤ x ≤ ξ1 .
Next put
q1 (x) = p2 (x) − p1 (x), x ∈ R,
so that
(k−1)
q1 (ξ1 ) = q1 (ξ1 ) = · · · = q1 (ξ1 ) = 0.
Since q1 is a polynomial of degree k,
d1 (k)
q1 (x) = (x − ξ1 )k , where d1 = q1 (ξ1 ).
k!

27
Hence
s(x) = p2 (x) = p1 (x) + q1 (x), ξ1 ≤ x ≤ ξ2 ,
and so
d1
s(x) = p1 (x) + (x − ξ1 )k+ , ξ0 ≤ x ≤ ξ2 .
k!
Here

0, x < ξ1 ,
(x − ξ1 )+ =
x − ξ1 , x ≥ ξ1 .
Now put
q2 (x) = p3 (x) − p2 (x), x ∈ R,
and continue in this manner to obtain
1
n−1
s(x) = p1 (x) + dj (x − ξj )k+ , a ≤ x ≤ b,
k! j=1
(k)
where dj = qj (ξj ) and qj = pj+1 − pj . Thus dj is the jump in s(k) at ξj .

7. Page 30, line 9−. The ‘big oh’ notation used here needs some explanation.
We say that
f (x) = O(g(x)), x ∈ S,
for some subset S of R, if
|f (x)| ≤ M |g(x)|, x ∈ S,
where the constant M does not depend on x. For example,

x2 + x = O x2 , x ≥ 1, whereas x2 + x = O(x), 0 ≤ x ≤ 1.

Self-assessment questions
S3 The best L∞ approximation from P2 to f (x) = |x| on [−1, 1] is
p∗ (x) = x2 + 18 (this will become clear in Chapter 7). Verify that
f − p∗ ∞ = 18 , thus conﬁrming one of the entries in Table 3.1.

S4 Express the quadratic spline

⎧
⎨ −x, −1 ≤ x ≤ 0,
s(x) = x2 − x, 0 ≤ x ≤ 1,
⎩ 2
3x − 5x + 2, 1 ≤ x ≤ 2,
in the form of equation (3.31).

Problems for Chapter 3

P1 Powell Exercise 3.2 (The last part is rather hard and can safely be ignored!)

P2 Powell Exercise 3.3

P3 Powell Exercise 3.4

P4 Powell Exercise 3.6 (Use Theorem 3.1 and be content to get the lower
estimate 0.048.)

P5 Powell Exercise 3.8

28
Solutions to SAQs in Chapter 3
S1 Consider
⎧
⎨ 1 − x/ε, 0 ≤ x ≤ ε,
fε (x) = 0, ε < x < 1 − ε,
⎩
1 + (x − 1)/ε, 1 − ε ≤ x ≤ 1.
(Remember Problem P3, Chapter 1.)
Since fε (0) = fε (1) = 1, the interpolating function p = X(fε ) is simply
p(x) = 1, 0 ≤ x ≤ 1, and so p 1 = p 2 = 1. However,
ε 1
fε 1 = (1 − x/ε) dx + (1 + (x − 1)/ε) dx = ε,
0 1−ε

so that
X(fε ) 1 1
(a) = is unbounded.
fε 1 ε
Also
ε 1
fε 22 = (1 − x/ε)2 dx + (1 + (x − 1)/ε)2 dx = 2ε/3,
0 1−ε

so that
X(fε ) 2 3
(b) = is unbounded.
fε 2 2ε

S2 If f (x) = x2 , 0 ≤ x ≤ 1, and p∗ (x) = x − 1/8, 0 ≤ x ≤ 1, then there are 3

candidates for the point x ∈ [0, 1] such that f − p∗ ∞ = |f (x) − p∗ (x)|.

1
y =x− 8
y = x2

1
− 18 2
1

These are 0, 1, and the point x where

e(x) = f (x) − p∗ (x) = x2 − (x − 1/8)
is a minimum, which satisﬁes
e (x) = 2x − 1 = 0 ⇒ x = 12 .
Thus

f − p∗ ∞ = max |e(0)|, |e 12 |, |e(1)| = 18 .
(The fact that f (x) − p∗ (x) has the same absolute value, but alternating
signs, at each of these 3 points is characteristic of best L∞ approximation
from certain subspaces — see Chapter 7.)
The corresponding interpolating polynomial is p(x) = x, 0 ≤ x ≤ 1, and, in
this case,

f − p ∞ = max x2 − x : 0 ≤ x ≤ 1 = 14 ,
since the maximum is taken at x = 12 . Hence f − p ∞ = 2 f − p∗ ∞ and
since X ∞ = 1 we do have equality in (3.11).

29
S3 If f (x) = |x|, −1 ≤ x ≤ 1, and p∗ (x) = x2 + 18 , −1 ≤ x ≤ 1, then there are 5
candidates for the point x ∈ [−1, 1] such that f − p∗ ∞ = |f (x) − p∗ (x)|.
These are 0, ±1, and the points ±x where

e(x) = f (x) − p∗ (x) = |x| − x2 + 18
is a maximum. As in SAQ S2, we find that x = ± 12 , so that
f − p∗ ∞ = 1
8 = 0.125,
which confirms the first entry in the k = 1 column of Table 3.1.

S4 Following the proof of (3.31) given in the commentary, put p1 (x) = −x,
p2 (x) = x2 − x, p3 (x) = 3x2 − 5x + 2. Then
q1 (x) = p2 (x) − p1 (x) = x2 − x − (−x) = x2 ,

q2 (x) = p3 (x) − p2 (x) = 3x2 − 5x + 2 − x2 − x
= 2x2 − 4x + 2
= 2(x − 1)2 .
Hence s(x) = −x + (x)2+ + 2(x − 1)2+ , −1 ≤ x ≤ 2.

Solutions to Problems in Chapter 3

P1 First we seek an expression M , depending on K but not on x, such that
|Xf (x)| ≤ M f ∞, a ≤ x ≤ b.
For example
b
|Xf (x)| ≤ |K(x, y)| |f (y)| dy
a
b
≤ |K(x, y)| dy f ∞ ,
a

so that

b
Xf ∞ ≤ max |K(x, y)| dy f ∞ .
a≤x≤b a

Hence
b b
X ∞ ≤ max |K(x, y)| dy = |K(x0 , y)| dy,
a≤x≤b a a

say. To prove that X ∞ can be no less than this, we should like to ﬁnd a
function f ∈ C[a, b] such that f ∞ = 1 and
b
Xf ∞ = |K(x0 , y)| dy.
a
The ideal function would be

1, if K(x0 , y) > 0,
f (y) = sgn(K(x0 , y)) =
−1, if K(x0 , y) < 0,
so that
b b
(Xf )(x0 ) = K(x0 , y)f (y) dy = |K(x0 , y)| dy,
a a
b
which implies that X ∞ ≥ a |K(x0 , y)| dy. Unfortunately, however, this
function f is not continuous (unless K(x0 , y) never vanishes).
Instead we take a continuous approximation fε , ε > 0, which diﬀers from
sgn(K(x0 , y)) only on a set of length less than ε.

30
1 fε (y)

a b y
Total
length
<ε K(x0 , y)
−1

Since fε ∞ = 1 and Kε can be taken arbitrarily small, we deduce that

b
X ∞ = |K(x0 , y)| dy.
a

Remark If X ∞ = 1 and Xf = f , then f need not be a constant. For

example, if
⎧
⎨ 4(1 − 2y),
⎪ 0 ≤ x, y ≤ 12 ,
1
K(x, y) = 8(1 − x)(1 − 2y), 2 ≤ x ≤ 1, 0 ≤ y ≤ 12 ,
⎪
⎩ 1
0, 2 ≤ y ≤ 1,
and

1 1, 0 ≤ x ≤ 12 ,
f (x) = K(x, y) dy = 1
0 2(1 − x), 2 ≤ x ≤ 1,
then X ∞ = 1 and
1
f (x) = K(x, y)f (y) dy, 0 ≤ x ≤ 1.
0

P2 To prove that X is a projection (it is obviously linear) we have to show that

Xf (x) = f (x) when f (x) = a + bx. But
1
Xf (x) = 12 (1 + 3xy)f (y) dy
−1
1 1
1 3x
= 2 f (y) dy + yf (y) dy
−1 2 −1
1 1
3x
= 1
2 (a + by) dy + ay + by 2 dy
−1 2 −1

3x 2b
= 12 (2a) +
2 3
= a + bx,
as required.

31
Now, by Problem P1,
1
1
X ∞ = max |1 + 3xy| dy
−1≤x≤1 2 −1
1 1
1 1
= 2 |1 + 3y| dy or 2 |1 − 3y| dy .
−1 −1

To see this, consider the change in the graph 1 + 3xy, −1 ≤ y ≤ 1, as x

increases from 0 to 1.

1
4
x=0 x= 2 x=1

1 1 1

−1 1 y −1 1 y −1 1 y

−2

Hence
1 1

X ∞ = 2 2 4 × 43 + 12 2 × 23 = 53 .

P3 First note that X is a projection (it is clearly linear), since if f (x) = a + bx,
then
12

Xf (x) = 2 (a + bt) dt + x − 14 (a + b − a)
0

= a + b/4 + b x − 14
= a + bx,
as required.
Now, for f ∈ C[0, 1],
1

2

|Xf (x)| ≤ 2 f (t) dt + |x − 14 | |f (1) − f (0)|
0
12
≤2 |f (t)| dt + 34 (|f (1)| + |f (0)|)
0
3
≤ f ∞ + 4 · 2 f ∞
5
= 2 f ∞ .

Hence
Xf ∞ ≤ 52 f ∞ ⇒ X ∞ ≤ 52 .
Thus, by Theorem 3.1,

f − Xf ∞ ≤ 1 + 52 f − p∗ ∞ ,
where p∗ is the best L∞ approximation to f from P1 , and so
f − Xf ∞ ≤ 72 f − p ∞ , for p ∈ P1 .
Remark In fact X ∞ = 5/2 in this problem, as you can see by
considering, for 0 < ε < 1,

−1 + 2x/ε, 0 ≤ x ≤ ε,
fε (x) =
1, ε < x ≤ 1.

32
P4 There is rather more to this question than meets the eye! First, if
p(x) = a + bx + cx2 , then
p(0) = a, p(1) = a + b + c, p(3) = a + 3b + 9c, p(4) = a + 4b + 16c,
and it is true that
a + 3b + 9c = − 12 a + (a + b + c) + 12 (a + 4b + 16c).
Now
min max |f (x) − p(x)| = f − p∗ ∞ ,
p∈P2 0≤x≤4

where p∗ is the best L∞ approximation to f from P2 . Thus, by Theorem 3.1,

f − Xf ∞
f − p∗ ∞ ≥ ,
1 + X ∞
where X is any linear projection from C[0, 4] to P2 .
Given the ﬁrst part of the question, it seems a good idea to let X be the
interpolation operator X(f ) = p, deﬁned by p(0) = f (0), p(1) = f (1),
p(4) = f (4). This is certainly a linear projection operator. Since
f (3) = − 12 f (0) + f (1) + 12 f (4) ± 0.15,
we have
f − Xf ∞ ≥ |f (3) − p(3)|

= ±0.15 + − 21 f (0) + f (1) + 12 f (4) − − 21 p(0) + p(1) + 12 p(4)
= 0.15.
Thus
0.15
f − p∗ ∞ ≥ .
1 + X ∞
To obtain the desired lower estimate for f − p∗ ∞ , we need to show,
therefore, that X ∞ ≤ 2. Unfortunately, it turns out that X ∞ = 17 8 .
Nevertheless, we indicate the argument.
Recall that, since X is linear,
X ∞ = max { Xf ∞ : f ∞ = 1}.
Thus we seek to maximise the L∞ norm of X(f ) = p, subject to the
constraints |p(0)| ≤ 1, |p(1)| ≤ 1, |p(4)| ≤ 1. It seems likely that p ∞ will be
largest if we choose p(0), p(1) and p(4) to be ±1. Taking this for granted,
there remains only an analysis of the (non-trivial) cases.

1 1 1

0 1 4 0 1 4 0 1 4
−1 −1 −1
(a) (b) (c)

As you can easily check, case (c) gives the largest value of p ∞ . In this case

5 2

p(x) = 17 1
8 − 2 x− 2 ⇒ p ∞ = p 52 = 17 8 .
17
Hence X ∞ = 8 , so that
0.15
f − p∗ ∞ ≥ = 0.048.
1 + 17
8

33
Two questions remain.
(I) How do we justify taking p(0), p(1), p(4) to be ±1?
(II) Can we in fact obtain the better estimate 0.05?
There are various ways to answer Question (I). For example, we could argue
from basic principles, examining the effects of taking p(0) = 1, |p(1)| < 1,
|p(4)| < 1, and so on. This would be tedious, and would not generalise to
other problems.
More generally, we can use a linear programming argument. This sounds very
grand, but it is really quite a simple idea. We want to maximise
|p(x)| = |a + bx + cx2 |, 0 ≤ x ≤ 4,
subject to the constraints
|p(xi )| = |a + bxi + cx2i | ≤ 1,
where x1 = 0, x2 = 1, x3 = 4. Now any equation of the form
Xa + Y b + Zc = k, where X, Y , Z, k are constant, defines a plane in R3 .
Hence the above 3 constraints define a parallelopiped P , centred at the origin,
of possible values of (a, b, c) in R3 .
The required maximum M of |p(x)| occurs for some (a, b, c) ∈ P and
x0 ∈ [0, 4], so that

M = max a + bx0 + cx20 .
(a,b,c)∈P

Since x0 is now fixed, we can find M by moving the plane a + bx0 + cx20 = k
as far as possible from the origin, while still meeting P ; at this point M = |k|.
Now, however, the plane must pass through at least one vertex of P , so that
|p(xi )| = 1, for i = 1, 2, 3, as required.
We shall see another approach in Chapter 4 which contains a formula for
X ∞ , where X is an interpolation operator from C[a, b] to Pn .
To answer Question (II) we look again at the proof of Theorem 3.1. Using
equation (3.12), we have
0.15 = |f (3) − (X(f ))(3)|
= |(f − p∗ ) (3) −(X(f − p∗ )) (3)|
≤ f − p∗ ∞ + |(X(f − p∗ )) (3)| .
Now consider the problem of maximising |(Xg)(3)|, for g ∈ C[0, 4], while
keeping g ∞ constant. Once again this is a linear programming problem so
that the maximum occurs for g(0) = ± g ∞, g(1) = ± g ∞ , g(4) = ± g ∞.
Examining cases (a), (b), (c) given earlier, we find that
|(Xg)(3)| ≤ 2 g ∞ , g ∈ C[0, 4]
((c) is again the extreme case). Hence, with g = f − p∗ ,
0.15 ≤ f − p∗ ∞ + 2 f − p∗ ∞ = 3 f − p∗ ∞ ,
and so f − p∗ ∞ ≥ 0.05, as required.

34
P5 This one is a little easier! First, since every quadratic spline is differentiable
at points of (−1, 1), we cannot have f − s ∞ = 0, that is s(x) = f (x),
−1 ≤ x ≤ 1, because f is not differentiable at 0.
However, we can make f − s ∞ < ε by defining
⎧
⎨ −x, −1 ≤ x ≤ −ε,
s(x) = p(x) = a + bx2 , −ε < x < ε,
⎩
x, ε ≤ x ≤ 1.
To guarantee that s is a quadratic spline, we require

p(±ε) = ε, that is, a + bε2 = ε 1 ε

⇒ b= ,a= .
p (±ε) = ±1, that is, 2bε = 1 2ε 2

y = s(x)

−1 −ε ε 1

Since the worst error occurs at x = 0, we have

ε
f − s ∞ = |f (0) − s(0)| = < ε,
2
as required.

35
Chapter 4 Polynomial interpolation
This chapter begins a detailed investigation of the interpolation of continuous
functions by polynomials. It turns out that the choice of interpolation points
makes a considerable diﬀerence to the accuracy of the interpolating
approximation; for example, we see in this chapter that equally-spaced
interpolation points make a rather poor choice. This investigation of interpolation
continues in Chapter 5.
This chapter splits into TWO study sessions:
Study session 1: Sections 4.1 and 4.2.
Study session 2: Sections 4.3 and 4.4.

Study Session 1: Polynomial interpolation

Read Sections 4.1 and 4.2

Commentary
1. Equation (4.2) represents (n + 1) linear equations (one for each interpolation
point) with n + 1 unknowns (the coeﬃcients of the required polynomial).
Theorem 4.1 shows that the corresponding (n + 1) × (n + 1) matrix
⎛ ⎞
1 x0 x20 . . . xn0
⎜ 1 x1 x21 . . . xn1 ⎟
⎜ ⎟
⎜ .. ⎟
⎝. ⎠
1 xn x2n . . . xnn
must be non-singular. In fact, this Vandermonde matrix, as it is called, has
determinant
"
(xj − xi ),
0≤i<j≤n

which is clearly non-zero if and only if the xi are distinct points.

2. Below is the graph of a typical Lagrange function

"n
x − xj
k (x) = ,
j=0
xk − xj
j=k

with n = 10 and k = 6.

y = 6 (x)

x0 x5 x6 x7 x10

Note that if all the xi are kept ﬁxed except for xk and xk+1 which both tend
to a number c, then k (x) → ∞ for any x = x0 , x1 , . . . , xk−1 , c, xk+2 , . . . , xn .

3. The useful symbol δki in equation (4.11) is called the Kronecker delta.

36
4. The remarks before the statement of Theorem 4.2 provide a way of
remembering that it is the (n + 1)th derivative of f which appears in the
error formula (4.13).

5. Equation (4.15) can be written as

"
n
1
g (n+1) (ξ) = f (n+1) (ξ) − e(x)(n + 1)! = 0.
i=0
(x − xi )

Self-assessment questions
S1 Powell Exercise 4.1

S2 Verify equation (4.11). (This identity is required in Chapter 10 and in

Chapter 19.)

S3 Powell Exercise 4.8

Study Session 2: Chebyshev interpolation

Read Sections 4.3 and 4.4

Commentary
1. Table 4.1. Here is the graph of the Runge example together with its
Lagrange interpolating polynomial p of degree 10.

y = p(x)

1
1 y=
1 + x2

−5 −1 1 5

Notice that p(4.5) 1.6, as indicated by the 5th entry in the middle column
of Table 4.1. Here is the graph of the corresponding function
10
"
prod(x) = (x − xj ).
j=0

y = prod (x)
105
−5 −1 1 5

As you can see, there is a close relationship between prod(x) and the size of
the error function in the above interpolation.

37
2. Chebyshev polynomials (pronounced Cheby‘shov’ in Russian).
We have
cos θ = cos θ ⇒ T1 (x) = x,
cos 2θ = 2 cos2 θ − 1 ⇒ T2 (x) = 2x2 − 1
and cos 3θ = 4 cos3 θ − 3 cos θ ⇒ T3 (x) = 4x3 − 3x.
The graphs of these Chebyshev polynomials appear below.

1
2
y = 2x − 1
y=x
y = 4x3 − 3x

−1 1

−1

Formula (4.25) shows that Tn (x) is a polynomial of degree n in which the

(2i − 1)π
coeﬃcient of xn is 2n−1 . It has zeros at the points cos ,
2n
i = 1, . . . , n, since

(2i − 1)π −1 (2i − 1)π
Tn cos = cos n cos cos
2n 2n
# π$
= cos (2i − 1) =0
2
(note that 0 < (2i − 1)π/(2n) < π, for i = 1, . . . , n).
Thus the function Tn+1 has n + 1 (simple) zeros which are, in increasing
order,

[2(n − i) + 1]π
xi = cos , i = 0, 1, . . . , n,
2(n + 1)
and we deduce that
"
n
1
Tn+1 (x) = 2n (x − xi ) ⇒ max |prod(x)| = .
i=0
−1≤x≤1 2n
The graphs below shows the points xi in the case n = 10 and the graph
y = T11 (x).

1
21π
π 22
y = T11 (x)

π y = cos−1 x
2 −1 0 1

−1 1
x0 x5 = 0 x10 −1

38
If xi are the Chebyshev points for the interval [a, b], defined by (4.28) and
(4.30), and ti are the Chebyshev points for [−1, 1], defined by (4.27), then,
for a ≤ x ≤ b with x = λ + µt,
"
n
prod(x) = (x − xi )
i=0
"n
= ((λ + µt) − (λ + µti ))
i=0
"
n
= µn+1 (t − ti )
i=0
= µn+1 prod(t),
where the latter product is defined with respect to the ti . Thus
max |prod(x)| = µn+1 max |prod(t)|
a≤x≤b −1≤t≤1
n+1
b−a 1
= · n
2 2
n+1
b−a
=2 .
4
For example, with n = 10 and [a, b] = [−5, 5], this maximum is 47 683.7,
which is considerably smaller than the corresponding maximum for
equally-spaced points. Finally, we plot the interpolating polynomial of degree
10 to the Runge example using these Chebyshev interpolation points.

−5 0 5

This graph conﬁrms the ﬁfth entry in the third column of Table 4.4, which
gives the maximum error in the above interpolation of approximately 0.1.

3. Theorem 4.3 provides an alternative method of calculating X in the

solution of Problem P4, Chapter 3. Also, the formula for X should remind
you of Problem P1, Chapter 3. In the proof, some comments are required
about the last two equalities of (4.32).
First, it is legitimate to change the order of the sup and the max because,
quite generally, we have
sup sup φ(x, y) = sup sup φ(x, y),
x∈X y∈Y y∈Y x∈X

for any real-valued function φ(x, y), x ∈ X, y ∈ Y . Indeed, for any ﬁxed
x ∈ X, y ∈ Y ,
φ(x, y) ≤ sup φ(ξ, y)
ξ∈X

so that
sup φ(x, y) ≤ sup sup φ(ξ, y).
y∈Y y∈Y ξ∈X

39
Thus
sup sup φ(x, y) ≤ sup sup φ(x, y),
x∈X y∈Y y∈Y x∈X

and the reverse inequality is proved similarly.

Next, note that in the ﬁnal step of (4.32) it is clear that

n n

max sup f (xk )k (x) ≤ max |k (x)|,
a≤x≤b f ≤1 a≤x≤b
k=0 k=0

and equality is seen to hold by taking f (xk ) = sgn (k (x∗ )), where

n
n
|k (x∗ )| = max |k (x)|.
a≤x≤b
k=0 k=0

Note that all the norms in Theorem 4.3 are the ∞-norm · ∞ .

4. Table 4.5. It is natural to ask at what rate the norms in the right-hand

column are increasing. It can be shown that these grow like π2 loge n.

5. The ‘optimal nodes problem’, to ﬁnd the interpolating points which minimise
X (see Powell Exercise 4.10) was solved only comparatively recently; see
the remarks in Appendix B. Note that the two independent papers [28] and
[89], referred to in Appendix B, appeared ‘back-to-back’ in the Journal of
Approximation Theory in 1978.

Self-assessment questions
S4 By calculating the interpolating polynomial from P2 to the Runge example,
with suitable interpolation points x0 , x1 , x2 , conﬁrm the ﬁrst entry in each
column of Table 4.1.

S5 (a) Use formula (4.25) to calculate T4 (x) and T5 (x).

(b) Prove that T2n (x) = 2Tn (x)2 − 1.

S6 Check the ﬁrst entry in both columns of Table 4.5.

Problems for Chapter 4

P1 Powell Exercise 4.2

P2 Powell Exercise
4.3 (Hint: ﬁnd the maxima of |(x − a)(x − b)| and
|(x − a) x − 12 (a + b) (x − b)| on [a, b].)

P3 Powell Exercise 4.4 (Hint: try to ﬁnd a substitute for the function g of
(4.14).)

P4 Powell Exercise 4.5 (Hint: decide ﬁrst where the maximum and minimum
gaps occur.)

P5 Powell Exercise 4.6

40
Solutions to SAQs in Chapter 4
S1 Using equation (4.7),
(x − 1)(x − 2)(x − 3) (x − 0)(x − 2)(x − 3)
p(x) = f (0) + f (1)
(0 − 1)(0 − 2)(0 − 3) (1 − 0)(1 − 2)(1 − 3)
(x − 0)(x − 1)(x − 3) (x − 0)(x − 1)(x − 2)
+ f (2) + f (3)
(2 − 0)(2 − 1)(2 − 3) (3 − 0)(3 − 1)(3 − 2)
1 1
= − 6 f (0)(x − 1)(x − 2)(x − 3) + 2 f (1)x(x − 2)(x − 3)
− 12 f (2)x(x − 1)(x − 3) + 16 f (3)x(x − 1)(x − 2).
Hence
p(6) = −10f (0) + 36f (1) − 45f (2) + 20f (3).
If f (x) = (x − 3)3 , then f (0) = −27, f (1) = −8, f (2) = −1, f (3) = 0, and so
p(6) = −10(−27) + 36(−8) − 45(−1) + 20(0) = 27.
This is correct since f (6) = 27 and f is a cubic, so the interpolation is exact.
The uncertainty of p(6), if that of each function value is ±ε, is
3

± ε|k (6)| = ±(10ε + 36ε + 45ε + 20ε) = ±111ε.
k=0

S2 Substituting (4.3) in (4.9) gives

n "n
i x − xj
xk = xi , i = 0, 1, . . . , n,
j=0
x k − xj
k=0
j=k

since the interpolation is exact for polynomials of degree i ≤ n. The coeﬃcient

of xn on the right is 1 if i = n and 0 otherwise, whereas on the left it is

n "
n
1
xik .
j=0
(xk − xj )
k=0
j=k

Hence (4.11) follows.

S3 The Lagrange interpolation formula can be written as follows:

n
n "n
x − xj
f (xk )k (x) = f (xk )
j=0
xk − xj
k=0 k=0
j=k

"
n
n
f (xk ) 1
= (x − xj ) % .
n
(x − xk )
j=0 k=0 (xk − xj )
j=0
j=k

Each of the quantities

f (xk )
µk = %
n
(xk − xj )
j=0
j=k

depends only on the data points and the function values, and so can be
calculated beforehand. Hence
"n
n
µk
p(x) = (x − xj )
j=0
x − xk
k=0

can be evaluated using n + 1 multiplications, n additions, n + 1 divisions and

n + 1 subtractions. Altogether this is 4n + 3 ≤ 5n arithmetic operations
(n ≥ 3).

41
S4 First calculate
1 1
f (−5) = 26 , f (0) = 1, f (5) = 26 .

The unique quadratic p(x) taking these values is of the form p(x) = 1 − ax2 ,
a > 0. To ﬁnd a we use
1
26 = 1 − a · 52 ⇒ a= 1
26 .
5
Now, with n = 2 we have x 32 = 2 and
5 1 4
f = = = 0.137 931 034,
2
1 + 25
4
29

p( 52 ) = 1 − 1
26 · 25
4 = 79
104 = 0.759 615 384.
Thus
5 5
f 2 −p 2 = −0.621 684 35,
and the veriﬁcation is complete.

S5 (a) T4 (x) = 2xT3 (x) − T2 (x)

= 2x 4x3 − 3x − 2x2 − 1
= 8x4 − 8x2 + 1,

T5 (x) = 2x 8x4 − 8x2 + 1 − 4x3 − 3x
= 16x5 − 20x3 + 5x.

(b) T2n (x) = cos 2n cos−1 x

= 2 cos2 n cos−1 x − 1
= 2Tn (x)2 − 1.

S6 Let X be the interpolation operator for the equally-spaced points x0 = −5,

x1 = 0, x2 = 5. Then
X ∞ = max { Xf ∞ : f ∞ = 1}
= max { p ∞ : p ∈ P2 , |p(xi )| ≤ 1, i = 0, 1, 2}.
As in Problem P4, Chapter 3, the maximum is attained when p(x0 ), p(x1 ),
p(x2 ) = ±1, and so we need consider only these cases. It is easy to see that
the maximum must occur for p(x0 ) = −1, p(x1 ) = 1, p(x2 ) = 1, with the
polynomial
2
p(x) = 54 − 25
1
x − 52 ⇒ p ∞ = p 52 = 1.25.
This confirms the first entry in the left-hand column of Table 4.5.
In the right-hand column, Powell uses the Chebyshev interpolation points
given by (4.27), scaled so that the initial and final points map to −5 and 5,
respectively (see the discussion
√ √before Table 4.5). In the first entry the points
given by (4.27) are − 3/2, 0, 3/2, so ||X||∞ is calculated using x0 = −5,
x1 = 0, x2 = 5 again. Hence ||X||∞ = 1.25 again, as required.
Note that if ||X||∞ is calculated on [−1, 1] using the points in (4.27), then the
values obtained are different from those in the right-hand column. This is
because the points in (4.27) do not include ±1. A formula for ||X||∞ ,
calculated in the latter way, is given in Powell Exercise 12.6.

42
Alternatively, Theorem 4.3 can be used. Here is the calculation for
equally-spaced points x0 = −5, x1 = 0, x2 = 5. First
(x − 0)(x − 5) 1
0 (x) = = 50 x(x − 5),
(−5 − 0)(−5 − 5)
(x − (−5))(x − 5)
1 (x) = 1
= − 25 (x2 − 25),
(0 − (−5))(0 − 5)
(x − (−5))(x − 0) 1
2 (x) = = 50 x(x + 5).
(5 − (−5))(5 − 0)
Now, for 0 ≤ x ≤ 5,
2

|k (x)| = 1
50 x(5 − x) + 1
25 25 − x2 + 1
50 x(x + 5)
k=0

= 1
25 25 + 5x − x2 ,
and the maximum of this expression occurs when x = 52 . Hence
2

max |k (x)| = 1
25 25 + 5.5/2 − (5/2)2 = 5/4.
0≤x≤5
k=0

By symmetry, the maximum of this sum will be the same for −5 ≤ x ≤ 0 and
so X ∞ = 5/4, as required.

Solutions to Problems in Chapter 4

P1 By Theorem 4.2, applied on [0, 1] to the interpolation points {0, 0.7} and
{0.7, 1}, there exist ξ0 , ξ1 ∈ [0, 1] such that
e0 (x) = f (x) − p0 (x) = 12 x(x − 0.7)f (2) (ξ0 ), x ∈ [0, 1],
e1 (x) = f (x) − p1 (x) = 12 (x − 0.7)(x − 1)f (2) (ξ1 ), x ∈ [0, 1],
where p0 , p1 ∈ P1 with
p0 (0) = 0, p0 (0.7) = p1 (0.7) = 0.7, p1 (1) = 0.1.
Now note that
|x(x − 0.7)| ≤ |(x − 0.7)(x − 1)| ⇔ |x| ≤ |x − 1| ⇔ x ≤ 12 .
Therefore (assuming that we have no information about f (2) ) it is best to use
p0 (x) for 0 < x < 12 and p1 (x) for 12 < x < 1.
Applying these error estimates at x = 0.5 itself, where p0 (0.5) = 0.5 and
p1 (0.5) = 1.1, we obtain
f (0.5) − 0.5 = −0.05f (2)(ξ0 ) ≤ 0.05 f (2) ∞ ,
f (0.5) − 1.1 = 0.05f (2) (ξ1 ) ≥ −0.05 f (2) ∞ .
Hence
1.1 − 0.05 f (2) ∞ ≤ f (0.5) ≤ 0.5 + 0.05 f (2) ∞ ,
as required. Thus
1.1 − 0.5
f (2) ∞ ≥ = 6.
0.05 + 0.05

43
P2 According to Theorem 4.2, the error in interpolating f (x) = cos x over
[kπ/n1 , (k + 1)π/n1 ] by p1 ∈ P1 , such that p1 (kπ/n1 ) = f (kπ/n1 ) and
p1 ((k + 1)π/n1 ) = f ((k + 1)π/n1 ), is at most

1 kπ (k + 1)π (2)
x− x−
kπ
max
(k+1)π 2 n1 n1 f ∞ .
n ≤x≤
1 n 1

(2)
Since f ∞ ≤ 1 and the maximum of |(x − a)(x − b)| on [a, b] is
((b − a)/2)2 , we deduce that
2
1 π π2
f − p1 ∞ ≤ 2 = .
2n1 8n21
To guarantee that this error is less than 10−6 it is, therefore, suﬃcient for n1
to satisfy
103 π
n1 > √ = 1110.7 . . . .
8
Again by Theorem 4.2, the error in interpolating f (x) = cos x over
[kπ/n2 , (k + 2)π/n2 ] by p2 ∈ P2 , such that p2 (kπ/n2 ) = f (kπ/n2 ),
p2 ((k + 1)π/n2 ) = f ((k + 1)π/n2 ), p2 ((k + 2)π/n2 ) = f ((k + 2)π/n2 ), is at
most

1 kπ (k + 1)π (k + 2)π (3)
x− x− x−
kπ
max
(k+2)π 6 n2 n2 n2 f ∞ .
n2 ≤x≤ n2

Since f ∞ ≤ 1 and the maximum of (x − a) x − 12 (a + b) (x − b) on
(3)
√
[a, b] is (2 3/9)((b − a)/2)3 , we deduce that
√ 3
2 3 π π3
f − p2 ∞ ≤ 16 · = 5/2 3 .
9 n2 3 n2
To guarantee that this error is less than 10−6 it is, therefore, suﬃcient for n2
to satisfy
102 π
n2 > = 125.7 . . . .
35/6

P3 Clearly this exercise is related to Theorem 4.2. To solve it we must ﬁnd a

substitute for the function g of (4.14). It is natural to consider the function
tn (t − 1)n
g(t) = f (t) − p(t) − e(x) , 0 ≤ t ≤ 1,
xn (x − 1)n
and to try and show that g (2n) (ξ) = 0, for some ξ ∈ [0, 1], since this would give
(2n)!
0 = f (2n) (ξ) − e(x) ,
xn (x − 1)n
as required.
To prove that g (2n) (ξ) = 0 for some ξ it is suﬃcient to prove that g (n) = 0 for
n + 1 distinct points in [0, 1], since we can then apply Rolle’s theorem n
times, as in the proof of Theorem 4.2.
By the deﬁnition of g we have g(x) = 0, and
&
g (k) (0) = 0
, k = 0, 1, . . . , n − 1.
g (k) (1) = 0
If we apply Rolle’s theorem to g on [0, x] and on [x, 1], then we deduce that
there are two distinct points x0 ∈ (0, x) and x1 ∈ (x, 1) such that
g (x0 ) = g (x1 ) = 0.
Now apply Rolle’s theorem to g on [0, x0 ], [x0 , x1 ], [x1 , 1], to deduce that
there are three distinct points in (0, 1) at which g (2) = 0. Continuing in this
way, we apply Rolle’s theorem n times to deduce that there are indeed n + 1
distinct points in (0, 1) at which g (n) = 0, as required.

44
P4 Since
&
[2(n − i) + 1]π
xi = cos , i = 0, 1, . . . , n,
2(n + 1)
and the function f (x) = cos x is concave on [0, π/2] and convex on [π/2, π],
the maximum gap occurs in the middle of the range and the minimum gap
occurs at the ends.

If n is even, then the maximum gap i = 12 n, i + 1 = 12 n + 1 is

(n − 1)π (n + 1)π nπ π
cos − cos = 2 sin sin
2(n + 1) 2(n + 1) 2(n + 1) 2(n + 1)

π
< 2 sin
2(n + 1)
π
< ,
n+1
since sin x < x, for x > 0.

If n is odd, then the maximum gap i = 12 (n − 1), i + 1 = 12 (n + 1) is

nπ (n + 2)π (n + 1)π π
cos − cos = 2 sin sin
2(n + 1) 2(n + 1) 2(n + 1) 2(n + 1)

π
= 2 sin
2(n + 1)
π
< .
n+1
Since the gap for n + 1 equally-spaced points is 2/n, the desired factor is
indeed less than π/2 in both cases.
The minimum gap (i = n − 1, i + 1 = n) is

π 3π π π
cos − cos = 2 sin sin .
2(n + 1) 2(n + 1) 2(n + 1) n+1
Thus the ratio of the maximum to the minimum gap is
⎧ # $ # $
⎨sin nπ / sin π
2(n+1) n+1 , n even,
# $
⎩1/ sin π , n odd.
n+1

It is evident that
1 1 n+1
# $ > # $ = ,
sin n+1π π π
n+1

so the required lower estimate clearly holds for n odd. For n even, there is a
little more work to do, since sin(nπ/2(n + 1)) < 1. However,

nπ π π π
sin = sin − = cos
2(n + 1) 2 2(n + 1) 2(n + 1)
and

π π π
sin = 2 sin cos ,
n+1 2(n + 1) 2(n + 1)
so that
# $
nπ
sin 2(n+1) 1 1 n+1
# $ = # $ > # $= ,
π
sin n+1 π
2 sin 2(n+1) π
2 2(n+1) π

which completes the solution.

45
P5 We give a solution along the lines of that used to find X ∞ in Problem P4,
Chapter 3. Note that Theorem 4.3 cannot be used because we are not
interpolating by a general element of P3 . To find X ∞ , we must find the
maximum on [0, 3] of |p(x)| = |c0 + c1 x + c3 x3 |, where |p(0)| ≤ 1, |p(2)| ≤ 1
and |p(3)| ≤ 1; once again, by the linear programming argument, we need
consider only the cases p(0), p(2), p(3) = ±1. The critical cases are sketched
below.

1 1 1

0 2 3 0 2 3 0 2 3
−1 −1 −1
(a) (b) (c)

In fact, case (c) gives the greatest value of |p(x)| in [0, 3]. In this case we have
p(0) = 1, p(2) = 1 and p(3) = −1, so that
8 2 3
√
p(x) = 1 + 15 x − 15 x ⇒ p ∞ = p(2/ 3) = 1 + 4532 √ ,
3

as required.

46
Chapter 5 Divided differences
In Chapter 4 we found that interpolation can provide a good method of
determining a polynomial approximation to a given function. This chapter is
devoted to a good method of calculating such an interpolating polynomial using a
formula due to Newton which involves divided diﬀerences.
This chapter splits into TWO study sessions:
Study session 1: Sections 5.1, 5.2 and 5.3.
Study session 2: Sections 5.4 and 5.5.

Study Session 1: Basic properties of divided

differences

Read Sections 5.1, 5.2 and 5.3

Commentary
1. The deﬁnition of the divided diﬀerence given in Section 5.1 makes it clear
that f [x0 , x1 , . . . , xn ] is independent of the order in which the points
x0 , x1 , . . . , xn appear. For example
f [x0 , x1 , x2 , x3 ] = f [x1 , x3 , x0 , x2 ].

2. The remarks at the bottom of page 47 will make more sense after you have
read how to calculate divided diﬀerences in Section 5.3.

3. The key features of the Newton formula (5.12) are that, for
k = 0, 1, . . . , n − 1,
(a) the ﬁrst k + 1 terms comprise the polynomial pk ∈ Pk which interpolates
f at x0 , x1 , . . . , xk ;
(b) the (k + 2)th term is an estimate for the error in the approximation of f
by pk .
If a large number of function values are available, therefore, Newton’s
formula should give better and better approximations to f by choosing more
and more interpolation points. By checking the size of each additional term
calculated, one can decide when further interpolation points are of no help.

4. Some special cases of (5.14) are

f [xj+1 ] − f [xj ]
f [xj , xj+1 ] = ,
xj+1 − xj
f [xj+1 , xj+2 ] − f [xj , xj+1 ]
f [xj , xj+1 , xj+2 ] = .
xj+2 − xj

47
The following diagram may help to interpret (5.14) in general.

xj f (xj )

xj+1 f (xj+1 )
..
. f [xj , . . . , xj+k ]
f [xj , . . . , xj+k+1 ]
..
. f [xj+1 , . . . , xj+k+1 ]
xj+k f (xj+k )

xj+k+1 f (xj+k+1 )

The (k + 1)th divided diﬀerence is found using the two adjacent terms in the
previous column and the corresponding x values at the ends of the diagonals.

5. The method of calculating divided diﬀerences given in Theorem 5.3 explains

the remarks at the bottom of page 47. For example, if the data
f (x0 ), f (x1 ), . . . , f (xn ) is given and hi = xi − xi−1 , i = 1, 2, . . . , n, then
adding ε to f (x2 ) has the following eﬀect on the table (note how the errors
grow rapidly if the h values are small).
x0 0
0
x1 0 +ε/ h1 h2 + h22
+ε/h2
x2 +ε −ε/(h2 h3 ) ...
−ε/h3
x3 0 +ε/ h23 + h3 h4
0
x4 0
The pattern which emerges in the case of equally-spaced data is investigated
in Powell Exercise 5.3 and exploited in Powell Exercise 5.8.

6. When evaluating (5.12) it is sometimes convenient to use the nested form

pn (x) = f (x0 ) + (x − x0 )(f [x0 , x1 ] + (x − x1 )(f [x0 , x1 , x2 ] + · · ·
. . . (f [x0 , x1 , . . . , xn−1 ] + (x − xn−1 )f [x0 , x1 , . . . , xn ]) . . .)).

Self-assessment questions
S1 Powell Exercise 5.1

S2 Powell Exercise 5.2

48
Study Session 2: Numerical considerations and
Hermite interpolation

Read Sections 5.4 and 5.5

Commentary
1. The method of interpolation by calculating the coefficients ci , i = 0, 1, . . . , n,

n
in p(x) = ci xi is, of course, convenient for interpolating by very low
i=0
degree polynomials, where the coefficients can be found exactly. For higher
degree polynomials, however, it is difficult to calculate the coefficients with
sufficient accuracy because the corresponding matrix equation may be
ill-conditioned.
"m
2. The assertion at the bottom of page 54 that p is a multiple of (x − xi )i +1
is true because i=0

f (j) (xi ) = 0, j = 0, 1, . . . , i , i = 0, 1, . . . , m.
This implies that the Taylor expansion of p about each xi begins
f (i +1) (xi )
p(x) = (x − xi )i +1 + · · · ,
(i + 1)!
so that (x − xi )i +1 is a factor of p(x), for each i.

3. The word ‘suitable’ at the top of page 56 can be interpreted to mean ‘valid’.

4. The proof of Theorem 5.5 shows that Hermite interpolation is the limiting
case of Newton’s formula (5.12), which is obtained when various adjacent
interpolation points merge together.

Self-assessment questions
S3 Verify equation (5.19).

S4 Calculate the value p(1.8) given by (5.20) and conﬁrm the value p(1.8) given
by (5.21). (Evaluate these polynomials by nested multiplication:
a0 + a1 x + a2 x2 + · · · + an xn = a0 + x(a1 + · · · + x(an−1 + an x) · · ·).)

S5 Verify that the polynomial (5.29) satisﬁes the last two interpolation
conditions in (5.28).

Problems for Chapter 5

P1 Powell Exercise 5.3

P2 Powell Exercise 5.4

P3 Powell Exercise 5.5

P4 Powell Exercise 5.7

P5 Powell Exercise 5.8

49
Solutions to SAQs in Chapter 5
S1 The table is as follows.
xi f (xi ) Order 1 Order 2 Order 3 Order 4
−2 3.28
14.08
−1 17.36 −3.72
−0.8 1.0
2 14.96 1.28 0
4.32 1.0
3 19.28 6.28
16.88
4 36.16
Thus
p4 (x) = 3.28 + 14.08(x + 2) − 3.72(x + 2)(x + 1) + (x + 2)(x + 1)(x − 2)
and so
p4 (4) = 3.28 + 14.08 × 6 − 3.72 × 6 × 5 + 6 × 5 × 2 = 36.16,
as expected. Note that p4 (x) = p3 (x) in this example.

S2 The required formula for pn (x0 ) follows from (5.12) by noting that, for
k = 1, 2, . . . , n − 1,
d d
(x − x0 ) . . . (x − xk ) = (x − x1 ) . . . (x − xk ) + (x − x0 ) (x − x1 ) . . . (x − xk )
dx dx
and so
d
(x − x0 ) . . . (x − xk )x=x = (x0 − x1 ) . . . (x0 − xk ).
dx 0

Thus
p (2) = f [2, 3] + (2 − 3)f [2, 3, 4] + (2 − 3)(2 − 4)f [2, 3, 4, −1]
+ (2 − 3)(2 − 4)(2 + 1)f [2, 3, 4, −1, −2].
By Comment 1 on page 47 and the above table,
f [2, 3] = f [3, 2] = 4.32,
f [2, 3, 4] = f [4, 3, 2] = 6.28,
f [2, 3, 4, −1] = f [4, 3, 2, −1] = 1,
f [2, 3, 4, −1, −2] = f [4, 3, 2, −1, −2] = 0.
Thus
p (2) = 4.32 + (2 − 3)6.28 + (2 − 3)(2 − 4) = 0.04.
The ordering x0 = 2, x1 = −1, x2 = 3, x3 = −2, x4 = 4 also allows us to
obtain the divided diﬀerences
f [2, −1], f [2, −1, 3], f [2, −1, 3, −2], f [2, −1, 3, −2, 4]
without compiling a fresh table. With this ordering,
p (2) = f [2, −1] + (2 + 1)f [2, −1, 3] + (2 + 1)(2 − 3)f [2, −1, 3, −2]
+ (2 + 1)(2 − 3)(2 + 2)f [2, −1, 3, −2, 4]
= − 0.8 + 3 × 1.28 + 3 × (−1) × 1 + 0
= 0.04,
once again.

50
S3 In exact arithmetic
p(1.8) = 0.0823 − 0.2 × 0.236 33 + 0.2 × 0.17 × 0.329
− 0.2 × 0.17 × 0.1 × 0.328 87 + 0.2 × 0.17 × 0.1 × 0.04 × 0.5008
= 0.0823 − 0.047 266 + 0.011 186 − 0.001 118 158 + 0.000 068 108 8
= 0.045 169 950 8.
(Note that using nested multiplication here may lose you a couple of digits at
the end.)

S4 Using (5.20),
p(1.8) = 6.700 98 + 1.8(−13.360 21 + 1.8(10.3856 + 1.8(−3.692 41 + 1.8 × 0.502 72)))
= 0.045 164 35,
which agrees with the data to 4 places of decimals. Using (5.21),
p(1.8) = 6.701 + 1.8(−13.36 + 1.8(10.386 + 1.8(−3.6924 + 1.8 × 0.502 72)))
= 0.046 916 672,
which agrees with the data to only 2 places of decimals.

S5 Since 1.8 − 1.6 = 0.2, 1.8 − 1.7 = 0.1 and 1.8 − 1.8 = 0,
p(1.8) = 0.082 297 + 0.2(−0.246 892 + 0.2(0.335 92 + 0.1 × (−0.297 35)))
= 0.045 166,
as required. Now
d
(x − 1.6)2 = 2(x − 1.6),
dx
d
(x − 1.6)2 (x − 1.7) = 2(x − 1.6)(x − 1.7) + (x − 1.6)2 ,
dx
d
(x − 1.6)2 (x − 1.7)(x − 1.8) = 2(x − 1.6)(x − 1.7)(x − 1.8) + (x − 1.6)2 (2x − 3.5).
dx
Hence
p (1.8) = − 0.246 892 + 0.335 92 × 2 × 0.2
− 0.297 35(2 × 0.2 × 0.1 + 0.2 × 0.2)
+ 0.203 75(0.2 × 0.2 × 0.1)
= − 0.135 497,
as required.
Remark In more complicated examples, one might use a more systematic
approach to calculate p (x), where
p(x) = a0 + (x − x0 )(a1 + (x − x1 )(a2 + · · · (x − xn−1 )(an + an+1 (x − xn )) . . .)).
Put
p(x) = q0 (x) = a0 + (x − x0 )q1 (x) = a0 + (x − x0 )(a1 + (x − x1 )q2 (x)) = . . . ,
so that
qn (x) = an + an+1 (x − xn ) and qn+1 (x) = an+1 .
Then
qk (x) = ak + (x − xk )qk+1 (x)
and so
qk (x) = qk+1 (x) + (x − xk )qk+1

(x).
Hence, by induction,
p (x) = q1 (x) + (x − x1 )(q2 (x) + · · · (qn (x) + (x − xn )qn+1 (x)) . . .)).

51
Solutions to Problems in Chapter 5
P1 First note that with the given values of xi , we have
"
n
(xk − xj ) = k!hk (n − k)!(−h)n−k = (−1)n−k hn k!(n − k)!,
j=0
j=k

so that

n
f (xk )
f [x0 , x1 , . . . , xn ] = h−n (−1)n−k .
k!(n − k)!
k=0

To verify that this formula is consistent with Theorem 5.3 we note that

k+1
f (xi+j )
−k−1
f [xj , . . . , xj+k+1 ] = h (−1)k+1−i ,
i=0
i!(k + 1 − i)!

k
f (xi+j )
f [xj , . . . , xj+k ] = h−k (−1)k−i ,
i=0
i!(k − i)!

k
f (xi+j+1 )
f [xj+1 , . . . , xj+k+1 ] = h−k (−1)k−i
i=0
i!(k − i)!
and
xj+k+1 − xj = (k + 1)h.
Now, the coeﬃcient of f (xi+j ) in
f [xj+1 , . . . , xj+k+1 ] − f [xj , . . . , xj+k ]
xj+k+1 − xj
is

1 (−1)k−(i−1) (−1)k−i
−
(k + 1)hk+1 (i − 1)!(k − (i − 1))! i!(k − i)!

(−1)k+1−i 1 1
= +
(k + 1)hk+1 (i − 1)!(k + 1 − i)! i!(k − i)!
(−1)k+1−i
= ,
hk+1 i!(k + 1 − i)!
which is the coeﬃcient of f (xi+j ) in f [xj , . . . , xj+k+1 ]. Hence the recurrence
relation (5.14) does indeed hold.

P2 The table can be reconstructed from the ﬁrst entry in each column using
(5.14) in the form
f [xj+1 , . . . , xj+k+1 ] = f [xj , . . . , xj+k ] + (xj+k+1 − xj )f [xj , . . . , xj+k+1 ].
For example,
f [1.8, 1.76, 1.7, 1.63] = f [1.76, 1.7, 1.63, 1.6] + (1.8 − 1.6)f [1.8, 1.76, 1.7, 1.63, 1.6]
= −0.328 87 + 0.2 × 0.500 80
= −0.228 71,

f [1.76, 1.7, 1.63] = f [1.7, 1.63, 1.6] + (1.76 − 1.6)f [1.76, 1.7, 1.63, 1.6]
= 0.329 + 0.16 × (−0.328 87)
= 0.276 380 8,

f [1.8, 1.76, 1.7] = f [1.76, 1.7, 1.63] + (1.8 − 1.63)f [1.8, 1.76, 1.7, 1.63]
= 0.276 380 8 + 0.17 × (−0.228 71)
= 0.237 500 1.

52
Continuing in this manner, we ﬁnd
f [1.7, 1.63] = −0.236 33 + (1.7 − 1.6) × 0.329 = −0.203 43,
f [1.76, 1.7] = −0.203 43 + (1.76 − 1.63) × 0.276 380 8 = −0.167 500 49,
f [1.8, 1.76] = −0.167 500 49 + (1.8 − 1.7) × 0.237 500 1 = −0.143 750 48,
f [1.63] = 0.082 30 + (1.63 − 1.6) × (−0.236 33) = 0.075 210 1,
f [1.7] = 0.075 210 1 + (1.7 − 1.63) × (−0.203 43) = 0.060 97,
f [1.76] = 0.060 97 + (1.76 − 1.7) × (−0.167 500 49) = 0.050 919 97,
f [1.8] = 0.050 919 97 + (1.8 − 1.76) × (−0.143 750 48) = 0.045 169 951.

P3 The required table is as follows.

f (0)
f (0)
1
f (0) 2 f (0)

f (0) f [1, 0] − f (0) − 12 f (0)
f (0)
f [1, 0] − f (0) f (1) − 3f [1, 0] + 2f (0) + 12 f (0)

f [1, 0] f (1) − 2f [1, 0] + f (0)
f (1) f (1) − f [1, 0]
f (1)
f (1)
Hence

p(x) = f (0) + xf (0) + 12 x2 f (0) + x3 f [1, 0] − f (0) − 12 f (0)

+ x3 (x − 1) f (1) − 3f [1, 0] + 2f (0) + 12 f (0) .
If f (x) = (x + 1)4 , then f (0) = 1, f (0) = 4, f (0) = 12, f (1) = 16,
f (1) = 32, so that
p(x) = 1 + 4x + 6x2 + x3 (15 − 4 − 6) + x3 (x − 1)(32 − 45 + 8 + 6)
= 1 + 4x + 6x2 + 5x3 + x3 (x − 1)
= 1 + 4x + 6x2 + 4x3 + x4
= (1 + x)4 ,
as required.

P4 It is easy to prove that if f (k) is strictly increasing, then the kth-order

diﬀerences are increasing. Indeed, by Theorem 5.1 and Theorem 5.3,
f [xj+1 , . . . , xj+k+1 ] − f [xj , . . . , xj+k ] = (xj+k+1 − xj )f [xj , . . . , xj+k+1 ]
f (k+1) (ξ)
= (xj+k+1 − xj ) ,
(k + 1)!
for some ξ in [xj , xj+k+1 ]. Since f (k) is strictly increasing, we have
f (k+1) (ξ) ≥ 0. Thus
f [xj+1 , . . . , xj+k+1 ] ≥ f [xj , . . . , xj+k ].
Some extra work is required to prove that this inequality must be strict. If it
is not, then f [xj , . . . , xj+k+1 ] = 0, and so the polynomial p in Pk+1 which
interpolates f at xj , . . . , xj+k+1 actually lies in Pk . Therefore, e = f − p has
(at least) k + 2 zeros and so, by Rolle’s Theorem, e(k) has (at least) 2 zeros.
Hence f (k) = p(k) at (at least) 2 points. But p(k) is a constant and f (k) is
strictly increasing — a contradiction. Hence the kth-order diﬀerences are
strictly increasing.

53
P5 The difference table is as follows.
xi f (xi ) Order 1 Order 2 Order 3
0.0 0.0
0.119 778
0.1 0.119 778 0.009 57
0.129 348 0.000 018
0.2 0.249 126 0.009 588
0.138 936 −0.002 982
0.3 0.388 062 0.006 606
0.145 542 0.009 015
0.4 0.533 604 0.015 621
0.161 163 −0.008 982
0.5 0.694 767 0.006 639
0.167 802 0.003 013
0.6 0.862 569 0.009 652
0.177 454 0.000 005
0.7 1.040 023 0.009 657
0.187 111 0.000 041
0.8 1.227 134 0.009 698
0.196 809 −0.000 015
0.9 1.423 943 0.009 683
0.206 492
1.0 1.630 435
The second-order differences are irregular. Most noticeably, the numbers
0.006 606, 0.015 621, 0.006 639 are substantially different from the other
entries, and this can be traced to an error in the value of f (0.4). Indeed, if the
above value of f (0.4) is increased by ε, then the increases in the second-order
differences are ε, −2ε, ε respectively, so that their average remains constant at
(0.006 606 + 0.015 621 + 0.006 639)/3 = 0.009 622.
For the middle one of these three differences to equal 0.009 622, we require
2ε = 0.015 621 − 0.009 622, that is, ε = 0.002 999 5. It appears likely,
therefore, that f (0.4) should actually be 0.536 604. (Alternatively: note that
the increases in the corresponding third-order differences are ε, −3ε, 3ε, −ε,
so that ε 0.003.)
Once this error is corrected, notice that the second-order differences are
increasing, apart from the last one, and that the increase from 0.009 657 to
0.009 698 is abnormally large. This can be traced to an error in the value of
f (0.8). Once again an increase of ε in the above value of f (0.8) leads to
increases in the last three second-order differences of ε, −2ε, ε respectively.
The average of these differences is 0.009 679 3̇, which suggests that
2ε = 0.009 698 − 0.009 679 3̇, that is, ε = 0.000 009 3̇. It appears likely,
therefore, that f (0.8) should actually be 1.227 143.
Once these corrections are made, the third-order differences are (giving only
the significant digits)
18, 18, 15, 18, 13, 14, 14, 12,
which is quite regular, since the final digit of the data is probably rounded.

54
Chapter 6 The uniform convergence of
polynomial approximations
Chapter 6 begins the detailed study of uniform approximation, that is,
approximation of functions in the ∞-norm. The chapter is almost entirely
devoted to the proof of Weierstrass’ approximation theorem, which states that if f
is continuous on [a, b] and ε > 0 is given, then there is a polynomial p such that
p − f ∞ ≤ ε. The degree of p is not ﬁxed in advance and will in general depend
on ε and f .
In this chapter we make frequent use of the Binomial Theorem
n
n n
(x + y) = xk y n−k ,
k
k=0
especially the cases x + y √
= 1 and x = y = 1. We also require Stirling’s
asymptotic formula, n! ∼ 2πn(n/e)n , in the problems.
This chapter splits into TWO study sessions:
Study session 1: Sections 6.1 and 6.2.
Study session 2: Sections 6.3 and 6.4.

Study Session 1: Monotone operators

Read Sections 6.1 and 6.2

Commentary
1. Weierstrass’ Theorem (proved in 1885) is probably the best known result in
approximation theory. Nevertheless, it is still a little surprising, since a
function f can be very badly behaved and yet be continuous. For example,
there are many functions f which are continuous but nowhere differentiable.
One such function (called the ‘blancmange function’ on account of the shape
of its graph) is given by
∞
1
f (x) = n
φ(2n x), 0 ≤ x ≤ 1,
n=0
2
where φ(x) = |x|, for |x| ≤ 12 , and φ(x + n) = φ(x), for n = 0, ±1, ±2, . . ..
The first example of such a continuous, nowhere differentiable function is due
to Weierstrass himself (1872).

2. Powell proves Weierstrass’ Theorem using Theorem 6.2, which is due to

Korovkin (1953) and Bohman (1952).

3. Note that the assertion in (6.9) uses the fact that a continuous function on a
closed interval [a, b] is uniformly continuous (see commentary on Chapter 1).

Self-assessment questions
S1 Verify the simpler form of the deﬁnition of a monotone linear operator L,
given in Section 6.2 (ﬁrst paragraph).

S2 Verify that (6.11) holds for all x ∈ [a, b].

S3 Powell Exercise 6.1

55
Study Session 2: The Bernstein operator

Read Sections 6.3 and 6.4

Commentary

n
1. The expression n!/(k!(n − k)!) is commonly referred to as or n Ck .
k

2. The function Bn f can be thought of as a weighted average of the function

n
values f (k/n), k = 0, 1, . . . , n. The weights xk (1 − x)n−k are smooth
k
functions of x which are relatively large for x near k/n and relatively small
elsewhere (see Powell Exercise 6.6).

3. A more careful analysis of the Bernstein operator shows that

√
Bn f − f ∞ ≤ 32 ω 1/ n ,
where ω(δ) = sup{|f (x) − f (y)| : |x − y| ≤ δ} is the modulus of continuity
of f . Powell Exercise 6.5 shows that we cannot hope for a signiﬁcantly
smaller estimate of the error in approximating a continuous function by the
Bernstein operator (compare Powell Exercise 6.8, however, for an
improvement if f ∈ C (2) [0, 1]).

Self-assessment questions
S4 Verify the identity (6.27).

S5 Determine the modulus of continuity ω of the functions

(a) f (x) = x2 (0 ≤ x ≤ 1),

(b) f (x) = x − 12 (0 ≤ x ≤ 1).

S6 Prove that if f is continuous on [a, b], then its modulus of continuity ω satisﬁes
(a) ω is increasing;
(b) lim ω(δ) = 0;
δ→0
(c) ω(δ1 + δ2 ) ≤ ω(δ1 ) + ω(δ2 ).

Problems for Chapter 6

P1 Powell Exercise 6.2

P2 Powell Exercise 6.3

P3 Powell
√ Exercise 6.5 (Hint: you will need to use Stirling’s formula
n! ∼ 2πn(n/e)n .)

P4 Powell Exercise 6.6

P5 Powell Exercise 6.8

56
Solutions to SAQs in Chapter 6
S1 If L is linear and satisﬁes
(Lf )(x) ≥ 0, a ≤ x ≤ b,
whenever
f (x) ≥ 0, a ≤ x ≤ b,
then
(Lf )(x) − (Lg)(x) = (L(f − g))(x) ≥ 0, a ≤ x ≤ b,
whenever
f (x) ≥ g(x), a ≤ x ≤ b,
as required.

S2 Case 1 If |x − ξ| ≤ δ, then
qu (x) ≥ f (ξ) + ε ≥ f (x),
by (6.9).
Case 2 If |x − ξ| > δ, then
qu (x) ≥ f (ξ) + ε + 2 f ∞
≥ ε + f ∞ (since f ∞ ≥ −f (ξ))
≥ f (x) (since f ∞ ≥ f (x)).
Thus (6.11) holds in either case.

S3 We know that X is a linear operator, so we can use the simpler deﬁnition of

monotone operator discussed in SAQ S1.
Suppose ﬁrst that x0 = a, x1 = b. If f (x) ≥ 0, a ≤ x ≤ b, then f (a), f (b) ≥ 0,
so that
(b − x)f (a) + (x − a)f (b)
(Xf )(x) = ≥ 0, a ≤ x ≤ b,
b−a
and hence X is monotone in this case.
Now suppose that x0 > a. We show that X is not monotone by choosing a
non-negative function f on [a, b] such that Xf fails to be non-negative on
[a, b]. A simple example is f (x) = |x − x0 |, since f (x0 ) = 0, f (x1 ) > 0 implies
that the linear interpolating function Xf (x) is negative for a < x < x0 .
A similar counterexample can be given if x1 < b.
n
2
n k
S4 xk (1 − x)n−k
k n
k=0
n

n−1 n−k k
= x (1 − x)
k
k−1 n
k=1
n
1 n−1
= xk (1 − x)n−k ((k − 1) + 1)
n k−1
k=1
n n
n−1 n−2 1 n−1
= x2 xk−2 (1 − x)n−k + x xk−1 (1 − x)n−k
n k−2 n k−1
k=2 k=1

n−2 n−1
n − 1
n−1 n−2 1
= x2 xk (1 − x)n−2−k + x xk (1 − x)n−1−k
n k n k
k=0 k=0

n−1 1
= x2 + x,
n n
by the Binomial Theorem.

57

S5 (a) By inspection, the largest value of x2 − y 2 when |x − y| ≤ δ, x, y ∈ [0, 1],
occurs for x = 1, y = 1 − δ. Thus
ω(δ) = 1 − (1 − δ)2
= 2δ − δ2 .

(b) By inspection, the largest value of x − 12 − y − 12 when
|x − y| ≤ δ ≤ 12 , x, y ∈ [0, 1], occurs for x = 12 , y = 12 + δ. When
1 1
2 < δ ≤ 1, the largest value occurs for x = 2 , y = 1. Thus

δ, 0 < δ ≤ 12 ,
ω(δ) = 1 1
2, 2 < δ ≤ 1.

S6 (a) If 0 < δ1 < δ2 , then

ω(δ1 ) = sup{|f (x) − f (y)| : |x − y| ≤ δ1 }
≤ sup{|f (x) − f (y)| : |x − y| ≤ δ2 }
= ω(δ2 ),
since the second supremum is taken over a larger set.
(b) For each ε > 0, there exists δ > 0 such that
|x − y| ≤ δ ⇒ |f (x) − f (y)| ≤ ε,
and so
0 ≤ ω(δ) ≤ ε.
Hence
lim ω(δ) = 0.
δ→0

(c) If x, y satisfy |x − y| ≤ δ1 + δ2 , then we can choose z such that

|x − z| ≤ δ1 , |z − y| ≤ δ2 . Since
|f (x) − f (y)| ≤ |f (x) − f (z)| + |f (z) − f (y)|
≤ ω(δ1 ) + ω(δ2 ),
we deduce that
ω(δ1 + δ2 ) ≤ ω(δ1 ) + ω(δ2 ),
as required.

58
Solutions to Problems in Chapter 6
P1 The proof is a generalisation of that of (6.27):
n 3
n k
xk (1 − x)n−k
k n
k=0
n 2
n−1 n−k k
= x (1 − x)
k
k−1 n
k=1

1 n−1
n
= 2 xk (1 − x)n−k ((k − 1)(k − 2) + 3(k − 1) + 1)
n k−1
k=1
n n
(n − 1)(n − 2) 3 n − 3 3(n − 1) 2 n − 2
= x xk−3
(1 − x)n−k
+ x xk−2 (1 − x)n−k
n2 k−3 n2 k−2
k=3 k=2
n
1 n−1
+ x xk−1 (1 − x)n−k
n2 k−1
k=1

(n − 1)(n − 2) 3 3(n − 1) 2 1
= x + x + x,
n2 n2 n2
by the Binomial Theorem.
The method generalises to xr , for n > r, because we can always write k r−1 as
a linear combination of the expressions
(k − 1) . . . (k − r + 1), (k − 1) . . . (k − r + 2), . . . , (k − 1)(k − 2), (k − 1), 1.
Note that if n = r then Bn f is automatically in Pr = Pn .

P2 By (6.23),
6
6
p(j/6) = (j/6)k (1 − j/6)6−k fk ,
k
k=0

where fk = f (k/6), k = 0, 1, . . . , 6.
Thus, with the given values of p(j/6), j = 0, 1, . . . , 6, we have 0 = p(0) = f0
and 0 = p(1) = f6 , so that
1
0 = p(1/6) = 6 6.55 f1 + 15.54f2 + 20.53 f3 + 15.52 f4 + 6.5f5 ,
6
1
0 = p(1/3) = 6 6.45 .2f1 + 15.44 .22 f2 + 20.43 .23 f3 + 15.42 .24 f4 + 6.4.25 f5 ,
6
1
1 = p(1/2) = 6 (6f1 + 15f2 + 20f3 + 15f4 + 6f5 ),
6
1
0 = p(2/3) = 6 6.25 .4f1 + 15.24.42 f2 + 20.23 .43 f3 + 15.22 .44 f4 + 6.2.45 .f5 ,
6
1
0 = p(5/6) = 6 6.5f1 + 15.52 f2 + 20.53f3 + 15.54 f4 + 6.55 f5 ,
6
which reduce to
3750f1 + 1875f2 + 500f3 + 75f4 + 6f5 = 0, (1)
48f1 + 60f2 + 40f3 + 15f4 + 3f5 = 0, (2)
6f1 + 15f2 + 20f3 + 15f4 + 6f5 = 64, (3)
3f1 + 15f2 + 40f3 + 60f4 + 48f5 = 0, (4)
6f1 + 75f2 + 500f3 + 1875f4 + 3750f5 = 0. (5)
By considering (1)–(5) and (2)–(4), we ﬁnd that f1 − f5 = 0 and f2 − f4 = 0.
Thus equations (1) to (5) further reduce to
3756f1 + 1950f2 + 500f3 = 0, (6)
12f1 + 30f2 + 20f3 = 64, (7)
51f1 + 75f2 + 40f3 = 0. (8)

59
Eliminating f3 ﬁrst from (6), (7) and then from (7), (8) gives
3456f1 + 1200f2 = −1600,
27f1 + 15f2 = −128,
and so
1296f1 = 8640 ⇒ f1 = f5 = 20/3.
Substituting back, we obtain f2 = f4 = −308/15 and f3 = 30, as required.
1
P3 Since f 2 = 0, the error in question is
n
n 1 n k 1

(Bn f ) 12 − f 12 = 2 − 2 .
k n
k=0

Now, if n is even, then

n n/2

n k
1 n 1 k
− =2 −
k n 2 k 2 n
k=0 k=0
n/2 n/2
n n−1
= −2
k k−1
k=0 k=1
n/2 n/2−1
n n−1
= −2 .
k k
k=0 k=0

Since n is even, n − 1 is odd and so

n/2
n
n n n n
1 1
= 2 + = 2n−1 + 2
k k n/2 n/2
k=0 k=0

and
n/2−1
n−1
n−1
n−1

2 = = 2n−1 ,
k k
k=0 k=0

by the Binomial Theorem. Hence

1 n
(Bn f ) 12 − f 12 = .
2n+1 n/2
To proceed further we need to use Stirling’s formula,
√
n! ∼ 2πn(n/e)n as n → ∞,
which gives
√
n 2πn(n/e)n 2n+1
n/2
∼ √ 2 = √ as n → ∞.
πn(n/2e)n/2 2πn
Hence
1
(Bn f ) 12 − f 12 ∼ √ as n → ∞,
2πn
as required.
If n is odd, then a similar argument gives

1 1 1 n−1
(Bn f ) 2 − f 2 = n 1
2 2 (n − 1)
1
∼ ,
2π(n − 1)
again using Stirling’s formula.

60
P4 Each of the functions

n
φnk (x) = xk (1 − x)n−k , 0 ≤ x ≤ 1,
k
is positive for 0 < x < 1 and vanishes for x = 0, 1 (unless k = 0 or k = n,
respectively). Also, for 0 < k < n,

n k−1
φnk (x) = kx (1 − x)n−k − xk (n − k)(1 − x)n−k−1
k

n
= xk−1 (1 − x)n−k−1 (k(1 − x) − (n − k)x)
k

n
= xk−1 (1 − x)n−k−1 (k − nx),
k
so that φnk has a unique turning point in (0, 1), namely, a maximum for
k − nx = 0, that is, x = k/n. This maximum value is
k n−k
n k k
φnk (k/n) = 1− .
k n n
If we keep k fixed while n → ∞, then by Stirling’s formula,
√ k n−k
2πn(n/e)n k n−k
φnk (k/n) ∼
k! 2π(n − k)((n − k)/e)n−k n n
k
n k
=
n − k k!ek
kk
∼ as n → ∞.
k!ek
Note, however, that k/n → 0 as n → ∞, so in this case the peak of height
e−k k k /k! moves towards the y-axis.
On the other hand, if ξ = k/n remains fixed while n (and hence k) tends to
infinity, then the width of the peak becomes narrower. Indeed, if η = ξ, then
φnk (η) ηk (1 − η)n−k
= k = α,
φnk (ξ) ξ (1 − ξ)n−k
say, where 0 < α < 1, because ξ is the maximum of φnk . Now consider the
sequences np = pn and kp = pk, where p is a positive integer. Then
φnp kp (η) ηpk (1 − η)pn−pk
=
φnp kp (ξ) ξpk (1 − ξ)pn−pk
= αp .
Thus
φnp kp (η)
lim = 0,
p→∞ φnp kp (ξ)
and so the width of the peak at ξ = k/n must tend to 0 as p → ∞. The
height of this peak is in fact

pn
φnp kp (ξ) = ξpk (1 − ξ)pn−pk
pk
1
∼ as p → ∞,
2πpk(1 − ξ)
by a further application of Stirling’s formula.
These properties of the graphs y = φnk (x) are illustrated below in various
special cases.

61
0.5
φ31
φ62
φ93

φ61
φ91

0 1 1 1
9 6 3
1

P5 In the proof of Theorem 6.3, we found that if q(x) = a + bx + cx2 , then

Bn q − q = Bn p − p,
where p(x) = cx2 , and that
c
Bn q − q ∞ = Bn p − p ∞ =
4n
1
= q ∞ .
8n
The technique used in Theorem 6.2 was to approximate f from above and
below at a point ξ by quadratic functions, and then use:
(a) the convergence of Ln q to q, if q is quadratic;
(b) the fact that the operators Ln are monotone.
In this problem we replace (a) by the explicit estimate for Bn q − q ∞ , given
above. We wish, therefore, to approximate a given function f ∈ C (2) [0, 1]
above and below by quadratics, the approximation being particularly good
near ξ. By Taylor’s Theorem,
f (x) = f (ξ) + f (ξ)(x − ξ) + 12 f (η)(x − ξ)2 ,
where η lies between x and ξ. Thus, if
qu (x) = f (ξ) + f (ξ)(x − ξ) + 12 f ∞ (x − ξ)2 ,
q (x) = f (ξ) + f (ξ)(x − ξ) − 12 f ∞ (x − ξ)2 ,
then
q (x) ≤ f (x) ≤ qu (x), x ∈ [0, 1],
so that
(Bn q )(x) ≤ (Bn f )(x) ≤ (Bn qu )(x), x ∈ [0, 1],
since Bn is a monotone operator.
Now qu and q are quadratic functions, so that
1
|(Bn qu )(ξ) − qu (ξ)| ≤ q ∞
8n u
and
1
|(Bn q )(ξ) − q (ξ)| ≤ q ∞ .
8n
Furthermore, by the deﬁnitions of q and qu , q (ξ) = f (ξ) = qu (ξ) and
q ∞ = f ∞ = qu ∞ , so that
1
|(Bn f )(ξ) − f (ξ)| ≤ f ∞ .
8n
Since the expression on the right of this inequality is independent of ξ, we
deduce that
1
Bn f − f ∞ ≤ f ∞ ,
8n
as required.

62
Chapter 7 The theory of minimax
approximation
In Chapter 7 we consider the problem of approximating a given function
f ∈ C[a, b] by polynomials of ﬁxed degree n in the ∞-norm. The polynomial
which best approximates f in this respect can be characterised rather elegantly
and is in fact unique. The theory can be extended to other linear spaces of
approximating functions which satisfy a criterion known as the ‘Haar condition’.
For conciseness we shall use the abbreviation b.m.a. for ‘best minimax
approximation’.
This chapter splits into TWO study sessions:
Study session 1: Sections 7.1 and 7.2.
Study session 2: Sections 7.3 and 7.4.

Study Session 1: The extreme values of the error

function

Read Sections 7.1 and 7.2

Commentary
1. The parameter θ in (7.2) may appear superﬂuous at ﬁrst sight, but its rôle
becomes clear in Section 7.2.

2. The following diagram may clarify the ﬁnal paragraph of Section 7.1.

f +g
p∗3

The function p∗3 is the b.m.a. from P1 to both f and to f + g, but the b.m.a.
from P1 to g is not the zero function.

3. The letter used in Section 7.2 to denote the set where the extreme values of
the error function occur is a script Z (Z) with subscript M . Here one must
interpret ‘extreme values of e∗ ’ to mean ‘maximum values of |e∗ |’.

4. The result in the ﬁrst paragraph of Section 7.2 can be summarised as follows:
if p∗ is not a b.m.a. from A to f , then there exists p in A such that
sgn(e∗ (x)) = sgn(p(x)), x ∈ ZM ,
that is,
e∗ (x)p(x) > 0, x ∈ ZM .

63
The converse result is:
if p∗ is in A, e∗ = f − p∗ and there exists p in A such that
e∗ (x)p(x) > 0, x ∈ ZM ,
then there exists θ > 0 such that
f − (p∗ + θp) ∞ < f − p∗ ∞ ,
so that p∗ is not a b.m.a. from A to f .
This converse result is the special case of Theorem 7.1 in which Z = [a, b].

5. The proof of Theorem 7.1 is quite subtle. At a ﬁrst reading you would do
well to assume that Z = [a, b]. The following diagram (based on the third
part of Figure 7.1) may clarify the rôles played by p, ZM , Z0 and d.

f
p∗

p
ZM

d
e∗ Z0

Self-assessment questions
S1 Sketch a diagram like the above, which corresponds to the second part of
Figure 7.1.
1
S2 Can the constant 2 in (7.13) be replaced by 1?

Study Session 2: Characterising best minimax

approximations

Read Sections 7.3 and 7.4

Commentary
1. The relationship between conditions (1), (2), (3) and (4) is examined in
detail in Appendix A, which will not be assessed. Note, however, that the
equivalence of (1) and (4) is straightforward. Indeed, if {φi : i = 0, 1, . . . , n}
is a basis of A, then:

n
(a) the function f = λi φi in A is identically zero if and only if
i=0
λ = (λ0 , λ1 , . . . , λn ) = (0, . . . , 0);

64

n
(b) the function f = λi φi in A has zeros at ξj , j = 0, 1, . . . , n, if and only
i=0
if

n
λi φi (ξj ) = 0, j = 0, 1, . . . , n,
i=0

that is,
Pλ = 0, (∗)
where P is the matrix with entries φi (ξj );
(c) equation (∗) has the unique solution λ = 0 if and only if P is
non-singular.
To verify that (4) holds for a given space A we need to check that the matrix
P is non-singular for every set {ξj : j = 0, 1, . . . , n} of distinct points in [a, b]
where {φi : i = 0, 1, . . . , n} is some basis of A. (See SAQ S4 and Powell
Exercise 7.8.)

2. We remark that Haar condition (2) is in fact equivalent to Haar conditions

(1), (3) and (4), contrary to the assertion at the end of Powell Exercise 7.4.
This result came to light during the preparation of these notes, after
Professor M. Stynes (University College, Cork) had pointed out that the
space A in Powell Exercise 7.4 does not in fact provide a counterexample to
this equivalence.

3. Points ξ∗0 , . . . , ξ∗n+1 which satisfy (7.17), (7.18) and (7.19) are often called
an alternating set (of length n + 2) for the error function f − p∗ .

4. The key observation in Theorem 7.3 is that the function

p∗ (x) = xn − Tn (x)/2n−1 has the following properties:
(a) p∗ ∈ Pn−1 ;
(b) if f (x) = xn , then f (x) − p∗ (x) = Tn (x)/2n−1 has an alternating set of
length n + 1 in [−1, 1].
Thus p∗ is the b.m.a. from Pn−1 to f on [−1, 1].
Similarly, if f (x) = c0 + c1 x + · · · + cn xn , then p∗ (x) = f (x) − cn Tn (x)/2n−1
is the b.m.a. from Pn−1 to f on [−1, 1]. Furthermore, the b.m.a. from Pn−1
to f on an arbitrary closed interval [a, b] can be found by applying the above
technique to the polynomial f (φ(x)), where φ is the linear map from [−1, 1]
onto [a, b] (see Powell Exercise 7.7).

5. In the sentence following the proof of Theorem 7.3, ‘C[a, b]’ should be ‘[a, b]’.

6. Theorem 7.4, and the discussion following it, indicate how to ﬁnd the b.m.a.
from Pn to the discrete data {(ξi , f (ξi )) : i = 0, 1, . . . , n}. The equations
(7.27) are fundamental to the exchange algorithm, which is discussed in
Chapter 8. The matrix of the system (7.27) is non-singular because if a linear
mapping from Rn+1 to Rn+1 is onto, then it is also one–one.

7. Theorem 7.6 can be proved without the rather awkward Theorem 7.5 and
condition (3), using the more direct method of Powell Exercise 7.6. However,
Theorem 7.5 is needed for Theorem 7.7 (see also page 113).

8. Theorem 7.7 is quite hard to grasp fully at a ﬁrst reading, but it is an

essential part of the exchange algorithm. The best way to understand the
inequalities (7.32) is to attempt Powell Exercise 7.7.

65
Self-assessment questions
S3 Explain why Pn satisﬁes Haar conditions (1) and (2).

S4 Determine whether the following linear spaces satisfy the Haar condition:
(a) the space A spanned by φ0 (x) = 1, φ1 (x) = cos x on [0, π];
(b) the space A spanned by φ0 (x) = 1, φ1 (x) = cos x on [π/2, 3π/2].

S5 Verify that p∗ (x) = x2 + 1/8 is the b.m.a. from P3 to f (x) = |x| on [−1, 1]
(cf. Chapter 3, SAQ S3).

S6 Determine the b.m.a. from P1 to f (x) = sin(πx/2) on [0, 1].

S7 Powell Exercise 7.5

Problems for Chapter 7

P1 Powell Exercise 7.2

P2 Powell Exercise 7.3

P3 Powell Exercise 7.6

P4 Powell Exercise 7.7

P5 Powell Exercise 7.8

Solutions to SAQs in Chapter 7

S1 The diagram is as follows.

p∗

e∗ Z0

S2 No, because if θ = max{|e∗ (x)| : x ∈ Z} − d, then we must have

d + θ = max{|e∗ (x)| : x ∈ Z} and (7.15) would not yield the required strict
inequality in (7.8).

S3 Pn satisﬁes condition (1) because a polynomial of degree n can have at most

n zeros. Pn satisﬁes condition (2) because the function
"
k
p(x) = (x − ζj )
j=1

has degree k ≤ n and changes sign precisely at the points ζj . Moreover, the
function p(x) = 1 lies in Pn and has no zeros in [a, b].

66
S4 (a) To verify Haar condition (4) we need to show that, for distinct ξ1 , ξ2 in
[0, π], the matrix

1 cos ξ1
1 cos ξ2
is non-singular, that is, cos ξ1 = cos ξ2 . But this is evident, since cos is
one–one on [0, π], because it is strictly decreasing on [0, π].
(b) Condition (4) fails in this case because the above matrix is singular if, for
example, ξ0 = π/2, ξ1 = 3π/2, so cos ξ0 = 0 = cos ξ1 . Also, consideration
of φ1 (x) = cos x at ξ0 , ξ1 shows that condition (1) is false.

S5 By Theorem 7.2, it is suﬃcient to note that −1, − 21 , 0, 12 , 1 is an alternating
set of length 5(= 3 + 2) for f (x) − p∗ (x) with h = −1/8. Note that p∗ is also
a b.m.a. from
P2 to f (suitable alternating sets are either −1, − 2 , 0, 12 or
1

− 12 , 0, 12 , 1 ).

S6 By Theorem 7.2, the error function e = f − p∗ must have an alternating set of

length 3. Since f is concave, p∗ (x) = a + bx must look as follows.

1
f p∗

0 α 1

Thus we have
f (0) − p∗ (0) = sin 0 − a = h, (1)
f (α) − p∗ (α) = sin(πα/2) − a − bα = −h, (2)
f (1) − p∗ (1) = sin(π/2) − a − b = h, (3)
where α is a solution of
π π
e (x) = cos x − b = 0.
2 2
Since (1) and (3) imply that b = 1, we deduce that

2 −1 2
α = cos = 0.560 664 18.
π π
Thus, from (1) and (2),
2h = α − sin(πα/2) = −0.210 513 662 ⇒ h = −0.105 256 831.
Since a = −h, we deduce that p∗ (x) = 0.105 + x.

67
S7 By Theorem 7.2, the error function e = f − p∗ must have an alternating set of
length 4. The required quadratic p∗ must surely, therefore, be of the following
form.

1 f
1
2
p∗

−1 − 1 α 1
2

If p∗ (x) = a + bx + cx2 , then for −1, − 21 , α, 1 to be an alternating set, we
want
f (−1) − p∗ (−1) = 12 − (a − b + c) = h, (4)
1
f − 2 − p∗ − 21 = 0 − a − 12 b + 14 c = −h, (5)

f (α) − p∗ (α) = α + 12 − a + αb + α2 c = h, (6)
∗ 3
f (1) − p (1) = 2 − (a + b + c) = −h. (7)
Here α is a solution of
e (x) = 1 − b − 2cx = 0 ⇒ α = (1 − b)/2c.
Equations (5) and (7) imply that 3/2 − 3b/2 − 3c/4 = 0, and hence that
α = 1/4. Equations (4) and (7) imply that a + c = 1 and also that
2b = 1 + 2h. Substituting in (5), we ﬁnd that a = 2h, and then in (6), that
h = 9/50, a = 9/25, b = 17/25, c = 16/25.
Hence

p∗ (x) = 1
25 9 + 17x + 16x2 .

Solutions to Problems in Chapter 7

P1 According to Theorem 7.1, for p∗ to be a b.m.a. from A to f there must be no
p in A such that

f (ξj ) − p∗ (ξj ) p(ξj ) > 0, j = 1, 2, . . . , r.
This implies, for instance, that there is no p in A such that
p(ξj ) = f (ξj ) − p∗ (ξj ), j = 1, 2, . . . , r,
and hence that the linear equations

n
λi φi (ξj ) = f (ξj ) − p∗ (ξj ), j = 1, 2, . . . , r,
i=0

have no solutions λ0 , λ1 , . . . , λn . But this means that the matrix H with

entries φi (ξj ) fails to have full rank r.

68
P2 The following special case may help to illuminate this rather slippery problem.
Suppose that all functions in A vanish at a particular point ξ1 . If it turns out
that ξ1 belongs to the set ZM for some approximation p∗ to f , then p∗ is a
b.m.a. to f , since we can obtain no better approximation at the point ξ1 .
In general, the condition

r
σj φ(ξj ) = 0, φ ∈ A, (8)
j=1

gives a linear dependence among the values taken at the ξj , by any member φ
of A. The condition
σj e∗ (ξj ) ≥ 0, j = 1, 2, . . . , r, (9)
where e∗ = f − p∗ , implies that σj and e∗ (ξj ), which are both non-zero, have
the same sign. Thus if φ is any member of A, then
e∗ (ξj )φ(ξj ) ≤ 0,
for at least one of the j, since otherwise (8) and (9) lead to a contradiction.
Since ξj ∈ ZM we deduce, by Theorem 7.1, that p∗ is a b.m.a. from A to f .

P3 If q ∗ and r∗ are b.m.a.s from A to f , then so is p∗ = 12 (q ∗ + r∗ ), by Theorem

2.2. Thus there is an alternating set {ξi : i = 0, 1, . . . , n + 1} for f − p∗ . Since,
for each i = 0, 1, . . . , n + 1,
f − p∗ ∞ = |f (ξi ) − p∗ (ξi )|

= 12 (f (ξi ) − q ∗ (ξi )) + 12 (f (ξi ) − r∗ (ξi ))
≤ 12 ( f − q ∗ ∞ + f − r∗ ∞ )
= f − p∗ ∞ ,
we deduce that
f (ξi ) − q ∗ (ξi ) = f (ξi ) − r∗ (ξi ), i = 0, 1, . . . , n + 1.
Thus (q ∗ − r∗ ) (ξi ) = 0, for i = 0, 1, . . . , n + 1, and so q ∗ = r∗ by Haar
condition (1).

P4 According to Theorem 7.4, we need to solve the linear equations (7.27). If

f (x) = x3 and p(x) = a + bx + cx2 , these equations are
f (0) − p∗ (0) = −a = h, (10)
f (0.3) − p∗ (0.3) = 0.027 − a − 0.3b − 0.09c = −h, (11)
f (0.8) − p∗ (0.8) = 0.512 − a − 0.8b − 0.64c = h, (12)
f (1) − p∗ (1) = 1 − a − b − c = −h. (13)
Considering (12) − (10) and (13) − (11) gives
0.512 − 0.8b − 0.64c = 0,
0.973 − 0.7b − 0.91c = 0,
and we can now eliminate b to give
4.2 − 2.8c = 0 ⇒ c = 1.5.
It quickly follows that b = −0.56, a = 0.03 and h = −0.03. In particular, the
first line of (7.32) is equal to 0.03.
To determine the final line of (7.32), we need to find the extreme values of

f (x) − p∗ (x) = x3 − 0.03 − 0.56x + 1.5x2 ,
which occur when
3x2 − 3x + 0.56 = 0 ⇒ x = 0.248 34, 0.751 67.

69
Evaluating f − p∗ at these points gives f − p∗ ∞ = 0.031 877. Thus, we
deduce that
0.03 ≤ min f − p ∞ ≤ 0.031 877.
p∈A

In fact, we can determine minp∈A f − p ∞ by considering f (φ(x)), where

φ(x) = 12 (1 + x) maps [−1, 1] onto [0, 1]. The b.m.a. from P2 to
f (φ(x)) = 1
8 + 38 x + 38 x2 + 18 x3
on [−1, 1] is (see commentary on Theorem 7.3) given by
3
f (φ(x)) − 1
8 · 14 T3 (x) = 1
8 + 38 x + 38 x2 + 18 x3 − 1
32 4x − 3x
= 1
8 + 15
32 x + 38 x2 .
Hence the b.m.a. from P2 to f on [0, 1] is
15 −1
p(x) = 1
8 + 32 φ (x) + 38 φ−1 (x)2
1 15 3 2
= 8 + 32 (2x − 1) + 8 (2x − 1)
1 9 3 2
= 32 − 16 x + 2 x .

Since the extreme values of f − p occur at the points of [0, 1] which

correspond to the extreme values of 4x3 − 3x on [−1, 1], namely at 0, 0.25,
0.75 and 1, we ﬁnd that
1
min f − p ∞ = 32 = 0.031 25.
p∈A

Note that 0.031 25 does indeed lie in (0.03, 0.031 877).

P5 The space A that is spanned by

φ0 (x) = 1, ϕ1 (x) = cos 2x, ϕ2 (x) = sin 3x,
on [−π/6, π/2] satisfies Haar condition (4) if the matrix
⎛ ⎞
1 cos 2ξ0 sin 3ξ0
⎝ 1 cos 2ξ1 sin 3ξ1 ⎠ (14)
1 cos 2ξ2 sin 3ξ2
is non-singular for all distinct ξ0 , ξ1 , ξ2 in [−π/6, π/2]. To prove this, we must
either show that the determinant

1 cos 2ξ0 sin 3ξ0

1 cos 2ξ1 sin 3ξ1

1 cos 2ξ2 sin 3ξ2
is non-zero for such ξ0 , ξ1 , ξ2 , or show that if α, β, γ are not all zero, then the
equation
f (x) = α + β cos 2x + γ sin 3x = 0 (15)
cannot have 3 distinct solutions in [−π/6, π/2] (since this shows that the
columns of the above matrix are linearly independent).
To prove this fact about the above determinant, we would need to express the
determinant as a product with fairly simple factors, each of which can be
shown to be non-zero. This seems rather difficult in this case.
To prove that an equation such as (15) has at most 2 distinct solutions in
[−π/6, π/2], it is sufficient (by Rolle’s theorem) to prove that the equation
f (x) = 0 has at most one solution in [−π/6, π/2]. This approach is often
effective, but it does not work in this case, since
f (x) = −2β sin 2x + 3γ cos 3x = 0
can have 3 solutions (take β = 0, γ = 1, x = −π/6, π/6, π/2).
We can convert this into a problem about polynomials by using the identities
cos 2x = 1 − 2 sin2 x, sin 3x = 3 sin x − 4 sin3 x.

70
Since the sine function maps [−π/6, π/2] one-one onto [− 21 , 1], it is equivalent
to prove that the matrix
⎛ ⎞
1 1 − 2t20 3t0 − 4t30
⎜ ⎟
⎝ 1 1 − 2t21 3t1 − 4t31 ⎠ (16)
2 3
1 1 − 2t2 3t2 − 4t2
is non-singular for all distinct t0 , t1 , t2 in [− 21 , 1]. Here ti = sin ξi , i = 0, 1, 2.
In this form, we can evaluate the corresponding determinant and (with some
eﬀort) ﬁnd a fairly simple factorisation:

1 1 − 2t2 3t0 − 4t2
0 0

1 1 − 2t21 3t1 − 4t31 = (t0 − t1 ) (t1 − t2 ) (t2 − t0 ) (3 + 4 (t0 t1 + t1 t2 + t2 t0 )) .

1 1 − 2t22 3t2 − 4t32

The first three factors of this product are non-zero (since t0 , t1 , t2 are
distinct) but it remains to show that
3 + 4(t0 t1 + t1 t2 + t2 t0 ) = 0,
for distinct t0 , t1 , t2 in [− 12 , 1]. This can be done, but it is rather tricky, and
we omit the details.
Instead, we show that if α, β, γ are not all zero, then the cubic equation
p(t) = α + β(1 − 2t2 ) + γ(3t − 4t3 ) = 0
cannot have 3 distinct roots in [− 12 , 1], which shows that the matrix (16) is
non-singular and hence so is the matrix (14). Once again, an approach via
Rolle’s theorem turns out to be unsuccessful, so we try the following approach
which uses knowledge about the overall shape of a cubic graph.
The equation can certainly have at most 2 roots if γ = 0. We can then
assume, for example, that γ < 0. Since the coefficient of t3 is then positive (so
p(t) → ±∞ as t → ±∞), there can be 3 distinct roots in [−1/2, 1] only if
p(1) = α − β − γ ≥ 0, (17)
p (1) = −4β + 3γ − 12γ > 0, (18)

p − 12 = α + 12 β − γ ≤ 0, (19)

p − 12 = 2β + 3γ − 3γ > 0. (20)
According to (20), we have β > 0 and so (17) and (19) yield the contradictory
inequalities α > γ and α < γ, respectively. Hence the above matrices are
indeed non-singular and A does satisfy Haar condition (4).
To prove the final part note that if
φ(x) = α + β cos(2x) + γ sin(3x)
vanishes at x = −π/6, then
α + 12 β − γ = 0. (21)
Now
φ (−π/6) = −2β sin(−π/3) + 3γ cos(−π/2)
√
= 3β,
and, by equation (21),
φ(π/2) = α − β − γ = −3β/2.
If β = 0, then φ(π/2) = φ (−π/6) = 0. On the other hand, if β > 0, then
φ (−π/6) > 0 and φ(π/2) < 0, so φ has a zero in [−π/6, π/2]. A similar
argument applies if β < 0 and so each function φ in A which vanishes at −π/6
also vanishes at some other point of [−π/6, π/2].
Remark The final part could also have been proved by considering the
cubic function p defined earlier.

71
Chapter 8 The exchange algorithm
This chapter contains a detailed account of the exchange algorithm, which is an
iteration process for determining the b.m.a. from a ﬁnite-dimensional subspace A
of C[a, b] to a function f ∈ C[a, b]. The space A must satisfy the Haar condition,
since the algorithm is based on the theory developed in Chapter 7.
The exchange algorithm is analysed in Chapter 9, which will not be assessed. Two
proofs are given there that the algorithm converges. The ﬁrst, in Sections 9.1 and
9.2, is fairly straightforward, but does not given an estimate for the rate of
convergence of the algorithm. The second proof, in Sections 9.3 and 9.4, is very
involved, but it serves to show that the algorithm converges remarkably quickly.
This chapter splits into TWO study sessions:
Study session 1: Sections 8.1, 8.2 and 8.3.
Study session 2: Sections 8.4 and 8.5.

Study Session 1: Using the exchange algorithm

Read Sections 8.1, 8.2 and 8.3

Commentary
1. Although Powell does not mention it in the text, the version of the exchange
algorithm in which all points of the reference are changed at each iteration is
often called the Remes algorithm (see page 338).

2. The choice of the point ξq to be replaced (see page 88) is easy if

ξ0 < η < ξn+1 . If η < ξ0 , however, there are two possibilities, illustrated
below.

|h| |h|

η
η ξ0 ξ1 ξ2 ξ3 ξ4 ξ0 ξ1 ξ2 ξ3 ξ4

−|h| −|h|

On the left, e(η) has the same sign as e(ξ0 ), so ξ0 leaves the reference; on the
right, e(η) has the opposite sign to e(ξ0 ), so ξ4 leaves the reference.

3. The following summary of the one-point exchange algorithm may prove

helpful. Recall that we are given a function f in C[a, b] and an
(n + 1)-dimensional subspace A of C[a, b], which satisﬁes the Haar condition,
and that we are trying to ﬁnd the function p∗ in A such that
f − p∗ ∞ = min f − p ∞ .
p∈A

72
Step 1 Choose an initial reference: a ≤ ξ0 < ξ1 < . . . < ξn+1 ≤ b.
Step 2 Determine p ∈ A and h ∈ R, such that
f (ξi ) − p(ξi ) = (−1)i h, i = 0, 1, . . . , n + 1.
Thus, by Theorem 7.4,
|h| = min max |f (ξi ) − p(ξi )|.
p∈A i=0,1,...,n+1

Step 3 Determine η ∈ [a, b], such that

|f (η) − p(η)| = f − p ∞ .

Step 4 By Theorem 7.7,

|h| ≤ f − p∗ ∞ ≤ |f (η) − p(η)|,
so stop if
δ = |f (η) − p(η)| − |h|
is small enough. Otherwise, continue to Step 5.
Step 5 Choose a new reference: a ≤ ξ+ + +
0 < ξ1 < . . . < ξn+1 ≤ b, replacing
one ξq by η, in such a way that the numbers

f ξ+i − p ξ+i , i = 0, 1, . . . , n + 1,
have alternating signs. Then return to Step 2.

4. Here is an example of the one-point exchange algorithm in action, being used

to find the best minimax approximation from P2 to f (x) = ex on [−1, 1]. We
perform only the first iteration of the algorithm.

Step 1 Choose the initial reference to be −1, − 21 , 12 , 1 . The reason for
this choice will be explained in Theorem 8.1.
Step 2 Determine p(x) = a + bx + cx2 ∈ P2 such that
f (−1) − p(−1) = e−1 − (a − b + c) = h, (1)

f − 12 − p − 21 = e−1/2 − a − 12 b + 14 c = −h, (2)

f 12 − p 12 = e1/2 − a + 12 b + 14 c = h, (3)
f (1) − p(1) = e − (a + b + c) = −h. (4)
Considering (1) + (4) and (2) + (3), we find that

a + c = cosh 1 c = 43 (cosh 1 − cosh 12 ) = 0.553 93,

1 1 ⇒
a + 4 c = cosh 2 a = cosh 1 − c = 0.989 14.
Considering (4) − (1) and (3) − (2), we ﬁnd that

b − h = sinh 1 b = 23 sinh 1 + sinh 12 = 1.130 86,
1 1 ⇒
2 b + h = sinh 2 h = b − sinh 1 = −0.044 337.
Thus the ﬁrst approximation to p∗ is
p(x) = 0.989 14 + 1.130 86x + 0.553 93x2,
and the levelled reference error is |h| = 0.044 337.

73
Step 3 To determine f − p ∞ , we need to identify the extreme points of
e = f − p on [−1, 1], which occur either at ±1 or at solutions of
e (x) = ex − b − 2cx = 0.
The graphs y = ex and y = b + 2cx (with b = 1.130 86, c = 0.553 93)
indicate that this non-linear equation has two solutions η1 , η2 at
approximately ±0.5. One can use the bisection method or Newton’s
method to obtain the accurate values
η1 = −0.438 62 and η2 = 0.560 94.
Since e(η1 ) = 0.045 233 and e(η2 ) = −0.045 468, we have
f − p ∞ = |e(η2 )| = 0.045 468.
Note that |e(−1)| = |e(1)| = |h| < |e(η2 )|.
Remark Recall Powell’s comment on page 86 that f − p ∞
would in practice be obtained by computing many values of f − p
on [a, b] and approximating this function locally by quadratics.
Step 4 Since
δ = |f (η2 ) − p(η2 )| − |h| = 0.045 468 − 0.044 337 = 0.001 131,
the polynomial p is already fairly close to the best minimax
approximation from P2 to f (x) = ex on [−1, 1].
Step 5 The error function e = f − p has the following form.

|h|

1
−1 2 η2
η 0
− 12 1 1

−|h|

The point of the initial reference

to be replaced
by η2 is clearly 12
1
and so the new reference is −1, − 2 , η2 , 1 . The linear equations to
be solved in Step 2 will not yield quite so simply with this
reference, owing to the lack of symmetry.
Remark Notice how well the calculated polynomial p(x)
approximates f (x) = ex on [−1, 1] by comparison with the Taylor
polynomial q(x) = 1 + x + 12 x2 of f . Indeed,
f − q ∞ = f (1) − q(1) = e − 2.5 = 0.218 28,
so that f − q ∞ 4.8 f − p ∞ .

Self-assessment questions
S1 Justify the statement in the second paragraph of page 86 that the error
function e has at least n turning points.

S2 Justify the statement in the second paragraph of page 88 that the case when
|h| = 0 can occur only on the ﬁrst iteration, and then any value of q gives the
increase (8.11).

S3 Powell Exercise 8.2

74
Study Session 2: Matters relating to the exchange
algorithm

Read Sections 8.4 and 8.5

Commentary
1. Theorem 8.1 explains the choice of reference that we made when ﬁnding a
minimax approximation from P2 to f (x) = ex on [−1, 1]. The Chebyshev
polynomial T3 (x) = 4x3 − 3x takes its extreme values at −1, − 21 , 12 , 1 .
Notice also the bearing that Theorem 8.1 has on Powell Exercise 7.7. If we
map the above reference from [−1, 1] to [0, 1], then it becomes 0, 14 , 34 , 1 ,
and Theorem 8.1 implies that if this reference had been used in Powell
Exercise 7.7, then the calculated polynomial p(x) would have been the b.m.a.
from P2 to f (x) = x3 on [0, 1]. Since the given reference {0, 0.3, 0.8, 1} was
close to this ideal reference, the resulting polynomial was close to the b.m.a.

2. The linear operator from C[−1, 1] to Pn , described after Theorem 8.1, is

investigated in a special case in Powell Exercise 8.5. The asymptotic estimate
ln n for the norm of this operator is not obvious.

3. The discussion of ‘telescoping’ on page 92 is closely related to our

commentary on Theorem 7.3 (see Problem P4).

4. In the linear programming problem described on page 94 the aim is to

minimise θ subject to the 2m linear constraints on (θ, λ0 , λ1 , . . . , λn ) given by
(8.34).

Self-assessment questions
S4 Verify that the points ξi in (8.17) satisfy (8.18).

S5 Calculate the b.m.a. from P2 to f (x) = x3 on [0, 1] by using Theorem 8.1.

Compare your answer with the one obtained for Powell Exercise 7.7.

Problems for Chapter 8

P1 Powell Exercise 8.1 (Hint: consider q ∈ Pn+1 which interpolates f at ξi .)

P2 Powell Exercise 8.3

P3 Powell Exercise 8.5

P4 Powell Exercise 8.6 (Hint: express the remainder for the Taylor
approximation as an integral.)

P5 Powell Exercise 8.7

75
Solutions to SAQs in Chapter 8
S1 Let ξk−1 , ξk , ξk+1 be consecutive points of the reference. If e(ξk ) = |h| > 0,
then the error function e has a local maximum inside [ξk−1 , ξk+1 ], with value
at least |h|. On the other hand, if e(ξk ) = −|h| < 0, then the function e has a
local minimum inside [ξk−1 , ξk+1 ], with value at most −|h|. Each of these
local extreme values is clearly distinct and so, since there are n intervals of
the form [ξk−1 , ξk+1 ], there are at least n local extreme points.

S2 The value h = 0 occurs if and only if the numbers {f (ξi ) : i = 0, 1, . . . , n + 1}

happen to be the values taken at ξi , i = 0, 1, . . . , n + 1, by some p in A. If f
itself is in A, then we also have δ = 0 on the ﬁrst iteration, so the algorithm
terminates. If f is not in A, then h may be zero on the ﬁrst iteration because
f (ξi ) = p(ξi ), i = 0, 1, . . . , n + 1, for some p in A. But then
f − p ∞ = |f (η) − p(η)| > 0 and on replacing any ξq by η we obtain a

reference ξ+ +
i : i = 0, 1, . . . , n + 1 such that no p in A interpolates f at ξi ,
i = 0, 1, . . . , n + 1 (because such a p is determined by its values at n + 1
points). Hence the new levelled reference error is positive.

S3 The initial reference is of the form {ξ0 , ξ1 , ξ2 }, where ξ0 = a, ξ2 = b, and the

error function e = f − p on the ﬁrst iteration is convex.

|h|
ξ1 η
a = ξ0 b = ξ2
−|h|
y = e(x)

Since e = f − p is convex, it has a unique extreme value other than a and b at

η, which is a minimum. If η = ξ1 , then the algorithm terminates. Otherwise,
the new reference is {a, η, b} and the new approximation is
p̃ = p − 12 (|e(η)| − |h|), since this gives an error function ẽ = e + 12 (|e(η)| − |h|)
with
ẽ(a) = −ẽ(η) = ẽ(b) = 12 (|h| + |e(η)|) = |h̃|,
say. The algorithm now terminates since ẽ ∞ = |h̃|.

S4 Since

Tn+1 (x) = cos (n + 1) cos−1 x , −1 ≤ x ≤ 1
(see (4.23)), we have
Tn+1 (ξi ) = cos((n + 1)(n + 1 − i)π/(n + 1))
= cos((n + 1 − i)π)
= (−1)n+1−i ,
because cos(nπ) = (−1)n .

76
S5 According to Theorem 8.1 and the discussion at the bottom of page 91, we
should use the reference {0, 0.25, 0.75, 1}. With f (x) = x3 and
p(x) = a + bx + cx2 , this gives
f (0) − p(0) = −a = h, (5)
f (1/4) − p(1/4) = 1/64 − (a + b/4 + c/16) = −h, (6)
f (3/4) − p(3/4) = 27/64 − (a + 3b/4 + 9c/16) = h, (7)
f (1) − p(1) = 1 − (a + b + c) = −h. (8)
Considering (7) − (5) and (8) − (6) gives

27/64 − 3b/4 − 9c/16 = 0

⇒ c = 3/2, b = −9/16.
63/64 − 3b/4 − 15c/16 = 0
Adding (5) and (8) gives
1 − 2a − b − c = 0 ⇒ a = 1/32, h = −1/32.
By Theorem 8.1, the b.m.a. from P2 to f (x) = x3 on [0, 1] is
p∗ (x) = 1
32 − 9
16 x + 32 x2 .
This agrees with the answer to Powell Exercise 7.7.

Solutions to Problems in Chapter 8

P1 The levelled reference error |h| is found by solving
f (ξi ) − p(ξi ) = (−1)i h, i = 0, 1, . . . , n + 1,
where p ∈ Pn . Now let q ∈ Pn+1 interpolate f at {ξi : i = 0, 1, . . . , n + 1}, so
that
q(ξi ) − p(ξi ) = (−1)i h, i = 0, 1, . . . , n + 1.
n+1
The coefficient of x in q, which is identical to the coefficient of xn+1 in
q − p, is equal to f [ξ0 , ξ1 , . . . , ξn+1 ] (see Section 5.1) and so
f [ξ0 , ξ1 , . . . , ξn+1 ]/h is the coefficient of xn+1 in that function r ∈ Pn+1 which
satisfies
r(ξi ) = (−1)i , i = 0, 1, . . . , n + 1.
Hence
r[ξ0 , ξ1 , . . . , ξn+1 ] = f [ξ0 , ξ1 , . . . , ξn+1 ]/h,
so that
f [ξ0 , ξ1 , . . . , ξn+1 ]
h= ,
r[ξ0 , ξ1 , . . . , ξn+1 ]
where r[ξ0 , ξ1 , . . . , ξn+1 ] is independent of f .
For n = 1, using equation (5.14),

2 −2 2
r[ξ0 , ξ1 , ξ2 ] = − /(ξ2 − ξ0 ) = ,
ξ2 − ξ1 ξ1 − ξ0 (ξ2 − ξ1 )(ξ1 − ξ0 )
and so
|h| = 12 (ξ2 − ξ1 )(ξ1 − ξ0 )|f [ξ0 , ξ1 , ξ2 ]|,
as required.

77
P2 The extreme values of the error function
e∗ (x) = f (x) − p∗ (x)
144
= − (69 − 20x + 2x2 )
x+2
occur when x = 0, 6, or x satisﬁes
−144
e∗ (x) = + 20 − 4x = 0.
(x + 2)2
This equation reduces to
(x − 1)(x2 − 16) = 0,
which has solutions x = 1, ±4. Since
e∗ (0) = 3, e∗ (1) = −3, e∗ (4) = 3, e∗ (6) = −3,
we deduce that p∗ is indeed the b.m.a. from P2 to f on [0, 6].
Next we determine the function p(x) = a + bx + cx2 which satisﬁes (8.4) with
the reference {0, 1 + α, 4 + β, 6}:
f (0) − p(0) = 72 − a = h, (9)
144
f (1 + α) − p(1 + α) = − a + b(1 + α) + c(1 + α)2 = −h, (10)
(3 + α)
144
f (4 + β) − p(4 + β) = − a + b(4 + β) + c(4 + β)2 = h, (11)
(6 + β)
f (6) − p(6) = 18 − (a + 6b + 36c) = −h. (12)
−1
Now, the binomial expansion for (1 + x) gives
1 1
= = 13 (1 − α/3) + O α2
3+α 3(1 + α/3)
and
1 1
= = 16 (1 − β/6) + O β2 ,
6+β 6(1 + β/6)
so that, if α, β are small enough for α2 , β2 to be neglected, then (10), (11)
reduce to
48 − 16α − (a + b(1 + α) + c(1 + 2α)) = −h, (13)
24 − 4β − (a + b(4 + β) + c(16 + 8β)) = h. (14)
Now we observe that the values a = 69, b = −20, c = 2 and h = 3 must satisfy
(9), (13), (14), (12) in the case α = β = 0, because p∗ is the b.m.a. from P2 to
f , with the alternating set being {0, 1, 4, 6}. Thus, these values for a, b, c and
h will satisfy (9), (13), (14), (12) when α, β = 0 also, provided that
−16α − bα − 2cα = (−16 − b − 2c)α = 0, (15)
−4β − bβ − 8cβ = (−4 − b − 8c)β = 0. (16)
Since both these equations do hold with b = −20 and c = 2, we deduce that
a, b, c, h do satisfy (9), (13), (14), (12), and so the function given by (8.4) in
this case is p∗ again.
Remark It is not, of course, pure chance that equations (15) and (16) hold.
In fact, if ξ is a point at which
f (ξ) − p(ξ) = h and f (ξ) − p (ξ) = 0,
then for small ε (small enough for ε2 to be neglected) we have the Taylor
approximation
f (ξ + ε) − p(ξ + ε) = f (ξ) − p(ξ) + ε (f (ξ) − p (ξ))
= f (ξ) − p(ξ)
= h.

78
In practice this means that if we choose a reference which is fairly close to an
alternating set for f − p∗ , then the polynomial given by (8.4) is very close
to p∗ .

P3 Since the reference ξi , i = 0, 1, . . . , n + 1, given by (8.17) is ﬁxed, the

coefficients λj of p (cf. equation (7.27)) and h are linear functions of f (ξi ),
i = 0, 1, . . . , n + 1. Hence the operator
n
X : f → p = λj φj ,
j=0
where {φj } is the standard basis for Pn , is linear.

For n = 2, the reference (8.17) is −1, − 21 , 12 , 1 and we have to find the
quadratic function p(x) of ∞-norm on [−1, 1] subject to the four
largest
constraints |p(±1)| ≤ 1, p ± 12 ≤ 1. By the linear programming argument
from
Chapter
3, it is sufficient to consider the cases p(±1) = ±1,
p ± 21 = ±1, and it turns out that the largest ∞-norm is achieved for
p(x) = −5/3 + (8/3)x2 . In this case p ∞ = 5/3, so that X ∞ = 5/3, as
required.

P4 The nth Taylor polynomial of f (x) = ln 1 + 12 x at 0 is
x 1 # x $2 (−1)n+1 # x $n
pn (x) = − + ···+ .
2 2 2 n 2
The usual form of the remainder in Taylor’s Theorem is rather poor for the
function ln. It is better to consider the derivatives
# $n−1
1 1 x n+1 x
f (x) − pn (x) = − 1 − + · · · + (−1)
2+x 2 2 2
n
1 1 1 −(−x/2)
= −
2+x 2 1 + x/2
n
(−x/2)
= ,
2+x
and then integrate to give
x

|f (x) − pn (x)| = (f (t) − pn (t)) dt

(since f (0) = pn (0) = 0)
0 x
(−t/2)n
= dt
0 2+t
|x|
(t/2)n
≤ dt
0 2−t
1
1
≤ n tn dt
2 0
1
= ,
(n + 1)2n
for |x| ≤ 1.
To obtain a Taylor polynomial pn with f − pn ∞ < 0.01 on [−1, 1] we must
therefore take n = 5:
1 1
f − p5 ∞ ≤ = 192 = 0.005 208 3̇.
6.25
Now we apply telescoping to
x x2 x3 x4 x5
p5 (x) = − + − +
2 8 24 64 160
on [−1, 1]. The b.m.a. from P4 to p5 on [−1, 1] is
1 1
p̃4 (x) = p5 (x) − · T5 (x),
160 24
and the ∞-norm error made by this approximation is 1/(160 × 24 ) = 1/2560.

79
Since T5 has no term in x4 , the coeﬃcient of x4 in p̃4 (x) is −1/64 and so the
b.m.a. from P3 to p̃4 on [−1, 1] is
1 1
p̃3 (x) = p̃4 (x) +· T4 (x),
64 23
and the ∞-norm error made by this second approximation is
1/(64 × 23 ) = 1/512. Thus
f − p̃3 ∞ ≤ f − p5 ∞ + p5 − p̃4 ∞ + p̃4 − p̃3 ∞
1 1 1
≤ 192 + 2560 + 512
= 0.007 55.
Since p̃3 is a cubic function and f − p̃3 ∞ < 0.01, we are done.
For the record:
p̃4 (x) = 255
512 x − 18 x2 + 19 3
384 x − 1 4
64 x

and
1 255 9 2 19 3
p̃3 (x) = 512 + 512 x − 64 x + 384 x .

Note, however, that the error in the next telescoping step is

1
4 (19/384) = 0.012 . . ., so no further telescoping is possible.

P5 We denote a typical element of P1 by p(x) = a + bx.

Iteration 1 {0, 3, 6}
⎫ &
f (0) − p(0) = 0.3 − a = h ⎬ 3.1 − 3b = −2h
f (3) − p(3) = 3.4 − a − 3b = −h 5.4 − 6b = 0
⎭
f (6) − p(6) = 5.7 − a − 6b = h ⇒ b = 0.9, h = −0.2, a = 0.5
Thus p1 (x) = 0.5 + 0.9x, with |h| = 0.2. The maximum error of p1 is
|f (1) − p1 (1)| = 2.8 > |h|, with f (1) − p1 (1) > 0, and so 1 replaces 3 in the
reference.
Iteration 2 {0, 1, 6}
⎫ &
f (0) − p(0) = 0.3 − a = h ⎬ 3.9 − b = −2h
f (1) − p(1) = 4.2 − a − b = −h 5.4 − 6b = 0
⎭
f (6) − p(6) = 5.7 − a − 6b = h ⇒ b = 0.9, h = −1.5, a = 1.8
Thus p2 (x) = 1.8 + 0.9x, with |h| = 1.5. The maximum error of p2 is
|f (2) − p2 (2)| = 3.5 > |h|, with f (2) − p2 (2) < 0, and so 2 replaces 6 in the
reference.
Iteration 3 {0, 1, 2}
⎫ &
f (0) − p(0) = 0.3 − a = h ⎬ 3.9 − b = −2h
f (1) − p(1) = 4.2 − a − b = −h −0.2 − 2b = 0
⎭
f (2) − p(2) = 0.1 − a − 2b = h ⇒ b = −0.1, h = −2, a = 2.3
Thus p3 (x) = 2.3 − 0.1x, with |h| = 2. The maximum error of p3 is
|f (6) − p3 (6)| = 4 > |h|, with f (6) − p3 (6) > 0, and so 6 replaces 0 in the
reference. (Care needed here to ensure that f − p3 alternates in sign on the
new reference!)
Iteration 4 {1, 2, 6}
⎫ &
f (1) − p(1) = 4.2 − a − b = h ⎬ 4.3 − 2a − 3b = 0
f (2) − p(2) = 0.1 − a − 2b = −h 5.8 − 2a − 8b = 0
⎭
f (6) − p(6) = 5.7 − a − 6b = h ⇒ a = 1.7, b = 0.3, h = 2.2
Thus p4 (x) = 1.7 + 0.3x, with |h| = 2.2. The maximum error of p4 is
|f (4) − p4 (4)| = 2.8 > |h|, with f (4) − p4 (4) > 0, and so 4 replaces 6 in the
reference.

80
Iteration 5 {1, 2, 4}
⎫ &
f (1) − p(1) = 4.2 − a − b = h ⎬ 4.3 − 2a − 3b = 0
f (2) − p(2) = 0.1 − a − 2b = −h 5.8 − 2a − 6b = 0
⎭
f (4) − p(4) = 5.7 − a − 4b = h ⇒ a = 1.4, b = 0.5, h = 2.3
Thus p5 (x) = 1.4 + 0.5x, with |h| = 2.3. The maximum error of p5 is also 2.3
and so the algorithm ends.
Hence the b.m.a. from P1 to f on {0, 1, 2, 3, 4, 5, 6} is p∗ (x) = 1.4 + 0.5x. The
following ﬁgure illustrates the 5 approximations calculated by the algorithm.

p2
6 p1
p5
p4
3
p3

0 3 6

Remark Notice that maximum errors in the above process are not
decreasing, and so this example serves to answer Powell Exercise 8.4.

81
Chapter 10 Rational approximation by
the exchange algorithm
The approximation of continuous functions by rational functions, that is, ratios of
polynomials, is of great practical importance. For example, it is common for
computers to use rational functions to approximate special functions such as ex ,
sin x etc. As an example, we mention the function
40x
r(x) = 1 +
x2 − 20x + 138 − 4116/(x2 + 42)
which approximates f (x) = ex to within 1.11 × 10−7 on [−1, 1]. The theory of
minimax rational approximation is rather like that for minimax polynomial
approximation, but is more difficult because rational functions do not depend
linearly on their coefficients.
The first three sections of Chapter 10 are devoted to a version of the exchange
algorithm for rational functions, including a discussion of its possible failure. The
existence and uniqueness of best minimax rational approximations is not proved
and the characterisation of best minimax rational approximation in terms of
alternating sets is left to a series of (rather hard) exercises (10.1, 10.2 and 10.6).
Section 10.4 gives a brief description of an alternative algorithm for calculating
best minimax rational approximations, called the ‘differential correction
algorithm’ (see the book by Braess for more details). This section will not be
assessed in any way.
This chapter splits into TWO study sessions:
Study session 1: Sections 10.1 and 10.2.
Study session 2: Section 10.3.

Study Session 1: The exchange algorithm for

rational approximation

Read Sections 10.1 and 10.2

Commentary

1. Notice that the evaluation of a polynomial in Pm+n requires m + n additions

and m + n multiplications (using nested multiplication), whereas the
evaluation of a function in Amn requires m + n additions, m + n − 1
multiplications and one division (we may assume that the coeﬃcient of xn in
the denominator is 1). Thus a function in Amn takes hardly any longer to
evaluate than one in Pm+n .

2. Powell uses the letter k in (10.5) to denote the kth approximation to the
b.m.a. from Amn to f . Unfortunately, this makes some of the formulas, such
as (10.12), look even more involved than they are already. This commentary
will suppress the letter k.

3. A proof that each f in C[a, b] has a unique b.m.a. from Amn can be found in
Achieser [2] or Rivlin [138]. Note that Theorem 1.2 does not apply because
Amn is not a linear space.

82
4. The process described in Section 10.2 for solving the non-linear equations
(10.10) to ﬁnd aj , j = 0, 1, . . . , m, bj , j = 0, 1, . . . , n, and h, is very ingenious,
but rather hard to follow in the abstract. We illustrate the process here with
the concrete example f (x) = ex on [−1, 1], r(x) = (a0 +a1 x)/(b0 + b1 x) in
A11 , and initial reference {ξ0 , ξ1 , ξ2 , ξ3 } = −1, − 21 , 12 , 1 . The calculation
should be compared with that given in the commentary for Chapter 8 to ﬁnd
the b.m.a. from P2 to f on [−1, 1].
In this case equations (10.10) are

a0 + a1 ξ0 = eξ0 − h (b0 + b1 ξ0 ), (1)

a0 + a1 ξ1 = eξ1 + h (b0 + b1 ξ1 ), (2)

a0 + a1 ξ2 = eξ2 − h (b0 + b1 ξ2 ), (3)

a0 + a1 ξ3 = eξ3 + h (b0 + b1 ξ3 ). (4)
These are 4 non-linear equations for the 5 unknowns a0 , a1 , b0 , b1 , h, but
remember that we are free to scale a0 , a1 , b0 , b1 . Even so, it does not look
easy to solve (1), (2), (3), (4).
The method of solution in Section 10.2 involves (10.11), which are special
cases of (4.11). It is helpful here to use the notation
"
m+n+1
1
Πi = , i = 0, 1, . . . , m + n + 1, (5)
j=0
(ξj − ξi )
j=i

so that (10.11) becomes

m+n+1
ξi Πi = 0, = 0, 1, . . . , m + n. (6)
i=0

In our example, where ξ0 = −1, ξ1 = − 12 , ξ2 = 12 , ξ3 = 1, we have

1 1
Π0 = 1 3 = 23 ,
Π1 = 1 3 = − 43 ,
2 2 (2) − 2 (1) 2
1 1
Π2 = 3 1 = 43 , Π3 = 3 1 = − 32 .
− 2 (−1) 2 (−2) − 2 − 2
Thus equation (6) states that
4 2
(−1) 23 + − 12 − 34 + 12 3 + (1) − 3 = 0,

= 0, 1, 2,
which you can readily check.
Thus, if we multiply equations (1), (2), (3), (4) by Π0 , Π1 , Π2 , Π3 ,
respectively, then the left-hand sides sum to zero, giving
3
ξ
e i − (−1)i h (b0 + b1 ξi )Πi = 0. (7)
i=0

Similarly, if we multiply equations (1), (2), (3), (4) by ξ0 Π0 , ξ1 Π1 , ξ2 Π2 ,

ξ3 Π3 , respectively, and sum, then we obtain
3
ξ
e i − (−1)i h (b0 ξi + b1 ξ2i )Πi = 0. (8)
i=0

Equations (7) and (8) can be written together as

⎡ ⎤
3 1

ξ
0= e i − (−1)i h ⎣ bj ξj+
i
⎦ Πi
i=0 j=0
1
. 3 /
j+
= e − (−1) h ξi Πi bj ,
ξi i
= 0, 1,
j=0 i=0

83
that is, (A − hB)b = 0, where b = (b0 , b1 ),
3

Aj = eξi ξi+j Πi , , j = 0, 1, (9)
i=0
3
Bj = (−1)i ξi+j Πi , , j = 0, 1. (10)
i=0

In our case
1 1
A00 = 23 e−1 − 43 e− 2 + 43 e 2 − 23 e = 4
2 sinh 12 − sinh 1 ,
3
1 1
A10 = − 23 e−1 + 23 e− 2 + 23 e 2 − 23 e = 43 cosh 12 − cosh 1 ,
1 1
A11 = 23 e−1 − 13 e− 2 + 13 e 2 − 23 e = 23 sinh 12 − 2 sinh 1 ,
and
2 4 4 2
B00 = 3 + 3 + 3 + 3 = 4,
B10 = − 23 − 23 + 23 + 23 = 0,
2 1 1 2
B11 = 3 + 3 + 3 + 3 = 2.

Thus

−0.177 347 44 −0.553 939 55 4 0
A= , B= .
−0.553 939 55 −1.219 538 05 0 2
Now we wish to determine those h such that det(A − hB) = 0, that is,
(det B)h2 − (A00 B11 − 2A01 B01 + A11 B00 )h + det A = 0, (11)
in view of the symmetry of A and B. This quadratic is
8h2 + 5.232 847 08 h − 0.090 567 07 = 0,
which has solutions
h1 = 0.016 872 21 and h2 = −0.670 978 095.
Recalling that |h| denotes the levelled reference error, we try h1 ﬁrst (since it
is smaller) and seek a solution b = (b0 , b1 ) to

−0.244 836 28 −0.553 939 55 b0 0
(A − h1 B)b = = .
−0.553 939 55 −1.253 282 47 b1 0
Choosing b1 = 1, we obtain b0 = −2.262 489 65 from the ﬁrst equation, and
we note that, with these values, q(x) = b0 + b1 x has no zeros in [−1, 1]. From
equations (1) and (4), say, we obtain a0 = −2.299 130 562,
a1 = −1.153 973 103, so that
2.299 130 562 + 1.153 973 103 x
r(x) = ,
2.262 489 65 − x
and the levelled reference error is |h1 | = 0.016 872 21. The argument on
page 115 of Powell shows that the value h2 leads to a rational function r with
a singularity in [−1, 1], as you can easily check.
To continue the exchange algorithm, we seek a number η in [−1, 1] such that
(10.5) holds. The extreme values occur either at ±1 or at solutions of

d a0 + a1 x a1 b 0 − a0 b 1
0= ex − = ex − .
dx b0 + b1 x (b0 + b1 x)2
With the calculated values of a0 , a1 , b0 , b1 , we thus have to solve
4.909 982 764
ex − =0
(2.262 489 65 − x)2
and use of Newton’s method or the bisection method gives
η1 = −0.256 234 678, η2 = 0.704 578 163.

84
Evaluating f (x) − r(x) at these points we find that
f − r ∞ = 0.025 321 995 4. Thus we deduce from (10.8) that, to 3 significant
figures,
0.0169 ≤ f − r∗ ∞ ≤ 0.0253,
where r∗ is the b.m.a. from A11 to f (x) = ex on [−1, 1]. It would appear,
therefore, that the least maximum error in approximation from A11 to f is
about half that obtained in approximating from P2 to f (see the commentary
on Section 8.2).
The error function e = f − r has the form shown below.

|h|

− 12 η1
−1 0 1 η2 1
2

−|h|

Hence the new

reference for the second iteration of the one-point exchange
algorithm is −1, − 21 , η2 , 1 .

5. Theorem 10.2 and the discussion which follows it shows that all values of h
which satisfy (10.16) are real. Note that if B is symmetric and positive
deﬁnite then B is conjugate to a diagonal matrix D with the (positive)
eigenvalues of B lying on the main diagonal. In fact, P−1 BP = D, where P
is the transition matrix from an eigenvector basis to the standard basis
(obtained by writing these eigenvectors as column vectors). Thus
1 1
B = PDP−1 ⇒ B 2 = PD 2 P−1 ,
# 1 $2 1 1
since B 2 = PD 2 P−1 PD 2 P−1 = PDP−1 = B, and so

1
# 1
$−1 1
B− 2 = PD 2 P−1 = PD− 2 P−1 .

Self-assessment questions
S1 Justify the fact, used in equation (10.19), that
"
m+n+1
1 "
m+n+1
1
(−1)i = , i = 0, 1, . . . , m + n + 1.
s=0
(ξs − ξi ) s=0
|ξs − ξi |
s=i s=i

3 −2
S2 Let B = . Prove that B is positive deﬁnite (i.e. B has strictly
−2 3
1
positive eigenvalues), and determine B− 2 .

85
Study Session 2: Some convergence properties of
the exchange algorithm

Read Section 10.3

Commentary
1. Theorem 10.3 shows that in the discrete case the exchange algorithm
converges. As pointed out in this section, the exchange algorithm may fail,
but if it does converge, then the rate of convergence is very rapid — hence
the algorithm’s importance.

2. Equation (10.29). More precisely, the diﬃculty occurs when every alternating
set which satisﬁes (10.29) has length less than m + n + 2. In the example at
the bottom of page 117, where m = n = 1, the longest alternating set for
f − r has length 3, whereas m + n + 2 = 4 in this case (cf. the solution to
Problem P5).

Self-assessment questions
S3 Explain how the method of proof of Theorem 10.1 implies that r∗ − r is the
ratio of two cubic functions with four zeros (page 117, bottom).

S4 Conﬁrm that the functions in A11 which satisfy (10.6) with the data
f (−4) = 0, f (−1) = 1, f (1) = 1, f (4) = 0, are given by (1.6 − 0.2x)/(2 − x)
and (1.6 + 0.2x)/(2 + x).

Problems for Chapter 10

# a c a+c c $
P1 Powell Exercise 10.1 Hint: if b, d > 0, then < ⇔ < .
b d b+d d

P2 Powell Exercise 10.2 (Hint: try to mimic the proof of Theorem 7.1.)

P3 Powell Exercise 10.3

P4 Powell Exercise 10.4 (Note that you need not ﬁnd the b.m.a. from A12 , in
the second part.)

P5 Powell Exercise 10.6 (Hint: try to use the dimension theorem for linear
mappings.)

Solutions to SAQs in Chapter 10

S1 If i is even (odd), then there are an even (odd) number of points ξ0 , . . . , ξi−1
lying to the left of ξi and hence an even (odd) number of the factors (ξs − ξi )
are negative. Thus (−1)i and the product are either both positive or both
negative. The result follows.

86
S2 The characteristic equation of B is λ2 − 6λ + 5 = 0, so that the eigenvalues
are λ = 1, 5. Since these are both positive, B is positive deﬁnite.
Corresponding eigenvectors are (1, 1) for λ = 1 and (1, −1) for λ = 5, so that
the transition matrix from the basis {(1, 1), (1, −1)} to the standard basis is

1 1 −1 1 0
P= ⇒ P BP = = D.
1 −1 0 5
Now

− 12 1 0√ 1 1
D = ⇒ B− 2 = PD− 2 P−1
0 1/ 5
√ √
1 1
2 (1 + 1/ 5) 2 (1 − 1/ 5)
= 1 √ 1
√ .
2 (1 − 1/ 5) 2 (1 + 1/ 5)

S3 We know that ξi , i = 0, 1, 2, 3, 4, is an alternating set for f − r∗ , that is,

f (ξi ) − r∗ (ξi ) = (−1)i h, i = 0, 1, 2, 3, 4,
for some h, with f − r ∞ = |h|. If r ∈ A22 and f − r ∞ < f − r∗ ∞ , then
∗

|f (ξi ) − r(ξi )| < |f (ξi ) − r∗ (ξi )| , i = 0, 1, 2, 3, 4,

and so each of the numbers
r(ξi ) − r∗ (ξi ) = (f (ξi ) − r∗ (ξi )) −(f (ξi ) − r(ξi )) , i = 0, 1, 2, 3, 4,
∗
has the same sign as (−1) h. Hence r − r has at least 4 zeros (the full
i

strength of Theorem 7.5 is not needed in this case), and since r ∈ A22 ,
r∗ ∈ A11 , the function r − r∗ is indeed the ratio of two cubic functions, whose
denominator has no zeros. Hence r − r∗ = 0, as required (see solution to
Problem P5 for a more general result).

S4 To solve the equations (10.6) we need to consider equation (11), where the
coeﬃcients of A and B are given by (9) and (10) (adapted to the present
case) and Πi , i = 0, 1, 2, 3, are given by (5).
First,
1 1 1 1
Π0 = 3·5·8 = 120 , Π1 = (−3)·2·5 = − 30 ,
1 1 1 1
Π2 = (−5)·(−2)·3 = 30 , Π3 = (−8)·(−5)·(−3) = − 120 ,
so that
1 1
A00 = 0 − 30 + 30 + 0 = 0,
1 1 1
A10 = 0 + 30 + 30 +0 = 15 ,
1 1
A11 = 0 − 30 + 30 + 0 = 0,
1 1 1 1 1
B00 = + 30
120 + 30 + 120 = 12 ,
1
1
1 1
B10 = (−4) · 120 − (−1) · − 30 + 1 · 30 − 4 · − 120 = 0,
1
1 1 1 1
B11 = 16 · 120 − 1 · − 30 + 1 · 30 − 16 · − 120 = 3.
Hence

0 1/15 1/12 0
A= and B = ,
1/15 0 0 1/3
so that equation (11) is h2 /36 − 1/225 = 0 ⇒ h = ±0.4. It follows that the
equations (A − hB)b = 0 reduce to −hb0 /12 + b1 /15 = 0, that is
b1 = 5hb0 /4. Thus, we ﬁnd from equations (1) and (4) that a0 = 5h2 b0 ,
a1 = hb0 /4, so that the corresponding rational functions are
1.6 − 0.2x 1.6 + 0.2x
(h = −0.4) and (h = 0.4).
2−x 2+x

87
Solutions to Problems in Chapter 10
P1 The key observation here is that for x ∈ [a, b], r(x) lies between r∗ (x) and
r(x). Indeed, by the hint, we deduce that, for x ∈ [a, b],
p∗ (x) p(x) p∗ (x) p∗ (x) + θp(x) p(x)
∗
< ⇒ ∗
< ∗
< ,
q (x) q(x) q (x) q (x) + θq(x) q(x)
since q ∗ , q > 0 on [a, b], and similarly
p∗ (x) p(x) p∗ (x) p∗ (x) + θp(x) p(x)
> ⇒ > > .
q ∗ (x) q(x) q ∗ (x) q ∗ (x) + θq(x) q(x)
Also, if p∗ (x)/q ∗ (x) = p(x)/q(x), then p(x)/q(x) = p∗ (x)/q ∗ (x).
Now suppose that η ∈ [a, b] satisﬁes f − r ∞ = |f (η) − r(η)|. If r(η) lies
strictly between r∗ (η) and r(η), then it is clear that
|f (η) − r(η)| < max{|f (η) − r∗ (η)| , |f (η) − r(η)|} ≤ f − r∗ ∞ ,
and so f − r ∞ < f − r∗ ∞ , as required.
Otherwise, r(η) = r∗ (η) = r(η), so that
f − r ∞ = |f (η) − r(η)| = |f (η) − r(η)| < f − r∗ ∞ ,
once again.

P2 The aim here is to prove that if

[f (x) − r∗ (x)][r(x) − r∗ (x)] > 0, x ∈ ZM ,
for some r ∈ Amn , where
ZM = {x ∈ [a, b] : |f (x) − r∗ (x)| = f − r∗ ∞ },
then r∗ is not a b.m.a. from Amn to f , because f − rθ ∞ < f − r∗ ∞ for
some θ > 0, where
p∗ + θp
rθ = .
q ∗ + θq
Following the proof of Theorem 7.1, we put e∗ = f − r∗ and deﬁne
Z0 = {x ∈ [a, b] : (r(x) − r∗ (x)) e∗ (x) ≤ 0},
so that d = maxx∈Z0 |e∗ (x)| < e∗ ∞ , because Z0 and ZM are disjoint.
Next observe that r∗ − rθ ∞ can be made arbitrarily small by choosing θ
small, since
∗
p∗ p∗ + θp pq − p∗ q
r∗ − rθ = ∗ − ∗ = −θ ∗ ∗ .
q q + θq q (q + θq
In particular, we can choose θ so small that
r∗ − rθ ∞ < e∗ ∞ − d.
Also recall from Problem P1 that rθ (x) lies between r∗ (x) and r(x), so that
r∗ (x) − rθ (x) and r∗ (x) − r(x) have the same sign (unless both are zero).
Finally, choose ξ ∈ [a, b] such that |f (ξ) − rθ (ξ)| = f − rθ ∞ . If ξ ∈ Z0 , then
f (ξ) − r∗ (ξ) and r∗ (ξ) − rθ (ξ) have the same sign, so that
f − rθ ∞ = |f (ξ) − rθ (ξ)|
= |f (ξ) − r∗ (ξ) + r∗ (ξ) − rθ (ξ)|
= |f (ξ) − r∗ (ξ)| + |r∗ (ξ) − rθ (ξ)|
< d + e∗ ∞ − d
= e∗ ∞ .

88
On the other hand, if ξ ∈ [a, b]\Z0 , then f (ξ) − r∗ (ξ) and r∗ (ξ) − rθ (ξ) have
opposite signs (neither is zero!), so that
f − rθ ∞ = |f (ξ) − rθ (ξ)|
= |f (ξ) − r∗ (ξ) + r∗ (ξ) − rθ (ξ)|
< max{|f (ξ) − r∗ (ξ)| , |r∗ (ξ) − rθ (ξ)|}
≤ e∗ ∞ .
In either case, the desired reduction holds.
We have thus proved the ‘only if’ part of the following analogue of
Theorem 7.1.
Theorem Let f ∈ C[a, b]. Then r∗ is a b.m.a. from Amn to f if and only if
there is no function r in Amn such that
(f (x) − r∗ (x))(r(x) − r∗ (x)) > 0, x ∈ ZM ,
where
ZM = {x ∈ [a, b] : |f (x) − r∗ (x)| = f − r∗ ∞ }.
The proof of the ‘if’ part goes exactly as the beginning of Section 7.2.

P3 As in the solution to SAQ S4 we could obtain equation (11) by determining

Πi , i = 0, 1, 2, 3, from (5) and the coeﬃcients of A and B from (9) and (10).
However, the fact that ξ0 = 0 and f (0) = 0 makes it fairly easy to solve
equations (10.6) directly:
a0 + 0a1 = (0 − h)(b0 + 0b1 ), (12)
a0 + 2a1 = (1 + h)(b0 + 2b1 ), (13)
a0 + 5a1 = (1.6 − h)(b0 + 5b1 ), (14)
a0 + 6a1 = (2 + h)(b0 + 6b1 ). (15)
Substituting for a0 from (12) into (13), (14), (15) and choosing b1 = 1 gives
2a1 = b0 + 2 + 2hb0 + 2h, (16)
5a1 = 1.6b0 + 8 − 5h, (17)
6a1 = 2b0 + 12 + 2hb0 + 6h. (18)
Using (16) to eliminate a1 from (17) and (18) gives
0 = 0.9b0 − 3 + 5hb0 + 10h,
0 = b0 − 6 + 4hb0 ,
and, after eliminating b0 ,
0 = 2.4 + 28h + 40h2 = (h + 0.1)(40h + 24),
so that h = −0.1, −0.6.
The solution h = −0.1 gives b0 = 10, a1 = 4.9, a0 = 1, that is,
1 + 4.9x
r1 (x) = .
10 + x
The solution h = −0.6 gives b0 = −30/7, a1 = 29/35, a0 = −18/7, that is,
−18/7 + (29/35)x −90 + 29x
r2 (x) = = .
−30/7 + x −150 + 35x
The two functions have the following graphs.

89
1 + 4.9x
y=
10 + x
−90 + 29x
y=
−150 + 35x
1

0 1 2 5 6

Note that only the ﬁrst approximation r1 is in A11 .

P4 It
is a straightforward
matter to check that f − r∗ ∞ = 14 and that
1 1
−1, − 2 , 2 , 1 is an alternating set of length 4 for f − r∗ . Since
m + n + 2 = 5 in this case, we can apply the argument at the bottom of page
117 (see also SAQ S3) to show that r∗ is the b.m.a. to f from A21 .
To prove that r∗ is not the b.m.a. from A12 to f , we shall ﬁnd a function of
the form
x
r(x) = , −1 ≤ x ≤ 1,
a + bx2
such that f − r ∞ < 14 . The above form for r is appropriate because the
function f itself is odd (i.e. f (−x) = −f (x)). There are many reasonable
choices for a and b, but a = 2, b = −1 seems a good idea since then we have
r (0) = 12 and r(1) = f (1) (and also r (1) = f (1)).

x 1
y=
2−x 2

y = x3

−1 1

−1

The maximum value of |f (x) − r(x)| in [−1, 1] occurs at a zero α of

2 2 − x2 + 2x2
f (x) − r (x) = 3x − 2
(2 − x2 )

3x2 4 − 4x2 + x4 − 2 + x2
= 2
(2 − x2 )
3x6 − 12x4 + 11x2 − 2
=
(2 − x2 )2
2
x − 1 3x4 − 9x2 + 2
= .
(2 − x2 )2
Thus
√
9− 81 − 24
α2 = ⇒ α = ±0.491 624 105.
6
Hence f − r ∞ = |f (α) − r(α)| = 0.160 778 31 < 0.25, as required.

P5 It is useful in this problem to denote the degree of a polynomial p by ∂p.

Thus d = min{m − ∂p∗ , n − ∂q ∗ }.

90
If r = p/q lies in Amn , then
p p∗ pq ∗ − p∗ q
r − r∗ = − ∗ = .
q q qq ∗
Thus we need to ﬁnd polynomials p ∈ Pm , q ∈ Pn such that
"
k
pq ∗ − p∗ q = (x − ξi ), (19)
i=1

where 0 ≤ k ≤ m + n − d. Also, we require q > 0 on [a, b], but we note that

this can be arranged (once a solution to (19) is available) by considering
(p∗ + εp)/(q ∗ + εq), which also satisfies (19), for a suitably small positive
constant ε.
Notice first that the degree of pq ∗ − p∗ q is
max {m + ∂q ∗ , n + ∂p∗ } = m + n − d,
and that the polynomials p, q have altogether m + n + 2 unknown coefficients
a0 , . . . , am , b0 , . . . , bn , say. Equating coefficients of xj , j = 0, 1, . . . , m + n − d,
in (19) gives m + n + 1 − d linear equations for the unknown coefficients,
which we regard as a linear mapping
t : (a0 , . . . , bn ) → (c0 , c1 , . . . cm+n−d ),
from R m+n+2
to Rm+n+1−d .
In order that we can solve for a0 , . . . , bn given any c0 , . . . , cm+n−d , we require
the mapping t to be onto, that is, dim Im(t) = m + n + 1 − d. By the
dimension theorem,
dim Ker(t) + dim Im(t) = m + n + 2,
and so we require dim Ker(t) = d + 1.
Suppose, then, that p/q ∈ Amn and that the coefficient vector of (p, q) lies in
Ker(t). Then
p p∗
pq ∗ − p∗ q = 0 so = ∗, (20)
q q
and since p∗ , q ∗ have no non-constant common factors, we deduce that
p = λp∗ , q = λq ∗ , where λ is a polynomial such that
∂λ = ∂p − ∂p∗ = ∂q − ∂q ∗ ≤ d. It is also clear that if λ is any polynomial
with ∂λ ≤ d, then p/q = λp∗ /λq ∗ lies in Amn and satisfies (20). Hence Ker(t)
consists of arbitrary linear combinations of the d + 1 coefficient vectors of
d ∗
(p∗ (x), q ∗ (x)) , (xp∗ (x), xq ∗ (x)) , . . . , x p (x), xd q ∗ (x) .
These coefficient vectors are clearly linearly independent, and so
dim Ker(t) = d + 1, as required.
The result of this problem should be compared to Haar condition (2), Powell,
page 77. Using this result, the theorem stated in the solution to Problem P2,
and the fact that a polynomial of degree m + n − d has at most m + n − d
sign changes, we deduce by the argument at the beginning of Section 7.3 the
following analogue of Theorem 7.2.
Theorem Let f ∈ C[a, b]. Then r∗ = p∗ /q ∗ is a b.m.a. from Amn to f on
[a, b] if and only if f − r∗ has an alternating set of length m + n + 2 − d,
where d = min{m − ∂p∗ , n − ∂q ∗ }.
Thus a b.m.a. r∗ from Amn to f is defective if and only if f − r∗ does not have
an alternating set of full length m + n + 2. A function f ∈ C[a, b] is called
hypernormal if its b.m.a. from each Amn is not defective. See Rivlin [138]
for a discussion of this concept and a proof that f (x) = ex is hypernormal.

91
Printed in the United Kingdom.

Elliptic Partial Differential Equations of Second Order - Gilbarg, David, Trudinger, Neil S. - Classics in Mathematics, 2nd Ed., Rev. 3rd Printing, - 9783540411604
No ratings yet
Elliptic Partial Differential Equations of Second Order - Gilbarg, David, Trudinger, Neil S. - Classics in Mathematics, 2nd Ed., Rev. 3rd Printing, - 9783540411604
531 pages
M840 Dissertation in Mathematics Print Final 6-9-2011
No ratings yet
M840 Dissertation in Mathematics Print Final 6-9-2011
41 pages
Guide To Analysis - F Mary Hart
100% (1)
Guide To Analysis - F Mary Hart
213 pages
Understanding Analysis - Solution of Exercise Problems
No ratings yet
Understanding Analysis - Solution of Exercise Problems
272 pages
A Practical Guide To Splines
100% (7)
A Practical Guide To Splines
366 pages
Introduction To Real Analysis 2e
100% (1)
Introduction To Real Analysis 2e
562 pages
m820 2011
67% (3)
m820 2011
396 pages
(Karen E. Smith, Lauri Kahanpaa, Pekka Kekalainen
No ratings yet
(Karen E. Smith, Lauri Kahanpaa, Pekka Kekalainen
86 pages
Solutions
100% (1)
Solutions
183 pages
Calculus of Variations
No ratings yet
Calculus of Variations
36 pages
Exercicios Munkres
100% (1)
Exercicios Munkres
22 pages
Selected Solutions To Dummit and Foote's Abstract Algebra Third Edition
No ratings yet
Selected Solutions To Dummit and Foote's Abstract Algebra Third Edition
157 pages
B.SC - III Maths
No ratings yet
B.SC - III Maths
76 pages
F04 MSC Mathematics 0
No ratings yet
F04 MSC Mathematics 0
5 pages
Lattice Solution PDF
100% (1)
Lattice Solution PDF
142 pages
PIII - Approximation Theory (Lectures 17-24 Missing) - Shadrin (2005) 52pg PDF
No ratings yet
PIII - Approximation Theory (Lectures 17-24 Missing) - Shadrin (2005) 52pg PDF
52 pages
Applied Functional Analysis PDF
No ratings yet
Applied Functional Analysis PDF
88 pages
Herstein Topics in Algebra Solution 5 5
No ratings yet
Herstein Topics in Algebra Solution 5 5
6 pages
Gershgorin Circle Theorem PDF
No ratings yet
Gershgorin Circle Theorem PDF
47 pages
20m836tma02 J
100% (2)
20m836tma02 J
4 pages
Complete Metric Space
No ratings yet
Complete Metric Space
10 pages
Garrett Birkhoff-Lattice Theory-American Mathematical Society (1967)
No ratings yet
Garrett Birkhoff-Lattice Theory-American Mathematical Society (1967)
423 pages
[Translations of Mathematical Monographs 26] Gennadiĭ Mikhaĭlovich Goluzin - Geometric Theory of Functions of a Complex Variable (Translations of Mathematical Monographs, Vol. 26) (1969, AMS Bookstore)
100% (1)
[Translations of Mathematical Monographs 26] Gennadiĭ Mikhaĭlovich Goluzin - Geometric Theory of Functions of a Complex Variable (Translations of Mathematical Monographs, Vol. 26) (1969, AMS Bookstore)
684 pages
Functional Analysis Master
No ratings yet
Functional Analysis Master
107 pages
Books List
No ratings yet
Books List
3 pages
Foundations of Functional Analysis: Alpha Science International LTD
No ratings yet
Foundations of Functional Analysis: Alpha Science International LTD
3 pages
Satya Algebraic
100% (1)
Satya Algebraic
351 pages
Basis of A Topology
No ratings yet
Basis of A Topology
9 pages
1.2 Normed Spaces
No ratings yet
1.2 Normed Spaces
17 pages
Functional Analysis
100% (1)
Functional Analysis
131 pages
Functional Analysis Exam: N N + N N
No ratings yet
Functional Analysis Exam: N N + N N
3 pages
Ebook mst326 Block3 E1i1 n9780749223151 l1
No ratings yet
Ebook mst326 Block3 E1i1 n9780749223151 l1
184 pages
M.SC Mathematics
0% (1)
M.SC Mathematics
30 pages
Part Solution of Dummit and Foote
50% (2)
Part Solution of Dummit and Foote
2 pages
Unitary and Orthogonal Operators and Their Matrices-Pt2
100% (1)
Unitary and Orthogonal Operators and Their Matrices-Pt2
22 pages
23 Equivalent Norms
No ratings yet
23 Equivalent Norms
4 pages
Institute of Mathematics and Its Applications
No ratings yet
Institute of Mathematics and Its Applications
8 pages
Functional Analysis Klka PDF
No ratings yet
Functional Analysis Klka PDF
51 pages
BSC 1 Math 1
No ratings yet
BSC 1 Math 1
276 pages
Irreducible Polynomials
No ratings yet
Irreducible Polynomials
11 pages
(Sergei Ovchinnikov) Functional Analysis (B-Ok - Xyz) (001-050) PDF
No ratings yet
(Sergei Ovchinnikov) Functional Analysis (B-Ok - Xyz) (001-050) PDF
50 pages
International Islamic University, Islamabad BS Mathematics Real Analysis II
No ratings yet
International Islamic University, Islamabad BS Mathematics Real Analysis II
29 pages
Introduction To Metric Spaces
100% (1)
Introduction To Metric Spaces
15 pages
Numerical Solution of Ordinary Differential Equations (Part - 1)
No ratings yet
Numerical Solution of Ordinary Differential Equations (Part - 1)
51 pages
Topology NPTEL Lecture Notes by Dr. Veermani
No ratings yet
Topology NPTEL Lecture Notes by Dr. Veermani
124 pages
16-Libro Análisis Topologia
No ratings yet
16-Libro Análisis Topologia
247 pages
Galois Solutions PDF
No ratings yet
Galois Solutions PDF
23 pages
Compactness
No ratings yet
Compactness
7 pages
An Elementary Treatise On Differential E
100% (2)
An Elementary Treatise On Differential E
301 pages
Lectures in Functional Analysis-Roman Vershynin PDF
No ratings yet
Lectures in Functional Analysis-Roman Vershynin PDF
131 pages
R.B.V.R.R Women'S College (Autonomous) Department of Mathematics M.Sc. Semester III Paper-III Syllabus 2017-18
No ratings yet
R.B.V.R.R Women'S College (Autonomous) Department of Mathematics M.Sc. Semester III Paper-III Syllabus 2017-18
2 pages
Linear Algebra and Applications: Numerical Linear Algebra: David S. Watkins
No ratings yet
Linear Algebra and Applications: Numerical Linear Algebra: David S. Watkins
107 pages
Complex Analysis, L V Ahlfors PDF
No ratings yet
Complex Analysis, L V Ahlfors PDF
347 pages
Pure Mathematics (M208) Content Listing: Mathematical Language and Proof
No ratings yet
Pure Mathematics (M208) Content Listing: Mathematical Language and Proof
1 page
m820 Sol 2011 PDF
No ratings yet
m820 Sol 2011 PDF
234 pages
M337 2009
No ratings yet
M337 2009
16 pages
Algebraic and Transcendental Equations
No ratings yet
Algebraic and Transcendental Equations
9 pages
Final Project Report
0% (1)
Final Project Report
20 pages
Radial Basis Functions by Buhmann
No ratings yet
Radial Basis Functions by Buhmann
38 pages
M - SC - II Sem - III Functional Analysis All Units
No ratings yet
M - SC - II Sem - III Functional Analysis All Units
245 pages
Numerical Approximation Methods
No ratings yet
Numerical Approximation Methods
14 pages
Chebyshev Approximation
No ratings yet
Chebyshev Approximation
6 pages
MTH 307 Numerical Analysis II
No ratings yet
MTH 307 Numerical Analysis II
91 pages
Chapter 11 Fourier Analysis PDF
No ratings yet
Chapter 11 Fourier Analysis PDF
165 pages
Algorithms For Polynomial and Rational Approximation
No ratings yet
Algorithms For Polynomial and Rational Approximation
141 pages
Asna Notes
No ratings yet
Asna Notes
95 pages
Chebyshev PDF
No ratings yet
Chebyshev PDF
24 pages
Chebyshev Polynomials - Wikipedia
No ratings yet
Chebyshev Polynomials - Wikipedia
21 pages
Chapter 7
No ratings yet
Chapter 7
24 pages
Nptel: - Course
No ratings yet
Nptel: - Course
4 pages
GMSK
No ratings yet
GMSK
17 pages
An Efficient Technique For The Point Reactor Kinetics Equations With Newtonian Temperature Feedback Effects
No ratings yet
An Efficient Technique For The Point Reactor Kinetics Equations With Newtonian Temperature Feedback Effects
8 pages
Numerical Analysis and Computing: Lecture Notes #12 - Approximation Theory - Chebyshev Polynomials & Least Squares, Redux
No ratings yet
Numerical Analysis and Computing: Lecture Notes #12 - Approximation Theory - Chebyshev Polynomials & Least Squares, Redux
73 pages
General SISO Takagi-Sugeno Fuzzy Systems With Linear Rule Consequent Are Universal Approximators
No ratings yet
General SISO Takagi-Sugeno Fuzzy Systems With Linear Rule Consequent Are Universal Approximators
7 pages
Mat326 Zoom 22 06 23
No ratings yet
Mat326 Zoom 22 06 23
6 pages
PHD Research Topic Proposal Bme, Doctoral School of Mathematics and Computer Science
No ratings yet
PHD Research Topic Proposal Bme, Doctoral School of Mathematics and Computer Science
24 pages
Assignment 5 23
No ratings yet
Assignment 5 23
2 pages
Iske A Approximation Theory and Algorithms For Data Analysis
100% (2)
Iske A Approximation Theory and Algorithms For Data Analysis
363 pages
(MADHU MANGAL PAUL) Numerical Analysis For Scienti
100% (1)
(MADHU MANGAL PAUL) Numerical Analysis For Scienti
666 pages
Chapter Two
No ratings yet
Chapter Two
13 pages
Assignment 1 Feedback
No ratings yet
Assignment 1 Feedback
1 page
Approximation Theory and Methods
No ratings yet
Approximation Theory and Methods
2 pages
NM Unit 3
No ratings yet
NM Unit 3
16 pages
Lecture Notes 5-Numerical Intergration
No ratings yet
Lecture Notes 5-Numerical Intergration
13 pages
Nonex Kap5
No ratings yet
Nonex Kap5
34 pages
Simpson Rule
No ratings yet
Simpson Rule
6 pages
Full Download JOURNAL OF COMPUTATIONAL ANALYSIS AND APPLICATIONS First Edition George Anastassiou PDF
100% (1)
Full Download JOURNAL OF COMPUTATIONAL ANALYSIS AND APPLICATIONS First Edition George Anastassiou PDF
49 pages
CMP 6
No ratings yet
CMP 6
4 pages

M832 Approximation Theory Course Notes

Uploaded by

M832 Approximation Theory Course Notes

Uploaded by



Study Session 1: Approximation in a metric space

Read Sections 1.1 and 1.2

1. Section 1.1 describes the three ingredients of an approximation problem:

In a continuous approximation problem f is typically a real function, such

In a discrete approximation problem f is typically a vector of function

This example explains why (M3) is called the triangle inequality.

4. The set A discussed

S2 Prove the following generalisation of Theorem 1.1. If A1 and A2 are compact

Read Sections 1.3, 1.4 and 1.5

The corresponding metrics have similar geometric interpretations.

has no convenient geometric interpretation.

Proof If f 2 = 0 or g 2 = 0, then the result is clear.

Remark Another proof of the Cauchy–Schwarz inequality appears in

deﬁned in equation (1) of the commentary for Study Session 1.

if 1 < p < ∞, is similar to the case p = 2. Instead of the Cauchy–Schwarz

where p−1 + q −1 = 1, whose proof is based on the inequality

2. Theorem 1.2 is of fundamental importance, and the proof looks

Let b1 , . . . , bn be a basis for A and write each am in the form

Hence λmk 2 ∞ as k → ∞ (for otherwise, since amk ≤ M , the above

3. In the proof of Theorem 1.3, the Cauchy–Schwarz inequality is applied with

S4 Prove that f 1 , f ∈ C[a, b], satisﬁes (N3).

S5 Prove that f ∞ , f ∈ C[a, b], satisﬁes (N3).

S6 Give an alternative proof of Theorem 1.2 by considering

S7 Verify equations (1.24), (1.25) and (1.26).

Problems for Chapter 1

P2 Powell Exercise 1.5

P3 Powell Exercise 1.6

P4 Powell Exercise 1.7

Solutions to SAQs in Chapter 1

S2 Let d∗ = inf{d(a1 , a2 ) : a1 ∈ A1 , a2 ∈ A2 } and choose sequences a1n ∈ A1 ,

By the compactness of A1 and A2 , we can choose common subsequences

S3 If λ = (λ1 , . . . , λn ), µ = (µ1 , . . . µn ) ∈ Rn , then

S4 Let f, g ∈ C[a, b]. Since

S5 Let f, g ∈ C[a, b] and suppose that

where c ∈ [a, b] (the maximum is attained because the function

Solutions to Problems in Chapter 1

Hence p(x) = 12 (e + 1) is the best L∞ approximation, with

Since 1 − a is decreasing and 14 a2 is increasing, we deduce that the best

P2 Here A is the set of continuous piecewise-linear functions on [a, b]. Given

We claim that f − g ∞ < ε, that is,

|f (x) − f (xk )| < ε

Study Session 1: Convexity

Read Sections 2.1 and 2.2

Not convex Convex Strictly

A point a is interior to a set A in a metric space if the open ball

3. The following diagram illustrates the proof of Theorem 2.3.

Note that any ﬁnite-dimensional subspace A is convex, but not compact

Study Session 2: Best approximation operators

Read Sections 2.3 and 2.4

1. The following diagram illustrates the deﬁnition of the best approximation

4. The proof of Theorem 2.7 is perhaps more clearly written as follows.

S5 Let w be a positive function in C[a, b]. Prove that

is a scalar product on C[a, b].

S6 Prove the Cauchy–Schwarz inequality for scalar products by considering the

S7 Draw ﬁgures to illustrate the non-uniqueness of best approximation in the

P2 Powell Exercise 2.5

P3 Powell Exercise 2.6

P4 Powell Exercise 2.8

P5 Powell Exercise 2.1

Solutions to SAQs in Chapter 2

S2 (a) Let S and T be convex sets. If s0 , s1 ∈ S ∩ T , then the points

S4 The idea, as in Theorem 1.2, is to consider a compact subset of A which is

S6 Since, for all λ ∈ R,

S7 The ﬁgures are as follows.

f − a1 = 2, |λ| ≤ 1 f − a1 = 5, |λ| ≤ 1

a(x) = λ(1 + x) a = (0, λ2 , λ, 3λ

X(f1 ) N (f2 , ||f2 − f1 ||∞ )

Hence, by the evenness of sin2 x, f − g 1 = π2 , for |λ| ≤ 1.

Study Session 1: ‘Good’ versus ‘best’ approximation

Read Sections 3.1 and 3.2

S2 Explain why the application of this operator X to f (x) = x2 (see

Study Session 2: Types of approximating functions

Read Sections 3.3 and 3.4

a = ξ0 ξ1 ξj−1 ξj ξj+1 ξn−1 b = ξn

Formula (3.31) can be explained as follows. First note that

Formula (4.25) shows that Tn (x) is a polynomial of degree n in which the

2. The function Bn f can be thought of as a weighted average of the function