Notes Ending 21 Feb 2024
Notes Ending 21 Feb 2024
Numerical analysis forms the heart of ‘scientific computing’ or ‘computational science and engi-
neering,’ fields. Though numerical analysis has flourished in the past seventy years, it continues
to be one of the fastest growing areas of applied mathematics, with its roots dating back to cen-
turies, where approximations were necessary. We model our world with continuous mathematics.
Numerical methods are used to solve problems, such as modeling chemical or biological processes,
planning ecologically sounding heat systems, or computing trajectories of spacecraft or satellites
etc. Whether our interest is natural science, engineering, even finance and economics, the models
we most often employ are functions of real variables. The greater availability of computing power
and global internet use has lead to the availability of good software implementation of numerical
methods. However, the availability of quality software does not alleviate your effort and responsi-
bility to first understand numerical analysis.
Numerical analysis has a distinct flavor from basic calculus, from solving ODEs algebraically, or
from other non-numeric areas. In numerical analysis, the equations that arise can be linear or
nonlinear, involve derivatives, integrals, combinations of these and beyond, and one has many op-
tions on solving these numerical problems where answers are given as tables of values (numbers) or
graphs. Whereas in calculus and in ODEs there are limited solutions to problems and most oftenly
the answer involves an algebraic answer. Thus, the tricks and techniques one learns in algebra and
calculus cannot fully tackle the complexities that arise in serious applications. Therefore, numer-
ical analysis extends your ability to solve problems that are either difficult or impossible to solve
analytically.
We want computational schemes that guarantee both the minimization of the number of calcu-
lations and effective use of computers. In carrying out large scale computations, it is important
to adhere to certain instructions that are designed specifically to save time and at the same time
ensuring quality results from approximations. For the analysis component of ‘numerical analysis’,
we rely on tools of classical real analysis, such as continuity, differentiability, Taylor expansion, and
convergence of sequences and series. Matrix computations play a fundamental role in numerical
analysis. Discretization of continuous variables turns calculus into algebra.
The importance of numerical analysis can be best appreciated by realizing the impact its disappear-
ance would have on our world. The space program would suffer; aircraft design would be hobbled;
weather forecasting would again become the stuff of soothsaying and almanacs. The ultrasound
technology that uncovers cancer and illuminates the womb would vanish. Google couldn’t rank web
pages.
Everybody is a genius. But if you judge a fish by its ability to climb a tree, it will live its
whole life believing that it is stupid.
—Albert Einstein
CHAPTER 1
LEAST-SQUARES APPROXIMATION
—Albert Einstein.
1.1.1 Introduction
Suppose we want to determine the weight of any female individual if given their height. Consider
the following initial data:
It can be observed that these points do not quite lie in a straight line. Although we could use a
random pair of these data points to approximate the weight of any individual, it would seem more
reasonable to find the line that best approximates all the data points to determine the weight of
any female individual. We will consider this type of approximation in this chapter.
We shall consider two general types of problems. One involves a case when a function is given
explicitly (in most cases not easy dealing with), but we wish to find a “simpler (easy to deal with)”
type of function, such as a polynomial, to approximate values of the given function. The other
problem is concerned with fitting functions to given data and finding the “best (a line of best fit)”
1
function in a certain class to represent the data.
Both problems have been touched upon in “Interpolation and polynomial approximation.” The
nth Taylor polynomial about the number x0 is an excellent approximation to an (n + 1)−times
differentiable function f in a small neighborhood of x0 . The Lagrange interpolating polynomials,
or, more generally, osculatory polynomials, were discussed both as approximating polynomials and
as polynomials to fit certain data together with cubic splines.
In this chapter, limitations to these techniques are considered, and other avenues of approach
are discussed.
Consider the problem of estimating the values of a function at non-tabulated points, given the
experimental data in Table 1.1.2 below.
Table 1.1:
xi 1 2 3 4 5 6 7 8 9 10
yi 1.3 3.5 4.2 5.0 7.0 8.8 10.1 12.5 13.0 15.6
Graphing the values in Table 1.1.2 above, it can be observed that the actual relationship between
x and y is linear. Because of errors in the data, it is unlikely that an algebraic equation of a line
will be found to precisely fit the data. So it is unreasonable to require that the approximating
function agree exactly with the data. In fact, such a function would introduce oscillations that
were not originally present. For example, the graph of the ninth-degree interpolating polynomial
shown in unconstrained mode for the data in Table 1.1.2 above can be obtained using Maple. This
polynomial is clearly a poor predictor of information between a number of the data points.
A better approach would be to find the “best” (in some sense) approximating line, even if it
does not agree precisely with the data at any point.
Let a1 xi + a0 denote the ith value on the approximating line and yi be the ith given y−value.
′
We assume throughout that the independent variables, the xi s, are exact, but that it is the de-
′
pendent variables, the yi s, that are suspect. This is a reasonable assumption in most experimental
situations.
The problem of finding the equation of the best linear approximation in the absolute sense requires
that values of a0 and a1 be found to minimize
This is commonly called a minimax problem and cannot be handled by elementary techniques.
Another approach to determining the best linear approximation involves finding values of a0 and
2
a1 to minimize
10
X
E1 (a0 , a1 ) = |yi − (a1 xi + a0 )| .
i=1
This quantity is called the absolute deviation. To minimize a function of two variables, we need
to set its partial derivatives to zero and simultaneously solve the resulting equations. In the case
of the absolute deviation, we need to find a0 and a1 with
10 10
∂ X ∂ X
0= |yi − (a1 xi + a0 )| and 0 = |yi − (a1 xi + a0 )| .
∂a0 i=1 ∂a1 i=1
The problem is that the absolute-value function is not differentiable at zero, and we might not be
able to find solutions to this pair of equations. We now resort to the linear least squares approach.
The least squares approach to this problem involves determining the best approximating line when
the error involved is the sum of the squares of the differences between the y−values on the approx-
imating line and the given y−values. Hence, constants a0 and a1 must be found that minimize the
least squares error:
10
X
E2 (a0 , a1 ) = [yi − (a1 xi + a0 )]2 .
i=1
The least squares method is the most convenient procedure for determining best linear approxima-
tions, but there are also important theoretical considerations that favor it as compared to other
procedures outlined above. The minimax approach generally assigns too much weight to a bit of
data that is badly in error, whereas the absolute deviation method does not give sufficient weight
to a point that is considerably out of line with the approximation. The least squares approach
puts substantially more weight on a point that is out of line with the rest of the data, but will
not permit that point to completely dominate the approximation. The least squares approach also
improves the differentiability of the resulting function thereby enabling it to be minimized. An
additional reason for considering the least squares approach involves the study of the statistical
distribution of error.
The general least squares problem of finding a line of best fit to a collection of data {(xi , yi )}m
i=1
involves minimizing the total error,
m
X
E ≡ E2 (a0 , a1 ) = [yi − (a1 xi + a0 )]2 ,
i=1
with respect to the parameters a0 and a1 . For a minimum to occur, we need both
∂E ∂E
= 0 and = 0,
∂a0 ∂a1
that is,
m m
∂ X X
0= [yi − (a1 xi + a0 )]2 = −2 (yi − a1 xi − a0 )
∂a0 i=1 i=1
3
and m m
∂ X 2
X
0= [yi − (a1 xi + a0 )] = −2 (yi − a1 xi − a0 ) (xi ).
∂a1 i=1 i=1
These equations simplify to the normal equations:
Xm Xm
a · m + a x = yi
0 1 i
i=1 i=1
(1.1)
m
X m
X m
X
x2i =
a0 x i + a1 xi yi .
i=1 i=1 i=1
Exercise 1.1.1. TRY: Find the least squares line approximating the data in Table 1.1.2 above.
Expected result: The resulting line of best fit is P (x) = 1.538x − 0.360.
of degree n < m−1, using the least squares procedure is handled similarly. We choose the constants
a0 , a1 , . . . , an to minimize the least squares error E = E2 (a0 , a1 , . . . , an ), where
m
X m
X m
X m
X
E = (yi − Pn (xi ))2 = yi2 − 2 Pn (xi )yi + (Pn (xi ))2
i=1 i=1 i=1 i=1
m m n
! m n
!2
X X X X X
= yi2 − 2 aj x j yi + aj x j
i=1 i=1 j=0 i=1 j=0
m n m
! n X
n m
!
X X X X X
= yi2 − 2 aj yi xji + aj ak xj+k
i .
i=1 j=0 i=1 j=0 k=0 i=1
Similarly, as considered in the linear case, in order for E to be minimized, it is necessary that
∂E/∂aj = 0, for each j = 0, 1, 2, . . . , n. Thus, for each j, we must have
m n m
∂E X j
X X
0= = −2 y i xi + 2 ak xj+k
i .
∂aj i=1 k=0 i=1
4
This gives n + 1 normal equations in the n + 1 unknowns aj . These are
n
X m
X m
X
ak xj+k
i = yi xji , for each j = 0, 1, . . . , n. (1.3)
k=0 i=1 i=1
These normal equations have a unique solution provided that the xi are distinct.
Exercise 1.1.2. Fit the data in Table 1.1.2 below with the discrete least squares polynomial of
degree at most 2. Hint: For this problem, n = 2, m = 5 and so there are three normal equations.
To solve the system of equations try using Mathematica or MATLAB. Determine the total error E,
obtained by using a polynomial of degree at most 2.
Table 1.2:
i 1 2 3 4 5
xi 0.001 0.25 0.50 0.75 1.00
yi 1.0000 1.2840 1.6487 2.1170 2.7183