Numerical Computions
Numerical Computions
CSC 2702
Required textbook:
Numerical Analysis: Burden & Faires: 8th edition Thomson Brooks/Cole
Dr Azeddine M
KICT, CS
IIUM
October 12, 2009
1
Contents
1 Mathematical Preliminaries 4
1.1 Review of Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Round-off Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Accelerating Convergence 33
4.1 Aitken’s ∆2 method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Steffensen’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 Zeros Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4 Horner’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.5 Deflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.6 Müller’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2
6 Numerical Differentiation and Integral 57
6.1 Numerical Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.2 Richardson’s Extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.3 Elements of Numerical Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3.1 Trapezoidal rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3.2 Simpson’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.3.3 Degree of precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.3.4 Newton-Cotes Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.4 Composite Numerical Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.5 Adaptive Quadrature Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.6 Gaussian Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.7 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
11 Exams 98
11.1 exam 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
11.2 exam 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
11.3 exam 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3
Chapter 1
Mathematical Preliminaries
f is continuous at x0 if
lim f (x) = f (x0 ) (1.3)
x→x0
Differentiable functions:
The function f is differentiable at x0 if
f (x) − f (x0 )
f ′ (x) = lim (1.5)
x→x0 x − x0
exists. This limit is called the derivative of f at x0 .
The set of all function that have n continuous derivatives on X is denoted by C n (X).
Rolle’s Theorem:
Suppose f ∈ C[a, b] and f is differentiable on (a, b). If f (a) = f (b), then a number c ∈ (a, b) exists
with f ′ (c) = 0.
4
Mean value theorem:
Suppose f ∈ C[a, b] and f is differentiable on (a, b). A number c ∈ (a, b) exists with
f (b) − f (a)
f ′ (c) = (1.6)
b−a
Riemann integral:
The Riemann integral of a function f on the interval [a, b] is the limit (provided it exists) of
Z b n
X
f (x)dx = lim f (zi )∆xi , (1.7)
a max∆xi →0
i=1
where the numbers xi satisfy a = x0 ≤ x1 ... ≤ xn = b, and where ∆xi = xi −xi−1 , and zi is arbitrarily
chosen in [xi−1 , xi ]. If the points are equally spaced and we choose zi = xi , in this case The Riemann
integral of a function f on the interval [a, b] is the limit (provided it exists) of
Z b n
b−aX
f (x)dx = lim f (xi ), (1.8)
a n→∞ n i=1
Weighted Mean Value Theorem for Integral: Suppose f ∈ C[a, b], the Riemann integral of g
exists on [a, b], and g(x) does not change sign on [a, b]. Them there exist a number c in (a, b) with
Z b Z b
f (x)g(x)dx = f (c) g(x)dx. (1.9)
a a
When g(x) = 1, it gives the average value of the function f over the interval [a, b].
Z b
1
f (c) = f (x)dx. (1.10)
b−a a
Generalized Rolle’s Theorem: Suppose that f ∈ C[a, b] is n times differentiable on (a, b). If f (x)
is zero at n + 1 distinct numbers x0 , ..., xn in [a, b], then a number c in (a, b) exists with f (n) (c) = 0.
Intermediate Value Theorem: If f ∈ C[a, b] and K is any number between f (a) and f (b), then
there exists c in (a, b) for which f (c) = K.
Taylor’s Theorem: If f ∈ C n [a, b] and f (n+1) exists on [a, b], and x0 ∈ [a, b]. For every x ∈ [a, b],
there exists a number ξ(x) between x0 and x with f (x) = Pn (x) + Rn (x), where
n
X f (k) (x0 )
Pn (x) = (x − x0 )k (1.11)
k=0
k!
(n+1)
f (ξ(x))
Rn (x) = (x − x0 )n+1 (1.12)
(n + 1)!
5
Pn (x) is called the nth Taylor polynomial for f about x0 , and Rn (x) is called the remainder
term (or truncation error). In the case when x0 = 0, the Taylor polynomial is often called a
Maclaurin polynomial.
if we take n → ∞ the Taylor polynomial is called Taylor series for f about x0 . For x0 = 0, the
Taylor series is called Maclaurin series.
Example:
We want to determine an approximate value of cos (0.01) using second Maclaurin polynomial
1 1
cos x = 1 − x2 + x3 sin(ξ) (1.13)
2 6
where ξ is number between 0 and x.
where we use the bar over 6 to indicate that this digit repeats indefinitely.
which gives
The error bound is much larger than the actual error. This is due in part to the poor bound we used
for sin ξ. It can be shown that | sin x| ≤ |x|. Since 0 ≤ ξ ≤ 0.01, we find bound 0.16 × 10−8 .
1 1
Note that eq. (1.13) can be written as cos x = 1 − x2 + x4 cos(ξ ′ ) and the error will be no more
2 24
1 −8 −9
than × 10 = 0.416 × 10 .
24
This example illustrate two objectives of numerical analysis:
find an approximation to the solution and determine a bound for the error.
1.1.1 Exercises
The exercises are from the textbook sec 1.1 pages 14-16.
Tutor: Exercises 1,2,3,4,15,23
Students: All odd exercises except 17
Assignment 1: Exercises 15 and 26
6
The remainder is given by
f (5) (ξ(x)) 5
R4 (x) = x
5!
1 ξ2
= e (15 + 90ξ 2 + 60ξ 4 + 8ξ 6 )x5
30
≤ 1.211406197 x5
≤ 0.01240479946
we find the last inequality by substituting ξ = x = 0.4.
The integral can be approximated by
Z0.4 Z0.4
f (x)dx ≈ x + x3 dx = 0.086400
0 0
n=1 , E = 0.001370778390
n=2 , E = 0.00002392459621
n=3 , E = 3.131722321 × 10−7
n=4 , E = 3.279531946 × 10−9
7
So n = 3 is sufficient to get the value accuracy to 10−6 . In this case P3 (x) is equal to
the actual value of cos 42o = 0.7431448255..., so the error is of the order of 0.2239 × 10−6 .
c1 m ≤ c1 f (x1 ) ≤ c1 M
c2 m ≤ c2 f (x2 ) ≤ c2 M
which lead to
and therefore
c1 f (x1 ) + c2 f (x2 )
m≤ ≤M
c1 + c2
without loss of generality, let assume that m = f (x1 ) and M = f (x2 ), then the last equation gives
c1 f (x1 ) + c2 f (x2 )
f (x1 ) ≤ ≤ f (x2 )
c1 + c2
According to the intermediate value theorem ∃ξ between x1 and x2 such that
c1 f (x1 ) + c2 f (x2 )
f (ξ) =
c1 + c2
where (.d1 d2 ...dn )β is a β-fraction called the mantissa, end e is an integer called the exponent.
Such a floating-point number is said to be normalized in case d1 6= 0, or else d1 = d2 = ... = dn = 0.
8
To save storage and provide a unique representation for each floating-point number we use a nor-
malized form
(−1)s 2c−1023 (1 + f ) (1.18)
Example: consider the machine number
0 |10000000011
{z } 1011100100010000000000000000000000000000000000000000
| {z }
The left most bit is zero, the number is positive.
The next eleven bits, (10000000011)2 = 1 + 21 + 210 = 1027.
The exponent part is 21027−1023 = 24 . The final 52 bits specify the mantissa
1 1 1 1 1 1
f = (.101110010001)2 = + 3 + 4 + 5 + 8 + 12
2 2 2 2 2 2
This number represents
27.566406250...
27.56640625
11
00
00
11
00
11
next smallest machine number next largest machine number
Figure 1.1:
inal number represents not only 27.56640625, but also half of the real numbers that are between the
numbers 27.56640625 and its two nearest machine-number neighbors see (Fig. 1.1).
Round-off errors:
Round-off errors arise because it is impossible to represent all real numbers exactly on a finite-state
machine (which is what all practical digital computers are).
On a pocket calculator, if one enters 0.0000000000001 (or the maximum number of zeros possible),
then a ’+’, and then 100000000000000 (again, the maximum number of zeros possible), one will obtain
the number 100000000000000 again, and not 100000000000000.0000000000001. The calculator’s
answer is incorrect because of round-off in the calculation.
9
Round-off errors in a computer1
The most basic source of errors in a computer is attributed to the error in representing a real number
with a limited number of bits.
The machine epsilon, ǫ, is the interval between 1 and the next number greater than 1 that is
distinguishable from 1. This means that no number between 1 and 1 + ǫ can be represented in
the computer.
Machine epsilon can be found by the following program:
10 E=1
20 IF E+1>1 THEN PRINT E ELSE STOP
30 E=E/2: GOTO 20
When numbers are added or subtracted, an accurate representation of the result may require much
larger number of digits than needed for numbers added or subtracted. Serious amounts of round-off
error occur in situations:
1. when adding (or subtracting) a very small number to (or from) a large number
2. when a number is subtracted from another that is very close
To test the first case on the computer, let us add 0.00001 to unity ten thousand times. The program
to do this job would be:
10 sum=1
20 for i=1 to 10000
30 sum=sum+0.00001
40 next
50 print*, sum
(0.10000 0000 0000 0000 1010 0111 1100 0101 1010 1100)2 × 21
1
Applied Numerical Methods with Software, Shoichiro Nakamura
10
Now we have to use only 24-bit for the mantissa, we get
(0.10000 0000 0000 0000 1010 0111 1100 0101 1010 1100)2 × 21
1. Double precision
2. Grouping
3. Taylor expansion
4. Changing definition of variable
5. Rewriting the equation to avoid subtractions
Example:
We want to add 0.00001 ten thousand times to unity by using:
(a)-double precision
(b)-grouping method
10 SUM=1.0D0
20 DO I=1,10000
30 SUM=SUM+0.00001D0
40 END DO
50 PRINT *, SUM
Grouping method:
SUM=1
DO 47 I=1,100
TOTAL=0
DO 40 K=1,100
TOTAL=TOTAL+0.00001
40 CONTINUE
11
SUM=SUM+TOTAL
47 CONTINUE
PRINT *, SUM
sin(1 + θ) − sin(1)
d=
θ
becomes very poor because of the round-off errors. By using Taylor expansion we can write
Therefore,
program testeps
implicit none
real :: d,da,t=1.0e0,h=10.0e0
integer :: i
do i=1,7
t=t/h
da=cos(1.0e0)-0.5e0*t*sin(1.0e0)
d=(sin(1.0e0+t)-sin(1.0e0))/t
print*,t,d,da
end do
end
angle d D
-------------------------------------
0.10000000E-00 0.49736413 0.49822873
9.99999978E-03 0.53608829 0.53609490
9.99999931E-04 0.53993475 0.53988153
9.99999902E-05 0.54062998 0.54026020
9.99999884E-06 0.54383242 0.54029804
9.99999884E-07 0.54327762 0.54030186
12
Law of Arithmetic Due to errors introduced in floating point arithmetic, the associative and
distributive laws of arithmetic are not always satisfied. that is
x + (y + z) =
6 (x + y) + z
x × (y × z) = 6 (x × y) × z
x × (y + z) 6= (x × y) + (x × z)
where 1 ≤ d1 ≤ 9 and 0 ≤ di ≤ 9. This number is called k-digit decimal machine numbers. Any
positive real number within the numerical range of the machine can be normalize to the form
Example: the number π = 0.314159... × 101 . The floating-point form of π using five-digit chopping
is f l(π) = 0.31415 × 101 = 3.1415. The floating-point form of π using five-digit rounding is 3.1416,
because of the sixth digit expansion of π which is 9 > 5.
13
that p and p∗ agree to k significant digits if |dk+1 − ek+1 | < 5. Otherwise, we say they agree to k − 1
significant digits.
Example: Let the true value p = 10/3 and the approximate value p∗ = 3.333.
The absolute error is |10/3 − 3.333| = 1/3000.
The relative error is 1/10000=10−4 < 5 × 10−4
The number of significant digits is 4.
Assume that the floating-point representations f l(x) and f l(y) are given for the real number x and y
and the symbols ⊕, ⊖, ⊗, ⊘ represent addition, subtraction, multiplication, and division operations,
respectively. The finite-digit arithmetic is given by
One of the most common error involves the cancellation of significant digits due to the subtraction
of two nearly equal numbers. Suppose we have two nearly equal numbers x and y, with x > y,
we have
f l(f l(x) − f l(y)) = αp+1 αp+2 ...αk × 10n−p − βp+1 βp+2 ...βk × 10n−p
= 0.σp+1 σp+2 ...σk × 10n−p
The floating-point number used to represent x − y has at most k − p digits of significance. Any
further calculations involving x − y retain the problem of having only k − p digits of significance.
Loss of significance: Consider, for example, x∗ = 0.76545421 × 101 and y ∗ = 0.76544200 × 101 to
be an approximation to x and y, respectively, correct to seven significant digits. Then, in eight-digit
floating-point arithmetic, the difference z ∗ = x∗ − y ∗ = 0.12210000 × 10−3 . But as an approximation
to z = x − y is good only to three digits, since the fourth significant digit of z ∗ is derived from the
eight digits of x∗ and y ∗ , both possibly in error. Hence, while the error in z ∗ is at most the sum of
the error in x∗ and y ∗ , the relative error in z ∗ is possibly 10000 times the relative relative error in x∗
and y ∗ . Loss of significant digits is therefore dangerous only if we wish to keep the relative error
small
14
We can also have error when dividing by a small number of multiplying by large number. Suppose,
for example, that the number z has a finite-digit approximation z + δ, where the error δ is introduced
by representation or previous calculation. If we divide it by ǫ = 10−n , where n > 0, then
z f l(z)
≈ fl = (z + δ) × 10n
ǫ f l(ǫ)
so, the absolute error in this approximation, |δ| × 10n , is the original absolute error, |δ|, multiplied
by a factor 10n .
Example:
Let p = 0.54617 and q = 0.54601. The exact value of r = p − q = 0.16 × 10−5 . If we perform the
subtraction using 4-digit rounding we find p∗ = 0.5462 and q ∗ = 0.5460, and r∗ = p∗ −q ∗ = 0.2×10−4 .
The relative error is
r − r∗
r = 0.25
which has only one significant digit, whereas p∗ and q ∗ were accurate to four and five significant
digits, respectively.
Example:
The quadrature formula states that the roots of ax2 + bx + c = 0, when a 6= 0, are
√
−b ± b2 − 4ac
x± = (1.24)
2a
using four-digit rounding arithmetic, consider this formula applied to x2 + 62.1x + 1 − 0, whose
roots are approximately x+ = −0.01610723 and x− = −62.08390. we can√see that b ≫ 4ac, so
the numerator of x+ involves the subtraction of two nearly equal numbers. b2 − 4ac = 62.06, we
get f l(x+ ) = −0.02 which is a poor approximation to x+ = −0.01611, with a relative error about
2.4 × 10−1 . The other root f l(x− ) = −62.10 has a small relative error around 3.2 × 10−4 .
To obtain more accurate we can use the formula
√
−b + b2 − 4ac
x+ =
√2a √
−b + b2 − 4ac b + b2 − 4ac
= √
2a b + b2 − 4ac
−2c
= √
b + b2 − 4ac
so we can get f l(x+ ) = −0.01610 which has the small relative error 6.2 × 10−4 . we can also derive a
formula for x2
−2c
x− = √
b − b2 − 4ac
In this case f l(x− ) will be −50.00 which has the large relative error 1.9 × 10−1 .
15
Example:
This example shows how we can avoid loss of significance. We want to evaluate f (x) = 1 − cos(x)
near zero in six-digit arithmetic. Since cos(x) ≈ 1 for x near zero, there will be loss of significant
digits by first finding cos(x) and then subtracting it from 1. Without loss of generality, assume that
x is close to zero with x > 0 , we have
the difference is
if we use rounding and if a7 ≥ 5 we cannot calculate the value of cos x using six-digit arithmetic at
all x ≤ x0 , because the rounding value of 1 − cos(x) is zero. for example 1 − cos(0.001) = 0.000000
but it is equal to 0.500000 × 10−6 . To overcome this we can use another formula
1 + cos(x)
1 − cos(x) = (1 − cos(x))
1 + cos x
sin2 (x)
=
1 + cos x
If we use this last equation we find that for x = 0.001
sin2 (0.001)
1 − cos(0.001) =
1 + cos 0.001
0.1 × 10−5
=
2
= 0.5 × 10−6
x2 x4
1 − cos x ≈ − + ...
2 24
which gives
0.0012 0.0014
1 − cos 0.001 ≈ − + ...
2 24
0.1 × 10−11
≈ 0.5 × 10−6 − + ...
24
= 0.5 × 10−6 − 0.416667 × 10−13 + ...
≈ 0.5 × 10−6
Example:
The value of the polynomial p(x) = 2x3 − 3x2 + 5x − 4 at x = 3 can be calculated as:
16
⋆ x2 = 9, x3 = 27, then we put every thing together, p(3) = 54 − 17 + 15 − 4 = 38.
We have five multiplication: x2 , x3 , 2x3 , 3x2 , 5x, and
one addition and two subtractions. We need in total 8 operations
⋆ The polynomial can be arrange as p(x) = [(2x − 3)x + 5]x − 4, nested manner
We need three multiplications and one addition and two subtractions. In total we need six.
– In general, for a polynomial of degree n we need (n − 1) + n = 2n − 1 multiplications:
( n−1 for xn , xn−1 ,...,x2 . and n for the multiplication of coefficients, an ×xn ,an−1 ×xn−1 ,...a1 ×x.
. However, for the nested form we need only n multiplications.
– Both need n addition/subtraction operations.
1.2.1 Exercises
Assignment: odd exercises From section 1.2 pages 26-29
Tutorial: 1,
22
Exercise 1: = π = 3.1415926..., and p∗ = = 3.142857. The absolute error is
7
|p − p∗ | = 3.142857 − 3.141592 6...
0.0012644 < |p − p∗ | < 0.0012645
If we round it we find that the absolute error is 0.00126. The relative error is
0.0012644 p − p∗ 0.0012645
< <
3.1415927 p 3.1415926
−4
p − p∗
4.0247 × 10 < < 4.0250 × 10−4
p
If we round it we find that the relative error is 4.025 × 10−4
∗ |p − p∗ |
The relative error in p , as an approximation to p, is defined by α = . Note that this number
|p|
|p − p∗ |
is close to if α ≪ 1. One can show that
|p∗ |
|p − p∗ | |p − p∗ | α
= α =⇒ = ≈α (1.25)
|p| |p∗ | |1 ± α|
17
Assignment
Suppose two points (x0 , y0 ) and (x1 , y1 ) are on the straight line with y0 6= y1 . The x-intercept of the line
is given by
x0 y1 − x1 y0
x=
y1 − y0
or
(x1 − x0 )y0
x = x0 −
y1 − y0
Group1 Use the data (x0 , y0 ) = (1.31, 3.24) and (x1 , y1 ) = (1.93, 4.76) and three-digit rounding arithmetic
to compute x-intercept both ways. Which method is better and why?
Group2 Use the data (x0 , y0 ) = (0.2, 0.2) and (x1 , y1 ) = (1.2, 1.01) and three-digit rounding arithmetic to
compute x-intercept both ways. Which method is better and why?
Solution
-y1 x0 + x1 y0
X1 := --------------
-y1 + y0
y0 (x1 - x0)
X2 := x0 + ------------
-y1 + y0
Group 1 Group 2
------------------------------------- ---------------------------------------
x0 := 1.31 x0 := 0.2
y0 := 3.24 y0 := 0.2
x1 := 1.93 x1 := 1.2
y1 := 4.76 y1 := 1.01
Actual solution is -0.01157894737 Actual solution is -0.04691358025
X1 := -0.00658 X1 := -0.0469
X2 := -0.01 X2 := -0.047
18
Chapter 2
Finding the roots of a function f is very important in science and engineering and it is not always simple.
Let consider the function
It is clear that x = 1 is the only real root of f . The graph of f is given in Fig. 2.1. It shows that there are
many roots for f because of many positive and negative values of f (x).
10 −14
10.0
7.5
y 5.0
2.5
0.0
0.97 0.98 0.99 1.0 1.01 1.02 1.03
x
Figure 2.1: The strange behavior of f (x) near x = 1 is due to round off errors in the computation of
expanded for of f (x). of f (x) = 0.
19
2.1 Bisection Method
Definition: The first technique, based on intermediate value theorem, is called bisection method.
To begin, set a1 = a and b1 = b, and let p1 the midpoint of the interval [a, b]; that is
b 1 − a1 a1 + b 1
p 1 = a1 + = (2.1)
2 2
if f (p1 ) = 0, then the root of f (x) = 0 is p = p1 . If f (p1 ) 6= 0, then f (p1 ) has the same sign of as
either f (a1 ) or f (b1 ). When f (p1 ) and f (a1 ) have the same sign, p ∈ (p1 , b1 ), and we set a2 = p1 and
b2 = b1 . When f (p1 ) and f (b1 ) have the same sign, p ∈ (a1 , p1 ), and we set a2 = a1 and b2 = p1 . We
then reapply the process to the interval [a2 , b2 ].
Algorithm:
INPUT: endpoints a, b; Tolerance T OL; maximum number of iteration N0
OUTPUT: approximate solution p or message of failure.
Step 1: Set i=1; FA=f(a);
20
and we just test sign(f(a)).sign(f(b)) instead of f (a).f (b).
It is good practice to set an upper bound for the number of iterations. This eliminate the possibility
in entering an infinite loop.
It is good to choose the interval [a, b] to be small as possible so we can reduce the number of iterations.
The bisection is slow to converge, N may become quite larger for small tolerance.
Theorem: Suppose that f ∈ C[a, b] and f (a).f (b) < 0. The bisection method generates a sequence
pn approximating a zero p of f with
b−a
|pn − p| ≤ , n ≥ 1. (2.6)
2n
Proof: For each n ≥ 1, we have
b1 = b , a1 = a (2.7)
and (2.8)
b−a
b n − an = (2.9)
2n−1
we also have that
b n − an b−a
|p − pn | ≤ = n (2.10)
2 2
which shows that the sequence pn converges to p.
The bound for number of iterations assumes calculation performed using infinite-digit arithmetic.
When implementing the method on a computer, we have to consider round-off error. For example,
the computation of midpoint of [a, b] should be found from the equation
b n − an
p n = an + (2.11)
2
instead from the algebraic equivalent equation
an + b n
pn = (2.12)
2
The first equation add a small correction (bn − an )/2 to the known value an . if bn − an is near the
maximum precision of the machine this correction will not affect significantly pn . However,
(an + bn )/2 may return a midpoint that is not even in the interval [an , bn ].
Exercises:
Odd numbers of sec. 2.1 page 51-52.
– Ex 13: √
An approximate value to 3 25 correct to within 10−4 .
Let consider the function f (x) = x3 − 25. We can choose the interval [2, 3], We have f (2) = −17
and f (3) = 2. The two values have different signs, so we can apply the bisection method.
21
n an bn pn b n − an f (pn )
1 2 3 2.5 1 −9.3750
2 2.5 3 2.75 0.5 −4.2031
3 2.75 3 2.8750 0.25 −1.2363
4 2.875 3 2.93750 0.125 +0.34741
5 2.8750 2.93750 2.906250 0.0625 +0.0625
6 2.90625 2.93750 2.921875 0.031250 +0.03125
7 2.921875 2.937500 2.929688 0.015625 +0.145710
8 2.9218750 2.9296875 2.9257812 0.0078125 +0.0452607
9 2.9218750 2.9257812 2.9238281 0.0039062 −0.0048632
10 2.9238 2.9258 2.9248 1.9531E − 03 +2.0190E − 02
11 2.9238 2.9248 2.9243 9.7656E − 04 +7.6615E − 03
12 2.9238 2.9243 2.9241 4.8828E − 04 +1.3986E − 03
13 2.9238 2.9241 2.9240 2.4414E − 04 −17.324E − 03
14 2.9240 2.9241 2.9240 1.2207E − 04 −1.6692E − 04
√3
So, the approximate value of 25 is p14 = 2.9240, because (b14 − a14 )/2 = 6.1035E − 05 and
(b13 − a13 )/2 = 1.2207E − 04. If we use the Theorem:
b−a
|pn − p| ≤ < 10−4 (2.13)
2n
we find that
1 −4 4
< 10 ⇒ −n log 2 < −4 ⇒ n > = 13.288
2n log 2
So, n should be at least 14.
– Ex 18:
The function f (x) = sin (πx) has zeros at every integer. We want to show that when −1 < a < 0
and 2 < b < 3, the bisection method converges to:
0 for a + b < 2
2 for a + b > 2
1 for a + b = 2
a+b
we have to check the sign of sin (aπ), sin π ,and sin (bπ) for each iteration.
2
For the starting point we have:
for a ∈ (−1, 0), the function sin (aπ) ∈ [−1, 0)
and for b ∈ (2, 3), the function sin (bπ) ∈ (0, 1]
So, the bisection method can apply on the interval [a,b].
The only root that we can get are, 0, or 1, or 2; because these are the only integers belong to
(−1, 3).
a+b
Next, we have to check the sign of sin π . We know that a + b ∈ (1, 3),
2
a+b a+b
* if a + b < 2 we have p = 0.5 < < 1 and sin π > 0.
2 2
so the sign of sin (aπ) < 0 and sin (pπ) > 0 are different and the only root between a and
p is 0. Therefore the bisection method converge to 0.
22
a+b a+b
* if a + b > 2 we have p = 1 < < 1.5 and sin π < 0.
2 2
so the sign of sin (pπ) < 0 and sin (bπ) > 0 are different and the only root between p and b
is 2. Therefore the bisection method converge to 2.
a+b
* if a + b = 2 we have p = 1 and sin π = 0.
2
which of cause is the root 1.
g(p) = p (2.14)
Theorem: If g ∈ C[a, b] and g(x) ∈ [a, b] for all x ∈ [a, b], the g has a fixed point in [a, b].
In addition, g ′ (x) exists on (a, b) and a positive constant k < 1 exists with
So, according to the Intermediate value theorem f (x) = 0 has a root. Thus g(x) = x has a solu-
tion.Therefore g has a fixed point.
Assume that there are more than one fixed points, let say, p and q with p 6= q. We know from the
mean value theorem that it exists ξ between q and q such that
g(p) − g(q)
= g ′ (ξ) = 1
p−q
which contradicts the fact that |g ′ (x)| < 1, so, there will be only one fixed point.
What about the case when |g ′ (x)| > 1, do we have such function?
Indeed if |g ′ (x)| =
6 1 for all x the only possible case is that |g ′ (x)| < 1.
Proof: Suppose that g ′ (x) 6= 1 for all x ∈ (a, b). Because g(x) ∈ [a, b] and g ′ (x) exists so, we have
g ′ (x) < −1 or −1 ≤ g ′ (x) < 1 or g ′ (x) > 1:
The first one and last on are not true because
g(b) − g(a)
−1 ≤ <1 (2.17)
b−a
So, we can say that for the case when |g ′ (x)| > 1 no function exists.
Algorithm:
INPUT: initial point p0 a, b; Tolerance T OL; maximum number of iteration N0
23
OUTPUT: approximate solution p or message of failure.
Step 1: Set i=1;
Step 2: while i ≤ N0 do steps 3-6.
Fixed-point theorem If g ∈ C[a, b] and g(x) ∈ [a, b] for all x ∈ [a, b], suppose in addition, g ′ (x)
exists on (a, b) and a positive constant 0 < k < 1 exists with
then for any p0 ∈ [a, b], the sequence pn = g(pn−1 ) converges to the unique fixed point p ∈ [a, b].
Proof:
If g satisfies the fixed point theorem, then bounds for the error involved in using pn to approximate
p are given by
|p − pn | ≤ k n max{p0 − a, b − p0 } (2.19)
kn
|p − pn | ≤ |p1 − p0 | (2.20)
1−k
proof:
Exercises:
– Ex.5: We use a fixed-point iteration method to determine a solution accurate to 10−2 for
x4 − 3x2 − 3 = 0 on [1, 2] and p0 = 1.
Solution:
To use the fixed-point iteration we have to find a function g(x) which satisfies fixed-point theo-
1/4
rem. the equation x4 −3x2 −3 = 0 leads to x4 = 3x2 +3, which in turn leads to x = 3x2 + 3 .
1/4
Now we check if the function g(x) = 3x2 + 3 satisfies the fixed-point theorem.
3 x
g ′ (x) =
2 (3x + 3)3/4
2
24
The derivative is always positive in the region [1, 2]. So, g(1) = 1.565084580 and g(2) =
1.967989671. Therefore g(x) ∈ [1, 2].
3 x 3 2
g ′ (x) = 3/4
≤
2 (3x2 + 3) 2 (3 × 12 + 3)3/4
≤ 0.7825422900 < 1
So, the function g(x) satisfies the fixed-point theorem. For more precisely, the second derivative
is given by
3 x2 − 2
g ′′ (x) = −
4 (x2 + 1)(3x2 + 3)3/4
√
In the region [1, 2] we have g ′ (1) = 0.3912711450, g ′ (2) = 0.3935979342 and g( 2) = 0.4082482906,
so g ′ (x) ≤ 0.4082482906 < 1. Therefore our k = 0.4082482906.
according to the theorem we have
|p − pn | ≤ k n max[p0 − a, b − p0 ]
which becomes
|p − pn | ≤ k n max[p0 − a, b − p0 ]
≤ 0.4082482906n ≤ 10−2
which implies that n ≥ 6. The answer will be p6 = 1.943316930 is accurate to within 10−2 .
– Ex 7:
We want to show that the function g(x) = π + 0.5 sin (x/2) has a unique fixed point on [0, 2π].
We know that
0 < π − 0.5 ≤ g(x) ≤ π + 0.5 < 2π (2.21)
The derivative of the function g is given by
cos (x/2) 1
g ′ (x) = ≤ (2.22)
4 4
Therefore the function has a unique fixed point. To find an approximation to the fixed-point
that is accurate to 10−2 let estimate the number of iteration. Let use p0 = π
n
1
|p − pn | ≤ π ≤ 10−2 (2.23)
4
25
we have p1 = π + 1/2, so the last equation becomes
n
(1/4)n 1 2 1
|p − pn | ≤ = ≤ 10−2 (2.25)
3/4 2 3 4
This leads to the number of iteration n ≥ 4. Therefore
p0 = 3.141592654
p1 = 3.641592654
p2 = 3.626048865
p3 = 3.626995623
p4 = 3.626938795
p5 = 3.626942209
26
The theorem states, that under reasonable assumptions, Newton’s method converges provided a
sufficiently accurate initial approximation is chosen. In practice, the method doesn’t tell us how to
calculate δ. In general either the method converges quickly or it will be clear that convergence is
unlikely.
Newton’s method is a powerful technique, but it has a major weakness: the need of the first deriva-
tive.The calculation of the first derivative f ′ (x) needs more arithmetic operations than f (x).
f (x) − f (pn−1 )
f ′ (pn−1 ) = lim (2.30)
x→pn−1 x − pn−1
Letting x = pn−2 , we have
f (pn−2 ) − f (pn−1 )
f ′ (pn−1 ) ≈ (2.31)
pn−2 − pn−1
using this last equation in the Newton’s method we get
pn−1 − pn−2
pn = pn−1 − f (pn−1 ) (2.32)
f (pn−1 ) − f (pn−2 )
This is called the Secant Method. Starting with two initial approximation p0 and p1 , the approx-
imation p2 is the x-intercept of the line joining the two points (p0 , f (p0 )) and (p1 , f (p1 )).
The approximation p3 is the x-intercept of the line joining (p1 , f (p1 )) and (p2 , f (p2 )) (see the Fig.2.2).
p0 p p3 p1 p p p p1
2 0 2 3
p4
p
4
Figure 2.2: Secant Method and False Position method for finding the root of f (x) = 0.
27
False position method: generates approximations in the same way as the Secant method, but
includes a test to ensure that the root is always bracketed between successive iterations. This
method is not recommended and it is just to illustrate how bracketing can be incorporated.
First we choose two approximations p0 and p1 such that f (p0 ) · f (p1 ) < 0. The approximation p2 is
chosen the same manner as the secant method, as the x-intercept of the line joining (p0 , f (p0 )) and
(p1 , f (p1 )). To decide which secant line to be use to compute p3 , we check the sign of f (p1 ) · f (p2 ).
If it is negative then p1 and p2 bracket a root, and we choose b3 as the x-intercept of the line joining
(p1 , f (p1 )) and (p2 , f (p2 ). If not we choose p3 as the x-intercept of the line joining p0 , f (p0 )) and
(p2 , f (p2 )). In similar manner we can found pn , for n ≥ 4.
Exercises
Ex 5: We want to use Newton’s method to find an approximate solution accurate to 10−4 of for the
following equation.
Solution:
The Newton’s method is:
f (pn−1 )
pn = pn−1 − , n≥1 (2.33)
f ′ (pn−1 )
The approximation is accurate to the places for which pn−1 and pn agree.
The Newton’s methods gives:
which is p4 = 2.690647448
If we use p0 = 2 we get
which is p5 = 2.690647448
28
Chapter 3
In general, a sequence with high order of convergence converges more rapidly than a sequence with
a lower order. The asymptotic constant affects the speed of convergence but is not as important as
the order.
29
after seven iterations we get
|pn | ≈ 0.78125 × 10−2 |p̃n | ≈ 0.58775 × 10−38
In order for the linearly convergent to have the same accuracy as quadratically convergence we need:
7 −1
(0.5)n = (0.5)2 ⇒ n = 27 − 1 = 127
Theorem: Let g ∈ C[a, b] be such that g(x) ∈ [a, b], for all x ∈ [a, b]. Suppose, in addition, that g ′
is continuous on (a, b) and that a positive constant k < 1 exists with |g ′ (x)| ≤ k, for all x ∈ (a, b). If
g ′ (x) 6= 0, then for any p0 ∈ [a, b], the sequence
pn = g(pn−1 ), n ≥ 1, (3.2)
converges only linearly to the unique fixed point p in [a, b].
proof:
Theorem: Let p a solution of the equation x = g(x). Suppose that g ′ (p) = 0 and g ′′ is continuous
with |g ′′ (x)| < M on an open interval I containing p. Then there exists a number δ > 0 such that, for
p0 ∈ [p−δ, p+δ], the sequence defined by pn = g(pn−1 ), where n ≥ 1, converges at least quadratically
to p. Moreover, for sufficient large values of n,
M
|pn+1 − p| < |pn − p|2 . (3.3)
2
Proof:
The easiest way to construct a fixed-point problem associated with a root-finding problem f (x) = 0
is to subtract a multiple of f (x) from x.
pn = g(pn−1 ), with g(x) = x − φ(x)f (x), where φ(x) is a differentiable function that will be chosen
later.
If p satisfies f (p) = 0 then it is clear that g(p) = p.
If the iteration procedure derived from g to be quadratically convergent, we need g ′ (p) = 0 when
f (p) = 0. Since
g ′ (x) = 1 − φ′ (x)f (x) − φ(x)f ′ (x)
we have
g ′ (p) = 1 − φ′ (p)f (p) − φ(p)f ′ (p)
= 1 − φ(p)f ′ (p)
which implies
1
φ(p) =
f ′ (p)
If we let φ(x) = 1/f ′ (x), we will ensure that φ(p) = 1/f ′ (p) and produce quadratically convergent
procedure
f (pn−1 )
pn = g(pn−1 ) = pn−1 − (3.4)
f ′ (pn−1 )
This is of cause Newton’s method which is quadratically convergent provided that f ′ (pn−1 ) 6= 0.
30
3.2 Zero multiplicity
Definition: A solution p of f (x) = 0 is zero multiplicity m of f if for x 6= p, we can write
f (x) = (x − p)m q(x), where lim q(x) 6= 0.
x→p
Theorem: f ∈ C 1 [a, b] has a simple zero at p in (a, b) iff f (p) = 0 but f ′ (p) 6= 0.
Proof:
If p is a simple root of f then Newton’s method converges quadratically. If p is not a simple root
then Newton’s method may not converge quadratically (see Example 2 page 79).
Theorem: The function f ∈ C m [a, b] has a zero of multiplicity m at p in (a, b) iff
3.3 Exercises
Ex 6: Show that the following sequence converges linearly to p = 0.
How large must n before we have |p − pn | ≤ 5 × 10−2 ?
a)pn = 1/n. b) qn = 1/n2 .
31
for (a) we have 1/n ≤ 5 × 10−2 implies that n ≥ 20. for (b) we have 1/n2 ≤ 5 × 10−2 implies that
n2 ≥ 20, which in turn gives n ≥ 5.
n
Ex 8a: We want to show that the sequence pn = 10−2 converges quadratically to 0.
So, we cannot find a positive number λ. Therefore, the sequence doesn’t converge quadratically.
32
Chapter 4
Accelerating Convergence
(pn+1 − pn )2
p̂n = pn −
pn+2 − 2pn+1 + pn
(∆pn )2
= pn − for n ≥ 0 (4.1)
∆2 p n
converges more rapidly to p than does the original sequence {pn }. The symbol ∆pn is the forward
difference which is defined by
For example
∆2 p n = ∆(∆pn )
= ∆(pn+1 − pn )
= (pn+2 − pn+1 ) − (pn+1 − pn )
= pn+2 − 2pn+1 + −pn (4.6)
Theorem: Suppose that {pn } is a sequence that converges linearly to the limit p and that
pn+1 − p
lim <1 (4.7)
n→∞ pn − p
33
Then the sequence
(∆pn )2
p̂n = pn − (4.8)
∆2 p n
p̂n − p
lim =0 (4.9)
n→∞ pn − p
Example:
Let consider pn = cos(1/n). This sequence converges linearly to p = 1,
1
1
cos n+1 −1 n2 sin n+1
lim = lim
n→∞ cos 1 − 1 n→∞ (n + 1)2 sin 1
n n
1
sin n+1
= lim
n→∞ sin 1
n
1
n2 cos n+1
= lim
n→∞ (n + 1)2 cos 1
n
= 1
(∆pn )2
p̂n = pn −
∆2 p n
(pn+1 − pn )2
= pn − (4.10)
pn+2 − 2pn+1 + pn
n pn p̂n
1 0.5403023059 0.9617750599
2 0.8775825619 0.9821293535
3 0.9449569463 0.9897855148
4 0.9689124217 0.9934156481
5 0.9800665778 0.9954099422
6 0.9861432316 0.9966199575
7 0.9898132604 0.9974083190
Example:
The function f (x) = x3 − 3x + 2 = (x − 1)2 (x + 2) has a double root p = 1. If Newton’s method
converges to p = 1 it converges linearly. We choose p0 = 2. The Newton’s method produces the
following sequence:
p0 = 2.
p3n − 3pn + 2
pn+1 = pn −
3pn − 3
34
pn − 1
n pn
pn−1 − 1
1 1.555555555555556 0.5555555560
2 1.297906602254429 0.5362318832
3 1.155390199213768 0.5216071009
4 1.079562210414361 0.5120156259
5 1.040288435171017 0.5063765197
6 1.020276809786733 0.5032910809
7 1.010172323431420 0.5016727483
8 1.005094741093272 0.5008434160
9 1.002549528082823 0.5004234759
10 1.001275305026243 0.5002121961
11 1.000637787960288 0.5001062491
12 1.000318927867152 0.5000533092
13 1.000159472408516 0.5000250840
14 1.000079738323218 0.5000125414
15 1.000039869690520 0.5000125411
It is clear that Newton’s method is linearly convergent or it converges slowly to p = 1. Let us apply
Aitken’s acceleration process to the sequence pn of iterations generated by Newton’s method
(pn+1 − pn )2
p̂n = pn −
pn+2 − 2pn+1 + pn
(4.11)
which gives:
n pn p̂n pn − 1 p̂n − 1
0 2.0 0.9425287356 1.0 −0.0574712644
1 1.555555556 0.9789767949 0.555555556 −0.0210232051
2 1.297906602 0.9933420783 0.297906602 −0.0066579217
3 1.155390199 0.9980927682 0.155390199 −0.0019072318
4 1.079562210 0.9994865474 0.079562210 −0.0005134526
5 1.040288435 0.9998665586 0.040288435 −0.0001334414
6 1.020276810 0.9999659695 0.020276810 −0.0000340305
7 1.010172323 0.9999914062 0.010172323 −0.0000085938
8 1.005094741 0.9999978406 0.005094741 −0.0000021594
9 1.002549528 0.9999994588 0.002549528 −5.41210−7
10 1.001275305 0.9999998645 0.001275305 −1.35510−7
11 1.000637788 0.9999999661 0.000637788 −3.3910−8
12 1.000318928 0.9999999915 0.000318928 −8.510−9
13 1.000159472 0.9999999979 0.000159472 −2.110−9
14 1.000079738 ∗∗ 0.000079738 ∗∗
15 1.000039870 ∗∗ 0.000039870 ∗∗
35
fensen’s method.
p0 , p1 = g(p0 ), p2 = g(p1 ), p̂0 = {∆2 }(p0 ), p3 = g(p2 ), p̂1 = {∆2 }(p1 ), ... (4.12)
where {∆2 } indicates that Aitken’s method given by eq.(4.10) is used. Steffensen’s Method constructs
the same first four terms, p0 , p1 , p2 , and p̂0 . However, at this step it assumes that p̂0 is better
approximation to p than p2 and applies fixed-point iteration to p̂0 instead of p2 . This leads to the
following sequence:
(0) (0) (0) (0) (0) (1) (0) (1) (1)
p0 , p1 = g(p0 ), p2 = g(p1 ), p0 = {∆2 }(p0 ), p1 = g(p0 ), ... (4.13)
Note that the denominator can be zero in the next iteration. If this occurs, we terminate the sequence
and select the last one before we get zero1 .
Ex 3 page 86 :
(0) (1)
Let g(x) = cos(x − 1) and p0 = 2. We want to use Steffensen’s method to get p0 .
(0)
p0 = 2
(0)
p1 = cos(2 − 1) = 0.5403023059
(0)
p2 = cos(0.5403023059 − 1) = 0.8961866647
(0) (0)
(1) (0) (p1 − p0 )2
p0 = p0 − (0) (0) (0)
= 0.826427396
p2 − 2p1 + p0
Ex 4 page 86 :
(0) (1) (2)
Let g(x) = 1 + (sin x)2 and p0 = 2. We want to use Steffensen’s method to get p0 and p0
(0)
p0 = 2
(0)
p1 = 1 + (sin 2)2 = 1.708073418
(0)
p2 = 1 + (sin 1.708073418)2 = 1.981273081
(0) (0)
(1) (0) (p1 − p0 )2
p0 = p0 − (0) (0) (0)
= 2.152904629
p2 − 2p1 + p0
(2)
To calculate p0 we start with:
(1)
p0 = = 2.152904629
(1)
p1 = 1 + (sin 2.152904629)2 = 1.697735097
(1)
p2 = 1 + (sin 1.697735097)2 = 1.983972911
(1) (1)
(2) (1) (p1 − p0 )2
p0 = p0 − (1) (1) (1)
= 1.873464043
p2 − 2p1 + p0
1
See page 85 from the textbook
36
4.3 Zeros Polynomial
A polynomial of degree n has the form
Fundamental Theorem of Algebra: If P (x) is polynomial of degree n ≥ 1, then P (x) = 0 has at least
one root (possibly complex).
If P (x) is a polynomial of degree n ≥ 1, then there exist unique constants x1 , x2 , ..., xk , possibly
complex, and unique positive integers m1 , m2 , ..., mk , such that
k
X
mi = n
i=1
P (x) = an (x − x1 )m1 (x − x1 )m2 ...(x − xk )mk
Let P (x) and Q(x) be polynomials of degree at most n, If x1 , x2 , ..., xk , with k > n, are distinct
numbers with P (xi ) = Q(xi ) for i = 1, 2, ..., k, then p(x) = Q(x) for all values of x.
Divide this polynomial Pn (x) by (x − x1 ), giving a reduced polynomial Qn−1 (x) of degree n − 1, and
a remainder R
Pn (x) = (x − x1 )Qn−1 (x) + R (4.16)
We can see that Pn (x1 ) = R. If we differentiate Pn (x) we get
thus,
Pn′ (x1 ) = Qn−1 (x1 ) (4.18)
37
We evaluate Qn−1 (x1 ) by a second division whose remainder equals Qn−1 (x1 ), and so on. Now we
can write
Pn (x) = an xn + an−1 xn−1 + ... + a1 x + a0
= (x − x1 )Qn−1 (x) + R
= (x − x1 )(bn−1 xn−1 + bn−2 xn−2 + ... + b1 x + b0 ) + R
= bn−1 xn + bn−2 xn−1 + ... + b1 x2 + b0 x
− bn−1 x1 xn−1 + bn−2 x1 xn−2 + ... + b1 x1 x + b0 x1 + R
We collect the terms
Pn (x) = bn−1 xn + [bn−2 − x1 bn−1 ] xn−1 + [bn−3 − x1 bn−2 ] xn−2 + ...
+ [b0 − x1 b1 ] x + [R − x1 b0 ]
By comparison we get:
bn−1 = an
bn−2 = an−1 + x1 bn−1
.. .
. = ..
bi = ai+1 + x1 bi+1
.. .
. = ..
b 0 = a1 + x 1 b 1
So, the reminder can be evaluated from
R = a0 + x 1 b 0
If bn = an and
bk = ak + bk+1 x0 , for k = n − 1, n − 2, ..., 1, 0 (4.20)
then b0 = P (x0 ), k is from n − 1 to 0, which means you need only n multiplications and n additions
to get p(x0 ). Moreover, if
Q(x) = bn xn−1 + bn−1 xn−2 + ... + b2 x + b1 (4.21)
Then
P (x) = (x − x0 )Q(x) + b0 (4.22)
Proof:
38
Example:
We want to evaluate P (x) = 2x4 − 3x2 + 3x − 4 at x0 = −2 using Horner’s method.
we start by:
Example:
Find an approximation to one of the zeros of P (x) = 2x4 − 3x2 + 3x − 4 using Newton’s Method and
synthetic division to evaluate P (xn ) and P ′ (xn ) for each iterate xn .
at x0 = −2 we use bn = an and bk = ak + bk+1 x0 for k = n − 1 to k = 0.
21 02 −33 34 −45
(26 )(−2) = −47 (−4)8 (−2) = 89 (510 )(−2) = −1011 (−712 )(−2) = 1413
26 02 − 47 = −48 −33 + 89 = 510 34 − 1011 = −712 −45 + 1413 = 1014
Using the theorem P ′ (x0 ) = Q(x0 ) we get
21 −42 53 −74
(25 )(−2) = −46 (−87 )(−2) = 168 (219 )(−2) = −4210
25 −42 − 45 = −87 53 + 168 = 219 −74 − 4210 = −4911
and
P (x0 ) 10
x1 = x0 − = −2 − ≈ −1.796
Q(x0 ) −49
repeating the procedure we get for x1 = −1.796, P (x1 ) = 1.742 and P ′ (x1 ) = −32.565, so, x2 ≈
−1.73896. in a similar manner we get x3 = −1.73897. An actual zero to five decimal places is
−1.73896.
4.5 Deflation
If the N th iterate, xN , in Newton’s method is an approximate zero for the polynomial P (x), then
P (x) = (x − xN )Q(x) + b0 = (x − xN )Q(x) + P (xN ) ≈ (x − xN )Q(x) (4.25)
so, x − xN is an approximate factor of P (x). Letting x̂1 = xN be the approximate zero of P and Q1 (x) =
Q(x) be the approximate factor given by
P (x) ≈ (x − x̂1 )Q1 (x) (4.26)
We can find a second approximate zero of P by applying Newton’s method to Q1 (x). If P (x) is of order n
we can apply repeatedly the procedure to find x̂2 and Q2 (x),..., x̂n−2 and Qn−2 (x). After finding (n − 2)
roots we get a quadrature form which we can solve it to get the last two approximate roots. This procedure
is called deflation.
39
The accuracy difficulty with deflation is due to the fact that, when obtaining the approximate zero
of P (x), Newton’s method is used on the reduced polynomial Qk (x), that is,
An approximate zero x̂k+1 of Qk (x) will generally not approximate a root of P (x) but of Qk (x). To
eliminate this we can use the reduced equations to find approximates x̂i , and then apply Newton’s
Method to the original polynomial P (x).
One problem with applying the Secant, False position, or Newton’ methods to polynomials is pos-
sibility of having complex roots. If the initial approximation is real all subsequent approximations
will also be real. To overcome this problem we start with complex initial approximation.
−2c
p3 − p2 = √ (4.29)
b ± b2 − 4ac
This has no problem with subtracting nearly equal numbers (see example 5 section 1.2).
This formula has two roots. In Muller’s method, the sign is chosen to agree with the sign of b, so
2c
p3 = p2 − √ (4.30)
b + sign(b) b2 − 4ac
The method involve square root which means that complex numbers can be found using Muller’s
method.
4.7 Exercises
We want to find the approximation to 10−4 of all real zeros of the following polynomial using New-
ton’s method.
P (x) = x3 − 2x2 − 5
40
sol:
Descartes’s rule of signs. The rule states that the number np of positive zeros of a polynomial P (x)
is less than or equal to the number of variations v in sign of the coefficients of P (x). Moreover, the
difference v − np is nonegative even integer.
For our example, the number of variations v in sign of the coefficients of P (x) is v = 1.
There are at most 1 positive root. Moreover, 1 − np ≥ 0, which implies that np = 1. Therefore there
is one positive root.
Now we change x → −x, we find
f (x) = x3 − 2x2 − 5
f ′ (x) = 3x2 − 4x
p0 = 2
f (pn )
pn+1 = pn − ′ , n≥0
f (pn )
n pn |pn − pn−1 |
1 3.250000000 1.250000000
2 2.811036789 0.438963211
3 2.697989503 0.113047286
4 2.690677153 0.007312350
5 2.690647448 0.000029705
6 2.690647448 0.000000001
by first finding the real zeros using Newton’s method and then reducing to polynomails of lower
degree to determine any complex zeros.
According to Descart rule we have:
1. For positive zeroes, we have: number of variations of sign is 1. Thus, there is only on positive
zero
2. For negative zeroes, we have: number of variartions of sign is 3. Thus, there are one or three
negative zeroes.
41
[1 -1.600000000, 1.600000000]
[2 -2.681394805, 1.081394805]
[3 -5.595348023, 2.913953218]
[4 -4.842605061, 0.752742962]
[5 -4.377210956, 0.465394105]
[6 -4.167343093, 0.209867863]
[7 -4.124721017, 0.042622076]
[8 -4.123107873, 0.001613144]
[9 -4.123105624, 0.000002249]
b4 = 1.
b3 = 9.123105625
b2 = 28.61552812
b1 = 32.9848450
b0:= 0.
We use Newton Method to get a solution for Q1 (x) = 0, we find up to 10−5 the root 4.123106. We
use Horners method to get the reduced polynomial
b3 = 1.
b2 = 5.
b1 = 8.
b0 = 0.
Q2 (x) = x2 + 5x + 8
42
Chapter 5
Conseder Data with two columns, x and y. We plot this data and see if we can fit this data to a function
y = f (x). This is what we call “interpolation”. We will study the case when we fit the data to a
polynomial.
The proof of this theorem can be found in most elementary textbook on real analysis.
Taylor polynomials are used mainly to approximate a function at a specified point. A good polyno-
mial needs to provide a relatively accurate approximation over the entire interval.
Taylor polynomial is not always an appropriate for interpolation. As example To approximate
f (x) = 1/x at x = 3 using Taylor polynomial expanded at x = 1, leads to inaccurate result.
n 0 1 2 3 4 5 6 7
Pn (3) 1 −1 3 −5 11 −21 43 −85
43
and define
It is clear that the polynomial P (x) coincides with f (x) at x0 and x1 and it is the unique linear
function passing through (x0 , f (x0 )) and (x1 , f (x1 )).
Theorem If x0 , x1 ,...,xn are n + 1 distinct numbers and f is a function whose values are given at
these numbers, then a unique polynomial P (x) of degree at most n exists with
with
n
Y x − xi
Ln,k (x) = (5.6)
i=0
xk − xi
i 6= k
Proof:
Theorem: Suppose x0 , x1 ,...,xn are n + 1 distinct numbers in [a, b] and f ∈ C n+1 [a, b], then
f n+1 (ξ(x))
f (x) = P (x) + (x − x0 )(x − x1 )...(x − xn ) (5.7)
(n + 1)!
where P (x) is the interpolating polynomial given by eq.(5.5) and ξ(x) ∈ (a, b).
Proof:
Definition: Let f be a function at x0 , x1 ,..., xn , and suppose that m1 , m2 ,...,mk are k distinct
integers, with 0 ≤ mi ≤ n for each i. The Lagrange polynomial that agrees with f (x) at all k points
xm1 , xm2 ,..., xmk , is denoted by Pm1 m2 ...mk (x).
Theorem Let f be defined at x0 , x1 ,...,xk , and xj 6= xi be two numbers from the set. Then,
44
5.3 Neville’s Method
This theorem implies that the interpolating polynomials can be generated recursively.
To avoid multiple subscripts, let Qi,j , for 0 ≤ j ≤ i, denote the interpolating polynomial of degree j
on the (j + 1) numbers xi−j , xi−j+1 ,...,xi−1 , xi ; that is
x0 P0 = Q0,0
x1 P1 = Q1,0 P0,1 = Q1,1
x2 P2 = Q2,0 P1,2 = Q2,1 P0,1,2 = Q2,2 (5.10)
x3 P3 = Q3,0 P2,3 = Q3,1 P1,2,3 = Q3,2 P0,1,2,3 = Q3,3
x4 P4 = Q4,0 P3,4 = Q4,1 P2,3,4 = Q4,2 P1,2,3,4 = Q4,3 P0,1,2,3,4 = Q4,4
Example:
Suppose function f is given for the following values:
x f (x)
x0 = 1.0 0.7651977
x1 = 1.3 0.6200860
x2 = 1.6 0.4554022
x3 = 1.9 0.2818186
x4 = 2.2 0.1103623
we want to approximate f (1.5) using various interpolating polynomials at x = 1.5. By using Neville’s
method, eq. (5.12), we can calculate Qi,j
Q0,0 = P0 = 0.7651977,
Q1,0 = P1 = 0.6200860,
(x − x0 )Q1,0 − (x − x1 )Q0,0
Q1,1 = P0,1 = = 0.5233449
(x1 − x0 )
Q2,0 = P2 = 0.4554022
(x − x1 )Q2,0 − (x − x2 )Q1,0
Q2,1 = P1,2 = = 0.5102968
(x2 − x1 )
(x − x0 )Q2,1 − (x − x2 )Q1,1
Q2,2 = P0,1,2 = = 0.5124715
(x2 − x0 )
assignments:
study example 6 page 113.
45
5.4 Newton Interpolating Polynomial
Suppose there is a known polynomial Pn−1 (x) that interpolates the data set: (xi , yi ), i = 0, 1, .., n − 1.
When one more data point (xn , yn ), which is distinct from all the other data points, is added to the data
set, we can construct a new polynomial Pn (x) that interpolates the new data set. To do so, let consider
the polynomial
n−1
Y
Pn (x) = Pn−1 (x) + cn (x − xi ) (5.13)
i=0
So, For any given data set (xi , yi ), i = 1, 2, ..., n, we can obtain the interpolating polynomial by recursive
process that starts from P0 (x) and uses the above construction to get P1 (x), P2 (x), ..., Pn−1 (x). We will
demonstrate this process through the following example
i 0 1 2 3 4
xi 0 0.25 0.5 0.75 1
yi −1 0 1 0 1
P0 (x) = y0 = −1
46
The constant c1 is given by
y1 − P0 (x1 )
c1 = 0
Y
(x1 − xi )
i=0
0 − (−1)
= =4
0.25 − 0
Thus,
P1 (x) = −1 + 4 x
1
Y
P2 (x) = P2 (x) + c2 (x − xi )
i=0
= (−1 + 4x) + c2 (x − x0 )(x − x1 )
= −1 + 4x + c2x(x − 0.25)
The constant c2 is given by
y2 − P1 (x2 )
c2 = 1
Y
(x2 − xi )
i=0
−1 − (−1 + 4 × 0.5)
= =0
(0.5 − 0)(0.5 − 0.25)
Thus,
P2 (x) = −1 + 4x
We continue the calculations we find
−64 64 1 1
c3 = , P3 (x) = −1 + 4x − x x − x−
3 3 4 2
64 1 1 1 1 3
c4 = 64, P4 (x) = −1 + 4x − x x − x− + 64x x − x− x−
3 4 2 4 2 4
Divided difference polynomial: The divided difference polynomial is a helpful method to generate
interpolation polynomials.
The first order divided difference of f at x = xi is give by
f (xi+1 ) − f (xi )
f [xi , xi+1 ] = (5.18)
xi+1 − xi
The second order divided difference of f at xi is given by
f [xi+1 , xi+2 ] − f [xi , xi+1 ]
f [xi , xi+1 , xi+2 ] = (5.19)
xi+2 − xi
47
We can generate this to higher order
f [x1 , . . . , xn ] − f [x0 , . . . , xn−1 ]
f [x0 , x1 , . . . , xn ] = (5.20)
xn − x0
With these definition we get the interpolation polynomial as:
n−1
Y
Pn (x) = f [x0 ] + f [x0 , x1 ](x − x0 ) + . . . + f [x0 , . . . , xn ] (x − xi )
i=0
n
X i−1
Y
= f [x0 ] + f [x0 , . . . , xi ] (x − xi ) (5.21)
i=1 i=0
Example:
i 0 1 2 3 4
xi 0 0.25 0.5 0.75 1
yi −1 0 1 0 1
Let try to find the interpolation polynomial of the above table
i xi f (xi ) 1st DD 2nd DD 3rd DD
0 0.00 f [x0 ] = −1
1 0.25 f [x1 ] = 0
f [x1 ]−f [x0 ]
f [x0 , x1 ] = x1 −x0
=4
2 0.50 f [x1 ] = 1
f [x2 ]−f [x1 ]
f [x1 , x2 ] = x2 −x1
=4
3 0.75 f [x1 ] = 0
f [x0 , x1 , x2 ] = 0
f [x3 ]−f [x2 ]
f [x2 , x3 ] = x3 −x2
= −4
f [x1 , x2 , x3 ] = −16
4 1.00 f [x1 ] = 1
f [x4]−f [3]
f [x3 , x4 ] = x4 −x3
=4
f [x0 , x1 , x2 , x3 ] = − 64
3
f [x2 , x3 , x4 ] = 16
128
f [x1 , x2 , x3 , x4 ] = 3
f [x0 , x1 , x2 , x3 , x4 ] = 64
48
5.5 Polynomial Forms
We follow the book Introduction to numerical analysis, Alastair Wood, Addition-Wesly
Pn (x) = a0 + a1 x + . . . + an xn
Xn
= ak x k (5.22)
k=0
This form is convenient for analysis but may leads to loss of significance. For example, let consider
18001
P1 (x) = −x
3
This polynomial takes 1/3 at x = 6000 and -2/3 at x = 6001. On a finite-precision machine with 5
decimal digits the coefficients are stored as a∗0 = 6000.3 and a∗1 = −1, and hence
Only one digit of the exact value is recovered, yet the coefficients are accurate to 5 digits! 4 significant
digits have been lost due to subtraction of two near-neighbor large numbers.
Shifted power form: The drawback seen in the previous example can be alleviated by changing
the origin of the x to a non-zero value c an writing the polynomial (5.22) as
n
X
Pn (x) = ak x k
k=0
= b0 + b1 (x − c) + . . . + bn (x − c)n (5.23)
Xn
= bk (x − c)k
k=0
This form is called shifted power form. c is a centre an bk are constant coefficients. The previous
example can be written as
18001
P1 (x) = −x
3
1
= − (x − 6000)
3
So, we get
1 1
P1 (6000) = − (6000 − 6000) = = 0.33333
3 3
1 1
P1 (6001) = − (6001 − 6000) = − 1 = −0.66667
3 3
49
These values are accurate to 5 digits and there is no loss of significance.
We can find the coefficients bk by using Taylor polynomial at x=c. This gives
n (k)
X Pn (c)
Pn (x) = (x − c)k (5.24)
k=0
k!
(k)
where,Pn (c) is the k-th derivative of Pn (x) at x = c. Thus,
(k)
Pn (c)
bk =
k!
Pn (x) = d0 + d1 (x − c1 ) + d2 (x − c1 )(x − c2 ) + . . .
+dn (x − c1 )(x − c2 ) . . . (x − cn ) (5.25)
Xn k
Y
= d0 + dk (x − cj )
k=1 j=1
The condition of this definition is that individual interpolates can no longer be constructed in isola-
tion. The piecewise interpolates s3,1 , . . . , s3,n are interdependent through the derivatives continuity
condition.
1
Alastair Wood, Introduction to Numerical Analysis
50
On the interval [xi−1 , xi ], and for i = 1, 2, . . . , n, we have
s3,i (x) = f (xi−1 ) + ai (x − xi−1 ) + bi (x − xi−1 )2 + ci (x − xi−1 )3 (5.26)
there are 3n constants to be determined, ai , bi ,ci , i = 1, . . . , n. The continuity enforce that
s3,i (xi ) = f (xi−1 ) + ai (xi − xi−1 ) + bi (xi − xi−1 )2 + ci (xi − xi−1 )3
s3,i+1 (xi ) = f (xi ) + ai+1 (xi − xi ) + bi+1 (xi − xi )2 + ci+1 (xi − xi )3
which leads to
f (xi−1 ) + ai (xi − xi−1 ) + bi (xi − xi−1 )2 + ci (xi − xi−1 )3 = f (xi )
f (xi−1 ) + ai hi + bi h2i + ci h3i = f (xi )
where hi = xi − xi−1 for i = 1, . . . , n.
For the first derivative we have
s′3,i (xi ) = ai + 2bi (xi − xi−1 ) + 3ci (xi − xi−1 )2
s′3,i+1 (xi ) = ai+1 + 2bi+1 (xi − xi ) + 3ci+1 (xi − xi )2
which leads to
ai + 2bi hi + 3ci h3i = ai+1
for i = 1, . . . , n − 1.
For the second derivative we get
s′′3,i (xi ) = 2bi + 6ci (xi − xi−1 )
s′′3,i (xi ) = 2bi+1 + 6ci+1 (xi − xi )
which leads to
bi + 3ci hi = bi+1
for i = 1, . . . , n − 1.
The natural cubic spline is defined by
s′′3,1 (x0 ) = s′′3,n (xn ) = 0 (5.27)
in other words
b1 = 0, and bn + 3cn hn = 0 (5.28)
we have:
3n constants to be determined
51
5.7 Parametric Curves
In some cases curves cannot be expressed as a function of one coordinate variable y in terms of the other
variable x. A straightforward method to represent such curves is to use parametric technique. We choose
a parameter t on the interval [t0 , tn ], with t0 < t1 < ... < tn and construct approximation functions with
xi = x(ti ) and yi = y(ti ) (5.29)
Consider a curve given by the figure.3.14 page 158 from the textbook. From the curve we can extract the
following table
i 0 1 2 3 4
ti 0 0.25 0.5 0.75 1
xi −1 0 1 0 1
yi 0 1 0.5 0 −1
Please refer to page 158
Example:
The first part of the graph
x = [10, 6, 2, 1, 2, 6, 10]
y = [3, 1, 1, 4, 6, 7, 6]
t = [0, 1/6, 1/3, 1/2, 2/3, 5/6, 1]
The second part of the graph
x = [2, 6, 10, 10, 13]
y = [10, 12, 10, 1, 1]
t = [0, 1/4, 2/4, 3/4, 1]
52
The cubic polynomials for the first graph are:
10 − 29713
u − 540
13
u3 u < 1/6
115 27
13
− 13 u − 1620
13
u2 + 2700
13
u3 u < 1/3
283
13
− 1539
13
u + 2916
13
u2 − 1836
13
u3 u < 1/2
fx = u 7→ 176 1215 2592 2 1836 3 (5.30)
− 13 + 13 u − 13 u + 13 u u < 2/3
1168
− 4833 u + 6480 u2 − 2700
u3 u < 5/6
13707 13
1917
13
1620 2
13
540 3
− 13 + 13 u − 13 u + 13 u otherwise
(5.31)
3 − 1797
130
u + 4266
65
u3 u < 1/6
367
− 130 u − 65 u2 + 1350
1383 1242
u3
130 13
u < 1/3
2143 17367 22734 2 17226 3
130
− 130 u + 65 u − 65 u u < 1/2
fy = u 7→ (5.32)
1831
− 65 + 17463 130
u − 12096
65
u2 + 5994
65
u3 u < 2/3
389 16521 13392 2 1350 3
− 130 u + 65 u − 13 u
132397 u < 5/6
− 26 + 40629 130
u − 20898
65
u2 + 6966
65
u3 otherwise
53
We merge the two graphs we get
Example:
Given the data points
i 0 1 2
xi 4 9 16
fi 2 3 4
f1 (x) = a0 + a1 (x − x0 ) + a2 (x − x0 )2 + a3 (x − x0 )3
f2 (x) = b0 + b1 (x − x1 ) + b2 (x − x1 )2 + b3 (x − x1 )3
which become
3 = 2 + a1 (9 − 4) + a2 (9 − 4)2 + a3 (9 − 4)3
1 = 5a1 + 25a2 + 125a3
4 = 3 + b1 (16 − 9) + b2 (16 − 9)2 + b3 (16 − 9)3
1 = +7b1 + 49b2 + 343b3
b1 = a1 + 10a2 + 75a3
54
The continuity of the second derivatives lead to
b2 = a2 + 15a3
0 = a2
0 = b2 + 21b3
55
1 = 5a1 + 125a3
1 = +7b1 + 49b2 + 343b3
b1 = a1 + 75a3
b2 = 15a3
0 = b2 + 21b3
then, we get:
1 = 5a1 + 125a3
and
15a3
1 = 7(a1 + 75a3 ) + 49(15a3 ) + 343 −
21
1 = 7a1 + 1015a3
56
Chapter 6
h2 ′′
f (x + h) = f (x) + hf ′ (x) + f (ξ) (6.1)
2
where, ξ is a real number between x and x + h. We can get
f (x + h) − f (x) h ′′
f ′ (x) = − f (ξ) (6.2)
h 2
We can get the same formula using Linear Lagrange polynomial. we use two points x0 and x1 = x0 +h,
we get
x − x1 x − x0 f ′′ (ξ)
f (x) = f (x0 ) + f (x1 ) + (x − x0 )(x − x1 )
x0 − x1 x1 − x0 2!
x − x1 x − x0 f ′′ (ξ)
= −f (x0 ) + f (x1 ) + (x − x0 )(x − x1 )
h h 2!
Now we calculate the derivative at x = x0
−f (x0 ) f (x1 ) f ′′ (ξ) 1 ′
f ′ (x) = + + (2x − x0 − x1 ) + [f ′′ (ξ)] (x − x0 )(x − x1 )
h h 2! 2!
f (x 0 + h) − f (x 0 ) h
f ′ (x0 ) = − f ′′ (ξ)
h 2
This formula is known as the forward-difference if h > 0 and the backward-difference if h < 0.
57
(n+1)-point formula: To obtain general derivative approximation formulas, suppose that {x0 , x1 , ..., xn }
are (n+1) distinct numbers in some interval I and that f ∈ C n+1 (I). We can Use Lagrange polyno-
mials n n
X f (n+1) (ξ(x)) Y
f (x) = f (xk )Lk (x) + (x − xk ) (6.4)
k=0
(n + 1)! k=0
In general, using more evaluation points produces greater accuracy, although the number of functional
evaluations and growth of round-off error discourages this somewhat.
Three-point formula:
1 h2
f ′ (x0 ) = [−3f (x0 ) + 4f (x + h) − f (x0 + 2h)] + f (3) (ξ0 ), (6.6)
2h 3
2
1 h
f ′ (x0 ) = [f (x0 + h) − f (x0 − h)] + f (3) (ξ1 ), (6.7)
2h 6
where ξ0 lies between x0 and x0 + 2h and ξ1 lies between x0 − h and x0 + h. Although the errors
in both formulas are O(h2 ), the error in the last equation is approximately half the error in the first
equation. This is because it use data from both sides of x0 .
Five-point formula
1
f ′ (x0 ) = [f (x0 − 2h) − 8f (x0 − h)
12h
h4
+8f (x0 + h) − f (x0 + 2h)] + f (5) (ξ), (6.8)
30
where ξ lies between x0 − 2h and x0 + 2h. The other five-point formula is useful for end-point
approximations. It is given by
1
f ′ (x0 ) = [−25f (x0 ) + 48f (x0 + h) − 36f (x0 + 2h)
12h
h4
+16f (x0 + 3h) − 3f (x0 + 4h)] + f (5) (ξ), (6.9)
5
where ξ lies between x0 and x0 + 4h. Left-endpoint approximation are found using h > 0 and
right-endpoint approximations with h < 0.
Example:
Let f (x) = x exp (x). The values of f at different x are given
x f (x)
1.8 10.889465
1.9 12.703199
2.0 14.778112
2.1 17.148957
2.2 19.855030
58
Since, f ′ (x) = (x+1) exp (x), we have f ′ (2) = 22.167168. Let us approximate f ′ (2) using three-point
formulas.
1
f ′ (x0 ) ≈ [−3f (x0 ) + 4f (x + h) − f (x0 + 2h)])
2h
1
f ′ (2) ≈ [−3f (2.0) + 4f (2.1) − f (2.2)] , h = 0.1
2 × 0.1
≈ 22.032310
1
f ′ (2) ≈ [−3f (2.0) + 4f (1.9) − f (1.8)] , h = −0.1
−2 × 0.1
≈ 22.054525
1
f ′ (x0 ) ≈ [f (x0 + h) − f (x0 − h)] ,
2h
1
≈ [f (2.1) − f (1.9)] , h = 0.1
2 × 0.1
≈ 22.228790
The errors are 22.167168 − 22.03231 = 0.134858, 22.167168 − 22.054525 = 0.112643, and 22.167168 −
22.228790 = −0.61622 × 10−1 . respectively
It is also possible to find approximation to higher derivatives function using only tabulated values
of function at various points.
Expand a function f in a third Taylor polynomial about x0 and evaluate at x0 ± h. Then
(x − x0 )2 ′′ (x − x0 )3 (3)
f (x) = f (x0 ) + (x − x0 )f ′ (x0 ) + f (x0 ) + f (x0 )
2 6
(x − x0 )4 (4)
+ f (ξ), (6.10)
24
h2 ′′ h3 h4
f (x0 + h) = f (x0 ) + hf ′ (x0 ) + f (x0 ) + f (3) (x0 ) + f (4) (ξ+ ) (6.11)
2 6 24
2 3
h h h4
f (x0 − h) = f (x0 ) − hf ′ (x0 ) + f ′′ (x0 ) − f (3) (x0 ) + f (4) (ξ− ) (6.12)
2 6 24
where x0 − h < ξ− < x0 < ξ+ < x0 + h adding the two last equations we get
f (x0 + h) + f (x0 − h) h2 h4 (4)
= f (x0 ) + f ′′ (x0 ) + f (ξ+ ) + f (4) (ξ− ) , (6.13)
2 2 24
solving this last equation we find that
1 h2 (4) (4)
f ′′ (x0 ) = [f (x 0 − h) − 2f (x 0 ) + f (x 0 + h)] − f (ξ+ ) + f (ξ− ) (6.14)
h2 24
Suppose that f (4) is continuous on [x0 − h, x0 + h]. Since 1/2(f (4) (ξ+ ) + f (4) (ξ− )) is between f (4) (ξ+ )
and f (4) (ξ− ), the Intermediate value theorem implies that a number ξ exists between ξ+ and ξ− , and
hence in (x0 − h, x0 + h), with
f (4) (ξ+ ) + f (4) (ξ− )
f (4) =
2
This lead to
1 h2 (4)
f ′′ (x0 ) = [f (x 0 − h) − 2f (x 0 ) + f (x 0 + h)] − f (ξ) (6.15)
h2 12
where ξ ∈ (x0 − h, x0 + h).
59
It is important to pay attention to round-off error when approximating derivatives. Let illustrate
this by an example:
two-point formula
f (x + h) − f (x) f1 − f0
f ′ (x) ≈ = (6.16)
h h
If we assume the round off errors in f0 and f1 as e0 and e1 , respectively, then
f1 + e1 − f0 − e0 f1 − f0 e1 − e0
f ′ (x) ≈ = + (6.17)
h h h
If the errors are of magnitude e, we can at worst get
2e
Rudolf error = (6.18)
h
h Mh
We know that the truncation error is given by − f ′′ (ξ) = , where M is the bound given by
′′
2 2
M = max |f (ξ)| and x ≤ ξ ≤ x + h Thus the bound for total error is then
Three-point formula
1 h2
f ′ (x0 ) = [f (x0 + h) − f (x0 − h)] − f (3) (ξ) (6.22)
2h 6
Suppose that in evaluating f (x0 ± h) we encounter round-off error e(x0 ± h). Then our computer
values f˜(x0 ± h) are related to f (x0 ± h) by
60
To reduce the truncation error, h2 M/6, we must reduce h. But if h is reduced, the round-off error
ǫ/h grows. In practice, then, it is seldom advantageous to let h be too small sine the round-off error
will dominate the calculations. The Minimum error can be found for the optimal value of h given by
r
3 3ǫ
h= (6.27)
M
Numerical differentiation is unstable: small values of h needed to reduce truncation error also cause
the round-off error to grow.
Example:
Show that the differentiation rule
f ′ (x0 ) ≈ a0 f0 + a1 f1 + a2 f2
is exact for all f ∈ C 3 , if, and only if, it exact for
f (x) = 1, f (x) = x, f (x) = x2 ,
and find the values of a0 , a1 , and a2 .
solution: According to the following formula
n n
X f (n+1) (ξ(x)) Y
f (x) = f (xk )Lk (x) + (x − xk ) (6.28)
k=0
(n + 1)! k=0
with n
Y (x − xi )
Lk (x) = (6.29)
i=0
(xk − xi )
i 6= k
Therefore, we get
n
′ f (3) (ξ0 ) Y
f (x0 ) = f0 L′0 (x0 ) + f1 L′1 (x0 ) + f2 L′2 (x0 ) + (x0 − xk ) (6.31)
6 k=1
– For f (x) = 1, x, x2 , the formula f ′ (x0 ) = a0 f0 + a1 f1 + a2 f2 is exact since ∀ξ0 f (3) (ξ0 ) = 0.
– Now, we want to show that if the formula f ′ (x0 ) = a0 f0 + a1 f1 + a2 f2 is exact, then it should
be also exact for f (x) = 1, x, x2 .
If f ′ (x0 ) = a0 f0 + a1 f1 + a2 f2 is exact then ∀ξ0 f (3) (ξ0 ) = 0.
This implies that ∀α, β, γ, f (x) = α + βx + γx2 .
Then,
if f ′ (x0 ) = a0 f0 + a1 f1 + a2 f2 is exact =⇒ ∀ξ0 f (3) (ξ0 ) = 0
=⇒ ∀α, β, γ, f (x) = α + βx + γx2
=⇒ the formula is exact fo f (x) = 1, x, x2
Exercises section 4.1: 1, 3, 5, 9, 13, 15, 19
61
6.2 Richardson’s Extrapolation
Richardson’s extrapolation is used to generate high-accuracy results while using low-order formulas. Ex-
trapolation can be applied whenever it is known that an approximation technique has an error term with
predictable form like
M − N (h) = K1 h + K2 h2 + ..., (6.32)
for some collection of unknown constants, Ki , where N (h) approximate an unknown value M . In general
M − N (h) ≈ K1 h, unless there was a large variation in magnitude among the constants K, which is O(h).
if M can be written in the form
m−1
X
M = N (h) + Kj hj + O(hm ), (6.33)
j=1
These approximations are generated by rows in the order indicated by the numbered entries in the following
table
O(h) O(h2 ) O(h3 ) O(h4 )
1 : N1 (h/1) ≡ N (h/1)
2 : N1 (h/2) ≡ N (h/2) 3 : N2 (h) (6.35)
4 : N1 (h/4) ≡ N (h/4) 5 : N2 (h/2) 6 : N3 (h)
7 : N1 (h/8) ≡ N (h/8) 8 : N2 (h/4) 9 : N3 (h/2) 10 : N4 (h)
Example:
We want to determine an approximate the value to f ′ (1.0) with h = 0.4 where f (x) = ln x. We use
Richardson’s Extrapolation N3 (h). We have
1 h2 h4 (5)
f ′ (x0 ) = [f (x0 + h) − f (x0 − h)] − f (3) (x0 ) − f (ξ) − ... (6.37)
2h 6 120
In the case h = 0.4 and x0 = 1, we can calculate N3 (h) using
1
N1 (h) = [ln(x0 + h) − ln(x0 − h)]
2h
N1 (0.4) = 1.059122326
N1 (0.2) = 1.013662770
N1 (0.1) = 1.003353478
62
We then use
h Nj−1 (h/2) − Nj−1 (h)
Nj (h) = Nj−1 + (6.38)
2 4j−1 − 1
to get
N1 (0.2) − N1 (0.4)
N2 (0.4) = N1 (0.2) + = 0.9985095847
41 − 1
N1 (0.1) − N1 (0.2)
N2 (0.2) = N1 (0.1) + = 0.9999170473
41 − 1
N2 (0.2) − N2 (0.4)
N3 (0.4) = N2 (0.2) + = 1.000010878
42 − 1
> a1:=1.1;f1:=exp(2*a1);
> a2:=1.2;f2:=exp(2*a2);
> a3:=1.3;f3:=exp(2*a3);
> a4:=1.4;f4:=exp(2*a4);
a1 := 1.1
f1 := 9.025013499
a2 := 1.2
f2 := 11.02317638
a3 := 1.3
f3 := 13.46373804
a4 := 1.4
f4 := 16.44464677
> h:=0.1;
> f1p:=1/2/h*(-3*f1+4*f2-f3);
> f2p:=1/2/h*(f3-f1);
> f3p:=1/2/h*(f4-f2);
> f4p=-1/2/h*(-3*f4+4*f3-f2);
h := 0.1
f1p := 17.76963490
f2p := 22.19362270
f3p := 27.10735195
f4p = 32.51082265
> evalf(subs(x=1.3,h^2/3*diff(exp(2*x),x$3)));
> evalf(subs(x=1.3,h^2/6*diff(exp(2*x),x$3)));
> evalf(subs(x=1.4,h^2/6*diff(exp(2*x),x$3)));
> evalf(subs(x=1.4,h^2/3*diff(exp(2*x),x$3)));
0.3590330144
0.1795165072
0.2192619569
63
0.4385239139
64
6.3.2 Simpson’s rule
Simpson’s rule uses the second Lagrange polynomial with nodes at x0 = a, x1 = a + h, x2 = b, where
h = (b − a)/2. There are few tricks to get the formula (please see sec 4.3 and exercise 24 of sec 4.3)
Zx2
h h5
f (x)dx = [f (x0 ) + 4f (x1 ) + f (x2 )] − f (4) (ξ). (6.41)
3 90
x0
Z1
1 1
f (x)dx = [f (−1) + 4f (0) + f (1)] − f (4) (ξ) (6.43)
3 90
−1
(6.44)
we get
Z1
1 −1 1
ex dx = [e + 4 + e] − eξ
3 90
−1
1 ξ
= 2.36205 − e
90
The maximum error is
1 ξ
|Emax | = Max e
90
1
= e = 0.03020
90
The exact value of the integral is
Z 1
1
I= ex dx = e − = 2.35040 (6.45)
−1 e
the absolute error is |2.36205 − 2.35040| = 0.01165 which is less than the maximum error 0.03020.
65
The integral
Z 1
1 + (−1)k
xk dx =
−1 1+k
and
k k k
1 1 1
−√ + √ = √ 1 + (−1)k
3 3 3
The form has n degree of precision if it is exact for xk , k = 0, . . . , n
k
1 + (−1)k 1
= √ 1 + (−1)k
1+k 3
It is true when k is an odd number. For k even number
k
1 1
= √
1+k 3
which is true for k = 0, 2. Therefore, the formula is true for k = 0, 1, 2, 3. Thus, the degree of the formula
is 3. We can conclude from this example that the formula is true for all the polynomial of degree at most
3.
Zb
ai = Li (x)dx (6.47)
a
Theorem 4.2
Theorem 4.3
66
Composite Simpson’s rule:
Let f ∈ C 4 [a, b], n be even number, h = (b − a)/n, and xj = a + hj, for each j = 0, 1, . . . , n. There
exists a µ ∈ (a, b) for which the composite Simpson’s rule for n subintervals can be written as
Z b (n/2)−1 n/2
h X X b − a 4 (4)
f (x) dx = f (a) + 2 f (x2j ) + 4 f (x2j−1 ) + f (b) − h f (µ) (6.48)
a 3 j=1 j=1
180
An important property shared by all these composite integration techniques is a stability with respect
to round off errors. Let demonstrate this property for Composite Simpson’s rule with n subintervals
to a function f on [a, b]. Assume that
f (xi ) = f˜(xi ) + ei (6.51)
where ei is the round off error and f˜(xi ) is an approximation to f (xi ). From the Composite Simpson’s
rule
Z b (n/2)−1 n/2
h X X b − a 4 (4)
f (x) dx = f (a) + 2 f (x2j ) + 4 f (x2j−1 ) + f (b) − h f (µ)
a 3 j=1 j=1
180
(n/2)−1 n/2
h X X
e(h) = e0 + 2 e2j + 4 e2j−1 + en
3 j=1 j=1
67
Exercises section 4.4:
h
S(ak , bk ) = (f (ak ) + 4f (ck ) + f (bk )) , (6.52)
3
b k − ak
where ck is the center of [ak , bk ], and h = . Furthermore, if f ∈ C (4) [ak , bk ], then there exist
2
ξk ∈ [ak , bk ] so that
Z b
h5 (4)
f (x) dx = S(ak , bk ) − f (ξk ) (6.53)
a 90
A composite Simpson rule using four subintervals of [ak , bk ] can be performed by bisecting this interval
into two equal subinterval [ak , ck ] = [ak1 , bk1 ] and [ck , bk ] = [ak2 , bk2 ]. We then write
h
S(ak1 , bk1 ) + S(ak2 , bk2 ) = (f (ak1 ) + 4f (ck1 ) + f (bk1 )) (6.54)
3×2
h
+ (f (ak2 ) + 4f (ck2 ) + f (bk2 )) (6.55)
3×2
where only two additional evaluation of f (x) are needed at ck1 and ck2 , which are the midpoint of the
intervals [ak1 , bk1 ], and [ak2 , bk2 ], respectively.
Furthermore, if f ∈ C (4) [ak , bk ], then there exist ξk1 ∈ [ak , bk ] so that
Z bk
h5
= S(ak1 , bk1 ) + S(ak2 , bk2 ) − f (4) (ξk1 ) (6.56)
ak 16 × 90
h5 (4) h5
S(ak , bk ) − f (ξk ) ≈ S(ak1 , bk1 ) + S(ak2 , bk2 ) − f (4) (ξk1 )
90 16 × 90
which can be written as
h5 (4) 16
− f (ξk1 ) ≈ (S(ak1 , bk1 ) + S(ak2 , bk2 ) − S(ak , bk ))
90 15
Thus, we can find that
Z b
≈ 1 |S(ak1 , bk1 ) + S(ak2 , bk2 ) − S(ak , bk )|
f (x) dx − S(a k1 , b k1 ) − S(a k2 , b k2 ) 15 (6.57)
a
68
1 1
Because of the assumption f (4) (ξk ) ≈ f (4) (ξk1 ), the fraction is replaced with when implementing
15 10
the method in a program.
Assume that we want the tolerance to be ǫk > 0 for the interval [ak , bk ]. If
1
|S(ak1 , bk1 ) + S(ak2 , bk2 ) − S(ak , bk )| < ǫk (6.58)
10
we infer that Z b
f (x) dx − S(a k1 , b k1 ) − S(a k2 , b k2 ) < ǫk
(6.59)
a
and the error bound for this approximation over the interval [ak , bk ] is ǫk .
The adaptive quadrature is implemented by applying Simpson’s rules in this way:
3. The interval is refined into subintervals labeled [a01 , b01 ] and [a02 , b02 ].
The two subintervals are labeled [a1 , b1 ] and [a2 , b2 ], over which the tolerances are halved,
ǫ1 = ǫ/2, ǫ2 = ǫ/2.
We repeat the steps 4-5 for the two intervals with the new tolerances.
6. we add all the quadrature formulas where the accuracy test are passed
69
example We apply the adaptive quadrature algorithm to approximate
Z 1
3√
x dx = 1
0 2
S(0, 1) = 0.9571067813
S(0, 0.5) = 0.3383883477
S(0.5, 1) = 0.6464010497
|S(0, 0.5) + S(0.5, 1) − S(0, 1)| − 10ǫ0 = 0.0176826161 > 0
We have to refine the interval [0, 1] into [0, 0.5] and [0.5, 1]
We have to refine the interval [0, 5] into [0, 0.25] and [0.25, 0.5]
We have to refine the interval [0, 25] into [0, 0.125] and [0.125, 0.25]
The test has passed. So, we can keep the interval [0, 0.125] with
S(0, 0.0625) + S(0.0625, 0.125) = 0.04352195381.
We go now back and keep ǫ3
70
Accuracy test for [0.125,0.25]
S(0.125, 0.25) = 0.08080013118
S(0.125, 0.1875) = 0.03699538942
S(0.1875, 0.25) = 0.04381002180
|S(0.125, 0.1875) + S(0.1875, 0.25) − S(0.125, 0.25)| − 10ǫ3 = 0.001244719960 < 0
The test has passed. So, we can keep the interval [0.125, 0.25] with
S(0.125, 0.1875) + S(0.1875, 0.25) = 0.08080541122.
We go now back with ǫ2
Accuracy test for [0.25 0.5]
S(0.25, 0.5) = 0.2285372827
S(0.25, 0.375) = 0.1046387629
S(0.375, 0.5) = 0.1239134540
|S(0.25, 0.375) + S(0.375, 0.5) − S(0.25, 0.5)| − 10ǫ2 = 0.002485065800 < 0
The test has passed. So, we can keep the interval [0.25, 0.5] with
S(0.25, 0.375) + S(0.375, 0.5) = 0.2285522169.
We go now back with ǫ1
Accuracy test for [0.5 1]
S(0.5, 1) = 0.6464010497
S(0.5, 0.75) = 0.2959631153
S(0.75, 1) = 0.3504801743
|S(0.5, 0.75) + S(0.75, 1) − S(0.5, 1)| − 10ǫ1 = 0.004957760100 < 0
The test has passed. So, we can keep the interval [0.5, 1] with
S(0.5, 0.75) + S(0.75, 1) = 0.6464432896.
Now we can add
Z 1
3√
x dx ≈ S(0, 0.0625) + S(0.0625, 0.125) + S(0.125, 0.1875) + S(0.1875, 0.25) +
0 2
S(0.25, 0.375) + S(0.375, 0.5) + S(0.5, 0.75) + S(0.75, 1)
= 0.9993228715
71
1.5
0.5
0
0 0.125 .0.25 0.5 1
Let us find the Gaussian quadrature formula for n = 2. In this case the function 1, x, x2 , and 4x3 should
give exact results.
Z 1
f (x) = 1 =⇒ a1 + a2 = dx = 2
−1
Z 1
f (x) = x =⇒ a1 x1 + a2 x2 = xdx = 0
−1
Z 1
2 2 2 2
f (x) = x =⇒ a1 x1 + a2 x2 = x2 dx =
−1 3
Z 1
f (x) = x3 =⇒ a1 x31 + a2 x32 = x3 dx = 0
−1
72
This method can be generalized to more than two nodes but there is another alternative way to get
more easily. This alternative way use what we call Legendre Polynomials. They are defined by
For each n, Pn is a monic polynomial of degree n, a polynomial xn + a(n−1) x(n−1) + ... + a1 x + a0 in
which the coefficient of the highest order term is 1.
whenever P (x) is polynomial of degree less then n we have
Z 1
P (x) Pn (x) dx = 0 (6.65)
−1
The nodes xi , i = 1, . . . , n are determined by the roots of Pn (x) and the coefficients ai are defined by
Z 1 Y
x − xj
ai = dx (6.67)
−1 j = 1 xi − xj
j 6= i
Note that Gaussian formula imposes a restriction on the limits of integration to be from -1 to 1. It is
possible to overcome this restriction by using the technique of changing variable.
Z b Z 1
f (x) dx = g(z)dz (6.69)
a −1
We define
z = Ax + B (6.70)
z−B
x = (6.71)
A
So, we can get
−1 = A a + B
1 = Ab + B
73
Therefore
Z b Z 1
1
f (x) dx = g(z) dz (6.72)
a A −1
where
z−B
g(z) = f (6.73)
A
Example:
Convert the integral
Z 2
I= e−x/2 dx
−2
We have
2 1
A= =
b−a 2
a+b
B= =0
a−b
Thus
Z 2
I = e−x/2 dx
−2
Z 1
= 2 f (2z) dz
−1
Z 1
= 2 e−z dz
−1
the notion of integration is extended to an interval of integration on which the function is unbounded
74
Let consider the first case, when the integrand f (x) is unbounded. The second case can be reduced to the
first case. It is well known that the improper integral
Z b
dx
p
(6.74)
a (x − a)
converges iff the power 0 < p < 1. Thus, we define the improper integral
Z b
dx (b − a)1−p
p
= (6.75)
a (x − a) 1−p
If the function f (x) can be written as
g(x)
f (x) = (6.76)
(x − a)p
where g is continuous on [a, b], the improper integral
Z b
f (x)dx (6.77)
a
Since 0 < p < 1 and P4k (a) agrees with g (k) (a) for each k = 0, dots, 4, we have G ∈ C 4 [a, b]. This implies
that the Composite Simpson’s rule can be applied to approximate the integral of G on [a, b].
Example:
We want to approximate Z 1
sin (x)
dx (6.82)
0 x1/4
Using Simpson’s composite rule with n = 4.
75
1. We find the fourth order Taylor polynomial for sin(x) at x = 0
1
sin(x) ≈ P4 (x) = x − x3 (6.83)
6
Z 1
1
G(x) dx ≈ [G(0) + 2G(0.5) + 4G(0.25) + 4G(0.75) + G(1)] = 0.001432198742
0 12
5. Now we get
Z 1
sin (x)
dx ≈= 0.001432198742 + 0.5269841270 = 0.5284163257
0 x1/4
To approximate the improper integral with a singularity at the right endpoint, we could apply the
technique used above with the following transformation
Z b Z −a
f (x)dx = f (−z)dz (6.84)
a −b
which has a singularity at the left endpoint. An improper with a singularity at a < c < b can be treated
as the sum
Z b Z c Z b
f (x)dx = f (x)dx + f (x)dx (6.85)
a a c
Other type of improper integral involves infinite limits of integration can be treated as
Z ∞ Z 1/a
f (x)dx = t−2 f (1/t)dt (6.86)
a 0
for a 6= 0. For the case when a = we can split the integral into two parts. One from 0 to c and the other
from c 6= 0 to ∞.
76
Chapter 7
7.1 introduction
An ordinary Differential Equation (ODE) is an equation containing one ore more derivatives of y.
Differential equations are classified according to their order. The order of differential equation is
the highest derivative that appears in the equation. When the the equation contains only a first
derivative, it is called first-order differential equation. A first order differential equation can be
expressed as
dy
= f (t, y) (7.1)
dt
Degree of a differential equation is the power of the highest-order derivative. For example
ty ′′ + 3t2 + 2 = 0
A differential equation is a linear equation when it does not contain terms involving the product of
the dependent variable y or its derivatives. For example y ′′ + 2y ′ + t2 is linear but y ′′ + 2y ′ y + t2 is
not.
If the order of the equation is n, we nee n conditions in order to obtain a unique solution. When
all the conditions are specified at a particular value of independent variable t, then the problem is
called initial-value problem.
It is also possible to specify the conditions at different values of t. such problems are called the
boundary-value problem.
All numerical techniques for solving differential equations involve a series of estimates of y(t) starting
from the given conditions. There are two basic approaches, one-step and multistep methods.
In one-step methods, we use information from only one preceding points. To estimate yi we only
need yi−1 .
Multistep methods use information at two or more previous steps to estimate a value.
77
7.2 Elementary Theory of Initial-Value Problems
Lipschitz condition: A function f (t, y) is said to satisfy a Lipschitz condition in the variable y on
a set D ⊂ R2 if a constant L > 0 exists with
Convex Set: A set D ∈ R2 is said to be convex if whenever (t1 , y1 ) and (t2 , y2 ) belong to D and λ
is in [0, 1], the point
((1 − λ)t1 + λt2 , (1 − λ)y1 + λy2 ) belongs to D. This means that the entire straight line segment
between the two points also belongs to the set D.
Non Convex
Convex
Theorem:
Suppose that f (t, y) is defined on a convex set D ∈ R2 . If a constant L > 0 exists with
∂f
(t, y) ≤ L, for all (t, y) ∈ D,
∂y
Theorem:
Suppose that D = {(t, y)|a ≤ t ≤ b, y ∈ R} and that f (t, y) is continuous on D. If f satisfies a
Lipschitz condition on D in the variable y, then the initial-value problem
y ′ = y cos(t), 0 ≤ t ≤ 1, y(0) = 1
78
Suppose that D = {(t, y)|a ≤ t ≤ b, y ∈ R} and that f (t, y) is continuous on D. If f satisfies a
Lipschitz condition on D in the variable y, then the initial-value problem
y ′ (t) = f (t, y), a ≤ t ≤ b, y(a) = α,
has a unique solution y(t) for a ≤ t ≤ b.
Let check that
f (t, y) = y ′ = y cos(t), 0 ≤ t ≤ 1
satisfies Lipschitz condition.
|y1 cos(t) − y2 cos(t)| = cos(t)|y1 − y2 | ≤ |y1 − y2 |
Thus, L = 1, f (t, y) satisfies the Lipschitz condition. Therefore, there is a unique solution.
example:
79
7.3 Euler’s Method
The objective of Euler’s method is to obtain an approximate solution to the well-posed initial-value
problem
dy
= f (x, y), a ≤ t ≤ b, y(a) = α (7.3)
dt
We can obtain approximate solutions at fixed points, called mesh points.
Let assume that the mesh points are equally distributed throughout the interval [a, b]. So, we choose
The common distance between the points h = (b − a)/N , is called the step size. To derive Euler’s
method we use Taylor’s Theorem
′ (ti+1 − ti )2 ′′
y(ti+1 ) = y(ti ) + (ti+1 − ti )y (ti ) + y (ξi )
2
h2
= y(ti ) + h y ′ (ti ) + y ′′ (ξi )
2
h2
= y(ti ) + h f (ti , y(ti )) + y ′′ (ξi ) (7.5)
2
Euler’s method constructs wi ≈ y(ti ), for each i = 1, 2, ..., N , by dropping the remainder term. Thus,
it is given by
w0 = α,
wi+1 = wi + hf (ti , wi ), for each i = 0, 1, ..., N. (7.6)
This last equation is called difference equation associated with Euler’s method.
w0 = 0.5,
wi+1 = wi + 0.2 × (wi − t2i + 1), for each i = 0, 1, ..., N
ti wi
0.0 0.05000
0.2 0.80000
0.4 1.15200
0.6 1.55040
.. ..
. .
2.0 4.86580
80
Theorem
Suppose that f is continuous an satisfies a Lipschitz condition with constant L on
and wi , (i = 0, . . . , N ) be the approximations generated by Euler’s method for some positive integer
N . Then, for each i
hM L(ti −a)
|y(ti ) − wi | ≤ e −1 (7.10)
2L
Theorem
Assume that the hypotheses of the previous theorem hold and ui (i = 0, . . . , N ) be the approximations
obtained from
u0 = α + δ0 ,
ui+1 = ui + hf (ti , ui ) + δi+1 (7.11)
The error bound is no longer linear in h. In fact it goes to infinity for h goes to zero.
hM δ
lim + =∞ (7.13)
h−→∞ 2 h
It can be shown that the minimum value of the error occurs when
r
2δ
h= (7.14)
M
example ex 9:
Given the initial-value problem
2
y ′ = y + t2 et , 1 ≤ t ≤ 2, y(1) = 0.
t
with the exact solution y(t) = t2 (et − e). The Euler’s method with h = 0.1 gives
81
t w(t) y(t) y(t) − w(t)
1. 0 0. 0.
1.1 0.2718281828 0.345919877 0.0740916942
1.2 0.6847555777 0.866642537 0.1818869593
1.3 1.276978344 1.607215080 0.330236736
1.4 2.093547688 2.620359552 0.526811864
1.5 3.187445123 3.967666297 0.780221174
1.6 4.620817847 5.720961530 1.100143683
1.7 6.466396379 7.963873477 1.497477098
1.8 8.809119690 10.79362466 1.984504970
1.9 11.74799654 14.32308154 2.57508500
2.0 15.39823565 18.68309709 3.28486144
A linear interpolation to approximate y(1.04) can be found as follows. We use x0 = 1 and x1 = 1.1
and the values of w(x0 ) and w(x1 ) we get the Lagrange polynomial
P (x) = 2.718281828 x − 2.718281828
Thus, P (1.04) = 0.108731273. The exact value is y(1.04) = 0.119987497 and the error is 0.011256224.
Example:
We want to approximate the solution of the initial-valued problem
y ′ = t2 + y 2 , 0 ≤ t ≤ 0.4, y(0) = 0
with h = 0.2 using Taylor’s method of order 4.
We calculate the derivatives
y′ = t2 + y 2
y ′′ = 2t + 2yy ′
y ′′′ = 2 + 2y ′2 + 2yy ′′
y (4) = 6y ′ y ′′ + 2yy ′′′
82
we find
y(0) = 0.0
y(0.2) = 0.00266666667
y(0.4) = 0.02135325469
If we use step size h = 0.4 we get y(0.4) = 0.02133333333. The correct answer is y(0.4) = 0.021359.
It shows that the accuracy has been improved by using subintervals, i.e., decreasing the step size.
Runge-Kutta methods refer to a family of one-step methods. They all based on the general form of
the extrapolation equation,
where m is the slope that is weighted averages of the slopes at various points in the interval. If
we estimate m using slopes at r points in the interval (ti , ti+1 ), then m can be written as m =
w1 m1 + w2 m2 + . . . + wr mr , where wi are weights of the slopes at various points.
Runge-Kutta methods are known by their order. For instance, a Runge-Kutta method is called the
r-order Runge-Kutta method when slopes at r points are used to construct the weighted average
slope m. In Euler’s method we use only one point slope at (ti , yi ) to estimate yi+1 , and therefore,
Euler’s method is a first-order Runge-Kutta method.
Second Taylor Polynomial in two variables for the function f (t, y) near the point (t0 , y0 ) can be
written as
m1 = f (ti , yi ) (7.21)
m2 = f (ti + a1 h, yi + b1 m1 h) (7.22)
83
The weights w1 and w2 and the constants a1 and b1 are to be determined. The principle of Runge-
Kutta method is that these parameters are chosen such that the power series expension of the right
side of eq. (7.20) agrees with Taylor series expension of yi+1 in terms of yi and f (ti , yi ).
The second-order Taylor series expension of yi+1 about yi is given by
y ′′ 2
yi+1 = yi + y ′ h + h (7.23)
2
We know that
yi′ = f (ti , yi )
dy ′ ∂f ∂f
yi′′ = = + f (ti , yi )
dt ∂t ∂y
We get
h2
yi+1 = yi + f h + (ft + fy f )
2
Now consider the right side of eq. (7.20). To get Taylor’s expension we need the Taylor’s series of
two variables. we can write
yi+1 = yi + (w1 m1 + w2 m2 )h
= y1 + (w1 f + w2 f (ti + a1 h, yi + b1 m1 h))
= yi + [w1 f + w2 (f + a1 hft + b1 m1 hfy + O(h2 ))]h
= yi + w1 hf + w2 hf + w2 a1 h2 ft + w2 b1 m1 h2 fy + O(h3 )
= yi + (w1 + w2 )hf + w2 (a1 ft + b1 m1 fy )h2 + O(h3 )
h2
yi+1 = yi + f h + (ft + fy f )
2
yi+1 = yi + (w1 + w2 )hf + w2 (a1 ft + b1 m1 fy )h2
we find
1
w1 + w2 = 1, w2 a1 = w2 b 1 =
2
Note that we have only three equations but four variables. These set of equations has no unique
solution. The index i = 0, 1, . . . , N − 1.
m1 = f (ti , yi )
h m1
m2 = f (ti + , yi + h)
2 2
yi+1 = yi + m2 h
84
If we choose w1 = w2 = 1/2 and a1 = b1 = 1 we get what we call Modified Euler Method
m1 = f (ti , yi )
m2 = f (ti + h, yi + m1 h)
1
yi+1 = yi + (m1 + m2 )h
2
If we choose w1 = 1/4, w2 = 3/4 and a1 = b1 = 2/3 we get what we call Heun’s method
m1 = f (ti , yi )
2 2
m2 = f ti + h, yi + m1 h
3 3
h
yi+1 = yi + (m1 + 3m2 )h
4
The derivation of Runge-Kutta Order Four is too long, we just give it here without details.
m1 = f (ti , yi )
h 1
m2 = f ti + , yi + m1 h
2 2
h 1
m3 = f ti + , yi + m2 h (7.24)
2 2
m4 = f (ti + h, yi + m3 h)
1
yi+1 = yi + (m1 + 2m2 + 2m3 + m4 )h
6
Examples:
We want to approximate the solution of
N = (1 − 0)/h = 2
y0 = 0
ti = 0 + ih, i = 0, 1, . . . , N − 1
m1 = f (ti , yi )
m2 = f (ti + h, yi + m1 h)
h
yi+1 = yi + (m1 + m2 )h
2
we get
ti wi
0.0 0.0
0.5 0.560211134
1.0 5.301489796
85
Example:
We want that the midpoint method, the modified Euler’s method, and Heun’s method give the same
approximations to the initial-value problem
y ′ = −y + t + 1, 0 ≤ t ≤ 1, y(0) = 1
with w1 + w2 = 1.
– Midpoint method:
1
w2 = 1, a1 = b 1 =
2
h2
yi+1 = yi + f h + (1 − m1 )
2
– Modified Euler’s method:
1
w2 = , a1 = b 1 = 1
2
h2
yi+1 = yi + f h + (1 − m1 )
2
– Heun’s method:
3 2
w2 = , a1 = b 1 =
4 3
h2
yi+1 = yi + f h + (1 − m1 )
2
Therefore, all the three methods give the same approximations.
example:
We want to find y(0.2) by using Runge-Kutta fourth order method for
y′ = 1 + y2, y(0) = 0
m1 = f (t0 , y0 ) = 1
m2 = f (t0 + h/2, y0 + hm1 /2) = 1.01000000
m3 = f (t0 + h/2, y0 + hm2 /2) = 1.01020100
m4 = f (t0 + h, y0 + m3 h) = 1.040820242
y(0.2) = y0 + h(m1 + 2m2 + 2m3 + m4 )/6 = 0.2027074080
86
7.6 Predictor Corrector Methods
In the previous methods, Euler, Heun, Taylor, and Runge-Kutta are called one-step methods because
only the value of y0 at the beginning of the interval was required. They use information from one previous
point to compute the successive point; that is, yi is needed to compute yi+1 . The Multistep methods
make use of information about the solution at more then one point. A desirable feature of multistep
method is that the local truncation error can be determined and a correction term can be included, which
improves the accuracy of the answer at each step.
Definition: An m-step multistep method for solving the initial-value problem
y ′ f (t, y), a ≤ t ≤ b, y(a) = α (7.25)
has a difference equation for finding the approximation wi+1 at the mesh point t − i + 1 represented by
the following equation, where the integer m > 1
wi+1 = am−1 wi + am−2 wi−1 + . . . + a0 wi+1−m
+h [bm f (ti+1 , wi+1 ) + bm−1 f (ti , wi ) + . . . + b0 f (ti+1−m , wi+1−m )] (7.26)
for i = m − 1, . . . , N − 1, where h = (b − 1)/N , the a0 , a1 , . . . , am−1 and b0 , . . . , bm are constant, and the
starting values w0 = α, w1 = α1 , . . . , wm−1 = αm−1 . When bm = 0 the method is called explicit, or open,
since in eq. (7.26), wi+1 is explicitly given in terms of previously determined values. When bm 6= 0 the
method is called implicit, or closed, since wi+1 occurs in both sides. Euler’s method gives
yi+1 = yi + f (t0 + ih, yi ), i = 0, 1, . . . (7.27)
The modified Euler’s method can be written as
h
yi+1 = yi + [f (ti , yi ) + f (ti+1 , yi+1 )] (7.28)
2
The value of yi+1 is first estimated by Euler’s method, eq (7.27), and then used in the right hand side of
the modified Euler’s method, eq (7.28), giving a better approximation of yi+1 . The value of yi+1 is again
substituted in the modified Euler’s method, eq (7.28), to find a still better approximation of yi+1 . This
procedure is repeated till two consecutive iterated values of yi+1 agree. The equation (7.27) is therefore
called predictor while eq (7.28) is called corrector.
We will describe only the multistep method called Adam-Bashforth-Moulton Method. It can
be derived from the fundamental theorem of calculus
Z ti+1
yi+1 = yi + f (t, y)dt (7.29)
ti
The predictor uses the Lagrange polynomial approximation for f (t, y) based on the four values, ti−3 ,
ti−2 , ti−1 , and ti .
(t − ti−2 )(t − ti−1 )(t − ti )
P4 (t) = yi−3
(ti−3 − ti−2 )(ti−3 − ti−1 )(ti−3 − ti )
(t − ti−3 )(t − ti−1 )(t − ti )
+yi−2
(ti−2 − ti−3 )(ti−2 − ti−1 )(ti−2 − ti )
(t − ti−3 )(t − ti−2 )(t − ti )
+yi−1
(ti−1 − ti−3 )(ti−1 − ti−2 )(ti−1 − ti )
(t − ti−3 )(t − ti−2 )(t − ti−1 )
+yi
(ti − ti−3 )(ti − ti−2 )(ti − ti−1 )
87
It is integrated over the interval [ti , ti+1 ].
Z ti+1
h
P4 (t)dt = (55yi − 59yi−1 + 37yi−2 − 9yi−3 ) (7.30)
ti 24
The corrector is developed in similar way. A second Lagrange polynomial for f (t, y) is constructed
based on the four points, (ti−2 , yi−2 ), (ti−1 , yi−1 ), (ti , yi ), and the new point (ti+1 , yi+1 ) just calculated
by eq. (7.33).
Algorithm
(0) h
yi+1 = yi + [55fi − 59fi−1 + 37fi−2 − 9fi−3 ] (7.34)
24
(0) (0)
– COMPUTE fi+1 = f (xi+1 , yn+1 )
(k)
– COMPUTE yi+1 from the equation
(k) h (k−1)
yi+1 = yi + [9f (xi+1 , yi+1 ) + 19fi − 5fi−1 + fi−2 ] (7.35)
24
88
– ITERATE ON i UNTIL
y (k) − y (k−1)
i+1 i+1
(k)
<ǫ (7.36)
y
i+1
Example:
Consider the initial-value problem
y′ = 1 + y2, 0 ≤ t ≤ 0.8, y(0) = 0
The first steps is to calculate the four initial value, w0 , w1 , w2 , and w3 . To do this we can use for
example Four-order Runge-Kutta method with t0 = 0, w0 = 0, and h = 0.2 we get
k1 = hf (t0 , y0 )
k2 = hf (t0 + h/2, y0 + k1 /2)
k3 = hf (t0 + h/2, y0 + k2 /2)
k4 = hf (t0 + h, y0 + k3 )
w1 = w0 + (k1 + 2k2 + 2k3 + k4 )/6 = 0.2027074081
In the same way we can get
w2 = 0.4227889928
w3 = 0.6841334020
so, we get the predictor
w4 = w3 + h/24 [55f (t3 , w3 ) − 59f (t2 , w2 ) + 37f (t1 , w1 ) − 9f (t0 , w0 )]
= 1.023434882
Now, we can correct the predicted value using the formula
t1 = 0.2, t2 = 0.4, t3 = 0.6, t4 = 0.8
(0)
w4 = 1.023434882
h i
(1) (0)
w4 = w3 + h/24 9f (t4 , w4 ) + 19f (t3 , w3 ) − 5f (t2 , w2 ) + f (t1 , w1 )
= 1.029690402
h i
(2) (1)
w4 = w3 + h/24 9f (t4 , w4 ) + 19f (t3 , w3 ) − 5f (t2 , w2 ) + f (t1 , w1 )
= 1.030653654
h i
(3) (2)
w4 = w3 + h/24 9f (t4 , w4 ) + 19f (t3 , w3 ) − 5f (t2 , w2 ) + f (t1 , w1 )
= 1.030653654
So, the predicted-corrector methods gives an approximate solution 1.030653654. The actual solution
of the ODE is
y(t) = tan(x)
y(0.8) = 1.029638557
The errors are
|w4 (0.8) − tan(0.8)| = 0.006203675
(3)
|w4 (0.8) − tan(0.8)| = 0.001015097
89
Chapter 8
No solution
No unique solution
Ill conditioned
This system is ill conditioned because the two equations represent a nearly two parallel lines.
Definition: A matrix n × m can be represented by
a11 a12 . . . a1m
a21 a22 . . . a2m
A = (aij ) = .. .. ..
. . .
an1 an2 . . . anm
Linear equations can be represented by matrix. Solving linear equation can be done by three main
operations:
Row Ei can be multiplied by any nonzero constant λ.
Example
1 1 0 3 4
2 1 −1 1 1
3 −1 −1 2 −3
−1 2 3 −1 4
90
it becomes
1 1 0 3 4 1 1 0 3 4
2 1 −1 1 1
→ 0 −1 −1 −5 −7
3 −1 −1 2 −3 0 −4 −1 −7 −15
−1 2 3 −1 4 0 3 3 2 8
1 1 0 3 4
0 −1 −1 −5 −7
→ 0
0 3 13 13
0 0 0 −13 −13
The matrix becomes a triangular matrix. It is possible to solve the linear equations by a backward-
substitution process.
91
if the final row has
00 . . . 00|0
92
Chapter 9
93
Chapter 10
Another relationship that suggests an interval for roots. All real roots are within the interval
1 1
−1 − Max{|a0 |, . . . , |an−1 }, 1 + Max{|a0 |, . . . , |an−1 } (10.4)
|an | |an |
94
10.3 Convergence of False Position Method
The false position formula is based on the linear interpolation model. One of the starting points is fixed
while the other moves towards the solution. Assume that the initial points bracketing the solution are a
and b and that a moves towards the solution and b is fixed.
Let x1 = a an p be the solution. Then,
E0 = p − p0 , E1 = p − p1 (10.7)
that is
Ei = p − pi (10.8)
It can be shown that..
f (pn )
pn+1 = pn − ⇒ f (pn ) = (pn − pn+1 ) f ′ (pn ) (10.11)
f ′ (pn )
f ′′ (ξ)
0 = (p − pn+1 ) f ′ (pn ) + (p − pn )2 (10.12)
2
The error in the estimate xn+1 is given by
En+1 = p − pn+1
En = p − pn
95
1. If f ′ (pn ) = 0.
2. If the initial guess is too far away from the required root, the process may converge to some other
root.
3. A practical value in the iteration sequence may repeat, resulting in an infinite loop. This occurs
when the tangent to the curve f (x) at pn+1 cuts the x-axis again at pn .
96
This means that √
α+1 1± 5
=α⇒α= . (10.23)
α 2
Since α is always positive then, the order of convergence of the secant method is α = 1.618 and the
convergence is referred to as superlinear convergence.
97
Chapter 11
Exams
11.1 exam 1
Answer all questions.
1. Let x = 0.456 × 10(−2) , y = 0.134, and z = 0.920.
Use three-digit rounding arithmetic to evaluate:
(a) (x + y) + z.
(b) x + (y + z).
(10 Marks)
(6 Marks)
3. We want to evaluate the square root of 5 using the equation x2 − 5 = 0 by applying the fixed-point
iteration algorithm.
1 5
(a) Use algebraic manipulation to show that g(x) = x + has a fixed point exactly at x2 − 5 = 0
2 2x
(b) Use fixed-point theorem to show that the function g(x) converges to the unique fixed point for
any intial p0 ∈ [2, 5].
(c) Use p0 = 3 to evaluate p2 .
(8 Marks)
Z 4
4. (a) Evaluate exactly the integral ex dx.
0
98
Z 4
(b) Find an approximation to ex dx using Simpson’s rule with h = 2.
0
Z 4
(c) Find an approximation to ex dx using Composite Simpson’s rule with h = 1.
0
(d) Does the composite Simpson’s rule improve the approximation.
(8 Marks)
5. Given the equation y ′ = 3x2 +1, with y(1) = 2. Estimate y(2) by Euler’s method using with h = 0.25.
(8 Marks)
END
99
11.2 exam 2
Answer all questions.
1. (a) Evaluate f (x) = x3 − 6.1x2 + 3.2x + 1.5 at x = 4.71 using three-digit rounding arithmetic.
(b) Find the relative error in (a).
(c) Use Horner’s Theorem to evaluate f (x) at x = 4.71 using three-digit rounding arityhmetic.
(d) Find the relative error in (c).
(e) Why the relative error in (c) is less than the relative error in (a).
(10 Marks)
2. Let f (x) = −x3 − cos x and p0 = −1. Use Newton’s method to find p2 .
(6 Marks)
3. (a) Let A be a given positive constant and g(x) = 2x − Ax2 . Show that if fixed-point iteration
converges to a nonzero limit, then the limit is p = 1/A, so the reciprocal of a number can be
found using only multiplications and subtractions.
1
(b) Use fixed-point iteration with p0 = 0.1 to find p2 that approximates .
11
(8 Marks)
4. Use the forward-difference and backward-difference formulas to determine each of the missing entry
in the following table:
x f (x) f ′ (x)
1.0 1.0000 ....
1.2 1.2625 ....
1.4 1.6595 ....
(8 Marks)
5. Use Euler’s method to approximate the solution for the following initial-value problem.
y ′ = et−y , 0 ≤ t ≤ 1, y(0) = 1 with h = 0.5.
(8 Marks)
END
100
Useful Formulas and Theorem
If bn = an and
then b0 = P (x0 ),
Fixed-Point Theorem:
Let g ∈ C[a, b] be such that g(x) ∈ [a, b], for all x in [a, b]. Suppose, in addition, that g ′ exists on
(a, b) and that a constant 0 < k < 1 exists with
pn = g(pn−1 ), n ≥ 1,
Euler’s method:
dy
To approximate the solution of the initial-value problem = f (t, y), a ≤ t ≤ b, y(a) = α at (N+1)
dt
equally spaced numbers in the interval [a, b], we construct the solution y(ti ) = wi for i = 0, 1, ..., N −1
and
w0 = α,
t0 = a,
wi+1 = wi + hf (ti , wi ),
ti = a + ih,
where h = (b − a)/N
101
11.3 exam 3
Answer all questions.
1. Let
ex − e−x
f (x) =
x
(a) Find lim f (x).
x→0
(8 Marks)
(8 Marks)
3. Use the forward-difference and backward-difference formulas to determine each of the missing entry
in the following table
x f (x) f ′ (x)
1.0 1.0000 ....
1.2 1.2625 ....
1.4 1.6595 ....
(8 Marks)
Z2
1
4. (a) Find The actual value of dx.
x+4
0
Z2
1
(b) Use the Trapazoidal rule to approximate dx
x+4
0
and Find the actual error.
(c) Determine the values of n and h required for the Composite Trapazoidal rule to approximate
Z2
1
dx to within 10−6 .
x+4
0
102
(8 Marks)
5. Use Euler’s method to approximate the solution for the following initial-value problem.
y ′ = et−y , 0 ≤ t ≤ 1, y(0) = 1 with h = 0.5.
(8 Marks)
END
Simpson’s rule
With nodes at x0 = a, x1 = a + h, x2 = b, where h = (b − a)/2, the Simpson’s rule is
Zx2
h h5
f (x)dx = [f (x0 ) + 4f (x1 ) + f (x2 )] − f (4) (ξ). (11.6)
3 90
x0
Euler’s method
dy
To approximate the solution of the initial-value problem = f (t, y), a ≤ t ≤ b, y(a) = α at (N+1)
dt
equally spaced numbers in the interval [a, b], we construct the solution y(ti ) = wi for i = 0, 1, ..., N −1
and
w0 = α,
t0 = a,
wi+1 = wi + hf (ti , wi ),
ti = a + ih,
where h = (b − a)/N
103