0% found this document useful (0 votes)
26 views

Error Analysis Lectures 18 19

The document discusses rounding errors that occur when performing floating point operations in finite precision systems. It introduces the concept of backward error analysis, which analyzes the stability of algorithms by determining if the computed solution can be viewed as the exact solution to a slightly perturbed problem, rather than analyzing errors at each step of computation. Backward stable or stable algorithms are those where the relative perturbations to the original data are on the order of the unit roundoff. This separates the analysis of an algorithm's properties from the sensitivity of the problem to errors in inputs.

Uploaded by

mahfoo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Error Analysis Lectures 18 19

The document discusses rounding errors that occur when performing floating point operations in finite precision systems. It introduces the concept of backward error analysis, which analyzes the stability of algorithms by determining if the computed solution can be viewed as the exact solution to a slightly perturbed problem, rather than analyzing errors at each step of computation. Backward stable or stable algorithms are those where the relative perturbations to the original data are on the order of the unit roundoff. This separates the analysis of an algorithm's properties from the sensitivity of the problem to errors in inputs.

Uploaded by

mahfoo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Rounding Errors and Stability of Algorithms

Rounding Errors
Let Nmin , Nmax be the floating point numbers of smallest and largest
magnitude in a finite precision system.
Rounding Errors
Let Nmin , Nmax be the floating point numbers of smallest and largest
magnitude in a finite precision system.
For any normalized number x (which is not necessarily a floating
point number in the system) such that Nmin ≤ |x| ≤ Nmax ,

fl(x) = x(1 + δ), |δ| ≤ u (∗)

This will be called the fundamental representation/rounding rule.


Rounding Errors
Let Nmin , Nmax be the floating point numbers of smallest and largest
magnitude in a finite precision system.
For any normalized number x (which is not necessarily a floating
point number in the system) such that Nmin ≤ |x| ≤ Nmax ,

fl(x) = x(1 + δ), |δ| ≤ u (∗)

This will be called the fundamental representation/rounding rule.


For normalized numbers floating point numbers x, y such that
Nmin ≤ |x op y | ≤ Nmax ,

fl(x op y ) = (x op y )(1 + δ), |δ| ≤ u (1)

where ‘op0 is any of the operations +, −, /, × and the unit roundoff


u ≈ 10−16 in IEEE double precision.
Rounding Errors
Let Nmin , Nmax be the floating point numbers of smallest and largest
magnitude in a finite precision system.
For any normalized number x (which is not necessarily a floating
point number in the system) such that Nmin ≤ |x| ≤ Nmax ,

fl(x) = x(1 + δ), |δ| ≤ u (∗)

This will be called the fundamental representation/rounding rule.


For normalized numbers floating point numbers x, y such that
Nmin ≤ |x op y | ≤ Nmax ,

fl(x op y ) = (x op y )(1 + δ), |δ| ≤ u (1)

where ‘op0 is any of the operations +, −, /, × and the unit roundoff


u ≈ 10−16 in IEEE double precision.
This is desirable as it implies that the maximum relative error in the
operation is of the order of unit roundoff.
Rounding errors
Do similar statements hold for normalized numbers x and y that
are not floating point numbers?
Rounding errors
Do similar statements hold for normalized numbers x and y that
are not floating point numbers?
Suppose, Nmin ≤ |x|, |y |, |x op y | ≤ Nmax . In such a case, x and
y need to be rounded before performing the operation and

fl(x op y ) = fl(fl(x) op fl(y )) = ((x(1+1 )) op (y (1+2 )))(1+3 ),

where |j | ≤ u, j = 1, 2, 3.
Rounding errors
Do similar statements hold for normalized numbers x and y that
are not floating point numbers?
Suppose, Nmin ≤ |x|, |y |, |x op y | ≤ Nmax . In such a case, x and
y need to be rounded before performing the operation and

fl(x op y ) = fl(fl(x) op fl(y )) = ((x(1+1 )) op (y (1+2 )))(1+3 ),

where |j | ≤ u, j = 1, 2, 3.
The answer depends on what ‘op’, x and y are.
Rounding errors
Do similar statements hold for normalized numbers x and y that
are not floating point numbers?
Suppose, Nmin ≤ |x|, |y |, |x op y | ≤ Nmax . In such a case, x and
y need to be rounded before performing the operation and

fl(x op y ) = fl(fl(x) op fl(y )) = ((x(1+1 )) op (y (1+2 )))(1+3 ),

where |j | ≤ u, j = 1, 2, 3.
The answer depends on what ‘op’, x and y are.

In the following
O(u) → quantities whose absolute values are small multiples of unit roundoff u;
O(u 2 ) → quantities whose absolute values are small multiples of u 2 and can be
ignored.
Rounding Errors

If op = ×, or op = /,

fl(x op y ) = (x op y )(1 + ), || ≤ 3u + O(u 2 ) (2)


Rounding Errors

If op = ×, or op = /,

fl(x op y ) = (x op y )(1 + ), || ≤ 3u + O(u 2 ) (2)

If op = +, or op = −,
  
x1 y 2
fl(x op y ) = (x op y ) 1 + op , (3)
x op y x op y

where |j | ≤ 2u + O(u 2 ).


Rounding Errors

If op = ×, or op = /,

fl(x op y ) = (x op y )(1 + ), || ≤ 3u + O(u 2 ) (2)

If op = +, or op = −,
  
x1 y 2
fl(x op y ) = (x op y ) 1 + op , (3)
x op y x op y

where |j | ≤ 2u + O(u 2 ).



y 2
However x x op x op y 6≈ O(u) if |x op y |  |x| or
1
op y
|x op y |  |y |.
Rounding Errors

If op = ×, or op = /,

fl(x op y ) = (x op y )(1 + ), || ≤ 3u + O(u 2 ) (2)

If op = +, or op = −,
  
x1 y 2
fl(x op y ) = (x op y ) 1 + op , (3)
x op y x op y

where |j | ≤ 2u + O(u 2 ).



y 2
However x x op x op y 6≈ O(u) if |x op y |  |x| or
1
op y
|x op y |  |y |.
This is called catastrophic cancellation as it can result in
sudden loss of accuracy.
Catastrophic Cancellation
Example 1: In the finite precision system (10, 3, −1, 2) with round to nearest
rounding, fl(100 − 99.95) = 0.

Example 2: The computation z = 1 − 1 − x 2 is prone to catastrophic
cancellation for x ≈ 0. The errors magnify when the computed value is
multiplied by a large number.

This may be avoided by using the equivalent formulation x 2 /(1 + 1 − x 2 ).

p  p 
x 1020 (1 − 1 − x 2) 1020 x 2 /(1 + 1 − x 2 )
5.3105e − 007 14100000 14101000
4.7895e − 007 11469000 11470000
4.2684e − 007 9114900 9109700
3.7474e − 007 7027700 7021400
3.2263e − 007 5206900 5204600
2.7053e − 007 3663700 3659200
2.1842e − 007 2387000 2385400
1.6632e − 007 1387800 1383000
1.1421e − 007 655030 652200
6.2105e − 008 199840 192850
1e − 008 11102 5000
Swamping → Catastrophic Cancellation
If 0 < fl(a)  fl(b), or fl(b)  0 < fl(a), then fl(a + b) ≈ fl(b).
This is called swamping.
Swamping → Catastrophic Cancellation
If 0 < fl(a)  fl(b), or fl(b)  0 < fl(a), then fl(a + b) ≈ fl(b).
This is called swamping.
Swamping can lead to catastrophic cancellation.
Swamping → Catastrophic Cancellation
If 0 < fl(a)  fl(b), or fl(b)  0 < fl(a), then fl(a + b) ≈ fl(b).
This is called swamping.
Swamping can lead to catastrophic cancellation.

Exercise: Perform GENP on


 
0.1 10 9.985
A =  1 0.05 0.15 
 

1 0.04 0.19

in the finite precision system (10, 3, −3, 2) with round to nearest


as the rounding mode. Identify the computations where
(a) swamping occurs;
(b) catastrophic cancellation occurs;
Find L and U of the computed LU decomposition and compute
kA − LUk1 . Repeat the calculations for GEPP.
Backward error Analysis and Stability of Algorithms
Some background
I In the early years of computing, the approach to understanding
the accuracy of solution from algorithms involved bounding the
rounding error at every stage of the computations.
I Apart from being practically very difficult in the presence of many
computations, it was also not very successful as the risk of
catastrophic cancellation which was not always possible to
predict, loomed over the computations.
I Therefore, there was a general air of pessimism about the extent
of errors in solutions computed via algorithms operating in a
finite precision environment.
I In the early 1960s, a radically different approach to rounding
error analysis was proposed by James Wilkinson.
I His idea was that instead of looking at errors at every stage of
the computation, one should look at the computed answer from
the algorithm as the exact answer of the same algorithm applied
to perturbed data, the perturbations arising from a pushback of
the errors in the computations back into data.
Backward Errors
I These relative perturbations to the data arising from the
pushback are called the backward errors of the computed
answer.
I If they are of the order of unit roundoff, then the algorithm is said
to be backward stable or simply stable.
I Since working in finite precision environments and consequent
rounding errors are inevitable for algorithms, the most that
should be expected of them is that they are backward stable.
I After this, the accuracy of the computed solution depends upon
the sensitivity of the problem to small changes in the data.
I The backward error analysis of the algorithm is combined with
the sensitivity analysis of the problem to bound the relative
errors in the solution.
I This approach separates the properties of the algorithm from
those of the problem.
I To contrast with backward errors, the usual errors in
computations are also referred to as forward errors.
Backward error analysis

James Hardy Wilkinson, FRS


(1919-1986)

Systematic analysis of backward errors and backward stability


was introduced by James Wilkinson.
Posteriori versus priori backward error analysis of GE

I The analysis of backward stability of algorithms is an analysis of


the backward errors in the computations.
I For Gaussian Elimination one such (posteriori) analysis was
already undertaken after solving a system of equations and
using the residual vector associated with the computed solution
to construct perturbed systems of equations of which the
computed solution is an exact solution.
I This approach depends on the computed solution and can only
say whether the algorithm is backward stable with respect to the
given problem.
I To know about the backward stability property of an algorithm in
general before using it to solve a problem a priori backward error
analysis is required.
A priori backward error analysis
I This analysis is made without actually using the algorithm to
solve a problem.
I It involves estimating maximum possible backward errors that
can arise in standard arithmetic operations as an initial step.
I These are used to estimate maximum possible backward errors
in the computed answer arising from the collective backward
errors in the computations.
I The algorithm is declared to be backward stable if these
maximum possible values of backward errors are O(u).
I As the analysis should hold for any problem that the algorithm is
designed to solve, it cannot make assumptions about the input
data and computed solution.
I However, sometimes these backward errors are guranteed to be
of the order of unit roundoff only under certain conditions. In
such cases the algorithm is conditionally backward stable.
I If no such conditions are required for backward stability, then the
algorithm is unconditionally backward stable.
Backward errors in standard arithmetic operations

|fl(x op y ) − (x op y )|
→ relative (forward) error in (x op y ).
|x op y |
Backward errors in standard arithmetic operations

|fl(x op y ) − (x op y )|
→ relative (forward) error in (x op y ).
|x op y |

In contrast if
fl(x op y ) = x̂ op ŷ
for some x̂, ŷ , then,

|x̂ − x| |ŷ − y |
, → relative backward errors in (x op y ).
|x| |y |
Backward errors in standard arithmetic operations

|fl(x op y ) − (x op y )|
→ relative (forward) error in (x op y ).
|x op y |

In contrast if
fl(x op y ) = x̂ op ŷ
for some x̂, ŷ , then,

|x̂ − x| |ŷ − y |
, → relative backward errors in (x op y ).
|x| |y |

The operation op is said to be backward stable if

|x̂ − x| |ŷ − y |
and are O(u).
|x| |y |
Backward errors in standard arithmetic operations

Since the relative representation error that arises when rounding


normalized numbers with absolute values between Nmin and Nmax is
very small, these may be ignored in the push back.
Backward errors in standard arithmetic operations

Since the relative representation error that arises when rounding


normalized numbers with absolute values between Nmin and Nmax is
very small, these may be ignored in the push back.
Therefore in backward error analysis it is assumed that all the
standard arithmetic operations occur between floating point numbers.
Backward errors in standard arithmetic operations

Since the relative representation error that arises when rounding


normalized numbers with absolute values between Nmin and Nmax is
very small, these may be ignored in the push back.
Therefore in backward error analysis it is assumed that all the
standard arithmetic operations occur between floating point numbers.
Theorem Let x, y be floating point numbers. Whenever op is any of
the standard operations +, −, /, ×,

fl(x op y ) = x̂ op ŷ

where x̂ = x(1 + δ1 ) and ŷ = y (1 + δ2 ) with |δj | ≤ u, for j = 1, 2.


Backward errors in standard arithmetic operations

Since the relative representation error that arises when rounding


normalized numbers with absolute values between Nmin and Nmax is
very small, these may be ignored in the push back.
Therefore in backward error analysis it is assumed that all the
standard arithmetic operations occur between floating point numbers.
Theorem Let x, y be floating point numbers. Whenever op is any of
the standard operations +, −, /, ×,

fl(x op y ) = x̂ op ŷ

where x̂ = x(1 + δ1 ) and ŷ = y (1 + δ2 ) with |δj | ≤ u, for j = 1, 2.


Hence all the standard arithmetic operations between two floating
point numbers are backward stable operations.
A fundamental result in backward error analysis
Theorem Let wi , i = 1, . . . , n, be floating point numbers. Then
there exist γi , i = 1, . . . , n, satisfying |γi | ≤ (n − 1)u + O(u 2 ),
such that
n n
!
X X
fl wi = wi (1 + γi ). (4)
i=1 i=1

irrespective of the order of summation.


A fundamental result in backward error analysis
Theorem Let wi , i = 1, . . . , n, be floating point numbers. Then
there exist γi , i = 1, . . . , n, satisfying |γi | ≤ (n − 1)u + O(u 2 ),
such that
n n
!
X X
fl wi = wi (1 + γi ). (4)
i=1 i=1

irrespective of the order of summation.

Corollary Let ui , wi , i = 1, . . . , n, be floating point numbers.


Then there exist γi , i = 1, . . . , n, satisfying |γi | ≤ nu + O(u 2 ),
such that
n n
!
X X
fl ui wi = ui wi (1 + γi ). (5)
i=1 i=1

irrespective of the order of summation.

You might also like