Error Analysis Lectures 18 19
Error Analysis Lectures 18 19
Rounding Errors
Let Nmin , Nmax be the floating point numbers of smallest and largest
magnitude in a finite precision system.
Rounding Errors
Let Nmin , Nmax be the floating point numbers of smallest and largest
magnitude in a finite precision system.
For any normalized number x (which is not necessarily a floating
point number in the system) such that Nmin ≤ |x| ≤ Nmax ,
where |j | ≤ u, j = 1, 2, 3.
Rounding errors
Do similar statements hold for normalized numbers x and y that
are not floating point numbers?
Suppose, Nmin ≤ |x|, |y |, |x op y | ≤ Nmax . In such a case, x and
y need to be rounded before performing the operation and
where |j | ≤ u, j = 1, 2, 3.
The answer depends on what ‘op’, x and y are.
Rounding errors
Do similar statements hold for normalized numbers x and y that
are not floating point numbers?
Suppose, Nmin ≤ |x|, |y |, |x op y | ≤ Nmax . In such a case, x and
y need to be rounded before performing the operation and
where |j | ≤ u, j = 1, 2, 3.
The answer depends on what ‘op’, x and y are.
In the following
O(u) → quantities whose absolute values are small multiples of unit roundoff u;
O(u 2 ) → quantities whose absolute values are small multiples of u 2 and can be
ignored.
Rounding Errors
If op = ×, or op = /,
If op = ×, or op = /,
If op = +, or op = −,
x1 y 2
fl(x op y ) = (x op y ) 1 + op , (3)
x op y x op y
If op = ×, or op = /,
If op = +, or op = −,
x1 y 2
fl(x op y ) = (x op y ) 1 + op , (3)
x op y x op y
If op = ×, or op = /,
If op = +, or op = −,
x1 y 2
fl(x op y ) = (x op y ) 1 + op , (3)
x op y x op y
p p
x 1020 (1 − 1 − x 2) 1020 x 2 /(1 + 1 − x 2 )
5.3105e − 007 14100000 14101000
4.7895e − 007 11469000 11470000
4.2684e − 007 9114900 9109700
3.7474e − 007 7027700 7021400
3.2263e − 007 5206900 5204600
2.7053e − 007 3663700 3659200
2.1842e − 007 2387000 2385400
1.6632e − 007 1387800 1383000
1.1421e − 007 655030 652200
6.2105e − 008 199840 192850
1e − 008 11102 5000
Swamping → Catastrophic Cancellation
If 0 < fl(a) fl(b), or fl(b) 0 < fl(a), then fl(a + b) ≈ fl(b).
This is called swamping.
Swamping → Catastrophic Cancellation
If 0 < fl(a) fl(b), or fl(b) 0 < fl(a), then fl(a + b) ≈ fl(b).
This is called swamping.
Swamping can lead to catastrophic cancellation.
Swamping → Catastrophic Cancellation
If 0 < fl(a) fl(b), or fl(b) 0 < fl(a), then fl(a + b) ≈ fl(b).
This is called swamping.
Swamping can lead to catastrophic cancellation.
1 0.04 0.19
|fl(x op y ) − (x op y )|
→ relative (forward) error in (x op y ).
|x op y |
Backward errors in standard arithmetic operations
|fl(x op y ) − (x op y )|
→ relative (forward) error in (x op y ).
|x op y |
In contrast if
fl(x op y ) = x̂ op ŷ
for some x̂, ŷ , then,
|x̂ − x| |ŷ − y |
, → relative backward errors in (x op y ).
|x| |y |
Backward errors in standard arithmetic operations
|fl(x op y ) − (x op y )|
→ relative (forward) error in (x op y ).
|x op y |
In contrast if
fl(x op y ) = x̂ op ŷ
for some x̂, ŷ , then,
|x̂ − x| |ŷ − y |
, → relative backward errors in (x op y ).
|x| |y |
|x̂ − x| |ŷ − y |
and are O(u).
|x| |y |
Backward errors in standard arithmetic operations
fl(x op y ) = x̂ op ŷ
fl(x op y ) = x̂ op ŷ