04 Float
04 Float
Floa.ng
Point
15-‐213:
Introduc;on
to
Computer
Systems
4th
Lecture,
Sep.
10,
2015
Instructors:
Randal
E.
Bryant
and
David
R.
O’Hallaron
¢ Summary
4
• • •
2
1
¢ Observa;ons
§ Divide
by
2
by
shi`ing
right
(unsigned)
§ Mul;ply
by
2
by
shi`ing
le`
§ Numbers
of
form
0.111111…2
are
just
below
1.0
§ 1/2
+
1/4
+
1/8
+
…
+
1/2i
+
…
➙
1.0
§ Use
nota;on
1.0
–
ε
Representable
Numbers
¢ Limita;on
#1
§ Can
only
exactly
represent
numbers
of
the
form
x/2k
§ Other
ra;onal
numbers
have
repea;ng
bit
representa;ons
§ Value
Representa;on
§ 1/3
0.0101010101[01]…2
§ 1/5
0.001100110011[0011]…2
§ 1/10
0.0001100110011[0011]…2
¢ Summary
¢ Encoding
§ MSB
s
is
sign
bit
s
§ exp
field
encodes
E
(but
is
not
equal
to
E)
§ frac
field
encodes
M
(but
is
not
equal
to
M)
s exp frac
Precision
op.ons
¢ Single
precision:
32
bits
s exp frac
1 8-‐bits 23-‐bits
s exp frac
1
11-‐bits
52-‐bits
¢ Extended
precision:
80
bits
(Intel
only)
s
exp
frac
1
15-‐bits
63
or
64-‐bits
Bryant
and
O’Hallaron,
Computer
Systems:
A
Programmer’s
Perspec;ve,
Third
Edi;on
10
Carnegie Mellon
¢ Significand
M
=
1.11011011011012
frac = 110110110110100000000002
¢ Exponent
E
=
13
Bias
=
127
Exp
=
140
=
100011002
¢ Result:
0 10001100 11011011011010000000000
s
exp frac
Bryant
and
O’Hallaron,
Computer
Systems:
A
Programmer’s
Perspec;ve,
Third
Edi;on
12
Carnegie Mellon
Special
Values
¢ Condi;on:
exp
=
111…1
−∞
+∞
−Normalized
−Denorm
+Denorm
+Normalized
NaN
NaN
-‐0
+0
¢ Summary
s
exp
frac
1
4-‐bits
3-‐bits
-15 -10 -5 0 5 10 15
Denormalized Normalized Infinity
-1 -0.5 0 0.5 1
Denormalized Normalized Infinity
¢ Summary
¢ x ×f y = Round(x × y)
Rounding
¢ Rounding
Modes
(illustrate
with
$
rounding)
¢ Examples
§ Round
to
nearest
1/4
(2
bits
right
of
binary
point)
Value
Binary
Rounded
Ac;on
Rounded
Value
2
3/32
10.000112
10.002
(<1/2—down)
2
2
3/16
10.001102
10.012
(>1/2—up)
2
1/4
2
7/8
10.111002
11.002
(
1/2—up)
3
2
5/8
10.101002
10.102
(
1/2—down)
2
1/2
FP
Mul.plica.on
¢ (–1)s1
M1
2E1
x
(–1)s2
M2
2E2
¢ Exact
Result:
(–1)
M
2
s E
¢ Fixing
§ If
M
≥
2,
shi`
M
right,
increment
E
§ If
E
out
of
range,
overflow
§ Round
M
to
fit
frac
precision
¢ Implementa;on
§ Biggest
chore
is
mul;plying
significands
Bryant
and
O’Hallaron,
Computer
Systems:
A
Programmer’s
Perspec;ve,
Third
Edi;on
27
Carnegie Mellon
s E (–1)s1
M1
¢ Exact
Result:
(–1)
M
2
¢ Fixing
§ If
M
≥
2,
shi`
M
right,
increment
E
§ if
M
<
1,
shi`
M
le`
k
posi;ons,
decrement
E
by
k
§ Overflow
if
E
out
of
range
§ Round
M
to
fit
frac
precision
Bryant
and
O’Hallaron,
Computer
Systems:
A
Programmer’s
Perspec;ve,
Third
Edi;on
28
Carnegie Mellon
¢ Monotonicity
§ a
≥
b
&
c
≥
0
a
*
c
≥
b
*c?
Almost
§ Except
for
infini;es
&
NaNs
Bryant
and
O’Hallaron,
Computer
Systems:
A
Programmer’s
Perspec;ve,
Third
Edi;on
30
Carnegie Mellon
¢ Summary
¢ Conversions/Cas;ng
§
Cas;ng
between
int,
float,
and
double
changes
bit
representa;on
§
double/float
→
int
§ Truncates
frac;onal
part
§ Like
rounding
toward
zero
§ Not
defined
when
out
of
range
or
NaN:
Generally
sets
to
TMin
§
int
→
double
§ Exact
conversion,
as
long
as
int
has
≤
53
bit
word
size
§
int
→
float
§ Will
round
according
to
rounding
mode
Summary
¢ IEEE
Floa;ng
Point
has
clear
mathema;cal
proper;es
E
¢ Represents
numbers
of
form
M
x
2
implementa;on
§ As
if
computed
with
perfect
precision
and
then
rounded
¢ Not
the
same
as
real
arithme;c
§ Violates
associa;vity/distribu;vity
§ Makes
life
difficult
for
compilers
&
serious
numerical
applica;ons
programmers
Addi.onal Slides
1
4-‐bits
3-‐bits
¢ Requirement
§ Set
binary
point
so
that
numbers
of
form
1.xxxxx
§ Adjust
all
to
have
leading
one
Decrement
exponent
as
shi`
le`
§
Value
Binary
Frac;on
Exponent
128 10000000 1.0000000 7
15 00001101 1.1010000 3
17 00010001 1.0001000 4
19 00010011 1.0011000 4
138 10001010 1.0001010 7
63 00111111 1.1111100 5
Rounding
1.BBGRXXX
Guard
bit:
LSB
of
result
S;cky
bit:
OR
of
remaining
bits
Round
bit:
1st
bit
removed
Postnormalize
¢ Issue
§ Rounding
may
have
caused
overflow
§ Handle
by
shi`ing
right
once
&
incremen;ng
exponent
Value
Rounded
Exp
Adjusted
Result
128 1.000 7 128
15 1.101 3 15
17 1.000 4 16
19 1.010 4 20
138 1.001 7 134
63 10.000 5 1.000/6 64