0% found this document useful (0 votes)
158 views

Computer Organisation and Architecture

Just a bit of theory on IEEE method of multiplication , division, addition and subtraction of binary numbers.

Uploaded by

ankit gundewar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
158 views

Computer Organisation and Architecture

Just a bit of theory on IEEE method of multiplication , division, addition and subtraction of binary numbers.

Uploaded by

ankit gundewar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

10/10/2011

What you are going to study?


 Multiplication- unsigned - Another View
 Multiplication- 2s complement-Booth Algorithm
 Division-unsigned
Organization & Architecture  Division-2’s complement-Algorithm-Examples
 Floating Point Numbers-Representation, IEEE format
(single precision and Double Precision)
 Arithmetic with Floating Point numbers (FP
Computer Arithmetic- Part II Addition/Subtraction,Multiplication/Division)

Multiplication of unsigned integers- example- Comparison of Multiplication of Unsigned and


another view Twos Complement Integers
* Multiplication of a binary number by 2n can be done by shifting that number to 1 0 0 1 (9)
the left n bits. X 0 0 1 1 (3)
0 0 0 0 1 0 0 1 1001 X 1 X 2^0
* Partial products can be viewed as 2n-bit numbers generated from the n-bit 0 0 0 1 0 0 1 0 1001 X 1 X 2^1
multiplicand. 0 0 0 0 0 0 0 0 1001 X 0 X 2^2
+ 0 0 0 0 0 0 0 0 1001 X 0 X 2^3
1011 0 0 0 1 1 0 1 1 (27)
X 1101
a) Unsigned Integers
--------------------------
1 0 0 1 (-7) M
00001011 1011*1*20 MULTIPLICATION OF TWO UNSIGNED
X 0 0 1 1 (3) Q
00000000 1011*0*21 4-BIT INTEGERS YIELDING AN 8-BIT RESULT 1 1 1 1 1 0 0 1 1001 X 1 X 2^0
1 1 1 1 0 0 1 0 1001 X 1 X 2^1
00101100 1011*1*22 0 0 0 0 0 0 0 0 1001 X 0 X 2^2
+ 01011000 1011*1*23 + 0 0 0 0 0 0 0 0 1001 X 0 X 2^3
1 1 1 0 1 0 1 1 (-21)
--------------------------
10001111 p/s: 1011*1*22 = 1011.00 *22 = 101100 b) Two Complement Integers
3 4

1
10/10/2011

Multiplying 2’s complement numbers


Multiplying 2’s complement numbers
 If multiplier (Q) is negative, 7 X –3 , this does
not work! Solution 1
0 1 1 1 (7) M Convert both multiplier and multiplicand to positive
X 1 1 0 1 (-3) Q if required
1 1 1 1 0 1 1 1 0111 X 1 X 2^0 Multiply as in unsigned binary
0 0 0 0 0 0 0 0 0111 X 0 X 2^1 If signs of the operands are different, negate
1 1 0 1 1 1 0 0 0111 X 1 X 2^2 answer (finding 2s complement of the result)
+ 1 0 1 1 1 0 0 0 0111 X 1 X 2^3
Solution 2
10 1 0 0 0 1 0 1 1 (-117)
Booth’s algorithm-performs fewer additions and
Xwhen Q is negative
it cannot work
subtractions than a more straightforward algorithm

5 6

Solution 1 Solution 2 - Booth’s Algorithm (1)…….


START
• To overcome this dilemma, first convert both multiplier and multiplicand to M A Q Q-1
positive numbers, then perform multiplication and negate the product if the
A 0, Q-1 0
original numbers have different sign
M  Multiplicand
Q  Multiplier
0 1 1 1 (7) M
Count  n
X 0 0 1 1 (3) Q
0 0 0 0 0 1 1 1 0111 X 1 X 2^0
0 0 0 0 1 1 1 0 0111 X 1 X 2^1
0 0 0 0 0 0 0 0 0111 X 0 X 2^2 = 10 = 01
Q0, Q-1
+ 0 0 0 0 0 0 0 0 0111 X 0 X 2^3
0 0 0 1 0 1 0 1 ----> negate 1 1 1 0 1 0 1 1 A A - M =00 A A + M
(-21) =11

-128 64 32 8 2 1 Arithmetic shift right:


A, Q, Q-1
Count  Count - 1

• This method is tedious as it involves checking the sign of No Yes


Count=0? END
the numbers and perform negation if necessary 7 8

2
10/10/2011

Booth’s Algorithm(2)….. Example of Booth’s Algorithm(3)….

 Scan the bit and right of the bit of the multiplier at the same
time by control logic
 If two bits =00 =11 - right shift only (A,Q,Q-1)
=01 A A +M and right shift
=10 A A - M and right shift
 To preserve the sign of the number in A and Q, arithmetic
shift is done (An-1 is not only shifted into A n-2 but also
remains in A n-1)

9 10

M=0101, Q=1010 , - M = 1011 M=1010, Q=1001 , - M = 0110


• Consider the multiplication of 5 x -6, both represented in 4-bit • Consider the multiplication of -6 x -7, both represented in
twos complement notation, to produce an 8-bit product 4-bit twos complement notation, to produce an 8-bit product
M Register A Register Q Register Q -1 M Register A Register Q Register Q -1
0 1 0 1 0 0 0 0 1 0 1 0 0 Initial value 1 0 1 0 0 0 0 0 1 0 0 1 0 Initial value
+ 0 1 1 0
0 0 0 0 0 1 0 1 0 Shift 1st cycle 0 1 1 0 1 0 0 1 0 AA–M
N/B: Negate the product if + 1 0 1 1 1st cycle
sign bit of product is 1 0 1 1 0 1 0 1 0 AA–M 0 0 1 1 0 1 0 0 1 Shift
negative, 1 2nd cycle + 1 0 1 0
1 1 0 1 1 0 1 0 1 Shift 1 1 0 1 0 1 0 0 1 AA+M
Negate 11100010 + 0 1 0 1 2nd cycle
Product = 00101010
1’ = 00011101 0 0 1 0 1 0 1 0 1 AA+M 1 1 1 0 1 0 1 0 0 Shift
3rd cycle Since the sign bit is
2’ = 00011110 (30) 0 0 0 1 0 1 0 1 0 Shift positive , 0. 1 1 1 1 0 1 0 1 0 Shift 3rd cycle
Since sign bit is 1, it shown + 1 0 1 1 + 0 1 1 0
Therefore the product
1 1 0 0 0 1 0 1 0 AA–M 0 1 0 1 0 1 0 1 0 AA–M
that it is a negative value, value is 42
4th cycle 4th cycle
Therefore product = -30 1 1 1 0 0 0 1 0 1 Shift 0 0 1 0 1 0 1 0 1 Shift

11 12
product product

3
10/10/2011

Division Division of Unsigned Binary Integers

 More complex than multiplication


 General principle is the same as multiplication. 00001101 Quotient
 Operation involves repetitive shifting and Divisor 1011 10010011 Dividend
add/sub. 1011
001110
 The basis for the algorithm is the paper and Partial 1011
pencil approach. Remainders
001111
1011
100 Remainder

13 14

• Consider the the division of two 4-bit unsigned integers:


Division of Unsigned Binary Integers
10112 (DIVIDED, 11)  01002 (DIVISOR, 4)
Start M = divisor , Q = divided

M Register A Register Q Register


A  0
M  Divisor 0 1 0 0 0 0 0 0 1 0 1 1 Initial value
Q  Dividend
Count  n

0 0 0 1 0 1 1 0 shift
Shift left A, Q A  A–M  A<0 1st cycle
0 0 0 1 0 1 1 0 restore A, Q0  0
A  A-M
0 0 1 0 1 1 0 0 shift
A  A–M  A<0 2nd cycle
For A > 0 or A = 0 No Yes For A< 0
A< 0 ?
0 0 1 0 1 1 0 0 restore A, Q0  0

Q  0
Q
0
 1
A 
0
A + M ( restore A)
0 1 0 1 1 0 0 0 shift
– 0 1 0 0 A  A–M  A0 3rd cycle
Count  Count - 1
0 0 0 1 1 0 0 1 Q0  1

0 0 1 1 0 0 1 0 shift
No Yes Slides adapted from tan wooi
Count = 0? End
haw’s lecture notes (FOE) Remainder in A Quotient in Q A  A–M  A<0 4th cycle
Quotient in Q 0 0 1 1 0 0 1 0 restore A, Q0  0
Remainder in A
15 16

4
10/10/2011

Twos complement Division - Restoring division approach Twos complement Division-Algorithm (3)…..
i. Expand dividend to 2n-bit. (For Ex. 4bit 0111 becomes 00000111, and
1001 becomes 11111001.
ii. Load divisor in M and dividend in A & Q.
iii. Shift left A & Q by 1 bit
iv. If M and A have the same sign, perform AAM, otherwise
AA+M
v. If the sign of A is the same before and after the operation or
(A=0 & remaining dividend=0), set Q0=1
vi. Otherwise, if the sign is different and (A0 or remaining
dividend0), set Q0=0 and restore A
vii. Negate Q if divisor and dividend have different sign
viii. Remainder in A, quotient in Q

What is remaining dividend = 0 ?


for example, dividend is 1100
If shift to left by 1 bit: 1000
so now the remaining dividend is 100 Slides adapted from tan wooi haw’s
If shift to left again becomes: 0000
lecture notes (FOE)
now the remaining dividend has become 00, which means remaining
17 dividend is 0 18

Twos complement Twos complement


Restoring Restoring
continue … Division
continue ... Division

Example 4.22: -7  2 -7 = 1111 10012 = A Q Example 4.23: 6  -3


2 = 0010 = M Slides adapted from M = 1101 Slides adapted from tan
tan wooi haw’s lecture 0 0 0 0 0 1 1 0 Initial values wooi haw’s lecture notes
M = 0010
notes (FOE) (FOE)
1 1 1 1 1 0 0 1 Initial values
0 0 0 0 1 1 0 0 Shift
1 1 1 1 0 0 1 0 Shift + 1 1 0 1 Add
+ 0 0 1 0 Add 1st cycle
1st cycle 1 1 0 1
0 0 0 1 Restore
0 0 0 0 1 1 0 0
1 1 1 1 0 0 1 0 Restore

0 0 0 1 1 0 0 0 Shift
1 1 1 0 0 1 0 0 Shift
+ 1 1 0 1 Add
+ 0 0 1 0 Add 2nd cycle
2nd cycle 1 1 1 0
0 0 0 0
0 0 0 1 1 0 0 0 Restore
1 1 1 0 0 1 0 0 Restore

1 1 0 0 1 0 0 0 Shift 0 0 1 1 0 0 0 0 Shift
+ 0 0 1 0 Add + 1 1 0 1 Add
3rd cycle 3rd cycle
1 1 1 0 0 0 0 0
1 1 1 0 1 0 0 1 Set Q 0 = 1 0 0 0 0 0 0 0 1 Set Q0 = 1

1 1 0 1 0 0 1 0 Shift 0 0 0 0 0 0 1 0 Shift
+ 0 0 1 0 Add Sign of dividend & divisor are different Sign of dividend & divisor are
4th cycle + 1 1 0 1 Add
1 1 1 1  negate Q 4th cycle differrent  negate Q
1 1 0 1
1 1 1 1 0 0 1 1 Set Q 0 = 1  Quotient = -Q =11012 = -310  Quotient = -Q =11102 = -210
19 0 0 0 0 0 0 1 0 Restore 20
Remainder = 11112 = -110 Remainder = 0

5
10/10/2011

Twos complement Division-Examples (1)….. Twos complement Division-Examples (2)….

Star t

Expand dividend
to 2n bits

M  D ivisor
A, Q  D ividend
count  n

shift left A, Q

yes no
A  A – M A and M sam e sign? A  A + M

Sign of A still

From reference book – not valid if - 6


yes no Q0  0
Q0  1 the sam e or r estor e A
( A= 0 AN D B= 0)

count  count – 1

divide by 2 , where quotient of -2 and no


count = 0?

remainder of -2
yes

D ivisor and dividend


differ ent sign?
yes
negate Q
B
repres 21
From reference book – not valid if - 6 divide by 2 22
no

Quotient in Q
R em ainder in A
End
ents Q

Twos complement Division (3) Problem (1)

 Remainder is defined by  Given x=0101 and y=1010 in twos complement


notation, (I.e., x=5,y=-6), compute the product
 D=Q*V+R
p=x*y with Booth’s algorithm
 D=Dividend, Q=Quotient, V=Divisor,
R=Remainder

N/b: find out the remainder of 7/-3 & –7 /3 by using the


formula above and check with the slides on page 21, 22.
The result of figures from both slides are consistent with the
formula
23 24

6
10/10/2011

Solution(1) Problem (2)


A Q Q-1 M Comments
0000 1010 0 0101 Initial  Verify the validity of the unsigned binary
0000 0101 0 0101 Q0,Q-1=00, Arithmetic right shift
division algorithm by showing the steps
involved in calculating the division
1011 0101 0 0101 Q0,Q-1=10, A A-M
10010011/1011. Use a presentation
1101 1010 1 0101 Arithmetic shift
similar to the examples used for twos
0010 1010 1 0101 Q0,Q-1=01, A A+M
complement arithmetic
0001 0101 0 0101 Arithmetic shift
1100 0101 0 0101 Q0,Q-1=10, A A-M

1110 0010 1 0101 Arithmetic shift


25 26

Real Numbers
Problem (3)
 Numbers with fractions
 Could be done in pure binary
 Divide -145 by 13 in binary twos  1001.1010 = 24 + 20 +2-1 + 2-3 =9.625
complement notation, using 12-bit words.  Radix point: Fixed or Moving?
Use the Restoring division approach.  Fixed radix point: can’t represent very large or
very small numbers.
 Dynamically sliding the radix point -
a range of very large and very small numbers
can be represented.
In mathematics, radix point refers to the symbol used in numerical representations to separate the integral part of the number (to the
left of the radix) from its fractional part (to the right of the radix). The radix point is usually a small dot, either placed on the baseline
or halfway between the baseline and the top of the numerals. In base 10, the radix point is more commonly called the decimal point.
... From en.wikipedia.org/wiki/Radix_point
27 28

7
10/10/2011

Floating Point Signs for Floating Point

 Mantissa is stored in 2s compliment.


Sign bit

Biased Significand or Mantissa  Exponent is in excess or biased notation.


Exponent
 Excess (biased exponent) 128 means
 +/- significand x 2exponent 8 bit exponent field
 Point is actually fixed between sign bit and  Pure value range 0-255
body of mantissa  Subtract 128 (2 k-1 - 1)to get correct value
 Exponent indicates place value (point  Range -128 to +127
position)

29 30

Normalization Expressible Numbers


 FP numbers are usually normalized
 exponent is adjusted so that leading bit
(MSB) of mantissa is 1
 Since it is always 1 there is no need to store it
 (Scientific notation where numbers are
normalized to give a single digit before the
decimal point e.g. 3.123 x 103)
 In FP representation: not representing more
individual values, but spreading the numbers.

31 32

8
10/10/2011

IEEE 754 Floating-point Format


• Various floating-point formats have been defined, such
as the UNIVAC 1100, CDC 3600 and IEEE Standard
 Standard for floating point storage 754
(a) UNIVAC 1100
 32 and 64 bit standards 27 bits 9 bits

 8 and 11 bit exponent respectively


Mantissa Exponent

Single precision
60 bits 12 bits
 Extended formats (both mantissa and Mantissa Exponent

exponent) for intermediate results Double precision

(b) CDC 3600


10 bits 36 bits

Exponent Mantissa

Exponent sign
Mantissa sign

33 34

• Two basics format are defined in the IEEE Standard 754


IEEE Floating-point Format • These are the 32-bit single and 64-bit double formats,
with 8-bit and 11-bit exponent respectively
• IEEE has introduced a standard floating-point format for
arithmetic operations in mini and microcomputer, which Sign
8 bits 23 bits
is defined in IEEE Standard 754 bit
Biased
Significand
• In this format, the numbers are normalized so that the Exponent

(a) Single format


significand or mantissa lie in the range 1F<2, which
Sign
corresponds to an integer part equal to 1 bit
11 bits 52 bits

• An IEEE format floating-point number X is formally Biased Exponent Significand

defined as: (b) Double format

• A sign-magnitude representation has been adopted for


X  1S x 2 E  B x 1.F the mantissa; mantissa is negative if S =1, and positive if
where S = sign bit [0+, 1]
S =0
E = exponent biased by B
F = fractional mantissa
35 36

9
10/10/2011

Example
Floating Point Examples Convert these number to IEEE single precision format:
(a) 199.95312510 = 1100 0111.1111012
= 1.100 0111 111101 x 27
stored
+ 7 + 127 = 13410 1  1 0 0 0 1 1 1 1 1 1 1 0 1
0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0
sign biased exponent significand

negative 7710 = 100 11012


(b) -77.710 = -100 1101.10110 01102 ...
20 127 + 20 = 147 = -1.00 1101 101100110 ... x 26 0.710  0.7 x 2 1.4
0.4 x 2 0.8
0.8 x 2 1.6
0.6 x 2 1.2
0.2 x 2 0.4
Slides adapted from tan 0.4 x 2 0.8
wooi haw’s lecture notes 0.8 x 2 1.6
(FOE) 0.6 x 2 1.2
negative 0.2 x 2 0.4

...
stored [23 bits]
normalized -20 127 - 20 = 107 – 6 + 127 = 13310 1  0 0 1 1 0 1 1 0 1 1 0 ...
1 1 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
sign biased exponent significand
The bias equals to (2K-1 – 1)  28-1 – 1 = 127 37 38

Convert these IEEE single precision floating-point numbers


to their decimal equivalent: FP Arithmetic +/-
(a) 0100 0101 1001 1100 0100 0001 0000 00002

sign biased exponent significand  Check for zeros


0 1 0 0 0 1 0 1 1 0 0 1 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0
+ 139 – 127 = 1210 1.0011100012
 Align significands (adjusting exponents)
1.0011100010000012 X 212 = 1001110001000.0012

= 5000.12510  Add or subtract significands

(b) 1100 0100 0111 1001 1111 1100 0000 00002


 Normalize result
sign biased exponent significand
1 1 0 0 0 1 0 0 0 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
– 136 – 127 = 910 1.11110011111112

-1.11110011111112 x 29 = -1111100111.11112
= -999.937510
Slides adapted from tan
wooi haw’s lecture notes
(FOE)
39 40

10
10/10/2011

Floating-point Arithmetic (cont.)


FP Arithmetic x/

 Check for zero


 Add/subtract exponents
 Multiply/divide significands (watch sign)
 Normalize
Some basic floating-point arithmetic operations are shown in the table
 Round
 All intermediate results should be in
double length storage
41 42

Floating-point Arithmetic (cont.) Floating-point Arithmetic (cont.)


• Some problems that may arise during arithmetic operations are:
i. Exponent overflow: A positive exponent exceeds the maximum
 For addition and subtraction, it is possible exponent value and this may leads to + or - in some
necessary to ensure that both operand systems
ii. Exponent underflow: A negative exponent is less than
exponents have the same value the minimum possible exponent value (eg. 2-200), the number is
too small to be represented and maybe reported as 0
 This may involves shifting the radix point iii. Significand underflow: In the process of aligning
of one of the operand to achieve significands, the smaller number may have a
significand which is too small to be represented
alignment iv. Significand overflow: The addition of two
significands of the same sign may result in a carry out
from the most significant bit

43 44

11
10/10/2011

FP Arithmetic +/- FP Arithmetic +/- (cont.)


• Unlike integer and fixed-point number representation, • Floating-point addition and subtraction will typically
floating-point numbers cannot be added in one simple involve the following steps:
operation i. Align the significand
• Consider adding two decimal numbers: ii. Add or subtract the significands
A = 12345 iii. Normalize the result
B = 567.89 • Since addition and subtraction are identical except for
If these numbers are normalized and added in floating- a sign change, the process begins by changing the sign
point format, we will have of the subtrahend if it is a subtract operation
• The floating-point numbers can only be added if the
0.12345 x 105 two exponents are equal
+ 0.56789 x 103
?.????? x 10?
• This can be done by aligning the smaller number with
the bigger number [increasing its exponent] or vice-
Obviously, direct addition cannot take place as the versa, so that both numbers have the same exponent
exponents are different Slides adapted from tan wooi
haw’s lecture notes (FOE)
45 46

1.1101 x 2 4
FP Arithmetic +/- (cont.) FP Arithmetic +/- (cont.) + 0.0101 x 2 4
10.0010 x 24 1.0001 x 2 5
• As the aligning operation may result in the loss of • After the numbers have been aligned, they are added
digits, it is the smaller number that is shifted so that together taking into account their signs
any loss will therefore be of relatively insignificant • There might be a possibility of significand overflow
8 bits remains
shift due to a carry out from the most significant bit
left
1.1001 x 29 110010000 x 21 1 x 29 is lost • If this occurs, the significand of the result if shifted
1.0111 x 21 1.0111000 x 21 right and the exponent is incremented
• Hence, the smaller number are shifted right by • As the exponents are incremented, it might overflows
increasing its exponent until the two exponents are the and the operation will stop
same • Lastly, the result if normalized by shifting significand
• If both numbers have exponents that differ digits left until the most significant digit is non-zero
significantly, the smaller number is lost as a result of • Each shift causes a decrement of the exponent and thus
shifting could cause an exponent underflow
1.1001001 x 29 1.1001001 x 29 • Finally, the result is rounded off and reported
1.0110001 x 21 shift
0.0000000 x 29
right 47 48

12
10/10/2011

X = 1.01101 x 2 7 1.01101 x 27 1.01101 x 27


SUBTRACT
+ 0.110101 x 27 – 0.110101 x 27
X–Y=Z
Y = 1.10101 x 2 6
10.001111 x 27 0.100101 x 27 FP Arithmetic +/- (cont.)
Change sign of Y
• Some of the floating-point arithmetic will lead to an
X = 1.01101 x 27 increase number of bits in the mantissa
X+Y=Z Y = 0.110101 x 27
0.100101 x 27
ADD X = 0?
no
Y = 0?
no Expoenents
Equal?
yes Add signed
significands
Results
normalized?
yes
Round result • For example, consider adding these 5 significant bits
yes yes no no
floating-point numbers:
ZY ZX
Increment smaller
Z0
yes Significand Shift significand RETURN
A = 0.11001 x 24
exponent = 0? left
B = 0.10001 x 23
no 1.0001111 x 28
1.00101 x 26 A = 0.11001 x 24
RETURN Shift significand RETURN Significand no Decrement
right overflow? exponent
B = 0.010001 x 24
normalize
10.001111 x 27 yes 1.000011 x 24 0.1000011 x 25
1.00101 x 26
Significand Shift significand no Exponent
Y = 0.110101 x 2 7
no = 0? right underflow?
• The result has two extra bit of precision which cannot
yes
be fitted into the floating point format
1.0001111 x 28
Put other number
in Z RETURN
Increment
exponent
Report underflow
• For simplicity, the number can be truncated to give
0.10000 x 25
RETURN yes Exponent no RETURN
Report overflow
overflow? 49 50

FP Arithmetic +/- (cont.) FP Arithmetic +/- (cont.)


• Truncation is the simplest method which involves ii. 0.1101011
nothing more than taking away the extra bits extra bits  0.0000011
• A much better technique is rounding in which if the LSB of retained bits  0.0001
value of the extra bits is greater than half the least 0. 1 1 0 1 0 1 1 0.1101
significant bit of the retained bits, 1 is added to the extra bits are
truncated
less than half
LSB of the remaining digits
• For example, consider rounding these numbers to 4 • Truncation always undervalues the result, leading to a
significant bits: systematic error, whereas rounding sometimes reduces
i. 0.1101101 the result and sometimes increases it
extra bits  0.0000101 • Rounding is always preferred to truncation partly
LSB of retained bits  0.0001 because it is more accurate and partly it gives rise to an
0.1 1 0 1 1 0 1
0.1101 unbiased error
+ 1
0.1110 • Major disadvantage of rounding is that it requires a
more than half
add 1 to the
LSB
further arithmetic operation on the result
51 52

13
10/10/2011

Example continue ...


Perform the following arithmetic operation using floating As these numbers have different exponents, the
point arithmetic, In each case, show how the numbers smaller number is shifted right to align with the larger
would be stored using IEEE single-precision format number
i. 1150.62510  525.2510 1000 1000 1.00000110101  1000 1001 0.100000110101
exponent mantissa exponent mantissa
1150.62510 = 100 0111 1110. 1012
Subtract the mantissa
= 1. 0001 1111 10101 x 210 1.0001111110101
stored – 0.100000110101
+ 10 + 127 = 13710 1  0 0 0 1 1 1 1 1 1 0 1 0 1 0.1001110001011
0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 1 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0
sign biased exponent significand
Normalize the result
525.2510 = 10 0000 1101.012 1000 1001 0.1001110001011  1000 1000 1.001110001011
exponent mantissa exponent mantissa
= 1. 0000 0110 101 x 29
stored stored
+ 9 + 127 = 13610 1  0 0 0 0 0 1 1 0 1 0 1 + 9 + 127 = 13610 1  0 0 1 1 1 0 0 0 1 0 1 1
0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0
sign biased exponent significand sign biased exponent significand
53 54

continue ... continue ...


12.210 = 1100.0011 0011 ... 1210 = 11002
ii. 68.310 + 12.210 = 1.100 0011 0011 ... x 23 0.210  0.2 x 2  0.4
68.310 = 100 0100.01001 1001 ... 0.4 x 2  0.8
6810 = 100 01002
= 1.00 0100 01001 1001 ... x 26
0.8 x 2  1.6
0.310  0.3 x 2  0.6
0.6 x 2  1.2
0.6 x 2  1.2
0.2 x 2  0.4
0.2 x 2  0.4
...

0.4 x 2  0.8
0.8 x 2  1.6 only 24 bits can be stored
0.6 x 2  1.2
1 1 0 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
...

less than half of


only 24 bits can be stored
the LSB
1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1
32-bit register
more than half
+1 of the LSB
stored [23 bits]
+ 3 + 127 = 13010 1  1 0 0 0 0 1 1 0 0 1 1 ...
stored [23 bits] 0 1 0 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
+ 6 + 127 = 13310 1  0 0 0 1 0 0 0 1 0 0 1 ... sign biased exponent significand
0 1 0 0 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0
sign biased exponent significand

55 56

14
10/10/2011

continue ... Floating-point Multiplication


XxY=Z X = 6.2510 = 110.012 = 1.1001 x 2 2
MULTIPLY Y = 12.510 = 1100.12 = 1.1001 x 2 3
Align the smaller number with the larger number by E 1 = 127 + 2 = 129
shifting it to the right [increasing the exponent] E 2 = 127 + 3 = 130
no no
X = 0? Y = 0? Add exponents
E 1 + E2 = 259
1000 0010 1.1000011001100110011  1000 0101 0.0011000011001100110011 yes yes

exponent mantissa exponent mantissa E T = 259 – 127 = 132


Z0 Subtract bias

Subtract the mantissa


1.00010001001100110011010
RETURN yes
+ 0.00110000110011001100110011 Exponent
overflow?
Report
overflow

1.01000010000000000000000011 no

less than half


yes
of the LSB Exponent
underflow
Report
underflow

Store the result in IEEE single-precision format 1.10012 no

x 1.10012
Multiply
0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10.011100012 significands

sign biased exponent significand


10.01110001 x 25
=1.001110001 x 2 6 Normalize

57 Round RETURN 58

Floating-point Division
Floating Point Multiplication
X = 3.7510 = 11.112 = 1.111 x 21
YX= Z
DIVIDE Y = 95.62510 = 101 1111.101 2
= 1.011111101 x 26
E 1 = 127 + 1 = 128
X = 0?
no
Y = 0?
no Subtract E 2 = 127 + 6 = 133
exponents
E2 – E1 = 5
yes yes

E T = 127 + 5 = 132
Z 0 Z  Add bias

RETURN Exponent yes Report


overflow? overflow

no

Exponent yes Report


underflow underflow

no

0.110011
Divide
1.111 1.011111101 significands

0.110011 x 2 5
= 1.10011 x 24 Normalize

Round RETURN 59 60

15
10/10/2011

Floating Point Division


PROBLEM (1)

 Express the number - (640.5)10 in IEEE 32


bit and 64 bit floating point format

61 62

SOLUTION (1)…….
SOLUTION (1)….
 Step 3: For the 8 bit biased exponent field, the
 IEEE 32 BIT FLOATING POINT FORMAT bias used is
2k-1-1 = 28-1-1 = 127
MSB 8 bits 23 bits
Add the bias 127 to the exponent 9 and convert
sign Biased Mantissa/Significand it into binary in order to store for 8-bit biased
Exponent (Normalized)
exponent. 127 + 9
Step 1: Express the given number in binary form =136 ( 1000 1000)
(640.5) = 1010000000.1* 20  Step 4: Since the given number is negative, put
Step 2: Normalize the number into the form 1.bbbbbbb
MSB as 1
1010000000.1* 20 = 1. 0100000001* 29
Once Normalized, every number will have 1 at the leftmost bit. So IEEE notation is saying  Step 5: Pack the result into proper format(IEEE
that there is no need to store this bit. Therefore significand to be stored is 0100 0000 0100
0000 0000 000 in the allotted 23 bits 32 bit)
1 1000 1000 0100 0000 0010 0000 0000 000
63 64

16
10/10/2011

SOLUTION (1)…... SOLUTION (1)…


 Step 3: For the 11 bit biased exponent field, the bias
 IEEE 64 BIT FLOATING POINT FORMAT used is
52 bits 2k-1-1 = 211-1-1 = 1023
MSB 11 bits
Add the bias 1023 to the exponent 9 and convert it into
sign Biased Mantissa/Significand binary in order to store for 11-bit biased exponent.
Exponent (Normalized) 1023 + 9 =1032 ( 1000 0001 000)
Step 1: Express the given number in binary form  Step 4: Since the given number is negative, put MSB as
1
(640.5) = 1010000000.1* 20
Step 2: Normalize the number into the form 1.bbbbbbb  Step 5: Pack the result into proper format(IEEE 64 bit)
1010000000.1* 20 = 1. 0100000001* 29
Once Normalized, every number will have 1 at the leftmost bit. So IEEE notation is saying
that there is no need to store this bit. Therefore significand to be stored is 0100 0000 0100
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 in the allotted 52 bits
1 1000 0001 000 0100 0000 0010 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
65 66

17

You might also like