0% found this document useful (0 votes)

94 views24 pages

Computer Architecture & Organization Unit 2

This document discusses different methods for representing numbers in digital computers. It describes fixed point notation, which reserves a fixed number of bits for the integer and fractional parts of a number. Floating point notation allows a varying number of bits after the decimal point. The key advantages of floating point are flexibility and a wider range of representable values, while fixed point has better performance. The document then explains IEEE floating point standards for single, double and quadruple precision formats.

Uploaded by

Nihal Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

94 views24 pages

Computer Architecture & Organization Unit 2

Uploaded by

Nihal Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Unit - 2

By: Namrata Singh

Fixed Point and Floating Point Number Representations
Digital Computers use Binary number system to represent all types of information inside the
computers. Alphanumeric characters are represented using binary bits (i.e., 0 and 1). Digital
representations are easier to design, storage is easy, accuracy and precision are greater.

There are various types of number representation techniques for digital number representation,
for example: Binary number system, octal number system, decimal number system, and
hexadecimal number system etc. But Binary number system is most relevant and popular for
representing numbers in digital computer system.

Storing Real Number

These are structures as following below −

There are two major approaches to store real numbers (i.e., numbers with fractional
component) in modern computing. These are (i) Fixed Point Notation and (ii) Floating Point
Notation. In fixed point notation, there are a fixed number of digits after the decimal point,
whereas floating point number allows for a varying number of digits after the decimal
point.

Fixed-Point Representation −
This representation has fixed number of bits for integer part and for fractional part. For
example, if given fixed-point representation is IIII.FFFF, then you can store minimum value is
0000.0001 and maximum value is 9999.9999. There are three parts of a fixed-point number
representation: the sign field, integer field, and fractional field.

By: Namrata Singh

We can represent these numbers using:

1) Signed representation: range from -(2(k-1)-1) to (2(k-1)-1), for k bits.

2) 1’s complement representation: range from -(2(k-1)-1) to (2(k-1)-1), for k bits.
3) 2’s complementation representation: range from -(2(k-1)) to (2(k-1)-1), for k bits.
4) 2’s complementation representation is preferred in computer system because of
unambiguous property and easier for arithmetic operations.

Example −Assume number is using 32-bit format which reserve 1 bit for the sign, 15 bits for
the integer part and 16 bits for the fractional part.

Then, -43.625 is represented as following:

Where, 0 is used to represent + and 1 is used to represent. 000000000101011 is 15 bit binary

value for decimal 43 and 1010000000000000 is 16 bit binary value for fractional 0.625.

The advantage of using a fixed-point representation is performance and disadvantage is

relatively limited range of values that they can represent. So, it is usually inadequate for
numerical analysis as it does not allow enough numbers and accuracy. A number whose
representation exceeds 32 bits would have to be stored inexactly.

These are above smallest positive number and largest positive number which can be store in
32-bit representation as given above format. Therefore, the smallest positive number is 2-16
≈ 0.000015 approximate and the largest positive number is (215-1)+(1-2-16)=215(1-2-16)
=32768, and gap between these numbers is 2-16.

We can move the radix point either left or right with the help of only integer field is 1.

By: Namrata Singh

Floating-Point Representation −
This representation does not reserve a specific number of bits for the integer part or the
fractional part. Instead it reserves a certain number of bits for the number (called the
mantissa or significand) and a certain number of bits to say where within that number the
decimal place sits (called the exponent).

The floating number representation of a number has two part: the first part represents a
signed fixed point number called mantissa. The second part of designates the position of the
decimal (or binary) point and is called the exponent. The fixed point mantissa may be
fraction or an integer. Floating -point is always interpreted to represent a number in the
following form: Mxre.

Only the mantissa m and the exponent e are physically represented in the register (including
their sign). A floating-point binary number is represented in a similar manner except that is
uses base 2 for the exponent. A floating-point number is said to be normalized if the most
significant digit of the mantissa is 1.

So, actual number is (-1)s(1+m)x2(e-Bias), where s is the sign bit, m is the mantissa, e is the
exponent value, and Bias is the bias number.

Note that signed integers and exponent are represented by either sign representation, or
one’s complement representation, or two’s complement representation.

The floating point representation is more flexible. Any non-zero number can be represented in
the normalized form of ±(1.b1b2b3 ...)2x2n This is normalized form of a number x.

Example −Suppose number is using 32-bit format: the 1 bit sign bit, 8 bits for signed exponent,
and 23 bits for the fractional part. The leading bit 1 is not stored (as it is always 1 for a
normalized number) and is referred to as a “hidden bit”.

Then −53.5 is normalized as -53.5=(-110101.1)2=(-1.101011)x25 , which is represented as

following below,

Where 00000101 is the 8-bit binary value of exponent value +5.

By: Namrata Singh
Note that 8-bit exponent ﬁeld is used to store integer exponents -126 ≤ n ≤ 127.

The smallest normalized positive number that ﬁts into 32 bits is

(1.00000000000000000000000)2x2-126=2-126≈1.18x10-38 , and largest normalized positive
number that ﬁts into 32 bits is (1.11111111111111111111111)2x2127=(224-1)x2104 ≈
3.40x1038 . These numbers are represented as following below,

The precision of a ﬂoating-point format is the number of positions reserved for binary digits
plus one (for the hidden bit). In the examples considered here the precision is 23+1=24.

The gap between 1 and the next normalized floating-point number is known as machine
epsilon. the gap is (1+2-23)-1=2-23for above example, but this is same as the smallest
positive floating-point number because of non-uniform spacing unlike in the fixed-point
scenario.

Note that non-terminating binary numbers can be represented in floating point

representation, e.g., 1/3 = (0.010101 ...)2 cannot be a ﬂoating-point number as its binary
representation is non-terminating.

IEEE Floating point Number Representation −

IEEE (Institute of Electrical and Electronics Engineers) has standardized Floating-Point
Representation as following diagram.

So, actual number is (-1)s(1+m)x2(e-Bias), where s is the sign bit, m is the mantissa, e is the
exponent value, and Bias is the bias number. The sign bit is 0 for positive number and 1 for
negative number. Exponents are represented by or two’s complement representation.

According to IEEE 754 standard, the floating-point number is represented in following ways:

1) Half Precision (16 bit): 1 sign bit, 5 bit exponent, and 10 bit mantissa
2) Single Precision (32 bit): 1 sign bit, 8 bit exponent, and 23 bit mantissa
3) Double Precision (64 bit): 1 sign bit, 11 bit exponent, and 52 bit mantissa
4) Quadruple Precision (128 bit): 1 sign bit, 15 bit exponent, and 112 bit mantissa
By: Namrata Singh
Special Value Representation −

There are some special values depended upon different values of the exponent and
mantissa in the IEEE 754 standard.

1) All the exponent bits 0 with all mantissa bits 0 represents 0. If sign bit is 0, then
+0, else -0.
2) All the exponent bits 1 with all mantissa bits 0 represents infinity. If sign bit is 0,
then +∞, else -∞.
3) All the exponent bits 0 and mantissa bits non-zero represents denormalized
number.
4) All the exponent bits 1 and mantissa bits non-zero represents error.

By: Namrata Singh

HALF-ADDER
Half-adder is a circuit that can add two binary bits. Its outputs are
SUM and CARRY. The following truth table shows various
combinations of inputs and their corresponding outputs of a
half-adder. X and Y denote inputs and C and S denote CARRY
and SUM.

X Y CARRY(C) SUM (S)

0 0 0 0
0 1 0 1( X Y)
1 0 0 1 (X Y )
1 1 1(XY) 0

Truth Table for a Half-Adder

The minterms for SUM and CARRY are shown in the bracket.
The Sum-Of-Product (SOP) equation for SUM is :

S = XY + XY = X  Y …..………… ( 1 )

Similarly, the SOP equation for the CARRY is :

C = XY …………. ………………………( 2 )

Combining the logic circuits for equation ( 1 ) & ( 2 ) we get

the circuit for Half-Adder as :

Half-Adder Circuit and Symbol

FULL-ADDER
Full- Adder is a logic circuit to add three binary bits. Its outputs
are SUM and CARRY. In the following truth table X, Y, Z are
inputs and C and S are CARRY and SUM.

By: Namrata Singh

X Y Z CARRY(C) SUM (S)

0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1(XYZ) 1(XYZ)

Truth Table for a Full-Adder

The minterms are written in the brackets for each 1 output in
the truth table. From these the SOP equation for SUM can be
written as :

S = XYZ  XYZ  XYZ  XYZ

= X YZ  YZ  X YZ  YZ

= XS  XS ……………………. (3)

( Exclusive OR and equivalence functions are complement to each

other). Here S is SUM of Half-Adder.

Again, SOP equation for Full–Adder CARRY is :

C = XYZ  XYZ  XYZ  XYZ

= XYZ  XYZ  XYZ  XYZ

= X  X YZ  X YZ  YZ

= YZ + XS
= C + XS ................................. (4)

Here also C means CARRY of half-adder and S means SUM

of half-adder.

Now using two half-adder circuits and one OR gate we can

implement equation ( 3) and ( 4 ) to obtain a full-adder circuit
as follows:

By: Namrata Singh

Full-Adder Circuit and its Symbol

HALF-SUBTRACTOR

A half-subtractor subtracts one bit from another bit. It has

two outputs viz DIFFERENCE (D) and BORROW (B).

X Y BORROW (B) DIFFERENCE (D)

0 0 0 0

0 1 1 XY

1 0 0 1 XY
1 1 0 0

Truth Table for Half-Subtractor

The mean terms are written within parenthesis for output

1 in each column. The SOP equations are :

D = X Y + XY
= X  Y ................................. (5)

B = X Y ................................. (6)

The half-subtractor circuit and the symbol

FULL-SUBTRACTOR

A full-subtractor circuit can find Difference and Borrow

By: Namrata Singh

arising on the subtraction operation involving three binary
bits.

X Y Z BORROW (B') DIFFERENCE (D')

0 0 0 0 0

0 0 1 1 XYZ

0 1 0 1 XYZ

0 1 1 1 XYZ

1 0 0 0 1 XYZ

1 0 1 0 0
1 1 0 0 0
1 1 1 1(XYZ) 1 (X YZ )

Truth Table for Full-Subtractor

The SOP equation for the DIFFERENCE IS :

D' = XYZ  XYZ  XYZ  XYZ

= XYZ  XYZ  XYZ  XYZ

= XY  XY Z  XY  XY Z

= DZ  DZ ................................. (7)
And SOP equation for BORROW is :

B' = XYZ  XYZ  XYZ  XYZ

= XYZ  XYZ  XYZ  XYZ

= XY  XY Z  XY Z  Z

= DZ  XZ ................................. (7)

In equation (7) and (8) , D stands for DIFFERENCE output

of half-subtractor. Now, from the equations (7) and (8)
we can construct a full-subtractor using two half-subtractor
and an OR gate.

By: Namrata Singh

Full-Subtractor circuit

By: Namrata Singh

Carry look ahead adder:
Motivation behind Carry Look-Ahead Adder:

In ripple carry adders, for each adder block, the two bits that are to be added are available instantly. However,
each adder block waits for the carry to arrive from its previous block. So, it is not possible to generate the sum
and carry of any block until the input carry is known.
The ith block waits for the i- 1th block to produce its carry. So there will be a considerable time delay which is
carry propagation delay.

Figure – Digital Logic

Consider the above 4-bit ripple carry adder. The sum S4 is produced by the corresponding full adder as
soon as the input signals are applied to it. But the carry input C4 is not available on its final steady state
value until carry C3 is available at its steady state value. Similarly C3 depends on C2 and C2 on C1.
Therefore, though the carry must propagate to all the stages in order that output S3 and carry C4 settle
their final steady-state value. The propagation time is equal to the propagation delay of each adder
block, multiplied by the number of adder blocks in the circuit. For example, if each full adder stage has
a propagation delay of 20 nanoseconds, then S3 will reach its final correct value after 60 (20 × 3)
nanoseconds. The situation gets worse, if we extend the number of stages for adding more number of
bits.

By: Namrata Singh

Carry Look-ahead Adder:
A carry look-ahead adder reduces the propagation delay by introducing more complex hardware. In this design,
the ripple carry design is suitably transformed such that the carry logic over fixed groups of bits of the adder is
reduced to two-level logic. Let us discuss the design in detail.

Figure - Design.

Figure – Truth table.

By: Namrata Singh

Generally, we perform many mathematical operations in our daily life such as addition, subtraction,
multiplication, division, and so on. Let us consider the multiplication process that can be performed in different
methods. Different types of algorithms can be used to perform multiplication like grid multiplication method,
long multiplication, lattice multiplication, peasant or binary multiplication, and soon.
Binary multiplication is usually performed in digital electronics by using an electronic circuit called as binary
multiplier. These binary multipliers are implemented using different computer arithmetic techniques. Booth
multiplier that works based on booth algorithm is one of the most frequently used binary multipliers.

By: Namrata Singh

Addition and Subtraction
Four basic computer arithmetic operations are addition, subtraction, division and
multiplication. The arithmetic operation in the digital computer manipulate data to produce
results. It is necessary to design arithmetic procedures and circuits to program arithmetic
operations using algorithm. The algorithm is a solution to any problem and it is stated by a
finite number of well-defined procedural steps. The algorithms can be developed for the
following types of data.
1. Fixed point binary data in signed magnitude representation
2. Fixed point binary data in signed 2’s complement representation.
3. Floating point representation
4. Binary Coded Decimal (BCD) data

Addition and Subtraction with signed magnitude

Consider two numbers having magnitude A and B. When the signed numbers are added or
subtracted, there can be 8 different conditions depending on the sign and the operation
performed as shown in the table below:
Operation Add magnitude When A > B When A < B When A = B
(+A) + (+B) +(A + B) -- -- --
(+A) + (-B) -- +(A - B) -(B - A) +(A - B)
(-A) + (+B) -- -(A - B) +(B - A) +(A - B)
(-A) + (-B) -(A + B) -- -- --
(+A) - (+B) -- +(A - B) -(B - A) +(A - B)
(+A) - (-B) +(A + B) -- -- --
(-A) - (+B) -(A + B) -- -- --
(-A) - (-B) -- -(A - B) +(B - A) +(A - B)
From the table, we can derive an algorithm for addition and subtraction as follows:
Addition (Subtraction) Algorithm:
 When the signs of A & B are identical, add the two magnitudes and attach the sign of A to
the result.
 When the sign of A & B are different, compare the magnitude and subtract the smaller
number from the large number. Choose the sign of the result to be same as A if A > B, or the
complement of the sign of A if A < B. If the two numbers are equal, subtract B from A and
make the sign of the result positive.

By: Namrata Singh

Hardware Implementation

fig: Hardware for signed magnitude addition and subtraction

The hardware consists of two registers A and B to store the magnitudes, and two flip-
flops As and Bs to store the corresponding signs. The results can be stored in the register A
and As which acts as an accumulator. The subtraction is performed by adding A to the 2’s
complement of B. The output carry is transferred to the flip-flop E. The overflow may occur
during the add operation which is stored in the flip-flop A Ë… F. When m = 0, the output of E is
transferred to the adder without any change along with the input carry of ‘0".

The output of the parallel adder is equal to A + B which is an add operation. When m =
1, the content of register B is complemented and transferred to parallel adder along with the
input carry of 1. Therefore, the output of parallel is equal to A + B’ + 1 = A – B which is a
subtract operation.

By: Namrata Singh

Hardware Algorithm

fig: flowchart for add and subtract operations

As and Bs are compared by an exclusive-OR gate. If output=0, signs are identical, if 1 signs are
different.
 For Add operation, identical signs dictate addition of magnitudes and for operation
identical signs dictate addition of magnitudes and for subtraction, different magnitudes
dictate magnitudes be added. Magnitudes are added with a micro operation EA
 Two magnitudes are subtracted if signs are different for add operation and identical for
subtract operation. Magnitudes are subtracted with a micro operation EA = B and number
(this number is checked again for 0 to make positive 0 [As=0]) in A is correct result. E = 0
indicates A < B, so we take 2’s complement of A.

Multiplication
Hardware Implementation and Algorithm
Generally, the multiplication of two final point binary number in signed magnitude
representation is performed by a process of successive shift and ADD operation. The process
consists of looking at the successive bits of the multiplier (least significant bit first). If the
multiplier is 1, then the multiplicand is copied down otherwise, 0’s are copied. The numbers

By: Namrata Singh

copied down in successive lines are shifted one position to the left and finally, all the numbers
are added to get the product.
But, in digital computers, an adder for the summation (∑) of only two binary numbers are
used and the partial product is accumulated in register. Similarly, instead of shifting the
multiplicand to the left, the partial product is shifted to the right. The hardware for the
multiplication of signed magnitude data is shown in the figure below.

Hardware for multiply operation

Initially, the multiplier is stored q register and the multiplicand in the B register. A register is
used to store the partial product and the sequence counter (SC) is set to a number equal to
the number of bits in the multiplier. The sum of A and B form the partial product and both
shifted to the right using a statement “Shr EAQ” as shown in the hardware algorithm. The flip
flops As, Bs & Qs store the sign of A, B & Q respectively. A binary ‘0” inserted into the flip-flop
E during the shift right.
Hardware Algorithm

flowchart for multiply algorithm

By: Namrata Singh

 The multiplicand is added to the partial product if the first least significant bit is 0
(provided that there was a previous 1) in a string of 0’s in the multiplier.
 The partial product doesn’t change when the multiplier bit is identical to the previous
multiplier bit.
This algorithm is used for both the positive and negative numbers in signed 2’s complement
form. The hardware implementation of this algorithm is in figure below:

The flowchart for booth multiplication algorithm is given below:

flowchart for booth multiplication algorithm

Numerical Example: Booth algorithm

BR=10111(Multiplicand)
QR=10011(Multiplier)
Array Multiplier
The multiplication algorithm first check the bits of the multiplier one at time and form partial
product. This is a sequential process that requires a sequence of add and shift micro
operation. This method is complicated and time consuming. The multiplication of 2 binary

By: Namrata Singh

numbers can also be done with one micro operation by using combinational circuit that
provides the product all at once.
Example.
Consider that the multiplicand bits are b1 and b0 and the multiplier bits are a1 and a0. The
partial product is c3c2c1c0. The multiplication two bits a0 and a1 produces a binary 1 if both
the bits are 1, otherwise it produces a binary 0. This is identical to the AND operation and can
be implemented with the AND gates as shown in figure.

2-bit by 2-bit array multiplier

Division Algorithm
The division of two fixed point signed numbers can be done by a process of successive
compare shift and subtraction. When it is implemented in digital computers, instead of
shifting the divisor to the right, the dividend or the partial remainder is shifted to the left. The
subtraction can be obtained by adding the number A to the 2’s complement of number B. The
information about the relative magnitudes of the information about the relative magnitudes
of numbers can be obtained from the end carry,
Hardware Implementation
The hardware implementation for the division signed numbers is shown id the figure.

By: Namrata Singh

Division Algorithm
The divisor is stored in register B and a double length dividend is stored in register A and Q.
the dividend is shifted to the left and the divider is subtracted by adding twice complement of
the value. If E = 1, then A >= B. In this case, a quotient bit 1 is inserted into Qn and the partial
remainder is shifted to the left to repeat the process. If E = 0, then A > B. In this case, the
quotient bit Qn remains zero and the value of B is added to restore the partial remainder in A
to the previous value. The partial remainder is shifted to the left and approaches continues
until the sequence counter reaches to 0. The registers E, A & Q are shifted to the left with 0
inserted into Qn and the previous value of E is lost as shown in the flow chart for division
algorithm.

flowchart for division algorithm

This algorithm can be explained with the help of an example.
Consider that the divisor is 10001 and the dividend is 01110.

By: Namrata Singh

binary division with digital hardware
Restoring method
Method described above is restoring method in which partial remainder is restored by
adding the divisor to the negative result. Other methods:
Comparison method: A and B are compared prior to subtraction. Then if A >= B, B is
subtracted from A. if A < B nothing is done. The partial remainder is then shifted left and
numbers are compared again. Comparison inspects end carry out of the parallel adder before
transferring to E.
Non-restoring method: In contrast to restoring method, when A -B is negative, B is not
added to restore A but instead, negative difference is shifted left and then B is added. How is it
possible? Let’s argue:
 In flowchart for restoring method, when A < B, we restore A by operation A - B + B. Next
time in a loop,
this number is shifted left (multiplied by 2) and B subtracted again, which gives: 2 (A - B +
B) – B = 2 A - B.
 In Non-restoring method, we leave A - B as it is. Next time around the loop, the number is
shifted left and B is added: 2 (A - B) + B = 2 A - B (same as above).

By: Namrata Singh

Divide Overflow
The division algorithm may produce a quotient overflow called dividend overflow. The
overflow can occur of the number of bits in the quotient are more than the storage capacity of
the register. The overflow flip-flop DVF is set to 1 if the overflow occurs.
The division overflow can occur if the value of the half most significant bits of the dividend is
equal to or greater than the value of the divisor. Similarly, the overflow can occue=r if the
dividend is divided by a 0. The overflow may cause an error in the result or sometimes it may
stop the operation. When the overflow stops the operation of the system, then it is called
divide stop.

Arithmetic Operations on Floating-Point Numbers

The rules apply to the single-precision IEEE standard format. These rules
specify only the major steps needed to perform the four operations. Intermediate
results for both mantissas and exponents might require more than 24 and 8 bits,
respectively & overflow or an underflow may occur. These and other aspects of the
operations must be carefully considered in designing an arithmetic unit that meets the
standard. If their exponents differ, the mantissas of floating-point numbers must be
shifted with respect to each other before they are added or subtracted. Consider a

decimal example in which we wish to add 2.9400 x to 4.3100 x . We rewrite

2.9400 x as 0.0294 x and then perform addition of the mantissas to get 4.3394

x . The rule for addition and subtraction can be stated as follows:

Add/Subtract Rule

The steps in addition (FA) or subtraction (FS) of floating-point numbers (s1, eˆ , f1) fad{s2, eˆ
2, f2) are as follows.

1. Unpack sign, exponent, and fraction fields. Handle special operands such as zero,
infinity, or NaN(not a number).

2. Shift the significand of the number with the smaller exponent right by bits.
3. Set the result exponent er to max(e1,e2).
4. If the instruction is FA and s1= s2 or if the instruction is FS and s1 ≠ s2 then add the
significands; otherwise subtract them.

By: Namrata Singh

Computer Organization and Architecture

5. Count the number z of leading zeros. A carry can make z = -1. Shift the result
significand left z bits or right 1 bit if z = -1.
6. Round the result significand, and shift right and adjust z if there is rounding overflow,
which is a carry-out of the leftmost digit upon rounding.
7. Adjust the result exponent by er = er - z, check for overflow or underflow, and pack
the result sign, biased exponent, and fraction bits into the result word.

Multiplication and division are somewhat easier than addition and subtraction, in that
no alignment of mantissas is needed.

By: Namrata Singh

DLCO Unit-1
No ratings yet
DLCO Unit-1
38 pages
Unit 2
No ratings yet
Unit 2
85 pages
Unit 5 - Share
No ratings yet
Unit 5 - Share
38 pages
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
No ratings yet
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
42 pages
Fixed and Floating Point Representation
No ratings yet
Fixed and Floating Point Representation
5 pages
CO Unit-V
No ratings yet
CO Unit-V
10 pages
CAO 2 Unit 1
No ratings yet
CAO 2 Unit 1
59 pages
COA UNIT-III PPTs Dr.G.Bhaskar ECE
No ratings yet
COA UNIT-III PPTs Dr.G.Bhaskar ECE
64 pages
COA - Unit 2 Data Representation 1
No ratings yet
COA - Unit 2 Data Representation 1
59 pages
Fixed - and - Floating - Point - Representation
No ratings yet
Fixed - and - Floating - Point - Representation
40 pages
Unit-1-Introduction To Digital Electronics
No ratings yet
Unit-1-Introduction To Digital Electronics
235 pages
Cacc
No ratings yet
Cacc
106 pages
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
No ratings yet
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
32 pages
Module 2
No ratings yet
Module 2
19 pages
Finite Word Length Effects in Digital Filter
No ratings yet
Finite Word Length Effects in Digital Filter
26 pages
Floating Point & Fixed Point Representation - BCA II
No ratings yet
Floating Point & Fixed Point Representation - BCA II
24 pages
Unit 2
No ratings yet
Unit 2
16 pages
Labview Programming Reference Manual 7-30-2024!1!3000
No ratings yet
Labview Programming Reference Manual 7-30-2024!1!3000
3,000 pages
Computer Architecture, The Arithmetic/Logic Unit Slide 1
No ratings yet
Computer Architecture, The Arithmetic/Logic Unit Slide 1
88 pages
4.4 - 1 New Floating Point
No ratings yet
4.4 - 1 New Floating Point
22 pages
Part 1
No ratings yet
Part 1
33 pages
Finite Word Length Effects
No ratings yet
Finite Word Length Effects
31 pages
#3 - Floating Point
No ratings yet
#3 - Floating Point
38 pages
Floating Points
No ratings yet
Floating Points
31 pages
Module 2 - PART D Floating
No ratings yet
Module 2 - PART D Floating
30 pages
Coa Unit 2
No ratings yet
Coa Unit 2
35 pages
Architetture Dei Calcolatori 2425 079 092
No ratings yet
Architetture Dei Calcolatori 2425 079 092
14 pages
Data Representation
No ratings yet
Data Representation
28 pages
Finite Word Length Effects in Digital Filter
No ratings yet
Finite Word Length Effects in Digital Filter
26 pages
CA Notes 01
No ratings yet
CA Notes 01
14 pages
Lect4 Floats
No ratings yet
Lect4 Floats
64 pages
Floating Point Representation: Reading: B&O 2.4
No ratings yet
Floating Point Representation: Reading: B&O 2.4
44 pages
HW 4 Sol
No ratings yet
HW 4 Sol
10 pages
L-5 Floating Point Representation of Numbers
No ratings yet
L-5 Floating Point Representation of Numbers
21 pages
Slide8-Number Systems and Number Representations
No ratings yet
Slide8-Number Systems and Number Representations
24 pages
3-EED220 Lecture 3
No ratings yet
3-EED220 Lecture 3
22 pages
Integer Representation
No ratings yet
Integer Representation
34 pages
CSC340 - HW3
No ratings yet
CSC340 - HW3
28 pages
COMP0068 Lecture10 High Level Data Types
No ratings yet
COMP0068 Lecture10 High Level Data Types
25 pages
Number Systems - Data Representation (Numbers)
No ratings yet
Number Systems - Data Representation (Numbers)
27 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
7 pages
Module 1 Data Rep
No ratings yet
Module 1 Data Rep
14 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
5 pages
SW Lab 3 Fixed Point Simulation EE 462
No ratings yet
SW Lab 3 Fixed Point Simulation EE 462
7 pages
EC-502 - Aritra Dutta
No ratings yet
EC-502 - Aritra Dutta
6 pages
02 - Data Representation - Exercise Sheet 2 (Solutions) - 1271758327
No ratings yet
02 - Data Representation - Exercise Sheet 2 (Solutions) - 1271758327
5 pages
Fixed Point Numbers
No ratings yet
Fixed Point Numbers
20 pages
FIXED and FLOAT
No ratings yet
FIXED and FLOAT
8 pages
Computer Arithmetic (5 Hours)
No ratings yet
Computer Arithmetic (5 Hours)
27 pages
MATLAB Questions
No ratings yet
MATLAB Questions
17 pages
A Course in Math Bio Sol
100% (10)
A Course in Math Bio Sol
63 pages
Number Representation
No ratings yet
Number Representation
7 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
High Level Programmers Guide To The 68000 1992
No ratings yet
High Level Programmers Guide To The 68000 1992
310 pages
Lecture 4 - Computer Arithmetic
No ratings yet
Lecture 4 - Computer Arithmetic
18 pages
6.1 Addition and Subtraction of Signed Numbers:: Unit - 6 Arithmetic
No ratings yet
6.1 Addition and Subtraction of Signed Numbers:: Unit - 6 Arithmetic
48 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Computer Organization: Digital Computer: It Is A Fast Electronic Calculating Machine That Accepts Digitized Input
No ratings yet
Computer Organization: Digital Computer: It Is A Fast Electronic Calculating Machine That Accepts Digitized Input
52 pages
L4
No ratings yet
L4
29 pages
Fixed Versus Floating Point
No ratings yet
Fixed Versus Floating Point
5 pages
Sound Engineers3
No ratings yet
Sound Engineers3
607 pages
Kristin Lauter, Wei Dai, Kim Laine - Protecting Privacy Through Homomorphic Encryption-Springer (2022)
No ratings yet
Kristin Lauter, Wei Dai, Kim Laine - Protecting Privacy Through Homomorphic Encryption-Springer (2022)
183 pages
Computer Organisation
No ratings yet
Computer Organisation
4 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
5 pages
SAS - Useful Spread Sheet
No ratings yet
SAS - Useful Spread Sheet
40 pages
Lec 5
No ratings yet
Lec 5
51 pages
Floating Point Sept 6, 2006 15-213: "The Course That Gives CMU Its Zip!"
No ratings yet
Floating Point Sept 6, 2006 15-213: "The Course That Gives CMU Its Zip!"
34 pages
Floating Point Numbers: CS031 September 12, 2011
No ratings yet
Floating Point Numbers: CS031 September 12, 2011
22 pages
Unit - 3 of Computer Architecture
No ratings yet
Unit - 3 of Computer Architecture
59 pages
Module 1c Number Systems and Presentation
No ratings yet
Module 1c Number Systems and Presentation
53 pages
Binary Revision Past Paper Questions
No ratings yet
Binary Revision Past Paper Questions
7 pages
ARCh Presentation1
No ratings yet
ARCh Presentation1
12 pages
Ece3101l Lab6 Signal Quantization
No ratings yet
Ece3101l Lab6 Signal Quantization
14 pages
CAO - Lecutre5 Datapath Design
No ratings yet
CAO - Lecutre5 Datapath Design
43 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
3 pages
Computer Architecture & Organization Unit 4
100% (2)
Computer Architecture & Organization Unit 4
24 pages
Floating Point 6up
No ratings yet
Floating Point 6up
7 pages
Computer Network UNIT 3
No ratings yet
Computer Network UNIT 3
28 pages
5118.numerical Computing With IEEE Floating Point Arithmetic by Michael L. Overton
No ratings yet
5118.numerical Computing With IEEE Floating Point Arithmetic by Michael L. Overton
121 pages
Programiing 2011 Question
No ratings yet
Programiing 2011 Question
22 pages
Unit 4
No ratings yet
Unit 4
13 pages
Digital Communication UNIT 2
No ratings yet
Digital Communication UNIT 2
17 pages
Digital Communication UNIT 4
No ratings yet
Digital Communication UNIT 4
26 pages
Computer Network UNIT 2
No ratings yet
Computer Network UNIT 2
30 pages
Fixed Point Conversion
No ratings yet
Fixed Point Conversion
50 pages
Binary Representations For Integers
No ratings yet
Binary Representations For Integers
13 pages
Computer Network UNIT 4
No ratings yet
Computer Network UNIT 4
13 pages
Computer Network UNIT 5
No ratings yet
Computer Network UNIT 5
28 pages
Readme File For Code Example
No ratings yet
Readme File For Code Example
12 pages
Cs Important Questions by Ujjwal
No ratings yet
Cs Important Questions by Ujjwal
19 pages
Ofdm Fpga Thesis
100% (3)
Ofdm Fpga Thesis
4 pages
Computer Architecture & Organization UNIT 1
No ratings yet
Computer Architecture & Organization UNIT 1
17 pages
EE 319K Introduction To Microcontrollers
No ratings yet
EE 319K Introduction To Microcontrollers
15 pages
Digital Signal Processing Important 2 Two Mark Question and Answer IT 1252
No ratings yet
Digital Signal Processing Important 2 Two Mark Question and Answer IT 1252
14 pages
SQL Server To PostgreSQL Migration Reference
No ratings yet
SQL Server To PostgreSQL Migration Reference
2 pages
Performance Enhancement of Cisc Microcontroller: Mr. K. Sai Krishna Mr. G. Sreenivasa Raju
No ratings yet
Performance Enhancement of Cisc Microcontroller: Mr. K. Sai Krishna Mr. G. Sreenivasa Raju
6 pages
BCS054
No ratings yet
BCS054
6 pages
Digital Communication UNIT 3
No ratings yet
Digital Communication UNIT 3
5 pages
Principles of Digital Electronics
From Everand
Principles of Digital Electronics
Sapana Rane
No ratings yet

Computer Architecture & Organization Unit 2

Uploaded by

Computer Architecture & Organization Unit 2

Uploaded by

Unit - 2

By: Namrata Singh

Storing Real Number

By: Namrata Singh

1) Signed representation: range from -(2(k-1)-1) to (2(k-1)-1), for k bits.

Then, -43.625 is represented as following:

Where, 0 is used to represent + and 1 is used to represent. 000000000101011 is 15 bit binary

The advantage of using a fixed-point representation is performance and disadvantage is

By: Namrata Singh

Then −53.5 is normalized as -53.5=(-110101.1)2=(-1.101011)x25 , which is represented as

Where 00000101 is the 8-bit binary value of exponent value +5.

The smallest normalized positive number that ﬁts into 32 bits is

Note that non-terminating binary numbers can be represented in floating point

IEEE Floating point Number Representation −

By: Namrata Singh

X Y CARRY(C) SUM (S)

Truth Table for a Half-Adder

Similarly, the SOP equation for the CARRY is :

Combining the logic circuits for equation ( 1 ) & ( 2 ) we get

Half-Adder Circuit and Symbol

By: Namrata Singh

Truth Table for a Full-Adder

S = XYZ  XYZ  XYZ  XYZ

( Exclusive OR and equivalence functions are complement to each

Again, SOP equation for Full–Adder CARRY is :

C = XYZ  XYZ  XYZ  XYZ

Here also C means CARRY of half-adder and S means SUM

Now using two half-adder circuits and one OR gate we can

By: Namrata Singh

A half-subtractor subtracts one bit from another bit. It has

X Y BORROW (B) DIFFERENCE (D)

Truth Table for Half-Subtractor

The mean terms are written within parenthesis for output

The half-subtractor circuit and the symbol

A full-subtractor circuit can find Difference and Borrow

By: Namrata Singh

X Y Z BORROW (B') DIFFERENCE (D')

Truth Table for Full-Subtractor

The SOP equation for the DIFFERENCE IS :

D' = XYZ  XYZ  XYZ  XYZ

B' = XYZ  XYZ  XYZ  XYZ

In equation (7) and (8) , D stands for DIFFERENCE output

By: Namrata Singh

By: Namrata Singh

Figure – Digital Logic

By: Namrata Singh

Figure – Truth table.

By: Namrata Singh

By: Namrata Singh

Addition and Subtraction with signed magnitude

By: Namrata Singh

fig: Hardware for signed magnitude addition and subtraction

By: Namrata Singh

fig: flowchart for add and subtract operations

By: Namrata Singh

Hardware for multiply operation

flowchart for multiply algorithm

By: Namrata Singh

The flowchart for booth multiplication algorithm is given below:

flowchart for booth multiplication algorithm

Numerical Example: Booth algorithm

By: Namrata Singh

2-bit by 2-bit array multiplier

By: Namrata Singh

flowchart for division algorithm

By: Namrata Singh

By: Namrata Singh

Arithmetic Operations on Floating-Point Numbers

decimal example in which we wish to add 2.9400 x to 4.3100 x . We rewrite

x . The rule for addition and subtraction can be stated as follows:

By: Namrata Singh

By: Namrata Singh

You might also like