Lecture3 - Computer Organization and Design
Lecture3 - Computer Organization and Design
_________
Character Sets
______________
The ASCII character set is everything you can type on an American standard
keyboard plus some formatting characters. Germans who can't live without
their umlauts can always use Unicode, which requires two bytes per character.
English rarely uses diacritical marks, although typing "Schrödinger" is fun.
Floating-Point Numbers
______________________
All numbers inside computers are _binary rationals_. This, and the finite
size of registers, limits the accuracy with which we can represent arbitrary
rational and real numbers.
A number 'n' has a finite binary expansion iff 'n' is a binary rational.
One reason for caring about fixed-point numbers is that every floating-point
number is a scaled version of a (1+f)-bit binary fixed-point number, where
'f' is the number of bits set aside for the fractional part of the significand.
m - 1
x = Sigma x_i * r^i ==> (x_m-1 ... x_0 <radix point> x_-1 ... x_-n)_r
i = -n
The digits to the right of the radix point are given negative indices and
their weights are negative powers of the radix.
Now, .9 * 2 = .8 1
.8 * 2 = .6 1
.6 * 2 = .2 1
.2 * 2 = .4 0
.4 * 2 = .8 0 Should we stop here, or
--------------
.8 * 2 = .6 1 should we compute one more binary digit?
Just taking the first five fractional digits gives (010.11100) = 2.875.
We _could_ do something with the sixth digit: since it is 1, we can add 1
to the current approximation. This gives (010.11101) = 2.90625, which is
closer to 2.9 than is (010.11100). This last refinement is called
_rounding_. We won't use rounding even if it often improves accuracy.
When we move the binary point left, we increase the exponent; when we move
the binary point right, we decrease the exponent.
Blackboards have plenty of room, but computer registers often have too few
bits. Therefore, when we pack blackboard floating-point numbers into
registers, we adopt certain tricks to save a bit here or there.
Trick 1: We drop the (thus far explicit) "1." in that it goes without saying.
This allows us to explicitly represent only (a finite prefix of) the
fractional part of the coefficient. Trick 2: We drop the (thus far explicit)
"2^" in that it too goes without saying. In blackboard notation, there is no
need to use either trick. Moreover, in blackboard notation, we can have
coefficients as long as we want.
Consider a 16-bit register. One bit is used to represent the sign, leaving
us with 15 bits. After reflection, we choose to use 4 bits (one hex digit)
to represent the signed exponent in two's complement semantics. That leaves
us with 11 bits to store (an 11-bit prefix of) the _fractional part_ of the
binary coefficient.
To repeat, we use: i) one bit for the sign, ii) 4 bits to represent the
exponent in two's complement, and iii) 11 bits for the first 11 bits to
the _right_ of the binary point in the full coefficient. (Again, these
11 "right-of-point" bits are called (the prefix of) the _fractional part_
of the coefficient, which could easily be infinite).
New example: Put 1/5 into a 16-bit register. Here, the true value of the
fractional part of the coefficient is an infinite binary expansion. The
previous examples all had fractional parts that terminated quickly, so no
prefixes were needed.
Notice that the notion of "fixed-point" vanishes from any example with a
scale factor. As normal mathematicians, we simply wrote: 0.2 = 0.(0011)^*
= 1.(1001)^* x 2^-3. The superscript '*' means infinite repetition.
This is just the infinite binary expansion of 0.2 with and without
normalization (thus, with and without scale factor).
There are short (32-bit) and long (64-bit) floating-point formats. The short
format ranges from 1.2 * 10^-38 to 3.4 * 10^38. The long format ranges from
2.2 * 10^-308 to 1.8 * 10^308. This is just single-precision floating point
and double-precision floating point.
Let's say a few more words about the IEEE standard. It has special bit
patterns for 0, Infinity, and Not-a-Number (NaN). Again, we are not
responsible for this.
I should add that there is no free lunch, and that the two most negative
exponents in the exponent range are _removed_ to allow construction of the
special bit patterns used to represent these special values.
A Friendly Example
__________________
Pretend there is no sign bit (all floating-point numbers are positive). Let
the exponent field be 8-bit two's complement, and let the fractional field
be 40 bits in length (register size = 48 bits). Show the register contents
when (an approximation of) 1/12 is stored as a floating-point number.
Instruction Formats
___________________
We add the 16-bit immediate (here, -8) to 'r1' to compute a number (often
a memory address). Then, we write this number into 'r1'.
For _data accesses_, the offset is the number of _bytes_ forward (positive)
or backward (negative) relative to the base address. In contrast, for
_branch instructions_, the offset is in _instructions_, where we assume
instructions always occupy 32-bit memory words. To interpret the 16-bit
signed integer as a byte-address offset, we must multiply by 4 to get the
number of bytes. Since offsets can be positive or negative, this allows for
branching to other instructions within +/- 2^15 (32,768) instructions of the
current instruction.
We compare register 'r1' and register 'r2'. If they are not equal, we add
the word-offset derived from the immediate 'loop' to the current value of
PC as the new value of PC. That is, we add the shifted, sign-extended
16-bit signed integer 'loop' to the memory address of the branch instruction.
We have already seen how far we can go from the current instruction.
[j] [done]
6 bits 26 bits
Using our two tricks, we expand the partial jump-target address 'done'
into a full 32-bit address, and transfer control there.
Addressing modes
________________
addi r1,r1,8
fmul f4,f2,f6
beq r1,r2,found
Again, the 16-bit signed number is multiplied by 4 (making it a word-address
offset) and the result is added to PC. This allows branching to other
instructions within +/- 2^15 words of the current instruction.
j done
Special Values in IEEE Floating Point (we are not responsible for this)
_____________________________________
I have told you that the exponent field is in two's complement. Suppose we
have a 4-bit exponent field. Then:
0000 0
0001 1
... ...
0111 7
1000 -8
1001 -7
1010 -6
... ...
1111 -1
I haven't taught the five special values (+/- 0, +/- infinity, and NaN), but I
agree that they are necessary in real computers. To represent these values,
we need special codes. But this means will we have to sacrifice certain bit
patterns that we could otherwise have used.
0000 0
0001 1
... ...
0111 7
*******************
1000 -8 <stolen to participate in special codes>
1001 -7 <stolen to participate in special codes>
*******************
1010 -6
... ...
1111 -1
I generally ask for maximum magnitudes, so this problem never arises, but if
I were to ask for minimum magnitudes, I would tell you whether to use -6 or -8.
Why is bias convenient? Well, the normal range is -8 to 7. Take the largest
number (i.e., 7), and add it to each of the two stolen exponents. (We also
add 7 to the exponent bit patterns that have not been stolen; this does _not_
change their true values, only their representations).
That is why 32-bit normalized IEEE floating-point numbers have magnitudes that
run from 2^-126 to (approximately) 2^128 (in decimal, 1.2 * 10^38 to
3.4 * 10^38---roughly). In contrast, 64-bit normalized IEEE floating-point
numbers have magnitudes that run from 2^-1022 to (approximately) 2^1024 (in
decimal, 2.2 * 10^-308 to 1.8 * 10^308---roughly). See Lecture 3.
However, you are _still_ not responsible for special values, special
encodings, exponent stealing, or exponent bias. This appendix is for
information only. Disclosure: I may talk about exponent stealing.
--