CS61C Final Exam : University of California, Berkeley - College of Engineering
CS61C Final Exam : University of California, Berkeley - College of Engineering
Problem M1 M2 M3 Ms F1 F2 F3 F4 Fs Total
Minutes 20 20 20 60 30 30 30 30 120 180
Points 10 10 10 30 22 22 22 24 90 120
Score
Name: _______________________________ Login: cs61c-____
Midterm Revisited
M1) “Son of a bits…” (10 pts, 20 min)
Recall the quarter definition from midterm (skip this paragraph if you remember it):
Early processors had no hardware support for floating point numbers. Suppose you are a
game developer for the original 8-bit Nintendo Entertainment System (NES) and wish to
represent fractional numbers. You and your engineering team decide to create a variant on
IEEE floating point numbers you call a quarter (for quarter precision floats). It has all the
properties of IEEE 754 (including denorms, NaNs and ± ∞) just with different ranges, precision
& representations. A quarter is a single byte split into the following fields (1 sign, 3 exponent, 4
mantissa): SEEEMMMM. The bias of the exponent is 3, and the implicit exponent for denorms is -2
You’re also familiar with the Fixed-Point Representation, where the binary point is always in the same
place so there’s no need to store the exponent. E.g., you could imagine splitting the Nintendo’s byte
into two nibbles with the left nibble representing the unsigned whole number component (W), and the
right representing the fractional component (F): WWWW.FFFF. Thus the bit pattern 0xa8 would be
interpreted as the unsigned fixed-point value 0xa.8 = 0b1010.1000 = 10.510. As a systems designer,
you could choose to interpret a byte any way you want, so you could change the point location (e.g.,
WW.FFFFFF or WWWWWWW.F) to suit your needs.
b) One of your games involves velocities that always fall in the range of [10, 15), i.e., 10 ≤ v < 15.
If you only have a single NES byte, you’re asked to design a novel representation to encode
a velocity (assume the hardware can handle whatever you do). It should be better than a
quarter, fixed-point, and any 8-bit encoding we’ve discussed! What is “better”? You will be
judged on four criteria (check the box to the left of the ones you think you satisfy, listed in
decreasing priority order). Explain your decoding below on the left (how to go from a bit pattern
b to a velocity v), and on the right show the bit patterns that would result from encoding
numbers closest to 10, 12.5 and 15 as well as the velocity each bit pattern actually represents.
______ Most bit patterns encoding the most different numbers in [10, 15)
______ You have bit patterns that are as close as possible to 10, 12.5, and 15
______ Uniform spacing between numbers in [10, 15) is better than non-uniform
______ Simplicity
My scheme has _____ NES byte bit patterns representing the Number ...and its ...and the
range [10, 15). Here’s how I go from a bit pattern b to a velocity v: closest bit velocity it
(if you’d like, you may write it mathematically... v as a function of b). to... pattern represents
10 0b
12.5 0b
15 0b
2/8
Name: _______________________________ Login: cs61c-____
M2) “Those are some big numbers you got there…” (10 pts, 20 min)
A bignum is a data structure designed to represent large integers. It does so by abstractly considering
all of the bits in the num array as part of one very large integer. This code is run on a standard 32-bit
MIPS machine, where a word (defined below) is 32 bits wide and a halfword is 16 bits wide.
typedef unsigned int word; This function shows how bignums are used:
typedef unsigned short halfword;
typedef struct bignum_struct {
int length; // number of words void print_bignum(bignum *b) {
word *num; // the actual data printf("0x"); // Print hex prefix
} bignum; for (int i = b->length-1; i>=0; i--)
printf("%08x", b->num[i]);
a) Is the ordering of words in the num array }
BIG or LITTLE endian? (circle one)
b) How many bytes would be used in the static, static stack heap
stack and heap areas as the result of lines 1, 3
and 4 below? Treat each line independently! Line 1
E.g., For line 3, don’t count the space allocated
Line 3
in line 1.
1 bignum biggie; Line 4
2 int main(int argc, char *argv[]) {
3 bignum bigTriple[3], *bigArray[4];
4 bigArray[1] = (bignum *) malloc (sizeof(bignum) * 2);
b) Complete the add function for two bignums, which you may assume are the same length. Our C
compiler translates z = x + y (where x,y,z are words) to add (not addu, as is customary) and thus
could generate a hardware (HW) overflow we don’t want, as we’re running on untrusted HW. Your
code should be written so that words never overflow in HW (so we do all adding in the halfword).
void add(bignum *a, bignum *b, bignum *sum, word carry_in, word *carry_out) {
// reserve space for num array. Remember a and b are the SAME length...
sum->num =
for (int i=0; i < a->length; i++) { // word-by-word do addition of lo, hi halfwords
// add lo halfwords of a,b
word lo =
// add hi halfwords of a,b (but in the safe, low halfword area so no HW overflow)
word hi =
// combine low and hi halfwords (put back in their places), like a lui-ori
carry_in =
}
sum->length = a->length;
*carry_out = carry_in;
}
3/8
Name: _______________________________ Login: cs61c-____
c) What is the most times this function can be called so that it still does what you described in
part (b)? (It can be left as an expression)
d) What will it return (exactly, but it may be left as an expression) if it is called one more time?
e) What will happen if it is called twice as many times as in (d)? Will it crash? Hang forever?
What’s returned, if anything? Describe the effect from the caller’s standpoint; be explicit.
4/8
Name: _______________________________ Login: cs61c-____
Post-Midterm Questions
F1) “The Datapath less traveled…” (22 pts, 30 min)
On the right is the single-cycle
MIPS datapath presented
during lecture. Your job is to
modify the diagram to
accommodate a new MIPS Rs Rt
instruction. Your modification
may use simple adders,
shifters, mux chips, wires, and
new control signals. If
necessary, you may replace
original labels.
We want to add a new MIPS
instruction so that the following
C statement (p is a pointer to an
int, and CONSTANT is small and
can be negative) could be performed in one I-type MAL MIPS instruction: *p = CONSTANT;
a) Make up the syntax for the I-type MAL MIPS instruction (call it sc for “store constant”) that
does it (show an example if the pointer lives in $v0 and the CONSTANT is 42). On the right,
show the register transfer language (RTL) description of sc.
Syntax: RTL:
b) For a larger CONSTANT (say 0xFAB5BEEF), to what exact TAL instructions would the MAL
above expand?
c) Modify the picture above and list your changes below. You may not need all the boxes.
Please write them in “pipeline stage order” (i.e., changes affecting IF first, MEM next, etc)
(i)
(ii)
(iii)
(iv)
d) We now want to set all the control lines appropriately. List what each signal should be
(an intuitive name or {0, 1, x = don't care}). Include any new control signals you added.
5/8
Name: _______________________________ Login: cs61c-____
F2) Congressman Mark Foley: “It was the Page’s fault” (22 pts, 30 min)
The specs for a MIPS machine’s memory system that has one level of cache and virtual memory are:
o 1MiB of Physical Address Space
o 4GiB of Virtual Address Space
o 4KiB page size
o 16KiB 8-way set-associative write-through cache, LRU replacement
o 1KiB Cache Block Size
o 2-entry TLB, LRU replacement
The following code is run on the system, which has no other users and process switching turned off.
a) What is the T:I:O bit breakup for the cache (assuming byte addressing)? ____:____:____
b) What is the VPN : PO bit breakup for VM (assuming byte addressing)? ______:______
For the following questions, only consider the line marked “SPECIAL”. Your answer can be a fraction.
6/8
Name: _______________________________ Login: cs61c-____
a) Count how many cycles will be needed to execute the code below and write out each
instruction’s progress through the pipeline by filling in the table below with pipeline stages
(F, D, E, M, W).
Cycle 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Inst 1
Inst 2
Inst 3
Inst 4
Inst 5
Inst 6
7/8
Name: _______________________________ Login: cs61c-____
b) If Stores were free (its CPI=0), how many times faster would our execution be?
c) What instruction would you make twice as fast for the best overall speed boost?
f) How many times faster are we if we parallelize the code over many machines?
g) Match the following items. Some items on the right may be used multiple times, or never.
A) LRU
Makes more efficient use of available disk area
B) Temporal Locality
The basis of network abstraction C) Synchronization
This guarantees delivery over a network D) Write-back
E) Full-duplex
Work per unit time
F) Latency
Time to complete a single task G) Constant Bit Density
Bigger blocks take advantage of this H) Amdahl’s Law
I) Spatial Locality
All caches take advantage of this
J) Encapsulation
“It’s getting harder to build a new chip fab plant!” K) Fragmentation
L) Synchronization
h) The circuit below has the following specs: tor, tclk-to-q, M) Throughput
tsetup, thold, tclock. (assume no delay on the wires): If all N) Parallelization
other times are fixed, what is the valid range for tor? O) AMAT
Express it in terms of the variables listed above. P) Constant angular velocity
Q) Ack
_________________ ≤ tor ≤ ___________________ R) Rock’s law
S) Superscalar
T) Pipelining
U) Superparamagnetism
D Q OUT V) Polling (not David)
RESET
CLK 8/8