0% found this document useful (0 votes)
13 views40 pages

Exercise Only

Uploaded by

minhhoang.vuduc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views40 pages

Exercise Only

Uploaded by

minhhoang.vuduc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Computer Architecture

Exercises
Exercises
◼ Arithmetic
◼ Digital logic
◼ Architectures

2
(Q) Number: Conversion
◼ Convert number
❑ (18)10 = ( )2
❑ (58)10 = ( )2
❑ (0.625)10 = ( )2
❑ (0.3125)10 = ( )2

❑ 1101.1012 = ( )10

❑ (-12.25)10 = ( )2s complement


◼ Assume 6 bits for integer and 3 bits for fractional number.

Number Systems and Codes


3
(Q) Number: BINARY to HEXADECIMAL

◼ Convert numbers
❑ (10 111 011 001 . 101 110)2 = ( )8
❑ (2731.56)8 = ( )2
❑ (101 1101 1001 . 1011 1000)2 = ( )16
❑ (5D9.B8)16 = ( )2

Number Systems and Codes


4
(Q) Number: Representation

For 4-bit system, fill in the table below


Positive values Negative values

Value Sign-and- 1s 2s Value Sign-and- 1s 2s


Magnitue Complement Complement Magnitue Complement Complement

+7 -0

+6 -1

+5 -2

+4 -3

+3 -4

+2 -5

+1 -6

+0 0000 -7
-8

Number Systems and Codes


5
(Q) Number: 2s complement addition and subtraction
◼ 4-bit system
◼ Which one is overflow? Why?
+3 0011 -2 1110
+ +4 + 0100 + -6 + 1010
(a) ---- ------- (b) ---- -------
+7 0111 -8 11000
---- ------- ---- -------

+6 0110 +4 0100
+ -3 + 1101 + -7 + 1001
(c) ---- ------- (d) ---- -------
+3 10011 -3 1101
---- ------- ---- -------

-3 1101 +5 0101
+ -6 + 1010 + +6 + 0110
(e) ---- ------- (f) ---- -------
-9 10111 +11 1011
---- ------- ---- -------

Number Systems and Codes


6
(Q) Boolean algebra
1. Prove the Boolean equation below.

X(X+Y) = X

(1) Derive the Boolean equation with the dual relationship


(2) Simplify

2. Prove the Boolean equation above using the truth table.

7
(Q) Boolean Algebra
◼ Simplify (x+y)(x+y')(x'+z)

Boolean Algebra
8
Exercises
◼ Arithmetic
◼ Digital logic
◼ Architectures

9
(Q) Sum-of-Product and NAND gates
◼ Draw gate-level circuit. Use 2-level NAND and NOT
gates.
❑ F = AB + C'D + E

Logic Gates and Circuits


10
(Q) Karnaugh Map
◼ Given sum-of-minterms formulation, draw Karnaugh map
and obtain the simplified Boolean expression
◼ F (w,x,y,z) = w'∙x∙y'∙z' + w'∙x∙y'∙z + w∙x'∙y∙z'
+ w∙x'∙y∙z + w∙x∙y∙z' + w∙x∙y∙z
=  m(4, 5, 10, 11, 14, 15)

Karnaugh Map
11
(Q) Karnaugh Map A
0
B
0
C
0
D
0
P
1
0 0 0 1 0
◼ An odd parity generator for BCD code 0 0 1 0 0
which has 6 unused combinations. Derive 0 0 1 1 1

Boolean expression for P. 0 1 0 0 0


0 1 0 1 1
0 1 1 0 1
0 1 1 1 0
1 0 0 0 0
1 0 0 1 1
1 0 1 0 X
1 0 1 1 X
1 1 0 0 X
1 1 0 1 X
1 1 1 0 X
1 1 1 1 X

Karnaugh Map
12
(Q) Combinational Circuit: MUX
Inputs
◼ Draw truth table for 4-by-1 multiplexor I0
◼ Design and draw the combinational logic I1 4:1
I2 mux Y
using AND, OR, NOT gates
I3

S1 S 0
select

MSI
13
(Q) Combinational Circuit: Encoder
◼ Draw truth table
◼ Design and draw the combinational logic
using AND, OR, NOT gates

F0 D0
Select via F1 4-to-2
switches 2-bits
F2 Encoder D1 code
F3

MSI
14
(Q) Sequential Circuit

◼ For the sequential circuit above, fill in the table (a)


◼ Fill in the table (b). Assume initial state is 0.
(a) State table
Present Input Input Next
state x y state (b) Timing diagram
0 0 0
0 0 1 Cycles 1 2 3 4 5 6 7

0 1 0 x 0 1 1 0 0 1 1
0 1 1
y 0 0 1 0 1 0 0
1 0 0
A 0
1 0 1
1 1 0
1 1 1

Sequential Circuit
15
(Q) Convolutional encoder
Out1

In Out
Reg1 Reg2

Out2

◼ For the sequential circuit above, fill in the table (a)


◼ Fill in the table (b). Assume initial state is 0.
◼ Complete state diagram (c). Numbers on the arrow indicates input / output.

(a) State table


(b) Timing diagram (c) State diagram
[ Reg1, Reg2 ] [Out1, Out2]
In Present Next
Out Cycle 1 2 3 4 5 6
0 / 00
state state
In 0 1 0 1 0 0 00
0 00
Present state 1 / 11
1 00 [Reg1, Reg2]
00

0 10 Out
[Out1, Out2] 10 01
1 10
0 01
1 01 11
0 11
1 11

16
(Q) Shift register
Out [0]

◼ For the sequential circuit,


Complete timing diagram below In Reg1 Reg2
Reg1 Reg2
Clk
Reg1, Reg2, Out[0], Out[1] concurrently operate

Out [1]

1 2 3 4 5

Clk

In

Reg1

Reg2

Out [0]

Out [1]

17
(Q) Register
Q1 Q2
D D Q D Q

Clk

Two registers are connected in serial

◼ Draw timing diagram for Q1 and Q2

1 2 3 4 5

Clock

Q1

Q2

18
Exercises
◼ Arithmetic
◼ Digital logic
◼ Architectures

19
(Q) Performance: Clock Cycle and Clock Frequency
◼ Program P runs in 10 seconds on computer A, which has a
400 MHz clock.
◼ Suppose we are trying to build a new machine B that will run
this program in 6 seconds. Unfortunately, the increase in
clock frequency has an averse effect on the rest of the CPU
design, causing machine B to require 1.2 times as many clock
cycles as machine A for the same program. What clock
frequency (for machine B) should we target at to achieve our
goal?

 Performance and Benchmarking


20
(Q) Performance: Impact of Compiler
◼ Given a program P, a compiler can generate 2 different binary on a
target machine. On that machine, there are 3 classes of instructions:
Class A, Class B, and Class C, and they require 1, 2, and 3 cycles
respectively.
◼ First binary has 5 instructions: 2 of A, 1 of B, and 2 of C.
Second binary has 6 instructions: 4 of A, 1 of B, and 1 of C.

1. Which code is faster? By how much?


2. What is the (average) CPI for each code?

 Performance and Benchmarking


21
(Q) Performance: Impact of Machine
◼ Suppose we have 2 implementations of the same ISA, and a
program is run on these 2 machines.
◼ Machine A has a clock cycle time of 10 ns and a CPI of 2.0.
Machine B has a clock cycle time of 20 ns and a CPI of 1.2.

1. Which machine is faster for this program?


2. By how much?

 Performance and Benchmarking


22
(Q) Performance: Amdahl’s Law
Suppose a program runs in 100 seconds on a machine, with multiply
operations responsible for 80 seconds of this time. How much do we
have to improve the speed of multiplication if we want the program
to run 4 times faster?

 Performance and Benchmarking


23
(Q) Performance: Amdahl’s Law
▪ Suppose we enhance a machine making all floating-
point instructions run five times faster. If the execution
time of some benchmark before the floating-point
enhancement is 12 seconds, what will the speedup be if
half of the 12 seconds is spent executing floating-point
instructions?

 Performance and Benchmarking


24
(Q) MIPS Assembly
◼ Write assembly or C code
C Statement MIPS Assembly Code

a = b + c; $s0→ a, $s1 → b, $s2 → c

a = b - c; $s0 → a, $s1 → b, $s2 → c

a = b + c - d; $s0 → a, $s1 → b, $s2 → c,


$s3 → d

f = (g + h) – (i + j); $s0 → f, $s1 → g


$s2 → h, $s3 → I, , $s4 → j

a = a + 4; $s0 → a

f = g; $s0 → f, $s1 → g

a = a * 8; $s0 → a

 MIPS
25
(Q) MIPS Assembly
◼ Write assembly or C code
C Statement MIPS Assembly Code
h ➔ $s2
A[7] = h + A[10]; base of A[] ➔ $s3

A[3] = h + A[1]; h ➔ $s2


base of A[] ➔ $s3

if (i == j) f ➔ $s0
g ➔ $s1
f = g + h; h ➔ $s2
i ➔ $s3
j ➔ $s4

 MIPS
26
(Q) Datapath single-cycle Loop: bne
add
addi
$9,
$8,
$9,
$0,
$8,
$9,
End
$10
-1
#
#
#
Line
Line
Line
1, PC=1000
2
3
beq $0, $0 Loop # Line 4
End: # Line 5

Instruction Represent in binary.


Memory
P (1) For Line1, what is
Add 0
Instruction
C M value A?
4 U
Add E 1X (2) For Line2, what is
Left
Address
Shift D value B
2-bit
PCSrc
(3) For Line 3, what is
opcode
31:26

value C?
ALUcontrol (4) For Line 4, what is
Inst [25:21]
value D?
25:21

5 4
rs

RR1 RD1
5 is0?
(5) For Line 4, what is
RR2 MemWrite
value E?
20:16

Registers ALUSrc ALU


rt

5 WR
0M B ALU Address
0 result
RD2 M
15:11

1U Data
rd

Inst [15:11]
WD U MemToReg
X Memory
1X
RegWrite Read
shamt

1M
10:6

RegDst Write Data


Data U
Inst [15:0] Sign 0X
A C
funct

Extend
5:0

MemRead

Processor: Datapath
27
(Q) Datapath single-cycle Loop: bne
add
addi
$9,
$8,
$9,
$0,
$8,
$9,
End
$10
-1
#
#
#
Line
Line
Line
1, PC=1000
2
3
beq $0, $0 Loop # Line 4
End: # Line 5

Instruction
Memory Fill the value in table below
P 0 (Initially, all register values are 0)
Add
C M
Instruction
4 U K

Left
Add E 1X
Line 1 2 3 4
Address
Shift D
A
2-bit
PCSrc
opcode

B
31:26

C
F ALUcontrol D
Inst [25:21]
25:21

5 G 4
rs

RR1 RD1 E
5 is0? MemWrite F
RR2
20:16

Registers ALUSrc ALU


rt

G
5 WR I
0M ALU Address H
B 0 result
RD2 M H Data
15:11

1U
rd

I
Memory MemToReg
Inst [15:11] WD
X U
J
1X Read
RegWrite
shamt

1M
10:6

RegDst Write Data J K


Data U
Inst [15:0] Sign 0X
A C
funct

Extend
5:0

MemRead

Processor: Datapath
28
(Q) Control single-cycle Loop: bne
add
$9,
$8,
$0,
$8,
End
$10
#
#
Line
Line
1
2
addi $9, $9, -1 # Line 3
beq $0, $0 Loop # Line 4
End: # Line 5

Instruction
Memory Fill in table below
P 0 (Initially, all register
Add
C M
Instruction 4 U values are 0)
Left Add E 1X
Line 1 2 3 4
Address D
Shift
RegDst
2-bit PCSrc
opcode
31:26

RegWrite
MemWrite
ALUcontrol
Inst [25:21] MemRead
25:21

5 4
rs

RR1 RD1 MemToReg


5 is0? MemWrite
RR2 PCsrc
20:16

Registers ALUSrc ALU


rt

5
0M B WR ALU Address
0 result
RD2 M
15:11

1U Data
rd

Inst [15:11]
WD U MemToReg
X Memory
1X
RegWrite Read
shamt

1M
10:6

RegDst Write Data


Data U
Inst [15:0] Sign 0X
A C
funct

Extend
5:0

MemRead

Processor: Control
29
(Q) General pipelining concept
◼ Suppose a CPU conducts following code. Each line is assumed to be conducted using two
‘load’ instructions and a single ‘add’ instruction.
a = b + c; // load b, load c, add
d = e + f; // load e, load f, add
◼ Suppose CPU requires 3 steps (Fetch, Decode, and Execution) to conduct a single assembly
instruction.
Step Fetch Decode Execution
load instruction 1 ns 1 ns 1 ns
add instruction 1 ns 1 ns 1 ns

◼ Answer the following questions, when pipelining technique is used or not used.
❑ What is the latency to conduct a single instruction?
❑ What is the total execution time?
❑ What is the bandwidth in terms of “number of instructions per second”?
❑ Assume the ideal max bandwidth of the CPU is 109 instructions per second. What is throughput?
❑ When there are N instructions, what is the total execution time and bandwidth? Assume the
instructions are not dependent each other.
❑ When there are N instructions, what is the maximum speed-up in the steady-state, if we use
pipeline technique?

30
(Q) Pipelining
◼ Given this code: Instruction
Inst
Mem
Reg
read
ALU
Data
Mem
Reg
write
Total

add $t0, $s0, $s1 ALU 2 1 2 1 6


sub $t1, $s0, $s1 lw 2 1 2 2 1 8
sll $t2, $s0, 2 sw 2 1 2 2 7

srl $t3, $s1, 2 beq 2 1 2 5

a) How many cycles will it take to execute the code on a


single-cycle datapath?
b) How long will it take to execute the code on a single-
cycle datapath, assuming a 100 MHz clock?
c) How many cycles will it take to execute the code on a 5-
stage MIPS pipeline?
d) How long will it take to execute the code on a 5-stage
MIPS pipeline? Assume 400MHz clock frequency.

 Pipelining
31
(Q) (Set Associative) Cache Organization and Operation
Suppose Cache size is 32-byte. Cache block
size is 8 bytes. Memory access address
sequence is 4, 0, 8, 36, 0. Initially, cache is N+M-1
31 N N-1 0
empty.
Tag Index Offset
(1) Direct-mapped cache
❑ What is the number of bits for Offset, tag, index?
❑ What is hit rate?

(2) 2-way set-associative cache


❑ What is the number of bits for Offset, tag, index?
❑ What is hit rate?

(3) 4-way set-associative cache


❑ What is the number of bits for Offset, tag, index?
❑ What is hit rate?

(4) Fully associative cache


❑ What is the number of bits for Offset, tag?
❑ What is hit rate?

Cache II
32
(Q) Architecture general
1. Which is not a main characteristic of Von Neumann architectures?
1) Major components of a computer system are Processor, Memory
and Devices. Buses transport data between component
2) Stored-Memory Concept is used.
3) Data and program are stored in memory
4) Instruction and data have the separate memories.

2. Why is instruction set architecture necessary?

33
(Q) Architecture General
◼ Describe main characteristics of RISC architectures
◼ Describe three types of pipeline hazards
◼ Describe write policies in a cache, when cache hit
◼ Describe three types of assembly instructions
❑ Memory: Move values between memory and register
❑ Calculation: Arithmetic and other operations
❑ Control flow: Changes the sequential execution

Architectures
34
END

Cache II
35
(Q) Carry Lookahead Adder
◼ Design 3-bit carry look-ahead adder

Combinational Circuits
36
Performance: All Factors (1/5)
◼ You are given 2 machine designs M1 and M2 for performance
benchmarking. Both M1 and M2 have the same ISA, but different
hardware implementations and compilers. Assuming that the clock
cycle times for M1 and M2 are the same, performance study gives
the following measurements for the 2 designs.

For M1 For M2
Instruction
class No. of instructions No. of instructions
CPI CPI
executed executed
A 1 3,000,000,000,000 2 2,700,000,000,000
B 2 2,000,000,000,000 3 1,800,000,000,000
C 3 2,000,000,000,000 3 1,800,000,000,000
D 4 1,000,000,000,000 2 900,000,000,000

Performance and Benchmarking


37
(Q) Performance: All Factors (2/5)
a) What is the CPI for each machine?
b) Which machine is faster? By how much?
c) To further improve the performance of the machines, a new compiler
technique is introduced. The compiler can simply eliminate all class D
instructions from the benchmark program without any side effects. (That is,
there is no change to the number of class A, B and C instructions executed in
the 2 machines.) With this new technique, which machine is faster? By how
much?
d) Alternatively, to further improve the performance of the machines, a new
hardware technique is introduced. The hardware can simply execute all class
D instructions in zero times without any side effects. (There is still execution
for class D instructions.) With this new technique, which machine is faster?
By how much? For M1 For M2
Instruction
class CPI No. of instructions executed CPI No. of instructions executed

A 1 3,000,000,000,000 2 2,700,000,000,000
B 2 2,000,000,000,000 3 1,800,000,000,000
C 3 2,000,000,000,000 3 1,800,000,000,000

D 4 1,000,000,000,000 2 900,000,000,000

 Performance and Benchmarking


38
(Q) Pipelining: Without Forwarding
◼ How many cycles will it take to execute the following code on a 5-stage
pipeline without forwarding?
sub $2, $1, $3
and $12, $2, $5
or $13, $6, $2

1 2 3 4 5 6 7 8 9 10 11

IM REG ALU MEM REG


(IF) (ID) (EXE) (DM) (WB)

Pipelining
39
(Q) Pipelining : With Forwarding
◼ How many cycles will it take to execute the following code on a 5-stage
pipeline with forwarding?
sub $2, $1, $3
and $12, $2, $5
or $13, $6, $2

1 2 3 4 5 6 7 8 9 10 11

sub IM REG ALU MEM REG


(IF) (ID) (EXE) (DM) (WB)

and

or

Pipelining
40

You might also like