Exercise Only
Exercise Only
Exercises
Exercises
◼ Arithmetic
◼ Digital logic
◼ Architectures
2
(Q) Number: Conversion
◼ Convert number
❑ (18)10 = ( )2
❑ (58)10 = ( )2
❑ (0.625)10 = ( )2
❑ (0.3125)10 = ( )2
❑ 1101.1012 = ( )10
◼ Convert numbers
❑ (10 111 011 001 . 101 110)2 = ( )8
❑ (2731.56)8 = ( )2
❑ (101 1101 1001 . 1011 1000)2 = ( )16
❑ (5D9.B8)16 = ( )2
+7 -0
+6 -1
+5 -2
+4 -3
+3 -4
+2 -5
+1 -6
+0 0000 -7
-8
+6 0110 +4 0100
+ -3 + 1101 + -7 + 1001
(c) ---- ------- (d) ---- -------
+3 10011 -3 1101
---- ------- ---- -------
-3 1101 +5 0101
+ -6 + 1010 + +6 + 0110
(e) ---- ------- (f) ---- -------
-9 10111 +11 1011
---- ------- ---- -------
X(X+Y) = X
7
(Q) Boolean Algebra
◼ Simplify (x+y)(x+y')(x'+z)
Boolean Algebra
8
Exercises
◼ Arithmetic
◼ Digital logic
◼ Architectures
9
(Q) Sum-of-Product and NAND gates
◼ Draw gate-level circuit. Use 2-level NAND and NOT
gates.
❑ F = AB + C'D + E
Karnaugh Map
11
(Q) Karnaugh Map A
0
B
0
C
0
D
0
P
1
0 0 0 1 0
◼ An odd parity generator for BCD code 0 0 1 0 0
which has 6 unused combinations. Derive 0 0 1 1 1
Karnaugh Map
12
(Q) Combinational Circuit: MUX
Inputs
◼ Draw truth table for 4-by-1 multiplexor I0
◼ Design and draw the combinational logic I1 4:1
I2 mux Y
using AND, OR, NOT gates
I3
S1 S 0
select
MSI
13
(Q) Combinational Circuit: Encoder
◼ Draw truth table
◼ Design and draw the combinational logic
using AND, OR, NOT gates
F0 D0
Select via F1 4-to-2
switches 2-bits
F2 Encoder D1 code
F3
MSI
14
(Q) Sequential Circuit
0 1 0 x 0 1 1 0 0 1 1
0 1 1
y 0 0 1 0 1 0 0
1 0 0
A 0
1 0 1
1 1 0
1 1 1
Sequential Circuit
15
(Q) Convolutional encoder
Out1
In Out
Reg1 Reg2
Out2
0 10 Out
[Out1, Out2] 10 01
1 10
0 01
1 01 11
0 11
1 11
16
(Q) Shift register
Out [0]
Out [1]
1 2 3 4 5
Clk
In
Reg1
Reg2
Out [0]
Out [1]
17
(Q) Register
Q1 Q2
D D Q D Q
Clk
1 2 3 4 5
Clock
Q1
Q2
18
Exercises
◼ Arithmetic
◼ Digital logic
◼ Architectures
19
(Q) Performance: Clock Cycle and Clock Frequency
◼ Program P runs in 10 seconds on computer A, which has a
400 MHz clock.
◼ Suppose we are trying to build a new machine B that will run
this program in 6 seconds. Unfortunately, the increase in
clock frequency has an averse effect on the rest of the CPU
design, causing machine B to require 1.2 times as many clock
cycles as machine A for the same program. What clock
frequency (for machine B) should we target at to achieve our
goal?
a = a + 4; $s0 → a
f = g; $s0 → f, $s1 → g
a = a * 8; $s0 → a
MIPS
25
(Q) MIPS Assembly
◼ Write assembly or C code
C Statement MIPS Assembly Code
h ➔ $s2
A[7] = h + A[10]; base of A[] ➔ $s3
if (i == j) f ➔ $s0
g ➔ $s1
f = g + h; h ➔ $s2
i ➔ $s3
j ➔ $s4
MIPS
26
(Q) Datapath single-cycle Loop: bne
add
addi
$9,
$8,
$9,
$0,
$8,
$9,
End
$10
-1
#
#
#
Line
Line
Line
1, PC=1000
2
3
beq $0, $0 Loop # Line 4
End: # Line 5
value C?
ALUcontrol (4) For Line 4, what is
Inst [25:21]
value D?
25:21
5 4
rs
RR1 RD1
5 is0?
(5) For Line 4, what is
RR2 MemWrite
value E?
20:16
5 WR
0M B ALU Address
0 result
RD2 M
15:11
1U Data
rd
Inst [15:11]
WD U MemToReg
X Memory
1X
RegWrite Read
shamt
1M
10:6
Extend
5:0
MemRead
Processor: Datapath
27
(Q) Datapath single-cycle Loop: bne
add
addi
$9,
$8,
$9,
$0,
$8,
$9,
End
$10
-1
#
#
#
Line
Line
Line
1, PC=1000
2
3
beq $0, $0 Loop # Line 4
End: # Line 5
Instruction
Memory Fill the value in table below
P 0 (Initially, all register values are 0)
Add
C M
Instruction
4 U K
Left
Add E 1X
Line 1 2 3 4
Address
Shift D
A
2-bit
PCSrc
opcode
B
31:26
C
F ALUcontrol D
Inst [25:21]
25:21
5 G 4
rs
RR1 RD1 E
5 is0? MemWrite F
RR2
20:16
G
5 WR I
0M ALU Address H
B 0 result
RD2 M H Data
15:11
1U
rd
I
Memory MemToReg
Inst [15:11] WD
X U
J
1X Read
RegWrite
shamt
1M
10:6
Extend
5:0
MemRead
Processor: Datapath
28
(Q) Control single-cycle Loop: bne
add
$9,
$8,
$0,
$8,
End
$10
#
#
Line
Line
1
2
addi $9, $9, -1 # Line 3
beq $0, $0 Loop # Line 4
End: # Line 5
Instruction
Memory Fill in table below
P 0 (Initially, all register
Add
C M
Instruction 4 U values are 0)
Left Add E 1X
Line 1 2 3 4
Address D
Shift
RegDst
2-bit PCSrc
opcode
31:26
RegWrite
MemWrite
ALUcontrol
Inst [25:21] MemRead
25:21
5 4
rs
5
0M B WR ALU Address
0 result
RD2 M
15:11
1U Data
rd
Inst [15:11]
WD U MemToReg
X Memory
1X
RegWrite Read
shamt
1M
10:6
Extend
5:0
MemRead
Processor: Control
29
(Q) General pipelining concept
◼ Suppose a CPU conducts following code. Each line is assumed to be conducted using two
‘load’ instructions and a single ‘add’ instruction.
a = b + c; // load b, load c, add
d = e + f; // load e, load f, add
◼ Suppose CPU requires 3 steps (Fetch, Decode, and Execution) to conduct a single assembly
instruction.
Step Fetch Decode Execution
load instruction 1 ns 1 ns 1 ns
add instruction 1 ns 1 ns 1 ns
◼ Answer the following questions, when pipelining technique is used or not used.
❑ What is the latency to conduct a single instruction?
❑ What is the total execution time?
❑ What is the bandwidth in terms of “number of instructions per second”?
❑ Assume the ideal max bandwidth of the CPU is 109 instructions per second. What is throughput?
❑ When there are N instructions, what is the total execution time and bandwidth? Assume the
instructions are not dependent each other.
❑ When there are N instructions, what is the maximum speed-up in the steady-state, if we use
pipeline technique?
30
(Q) Pipelining
◼ Given this code: Instruction
Inst
Mem
Reg
read
ALU
Data
Mem
Reg
write
Total
Pipelining
31
(Q) (Set Associative) Cache Organization and Operation
Suppose Cache size is 32-byte. Cache block
size is 8 bytes. Memory access address
sequence is 4, 0, 8, 36, 0. Initially, cache is N+M-1
31 N N-1 0
empty.
Tag Index Offset
(1) Direct-mapped cache
❑ What is the number of bits for Offset, tag, index?
❑ What is hit rate?
Cache II
32
(Q) Architecture general
1. Which is not a main characteristic of Von Neumann architectures?
1) Major components of a computer system are Processor, Memory
and Devices. Buses transport data between component
2) Stored-Memory Concept is used.
3) Data and program are stored in memory
4) Instruction and data have the separate memories.
33
(Q) Architecture General
◼ Describe main characteristics of RISC architectures
◼ Describe three types of pipeline hazards
◼ Describe write policies in a cache, when cache hit
◼ Describe three types of assembly instructions
❑ Memory: Move values between memory and register
❑ Calculation: Arithmetic and other operations
❑ Control flow: Changes the sequential execution
Architectures
34
END
Cache II
35
(Q) Carry Lookahead Adder
◼ Design 3-bit carry look-ahead adder
Combinational Circuits
36
Performance: All Factors (1/5)
◼ You are given 2 machine designs M1 and M2 for performance
benchmarking. Both M1 and M2 have the same ISA, but different
hardware implementations and compilers. Assuming that the clock
cycle times for M1 and M2 are the same, performance study gives
the following measurements for the 2 designs.
For M1 For M2
Instruction
class No. of instructions No. of instructions
CPI CPI
executed executed
A 1 3,000,000,000,000 2 2,700,000,000,000
B 2 2,000,000,000,000 3 1,800,000,000,000
C 3 2,000,000,000,000 3 1,800,000,000,000
D 4 1,000,000,000,000 2 900,000,000,000
A 1 3,000,000,000,000 2 2,700,000,000,000
B 2 2,000,000,000,000 3 1,800,000,000,000
C 3 2,000,000,000,000 3 1,800,000,000,000
D 4 1,000,000,000,000 2 900,000,000,000
1 2 3 4 5 6 7 8 9 10 11
Pipelining
39
(Q) Pipelining : With Forwarding
◼ How many cycles will it take to execute the following code on a 5-stage
pipeline with forwarding?
sub $2, $1, $3
and $12, $2, $5
or $13, $6, $2
1 2 3 4 5 6 7 8 9 10 11
and
or
Pipelining
40