Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page
Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page
0
ID/EX
WB EX/MEM
PCSrc
Control M WB MEM/WB
IF/ID EX M WB
4
Add
P Add
C Shift
RegWrite left 2
Read Read
register 1 data 1 MemWrite
ALU
Read Instruction Zero
Read Read
address [31-0]
register 2 data 2 0 Result Address
Write
Data
Instruction register 1 MemToReg
memory
memory Registers ALUOp
Write
data ALUSrc Write Read
1
data data
Instr [15 - 0] Sign
RegDst
extend MemRead
0
Instr [20 - 16]
0
Instr [15 - 11]
1
1
An example with dependences
Read after Write dependences
2
Data hazards in the pipeline diagram
Clock cycle
1
2
3
4
5
6
7
8
9
3
One Solution To Data Hazards
sub $2, $1, $3 sub $2, $1, $3
and $12, $2, $5 sll $0, $0, $0
or $13, $6, $2 sll $0, $0, $0
add $14, $2, $2 and $12, $2, $5
sw $15, 100($2) or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)
IM Reg DM Reg
sub $2, $1, $3
IM Reg DM Reg
and $12, $2, $5
IM Reg DM Reg
or $13, $6, $2
IM Reg DM Reg
add $14, $2, $2
IM Reg DM Reg
sw $15, 100($2)
4
Forwarding
Since the pipeline registers already contain the ALU
result, we could just forward the value to later
instructions, to prevent data hazards
• In clock cycle 4, the AND instruction can get the value of $1 -
$3 from the EX/MEM pipeline register used by SUB
• Then in cycle 5, the OR can get that same result from the MEM/
WB pipeline register being used by SUB
Clock cycle
1 2 3 4 5 6 7
IM Reg DM Reg
sub $2, $1, $3
IM Reg DM Reg
and $12, $2, $5
IM Reg DM Reg
or $13, $6, $2 9
Forwarding Implementation
Forwarding requires …
(a) Recognizing when a potential data hazard
exists, and
(b) Revising the pipeline to introduce
forwarding paths …
We’ll do those revisions next time
10
5
What about stores?
Two “easy” cases:
1 2 3 4 5 6
IM Reg DM Reg
sw $4, 0($1)
1 2 3 4 5 6
IM Reg DM Reg
sw $1, 0($4)
11
IM Reg DM Reg
sw $1, 0($4)
6
Load/Store Bypassing: Extends Datapath#
By cycling the result of Read data EX/MEM MEM/WB
Write Read
0
speeds … until there is a cache
miss!
ForwardC
13
14
7
What about loads?
Imagine if the first instruction in the example was LW
instead of SUB
The load data doesn’t come from memory until the end of
cycle 4
But the AND needs that value at the beginning of the same
cycle!
This is a “true” data hazard—the data is simply not
available when its needed
Clock cycle
1 2 3 4 5 6
IM Reg DM Reg
lw $2, 20($3)
IM Reg DM Reg
and $12, $2, $5
15
Stalling
The easiest solution is to stall the pipeline
We can delay the AND instruction by introducing a 1
cycle delay in the pipeline, often called a bubble
Clock cycle
1 2 3 4 5 6 7
IM Reg DM Reg
lw $2, 20($3)
IM Reg DM Reg
and $12, $2, $5
16
8
Stalling and forwarding
Without forwarding, we’d have to stall for two cycles to
wait for the LW instruction’s writeback stage.
Clock cycle
1 2 3 4 5 6 7 8
IM Reg DM Reg
lw $2, 20($3)
IM Reg DM Reg
and $12, $2, $5
17
IM Reg DM Reg
lw $2, 20($3)
IM Reg DM Reg
and $12, $2, $5
IM Reg DM Reg
or $13, $12, $2
18
9
Implementing stalls
To implement a stall we force the two instructions after
LW to remain in their ID & IF stages for 1 extra cycle
Clock cycle
1 2 3 4 5 6 7 8
IM Reg DM Reg
lw $2, 20($3)
IM IM Reg DM Reg
or $13, $12, $2
IM IM Reg DM Reg
or $13, $12, $2
10
Stall = Nop conversion
Clock cycle
1 2 3 4 5 6 7 8
IM Reg DM Reg
lw $2, 20($3)
IM Reg DM
and -> nop Reg
IM Reg DM Reg
and $12, $2, $5
IM Reg DM Reg
or $13, $12, $2
11