Chapter_04_processor_3.5
Chapter_04_processor_3.5
Architecture
CH4 Processor Microarchitecture (III)
Instruction 1
Instruction 2
1 Instruction 3
Instruction 1
Instruction 2
Instruction 3
2
Outline
• Background
• Single-cycle design
• Pipelined design
• Pipeline concepts and MIPS's pipeline
• Cost and issues of pipelining
• Detailed pipelined datapath and control
• Trace the pipeline
• Dependencies, hazards, and forwarding
• Stalls and exceptions
3
Multi-Cycle Datapath
4
Consecutive Instruction Example
# int a;
# int b;
# int c;
… … # … …
80: lw t1, 0(gp) #
84: lw t2, 4(gp) #
88: nop #
92: add t3, t1, t2 # c = a + b;
96: sw t3, 8(gp) #
5
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
First Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)
80
6
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
Second Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)
84 lw t1, 0(gp)
7
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
Third Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)
8
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
Forth Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)
9
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
Fifth Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)
10
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
Sixth Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)
11
Simplified Pipeline Diagram
lw t1, 0(gp)
lw t2, 4(gp)
nop
add t3, t1, t2
sw t3, 8(gp)
12
Resource Usage Diagram
lw t1, 0(gp)
lw t2, 4(gp)
nop
sw t3, 8(gp)
13
Resource Usage Diagram
lw t1, 0(gp)
3rd-cycle ALU produces
result for 4th-cycle DM
lw t2, 4(gp)
nop
sw t3, 8(gp)
14
Real Hardware Diagram
lw t1, 0(gp)
lw t2, 4(gp)
nop
sw t3, 8(gp)
15
Pipeline Control
• Control signals are derived from each instruction
• Just like single-cycle datapath
• Control signals are passed along just like the data
PC
16
•
WB
•
Pipeline Control •
M •
•
What are they?
•
EX
•
PC
17
Datapath without Control
18
Datapath with Control
19
Dependencies and Hazards
• Dependencies
inst''
• Operand of the current instruction
depends on a previous instruction's inst'
outcome
PC: inst
• Hazards
• Current PC cannot control and data
be executed right now dependencies
• Pipeline needs to postpone executing
current PC (i.e., stall) for some cycles
• Dependencies may result in hazards
• Pipeline hardware design can resolve
some hazards
• Compiler can avoid some hazards
hazards
structure
hazards 20
Anatomy of Data Dependencies
• Source operand, rd or rt
• Source instruction type,
e.g., R-type, lw, & branch
• Distance, e.g., 1~5
PC: • Destination instruction type,
e.g., R-type, lw, sw, & branch
• Destination operand, rd
21
Results
ALU lw branch
• Distance == 1
incurs a hazard
No hazards • Distance >= 2
no hazard
ALU lw sw branch
22
Detailed Analysis
• ALU - ALU
23
Detailed Analysis
• ALU - LW
lw rd, imm(r3) IF ID EX M WB
24
Detailed Analysis
• ALU - SW
sw rs2, imm(r3) IF ID EX M WB
25
Detailed Analysis
• ALU - SW
sw r3, imm(rs1) IF ID EX M WB
26
Detailed Analysis
• ALU - Branch
IF ID EX M WB
27
Results
ALU lw branch
ALU lw sw branch
• rs2 can be forwarded (no hazard)
• rs1 dependency causes no hazard if
distance >= 2 (similar to lw-R-type)
28
Analysis
• LW - ALU
lw r3, imm(rs1) IF ID EX M r3 WB
IF ID EX M WB
29
Analysis
• LW - LW
lw r3, imm(rs1) IF ID EX M r3 WB
IF ID EX M WB
lw rd, imm(r3) IF ID EX M WB
lw rd, imm(r3) IF ID EX M WB
30
Analysis
• LW - SW
lw r3, imm(rs1) IF ID EX M r3 WB
IF ID EX M WB
sw rs2, imm(r3) IF ID EX M WB
sw rs2, imm(r3) IF ID EX M WB
31
Analysis
• LW - SW
lw r3, imm(rs1) IF ID EX M r3 WB
sw r3, imm(rs1) IF ID EX M WB
sw r3, imm(rs1) IF ID EX M WB
sw r3, imm(rs1) IF ID EX M WB
32
Analysis
• LW - SW
lw r3, imm(rs1) IF ID EX M r3 WB
IF ID EX M WB
IF ID EX M WB
33
Results
ALU lw branch
• Distance == 1
incurs a control hazard
• Distance >= 2
no hazard
ALU lw sw branch
34
Hazard Example
ALU從register
file拿到的,不
是最新的值
→ solution:
ALU改由他處
得最新的值
35
Forwarding (Bypassing)
• Write-back value
becomes rs1 or rs2
before being written
into the register file
36
Forwarding Paths
• For a 5-stage pipeline
• Forwarding destinations
• rs1 of ALU, beq, and SW
• rs2 of ALU and beq
• rs2 of SW
• Forwarding sources
• rd of the 1-cycle-earlier ALU
• rd of the 2-cycle-earlier ALU
• rd of the 1-cycle-earlier LW
• rd of the 2-cycle-earlier LW
37
• rs1 of ALU, beq, and SW
• rs2 of SW
imm - MU
X
M
U
X
39
Forwarding Control
• Fwd x to rs = f1(inst, inst', inst'')
• Fwd inst'.rd to rs
• inst ∈ {ALU, beq, lw, sw} and
• inst' ∈ {ALU} and
• inst.rs == inst'.rd and
• inst.rs != 0
• Fwd inst''.rd to rs
• inst ∈ {ALU, beq, lw, sw} and
• inst'' ∈ {ALU, lw} and
• inst.rs == inst''.rd and
• inst.rs != 0 and
• Fwd inst'.rd to ALU_in is false // not double dependency
40
Forwarding Control
• Fwd inst''.rd to inst'.rs2 = f2(inst, inst', inst'')
41
Outline
• Background
• Single-cycle design
• Pipelined design
• Pipeline concepts and MIPS's pipeline
• Cost and issues of pipelining
• Detailed pipelined datapath and control
• Trace the pipeline
• Dependencies, hazards, and forwarding
• Stalls and exceptions
42
Data Hazards That Cause Stall(s)
• Two lw cases
43
Data Hazards That Cause Stall(s)
• Three branch cases
ALU r1, r, r IF ID EX MEM WB
beq r1, r2, imm IF ID EX MEM WB
IF ID EX MEM WB
45
Hazard Detect and Stall
nop
46
Hazard Detect and Stall
nop
lw $1, 100($2)
84
nop
nop
nop
80: lw
84: addi
47
Hazard Detect and Stall
nop
addi t3, t1, 123
lw t1, 100(t2)
88
nop
nop
80: lw
84: addi
48
Hazard Detect and Stall
nop
addi t3, t1, 123
lw t1, 100(t2)
88
nop
nop
80: lw
84: addi
49
Hazard Detect and Stall
nop
lw t1, 100(t2)
8C
nop
…
80: lw
84: addi
50
Hazard Detect and Stall
nop
nop
…
…
80: lw
84: addi
51
Hazard Detect and Stall
nop
…
…
…
80: lw
84: addi
52