0% found this document useful (0 votes)
4 views

Chapter_04_processor_3.5

The document discusses the microarchitecture of a 12-stage pipelined processor, highlighting its speedup compared to a 1-cycle processor and detailing the concepts of pipelining, including dependencies, hazards, and forwarding. It includes examples of instruction execution in a pipelined environment, addressing issues such as stalls and control signals. The document also covers the analysis of data hazards and methods to mitigate them through techniques like forwarding and stalling.

Uploaded by

s6i893i7744
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Chapter_04_processor_3.5

The document discusses the microarchitecture of a 12-stage pipelined processor, highlighting its speedup compared to a 1-cycle processor and detailing the concepts of pipelining, including dependencies, hazards, and forwarding. It includes examples of instruction execution in a pipelined environment, addressing issues such as stalls and control signals. The document also covers the analysis of data hazards and methods to mitigate them through techniques like forwarding and stalling.

Uploaded by

s6i893i7744
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Computer

Architecture
CH4 Processor Microarchitecture (III)

Prof. Ren-Shuo Liu


NTHU EE
• Assume a 12-stage pipelined processor with 2 stall cycles
per instruction on average can achieve a 6 times higher
clock frequency than a 1-cycle processor does. What is the
speedup over the 1-cycle processor?

Instruction 1
Instruction 2
1 Instruction 3

Instruction 1

Instruction 2
Instruction 3

2
Outline
• Background
• Single-cycle design
• Pipelined design
• Pipeline concepts and MIPS's pipeline
• Cost and issues of pipelining
• Detailed pipelined datapath and control
• Trace the pipeline
• Dependencies, hazards, and forwarding
• Stalls and exceptions

3
Multi-Cycle Datapath

op, control op, control op, control

4
Consecutive Instruction Example

# int a;
# int b;
# int c;
… … # … …
80: lw t1, 0(gp) #
84: lw t2, 4(gp) #
88: nop #
92: add t3, t1, t2 # c = a + b;
96: sw t3, 8(gp) #

5
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
First Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)

80

op, control op, control op, control

6
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
Second Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)

84 lw t1, 0(gp)

op, control op, control op, control

7
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
Third Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)

88 lw t2, 4(gp) lw t1, 0(gp)

op, control op, control op, control

8
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
Forth Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)

92 nop lw t2, 4(gp) lw t1, 0(gp)

op, control op, control op, control

9
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
Fifth Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)

96 add t3, t1, t2 nop lw t2, 4(gp) lw t1, 0(gp)

op, control op, control op, control

10
80: lw t1, 0(gp)
84: lw t2, 4(gp)
88: nop
Sixth Cycle 92:
96:
add
sw
t3, t1, t2
t3, 8(gp)

100 sw t3, 8(gp) add t3, t1, t2 nop lw t2, 4(gp)

op, control op, control op, control

11
Simplified Pipeline Diagram

lw t1, 0(gp)
lw t2, 4(gp)
nop
add t3, t1, t2
sw t3, 8(gp)

12
Resource Usage Diagram

lw t1, 0(gp)

lw t2, 4(gp)

nop

add t3, t1, t2

sw t3, 8(gp)

13
Resource Usage Diagram

Same ALU at different cycles


Same DM at different cycles

lw t1, 0(gp)
3rd-cycle ALU produces
result for 4th-cycle DM
lw t2, 4(gp)

nop

add t3, t1, t2

sw t3, 8(gp)

14
Real Hardware Diagram

lw t1, 0(gp)

lw t2, 4(gp)

nop

add t3, t1, t2

sw t3, 8(gp)

15
Pipeline Control
• Control signals are derived from each instruction
• Just like single-cycle datapath
• Control signals are passed along just like the data

PC

16

WB

Pipeline Control •
M •

What are they?

EX

PC

17
Datapath without Control

18
Datapath with Control

19
Dependencies and Hazards
• Dependencies
inst''
• Operand of the current instruction
depends on a previous instruction's inst'
outcome
PC: inst
• Hazards
• Current PC cannot control and data
be executed right now dependencies
• Pipeline needs to postpone executing
current PC (i.e., stall) for some cycles
• Dependencies may result in hazards
• Pipeline hardware design can resolve
some hazards
• Compiler can avoid some hazards
hazards
structure
hazards 20
Anatomy of Data Dependencies

• Source operand, rd or rt
• Source instruction type,
e.g., R-type, lw, & branch
• Distance, e.g., 1~5
PC: • Destination instruction type,
e.g., R-type, lw, sw, & branch
• Destination operand, rd

21
Results
ALU lw branch

• Distance == 1
incurs a hazard
No hazards • Distance >= 2
no hazard

ALU lw sw branch

22
Detailed Analysis
• ALU - ALU

alu r3, rs1, rs2 IF ID EX r3 M WB

alu rd, r3, r4 IF ID EX M WB

23
Detailed Analysis
• ALU - LW

alu r3, rs1, rs2 IF ID EX r3 M WB

lw rd, imm(r3) IF ID EX M WB

24
Detailed Analysis
• ALU - SW

alu r3, rs1, rs2 IF ID EX r3 M WB

sw rs2, imm(r3) IF ID EX M WB

25
Detailed Analysis
• ALU - SW

alu r3, rs1, rs2 IF ID EX r3 M WB

sw r3, imm(rs1) IF ID EX M WB

26
Detailed Analysis
• ALU - Branch

alu r3, rs1, rs2 IF ID EX r3 M WB

IF ID EX M WB

beq r3, rs2, imm IF ID EX M WB

beq r3, rs2, imm IF ID EX M WB

27
Results
ALU lw branch

• Distance == 1 • Distance <= 2


incurs a hazard incurs a hazard
• Distance >= 2 • Distance >= 3
no hazard no hazard

ALU lw sw branch
• rs2 can be forwarded (no hazard)
• rs1 dependency causes no hazard if
distance >= 2 (similar to lw-R-type)
28
Analysis
• LW - ALU

lw r3, imm(rs1) IF ID EX M r3 WB

IF ID EX M WB

alu rd, r3, rs2 IF ID EX M WB

alu rd, r3, rs2 IF ID EX M WB

29
Analysis
• LW - LW

lw r3, imm(rs1) IF ID EX M r3 WB

IF ID EX M WB

lw rd, imm(r3) IF ID EX M WB

lw rd, imm(r3) IF ID EX M WB

30
Analysis
• LW - SW

lw r3, imm(rs1) IF ID EX M r3 WB

IF ID EX M WB

sw rs2, imm(r3) IF ID EX M WB

sw rs2, imm(r3) IF ID EX M WB

31
Analysis
• LW - SW

lw r3, imm(rs1) IF ID EX M r3 WB

sw r3, imm(rs1) IF ID EX M WB

sw r3, imm(rs1) IF ID EX M WB

sw r3, imm(rs1) IF ID EX M WB

32
Analysis
• LW - SW

lw r3, imm(rs1) IF ID EX M r3 WB

IF ID EX M WB

IF ID EX M WB

beq r3, rs2, imm IF ID EX M WB

33
Results
ALU lw branch

• Distance == 1
incurs a control hazard
• Distance >= 2
no hazard

ALU lw sw branch

34
Hazard Example

ALU從register
file拿到的,不
是最新的值
→ solution:
ALU改由他處
得最新的值

35
Forwarding (Bypassing)

• Write-back value
becomes rs1 or rs2
before being written
into the register file

36
Forwarding Paths
• For a 5-stage pipeline
• Forwarding destinations
• rs1 of ALU, beq, and SW
• rs2 of ALU and beq
• rs2 of SW

• Forwarding sources
• rd of the 1-cycle-earlier ALU
• rd of the 2-cycle-earlier ALU
• rd of the 1-cycle-earlier LW
• rd of the 2-cycle-earlier LW

37
• rs1 of ALU, beq, and SW

Forwarding Paths • rs2 of R-type and beq

• rs2 of SW

imm - MU
X

M
U
X

• rd of the 2-cycle- • rd of the 1-cycle-


earlier ALU and LW • rd of the 1-cycle- earlier ALU and LW
earlier ALU
38
Forwarding Control
• Take special care of
• Don't forward any result to $0, which is always zero
add $0,$1,$2
add $3,$0,$4
• Forward the latest value for double dependencies
add $1,$1,$2
add $1,$1,$3
add $1,$1,$4

39
Forwarding Control
• Fwd x to rs = f1(inst, inst', inst'')

• Fwd inst'.rd to rs
• inst ∈ {ALU, beq, lw, sw} and
• inst' ∈ {ALU} and
• inst.rs == inst'.rd and
• inst.rs != 0

• Fwd inst''.rd to rs
• inst ∈ {ALU, beq, lw, sw} and
• inst'' ∈ {ALU, lw} and
• inst.rs == inst''.rd and
• inst.rs != 0 and
• Fwd inst'.rd to ALU_in is false // not double dependency

40
Forwarding Control
• Fwd inst''.rd to inst'.rs2 = f2(inst, inst', inst'')

• Fwd inst''.rd to inst'.rs2


• inst' ∈ {sw} and
• inst'' ∈ {lw} and
• inst'.rs2 == inst''.rd and
• inst'.rs2 != 0

41
Outline
• Background
• Single-cycle design
• Pipelined design
• Pipeline concepts and MIPS's pipeline
• Cost and issues of pipelining
• Detailed pipelined datapath and control
• Trace the pipeline
• Dependencies, hazards, and forwarding
• Stalls and exceptions

42
Data Hazards That Cause Stall(s)
• Two lw cases

lw r1, imm(r2) IF ID EX MEM WB


ALU r3, r4, r1 IF ID EX MEM WB
IF ID EX MEM WB

lw r1, imm(r2) IF ID EX MEM WB


sw r3, imm(r1) IF ID EX MEM WB
IF ID EX MEM WB

43
Data Hazards That Cause Stall(s)
• Three branch cases
ALU r1, r, r IF ID EX MEM WB
beq r1, r2, imm IF ID EX MEM WB
IF ID EX MEM WB

lw r1, imm(r2) IF ID EX MEM WB


beq r1, r2, imm IF ID EX MEM WB

lw r1, imm(r2) IF ID EX MEM WB


nop IF ID EX MEM WB
beq r1, r2, imm IF ID EX MEM WB
IF ID EX MEM WB 44
How to Handle
• Detect the situations
• Stall the pipeline
• Example

PC-4 lw r1, imm(r2) IF ID EX MEM WB


PC R-type r3, r4, r1 → nop IF ID EX MEM WB
PC R-type r3, r4, r1 IF ID EX MEM WB

45
Hazard Detect and Stall

keep_PC keep_inst inst inst' inst''

nop

46
Hazard Detect and Stall

keep_PC keep_inst inst inst' inst''

nop
lw $1, 100($2)
84

nop

nop

nop
80: lw
84: addi

47
Hazard Detect and Stall

keep_PC keep_inst inst inst' inst''

nop
addi t3, t1, 123

lw t1, 100(t2)
88

nop

nop
80: lw
84: addi

48
Hazard Detect and Stall

keep_PC keep_inst inst inst' inst''

nop
addi t3, t1, 123

lw t1, 100(t2)
88

nop

nop
80: lw
84: addi

49
Hazard Detect and Stall

keep_PC keep_inst inst inst' inst''

nop

addi t3, t1, 123

lw t1, 100(t2)
8C

nop

80: lw
84: addi

50
Hazard Detect and Stall

keep_PC keep_inst inst inst' inst''

nop

addi t3, t1, 123


90

nop

80: lw
84: addi

51
Hazard Detect and Stall

keep_PC keep_inst inst inst' inst''

nop

addi t3, t1, 123


94



80: lw
84: addi

52

You might also like