0% found this document useful (0 votes)

75 views21 pages

CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan

lecture 3 of coa2

Uploaded by

علي سعدهاشم

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

75 views21 pages

CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan

lecture 3 of coa2

Uploaded by

علي سعدهاشم

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

CS 162 Computer Architecture

Lecture 3: Pipelining Contd.

Instructor: L.N. Bhuyan

[Link]/~bhuyan/cs162

1 1999 ©UCB
Single Cycle Datapath (From Ch 5)
M
a a u
d d x
4 d << d
2 PCSrc
Read 25:21 Read MemWrite
P Addr Reg1
Read Read
C
31:0 Read data1 Zero data
20:16
Instruc- Reg2
A
tion L
Read Address
M Write U MemTo-
data2 M
u Reg Reg
u
Imem x Regs x
Dmem
Write ALU-
15:11 con Write
Data
Data
RegDst ALU- M
RegWrite src MemRead u
15:0 Sign
Extend x

2 ALUOp 1999 ©UCB

Required Changes to
Datapath
° Introduce registers to separate 5 stages
by putting IF/ID, ID/EX, EX/MEM, and
MEM/WB registers in the datapath.
° Next PC value is computed in the 3rd
step, but we need to bring in next instn
in the next cycle – Move PCSrc Mux to
1st stage. The PC is incremented unless
there is a new branch address.
° Branch address is computed in 3rd
stage. With pipeline, the PC value has
changed! Must carry the PC value along
with instn. Width of IF/ID register = (IR)+
(PC) = 64 bits.
3 1999 ©UCB
Changes to Datapath
Contd.
° For lw instn, we need write register
address at stage 5. But the IR is now
occupied by another instn! So, we
must carry the IR destination field as
we move along the stages. See
connection in fig.
Length of ID/EX register = (Reg1:32)+
(Reg2:32)+(offset:32)+ (PC:32)+
(destination register:5) = 133 bits
Assignment: What are the lengths of
EX/MEM, and MEM/WB registers

4 1999 ©UCB
Pipelined Datapath (with Pipeline Regs)
(6.2)Fetch Decode Execute Memory Write
Back
0
M
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

Add
4 Add
result

Shift
left 2

Read
Ins tructio n

PC Address register 1
Read
data 1
Read
register 2 Zero
Read ALU ALU
Write 0 Address Read
data 2 result 1
register M data
u M
Imem Write
data Regs x
1
u
x
0
Write

16 32
data
Dmem
Sign
extend

5
64 bits 133 bits 102 bits 69 bits
1999 ©UCB
Pipelined Control
(6.3)
• Start with single-cycle controller
• Group control lines by pipeline stage needed
• Extend pipeline registers with control bits

Instruction Mem
Control WB

EX Mem WB

RegDst
Branch MemToReg
ALUop
MemRead RegWrite
ALUSrc
MemWrite

IF/ID ID/EX EX/MEM MEM/WB

6 1999 ©UCB
Pipelined Processor: Datapath +
Control • More work to correctly handle pipeline hazards
PCSrc

ID/EX
0
M
u WB
x EX/MEM
1
Control M WB
MEM/WB

EX M WB
IF/ID

Add

Add
4 Add resul t
RegWrite
Sh if t Branch

MemWrite
left 2

MemToReg
ALUSrc
Instructi on

Read
PC Address regis ter 1 Read
Read data 1
regis ter 2 Zero
Read ALU ALU
Writ e 0 Read
data 2 result Address 1
Imem regis ter M
u
data
M

Regs
Writ e x u
data x
1
Dmem
0
Write
data

Instruction 16 32
[15– 0] 6
Si gn ALU MemRead
ex tend control

Instruction
[20– 16]
0 ALUOp
M
Instruction u
[15– 11] x
1
RegDst
7 1999 ©UCB
Reca
p
° if can keep all pipeline stages busy,
can retire (complete) up to one
instruction per clock cycle (thereby
achieving single-cycle throughput)
° The pipeline paradox (for MIPS): any
instruction still takes 5 cycles to
execute (even though can retire one
instruction per cycle)

8 1999 ©UCB
Problems for Pipelining
° Hazards prevent next instruction from
executing during its designated clock
cycle, limiting speedup
• Structural hazards: HW cannot support
this combination of instructions (single
memory for instruction and data)
• Data hazards: Instruction depends on
result of prior instruction still in the
pipeline
• Control hazards: conditional branches &
other instructions may stall the pipeline
delaying later instructions

9 1999 ©UCB
Single Memory is a Structural
Hazard
Time (clock cycles)
I
n

ALU
M Reg M Reg

s Load

ALU
t Instr 1 M Reg M Reg

ALU
M Reg M Reg
Instr 2
O

ALU
M Reg M Reg
Instr 3
r

ALU
d Instr 4 M Reg M Reg

e
r
10
• Can’t read same memory twice in same clock cycle
1999 ©UCB
EX: MIPS multicycle datapath:
Structural Hazard in Memory

P Address Instruction Read

C Register Reg1
Memory Read
Read
Instruction Reg2
data 1 A A ALU-
or Data L Out
Registers U
Write Read
Reg data 2 B
Data Memory
Data
Register Data

11 1999 ©UCB
Structural Hazards limit
performance
° Example: if 1.3 memory accesses per
instruction (30% of instructions
execute loads and stores)
and only one memory access per cycle
then
• Average CPI  1.3
• Otherwise datapath resource is more than
100% utilized

Structural Hazard Solution: Add more

Hardware
12 1999 ©UCB
Speed Up Equation for Pipelining

CPIpipelined = Ideal CPI + Pipeline stall clock cycles per instn

Speedup = Ideal CPI x Pipeline depth Clock Cycleunpipelined

---------------------------------- X -------------------------
Ideal CPI + Pipeline stall
x CPI Clock Cyclepipelined

Speedup = Pipeline depth Clock Cycleunpipelined

------------------------ X ---------------------------
1 + Pipeline stall CPI Clock Cyclepipelined

13 1999 ©UCB
Example: Dual-port vs. Single-port
° Machine A: Dual ported memory
° Machine B: Single ported memory, but its pipelined implementation
has a 1.05 times faster clock rate
° Ideal CPI = 1 for both
° Loads are 40% of instructions executed
SpeedUpA = Pipeline Depth/(1 + 0) x (clockunpipe/clockpipe)
= Pipeline Depth
SpeedUpB = Pipeline Depth/(1 + 0.4 x 1)
x (clockunpipe/(clockunpipe / 1.05)
= (Pipeline Depth/1.4) x 1.05
= 0.75 x Pipeline Depth
SpeedUpA / SpeedUpB = Pipeline Depth/(0.75 x Pipeline Depth) = 1.33

° Machine A is 1.33 times faster

add $1 ,$2, $3

sub $4, $1 ,$3

and $6, $1 ,$7

or $8, $1 ,$9

xor $10, $1 ,$11

15 1999 ©UCB
Data Hazard
Solution:
• “Forward” result from one stage to another
I Time (clock cycles)
IF ID/RF EX MEM WB
n

ALU
s add $1,$2,$3 IM Reg DM Reg

ALU
IM Reg DM Reg
sub $4,$1,$3
r.

ALU
IM Reg DM Reg
and $6,$1,$7
O

ALU
IM Reg DM Reg
r or $8,$1,$9
d

ALU
IM Reg DM Reg
xor $10,$1,$11
e
r
• “or” OK if implement register file properly
16 1999 ©UCB
Hazard Detection for Forwarding
° A hazard must be detected just before execution so that
in case of hazard, the data can be forwarded to the
input of the ALU.
° It can be detected when a source register (Rs or Rt or
both) of the instruction at the EX stage is equal to the
destination register (Rd) of an instruction in the
pipeline (either in MEM or WB stage)
° Compare the values of Rs and Rt registers in the ID/EX
stage with Rd at EX/MEM and MEM/WB stages =>
Need to carry Rs, Rt, Rd values to the ID/EX register
from the IF/ID register (only Rd was carried before)
° If they match, forward the data to the input of the ALU
through the multiplexor.

See Fig. 6.43 pp. 488 of the text

IF ID/RF EX MEM WB

ALU
lw $1,0($2) IM Reg DM Reg

ALU
IM Reg DM Reg
sub $4,$1,$3

• Can’t solve with forwarding alone

• Must stall instruction dependent on load
•“Load-Use” hazard
18 1999 ©UCB
Data Hazard Even with
Forwarding
• Must stall pipeline 1 cycle (insert 1 bubble)
Time (clock cycles)

IF ID/RF EX MEM WB
lw $1, 0($2)

ALU
IM Reg DM Reg

bub

ALU
sub $4,$1,$6 IM Reg
ble
DM Reg

bub

ALU
IM Reg DM Reg
and $6,$1,$7 ble

bub

ALU
or $8,$1,$9 ble
IM Reg DM

19 1999 ©UCB
Compiler Schemes to Improve Load Delay
° Compiler will detect data dependency and inserts
nop instructions until data is available
sub $2, $1, $3
nop
and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)
° Compiler will find independent instructions to
fill in the delay slots
20 1999 ©UCB
Software Scheduling to Avoid Load Hazards
Try producing fast code for
a = b + c;
d = e – f;
assuming a, b, c, d ,e, and f in memory.
Slow code: Fast code:
LW Rb,b LW Rb,b
LW Rc,c LW Rc,c
ADD Ra,Rb,Rc LW Re,e
SW a,Ra ADD Ra,Rb,Rc
LW Re,e
LW Rf,f
LW Rf,f
SW a,Ra
SUB Rd,Re,Rf
SUB Rd,Re,Rf
SW d,Rd
SW d,Rd

21 1999 ©UCB

Pipelined Datapath in MIPS Architecture
No ratings yet
Pipelined Datapath in MIPS Architecture
26 pages
Pipelining in MIPS Architecture Explained
No ratings yet
Pipelining in MIPS Architecture Explained
37 pages
MIPS Pipelining Performance Insights
No ratings yet
MIPS Pipelining Performance Insights
54 pages
Pipelining for Enhanced CPU Performance
No ratings yet
Pipelining for Enhanced CPU Performance
71 pages
Pipelining Techniques in MIPS Architecture
No ratings yet
Pipelining Techniques in MIPS Architecture
59 pages
Pipeline Hazards in Processor Design
No ratings yet
Pipeline Hazards in Processor Design
22 pages
Pipelining in Computer Architecture
No ratings yet
Pipelining in Computer Architecture
77 pages
Pipelining and ILP in Computer Architecture
No ratings yet
Pipelining and ILP in Computer Architecture
5 pages
Pipelined Processor Microarchitecture Analysis
No ratings yet
Pipelined Processor Microarchitecture Analysis
52 pages
CODch 6 Slides
No ratings yet
CODch 6 Slides
77 pages
Pipelining in CPU Architecture
No ratings yet
Pipelining in CPU Architecture
56 pages
Microcontroller Instruction Pipeline Design
0% (1)
Microcontroller Instruction Pipeline Design
38 pages
Understanding Pipelining in MIPS Architecture
No ratings yet
Understanding Pipelining in MIPS Architecture
36 pages
Pipelined Data-Path in MIPS Architecture
No ratings yet
Pipelined Data-Path in MIPS Architecture
31 pages
Pipelining Techniques in Modern Processors
No ratings yet
Pipelining Techniques in Modern Processors
22 pages
Pipelining in MIPS Architecture Explained
No ratings yet
Pipelining in MIPS Architecture Explained
85 pages
Pipeline Registers in Pipelined Datapath
No ratings yet
Pipeline Registers in Pipelined Datapath
33 pages
MIPS Pipelining and Datapath Overview
No ratings yet
MIPS Pipelining and Datapath Overview
40 pages
Performance and Pipelining in MIPS
No ratings yet
Performance and Pipelining in MIPS
48 pages
Pipelining and Datapath Concepts
No ratings yet
Pipelining and Datapath Concepts
64 pages
Understanding Pipeline Hazards in CPUs
No ratings yet
Understanding Pipeline Hazards in CPUs
31 pages
Pipelined Datapath Overview
100% (1)
Pipelined Datapath Overview
31 pages
MIPS Processor Design and Pipelining
No ratings yet
MIPS Processor Design and Pipelining
72 pages
MIPS Pipelined Processor Design
No ratings yet
MIPS Pipelined Processor Design
89 pages
Pipelining Hazards in Computer Architecture
No ratings yet
Pipelining Hazards in Computer Architecture
4 pages
Pipelining in Computer Architecture
No ratings yet
Pipelining in Computer Architecture
38 pages
Tiled Chip Multicore Processor Overview
No ratings yet
Tiled Chip Multicore Processor Overview
64 pages
Forwarding in Pipelined Processors
No ratings yet
Forwarding in Pipelined Processors
37 pages
Understanding Pipelining in Computer Architecture
No ratings yet
Understanding Pipelining in Computer Architecture
41 pages
Pipelining in Computer Architecture
No ratings yet
Pipelining in Computer Architecture
49 pages
Pipelined Processor Design Concepts
No ratings yet
Pipelined Processor Design Concepts
30 pages
MIPS Pipeline Architecture and Hazards
No ratings yet
MIPS Pipeline Architecture and Hazards
59 pages
Pipelined Datapath in Computer Architecture
No ratings yet
Pipelined Datapath in Computer Architecture
16 pages
Understanding Pipelining and Hazards
No ratings yet
Understanding Pipelining and Hazards
39 pages
Pipelining in Computer Architecture
No ratings yet
Pipelining in Computer Architecture
43 pages
Forwarding Paths in Pipelined Datapath
No ratings yet
Forwarding Paths in Pipelined Datapath
11 pages
Lecture 13 Pipelining
No ratings yet
Lecture 13 Pipelining
12 pages
Pipelining Hazards in MIPS Architecture
No ratings yet
Pipelining Hazards in MIPS Architecture
37 pages
Pipelining Techniques in Computer Architecture
No ratings yet
Pipelining Techniques in Computer Architecture
25 pages
Arithmetic Pipeline in DLX Architecture
No ratings yet
Arithmetic Pipeline in DLX Architecture
18 pages
Pipelined Datapath and Control Overview
No ratings yet
Pipelined Datapath and Control Overview
26 pages
32-Bit MIPS Processor Design Guide
No ratings yet
32-Bit MIPS Processor Design Guide
23 pages
Pipelining Concepts and Hazards Recap
No ratings yet
Pipelining Concepts and Hazards Recap
41 pages
Pipelining and Hazards in MIPS Architecture
No ratings yet
Pipelining and Hazards in MIPS Architecture
50 pages
Understanding CPU Pipelining Techniques
No ratings yet
Understanding CPU Pipelining Techniques
52 pages
Forwarding Techniques in Pipelined Datapaths
No ratings yet
Forwarding Techniques in Pipelined Datapaths
35 pages
Types of Pipeline Hazards Explained
No ratings yet
Types of Pipeline Hazards Explained
19 pages
Instruction Set Principles & Hazards
No ratings yet
Instruction Set Principles & Hazards
13 pages
Instruction Level Parallelism in Pipelines
No ratings yet
Instruction Level Parallelism in Pipelines
16 pages
Lec12 Pipeline 2 Notes
No ratings yet
Lec12 Pipeline 2 Notes
58 pages
Introduction to Pipelining in Computer Architecture
No ratings yet
Introduction to Pipelining in Computer Architecture
30 pages
MIPS Pipelining and Performance Analysis
No ratings yet
MIPS Pipelining and Performance Analysis
109 pages
MIPS Processor Architecture Overview
No ratings yet
MIPS Processor Architecture Overview
50 pages
Pipeline Processor Design Overview
No ratings yet
Pipeline Processor Design Overview
23 pages
MIPS Pipeline Hazards and Datapath Control
No ratings yet
MIPS Pipeline Hazards and Datapath Control
10 pages
Pipelining Hazards in Computer Architecture
No ratings yet
Pipelining Hazards in Computer Architecture
23 pages
MIPS Pipelined Architecture Overview
No ratings yet
MIPS Pipelined Architecture Overview
21 pages
Computer Architecture Course Overview
No ratings yet
Computer Architecture Course Overview
25 pages
Computer Architecture II Course Overview
No ratings yet
Computer Architecture II Course Overview
13 pages
Computer Architecture Course Overview
No ratings yet
Computer Architecture Course Overview
10 pages
Data Transfer in Computer Architecture
No ratings yet
Data Transfer in Computer Architecture
12 pages
CSC 504 Course Details and Structure
No ratings yet
CSC 504 Course Details and Structure
6 pages
CSE/EE 470: Computer Architecture Insights
No ratings yet
CSE/EE 470: Computer Architecture Insights
13 pages
Ds ESPRIMO E900 0watt
No ratings yet
Ds ESPRIMO E900 0watt
9 pages
Synchronous FIFO Design & FPGA Guide
No ratings yet
Synchronous FIFO Design & FPGA Guide
4 pages
Categories of Computer Software Explained
No ratings yet
Categories of Computer Software Explained
8 pages
Twisted Hobbys R/C Aircraft Assembly Guide
No ratings yet
Twisted Hobbys R/C Aircraft Assembly Guide
25 pages
U-Blox ZED-F9P InterfaceDescription (UBX-18010854) PDF
No ratings yet
U-Blox ZED-F9P InterfaceDescription (UBX-18010854) PDF
270 pages
8085 Microprocessor Overview and Pin Diagram
No ratings yet
8085 Microprocessor Overview and Pin Diagram
12 pages
PDF 5406675 en-US-6
No ratings yet
PDF 5406675 en-US-6
3,188 pages
Understanding Digital Logic Gates
No ratings yet
Understanding Digital Logic Gates
6 pages
Arduino Bitwise Operators Explained
No ratings yet
Arduino Bitwise Operators Explained
13 pages
Vivo, Oppo, Realme, Xiaomi Device Codes
No ratings yet
Vivo, Oppo, Realme, Xiaomi Device Codes
2 pages
Instruction - KECA 80 C184
No ratings yet
Instruction - KECA 80 C184
9 pages
Coleman Powermate 5000 Generator Manual - Pm0525312.17
33% (3)
Coleman Powermate 5000 Generator Manual - Pm0525312.17
8 pages
Understanding Threads in Operating Systems
No ratings yet
Understanding Threads in Operating Systems
15 pages
Python Programming and SQL 5 Books in 1 From Starter To Smarter Master Hands On Coding Break Career Barriers and Unlock Expert Techniques With A Step by Step Method
No ratings yet
Python Programming and SQL 5 Books in 1 From Starter To Smarter Master Hands On Coding Break Career Barriers and Unlock Expert Techniques With A Step by Step Method
210 pages
Native Instruments Mac OS Compatibility
No ratings yet
Native Instruments Mac OS Compatibility
7 pages
ArduPilot vs PX4: Key Differences Explained
No ratings yet
ArduPilot vs PX4: Key Differences Explained
28 pages
Cellebrite Pathfinder: Streamlining Investigations
No ratings yet
Cellebrite Pathfinder: Streamlining Investigations
2 pages
HBE-ROBONOVA AI 3 Biped Robot Overview
No ratings yet
HBE-ROBONOVA AI 3 Biped Robot Overview
6 pages
Flipkart Invoice for Bhumika Products
No ratings yet
Flipkart Invoice for Bhumika Products
3 pages
Operating System Architecture Overview
No ratings yet
Operating System Architecture Overview
20 pages
Wheel Balancer User Manual
No ratings yet
Wheel Balancer User Manual
24 pages
Sartorius MC5 Balance Service Manual
No ratings yet
Sartorius MC5 Balance Service Manual
49 pages
Sagemcom F@ST 5657 ONT Overview
No ratings yet
Sagemcom F@ST 5657 ONT Overview
6 pages
Lenovo Internship Report Overview
No ratings yet
Lenovo Internship Report Overview
19 pages
ALCATEL 1650SM-C System Operations Guide
No ratings yet
ALCATEL 1650SM-C System Operations Guide
489 pages
Lenovo IdeaPad Slim 3 15IRH10 Specs
No ratings yet
Lenovo IdeaPad Slim 3 15IRH10 Specs
8 pages
Invoice for Apple iPhone 13 Purchase
No ratings yet
Invoice for Apple iPhone 13 Purchase
1 page
8051 Microcontroller Overview and Features
No ratings yet
8051 Microcontroller Overview and Features
31 pages
Logger 1000 User Manual Guide
No ratings yet
Logger 1000 User Manual Guide
94 pages
How to Enable Apple Cash on iPhone
No ratings yet
How to Enable Apple Cash on iPhone
1 page

CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan

Uploaded by

CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan

Uploaded by

CS 162 Computer Architecture

Lecture 3: Pipelining Contd.

Instructor: L.N. Bhuyan

2 ALUOp 1999 ©UCB

IF/ID ID/EX EX/MEM MEM/WB

IF/ID ID/EX EX/MEM MEM/WB

P Address Instruction Read

Structural Hazard Solution: Add more

CPIpipelined = Ideal CPI + Pipeline stall clock cycles per instn

Speedup = Ideal CPI x Pipeline depth Clock Cycleunpipelined

Speedup = Pipeline depth Clock Cycleunpipelined

° Machine A is 1.33 times faster

sub $4, $1 ,$3

and $6, $1 ,$7

xor $10, $1 ,$11

See Fig. 6.43 pp. 488 of the text

• Can’t solve with forwarding alone

You might also like