0% found this document useful (0 votes)

51 views

Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page

The document describes data hazards that can occur in a pipelined processor when instructions have dependencies between register values. An example code sequence is given that contains read-after-write dependencies that would cause problems in a pipelined datapath. Specifically, instructions further down the pipeline rely on register values that have not been written yet. Two solutions are proposed: inserting no-ops to stall the pipeline, or forwarding register values between stages of the pipeline before they are written to the register file.

Uploaded by

bsudheertec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views

Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page

Uploaded by

bsudheertec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Pipeline Review

0
ID/EX
WB EX/MEM
PCSrc
Control M WB MEM/WB
IF/ID EX M WB
4
Add
P Add
C Shift
RegWrite left 2

Read Read
register 1 data 1 MemWrite
ALU
Read Instruction Zero
Read Read
address [31-0]
register 2 data 2 0 Result Address
Write
Data
Instruction register 1 MemToReg
memory
memory Registers ALUOp
Write
data ALUSrc Write Read
1
data data
Instr [15 - 0] Sign
RegDst
extend MemRead
0
Instr [20 - 16]
0
Instr [15 - 11]
1

Our examples are too simple

Here is the example instruction sequence used to
illustrate pipelining on the previous page
lw $8, 4($29)
sub $2, $4, $5
and $9, $10, $11
or $16, $17, $18
add $13, $14, $0

The instructions in this example are independent

 Each instruction reads and writes completely different
registers
 Our datapath handles this sequence easily
But most sequences of instructions are not
independent!
2

1
An example with dependences
Read after Write dependences

sub $2, $1, $3

and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)
Dependences are a property of how the
computation is expressed

An example with dependences

sub $2, $1, $3
and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)

There are several dependences in this code fragment

 The first instruction, SUB, stores a value into $2
 That register is used as a source in the rest of the instructions
This is no problem for 1-cycle and multicycle datapaths
 Each instruction executes completely before the next begins
 This ensures that instructions 2 through 5 above use the new
value of $2 (the sub result), just as we expect.
How would this code sequence fare in our pipelined
datapath?
4

2
Data hazards in the pipeline diagram
Clock cycle
1 2 3 4 5 6 7 8 9

sub $2, $1, $3 IF ID EX MEM WB

and $12, $2, $5 IF ID EX MEM WB

or $13, $6, $2 IF ID EX MEM WB

add $14, $2, $2 IF ID EX MEM WB

sw $15, 100($2) IF ID EX MEM WB

The SUB does not write to register $2 until clock cycle 5

causeing 2 data hazards in our pipelined datapath
 The AND reads register $2 in cycle 3. Since SUB hasn’t
modified the register yet, this is the old value of $2
 Similarly, the OR instruction uses register $2 in cycle 4, again
before it’s actually updated by SUB 5

Things that are okay

Clock cycle
1 2 3 4 5 6 7 8 9

sub $2, $1, $3 IF ID EX MEM WB

and $12, $2, $5 IF ID EX MEM WB

or $13, $6, $2 IF ID EX MEM WB

add $14, $2, $2 IF ID EX MEM WB

sw $15, 100($2) IF ID EX MEM WB

The ADD is okay, because of the register file design

 Registers are written at the beginning of a clock cycle
 The new value will be available by the end of that cycle
The SW is no problem at all, since it reads $2 after the
SUB finishes
6

3
One Solution To Data Hazards
sub $2, $1, $3 sub $2, $1, $3
and $12, $2, $5 sll $0, $0, $0
or $13, $6, $2 sll $0, $0, $0
add $14, $2, $2 and $12, $2, $5
sw $15, 100($2) or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)

Since it takes two instruction cycles to get the value stored,

one solution is for the assembler to insert no-ops or for
compilers to reorder instructions to do useful work while
the pipeline proceeds
A software solution to data hazards 7

A fancier pipeline diagram

Clock cycle
1 2 3 4 5 6 7 8 9

IM Reg DM Reg
sub $2, $1, $3

IM Reg DM Reg
and $12, $2, $5

IM Reg DM Reg
or $13, $6, $2

IM Reg DM Reg
add $14, $2, $2

IM Reg DM Reg
sw $15, 100($2)

4
Forwarding
Since the pipeline registers already contain the ALU
result, we could just forward the value to later
instructions, to prevent data hazards
• In clock cycle 4, the AND instruction can get the value of $1 -
$3 from the EX/MEM pipeline register used by SUB
• Then in cycle 5, the OR can get that same result from the MEM/
WB pipeline register being used by SUB
Clock cycle
1 2 3 4 5 6 7

IM Reg DM Reg
sub $2, $1, $3

IM Reg DM Reg
and $12, $2, $5

IM Reg DM Reg
or $13, $6, $2 9

Forwarding Implementation
Forwarding requires …
(a) Recognizing when a potential data hazard
exists, and
(b) Revising the pipeline to introduce
forwarding paths …
We’ll do those revisions next time

5
What about stores?
Two “easy” cases:
1 2 3 4 5 6

add $1, $2, $3 IM Reg DM Reg

IM Reg DM Reg
sw $4, 0($1)

1 2 3 4 5 6

add $1, $2, $3 IM Reg DM Reg

IM Reg DM Reg
sw $1, 0($4)
11

What about stores?

A harder case:
1 2 3 4 5 6

lw $1, 0($2) IM Reg DM Reg

IM Reg DM Reg
sw $1, 0($4)

In what cycle is:

 The load value available?
 The store value needed?

What do we have to add to the datapath?

6
Load/Store Bypassing: Extends Datapath#
By cycling the result of Read data EX/MEM MEM/WB

back to be the value for Write

data, the combination
Sequence :
lw $1, 0($2) Address
Data
sw $1, 0($4) 1
memory

Write Read

can operate at normal pipeline

0 1
data data

0
speeds … until there is a cache
miss!
ForwardC

Stalls and flushes

We have seen data hazards can occur in pipelined CPUs
when instructions depend upon others still executing
 Many hazards can be resolved by forwarding data from the
pipeline registers, instead of waiting for the writeback stage
 The pipeline continues running at full speed, with one
instruction beginning on every clock cycle
Now, we’ll see some real limitations of pipelining
 Forwarding may not work for data hazards from load
instructions
 Branches affect the instruction fetch for the next clock cycle
In both of these cases we may need to slow down, or
stall, the pipeline

7
What about loads?
Imagine if the first instruction in the example was LW
instead of SUB
 The load data doesn’t come from memory until the end of
cycle 4
 But the AND needs that value at the beginning of the same
cycle!
This is a “true” data hazard—the data is simply not
available when its needed
Clock cycle
1 2 3 4 5 6

IM Reg DM Reg
lw $2, 20($3)

IM Reg DM Reg
and $12, $2, $5

Stalling
The easiest solution is to stall the pipeline
We can delay the AND instruction by introducing a 1
cycle delay in the pipeline, often called a bubble

Clock cycle
1 2 3 4 5 6 7

IM Reg DM Reg
lw $2, 20($3)

IM Reg DM Reg
and $12, $2, $5

Notice that we’re still using forwarding in cycle 5, to get

data from the MEM/WB pipeline register to the ALU

8
Stalling and forwarding
Without forwarding, we’d have to stall for two cycles to
wait for the LW instruction’s writeback stage.
Clock cycle
1 2 3 4 5 6 7 8

IM Reg DM Reg
lw $2, 20($3)

IM Reg DM Reg
and $12, $2, $5

In general, you can always stall to avoid hazards—but

dependencies are very common in real code, and
stalling will often reduce performance significantly

Stalling delays the entire pipeline

If we delay the 2nd instruction, we must delay the 3rd too
 This is necessary to make forwarding work between AND and
OR
 It also prevents problems such as two instructions trying to
write to the same register in the same cycle.
Clock cycle
1 2 3 4 5 6 7 8

IM Reg DM Reg
lw $2, 20($3)

IM Reg DM Reg
and $12, $2, $5

IM Reg DM Reg
or $13, $12, $2
18

9
Implementing stalls
To implement a stall we force the two instructions after
LW to remain in their ID & IF stages for 1 extra cycle
Clock cycle
1 2 3 4 5 6 7 8
IM Reg DM Reg
lw $2, 20($3)

IM Reg Reg DM Reg

and $12, $2, $5

IM IM Reg DM Reg
or $13, $12, $2

This is easily accomplished

 Don’t update the IF/ID register, so the ID stage is repeated
19
 Don’t update the PC, so the current IF stage is repeated

What about EXE, MEM, WB

But what about the ALU during cycle 4, the data
memory in cycle 5, and the register file write in cycle
6? Clock cycle
1 2 3 4 5 6 7 8
IM Reg DM Reg
lw $2, 20($3)

IM Reg Reg DM Reg

and $12, $2, $5

IM IM Reg DM Reg
or $13, $12, $2

Those units aren’t used in those cycles because of the

stall, so we can set the EX, MEM and WB control
signals to all 0s … the bubble “bubbles” through 20

10
Stall = Nop conversion
Clock cycle
1 2 3 4 5 6 7 8
IM Reg DM Reg
lw $2, 20($3)

IM Reg DM
and -> nop Reg

IM Reg DM Reg
and $12, $2, $5

IM Reg DM Reg

or $13, $12, $2

The effect of a load stall is to insert an empty or

nop instruction into the pipeline
21

Rectifier Power System
100% (5)
Rectifier Power System
13 pages
Andrew Antenna Specs
67% (3)
Andrew Antenna Specs
44 pages
2.33. - Electroswitch 7803g
No ratings yet
2.33. - Electroswitch 7803g
6 pages
forwarding assignment
No ratings yet
forwarding assignment
35 pages
Pipelining-3
No ratings yet
Pipelining-3
37 pages
L13 Stalls and Flushes
No ratings yet
L13 Stalls and Flushes
27 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
50 pages
Enhancing Performance With Pipelining
No ratings yet
Enhancing Performance With Pipelining
71 pages
Ca07 2014 PDF
No ratings yet
Ca07 2014 PDF
56 pages
Pipelined Datapath and Control (1)
No ratings yet
Pipelined Datapath and Control (1)
26 pages
06 Instruction+Level+Parallelism
No ratings yet
06 Instruction+Level+Parallelism
16 pages
M3.3 Data Hazard
No ratings yet
M3.3 Data Hazard
12 pages
Chapter Six: 2004 Morgan Kaufmann Publishers
No ratings yet
Chapter Six: 2004 Morgan Kaufmann Publishers
25 pages
Lec12 Pipeline 2 Notes
No ratings yet
Lec12 Pipeline 2 Notes
58 pages
2014fa CS61C L31 DG PipelineII 6up
No ratings yet
2014fa CS61C L31 DG PipelineII 6up
4 pages
Lecture9 Cda3101
No ratings yet
Lecture9 Cda3101
62 pages
CS3350B Computer Architecture: Lecture 6.2: Instructional Level Parallelism: Hazards and Resolutions
No ratings yet
CS3350B Computer Architecture: Lecture 6.2: Instructional Level Parallelism: Hazards and Resolutions
31 pages
Pipeline Datapaths: Pipelined Datapath and Control
No ratings yet
Pipeline Datapaths: Pipelined Datapath and Control
16 pages
CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan
No ratings yet
CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan
21 pages
COA Unit 3
No ratings yet
COA Unit 3
89 pages
3 Pipeline
No ratings yet
3 Pipeline
38 pages
Arch4 Pipelined Processor Design Afterlecture
No ratings yet
Arch4 Pipelined Processor Design Afterlecture
130 pages
Computer Architecture: Introduction To The Concept of Pipelined Processor
No ratings yet
Computer Architecture: Introduction To The Concept of Pipelined Processor
20 pages
Pipelining
No ratings yet
Pipelining
29 pages
Chapter4 Pipelining END FA11
No ratings yet
Chapter4 Pipelining END FA11
84 pages
L11 Pipelined Datapath and
100% (1)
L11 Pipelined Datapath and
31 pages
Hazards: CSE378 W, 2001 CSE378 W, 2001
No ratings yet
Hazards: CSE378 W, 2001 CSE378 W, 2001
6 pages
CA unit-2 Chapter-2
No ratings yet
CA unit-2 Chapter-2
36 pages
chapter4_2
No ratings yet
chapter4_2
34 pages
Pipelining in MIPs Architecture
100% (3)
Pipelining in MIPs Architecture
23 pages
Two Forms of Pipelining: - E.g., Floating Point Operations
No ratings yet
Two Forms of Pipelining: - E.g., Floating Point Operations
36 pages
Pipelining ControlUnitAndHazards
No ratings yet
Pipelining ControlUnitAndHazards
109 pages
Microprocessors Piplining Slides
No ratings yet
Microprocessors Piplining Slides
38 pages
Unit 5 Pipeline Hazard
No ratings yet
Unit 5 Pipeline Hazard
31 pages
3.2 Pipeline Processing
No ratings yet
3.2 Pipeline Processing
18 pages
L15 MipsPipeline
No ratings yet
L15 MipsPipeline
26 pages
Computer Architecture LAB 2
No ratings yet
Computer Architecture LAB 2
4 pages
Week 12
No ratings yet
Week 12
41 pages
Lec 11
No ratings yet
Lec 11
30 pages
Pipelining-2
No ratings yet
Pipelining-2
33 pages
Lect8 Pipelined DP Control
No ratings yet
Lect8 Pipelined DP Control
59 pages
15IF11 Multicore A PDF
No ratings yet
15IF11 Multicore A PDF
64 pages
Week 4 - Pipelining
No ratings yet
Week 4 - Pipelining
44 pages
Unit 6 Part1 Ilp
No ratings yet
Unit 6 Part1 Ilp
39 pages
CA Unit 3 Answers
No ratings yet
CA Unit 3 Answers
10 pages
Onur Digitaldesign - Comparch 2021 Lecture14 Pipelined Processor Design Afterlecture
No ratings yet
Onur Digitaldesign - Comparch 2021 Lecture14 Pipelined Processor Design Afterlecture
97 pages
Revisiting Hazards: Data Hazards Control Hazards Hardware
No ratings yet
Revisiting Hazards: Data Hazards Control Hazards Hardware
45 pages
Lecture 6
No ratings yet
Lecture 6
17 pages
Lec13 Pipe Control
No ratings yet
Lec13 Pipe Control
19 pages
Chapter_04_processor_3.5
No ratings yet
Chapter_04_processor_3.5
52 pages
Pipeline Hazards Detailed Notes
No ratings yet
Pipeline Hazards Detailed Notes
49 pages
Co - Unit Ii - Ii
No ratings yet
Co - Unit Ii - Ii
34 pages
Coa Unit 4
No ratings yet
Coa Unit 4
10 pages
Table 1: Control Signals and Opcodes
No ratings yet
Table 1: Control Signals and Opcodes
6 pages
M116C 1 EE116C-Midterm2-w15 Solution
100% (1)
M116C 1 EE116C-Midterm2-w15 Solution
8 pages
Advanced Linux Programming
No ratings yet
Advanced Linux Programming
31 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
214 pages
03 Pipeline
0% (1)
03 Pipeline
38 pages
L24 Pipeline
No ratings yet
L24 Pipeline
40 pages
Lecture 13-14: Pipelines Hazards": Suggested Reading:" (HP Chapter 4.5-4.7) "
No ratings yet
Lecture 13-14: Pipelines Hazards": Suggested Reading:" (HP Chapter 4.5-4.7) "
51 pages
Smart Social Distancing Technique
No ratings yet
Smart Social Distancing Technique
86 pages
M.E (FT) 2021 Regulation-Ece Syllabus
No ratings yet
M.E (FT) 2021 Regulation-Ece Syllabus
64 pages
CS8261 - C Programming Laboratory - MCQ
No ratings yet
CS8261 - C Programming Laboratory - MCQ
6 pages
Webinar On AI Using Matlab
No ratings yet
Webinar On AI Using Matlab
1 page
M.E (FT) 2021 Regulation-Cse Syllabus
No ratings yet
M.E (FT) 2021 Regulation-Cse Syllabus
88 pages
E-Health Care Management
No ratings yet
E-Health Care Management
92 pages
Question Paper Code:: Reg. No.
No ratings yet
Question Paper Code:: Reg. No.
2 pages
Markov Decision Processes: - The Markov Property - The Markov Decision Process - Partially Observable Mdps
No ratings yet
Markov Decision Processes: - The Markov Property - The Markov Decision Process - Partially Observable Mdps
24 pages
MDP PDF
No ratings yet
MDP PDF
37 pages
Academic Allotments PDF
No ratings yet
Academic Allotments PDF
259 pages
CS554 - Advanced Database Systems Homework 8: Undo-Log Records
No ratings yet
CS554 - Advanced Database Systems Homework 8: Undo-Log Records
15 pages
HW3 Sol PDF
No ratings yet
HW3 Sol PDF
44 pages
Waside Graphs PDF
No ratings yet
Waside Graphs PDF
4 pages
Tamilnadu Engineering Admissions 2020 Directorate of Technical Education, Chennai - 25
No ratings yet
Tamilnadu Engineering Admissions 2020 Directorate of Technical Education, Chennai - 25
35 pages
ECE 341 Final Exam Solution: Problem No. 1 (10 Points)
No ratings yet
ECE 341 Final Exam Solution: Problem No. 1 (10 Points)
9 pages
Deepiris: Iris Recognition Using A Deep Learning Approach
No ratings yet
Deepiris: Iris Recognition Using A Deep Learning Approach
4 pages
I. What Comes Before: Srimathi Sundaravalli Memorial School
No ratings yet
I. What Comes Before: Srimathi Sundaravalli Memorial School
2 pages
Problem Set 4 Sol
No ratings yet
Problem Set 4 Sol
14 pages
Midterm1 Soln Fall09 PDF
No ratings yet
Midterm1 Soln Fall09 PDF
6 pages
Datasheet PDF
No ratings yet
Datasheet PDF
3 pages
AeroTrak-A100-Spec-Sheet A4 5002990 RevA Web
No ratings yet
AeroTrak-A100-Spec-Sheet A4 5002990 RevA Web
2 pages
QUIZ 1 2 Compilation
No ratings yet
QUIZ 1 2 Compilation
2 pages
The Theorem of Mesh and Nodal - DC Circuit.: Abstract: The Experiment Is About To Demonstrate
No ratings yet
The Theorem of Mesh and Nodal - DC Circuit.: Abstract: The Experiment Is About To Demonstrate
2 pages
Module 5 - GSM Air Interface & Network Planning
No ratings yet
Module 5 - GSM Air Interface & Network Planning
42 pages
3Ph/1Ph Phase Ups: Principles of Working
No ratings yet
3Ph/1Ph Phase Ups: Principles of Working
4 pages
SK2400 Manual PDF R
No ratings yet
SK2400 Manual PDF R
20 pages
Pullini 2017
No ratings yet
Pullini 2017
8 pages
A e 2600 Install Guide
No ratings yet
A e 2600 Install Guide
6 pages
Marantz Ma-9s2 SM Ver1
No ratings yet
Marantz Ma-9s2 SM Ver1
27 pages
LG ML041B Chassis RM17LZ50 LCD TV SM PDF
No ratings yet
LG ML041B Chassis RM17LZ50 LCD TV SM PDF
35 pages
Qelectrotech Element
No ratings yet
Qelectrotech Element
2 pages
M150-SP150 Datasheet V01 (GB)
No ratings yet
M150-SP150 Datasheet V01 (GB)
2 pages
10 - Block Diagram
No ratings yet
10 - Block Diagram
2 pages
KFlopManual PDF
No ratings yet
KFlopManual PDF
306 pages
Au 670 Yamaha
No ratings yet
Au 670 Yamaha
69 pages
2 Way Sma Wilkinson Power Divider From 2 GHZ To 4 GHZ Rated at 10 Watts
No ratings yet
2 Way Sma Wilkinson Power Divider From 2 GHZ To 4 GHZ Rated at 10 Watts
3 pages
Multilevel Caches and Replacement Policies
No ratings yet
Multilevel Caches and Replacement Policies
16 pages
AL3353
No ratings yet
AL3353
15 pages
Ousb User Guide 1 - 0
No ratings yet
Ousb User Guide 1 - 0
238 pages
Design of Circular Fractal Antenna
No ratings yet
Design of Circular Fractal Antenna
4 pages
Project Work On Computer Fundamental
43% (7)
Project Work On Computer Fundamental
18 pages
Helion Rivos
No ratings yet
Helion Rivos
11 pages
Ar 85DTTR
No ratings yet
Ar 85DTTR
16 pages
Operating Instructions Rish CON TPT 96 X 96
No ratings yet
Operating Instructions Rish CON TPT 96 X 96
59 pages
Setup and Hold Time Calculations
100% (11)
Setup and Hold Time Calculations
33 pages
Fault Code: 122 Intake Manifold 1 Pressure Sensor Circuit - Voltage Above Normal or Shorted To High Source
No ratings yet
Fault Code: 122 Intake Manifold 1 Pressure Sensor Circuit - Voltage Above Normal or Shorted To High Source
3 pages

Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page

Uploaded by

Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page

Uploaded by

Pipeline Review

Our examples are too simple

The instructions in this example are independent

sub $2, $1, $3

An example with dependences

There are several dependences in this code fragment

sub $2, $1, $3 IF ID EX MEM WB

and $12, $2, $5 IF ID EX MEM WB

or $13, $6, $2 IF ID EX MEM WB

add $14, $2, $2 IF ID EX MEM WB

sw $15, 100($2) IF ID EX MEM WB

The SUB does not write to register $2 until clock cycle 5

Things that are okay

sub $2, $1, $3 IF ID EX MEM WB

and $12, $2, $5 IF ID EX MEM WB

or $13, $6, $2 IF ID EX MEM WB

add $14, $2, $2 IF ID EX MEM WB

sw $15, 100($2) IF ID EX MEM WB

The ADD is okay, because of the register file design

Since it takes two instruction cycles to get the value stored,

A fancier pipeline diagram

add $1, $2, $3 IM Reg DM Reg

add $1, $2, $3 IM Reg DM Reg

What about stores?

lw $1, 0($2) IM Reg DM Reg

In what cycle is:

What do we have to add to the datapath?

back to be the value for Write

can operate at normal pipeline

Stalls and flushes

Notice that we’re still using forwarding in cycle 5, to get

In general, you can always stall to avoid hazards—but

Stalling delays the entire pipeline

IM Reg Reg DM Reg

This is easily accomplished

What about EXE, MEM, WB

IM Reg Reg DM Reg

Those units aren’t used in those cycles because of the

The effect of a load stall is to insert an empty or

You might also like