0% found this document useful (0 votes)

124 views43 pages

Pipelining and Pipelining Hazards

This document summarizes a lecture on pipelining and pipelining hazards in computer architecture. It discusses how pipelining improves throughput by allowing multiple instructions to be processed simultaneously across different stages. However, pipelining can introduce hazards such as structural hazards when resources are busy, data hazards due to dependencies between instructions, and control hazards with branches. These hazards are addressed through techniques like stalling the pipeline, forwarding results, code reordering, and dynamic branch prediction.

Uploaded by

Shahid Hussain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

124 views43 pages

Pipelining and Pipelining Hazards

Uploaded by

Shahid Hussain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

CE-820 Spring 2023

Advanced Computer Architecture

Lecture # 07
Pipelining and Pipelining Hazards

Muhammad Imran
[email protected]
Acknowledgement
2

▪ Content from following has been used in these lectures

▪ Computer Organization and Design (RISC-V Edition), Patterson and
Hennessy
▪ Computer Architecture: A Quantitative Approach, 6th Edition,
Hennessy and Patterson
Contents
3

▪ Pipelining to improve throughput

▪ Pipelining Hazards
Pipelining to Improve Throughput
A Laundry Analogy
5

▪ Pipelining helps execute multiple tasks in parallel

▪ Improves throughput …
Improvement by Pipelining
6

▪ Pipelining only improves throughput

▪ Individual tasks still take same amount of time
▪ When number of tasks is too large and pipeline stages are
perfectly balanced!
▪ Improvement in performance ≈ Number of pipeline stages
▪ Example
▪ Suppose each stage in laundry takes 10 minutes and there are 4
stages
▪ Without pipelining
▪ 4 loads take 40×4=160 minutes!
▪ With pipelining
▪ 4 loads take 70 minutes!
▪ With large number of loads, improvement would approach 4 times!
Pipelining in RISC-V
7

▪ Instruction execution stages

1. Fetch instruction from memory.
2. Read registers and decode the instruction.
3. Execute the operation or calculate an address.
4. Access an operand in data memory (if necessary).
5. Write the result into a register (if necessary).
▪ Therefore, we can implement a five-stage pipeline for RISC-
V!

▪ RISC-V Pipeline Stages are not perfectly balanced!

Pipelining in RISC-V
8

▪ Example of RISC-V instruction execution time!

▪ Assuming Multiplexors, Control Unit, PC Access and Sign-extension

unit has no delay!
▪ Load takes the longest!
▪ For single-cycle, the clock period must be that of load time!
▪ Among individual stages the longest time is 200 ps!
▪ The clock period in pipelined architecture must be at least 200 ps (+
tclk2q + ts), although some stages take only 100 ps!!
Pipelining in RISC-V
9

▪ Time to execute three instructions

▪ Without pipelining: 2400 ps
▪ After pipelining: 1400 ps
▪ Time between first and 3rd instruction = 2×200 ps = 400 ps
Pipelining in RISC-V
10

▪ How about adding 1,000,000 more instructions (to current 3)?

▪ Each instruction adds 200 ps, total execution time using pipelining would be (1400 ps +
200×1,000,000 ps) = 200,001,400 ps
▪ For non-pipelined design execution time = (2400 ps + 1,000,000×800 ps) = 800,002,400 ps
▪ Improved is about 4 times = amount of reduction in clock period (given imbalanced stages)!
Pipelining in RISC-V
11

▪ Implementing RISC-V Pipelining

▪ RISC-V instructions are easier to pipeline, because of
▪ Fixed instruction size
▪ Easy to fetch and decode!
▪ Fixed location of source/destination operands
▪ Memory operands appear only in loads/stores
▪ Execution stage can be used for address calculation!
▪ Access memory in last stage!
▪ Allowing memory operand in other instructions would increase number
of pipeline stages / imbalance!
▪ RISC-V has fewer instruction formats!

▪ x86 architecture has variable instruction sizes

▪ Hard to pipeline
▪ x86 instructions are translated into RISC like operations to
implement pipelining
Pipelining is a programmer invisible
technique for performance
improvement …
Pipelining Hazards
What are hazards?
14

▪ Situations when next instruction in the pipeline cannot

correctly execute in the following cycle
▪ Three types of hazards
▪ Structural Hazards
▪ Data Hazards
▪ Control Hazards
Structural Hazards
15

▪ When hardware cannot execute a combination of

instructions in same cycle
▪ The required hardware resource is busy!
▪ Example
▪ If we had one memory for instructions and data in RISC-V
▪ Memory access from one instruction and instruction fetch of another
couldn’t execute in one cycle

▪ RISC-V instruction set was designed to be pipelined

▪ Easier to avoid structural hazards in pipelined implementation
Data Hazards
16

▪ When instruction cannot execute because the data required

by it is not yet available
▪ Dependence of one instruction on an earlier instruction in the
pipeline
▪ Example
▪ add x1, x2, x3
▪ sub x4, x1, x5
▪ x1 will be written back in first instruction later before it is required
by 2nd instruction!

Without addressing data hazard outdated value of x1 will be read!

Addressing Data Hazards
17

▪ Simple Solution
▪ Stall the pipeline!
▪ Wait until the data has been written
Pipeline Stall or Bubble –
A pipeline stall (wait)
initiated to resolve a
hazard

▪ Need to wait 3 cycles, so that slows down execution!

▪ Compiler can implement stalls to resolve data hazards!
▪ Can we do better?
Addressing Data Hazards
18

▪ Forwarding or Bypassing
▪ Forward the data to the next instruction when it is available
▪ Do not wait for the write back stage to complete!

▪ Forwarding only works destination stage is later in time compared

to source stage!
Addressing Data Hazards
19

▪ Forwarding cannot prevent all pipeline stalls!

▪ Example
▪ ld x1, 0(x2)
▪ sub x4, x1, x5
▪ x1 is not available for forwarding when it is needed by 2nd instruction!

▪ Known as load-use data hazard!

Addressing Data Hazards
20

▪ Code reordering to prevent stalls

▪ Example
▪ Consider following C Code
▪ a = b + e;
▪ c = b + f;
▪ Compiled to following RISC-V Code
▪ ld x1, 0(x31) # Load b
▪ ld x2, 8(x31) # Load e
▪ add x3, x1, x2 #b+e
▪ sd x3, 24(x31) # Store a
▪ ld x4, 16(x31) # Load f
▪ add x5, x1, x4 #b+f
▪ sd x5, 32(x31) # Store c
▪ Which instructions have data hazards?
▪ How many hazards are there with or without forwardig?
Addressing Data Hazards
21

▪ Code reordering to prevent stalls

▪ Example
▪ Data hazards when we can use forwarding
ld x1, 0(x31) ld x1, 0(x31)
ld x2, 8(x31) ld x2, 8(x31)
add x3, x1, x2 ld x4, 16(x31)
sd x3, 24(x31) add x3, x1, x2
ld x4, 16(x31) sd x3, 24(x31)
add x5, x1, x4 add x5, x1, x4
sd x5, 32(x31) sd x5, 32(x31)
▪ How can we reorder code to avoid stalls?
▪ Simply moving ld instruction above in the sequence can avoid both
hazards!
▪ How much this improves the performance?
▪ The new code sequence executes two cycles faster (assuming that we are
using forwarding!)
Addressing Data Hazards
22

▪ Forwarding implementation in RISC-V

▪ RISC-V instructions write at most one result!
▪ Result is written in last stage!

▪ Forwarding is harder if
▪ More than one results are to be forwarded per instruction!
Control Hazards
23

▪ When conditional branch instructions need to decide which

instruction should be next
▪ Next instruction cannot be fetched until branch decision is
finalized!
▪ Also known as branch hazards
▪ Two important considerations
▪ When is the branch target known (computed)?
▪ When is the branch decision known (test evaluated)?
▪ These two factors decide penalties (additional cycles) for the
control hazards!
Addressing Control Hazards
24

▪ Branch decision and target address can be finalized in

second stage by adding extra hardware
▪ Still the pipeline may need to be stalled!
▪ Example
Addressing Control Hazards
25

▪ What is the drawback of finalizing branch decision in ID

(second) stage?
▪ Can introduce new data hazards!
▪ If branch is dependent on an earlier instruction!
▪ Forwarding to second stage will solve less of those hazards because
destination stage is too early!
Addressing Control Hazards
26

▪ Prediction
▪ A better solution to control hazards
▪ Predict the outcome of branch and fetch the next instruction!
▪ If prediction is wrong, fetch the right instruction again!

▪ Prediction can be static or dynamic!

Addressing Control Hazards
27

▪ Static prediction (compile time solution)

▪ Predict all branches as taken or not taken
▪ Last example using prediction
▪ When prediction is correct!

▪ When prediction is wrong!

Addressing Control Hazards
28

▪ Static prediction (compile time solution)

▪ Predict some branches as taken and some as not taken
▪ For example, a branch instruction at the end of a loop is usually
taken
▪ Predict all branches to earlier addresses to be taken!
Addressing Control Hazards
29

▪ Dynamic prediction (runtime solution)

▪ Predict based on knowledge of the behavior of branch instruction!
▪ Keep history of different branches as taken and not taken
▪ Predict based on prevalent behavior of different instructions!
▪ Because of lot of history, such prediction have accuracy of above 90%!

▪ Can you think of any other solution to control hazards?

Addressing Control Hazards
30

▪ Delayed prediction (compile time solution)

▪ Delay the branch decision!
▪ Execute the instruction which is not impacted by branch!
▪ Efficient for one-cycle branch delays!
▪ Example
▪ add x1, x2, x4
▪ beq x5, x6, somewhere
▪ Reorder the instructions as
▪ beq x5, x6, somewhere
▪ add x1, x2, x4
▪ Handled by assembler!
▪ Invisible to assembly language programmer!
Instruction Sets can make pipelining
easier or harder …
Knowledge Check!
32

▪ Tell whether following code sequences must stall, can avoid

stall with only forwarding or can execute without stall or
forwarding …
▪ Example 1
▪ ld x10, 0(x10)
▪ add x11, x10, x10
▪ Answer:
▪ Cannot fully avoid stall!
▪ Can reduce one cycle by forwarding!
Knowledge Check!
33

▪ Tell whether following code sequences must stall, can avoid

stall with only forwarding or can execute without stall or
forwarding …
▪ Example 2
▪ add x11, x10, x10
▪ addi x12, x10, 5
▪ addi x14, x11, 5
▪ Answer:
▪ Third instruction needs x11 before first writes it back!
▪ read-after-write data hazard!
▪ Stall can be avoided by forwarding!
Knowledge Check!
34

▪ Tell whether following code sequences must stall, can avoid

stall with only forwarding or can execute without stall or
forwarding …
▪ Example 3
▪ addi x11, x10, 1
▪ addi x12, x10, 2
▪ addi x13, x10, 3
▪ addi x14, x10, 4
▪ addi x15, x10, 5
▪ Answer:
▪ No stalls even without forwarding!
Pipelining performance when
considering stalls due to hazards …
Performance without Stalls
36

▪ In general,
Avg. Time per instruction unpipelined
Speedup =
Avg. Time per instruction pipelined

▪ Balanced stages and no stalls!

Avg. Time per instruction unpipelined

Speedup =
Number of pipeline stages
Performance with Stalls
37

▪ In general,
Avg. Time per instruction unpipelined
Speedup =
Avg. Time per instruction pipelined

CPI unpipelined × Clock cycle unpipelined

Speedup =
CPI pipelined × Clock cycle pipelined

▪ Stalls decrease the CPI from ideal pipeline CPI!

CPI pipelined = Ideal CPI + Pipeline stall cycles per instruction

▪ What is ideal pipeline CPI?

CPI pipelined = 1 + Pipeline stall cycles per instruction
Performance with Stalls
38

CPI unpipelined × Clock cycle unpipelined

Speedup =
CPI pipelined × Clock cycle pipelined

▪ Ignoring clock skew in pipelined implementation, cycle

time is same in pipelined and multi-cycle implementation!
CPI unpipelined
Speedup =
1 + Pipeline stall cycles per instruction

▪ If all instructions take same cycles, CPI unpipelined =

pipeline depth!
Pipeline depth
Speedup =
1 + Pipeline stall cycles per instruction

▪ Without stalls, speedup = pipeline depth!

Performance with Stalls
39

Pipeline depth
Speedup =
1 + Pipeline stall cycles per instruction

▪ Stall cycles depend on frequency of instructions causing

stalls × penalty for each such instruction
▪ For instance, for branches

Pipeline stall cycles for Branch

= × Branch Penalty
branches Frequency
▪ With this equation, we can compare different prediction schemes!
Example: Prediction Performance
40

▪ MIPS R4000 pipeline

▪ Three pipeline stages before branch target address is known!
▪ Four pipeline stages until branch comparison is done!
▪ Assume no stalls on registers in conditional comparison!
▪ Find effective addition to CPI due to branches assuming
▪ Unconditional branches are 4%
▪ Conditional branches, untaken are 6%
▪ Conditional branches, taken are 10%

▪ Let’s compare three branch prediction schemes

▪ Flush pipeline
▪ Predicted taken
▪ Predicted untaken
Example: Prediction Performance
41

▪ Target Address → 3rd Stage, Condition Test → 4th Stage

▪ Branch Penalties
Branch Scheme Penalty Unconditional Penalty Untaken Penalty Taken

Flush pipeline 2 3 3

Predicted taken 2 3 2

Predicted untaken 2 0 3

Pipeline stall cycles for branches = Branch Frequency × Branch Penalty

Example: Prediction Performance
42

▪ If base CPI = 1 and branches are the only source of stalls

▪ Stalling the pipeline is 1.56 times slower than the ideal!
▪ Predicted untaken is 1.38 times slower than the ideal!
▪ Predicted untaken is 1.13 (=1.56/1.38) times better than stalling the pipeline!
Relevant Reading
43

▪ Computer Organization and Design (RISC-V Edition),

Patterson and Hennessy
▪ Chapter 4
▪ Section 4.5!

▪ Computer Architecture: A Quantitative Approach, 6th

Edition, Hennessy and Patterson
▪ Appendix C
▪ Sections C.1 and C.2

C Programming
From Everand
C Programming
Netra
No ratings yet
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Pipeline Hazards Detailed Notes
No ratings yet
Pipeline Hazards Detailed Notes
49 pages
Chapter 10 Principles of Pipelining
No ratings yet
Chapter 10 Principles of Pipelining
124 pages
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
CAO Fall 2024 Lecture 07 RISC V Pipelined Implementation
No ratings yet
CAO Fall 2024 Lecture 07 RISC V Pipelined Implementation
114 pages
Pipelining 2019
No ratings yet
Pipelining 2019
82 pages
NB To CP1E-N S1 To E5CC TEMP Controller With NB
No ratings yet
NB To CP1E-N S1 To E5CC TEMP Controller With NB
166 pages
Pipelinehazard 160823134502
No ratings yet
Pipelinehazard 160823134502
61 pages
Pipelining: Basic Concepts
No ratings yet
Pipelining: Basic Concepts
20 pages
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
From Everand
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
Mamta Devi
No ratings yet
Lecture 6 The Processors-Improving The Performance
No ratings yet
Lecture 6 The Processors-Improving The Performance
40 pages
Pooja Vashisth
No ratings yet
Pooja Vashisth
68 pages
SRM Pipelining 05
No ratings yet
SRM Pipelining 05
42 pages
Chapter 6 - Pipelining
0% (1)
Chapter 6 - Pipelining
61 pages
Comptia Server+ Primer
From Everand
Comptia Server+ Primer
John Greene
5/5 (1)
2.pipeline RISC-V v2
No ratings yet
2.pipeline RISC-V v2
47 pages
L04 Pipelining
No ratings yet
L04 Pipelining
38 pages
Chapter 4 Part 2
No ratings yet
Chapter 4 Part 2
50 pages
Pipelinehazard For Class
No ratings yet
Pipelinehazard For Class
61 pages
Module 5 Part2 Pipelining
No ratings yet
Module 5 Part2 Pipelining
36 pages
Unit-V: Performance Enhancement Techinques
No ratings yet
Unit-V: Performance Enhancement Techinques
61 pages
Question Ans CA
No ratings yet
Question Ans CA
28 pages
Chapter6 - Pipelining
No ratings yet
Chapter6 - Pipelining
61 pages
Week 12
No ratings yet
Week 12
41 pages
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
No ratings yet
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
64 pages
Helping Slides Pipelining Hazards Solutions
No ratings yet
Helping Slides Pipelining Hazards Solutions
55 pages
COA Unit 3
No ratings yet
COA Unit 3
89 pages
05 Risc V Pipeline
No ratings yet
05 Risc V Pipeline
31 pages
Chapter6 - Pipelining
No ratings yet
Chapter6 - Pipelining
61 pages
Chapter 04 Processor 2
No ratings yet
Chapter 04 Processor 2
28 pages
Pipelining (All Slides)
No ratings yet
Pipelining (All Slides)
45 pages
Enhancing Performance With Pipelining
No ratings yet
Enhancing Performance With Pipelining
85 pages
Pipelining Lecture
No ratings yet
Pipelining Lecture
39 pages
Chapter 3 PPTV 31 Sem IIv 31
No ratings yet
Chapter 3 PPTV 31 Sem IIv 31
40 pages
CoA Batch13
No ratings yet
CoA Batch13
30 pages
T24 Document For Learning
100% (1)
T24 Document For Learning
26 pages
CH 6
No ratings yet
CH 6
29 pages
L10-L11-Instruction Pipelining
No ratings yet
L10-L11-Instruction Pipelining
38 pages
Pipe Lining
No ratings yet
Pipe Lining
66 pages
IT3030E CA Chap5 CPU - Removed
No ratings yet
IT3030E CA Chap5 CPU - Removed
26 pages
Lec 1
No ratings yet
Lec 1
30 pages
Week 11
No ratings yet
Week 11
33 pages
Lecture # 8B
No ratings yet
Lecture # 8B
20 pages
L8 PipelineHazards 1
No ratings yet
L8 PipelineHazards 1
28 pages
Lecture 13-14: Pipelines Hazards": Suggested Reading:" (HP Chapter 4.5-4.7) "
No ratings yet
Lecture 13-14: Pipelines Hazards": Suggested Reading:" (HP Chapter 4.5-4.7) "
51 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
49 pages
CH14-WS - 10thed - Pipeline
No ratings yet
CH14-WS - 10thed - Pipeline
16 pages
Reduced Instruction Set Computers Pipelining: (RISC)
No ratings yet
Reduced Instruction Set Computers Pipelining: (RISC)
25 pages
Week 4 - Pipelining
No ratings yet
Week 4 - Pipelining
44 pages
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
No ratings yet
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
19 pages
Presentation 1
No ratings yet
Presentation 1
22 pages
Instruction Pipelining: 1 Zelalem Birhanu, Aait
No ratings yet
Instruction Pipelining: 1 Zelalem Birhanu, Aait
20 pages
Chapter 8 - Pipelining
No ratings yet
Chapter 8 - Pipelining
38 pages
CO Pipelining PDF Notes
No ratings yet
CO Pipelining PDF Notes
10 pages
CA Assignment
100% (1)
CA Assignment
8 pages
Lec3 PDF
No ratings yet
Lec3 PDF
15 pages
24CS203ES101 - Problem Solving Using C Programming - B. Vijay Kumar - Course Plan
No ratings yet
24CS203ES101 - Problem Solving Using C Programming - B. Vijay Kumar - Course Plan
15 pages
Tuesday, October 31, 2023 10:53 PM: Discuss, The Schemes For Dealing With The Pipeline Stalls Caused by Branch Hazards
No ratings yet
Tuesday, October 31, 2023 10:53 PM: Discuss, The Schemes For Dealing With The Pipeline Stalls Caused by Branch Hazards
7 pages
Chapter 4 Summary
No ratings yet
Chapter 4 Summary
2 pages
Nuvoton W90N745 - W90N745 Bootloader Users Manual PDF
No ratings yet
Nuvoton W90N745 - W90N745 Bootloader Users Manual PDF
63 pages
Lect3 Pipeline
No ratings yet
Lect3 Pipeline
4 pages
Pipelining & Riscs: Pipelining Used Key Implementation Technique To Build Fast Processors. It
No ratings yet
Pipelining & Riscs: Pipelining Used Key Implementation Technique To Build Fast Processors. It
6 pages
Labview Books
No ratings yet
Labview Books
3 pages
Lec 24
No ratings yet
Lec 24
3 pages
Power Shell For Share Point How To
No ratings yet
Power Shell For Share Point How To
272 pages
Transmisor de Temperatura - Rosemount - 644H
No ratings yet
Transmisor de Temperatura - Rosemount - 644H
16 pages
Integrating Google GCP With Microsoft Sentinel
No ratings yet
Integrating Google GCP With Microsoft Sentinel
18 pages
Placement Brochure
No ratings yet
Placement Brochure
27 pages
Info Bosch Rexroth PSI 6100 105L1 Datasheet 201733114125
No ratings yet
Info Bosch Rexroth PSI 6100 105L1 Datasheet 201733114125
14 pages
Serverless in Action: Getting Serious With Serverless and Graphql
No ratings yet
Serverless in Action: Getting Serious With Serverless and Graphql
44 pages
User Manual: Gateway / Bridge Canopen To Modbus TCP Client
No ratings yet
User Manual: Gateway / Bridge Canopen To Modbus TCP Client
12 pages
Wire Remittance Domestic Best Practices
No ratings yet
Wire Remittance Domestic Best Practices
10 pages
Societal and Ethical Issues of Digitization
No ratings yet
Societal and Ethical Issues of Digitization
17 pages
AES 31 10 2022 385 J.Srilatha
No ratings yet
AES 31 10 2022 385 J.Srilatha
9 pages
3 - Piston SureLoc Owners Manual For End Users 2013
No ratings yet
3 - Piston SureLoc Owners Manual For End Users 2013
15 pages
Car Monitoring System
No ratings yet
Car Monitoring System
21 pages
CC 16
No ratings yet
CC 16
19 pages
S5-Automatic Arabic Text Summarisation System (AATSS) Based On Morphological Analysis
No ratings yet
S5-Automatic Arabic Text Summarisation System (AATSS) Based On Morphological Analysis
9 pages
Dual Band High Gain - (ADU451807v01)
No ratings yet
Dual Band High Gain - (ADU451807v01)
2 pages
Ground Excavation
No ratings yet
Ground Excavation
6 pages
Damen Shipyards Gorinchem
No ratings yet
Damen Shipyards Gorinchem
3 pages
MPEG Poster Lowrez
100% (1)
MPEG Poster Lowrez
1 page
118 BR 052017
No ratings yet
118 BR 052017
2 pages
MBAn Factsheet 2024
No ratings yet
MBAn Factsheet 2024
2 pages
Release Note - For D-Link Storage Utility (Win) - v5.2.1.7
No ratings yet
Release Note - For D-Link Storage Utility (Win) - v5.2.1.7
4 pages
Datasheet TP330 4BB Triple Black PDF
No ratings yet
Datasheet TP330 4BB Triple Black PDF
2 pages
F5 - Configuring BIG-IP LTM v11: Local Traffic Manager: Détails
No ratings yet
F5 - Configuring BIG-IP LTM v11: Local Traffic Manager: Détails
2 pages
Sahil Chaudhary (18BCS1453) Resume
No ratings yet
Sahil Chaudhary (18BCS1453) Resume
2 pages
Award and Achievements: Product Portfolio
No ratings yet
Award and Achievements: Product Portfolio
1 page

Pipelining and Pipelining Hazards

Uploaded by

Pipelining and Pipelining Hazards

Uploaded by

CE-820 Spring 2023

Advanced Computer Architecture

▪ Content from following has been used in these lectures

▪ Pipelining to improve throughput

▪ Pipelining helps execute multiple tasks in parallel

▪ Pipelining only improves throughput

▪ Instruction execution stages

▪ RISC-V Pipeline Stages are not perfectly balanced!

▪ Example of RISC-V instruction execution time!

▪ Assuming Multiplexors, Control Unit, PC Access and Sign-extension

▪ Time to execute three instructions

▪ How about adding 1,000,000 more instructions (to current 3)?

▪ Implementing RISC-V Pipelining

▪ x86 architecture has variable instruction sizes

▪ Situations when next instruction in the pipeline cannot

▪ When hardware cannot execute a combination of

▪ RISC-V instruction set was designed to be pipelined

▪ When instruction cannot execute because the data required

Without addressing data hazard outdated value of x1 will be read!

▪ Need to wait 3 cycles, so that slows down execution!

▪ Forwarding only works destination stage is later in time compared

▪ Forwarding cannot prevent all pipeline stalls!

▪ Known as load-use data hazard!

▪ Code reordering to prevent stalls

▪ Code reordering to prevent stalls

▪ Forwarding implementation in RISC-V

▪ When conditional branch instructions need to decide which

▪ Branch decision and target address can be finalized in

▪ What is the drawback of finalizing branch decision in ID

▪ Prediction can be static or dynamic!

▪ Static prediction (compile time solution)

▪ When prediction is wrong!

▪ Static prediction (compile time solution)

▪ Dynamic prediction (runtime solution)

▪ Can you think of any other solution to control hazards?

▪ Delayed prediction (compile time solution)

▪ Tell whether following code sequences must stall, can avoid

▪ Tell whether following code sequences must stall, can avoid

▪ Tell whether following code sequences must stall, can avoid

▪ Balanced stages and no stalls!

Avg. Time per instruction unpipelined

CPI unpipelined × Clock cycle unpipelined

▪ Stalls decrease the CPI from ideal pipeline CPI!

▪ What is ideal pipeline CPI?

CPI unpipelined × Clock cycle unpipelined

▪ Ignoring clock skew in pipelined implementation, cycle

▪ If all instructions take same cycles, CPI unpipelined =

▪ Without stalls, speedup = pipeline depth!

▪ Stall cycles depend on frequency of instructions causing

Pipeline stall cycles for Branch

▪ MIPS R4000 pipeline

▪ Let’s compare three branch prediction schemes

▪ Target Address → 3rd Stage, Condition Test → 4th Stage

Pipeline stall cycles for branches = Branch Frequency × Branch Penalty

▪ If base CPI = 1 and branches are the only source of stalls

▪ Computer Organization and Design (RISC-V Edition),

▪ Computer Architecture: A Quantitative Approach, 6th

You might also like