0% found this document useful (0 votes)

13 views

Pipelining

The document discusses pipelining in computer architecture, illustrating its benefits through a laundry example and detailing the RISC instruction set's 5-stage pipeline. It covers various types of hazards that can occur during pipelining, including structural, data, and control hazards, and explains how these can impact performance. Additionally, it highlights the importance of throughput over latency and the need for careful design to mitigate hazards.

Uploaded by

Sangita Dutta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Pipelining

Uploaded by

Sangita Dutta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 43

Pipelining & Hazards

1
Pipelining: Its Natural!
🞕 Laundry Example
🞕 Ann, Brian, Cathy, Dave each
have one load of clothes to
wash, dry, and fold A B C D
 Washer takes 30 minutes
 Dryer takes 40 minutes
 “Folder” takes 20 minutes

🞕 One load: 90 minutes

6 PM
Sequential Laundry
11 Midnight
7 8 9 10
Time

30 40 20 30 40 20 30 40 20 30 40 20
T
a
s
A
k

O
r
B
d
e
r C

🞕 Sequential laundry takes 6 hours for 4 loads

🞕 If they learned pipelining, how long would laundry take?
Pipelined Laundry Start Work
AS AP 7
6 PM 8 9 10 11 Midnight
Time

30 40 40 40 40 20
T
a Sequential laundry takes 6 hours for 4
s
A loads
k

O
r
B
d
e
r C

🞕 Pipelined laundry takes 3.5 hours for 4 loads

RISC Instruction Set
🞕 Every instruction can be implemented in at most 5 clock cycles/
stages
 Instruction fetch cycle (IF): send PC to memory, fetch the
current instruction from memory, and update PC to the next
sequential PC by adding 4 to the PC.
 Instruction decode/register fetch cycle (ID): decode the
instruction, read the registers corresponding to register source specifiers
from the register file.
 Execution/effective address cycle (EX): perform Memory
address calculation for Load/Store, Register-Register ALU
instruction and Register-Immediate ALU instruction.
 Memory access (MEM): Perform memory access for
load/store instructions.
 Write-back cycle (WB): Write back results to the dest
operands for Register-Register ALU instruction or Load
instruction.
Classic 5-Stage Pipeline for a
🞕 RISC
Each cycle the hardware will initiate a new instruction and will
be executing some part of the five different instructions.
 Simple;
 However, be ensure that the overlap of instructions in the
pipeline cannot cause such a conflict. (also called Hazard)

Clock number
Instruction number 1 2 3 4 5 6 7 8 9
Instruction i IF ID EX MEM WB
Instruction i+1 IF ID EX MEM WB
Instruction i+2 IF ID EX MEM WB
Instruction i+3 IF ID EX MEM WB
Instruction i+4 IF ID EX MEM WB
Computer Pipelines
🞕 Pipeline properties
 Execute billions of instructions, so throughput is what matters.
 Pipelining doesn’t help latency of single task, it helps throughput
of entire workload;
 Pipeline rate limited by slowest pipeline stage;
 Multiple tasks operating simultaneously;
 Potential speedup = Number pipe stages;
 Unbalanced lengths of pipe stages reduces speedup;
 Time to “fill” pipeline and time to “drain” it reduces speedup.
🞕 The time per instruction on the pipelined processor in ideal
conditions is equal to,
Time per instruction on unpipelined machine
Number of pipe stage
† However, the stages may not be perfectly balanced.
† Pipelining yields a reduction in the average execution time per
instruction.
Review: Components of a
Computer Memo
Processor ry Input
Enable?
Read/Write
Control

Progra
Datapat m
h Address
Program Counter Bytes
(PC)

Registers Write Data

Arithmetic & Logic Unit ReadData Data

Output
(ALU)

Processor-Memory Interface I/O-Memory Interfaces

C P U and Datapath vs Control

🞕 Datapath: Storage, FU, interconnect sufficient to perform the desired

functions
 Inputs are Control Points
 Outputs are signals
🞕 Controller: State machine to orchestrate operation on the data path
 Based on desired function and signals
Making RISC Pipelining Real
🞕 Function units used in different cycles
 Hence we can overlap the execution of multiple instructions
🞕 Important things to make it real
 Separate instruction and data memories, e.g. I-cache and D-cache, banking
» Eliminate a conflict for accessing a single memory.
 The Register file is used in the two stages (two R and one W every cycle)
» Read from register in ID (second half of CC), and write to register in WB
(first half of CC).
 PC
» Increment and store the PC every clock, and done it during the IF
stage.
» A branch does not change the PC until the ID stage (have an adder to
compute the potential branch target).
 Staging data between pipeline stages
» Pipeline register
Pipeline Datapath
🞕 Register files in ID and WB stage
 Read from register in ID (second half of CC), and write to
register in WB (first half of CC).
🞕 IM and DM
Pipelining Performance
(1/2)
🞕 Pipelining increases throughput, not reduce the execution time of
an individual instruction.
 In face, slightly increases the execution time (an
instruction) due to overhead in the control of the pipeline.
 Practical depth of a pipeline is limits by increasing execution time.
🞕 Pipeline overhead
 Unbalanced pipeline stage;
 Pipeline stage overhead;
 Pipeline register delay;
 Clock skew.
Processor Performance

CPU Time 
Instructions *
Cycles *
Cycl
Program Time
e
🞕 Instructions per program depends on source code,
compiler technology, and ISA
Instruction
🞕 Cycles per instructions (CPI) depends on ISA
and
µarchitecture
🞕 Time per cycle depends upon the µarchitecture and base
technology
CPI for Different Instructions
7 cycles 5 cycles 10 cycles
Inst 1 Inst 2 Inst 3

Time

Total clock cycles = 7+5+10 = 22

Total instructions = 3
CPI = 22/3 = 7.33

CPI is always an average over a large

number of instructions

22
Pipeline Performance
(2/2)
🞕 Example 1 (p.C-10): Consider the unpipelined processor in previous section. Assume that
it has a 1ns clock cycle and that it uses 4 cycles for ALU operations and branches, and 5
cycles for memory operations. Assume that the relative frequencies of these operations are
40%, 20%, and 40%, respectively. Suppose that due to clock skew and setup, pipelining
the processor adds 0.2 ns of overhead to the clock. Ignoring any latency impact, how
much speedup in the instruction execution rate will we gain from a pipeline?
🞕 Answer
The average instruction execution time on the unpipelined processor is

Average instruction execution time  Clock cycle  Average CPI

 1 ns40%  20%  4  40%  5
 1 ns 4.4  4.4 ns

Speedup from pipelining 

Average instruction time unpipelined  4.4 ns  3.7 times
Average instruction time pipelined

1.2 ns
† In the pipeline, the clock must run at the speed of the slowest stage
Performance with Pipeline Stall
(1/2) Average instruction time unpipelined
Speedup from pipelining 
Average instruction time pipelined
CPI unpipelined Clock cycle

unpipelined CPI pipelined  Clock
cycle
CPI pipelined Clock cycle unpipelined
unpipelined
 
CPI pipelined Clock cycle pipelined

CPI pipelined  Ideal CPI  Pipeline stall clock cycles per

instruction
 1  Pipelined stall clock cycles per instruction
Performance with Pipeline Stall
(2/2) CPI unpipelined  Clock cycle unpipelined
Speedup from pipelining
CPI pipelined
 1  Clock cycle unpipelined 1 Pipeline stall
Clock cycle pipelined
cycles per instruction Clock cycle pipelined

Clock cycle pipelined 

Clock cycle unpipelined
Pipeline depth

 Pipeline depth 
Clock cycle
1
unpipelined
Speedup from pipelining 
Clock cycle unpipelined
 1 Pipeline
Clock stall cycles per instruction
cycle pipelined Clock cycle pipelined
1
  Pipeline
1 Pipeline stall cycles per instruction depth

Pipelining speedup is proportional to the pipeline

depth and 1/(1+ stall cycles)
Pipeline Hazards
🞕 Hazard, that prevent the next instruction in the instruction steam.
 Structural hazards: resource conflict, e.g. using the same unit
 Data hazards: an instruction depends on the results of a
previous instruction
 Control hazards: arise from the pipelining of branches
and other instructions that change the PC.
🞕 Hazards in pipelines can make it necessary to stall the
pipeline.
 Stall will reduce pipeline performance.
Structure Hazards
🞕 Structure Hazards
 If some combination of instructions cannot be accommodated
because of resource conflict (resources are pipelining of functional units
and duplication of resources).
» Occur when some functional unit is not fully pipelined, or
» No enough duplicated resources.
One Memory Port/Structural
Hazards
Time (clock
cycles)
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle
Cycle 5 7
I Load Ifetch Re DMe Re

AL
U
g m g
n
s
Instr 1 Ifetch Re DMe Re

AL
U
t g m g

r.
Ifetch Re DMe Re
Instr 2

AL
U
g m g
O
r Re
d Instr 3
Ifetch Re DMe

AL
U
g m g

e Instr 4 Ifetch Re DMe Re

AL
U
g m g

r
One Memory Port/Structural
Hazards
Time (clock
cycles)
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle
Cycle 5 7
I Load Ifetch Re DMe Re

AL
U
g m g
n
s
Instr 1 Ifetch Re DMe Re

AL
U
t g m g

r.
Ifetch Re DMe Re
Instr 2

AL
U
g m g
O
r
d Stall
Bubbl Bubbl Bubbl Bubbl Bubbl
e e e e e

e Instr 3 Ifetch Re DMe Re

AL
U
g m g

r
How do you “bubble” the pipe?
Performance on Structure
🞕 Hazard
Example 2 (p.C-14): Let’s see how much the load structure hazard might cost.
Suppose that data reference constitute 40% of the mix, and that the ideal CPI of the
pipelined processor, ignoring the structure hazard, is 1. Assume that the processor with
the structure hazard has a clock rate that is 1.05 times higher than the clock rate of
processor without the hazard. Disregarding any other performance losses, is the
pipeline with or without the structure hazard faster, and by how much?
🞕 Answer
The average instruction execution time on the unpipelined processor is
Average instruction timeideal  CPI  Clock cycle timeideal
 1 Clock cycle timeideal
Average instruction timestructure hazard  CPI  Clock cycle time

Clock cycle
 1 0.4 1
timeideal
1.05
 1.3  Clock cycle
timeideal
Summary of Structure Hazard
🞕 An alternative to this structure hazard, designer could provide a
separate memory access for instructions.
 Splitting the cache into separate instruction and data caches, or
 Use a set of buffers, usually called instruction buffers, to hold
instruction;
🞕 However, it will increase cost overhead.
 Ex1: pipelining function units or duplicated resources is a high cost;
 Ex2: require twice bandwidth and often have higher bandwidth at
the pins to support both an instruction and a data cache access every
cycle;
 Ex3: a floating-point multiplier consumes lots of gates.

† If the structure hazard is rare, it may not be worth the cost

to avoid it.
Data Hazards
🞕 Data Hazards
 Occur when the pipeline changes the order of read/write
accesses to operands so that the order differs from the order seen by
sequentially executing instructions on an unpipelined processor.
» Occur when some functional unit is not fully pipelined, or
» No enough duplicated resources.
 A example of pipelined execution

DADD

R1, R2, R3
DSUB

R4, R1, R5 AND

Three Generic Data Hazards
(1/3)
🞕 Read After Write (RAW)
 InstrJ tries to read operand before InstrI writes it

I: ADD R1,R2,R3
J: SUB R4,R1,R3

🞕 Caused by a “true dependence” (in compiler nomenclature). This

hazard results from an actual need for communication.
Three Generic Data Hazards
(2/3)
🞕 Write After Read (WAR)
 InstrJ writes operand before InstrI reads it

I: SUB R4,R1,R3
J: ADD R1,R2,R3
K: MUL R6,R1,R7

🞕 Called an “anti-dependence” by compiler writers.

This results from reuse of the name “R1”.
🞕 Can’t happen in MIPS 5 stage pipeline because:
 All instructions take 5 stages, and

 Reads are always in stage 2, and

 Writes are always in stage 5

Three Generic Data Hazards
(3/3)
🞕 Write After Write (WAW)
 InstrJ writes operand before InstrI writes it.
useless

I: SUB R1,R4,R3
J: ADD R1,R2,R3
K: MUL R6,R1,R7

🞕 This hazard also results from the reuse of name r1

🞕 Hazard when writes occur in the wrong order
🞕 Can’t happen in our basic 5-stage pipeline because:
 All writes are ordered and take place in stage 5
🞕 WAR and WAW hazards occur in complex pipelines
🞕 Notice that Read After Read – RAR is NOT a hazard
#2: Forwarding (aka bypassing) to
Avoid Data Hazard
Time (clock
cycles)
I
n DADD R1,R2,R3 Re DMe Re

AL
U
g m g
Pipeline register
s Ifetch
t
r. DSUB R4,R1,R5
Ifetch Re DMe Re

AL
U
g m g

O Ifetch Re DMe Re

AL
r AND R6,R1,R7

U
g m g

d
Ifetch Reg DMe Re

AL
OR

U
m g
e

r Ifetch Re DMe Re

AL
R8,R1,R9 XOR

U
g m g
Another Example of a RAW Data
Hazard
🞕 Result of sub is needed by and, or, add, & sw
instructions
🞕 Instructions and & or will read old value of r2 from
reg file
Time CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8
🞕 During CC5, r2
(cycles) is written
10 10 and
10 read – new
10 10/20 20 value
20 is
20
value of r2
read sub r2, r1, IM Reg ALU DM Reg
r3
Program Execution

and r4, r2, IM Reg AL DM Reg

r5 U

or r6, r3, IM Reg AL DM Reg

r2 U
Order

add r7, r2, IM Reg AL

DM Reg
r2 U

sw r8, IM AL
Reg
U
DM
10(r2)
Solution #1: Stalling the Pipeline
🞕 The and instruction cannot fetch r2 until CC5
 The and instruction remains in the IF/ID register until CC5

🞕 Two bubbles are inserted into ID/EX at end of CC3

& CC4
 Bubbles are NOP instructions: do not modify registers or
memory
Time (in CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8
 Bubbles
cycles) delay instruction
10 10 execution
10 10 and waste
10/20 20 clock
20 20
cycles
value of r2
IM Reg ALU DM Reg
sub r2, r1, r3
Instruction

and r4, r2, IM bubble bubble Reg AL DM Reg

Order

r5 U

or r6, r3, IM Reg ALU DM

r2
Solution #2: Forwarding ALU
🞕 The Result
ALU result is forwarded (fed back) to the ALU
input
 No bubbles are inserted into the pipeline and no cycles are
wasted
🞕 ALU result exists in either EX/MEM or MEM/WB
register sub r2, r1, AL
IM Reg
U
DM Reg
r3
Program Execution

Time (in cycles) CC1 CC2 CC3 CC4 CC5

and r4, r2, IM Reg DM Reg
CC6 CC7 ALU
CC8
r5
or r6, r3, IM Reg AL DM Reg
r2 U

add r7, r2,

Order

IM Reg AL DM Reg
r2 U

sw r8, IM Reg AL DM
10(r2) U
Double Data Hazard
🞕 Consider the sequence:
add r1,r1,r2
sub r1,r1,r3
and r1,r1,r4
🞕 Both hazards
occur
 Want to use
the most
recent
 When
executing
AND, forward
result of SUB
»
ForwardA =
01 (from the
EX/MEM pipe
Data Hazard Even with
Forwarding
Time (clock
cycles)

I LD R1,0(R2) Ifetch Re DMe Re

AL
U
g m g
n
s
t DSUB R4,R1,R6 Ifetch Re DMe Re

AL
U
g m g
r.

O Re

AL
Ifetch Re DMe
DAND R6,R1,R7

U
g m g
r
d
Ifetch Re DMe Re

AL
OR R8,R1,R9

U
g m g
e

r
Data Hazard Even with
Forwarding
Time (clock
cycles)

I LD R1,0(R2) Ifetch Re DMe Re

AL
U
g m g
n
s
t Bubbl Re
DSUB R4,R1,R6 Ifetch Re DMe

AL
U
g m g
r. e

O Ifetch Bubbl Re DMe Re

AL
AND R6,R1,R7

U
g m g
r e
d
Bubbl Ifetch Re DMe

AL
U
e OR R8,R1,R9 e
g m

r
How is this detected?
Load
Delay
🞕 Not all RAW data hazards can be forwarded
 Load has a delay that cannot be eliminated by

forwardin
🞕 In the example shown below …
 The LW instruction does not have data until end of CC4

 AND wants data at beginning of CC4 - N OT possible

Time (cycles) However,

CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8load can
lw r2, IF Reg AL DM Reg forward data to
20(r1) U
second next
instruction
and r4, r2, IF DM
Program

Reg AL Reg
r5 U
Order

or r6, r3, IF Reg ALU DM Reg

r2
add r7, r2, IF Reg ALU DM Reg
r2
Stall the Pipeline for one
Cycle
🞕 Freeze the PC and the IF/ID registers
 No new instruction is fetched and instruction after load is
stalled
🞕 Allow the Load in ID/EX register to proceed
🞕 Introduce a bubble into the ID/EX register
🞕 Load can forward data after stalling next
instruction
lw r2, IM Reg AL DM Reg
Time (cycles)
20(r1) U CC1 CC2 CC3 CC4 CC5
CC6 CC7 CC8AL
Program

and r4, r2, IM bubble Reg DM Reg

U
Order

r5
or r6, r3, IM Reg AL
DM Reg
U
r2
Forwarding to Avoid LW-SW Data
Hazard Time (clock
cycles)
I
Re DMe Re

AL
n DADD R1,R2,R3

U
g m g
s
Ifetch

t
r. LD R4,0(R1) Ifetch Re DMe Re

AL
U
g m g

O Ifetch Re DMe Re

AL
U
r SD R4,12(R1) g m g

d
Ifetch Re DMe Re

AL
e OR R8,R6,R9

U
g m g

r XOR Ifetch Re DMe Re

AL
R10,R9,R11

U
g m g
Detecting RAW Hazards
🞕 Pass register numbers along pipeline
 ID/EX.RegisterRs = register number for Rs in

ID/EX
 ID/EX.RegisterRt = register number for Rt in

ID/EX
 ID/EX.RegisterRd = register number for Rd in

ID/EX
🞕 Current instruction being executed in ID/EX
register
🞕 RAW Data hazards when Fwd from
🞕 Previous instruction is
1a. EX/MEM.RegisterRd = in the EX/MEM EX/MEM
register
ID/EX.RegisterRs 1b.
pipeline
reg
🞕 Second previous is=in the MEM/WB register
EX/MEM.RegisterRd Fwd from
MEM/WB
ID/EX.RegisterRt pipeline
reg
Detecting the Need to
Forward
🞕 But only if forwarding instruction will write to a
register!
 EX/MEM.RegWrite, MEM/WB.RegWrite
🞕 And only if Rd for that instruction is not R0
 EX/MEM.RegisterRd ≠ 0
 MEM/WB.RegisterRd ≠ 0
Forwarding Conditions
🞕 Detecting RAW hazard with Previous
Instruction
 if (EX/MEM.RegWrite and
(EX/MEM.RegisterRd ≠ 0) and
(EX/MEM.RegisterRd = ID/EX.RegisterRs))
ForwardA = 01 (Forward from EX/MEM pipe
stage)
 if (EX/MEM.RegWrite and
(EX/MEM.RegisterRd ≠ 0) and
(EX/MEM.RegisterRd = ID/EX.RegisterRt))
ForwardB = 01 (Forward from
EX/MEM pipe stage)
🞕 Detecting RAW hazard with Second
Previous
 if (MEM/WB.RegWrite and
(MEM/WB.RegisterRd ≠ 0) and
Control Hazard on Branches: Three
Stage Stall

10: BEQ R1,R3,36 Ifetch Re DMe Re

AL
U
g m g

Re
14: AND R2,R3,R5 Ifetch Re DMe

AL
U
g m g

Ifetch Re DMe Re
18: OR R6,R1,R7

AL
U
g m g

Ifetch Re DMe Re

AL
22: ADD R8,R1,R9

U
g m g

36: XOR R10,R1,R11 Ifetch Re DMe Re

AL
U
g m g

What do you do with the 3 instructions in between?

How do you do it?
Where is the “commit”?
Branch/Control
Hazards
🞕 Branch instructions can cause great
performance loss
🞕 Branch instructions need two
Takenthings:
or Not
 Branch Result
Taken
 Branch Target
If Branch is NO T
» PC + 4
taken If Branch is
» PC + 4 +
🞕 For our Taken delay
pipeline: 3-cycle
branch
4 × imm
 PC is updated 3 cycles after fetching branch

instruction
 Branch target address is calculated in the ALU

stage
 Branch result is also computed in the ALU

stage
1. Pipelining Introduction
Contents
2. The Major Hurdle of Pipelining—Pipeline Hazards
3. RISC-V ISA and its Implementation

Reading:
 Textbook: Appendix C
 RISC-V ISA
 Chisel Tutorial

Functional Business Systems: James A. O'Brien, and George Marakas Management Information Systems
No ratings yet
Functional Business Systems: James A. O'Brien, and George Marakas Management Information Systems
22 pages
Weintek To MySQL Database Server
No ratings yet
Weintek To MySQL Database Server
26 pages
Pipeline Hazards
No ratings yet
Pipeline Hazards
61 pages
Lec 1
No ratings yet
Lec 1
30 pages
Pipeline
No ratings yet
Pipeline
39 pages
Pipe Lining
No ratings yet
Pipe Lining
66 pages
Module 4
No ratings yet
Module 4
12 pages
Piplining
No ratings yet
Piplining
23 pages
Pipelinehazard 160823134502
No ratings yet
Pipelinehazard 160823134502
61 pages
Pipelinehazard For Class
No ratings yet
Pipelinehazard For Class
61 pages
Pipelining Basic and Intermediate Concepts
No ratings yet
Pipelining Basic and Intermediate Concepts
75 pages
Pipeline: A Simple Implementation of A RISC Instruction Set
No ratings yet
Pipeline: A Simple Implementation of A RISC Instruction Set
16 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
49 pages
Module 5 Part2 pipelining
No ratings yet
Module 5 Part2 pipelining
36 pages
Week 11 Reduced
No ratings yet
Week 11 Reduced
29 pages
Computer Architecture: Pipelining: Dr. Ashok Kumar Turuk
No ratings yet
Computer Architecture: Pipelining: Dr. Ashok Kumar Turuk
136 pages
Pipelining - Modified1
No ratings yet
Pipelining - Modified1
51 pages
Pipelining Preview: Basics & Challenges
No ratings yet
Pipelining Preview: Basics & Challenges
75 pages
Pipelining basic concept
No ratings yet
Pipelining basic concept
23 pages
PipeLining in Microprocessors
No ratings yet
PipeLining in Microprocessors
19 pages
Lec18 Pipeline
No ratings yet
Lec18 Pipeline
59 pages
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
No ratings yet
Topic 10: Pipelining: Cos / Ele 375 Computer Architecture and Organization
64 pages
Co Unit 4
No ratings yet
Co Unit 4
17 pages
Helping Slides Pipelining Hazards Solutions
No ratings yet
Helping Slides Pipelining Hazards Solutions
55 pages
CO Pipelining PDF notes
No ratings yet
CO Pipelining PDF notes
10 pages
Pipelining
No ratings yet
Pipelining
26 pages
Lecture-5-09.01.2025
No ratings yet
Lecture-5-09.01.2025
25 pages
Pipelining Concepts and Problems
No ratings yet
Pipelining Concepts and Problems
33 pages
Pipelining Lecture
No ratings yet
Pipelining Lecture
39 pages
ILP - Appendix C PDF
No ratings yet
ILP - Appendix C PDF
52 pages
Pipeline
No ratings yet
Pipeline
22 pages
Comparison Between Pipelining
No ratings yet
Comparison Between Pipelining
9 pages
1. Lecture 13 Pipelining
No ratings yet
1. Lecture 13 Pipelining
12 pages
Instruction Pipelining: 1 Zelalem Birhanu, Aait
No ratings yet
Instruction Pipelining: 1 Zelalem Birhanu, Aait
20 pages
HRY-312 Computer Organization Introduction To Pipelining
No ratings yet
HRY-312 Computer Organization Introduction To Pipelining
30 pages
Lec04 Pipelining Intro&hazards
No ratings yet
Lec04 Pipelining Intro&hazards
77 pages
Shri G.S. Institute of Technology and Science: Computer Architecture and Organisation (CO-24009) Session: 2019-2020
No ratings yet
Shri G.S. Institute of Technology and Science: Computer Architecture and Organisation (CO-24009) Session: 2019-2020
27 pages
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
No ratings yet
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
19 pages
Pipe Lining
No ratings yet
Pipe Lining
29 pages
Pipelining and parallel processing
No ratings yet
Pipelining and parallel processing
26 pages
Computer Architecture: Appendix A Pipelining Prof. Jerry Breecher CSCI 240 Fall 2003
No ratings yet
Computer Architecture: Appendix A Pipelining Prof. Jerry Breecher CSCI 240 Fall 2003
58 pages
Pipelining2019_(1)[1]
No ratings yet
Pipelining2019_(1)[1]
82 pages
module 4-Pipelining
No ratings yet
module 4-Pipelining
39 pages
Pipelining and Parallelism
No ratings yet
Pipelining and Parallelism
41 pages
Week 11-13
No ratings yet
Week 11-13
76 pages
Computer Architecture Pipe Line
No ratings yet
Computer Architecture Pipe Line
28 pages
Unit 2 - Session-6 To 10
No ratings yet
Unit 2 - Session-6 To 10
40 pages
Computer Architecture: Nguyễn Trí Thành
No ratings yet
Computer Architecture: Nguyễn Trí Thành
77 pages
CA-unit 4-Material
No ratings yet
CA-unit 4-Material
31 pages
Pipelining & Riscs: Pipelining Used Key Implementation Technique To Build Fast Processors. It
No ratings yet
Pipelining & Riscs: Pipelining Used Key Implementation Technique To Build Fast Processors. It
6 pages
Pipeline Processing
No ratings yet
Pipeline Processing
28 pages
CODch 6 Slides
No ratings yet
CODch 6 Slides
77 pages
Chapter 4.5 - 4.8 Piplined Processor and Hazards
No ratings yet
Chapter 4.5 - 4.8 Piplined Processor and Hazards
68 pages
Chapter 6 - Pipelining
0% (1)
Chapter 6 - Pipelining
61 pages
Chapter6 - Pipelining
No ratings yet
Chapter6 - Pipelining
61 pages
Chapter6 - Pipelining
No ratings yet
Chapter6 - Pipelining
61 pages
CAO Pipelining Lecture
No ratings yet
CAO Pipelining Lecture
50 pages
Lecture # Pipelining
No ratings yet
Lecture # Pipelining
36 pages
Pic® Micro Principles V11
From Everand
Pic® Micro Principles V11
Clive W. Humphris
No ratings yet
Pic® Micro Principles Teachers Pack V11
From Everand
Pic® Micro Principles Teachers Pack V11
Clive W. Humphris
No ratings yet
Pic® Micro Principles on Your Mobile
From Everand
Pic® Micro Principles on Your Mobile
Clive W. Humphris
No ratings yet
Learn the Pic® Micro on Your Smartphone
From Everand
Learn the Pic® Micro on Your Smartphone
Clive W. Humphris
No ratings yet
Introduction To - at Dawn They Strzok - Now Illustrated
No ratings yet
Introduction To - at Dawn They Strzok - Now Illustrated
59 pages
iPhone Parts and Service History - Apple Support
No ratings yet
iPhone Parts and Service History - Apple Support
1 page
Network Forensics Part 1
No ratings yet
Network Forensics Part 1
11 pages
IP Datagram Problem
No ratings yet
IP Datagram Problem
13 pages
MGMT627-Final-Term-By Rana Abubakar Khan
No ratings yet
MGMT627-Final-Term-By Rana Abubakar Khan
23 pages
Assignment # 01: Object Oriented Analysis and Design
No ratings yet
Assignment # 01: Object Oriented Analysis and Design
4 pages
UG BCA Honours
No ratings yet
UG BCA Honours
68 pages
Outlier Community Basics
No ratings yet
Outlier Community Basics
23 pages
Experiment Name: Study of Basic Logic Gates
No ratings yet
Experiment Name: Study of Basic Logic Gates
17 pages
Ibryan-M3 Summative Task in Business Studies Bta3o
No ratings yet
Ibryan-M3 Summative Task in Business Studies Bta3o
11 pages
Java Performance Optimization: Patterns and Anti-Patterns
No ratings yet
Java Performance Optimization: Patterns and Anti-Patterns
7 pages
RT-1851A(C)
No ratings yet
RT-1851A(C)
4 pages
Associate 600-Fine
No ratings yet
Associate 600-Fine
7 pages
Get Ready for Kindergarten Reading
No ratings yet
Get Ready for Kindergarten Reading
64 pages
Decision Support System: Unit 1
No ratings yet
Decision Support System: Unit 1
34 pages
Cs - Project - New Project On Mahesh House
No ratings yet
Cs - Project - New Project On Mahesh House
24 pages
Readme
No ratings yet
Readme
3 pages
CST8230 - Midterm (With Answers)
No ratings yet
CST8230 - Midterm (With Answers)
10 pages
Water Cooled Screw Chiller
No ratings yet
Water Cooled Screw Chiller
15 pages
IGNOU Mail - December, 2024 Term-End-Examination for E-Vidya Bharti Learners – Login Credentials for Accessing the Examination Portal and Instructions Thereof
No ratings yet
IGNOU Mail - December, 2024 Term-End-Examination for E-Vidya Bharti Learners – Login Credentials for Accessing the Examination Portal and Instructions Thereof
3 pages
Permutations Combinations
No ratings yet
Permutations Combinations
80 pages
L3 Service-Bulletin-Public-Index
No ratings yet
L3 Service-Bulletin-Public-Index
126 pages
Baum Operators Control Panel Inst For 2020 18 15 tp10386
No ratings yet
Baum Operators Control Panel Inst For 2020 18 15 tp10386
12 pages
Simple Algorithms For Network Visualization
No ratings yet
Simple Algorithms For Network Visualization
17 pages
The NFT Revolution: NFT, DAO, and Smart Contracts: 2021 Buzzwords 101
No ratings yet
The NFT Revolution: NFT, DAO, and Smart Contracts: 2021 Buzzwords 101
4 pages
Case 4 - Meet The New Mobile Workers
No ratings yet
Case 4 - Meet The New Mobile Workers
2 pages
Cisco Certified Network Associate (CCNA) v1.0 (200-301) - Full Access
No ratings yet
Cisco Certified Network Associate (CCNA) v1.0 (200-301) - Full Access
43 pages

Pipelining

Uploaded by

Pipelining

Uploaded by

Pipelining & Hazards

🞕 One load: 90 minutes

🞕 Sequential laundry takes 6 hours for 4 loads

🞕 Pipelined laundry takes 3.5 hours for 4 loads

Registers Write Data

Arithmetic & Logic Unit ReadData Data

Processor-Memory Interface I/O-Memory Interfaces

🞕 Datapath: Storage, FU, interconnect sufficient to perform the desired

Total clock cycles = 7+5+10 = 22

CPI is always an average over a large

Average instruction execution time  Clock cycle  Average CPI

Speedup from pipelining 

CPI pipelined  Ideal CPI  Pipeline stall clock cycles per

Clock cycle pipelined 

Pipelining speedup is proportional to the pipeline

e Instr 4 Ifetch Re DMe Re

e Instr 3 Ifetch Re DMe Re

† If the structure hazard is rare, it may not be worth the cost

R4, R1, R5 AND

🞕 Caused by a “true dependence” (in compiler nomenclature). This

🞕 Called an “anti-dependence” by compiler writers.

 Reads are always in stage 2, and

 Writes are always in stage 5

🞕 This hazard also results from the reuse of name r1

and r4, r2, IM Reg AL DM Reg

or r6, r3, IM Reg AL DM Reg

add r7, r2, IM Reg AL

🞕 Two bubbles are inserted into ID/EX at end of CC3

and r4, r2, IM bubble bubble Reg AL DM Reg

or r6, r3, IM Reg ALU DM

Time (in cycles) CC1 CC2 CC3 CC4 CC5

add r7, r2,

I LD R1,0(R2) Ifetch Re DMe Re

I LD R1,0(R2) Ifetch Re DMe Re

O Ifetch Bubbl Re DMe Re

 AND wants data at beginning of CC4 - N OT possible

Time (cycles) However,

or r6, r3, IF Reg ALU DM Reg

and r4, r2, IM bubble Reg DM Reg

r XOR Ifetch Re DMe Re

10: BEQ R1,R3,36 Ifetch Re DMe Re

36: XOR R10,R1,R11 Ifetch Re DMe Re

What do you do with the 3 instructions in between?

You might also like