0% found this document useful (0 votes)

71 views38 pages

Understanding Pipelining in MIPS Architecture

The document discusses pipelining in computer processors. It describes the concept of pipelining including its benefits and challenges. It provides examples of pipelined execution and calculations of speedup from pipelining. It also discusses different types of hazards that can occur in pipelined systems including structural, data, and control hazards.

Uploaded by

kholood badea

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views38 pages

Understanding Pipelining in MIPS Architecture

Uploaded by

kholood badea

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Chapter 4

Pipelining
Basic Computer Architecture
 What is computer architecture
 Instruction set architecture
 What to do
 Computer organization
 How to do (Data path and Control)

Chapter 4 — The Processor — 2

Instruction Set Architecture
 Instruction set architecture for MIPS
 Arithmetic logical instruction
 add R3,R2,R1
 Data transfer instructions
 lw R2, offset(R1)
 sw R2, offset(R1)
 Branch instructions
 beq rs, rt, offset

Chapter 4 — The Processor — 3

Computer Organization
 Sequential (single cycle) Execution
 One instruction is fetched from instruction memory
 All the steps in instruction execution are completed
 Then next instruction is fetched
 In other words, new instruction can not be fetched
from the memory unless the previous instruction
has completed its execution

Instruction 1
Instruction 2
Instruction 3

Chapter 4 — The Processor — 4

single cycle
 Why a single cycle implementation is not
used today
 Inefficient: single cycle for each instruction with
Which
inst.? same length
 Longest path determines the clock cycle
 CPI is 1 and clock cycle is too long, overall
performance is poor
 We need another implementation
technique that is more efficient and having
higher throughput. (Pipelining)

Chapter 4 — The Processor — 5

Pipelining
 Pipelining
 Next instruction is fetched from the memory
before the previous instruction has completed
its execution
 In other words, overlapping of instruction
execution
 Why pipelining is used ?
 To improve performance (higher throughput)

Chapter 4 — The Processor — 6

Pipelining

Each stage = 30 min

Total Execution time for 4 loads = 8 h

Original

Each stage = 30 min

Total Execution time for 4 loads = 3.5 h

Improved
 Four loads:
 Speedup
= 8/3.5 = 2.3
Chapter 4 — The Processor — 7
Pipelining vs Performance
 What is performance ?
 Latency or response time
 How long it takes to do a single task
 Throughput
 Total work done (all tasks) per unit time
 Pipelining increases latency or throughput ?
 Only throughput

Chapter 4 — The Processor — 8

MIPS Pipeline
 Five stages, one step per stage
1. IF (Fetch): Instruction fetch from memory IF

2. ID (Decode): Instruction decode & register read ID

3. EX (Execute): Execute operation or calculate address

EX
4. MEM (Memory): Access data memory MEM

5. WB (Writeback): Write result back to register

 Focus on 8 inst.:
 lw, sw, add, sub, AND, OR, slt, beq

Chapter 4 — The Processor — 9

Graphical Representation of MIPS 5-stage Pipeline

read write
(right shaded) ALU (left shaded)

White background
because add does not access memory

Chapter 4 — The Processor — 10

Pipeline Performance
 Assume that:
 Time for register read and write is 100ps
 Time for any other stage is 200 ps
Inst. Inst. fetch Reg. read ALU op Mem. access Reg. write Tot. time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps
R-format (add, 200ps 100 ps 200ps 100 ps 600ps
sub, AND, OR,
slt)

Branch (beq) 200ps 100 ps 200ps 500ps

Chapter 4 — The Processor — 11

Pipeline Performance
Single-cycle (Clock cycle time,Tc= 800ps)

Time bet. 1st and 4th inst.=3x800=2400 ps

Pipelined (Tc= 200ps)

Time bet. 1st and 4th inst.=3x200=600 ps

The clock cycle design must

allow for the slowest inst.
See previous table

Chapter 4 — The Processor — 12

Pipeline Performance
𝐶𝐶 = 5

𝐶𝐶 = 9

𝐶𝐶 =104

Chapter 4 — The Processor — 13

Example 1
 Consider a nonpipelined machine with 8
execution stages of lengths 20 ns.
 The time between two instructions
 20+20+20+20+20+20+20+20 = 160 ns
 Suppose we introduce pipelining on this
machine.
 The time between two instructions = 20 ns
 The speedup obtained from pipelining
 Speedup = 160 / 20 = 8

Chapter 4 — The Processor — 14

Example 2
 Consider a nonpipelined machine with 10
execution stages of lengths 10, 20, 20, 30, 10,
10, 50, 45, 20, 10.
 The time between two instructions on this machine
 10+20+20+30+10+10+50+45+20+10 = 225 ns
 Suppose we introduce pipelining on this
machine.
 The time between two instructions = 50 ns. The clock cycle design must
allow for the slowest inst.

 Speedup = 225 / 50 = 4.5

Chapter 4 — The Processor — 15

Pipeline Speedup


Chapter 4 — The Processor — 16

Example 3
 In a non-pipelined machine time between instructions is
200 ns. If we use pipelining with four stages such that all
the stages are balanced. What is the time between
instructions after pipelining ?

 Time between instructions after pipelining = 200/4 = 50 ns

 Notice: speedup = 200/50=4= number of pipelined stages

Chapter 4 — The Processor — 17

Pipelining and ISA Design
 MIPS ISA designed for pipelining
 All instructions are 32-bits
 Easier to fetch and decode in one cycle
 c.f. x86: 1- to 17-byte instructions
 Few and regular instruction formats
 Can decode and read registers in one step
 Load/store addressing
 Can calculate address in 3rd stage, access memory
in 4th stage
 Alignment of memory operands
 Memory access takes only one cycle

Chapter 4 — The Processor — 18

Pipelining Hazards
 There are situations in pipelining when the next
instruction can not execute in the following clock cycle.
These events are called hazards
 In other word, any condition that causes a pipeline to
stall is called a hazard.
 There are three types of hazards:
 Structural hazards:
 A required resource is busy
 Data hazards:
 Need to wait for previous instruction to complete its data read/write
 Control hazards:
 Deciding on control action depends on previous instruction

Chapter 4 — The Processor — 19

Structure Hazards
 Due to the conflict for use of a resource
 A required resource is busy
 ex. Using a washer-dryer combination
 Assume a MIPS pipeline with a single
memory
 Load/store requires data access
 Instruction fetch would have to stall (wait) for that
cycle
 Would cause a pipeline “bubble”

 Hence, pipelined datapaths require separate

instruction/data memories

Chapter 4 — The Processor — 20

Structure Hazards

Chapter 4 — The Processor — 21

Structure Hazards

Chapter 4 — The Processor — 22

Data Hazards
 Data hazards:
 Arise from the dependence of one instruction on an
earlier one that is still in the pipeline
 Need to stall (wait) for previous instruction to complete
its data read/write add $s0, $t0, $t1
sub $t2, $s0, $t3

The value of $s0

is written back here

So, Wait Until it becomes available

The value of $s0 is needed here but it is
not available in this stage to be read
Chapter 4 — The Processor — 23
Data Hazards: Example
 An instruction depends on completion of data access by
a previous instruction
 add $s0, $t0, $t1
sub $t2, $s0, $t3
 To resolve this hazard:
 (1) wait until the hazard is resolved (but impacts CPI)

Write result is in the fifth stage

Chapter 4 — The Processor — 24

Data Hazards:
(2)Forwarding (Bypassing)

 Use result when it is computed

 Don’t wait for it to be stored in a register
 Requires extra connections in the datapath (Hardware)
 Valid only if the destination stage is later in time than
the source stage
 Can’t prevent all pipeline stalls

Chapter 4 — The Processor — 25

Load-Use Data Hazard
 Can’t always avoid stalls by forwarding
 If value not computed when needed
 Can’t forward backward in time!

Chapter 4 — The Processor — 26

Load-Use Data Hazard
 So, we would have to stall one stage for a
load-use data hazard

Chapter 4 — The Processor — 27

(3)Code Scheduling to Avoid Stalls
 Reorder code to avoid use of load result in
the next instruction (Software)
 C code for A = B + E; C = B + F;
lw $t1, 0($t0)
lw $t2, 4($t0)
stall add $t3, $t1, $t2 Forwarding is adopted here
sw $t3, 12($t0)
lw $t4, 8($t0)
stall add $t5, $t1, $t4
sw $t5, 16($t0)
13 cycles

Chapter 4 — The Processor — 28

lw $t1, 0($t0)
(3)Code Scheduling to Avoid Stalls lw $t2, 4($t0)
add $t3, $t1, $t2
lw $t1, 0($t0) sw $t3, 12($t0)
lw $t4, 8($t0)
lw $t2, 4($t0)
add $t5, $t1, $t4
stall sw $t5, 16($t0)
add $t3, $t1, $t2

sw $t3, 12($t0)

lw $t4, 8($t0)

stall

add $t5, $t1, $t4

sw $t5, 16($t0)

13 cycles
Chapter 4 — The Processor — 29
(3)Code Scheduling to Avoid Stalls

 Reorder code to avoid use of load result in

the next instruction (Software)
 C code for A = B + E; C = B + F;
Forwarding is adopted here

lw $t1, 0($t0) lw $t1, 0($t0)

lw $t2, 4($t0) lw $t2, 4($t0)
stall add $t3, $t1, $t2 lw $t4, 8($t0)
sw $t3, 12($t0) add $t3, $t1, $t2
lw $t4, 8($t0) sw $t3, 12($t0)
stall add $t5, $t1, $t4 add $t5, $t1, $t4
sw $t5, 16($t0) sw $t5, 16($t0)
13 cycles 11 cycles

Chapter 4 — The Processor — 30

lw $t1, 0($t0)
(3)Code Scheduling to Avoid Stalls
lw $t2, 4($t0)
lw $t4, 8($t0)
lw $t1, 0($t0) add $t3, $t1, $t2
sw $t3, 12($t0)
lw $t2, 4($t0)
add $t5, $t1, $t4
lw $t4, 8($t0) sw $t5, 16($t0)

add $t3, $t1, $t2

sw $t3, 12($t0)

add $t5, $t1, $t4

sw $t5, 16($t0)

11 cycles

Chapter 4 — The Processor — 31

Control Hazards
 Also called branch hazards because
control hazards are due to branch
instructions
 Branch determines flow of control
 Fetching next instruction depends on branch
outcome
 Pipeline can’t always fetch correct instruction
 Control hazard will occur when the proper
instruction was not fetched
 What is the solution ?

Chapter 4 — The Processor — 32

Control Hazards: (1) Stall on Branch
 Wait until branch outcome determined before fetching
next instruction
 It means we have to wait until stage 4.
 Advantage: simple both to software and hardware

Stall 3 cycles

Chapter 4 — The Processor — 33

Control Hazards: (2) putting extra hardware
 Lets assume that we put in enough extra hardware during the second stage
of pipeline (ID stage). so that we can:
 test registers (Comparator)
 calculate the branch address (adder)
 update the PC
 Even with this extra hardware, we have to wait until stage 2.

Stall 1 cycle
Chapter 4 — The Processor — 34
Control Hazards: (3) Branch Prediction

 Longer pipelines can’t determine branch

outcome early
 Stall penalty becomes unacceptable
 Solution: Predict outcome of branch
 Only stall if prediction is wrong
 In MIPS pipeline
 Can predict branches not taken
 Fetch instruction after branch, with no delay

Chapter 4 — The Processor — 35

MIPS with Predict Not Taken

Prediction correct
i.e. branch not taken

Prediction incorrect
i.e. branch taken

Chapter 4 — The Processor — 36

Check Yourself (HW)

[Link]
[Link]
Chapter 4 — The Processor — 37
Pipeline Summary
 Pipelining improves performance by
increasing instruction throughput
 Executes multiple instructions in parallel
 Each instruction has the same latency
 Pipelining hazards
 Structure, data, control
 Instruction set design affects complexity of
pipeline implementation

Chapter 4 — The Processor — 38

Pipelining in MIPS Architecture
No ratings yet
Pipelining in MIPS Architecture
32 pages
MIPS Pipeline Stages and Performance
No ratings yet
MIPS Pipeline Stages and Performance
20 pages
MIPS Pipeline Performance Overview
No ratings yet
MIPS Pipeline Performance Overview
39 pages
MIPS Pipelined Processor Design
No ratings yet
MIPS Pipelined Processor Design
89 pages
MIPS Pipelining and Performance Analysis
No ratings yet
MIPS Pipelining and Performance Analysis
69 pages
Pipelined Microcontroller Design Analysis
No ratings yet
Pipelined Microcontroller Design Analysis
149 pages
MIPS Pipelining and Hazards Analysis
No ratings yet
MIPS Pipelining and Hazards Analysis
73 pages
Understanding Instruction Level Parallelism
No ratings yet
Understanding Instruction Level Parallelism
59 pages
MIPS Processor Control and Pipelining
No ratings yet
MIPS Processor Control and Pipelining
28 pages
MIPS Datapath and Pipelining Overview
No ratings yet
MIPS Datapath and Pipelining Overview
137 pages
MIPS Pipeline Register Activity Analysis
No ratings yet
MIPS Pipeline Register Activity Analysis
137 pages
MIPS Processor Architecture Overview
100% (1)
MIPS Processor Architecture Overview
131 pages
Understanding MIPS Processor Architecture
No ratings yet
Understanding MIPS Processor Architecture
131 pages
Pipelined Datapath in MIPS Architecture
No ratings yet
Pipelined Datapath in MIPS Architecture
137 pages
MIPS Processor Design Overview
No ratings yet
MIPS Processor Design Overview
98 pages
MIPS Processor Design Overview
75% (8)
MIPS Processor Design Overview
137 pages
Pipelining in MIPS Architecture
No ratings yet
Pipelining in MIPS Architecture
7 pages
MIPS Processor Design and Pipelining
No ratings yet
MIPS Processor Design and Pipelining
72 pages
MIPS Processor Architecture Overview
No ratings yet
MIPS Processor Architecture Overview
51 pages
ALU Control and Pipelining in MIPS
No ratings yet
ALU Control and Pipelining in MIPS
26 pages
MIPS Processor Design Overview
No ratings yet
MIPS Processor Design Overview
45 pages
MIPS Pipelining: Stages & Hazards Explained
No ratings yet
MIPS Pipelining: Stages & Hazards Explained
7 pages
MIPS Processor Architecture Overview
No ratings yet
MIPS Processor Architecture Overview
51 pages
Pipeline Performance and Hazards Analysis
No ratings yet
Pipeline Performance and Hazards Analysis
95 pages
MIPS Pipelining and Hazards Explained
No ratings yet
MIPS Pipelining and Hazards Explained
83 pages
Parallel and Pipeline Processing Overview
No ratings yet
Parallel and Pipeline Processing Overview
31 pages
Pipelining and Parallel Processing Overview
No ratings yet
Pipelining and Parallel Processing Overview
56 pages
Pipelining in Computer Architecture
No ratings yet
Pipelining in Computer Architecture
66 pages
Processor Instruction Execution Overview
No ratings yet
Processor Instruction Execution Overview
53 pages
MIPS Datapath and Pipelining Overview
No ratings yet
MIPS Datapath and Pipelining Overview
35 pages
Understanding Pipelining in Computer Science
No ratings yet
Understanding Pipelining in Computer Science
27 pages
Understanding Pipelining in Processors
No ratings yet
Understanding Pipelining in Processors
46 pages
Pipelined Processor Design Overview
No ratings yet
Pipelined Processor Design Overview
11 pages
MIPS Processor Design and Pipelining
No ratings yet
MIPS Processor Design and Pipelining
171 pages
Lec18 Pipeline
No ratings yet
Lec18 Pipeline
59 pages
Pipelining Basic and Intermediate Concepts
No ratings yet
Pipelining Basic and Intermediate Concepts
75 pages
MIPS Pipeline Architecture Overview
No ratings yet
MIPS Pipeline Architecture Overview
84 pages
MIPS Pipelining: Concepts and Hazards
No ratings yet
MIPS Pipelining: Concepts and Hazards
17 pages
Understanding Pipelining in Processors
No ratings yet
Understanding Pipelining in Processors
145 pages
CSE332 / EEE336 Computer Organization & Architecture Pipelining I
No ratings yet
CSE332 / EEE336 Computer Organization & Architecture Pipelining I
21 pages
Lec21-Quite Good But Same Thing
No ratings yet
Lec21-Quite Good But Same Thing
49 pages
Pipelining in Microcontroller Design
No ratings yet
Pipelining in Microcontroller Design
39 pages
Pipelining Lecture
No ratings yet
Pipelining Lecture
74 pages
Advanced Computer Architecture Course
No ratings yet
Advanced Computer Architecture Course
64 pages
Understanding CPU Pipelining Techniques
No ratings yet
Understanding CPU Pipelining Techniques
20 pages
Pipelining in Computer Architecture
No ratings yet
Pipelining in Computer Architecture
42 pages
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
No ratings yet
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
81 pages
Lecture 13 Pipelining
No ratings yet
Lecture 13 Pipelining
12 pages
MIPS32 Pipelining and Hazards Explained
No ratings yet
MIPS32 Pipelining and Hazards Explained
29 pages
Pipelining in Computer Architecture
No ratings yet
Pipelining in Computer Architecture
43 pages
Enhancing Personal Computer Architecture
No ratings yet
Enhancing Personal Computer Architecture
74 pages
Pipelining and ILP in Computer Architecture
No ratings yet
Pipelining and ILP in Computer Architecture
5 pages
Understanding 5-Stage Pipelined Architecture
No ratings yet
Understanding 5-Stage Pipelined Architecture
34 pages
Lec20 PDF
No ratings yet
Lec20 PDF
49 pages
Pipelining and Parallel Processing Explained
No ratings yet
Pipelining and Parallel Processing Explained
26 pages
Overview of MIPS Microcontroller
No ratings yet
Overview of MIPS Microcontroller
10 pages
Pipelining in Computer Architecture
No ratings yet
Pipelining in Computer Architecture
27 pages
MIPS Pipelining Overview for CSE431
No ratings yet
MIPS Pipelining Overview for CSE431
25 pages
Single Cycle vs Pipelined Processor Design
No ratings yet
Single Cycle vs Pipelined Processor Design
31 pages
Protheus Error: Array Out of Bounds
No ratings yet
Protheus Error: Array Out of Bounds
154 pages
School Restructuring in Eluru
No ratings yet
School Restructuring in Eluru
13 pages
IT Controls and Audit Techniques Overview
No ratings yet
IT Controls and Audit Techniques Overview
7 pages
AI in Industry 4.0: Opportunities & Challenges
No ratings yet
AI in Industry 4.0: Opportunities & Challenges
13 pages
Amp 30 D
No ratings yet
Amp 30 D
2 pages
Embedded Systems Study Guide
No ratings yet
Embedded Systems Study Guide
2 pages
Decimal to Binary Conversion in C
No ratings yet
Decimal to Binary Conversion in C
1 page
SEO & GMB Strategy Proposal
No ratings yet
SEO & GMB Strategy Proposal
5 pages
Linear Magnetic Anomalies Analysis
No ratings yet
Linear Magnetic Anomalies Analysis
71 pages
Product Data Sheet Deltav Opc Ua Servers Clients en 3583458
No ratings yet
Product Data Sheet Deltav Opc Ua Servers Clients en 3583458
7 pages
Android GridView Tutorial and Example
No ratings yet
Android GridView Tutorial and Example
10 pages
ASP.NET MVC CRUD Operations Guide
No ratings yet
ASP.NET MVC CRUD Operations Guide
26 pages
Key Features and Flavors of Python
No ratings yet
Key Features and Flavors of Python
31 pages
AI NLP Engineer Role at E42.ai
No ratings yet
AI NLP Engineer Role at E42.ai
2 pages
Neo Sunset MPC Beats Edition Order
No ratings yet
Neo Sunset MPC Beats Edition Order
3 pages
SWI9X30C 2.0: Customer Release Notes
No ratings yet
SWI9X30C 2.0: Customer Release Notes
35 pages
Class 12 Python Revision Notes
No ratings yet
Class 12 Python Revision Notes
20 pages
Top Technology Workshop Videos List
No ratings yet
Top Technology Workshop Videos List
34 pages
Full-Stack Document Intelligence Assignment
No ratings yet
Full-Stack Document Intelligence Assignment
4 pages
Unusual Text Input Generation for App Testing
No ratings yet
Unusual Text Input Generation for App Testing
12 pages
Software Applications for Industrial Engineering
No ratings yet
Software Applications for Industrial Engineering
14 pages
Recursive Descent and LL(1) Parsing Implementations
No ratings yet
Recursive Descent and LL(1) Parsing Implementations
18 pages
AI Password Analyzer Project Report
No ratings yet
AI Password Analyzer Project Report
51 pages
Unit-8 Introduction To Python Modules
No ratings yet
Unit-8 Introduction To Python Modules
33 pages
Compiler Design Question Bank: Unit 1
No ratings yet
Compiler Design Question Bank: Unit 1
46 pages
Sample Cruisevoy Copy Document
No ratings yet
Sample Cruisevoy Copy Document
9 pages
CPIM-8.0 Exam Practice Questions Guide
No ratings yet
CPIM-8.0 Exam Practice Questions Guide
27 pages
Student Contact Card Design Guide
100% (1)
Student Contact Card Design Guide
3 pages
HNS Level 2 Exam Support Guide
No ratings yet
HNS Level 2 Exam Support Guide
8 pages
HPE FlexFabric 5901AF 48-Port 1GBaseT 4XG 2QSFP+ Switch-PSN1013968503HREN
No ratings yet
HPE FlexFabric 5901AF 48-Port 1GBaseT 4XG 2QSFP+ Switch-PSN1013968503HREN
4 pages

Understanding Pipelining in MIPS Architecture

Uploaded by

Understanding Pipelining in MIPS Architecture

Uploaded by

Chapter 4

Chapter 4 — The Processor — 2

Chapter 4 — The Processor — 3

Chapter 4 — The Processor — 4

Chapter 4 — The Processor — 5

Chapter 4 — The Processor — 6

Each stage = 30 min

Each stage = 30 min

Chapter 4 — The Processor — 8

2. ID (Decode): Instruction decode & register read ID

3. EX (Execute): Execute operation or calculate address

5. WB (Writeback): Write result back to register

Chapter 4 — The Processor — 9

Chapter 4 — The Processor — 10

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

Branch (beq) 200ps 100 ps 200ps 500ps

Chapter 4 — The Processor — 11

Time bet. 1st and 4th inst.=3x800=2400 ps

Pipelined (Tc= 200ps)

Time bet. 1st and 4th inst.=3x200=600 ps

The clock cycle design must

Chapter 4 — The Processor — 12

Chapter 4 — The Processor — 13

Chapter 4 — The Processor — 14

 Speedup = 225 / 50 = 4.5

Chapter 4 — The Processor — 15

Chapter 4 — The Processor — 16

 Time between instructions after pipelining = 200/4 = 50 ns

Chapter 4 — The Processor — 17

Chapter 4 — The Processor — 18

Chapter 4 — The Processor — 19

 Hence, pipelined datapaths require separate

Chapter 4 — The Processor — 20

Chapter 4 — The Processor — 21

Chapter 4 — The Processor — 22

The value of $s0

So, Wait Until it becomes available

Write result is in the fifth stage

Chapter 4 — The Processor — 24

 Use result when it is computed

Chapter 4 — The Processor — 25

Chapter 4 — The Processor — 26

Chapter 4 — The Processor — 27

Chapter 4 — The Processor — 28

add $t5, $t1, $t4

 Reorder code to avoid use of load result in

lw $t1, 0($t0) lw $t1, 0($t0)

Chapter 4 — The Processor — 30

add $t3, $t1, $t2

add $t5, $t1, $t4

Chapter 4 — The Processor — 31

Chapter 4 — The Processor — 32

Chapter 4 — The Processor — 33

 Longer pipelines can’t determine branch

Chapter 4 — The Processor — 35

Chapter 4 — The Processor — 36

Chapter 4 — The Processor — 38

You might also like