0% found this document useful (0 votes)
16 views34 pages

Principles of Pipelining in Processors

Uploaded by

Parth Lakhera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views34 pages

Principles of Pipelining in Processors

Uploaded by

Parth Lakhera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

PowerPoint Slides

The
Processor
Language
Design
of Bits

Computer Organisation and Architecture


Smruti Ranjan Sarangi,
IIT Delhi

Chapter 9 Principles of Pipelining

PROPRIETARY MATERIAL. © 2014 The McGraw-Hill Companies, Inc. All rights reserved. No part of this PowerPoint slide may be displayed, reproduced or distributed in any form or by any means, without the
prior written permission of the publisher, or used beyond the limited distribution to teachers and educators permitted by McGraw-Hill for their individual course preparation. PowerPoint Slides are being provided only to
authorized professors and instructors for use in preparing for classes using the affiliated textbook. No other use or distribution of this PowerPoint slide is permitted. The PowerPoint slide may not be sold and may not be
distributed or be used by any student or any other third party. No part of the slide may be reproduced, displayed or distributed in any form or by any means, electronic or otherwise, without the prior written permission
of McGraw Hill Education (India) Private Limited.
1 1
2nd version

[Link]
Download the pdf of the book

videos

Slides, software, solution manual

Print version
The pdf version of the book and
(Publisher: WhiteFalcon, 2021)
all the learning resources can be
Available on e-commerce sites.
freely downloaded from the
website: [Link]
Outline
 Overview of Pipelining
 A Pipelined Data Path
 Pipeline Hazards
 Pipeline with Interlocks
 Forwarding
 Performance Metrics
 Interrupts/Exceptions

3
Up till now ….
 Single-cycle implementation: We have designed a
processor that can execute all the SimpleRisc
instructions in a single clock-cycle. Performance:
 Longest delay determines clock period
 Critical path: LD (load instruction)
 All 5 stages exercised for it: IF → OF → EX → MA → RW
 Not feasible to vary period for different instructions
 Violates design principle: Make the common case fast
 We will improve performance by pipelining

4
Designing Efficient Processors

 Microprogrammed processors are much


slower that hardwired processors
 Even hardwired processors
 Have a lot of waste !!!
 We have 5 stages.
 What is the IF stage doing, when the MA stage is
active ?
 ANSWER : It is idling

5
The Notion of Pipelining

 Let us go back to the car assembly line


 Is the engine shop idle, when the paint shop is
painting a car ?
 NO : It is building the engine of another car
 When this engine goes to the body shop, it
builds the engine of another car, and so on ….
 Insight :
 Multiple cars are built at the same time.
 A car proceeds from one stage to the next

6
§4.6 An Overview of Pipelining
Pipelining Analogy (Laundry Example)
 Pipelined laundry: overlapping execution
 Parallelism improves performance

A, B, C, and D each have  4 people:


dirty clothes to be washed,  Speedup
dried, folded, and put away.
= 8/3.5 = 2.3
The washer, dryer, “folder,”
and “storer” each take 30  N people non-stop:
minutes for their task.  Speedup
Sequential laundry takes 8 = 4*0.5*N/(1.5 + 0.5*N)
hours for four loads of
washing, while pipelined ≈ 4 as 𝑁 → ∞
laundry takes just 3.5 hours = number of stages
Pipelined Processors
inst 5 inst 4 inst 3 inst 2 inst 1

Instruction Operand Execute Memory Register


Fetch Fetch Access Write
(IF) (OF) (EX) (MA) (RW)

 The IF, ID, EX, MA, and RW stages process


5 instructions simultaneously
 Each instruction proceeds from one stage
to the next
 This is known as pipelining
8
Advantages of Pipelining

 We keep all parts of the data path, busy all


the time
 Let us assume that all the 5 stages do the
same amount of work
 Without pipelining, every T seconds, an
instruction completes its execution
 With pipelining, every T/5 seconds, a new
instruction completes its execution

9
Design of a Pipeline
 Splitting the Data Path
 We divide the data path into 5 parts : IF, OF, EX,
MA, and RW
 Timing
 We insert latches (registers) between
consecutive stages. Called pipeline registers
 4 Latches → IF-OF, OF-EX, EX-MA, and MA-RW
 At the negative edge of a clock, an instruction
moves from one stage to the next
10
Pipelined Data Path with
Latches/Flip-flops
Latches

Instruction Operand Execute Memory Register


Fetch Fetch Access Write
(IF) (OF) (EX) (MA) (RW)

 Add a flip-flop between subsequent stages.


 Triggered by a negative clock edge

11
The Instruction Packet
 What travels between stages ?
 ANSWER : the instruction packet
 Instruction Packet
 Instruction contents
 Program counter
 All intermediate results
 Control signals
 Every instruction moves with its entire state, no
interference between instructions
12
Outline
 Overview of Pipelining
 A Pipelined Data Path
 Pipeline Hazards
 Pipeline with Interlocks
 Forwarding
 Performance Metrics
 Interrupts/Exceptions
13
IF Stage

instruction

instruction IF/OF Register

 Instruction contents saved


 in the instruction field

Non-pipelined version, for reference 14


OF Stage
Note: The instruction
op2/immx Mux is
pulled in into the
OF stage in the Control
pipelined design. Immediate and unit
branch target
This is likely due to
the need of native
op2 (ie. without
muxed with immx)
in the MA stage. branchTarget op2 instruction control
That’s probably
the reason why
the native op2 is
also in the OF-EX  A, B → ALU Operands, op2 (store operand),
pipeline register control (set of all control signals)

15
EX Stage

pc branchTarget B A op2 instruction control OF-EX

aluSignals

flags
0 1 isBeq
Branch
isRet ALU unit isBgt
branchPC ?ags
isUBranch
isBranchTaken

pc aluResult op2 instruction control EX-MA

 aluResult → result of the ALU Opera on


 op2, control, pc, instruction (passed from
OF-EX)

16
MA Stage
pc aluResult op2 instruction control EX-MA

mdr
mar
isLd
Data memory Memory
unit
isSt

pc control MA-RW
ldResult aluResult instruction

 ldResult → result of the load opera on


 aluResult, control, pc, instruction (passed
from EX-MA)

17
RW Stage

pc ldResult aluResult instruction control MA-RW

4 isLd
10 01 00 isCall isWb
E
rd
0
Register
E enable A file
1
data ra(15) D
A address
D data

18
1

pc + 4 0

pc Instruction instruction
memory

pc instruction

rd rs2 ra(15) rs1

1 0 1 0 isSt
isRet Control
reg
Immediate and Register unit
file data
branch target
op2 op1
isWb
immx isImmediate
1 0

pc branchTarget B A op2 instruction control

aluSignals

flags
0 isBeq
1 Branch
isRet ALU unit isBgt
isUBranch

isBranchTaken
pc aluResult op2 instruction control

mar mdr
isLd
Data
Memory
memory unit
isSt

DRAFT
pc ldResult aluResult instruction control

4 isLd
isWb
10 01 00 isCall
rd
0

C Smruti R. Sarangi
ra(15) <srsarangi@[Link]>
1
data

19
Abridged Diagram

IF-OF OF-EX EX-MA MA-RW


Control
unit Branch
unit Memory
unit
Fetch Immediate Register
and branch flags write unit
unit unit

Data
ALU
memory
op2 Unit
Instruction Register
memory file op1

20
Outline
 Overview of Pipelining
 A Pipelined Data Path
 Pipeline Hazards
 Pipeline with Interlocks
 Forwarding
 Performance Metrics
 Interrupts/Exceptions
21
Pipeline Hazards
 Now, let us consider correctness
 Let us introduce a new tool → Pipeline
Diagram Clock cycles

1 2 3 4 5 6 7 8 9

IF 1 2 3
[1]: add r1, r2, r3
OF 1 2 3
[2]: sub r4, r5, r6 EX 1 2 3
MA 1 2 3
[3]: mul r8, r9, r10
RW 1 2 3

22
Rules for Constructing a Pipeline
Diagram
 It has 5 rows
 One per each stage
 The rows are named : IF, OF, EX, MA, and RW
 Each column represents a clock cycle
 Each cell represents the execution of an
instruction in a stage
 It is annotated with the name(label) of the
instruction
 Instructions proceed from one stage to
the next across clock cycles
23
Example

Clock cycles

1 2 3 4 5 6 7 8 9

IF 1 2 3
[1]: add r1, r2, r3 1 2
OF 3
[2]: sub r4, r2, r5 EX 1 2 3
MA 1 2 3
[3]: mul r5, r8, r9
RW 1 2 3

24
Data Hazards

clock cycles

1 2 3 4 5 6 7 8 9

IF 1 2
[1]: add r1, r2, r3
OF 1 2

[2]: sub r3, r1, r4 EX 1 2


1 2
MA
RW 1 2

 Instruction 2 will read incorrect values !!!

25
Data Hazard
Definition: A hazard is defined as the possibility of erroneous execution of an
instruction in a pipeline. A data hazard represents the possibility of erroneous
execution because of the unavailability of data, or the availability of incorrect
data.

 This situation represents a data hazard


 In specific,
 it is a RAW (read after write) hazard

 The earliest we can dispatch instruction


2, is cycle 5
26
Other Types of Data Hazards
 Our pipeline is in-order
Definition: In an in-order pipeline (such as ours), a preceding instruction is
always ahead of a succeeding instruction in the pipeline. Modern processors
however use out-of-order pipelines that break this rule. It is possible for later
instructions to execute before earlier instructions.

 We will only have RAW hazards in our


pipeline.
 Out-of-order pipelines can have WAR and
WAW hazards
27
WAW Hazards

[1]: add r1, r2, r3


[2]: sub r1, r4, r3

 Instruction [2] cannot write the value of


r1 before instruction [1] writes to it, will
lead to a WAW hazard, as it will result in
r1 having an unintended value

28
WAR Hazards

[1]: add r1, r2, r3


[2]: add r2, r5, r6

 Instruction [2] cannot write the value of


r2 before instruction [1] reads it → will
lead to a WAR hazard , as it will result in
an incorrect value of r2 being read

29
Control Hazards

[1]: beq .foo


[2]: mov r1, 4
[3]: add r2, r4, r3
...
...
.foo:
[100]: add r4, r1, r2

 If the branch is taken, instructions [2]


and [3], might get fetched, incorrectly
30
Control Hazard – Pipeline
Diagram
Clock cycles

1 2 3 4 5 6 7 8 9

IF 1 2 3
[1]: beq .foo OF 1 2 3
[2]: mov r1, 4 EX 1 2 3
MA 1 2 3
[3]: add r2, r4, r3
RW 1 2 3

 The two instructions fetched immediately


after a branch instruction might have been
fetched incorrectly.
31
Control Hazards
 The two instructions fetched immediately
after a branch instruction might have
been fetched incorrectly.
 These instructions are said to be on the
wrong path
 A control hazard represents the possibility of
erroneous execution in a pipeline because
instructions in the wrong path of a branch can
possibly get executed and save their results in
memory, or in the register file
32
Structural Hazards

 A structural hazard may occur when two


instructions have a conflict on the same set of
resources in a cycle
 Example :
 Assume that we have an add instruction
that can read one operand from memory
 add r1, r2, 10[r3]

33
Structural Hazards - II

[1]: st r4, 20[r5]


[2]: sub r8, r9, r10
[3]: add r1, r2, 10[r3]

 This code will have a structural hazard


 [3] tries to read 10[r3] (MA unit) in cycle 4
 [1] tries to write to 20[r5] (MA unit) in cycle 4
 Does not happen in our SimpleRisc pipeline

34

You might also like