0% found this document useful (0 votes)
15 views

Cse431 06

IIT Mandi Computer Architecture lecture 6 PPT

Uploaded by

syntaxajju
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Cse431 06

IIT Mandi Computer Architecture lecture 6 PPT

Uploaded by

syntaxajju
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 25

CSE 431

Computer Architecture
Fall 2005

Lecture 06: Basic MIPS Pipelining


Review
Mary Jane Irwin ( www.cse.psu.edu/~mji )
www.cse.psu.edu/~cg431

[Adapted from Computer Organization and Design,


Patterson & Hennessy, © 2005, UCB]

CSE431 L06 Basic MIPS Pipelining.1 Irwin, PSU, 2005


Review: Single Cycle vs. Multiple Cycle Timing
Single Cycle Implementation:

Cycle 1 Cycle 2
Clk

lw sw Waste
multicycle clock
slower than 1/5th of
Multiple Cycle Implementation: single cycle clock
due to stage register
overhead
Clk Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9Cycle 10

lw sw R-type
IFetch Dec Exec Mem WB IFetch Dec Exec Mem IFetch

CSE431 L06 Basic MIPS Pipelining.2 Irwin, PSU, 2005


How Can We Make It Even Faster?
 Split the multiple instruction cycle into smaller and
smaller steps
 There is a point of diminishing returns where as much time is
spent loading the state registers as doing the work

 Start fetching and executing the next instruction before


the current one has completed
 Pipelining – (all?) modern processors are pipelined for
performance
 Remember the performance equation:
CPU time = CPI * CC * IC

 Fetch (and execute) more than one instruction at a time


 Superscalar processing – stay tuned

CSE431 L06 Basic MIPS Pipelining.3 Irwin, PSU, 2005


A Pipelined MIPS Processor
 Start the next instruction before the current one has
completed
 improves throughput - total amount of work done in a given time
 instruction latency (execution time, delay time, response time -
time from the start of an instruction to its completion) is not
reduced

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8

lw IFetch Dec Exec Mem WB

sw IFetch Dec Exec Mem WB

R-type IFetch Dec Exec Mem WB

- clock cycle (pipeline stage time) is limited by the slowest stage


- for some instructions, some stages are wasted cycles

CSE431 L06 Basic MIPS Pipelining.4 Irwin, PSU, 2005


Single Cycle, Multiple Cycle, vs. Pipeline
Single Cycle Implementation:
Cycle 1 Cycle 2
Clk

lw sw Waste

Multiple Cycle Implementation:

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9Cycle 10
Clk
lw sw R-type
IFetch Dec Exec Mem WB IFetch Dec Exec Mem IFetch

Pipeline Implementation:
lw IFetch Dec Exec Mem WB

sw IFetch Dec Exec Mem WB

R-type IFetch Dec Exec Mem WB


CSE431 L06 Basic MIPS Pipelining.5 Irwin, PSU, 2005
MIPS Pipeline Datapath Modifications
 What do we need to add/modify in our MIPS datapath?
 State registers between each pipeline stage to isolate them
IF:IFetch ID:Dec EX:Execute MEM: WB:
MemAccess WriteBack

Add
Add
4 Shift
left 2
Read Addr 1
Instruction Data
Register Read
IFetch/Dec

Memory Memory
Read Addr 2Data 1
Dec/Exec

Exec/Mem
Read
PC

File Read

Mem/WB
Address Address
Write Addr Read ALU Data
Data 2 Write Data
Write Data

Sign
16 Extend 32

System Clock
CSE431 L06 Basic MIPS Pipelining.6 Irwin, PSU, 2005
Pipelining the MIPS ISA
 What makes it easy
 all instructions are the same length (32 bits)
- can fetch in the 1st stage and decode in the 2nd stage
 few instruction formats (three) with symmetry across formats
- can begin reading register file in 2nd stage
 memory operations can occur only in loads and stores
- can use the execute stage to calculate memory addresses
 each MIPS instruction writes at most one result (i.e.,
changes the machine state) and does so near the end of the
pipeline (MEM and WB)
 What makes it hard
 structural hazards: what if we had only one memory?
 control hazards: what about branches?
 data hazards: what if an instruction’s input operands
depend on the output of a previous instruction?
CSE431 L06 Basic MIPS Pipelining.7 Irwin, PSU, 2005
Graphically Representing MIPS Pipeline

ALU
IM Reg DM Reg

 Can help with answering questions like:


 How many cycles does it take to execute this code?
 What is the ALU doing during cycle 4?
 Is there a hazard, why does it occur, and how can it be fixed?

CSE431 L06 Basic MIPS Pipelining.8 Irwin, PSU, 2005


Why Pipeline? For Performance!
Time (clock cycles)

Once the

ALU
I Inst 0 IM Reg DM Reg pipeline is full,
n one instruction
s is completed

ALU
t Inst 1 IM Reg DM Reg
every cycle, so
r. CPI = 1

ALU
O Inst 2 IM Reg DM Reg
r
d

ALU
e Inst 3 IM Reg DM Reg
r

ALU
Inst 4 IM Reg DM Reg

Time to fill the pipeline

CSE431 L06 Basic MIPS Pipelining.9 Irwin, PSU, 2005


Can Pipelining Get Us Into Trouble?
 Yes: Pipeline Hazards
 structural hazards: attempt to use the same resource by two
different instructions at the same time
 data hazards: attempt to use data before it is ready
- An instruction’s source operand(s) are produced by a prior
instruction still in the pipeline
 control hazards: attempt to make a decision about program
control flow before the condition has been evaluated and the
new PC target address calculated
- branch instructions

 Can always resolve hazards by waiting


 pipeline control must detect the hazard
 and take action to resolve hazards

CSE431 L06 Basic MIPS Pipelining.10 Irwin, PSU, 2005


A Single Memory Would Be a Structural Hazard
Time (clock cycles)

Reading data from

ALU
I lw Mem Reg Mem Reg
memory
n
s

ALU
t Inst 1 Mem Reg Mem Reg
r.

ALU
O Inst 2 Mem Reg Mem Reg
r
d

ALU
e Inst 3 Mem Reg Mem Reg
r

ALU
Inst 4 Mem Reg Mem Reg
Reading instruction
from memory
 Fix with separate instr and data memories (I$ and D$)
CSE431 L06 Basic MIPS Pipelining.11 Irwin, PSU, 2005
How About Register File Access?
Time (clock cycles)

Fix register file

ALU
I add $1, IM Reg DM Reg access hazard by
n doing reads in the
s second half of the

ALU
t Inst 1 IM Reg DM Reg
cycle and writes in
r. the first half

ALU
O Inst 2 IM Reg DM Reg
r
d

ALU
e add $2,$1, IM Reg DM Reg
r

clock edge that controls clock edge that controls


register writing loading of pipeline state
CSE431 L06 Basic MIPS Pipelining.13
registers Irwin, PSU, 2005
Register Usage Can Cause Data Hazards
 Dependencies backward in time cause hazards

ALU
add $1, IM Reg DM Reg

ALU
sub $4,$1,$5 IM Reg DM Reg

ALU
and $6,$1,$7 IM Reg DM Reg

ALU
or $8,$1,$9 IM Reg DM Reg

ALU
IM Reg DM Reg
xor $4,$1,$5

 Read before write data hazard


CSE431 L06 Basic MIPS Pipelining.15 Irwin, PSU, 2005
Loads Can Cause Data Hazards
 Dependencies backward in time cause hazards

ALU
I lw $1,4($2) IM Reg DM Reg
n
s

ALU
t sub $4,$1,$5 IM Reg DM Reg
r.

ALU
O and $6,$1,$7 IM Reg DM Reg
r
d

ALU
e or $8,$1,$9 IM Reg DM Reg
r

ALU
IM Reg DM Reg
xor $4,$1,$5

 Load-use data hazard


CSE431 L06 Basic MIPS Pipelining.16 Irwin, PSU, 2005
One Way to “Fix” a Data Hazard

Can fix data


hazard by

ALU
I add $1, IM Reg DM Reg
waiting – stall –
n
but impacts CPI
s
t stall
r.

O stall
r
d

ALU
e sub $4,$1,$5 IM Reg DM Reg
r

ALU
and $6,$1,$7 IM Reg DM Reg

CSE431 L06 Basic MIPS Pipelining.17 Irwin, PSU, 2005


Another Way to “Fix” a Data Hazard
Fix data hazards
by forwarding

ALU
add $1, IM Reg DM Reg
I results as soon as
n they are available
s to where they are

ALU
t IM Reg DM Reg
sub $4,$1,$5 needed
r.

ALU
IM Reg DM Reg
r and $6,$1,$7
d
e

ALU
r IM Reg DM Reg
or $8,$1,$9

ALU
IM Reg DM Reg
xor $4,$1,$5

CSE431 L06 Basic MIPS Pipelining.19 Irwin, PSU, 2005


Forwarding with Load-use Data Hazards

ALU
I lw $1,4($2) IM Reg DM Reg

n
s

ALU
sub $4,$1,$5 IM Reg DM Reg
t
r.

ALU
IM Reg DM Reg
O and $6,$1,$7
r
d

ALU
IM Reg DM Reg
e or $8,$1,$9
r

ALU
IM Reg DM Reg
xor $4,$1,$5

 Will still need one stall cycle even with forwarding

CSE431 L06 Basic MIPS Pipelining.21 Irwin, PSU, 2005


Branch Instructions Cause Control Hazards
 Dependencies backward in time cause hazards

ALU
I beq IM Reg DM Reg
n
s

ALU
t lw IM Reg DM Reg
r.

ALU
O Inst 3 IM Reg DM Reg
r
d

ALU
e Inst 4 IM Reg DM Reg
r

CSE431 L06 Basic MIPS Pipelining.22 Irwin, PSU, 2005


One Way to “Fix” a Control Hazard

Fix branch

ALU
I beq IM Reg DM Reg hazard by
n waiting –
s stall – but
t stall affects CPI
r.

O stall
r
d
e stall
r

ALU
IM Reg DM Reg
lw

ALU
IM Reg DM
Inst 3

CSE431 L06 Basic MIPS Pipelining.23 Irwin, PSU, 2005


Corrected Datapath to Save RegWrite Addr
 Need to preserve the destination register address in the
pipeline state registers

IF/ID ID/EX EX/MEM

Add
4 Add MEM/WB
Shift
left 2
Read Addr 1
Instruction Data
Register Read
Memory Read Addr 2Data 1 Memory
Read
PC

File Address
Read
Address Write Addr ALU
Read Data
Data 2 Write Data
Write Data

Sign
16 Extend 32

CSE431 L06 Basic MIPS Pipelining.25 Irwin, PSU, 2005


MIPS Pipeline Control Path Modifications
 All control signals can be determined during Decode
 and held in the state registers between pipeline stages

ID/EX
EX/MEM

IF/ID Control

Add MEM/WB
4 Add
Shift
left 2
Read Addr 1
Instruction Data
Register Read
Memory Read Addr 2Data 1 Memory
Read
PC

File Address
Read
Address Write Addr ALU
Read Data
Data 2 Write Data
Write Data

Sign
16 Extend 32

CSE431 L06 Basic MIPS Pipelining.26 Irwin, PSU, 2005


Other Pipeline Structures Are Possible
 What about the (slow) multiply operation?
 Make the clock twice as slow or …
 let it take two cycles (since it doesn’t use the DM stage)
MUL

ALU
IM Reg DM Reg

 What if the data memory access is twice as slow as


the instruction memory?
 make the clock twice as slow or …
 let data memory access take two cycles (and keep the same
clock rate)
ALU

IM Reg DM1 DM2 Reg

CSE431 L06 Basic MIPS Pipelining.27 Irwin, PSU, 2005


Sample Pipeline Alternatives
 ARM7 IM Reg EX

PC update decode ALU op


IM access reg DM access
access shift/rotate
commit result
(write back)

ALU
IM Reg DM Reg
 StrongARM-1

Reg

ALU
 XScale IM1 IM2 Reg SHFT DM1
DM2
PC update decode DM write
BTB access reg 1 access ALU op reg write
start IM access
shift/rotate start DM access
IM access reg 2 access exception

CSE431 L06 Basic MIPS Pipelining.28 Irwin, PSU, 2005


Summar
y
 All modern day processors use pipelining
 Pipelining doesn’t help latency of single task, it helps
throughput of entire workload
 Potential speedup: a CPI of 1 and fast a CC
 Pipeline rate limited by slowest pipeline stage
 Unbalanced pipe stages makes for inefficiencies
 The time to “fill” pipeline and time to “drain” it can impact
speedup for deep pipelines and short code runs
 Must detect and resolve hazards
 Stalling negatively affects CPI (makes CPI less than the ideal
of 1)

CSE431 L06 Basic MIPS Pipelining.29 Irwin, PSU, 2005


Next Lecture and Reminders
 Next lecture
 Overcoming data hazards
- Reading assignment – PH, Chapter 6.4-6.5

 Reminders
 HW2 due September 29th
 SimpleScalar tutorials scheduled
- Thursday, Sept 22, 5:30-6:30 pm in 218 IST

 Evening midterm exam scheduled


- Tuesday, October 18th , 20:15 to 22:15, Location 113 IST
- You should have let me know by now if you have a conflict

CSE431 L06 Basic MIPS Pipelining.30 Irwin, PSU, 2005

You might also like