0% found this document useful (0 votes)

11 views

pipe3

Uploaded by

pedro paulo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

pipe3

Uploaded by

pedro paulo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

6.

6 Branch Hazards (Control Hazards)

Control Hazards are simple to understand and occur

much less frequently than data hazards.

They can’t be solved as effectively as forwarding is

for data hazards.

There are 2 schemes for resolving these hazards and

one optimization to improve these schemes.

cs 152
cs 152 L1 3L1
.1 3 . DAP Fa97,  U.CB
a) Assume Branch not Taken

Stalling until branches complete is too slow and

reduces potential speedup too much.
Improvement strategy: Assume branch will not be
taken and continue execution down the sequential
instr stream. If the branch is taken, the instrs that
are being fetched and decoded must be discarded,
and execution continues at the branch target.
If branches are taken ½ the time, and if it costs little
to discard the instrs, this scheme halves the cost of
control hazards.
To discard instrs, the original control values are
changed to 0’s (like load hazards). Change also for
instrs in the IF, ID and EX stages when the branch
reaches MEM - Flush instrs in the IF, ID, and EX
stages.

cs 152 L1 3 .2 DAP Fa97,  U.CB

Impact of Branch Instrs on Pipe Efficiency

Program Time (in clock cycles)

execution CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
order
(in instructions)

40 beq $1, $3, 7 IM Reg DM Reg

44 and $12, $2, $5 IM Reg DM Reg

48 or $13, $6, $2 IM Reg DM Reg

52 add $14, $2, $2 IM Reg DM Reg

72 lw $4, 50($7) IM Reg DM Reg

cs 152
cs 152 L1 3L1
.3 3 .3 DAP Fa97,  U.CB
b) Reducing the Delay of Branches

Reduce the cost of the taken branch.

In first implementation, the next PC for a branch is
selected in the MEM stage.

Move the branch execution to the ID stage. Special

HW in ID: adder for address calculation (moved
from MEM stage); Exclusive-OR and AND logic (for
equality testing).

Only one instruction to flush if the branch is taken.

To flush: control line, IF.Flush, that zeros the instr
control signals of the IF/ID pipe regst (the instr
becomes a nop).
04/26/99 4
cs 152
cs 152 L1 3L1
.4 3 .4 DAP Fa97,  U.CB
Datapath for Branch (HW to Flush Instr after branch)
IF.Flush

Hazard
detection
unit
M ID/EX
u
x
WB
EX/MEM
M
Control u M WB
x MEM/WB
0

IF/ID EX M WB

4 Shift
left 2
M
u
x
Registers =
Instruction Data
PC ALU
memory memory M
u
M x
u
x

Sign
extend

M
u
x
Forwarding
unit

IF.Flush control signal actually comes from equality checker (evaluates

branch condition).
cs 152 L1 3 .5 DAP Fa97,  U.CB
Example: Pipelined Branch Execution

Assume the pipeline is optimized for branches not

taken and branch execution is moved to ID stage.

36 sub $10, $4, $8

40 beq $1, $3, 7 #PC-relative branch to 40+4+7*4 = 72

44 and $12, $2, $5

48 or $13, $2, $6
52 add $14, $4, $2
56 slt $15, $6, $7
…
72 lw $4, 50($7)
04/26/99 6
cs 152
cs 152 L1 3L1
.6 3 .6 DAP Fa97,  U.CB
Branch Execution of Sample Code
and $12, $2, $5 beq $1, $3, 7 sub $10, $4, $8 before<1> before<2>

IF.Flush

Hazard
detection
unit
72 ID/EX
M
u
48 x WB
EX/MEM
M
Control u M WB
x MEM/WB
28
0
IF/ID EX M WB
48 44 72

4
$1
Shift M $4
left 2 u
x
=
Registers
Instruction Data
PC ALU
memory memory M
72 44 $3
u
M $8 x
7 u
x

Sign
extend

Forwarding
unit

Clock 3

lw $4, 50($7) bubble (nop) beq $1, $3, 7 sub $10, . . . before<1>

IF.Flush

Hazard
detection
unit
ID/EX
M
u
76 x WB
EX/MEM
M
Control u M WB
x MEM/WB
0
IF/ID EX M WB
76 72

Shift M $1
left 2 u
x
Registers
=
Instruction Data
PC ALU
memory memory M
76 72
u
M $3 x
u
x

Sign
extend

Forwarding
unit

Clock 4

cs 152 L1 3 .7 DAP Fa97,  U.CB

Dynamic Branch Prediction

Assuming Branch taken is a crude form of branch

prediction. Try to detect regularities in branch
behavior.

Algorithm: look up the address of the instr to see if a

branch was taken the last time this instr was
executed and, if so, begin fetching new instrs from
the same place as the last time.

04/26/99 8
cs 152
cs 152 L1 3L1
.8 3 .8 DAP Fa97,  U.CB
Implementation of Dynamic Branch Prediction

Implementation: branch prediction buffer or branch

history table, a small memory indexed by the lower
portion of the address of the branch instr. This
memory contains a bit that says wether the branch
was taken or not.
The branch prediction buffer can be implemented
as a small, special buffer accessed with the instr
address during the IF stage.
If the instr is predicted as taken, fetching begins
from the target as soon as the PC is known (this
can be as early as the ID stage).
Otherwise sequential fetching continues. If
prediction is wrong, prediction bit must be
changed.

cs 152 L1 3 .9 DAP Fa97,  U.CB

Example Branch Prediction (1 bit history table)

Consider the following loop with: i=1 initially, j=1, h=10

Loop: g = g + A[ i ];
i = i + j;
if (i != h) go to Loop;

Loop: add $t1, $s3, $s3 #Temp reg $t1 = 2 *i

i j
add $t1, $t1, $t1 #Temp reg $t1 = 4*i
add $t1, $t1, $s5 #$t1 = address of A[ i ]
lw $t0, 0($t1) #Temp reg $t0 = A[ i ]
add $s1, $s1, $t0 #g = g + A[ i ]
add $s3, $s3, $s4 #i = i + j
bne $s3, $s3, Loop # go to Loop if i != j
The algorithm will mispredict the first time (i = 1, bit =0) and last time
(i=10, bit =1).
04/26/99 * 9
cs 152
cs 152 L1 3L1
.103 .9 DAP Fa97,  U.CB
Loops and Prediction

The prediction may be wrong (it may have been put

there by another branch that has the same low
order bits). This doesn’t affect correctness. If the
prediction is wrong, the prediction bit is inverted
and stored back, and the proper bit sequence is
executed.
The prediction accuracy for this branch that is taken
90% of the time is only 80% (2 incorrect predictions
and eight correct ones).
For highly regular branches, we want the accuracy of
the predictor to match the taken branch frequency.
Solution: Use a 2 bit prediction buffer. A prediction
must be wrong twice before it is changed.

04/26/99 10
cs 152
cs 152 L1 3L1
.113 .10 DAP Fa97,  U.CB
Finite State Machine for 2-bit Prediction Scheme

Taken
Not taken
Predict taken Predict taken

Taken
Taken Not taken
Not taken
Predict not taken Predict not taken

Taken

Not taken

A branch that strongly favors taken or not taken will

be mispredicted only once. The 2 bits are used to
encode the four states in the system.

Assume code has exited an earlier loop:

when enters loop bits=10 (2nd state, right top) predicts correctly;
during other prediction bits = 11 (1st state, left top);
at last loop prediction bits = 11, predicts wrongly and changes to10;
after last prediction bits = 10 (2nd state, right top).

cs 152 97108/Patterson
L1 3 .12 DAP Fa97,  U.CB
Figure 06.53
Compiler Scheduling for Branch Delays

A delayed branch always executes the following instr,

but the 2nd instr following the branch will be affected
by the branch.
Compilers and assemblers try to place an instr that
always executes after the branch in the branch delay
slot. The SW tries to make the successor instr valid
and useful.
Limitations of scheduling:
a) restrictions on instrs that are scheduled in the slot,
b) the ability to predict at compile time wether a
branch will be taken or not.
As machines go to longer pipes and towards issuing
multiple instrs / cycle a single delay slot is not very
useful. This strategy is loosing popularity.
Transistors in chip increase, dynamic predictors
become more popular.
04/27/99 12
cs 152
cs 152 L1 3L1
.133 .12 DAP Fa97,  U.CB
3 Ways to Schedule Branch Delay Slot
a. From before b. From target c. From fall through
sub $t4, $t5, $t6
add $s1, $s2, $s3 add $s1, $s2, $s3
…
if $s2 = 0 then if $s1 = 0 then
add $s1, $s2, $s3
Delay slot Delay slot
if $s1 = 0 then

Delay slot sub $t4, $t5, $t6

Becomes Becomes Becomes

add $s1, $s2, $s3

if $s2 = 0 then if $s1 = 0 then

add $s1, $s2, $s3
add $s1, $s2, $s3 sub $t4, $t5, $t6
if $s1 = 0 then

sub $t4, $t5, $t6

In b) and c) performance is improved only when execution proceeds

in expected direction.
cs 152 L1 3 .14 DAP Fa97,  U.CB
Comments on Branch Delay Scheduling

In a) the slot is scheduled with an independent instr

from before the branch. This is the best choice, it
always improves performance.
b) and c) are used when a) is not possible ($s1
written by add and read by if).
In b) usually the target will have to be copied into
slot because it can be reached by another path.
This strategy is preferred when branch is taken
with high probability (such as loop braches).
In c) the slot is scheduled with the not-taken fall-
through.
To make b) and c) scheduling possible, it is
necessary to be OK to execute the sub instr when
the branch goes in the unexpected direction (work
is wasted but the program executes correctly, e. g.
$t4 is not used when branch takes unexpected
direction).
cs 152 L1 3 .15 DAP Fa97,  U.CB
Execution Model Performance Comparisson

Compare performance of single-cycle, multicycle

and pipelined control for instr mix of gcc (fig 4.54)
instr mix:
22% loads, 11% stores, 49% R-format, 16% branches, 2% jumps
operation times for major functional units: mem access = 2ns, ALU
operations = 2 ns, regs file read / write = 1 ns.

Single-cycle: 8 ns / instruction = cycle duration.

Multicycle:
No. of cycles for each operation (CPI) in multicycle implementation:
loads = 5, stores = 4, R-format = 4, branches = 3, jumps = 3 cycles.
CPI = 0.22*5 + 0.11*4 + 0.49*4 + 0.16*3 + 0.02*3 = 4.04 cycles / instr
Average time for instr execution = 4.04 * 2 ns = 8.08 ns.
cs 152 L1 3 .16 DAP Fa97,  U.CB
Performance Comparisson Continuation
Pipelined: assume
1/2 of loads followed by instr that uses load result; branch delay on
misprediction 1 cycle and 1/4 branches are mispredicted; jumps
always delay 1 cycle.
loads: 1 cycle when no dependency and 2 when there is dependency,
average cycle for loads = 1.5 cycles
stores and R-format: 1 cycle
branches: 1 cycle when prediction is correct and 2 cycles when
wrong, average cycle for branches = 1 + 1*0.25 = 1.25 cycles
jumps: 1 cycle + 1 cycle delay = 2 cycles

CPI = 1.5 0.22 + 10.11 + 10.49 + 1.250.16 + 2*0.02 = 1.17 cycles

average instr execution time 1.17*2 ns = 2.34 ns
Pipelined control time is:
8 ns / 2.34 ns = 3.4 times faster than single-cycle control,
8.08 ns / 2.34 ns = 3.5 times faster than multicycle control (book does
4.04 ns / 2.34 ns = 1.73 times faster - wrong...see example pg 397
chapter 5).
cs 152 L1 3 .17 DAP Fa97,  U.CB
6.7 Exceptions

Exceptions and interrupts are events other than

branches or jumps that change the normal flow of
instr execution.

Exception: an unexpected event from within the

processor. Ex: arithmetic overflow

Interrupts: a change of control flow from outside the

processor. Ex: Interrupts caused by I/O devices to
communicate with the processor.

cs 152 L1 3 .18 DAP Fa97,  U.CB

What about Interrupts, Traps,
Faults?
External Interrupts:
• Allow pipeline to drain,
• Load PC with interupt address

Faults (within instruction, restartable)

• Force trap instruction into IF (transfer control to the
exception routine)
• disable writes till trap hits WB
• must save multiple PCs or PC + state

Refer to MIPS solution

cs 152 L1 3 .19 DAP Fa97,  U.CB
Example: Arithmetic Overflow Exception

What happens if an over flow exception occurs in the

add instr of the following instr sequence? Instr
addresses are given in hexadecimal.
40 sub $11, $2, $4
44 and $12, $2, $5
48 or $13, $2, $6
4C add $1, $2, $1
50 slt $15, $6, $7
54 lw $16, 50($7)

assume the instrs to be invoked on an exception begin:

4000040 sw $25, 1000($0)
4000044 sw $26, 1004($0)

cs 152 L1 3 .20 DAP Fa97,  U.CB

Bubbles are Inserted
lw $16, 50($7) slt $15, $6, $7 add $1, $2, $1 or $13, . . . and $12, . . .

IF.Flush ID.Flush EX.Flush

Hazard
detection
unit
M ID/EX
u M 0
40000040 u
x 0 10
WB x
0 E X/MEM
M 0 010 M 0
Control u M u WB
x x MEM/WB
0
0
0 Cause 1
IF/ID EX M WB
58 54 50 Except
PC
4 Shift
left 2 $6
M $2
u
x

12 Registers = Data
Instruction ALU
PC memory M
mem ory $7
40000040 u
M $1 x
54 u
x

Sign
extend

M 13 12
$1 u
15 x
Forwarding
unit

Clock 5

sw $25, 1000($0) bubble (nop) bubble bubble or $13, . . .

IF.Flush ID.Flush EX.Flush

Hazard
detection
unit
M ID/EX
u M 00
40000040 u
x 0 00
WB x
0 E X/MEM

Control M 0 000 M 00
u M u WB
x x MEM/WB
0
40000044 0
0 1
EX Cause M WB
IF/ID

Except
PC
4 Shift
left 2
M
u
x

13 Registers = Data
Instruction ALU
PC memory
mem ory M
40000044 u
M x
40000040 u
x

Sign
extend

M 13
u
x

Forwarding
unit

Clock 6

cs 152 L1 3 .21 DAP Fa97,  U.CB

How to Flush Unwanted Instructions

Flush instrs in IF by turning it into nop (IF.Flush

control signal).
Use multiplexors already in ID to zero control signals
of instr in ID stage. A new ID.Flush control signal is
ORed with the stall signal from the Hazard
Detection Unit to flush during ID.
Use a new signal EX.Flush is used to cause new
multiplexors to zero the control lines in EX.
To start fetching instrs at location 40000040, add an
additional input to the PC multiplexor that sends
40000040 to the PC.
If the instr is not stopped in the middle of its execution, the
programmer will see a wrong value of $1 (not the original value
that caused the overflow). The exception must be detected and
flush signals set during the EX of add instr. EX.Flush prevents
this instr from writing in the WB stage.

cs 152 L1 3 .22 DAP Fa97,  U.CB

Datapath with Controls to Handle Exceptions
IF.Flush ID.Flush EX.Flush

Hazard
detection
unit
M ID/EX
40000040 u M
x u
WB x
0 EX/MEM
M M
Control u M u WB
x x MEM/WB
0
0
EX Cause M WB
IF/ID

Except
PC
4 Shift
left 2
M
u
x
Registers = Data
Instruction ALU
PC memory M
memory
u
M x
u
x

Sign
extend

M
u
x
Forwarding
unit

Cause regs records cause of exception; Exception PC saves address of

instr that caused the exception (actually saves address + 4). In example
cs 152 L1 3 .23
EPC = 4C + 4 = 50 (in hexadecimal). DAP Fa97,  U.CB
Other Causes of Exceptions

° I/O device request

° Invoking Operating System service from user
program
° Using an undefined instr
° HW malfunction
How to associate an exception with the
corresponding instr when there are 5 instrs active
in a clok cycle?
Multiple exceptions can occur simultaneously in a
single clock cycle.

cs 152 L1 3 .24 DAP Fa97,  U.CB

Exception Problem

° Exceptions/Interrupts: 5 instructions executing in 5 stage pipeline

• How to stop the pipeline?
• Restart?
• Who caused the interrupt?
Stage Problem interrupts occurring
IF Page fault on instruction fetch; misaligned memory
access; memory-protection violation
ID Undefined or illegal opcode
EX Arithmetic exception
MEM Page fault on data fetch; misaligned memory
access; memory-protection violation; memory error
° Load with data page fault, Add with instruction page fault?
° Solution 1: interrupt vector/instruction 2: interrupt ASAP, restart
everything incomplete

cs 152 L1 3 .25 DAP Fa97,  U.CB

Exception Handling

IAU

npc

I mem detect bad instruction address

Regs lw $2,20($5) PC
detect bad instruction
B A im n op rw

alu detect overflow

D mem detect bad data address

Regs Allow exception to take effect

cs 152 L1 3 .26 DAP Fa97,  U.CB
Solutions to Exception Problem

I/O device requests and HW malfunction: not

associated with a specific instr. Implementation
has some flexibility as to when to interrupt the
pipeline. For I/O request, stop at simplest instr. For
HW malfunction, it is best to stop as soon as
possible because HW is unstable.

Prioritize exceptions so it is easy to determine which

is serviced first. In most MIPs implementations the
HW sorts instrs so that earliest instr is interrupted.
EPC has address of interrupted instrs.
Cause regs records all possible exceptions in a
clock cycle.

cs 152 L1 3 .27 DAP Fa97,  U.CB

Pipeline Changes Order of Exception Ocurrence

instr__1__ _2___3___4___5___6___7___8___9___10
i-3 IF ID EX MEM WB
i-2 IF ID EX MEM WB
i-1 IF ID EX MEM WB
i LW IF ID EX MEM WB
i+1 ADD IF ID EX MEM WB
i+2 IF ID EX MEM WB

cs 152 L1 3 .28 DAP Fa97,  U.CB

HW SW Interface for Exception Handling

By knowing in which stage an exception can occur,

the exception SW can match the exception to the
instr.
Exceptions are collected in the cause reg so that the
HW can interrupt based on later exceptions, once
the earliest one has been serviced.
Precise Interrupts: The pipeline is stopped so that
the instrs prior to the falty instr j complete and the
following are flushed and later restarted.
Implementation: HW associates an interrupt vector
with each instr and disables the effects of the of the
faulty instr j when it is detected. The instr follows
down the pipe and the exception is treated at the
end of the MEM stage of j. This guarantees that the
exceptions of instr i are treated before those of
instr i+1.
cs 152 L1 3 .29 DAP Fa97,  U.CB
HW SW Interface (Cont...)

The HW saves the address in EPC, and the cause of

the falty instr in the cause reg and then jumps to a
prearranged address of the exception handling
routine.
The OS looks at the cause and acts appropriately.
For undefined instr, HW malfunction, arithmetic
overflow, the OS normally kills the program and
returns an indicator of the reason.
For I/O device request or an OS service call, the OS
saves the state of the program, performs the
desired task, and then restores the program to
continue execution.
MIPS and the most machines today support precise
interrupts or precise exceptions because they
simplify the interface with the OS (e.g. to support
virtual memory).
cs 152 L1 3 .30 DAP Fa97,  U.CB
Imprecise Interrupts or Imprecise Exceptions

Imprecise Interrupts: Treat exception when it occurs.

instr__1__ _2___3___4___5___6___7___8___9___10
i-3 IF ID EX MEM WB
i-2 IF ID EX MEM WB
i-1 IF ID EX MEM WB
i LW IF ID EX MEM WB
i+1 ADD IF ID EX MEM WB
i+2 IF ID EX MEM WB

Treats ADD exception at the end of 5th cycle. Stop instrs i-2, i-1, i,
and i+1. Reinitiate pipe at intrs i-2 after exception handling. It
will then detect the pg fault of instr i, stopping instrs i...i+3.
cs 152 L1 3 .31 DAP Fa97,  U.CB
Why use Imprecise Exceptions?

The difficulty of making the HW associate the correct

exception with the correct instr in pipelined
computers sometimes leads designers to relax
precise exception treatment.

Let the OS system determine:

° which instr caused the exception (from the type of
exception and the stage in which it occured)
° which instrs to flush
° from which instr to restart

cs 152 L1 3 .32 DAP Fa97,  U.CB

TOR Starter Set The Rules
100% (11)
TOR Starter Set The Rules
28 pages
Bookofebontidespdf
100% (3)
Bookofebontidespdf
257 pages
DINH, Giang - Origami-11
No ratings yet
DINH, Giang - Origami-11
2 pages
Control Hazard
No ratings yet
Control Hazard
20 pages
Computer Science 37 Lecture 22
No ratings yet
Computer Science 37 Lecture 22
14 pages
Branch Hazard in Pipelining
No ratings yet
Branch Hazard in Pipelining
35 pages
Lecture-6-13.01.2025 HPC
No ratings yet
Lecture-6-13.01.2025 HPC
17 pages
What About Branches?: Branch Outcomes Are Not Known Until EXE What Are Our Options?
No ratings yet
What About Branches?: Branch Outcomes Are Not Known Until EXE What Are Our Options?
27 pages
Pipeline Hazards: Structural Hazards: Resource Conflict
No ratings yet
Pipeline Hazards: Structural Hazards: Resource Conflict
49 pages
Slides Chapter 6 Pipelining
No ratings yet
Slides Chapter 6 Pipelining
60 pages
05 - Pipelining - Branch Prediction
No ratings yet
05 - Pipelining - Branch Prediction
20 pages
Lec5 PDF
No ratings yet
Lec5 PDF
23 pages
PipelineHazards
No ratings yet
PipelineHazards
4 pages
L9 PipelineHazards 2
No ratings yet
L9 PipelineHazards 2
21 pages
Lecture 4.3 - The Processor - Pipelining
No ratings yet
Lecture 4.3 - The Processor - Pipelining
27 pages
Sample Problems Pipe&Memory
No ratings yet
Sample Problems Pipe&Memory
57 pages
pipe2New
No ratings yet
pipe2New
41 pages
Reducing Pipeline Branch Penalties
No ratings yet
Reducing Pipeline Branch Penalties
4 pages
Group 17_2151177
No ratings yet
Group 17_2151177
15 pages
Pipeline - Instr - Super Branch
No ratings yet
Pipeline - Instr - Super Branch
48 pages
Pipeline Part 2 and Data Hazards
No ratings yet
Pipeline Part 2 and Data Hazards
11 pages
Control Hazards
No ratings yet
Control Hazards
19 pages
EE557SP25HW2Sol
No ratings yet
EE557SP25HW2Sol
9 pages
Pipeline History
No ratings yet
Pipeline History
30 pages
3.Control.Hazards.and.Branch.Prediction
No ratings yet
3.Control.Hazards.and.Branch.Prediction
111 pages
Anch Prediction
No ratings yet
Anch Prediction
183 pages
Instruction Pipelining (Ii) : Reducing Pipeline Branch Penalties
No ratings yet
Instruction Pipelining (Ii) : Reducing Pipeline Branch Penalties
4 pages
Pipeline Hazards (1)
No ratings yet
Pipeline Hazards (1)
53 pages
CA Lecture 4 Module 3
No ratings yet
CA Lecture 4 Module 3
27 pages
L13 MIPS Control Hazards
No ratings yet
L13 MIPS Control Hazards
40 pages
Branch Prediction
No ratings yet
Branch Prediction
38 pages
Instruction Pipelining (Ii) : Reducing Pipeline Branch Penalties
No ratings yet
Instruction Pipelining (Ii) : Reducing Pipeline Branch Penalties
5 pages
L02 Branch Prediction V2021
No ratings yet
L02 Branch Prediction V2021
82 pages
Control Hazard
No ratings yet
Control Hazard
4 pages
Branch Hazards in The Pipelined Processor: Winter 2002 CSE 141 - Topic
No ratings yet
Branch Hazards in The Pipelined Processor: Winter 2002 CSE 141 - Topic
24 pages
Lect6 Pipelining2 Sec2 PDF
No ratings yet
Lect6 Pipelining2 Sec2 PDF
31 pages
Chapter 8 - Pipelining
No ratings yet
Chapter 8 - Pipelining
38 pages
Cse590490 HW2
No ratings yet
Cse590490 HW2
5 pages
Tuesday, October 31, 2023 10:53 PM: Discuss, The Schemes For Dealing With The Pipeline Stalls Caused by Branch Hazards
No ratings yet
Tuesday, October 31, 2023 10:53 PM: Discuss, The Schemes For Dealing With The Pipeline Stalls Caused by Branch Hazards
7 pages
CA Unit 3 Answers
No ratings yet
CA Unit 3 Answers
10 pages
Conditional Branches
No ratings yet
Conditional Branches
35 pages
Pipelining Hazards 3
No ratings yet
Pipelining Hazards 3
14 pages
8 Pipeline Ddp Control
No ratings yet
8 Pipeline Ddp Control
54 pages
App C
No ratings yet
App C
50 pages
CA - Slides
No ratings yet
CA - Slides
28 pages
L13 Stalls and Flushes
No ratings yet
L13 Stalls and Flushes
27 pages
Lecture11-Instruction Pipelining Control Hazards
No ratings yet
Lecture11-Instruction Pipelining Control Hazards
19 pages
Instruction Pipelining
No ratings yet
Instruction Pipelining
32 pages
Lec12 Pipeline 2 Notes
No ratings yet
Lec12 Pipeline 2 Notes
58 pages
Pipelining
No ratings yet
Pipelining
44 pages
Computer Architecture
No ratings yet
Computer Architecture
100 pages
Unit 7 - Basic Processing
No ratings yet
Unit 7 - Basic Processing
85 pages
Branch Prediction Techniques
No ratings yet
Branch Prediction Techniques
48 pages
Unit V
No ratings yet
Unit V
23 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
50 pages
Ca CT2
No ratings yet
Ca CT2
4 pages
4-Pipeline
No ratings yet
4-Pipeline
30 pages
Comp206 Lecture9
No ratings yet
Comp206 Lecture9
53 pages
CAQA5e ch3
No ratings yet
CAQA5e ch3
45 pages
CompEng 361 - Homework 3 Solutions(1)
No ratings yet
CompEng 361 - Homework 3 Solutions(1)
6 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Analog Dialogue, Volume 47, Number 2
From Everand
Analog Dialogue, Volume 47, Number 2
Analog Dialogue
No ratings yet
Chemical Equilibrium With Ans L (OK) Colour File
100% (1)
Chemical Equilibrium With Ans L (OK) Colour File
40 pages
Ojos Locos Drink Menu CC MM
No ratings yet
Ojos Locos Drink Menu CC MM
1 page
Nursing Care of Death and Dying Patients
100% (2)
Nursing Care of Death and Dying Patients
71 pages
bEQ Is Rating System That Offers Guidance: BOD Nominees Introduced
No ratings yet
bEQ Is Rating System That Offers Guidance: BOD Nominees Introduced
24 pages
Delhi Public School Kalinga: Book List For The Year: 2023-24
No ratings yet
Delhi Public School Kalinga: Book List For The Year: 2023-24
3 pages
Est200-Draft Scheme
No ratings yet
Est200-Draft Scheme
2 pages
Exploration and Evaluation of Mineral Resources Under PFRS 6
No ratings yet
Exploration and Evaluation of Mineral Resources Under PFRS 6
2 pages
TOMATO CHUTNEY RECIPE (Tamatar Ki Chutney) : Ingredients
No ratings yet
TOMATO CHUTNEY RECIPE (Tamatar Ki Chutney) : Ingredients
2 pages
Students:: Alaa El Badri Haya El Jerjawi
No ratings yet
Students:: Alaa El Badri Haya El Jerjawi
32 pages
Liverpathologyrelatedto Onco-Therapeuticagents: Paige H. Parrack,, Stephen D. Zucker,, Lei Zhao
No ratings yet
Liverpathologyrelatedto Onco-Therapeuticagents: Paige H. Parrack,, Stephen D. Zucker,, Lei Zhao
20 pages
CSC Note
No ratings yet
CSC Note
42 pages
SPE-175954-MS Impact of Solvent-Extraction On Fluid Storage and Transport Properties of Montney Formation
No ratings yet
SPE-175954-MS Impact of Solvent-Extraction On Fluid Storage and Transport Properties of Montney Formation
16 pages
Ragna and Ylva
No ratings yet
Ragna and Ylva
60 pages
Diagnostic Test: Plan Educativo Aprendemos Juntos en Casa Ámbito Pedagógico Curricular
No ratings yet
Diagnostic Test: Plan Educativo Aprendemos Juntos en Casa Ámbito Pedagógico Curricular
4 pages
Cutting B8
No ratings yet
Cutting B8
2 pages
COVID-19: Upaya Pencegahan Penularan
No ratings yet
COVID-19: Upaya Pencegahan Penularan
57 pages
Reflections On Purposeful Work
No ratings yet
Reflections On Purposeful Work
9 pages
Turn It Off-An Anti-Idling Campaign
No ratings yet
Turn It Off-An Anti-Idling Campaign
2 pages
FITT Principle Act. (BB) Answer
100% (1)
FITT Principle Act. (BB) Answer
1 page
Leafy Corn Silage Hybrids Promotional Guide
100% (1)
Leafy Corn Silage Hybrids Promotional Guide
24 pages
Wireline Fishing
100% (3)
Wireline Fishing
28 pages
State Space Model Nptel Mod
No ratings yet
State Space Model Nptel Mod
30 pages
Square Foot Construction Cost Table: Bureau of Construction Codes
No ratings yet
Square Foot Construction Cost Table: Bureau of Construction Codes
1 page
1
100% (3)
1
223 pages
Grove GMK 5275
No ratings yet
Grove GMK 5275
24 pages
Dykem Marking Products Brochure
No ratings yet
Dykem Marking Products Brochure
2 pages
Temperature
No ratings yet
Temperature
10 pages

pipe3

Uploaded by

pipe3

Uploaded by

6.

6 Branch Hazards (Control Hazards)

Control Hazards are simple to understand and occur

They can’t be solved as effectively as forwarding is

There are 2 schemes for resolving these hazards and

Stalling until branches complete is too slow and

cs 152 L1 3 .2 DAP Fa97,  U.CB

Program Time (in clock cycles)

40 beq $1, $3, 7 IM Reg DM Reg

44 and $12, $2, $5 IM Reg DM Reg

48 or $13, $6, $2 IM Reg DM Reg

52 add $14, $2, $2 IM Reg DM Reg

72 lw $4, 50($7) IM Reg DM Reg

Reduce the cost of the taken branch.

Move the branch execution to the ID stage. Special

Only one instruction to flush if the branch is taken.

IF.Flush control signal actually comes from equality checker (evaluates

Assume the pipeline is optimized for branches not

36 sub $10, $4, $8

44 and $12, $2, $5

cs 152 L1 3 .7 DAP Fa97,  U.CB

Assuming Branch taken is a crude form of branch

Algorithm: look up the address of the instr to see if a

Implementation: branch prediction buffer or branch

cs 152 L1 3 .9 DAP Fa97,  U.CB

Consider the following loop with: i=1 initially, j=1, h=10

Loop: add $t1, $s3, $s3 #Temp reg $t1 = 2 *i

The prediction may be wrong (it may have been put

A branch that strongly favors taken or not taken will

Assume code has exited an earlier loop:

A delayed branch always executes the following instr,

Delay slot sub $t4, $t5, $t6

Becomes Becomes Becomes

add $s1, $s2, $s3

if $s2 = 0 then if $s1 = 0 then

sub $t4, $t5, $t6

In b) and c) performance is improved only when execution proceeds

In a) the slot is scheduled with an independent instr

Compare performance of single-cycle, multicycle

Single-cycle: 8 ns / instruction = cycle duration.

CPI = 1.5 *0.22 + 1*0.11 + 1*0.49 + 1.25*0.16 + 2*0.02 = 1.17 cycles

Exceptions and interrupts are events other than

Exception: an unexpected event from within the

Interrupts: a change of control flow from outside the

cs 152 L1 3 .18 DAP Fa97,  U.CB

Faults (within instruction, restartable)

Refer to MIPS solution

What happens if an over flow exception occurs in the

assume the instrs to be invoked on an exception begin:

cs 152 L1 3 .20 DAP Fa97,  U.CB

IF.Flush ID.Flush EX.Flush

sw $25, 1000($0) bubble (nop) bubble bubble or $13, . . .

IF.Flush ID.Flush EX.Flush

cs 152 L1 3 .21 DAP Fa97,  U.CB

Flush instrs in IF by turning it into nop (IF.Flush

cs 152 L1 3 .22 DAP Fa97,  U.CB

Cause regs records cause of exception; Exception PC saves address of

° I/O device request

cs 152 L1 3 .24 DAP Fa97,  U.CB

° Exceptions/Interrupts: 5 instructions executing in 5 stage pipeline

cs 152 L1 3 .25 DAP Fa97,  U.CB

I mem detect bad instruction address

alu detect overflow

D mem detect bad data address

Regs Allow exception to take effect

I/O device requests and HW malfunction: not

Prioritize exceptions so it is easy to determine which

cs 152 L1 3 .27 DAP Fa97,  U.CB

cs 152 L1 3 .28 DAP Fa97,  U.CB

By knowing in which stage an exception can occur,

The HW saves the address in EPC, and the cause of

Imprecise Interrupts: Treat exception when it occurs.

The difficulty of making the HW associate the correct

Let the OS system determine:

cs 152 L1 3 .32 DAP Fa97,  U.CB

You might also like

CPI = 1.5 0.22 + 10.11 + 10.49 + 1.250.16 + 2*0.02 = 1.17 cycles