Pipe Line
Pipe Line
A useful method of demonstrating this is the laundry analogy. Let's say that
there are four loads of dirty laundry that need to be washed, dried, and
folded. We could put the the first load in the washer for 30 minutes, dry it
for 40 minutes, and then take 20 minutes to fold the clothes. Then pick up
the second load and wash, dry, and fold, and repeat for the third and fourth
loads. Supposing we started at 6 PM and worked as efficiently as possible,
we would still be doing laundry until midnight.
Source: http://www.ece.arizona.edu/~ece462/Lec03-pipe/
RISC Pipelines
A RISC processor pipeline operates in much the same way, although the
stages in the pipeline are different. While different processors have different
numbers of steps, they are basically variations of these five, used in the
MIPS R3000 processor:
If you glance back at the diagram of the laundry pipeline, you'll notice that
although the washer finishes in half an hour, the dryer takes an extra ten
minutes, and thus the wet clothes must wait ten minutes for the dryer to
free up. Thus, the length of the pipeline is dependent on the length of the
longest step. Because RISC instructions are simpler than those used in pre-
RISC processors (now called CISC, or Complex Instruction Set Computer),
they are more conducive to pipelining. While CISC instructions varied in
length, RISC instructions are all the same length and can be fetched in a
single operation. Ideally, each of the stages in a RISC processor pipeline
should take 1 clock cycle so that the processor finishes an instruction each
clock cycle and averages one cycle per instruction (CPI).
Pipeline Problems
In practice, however, RISC processors operate at more than one cycle per
instruction. The processor might occasionally stall a a result of data
dependencies and branch instructions.
For example:
In this example, the first instruction tells the processor to add the contents
of registers r1 and r2 and store the result in register r3. The second instructs
it to add r3 and r4 and store the sum in r5. We place this set of instructions
in a pipeline. When the second instruction is in the second stage, the
processor will be attempting to read r3 and r4 from the registers.
Remember, though, that the first instruction is just one step ahead of the
second, so the contents of r1 and r2 are being added, but the result has not
yet been written into register r3. The second instruction therefore cannot
read from the register r3 because it hasn't been written yet and must wait
until the data it needs is stored. Consequently, the pipeline is stalled and a
number of empty instructions (known as bubbles go into the pipeline. Data
dependency affects long pipelines more than shorter ones since it takes a
longer period of time for an instruction to reach the final register-writing
stage of a long pipeline.
Branch instructions are those that tell the processor to make a decision
about what the next instruction to be executed should be based on the
results of another instruction. Branch instructions can be troublesome in a
pipeline if a branch is conditional on the results of an instruction which has
not yet finished its path through the pipeline.
For example:
The example above instructs the processor to add r1 and r2 and put the
result in r3, then subtract r4 from r5, storing the difference in r6. In the
third instruction, beq stands for branch if equal. If the contents of r3 and r6
are equal, the processor should execute the instruction labeled "Loop."
Otherwise, it should continue to the next instruction. In this example, the
processor cannot make a decision about which branch to take because
neither the value of r3 or r6 have been written into the registers yet.
The processor could stall, but a more sophisticated method of dealing with
branch instructions is branch prediction. The processor makes a guess about
which path to take - if the guess is wrong, anything written into the registers
must be cleared, and the pipeline must be started again with the correct
instruction. Some methods of branch prediction depend on stereotypical
behavior. Branches pointing backward are taken about 90% of the time
since backward-pointing branches are often found at the bottom of loops. On
the other hand, branches pointing forward, are only taken approximately
50% of the time. Thus, it would be logical for processors to always follow the
branch when it points backward, but not when it points forward. Other
methods of branch prediction are less static: processors that use dynamic
prediction keep a history for each branch and uses it to predict future
branches. These processors are correct in their predictions 90% of the time.
Still other processors forgo the entire branch prediction ordeal. The RISC
System/6000 fetches and starts decoding instructions from both sides of the
branch. When it determines which branch should be followed, it then sends
the correct instructions down the pipeline to be executed.
Pipelining Developments
In order to make processors even faster, various methods of optimizing
pipelines have been devised.
Super pipelining refers to dividing the pipeline into more steps. The more
pipe stages there are, the faster the pipeline is because each stage is then
shorter. Ideally, a pipeline with five stages should be five times faster than a
non-pipelined processor (or rather, a pipeline with one stage). The
instructions are executed at the speed at which each stage is completed, and
each stage takes one fifth of the amount of time that the non-pipelined
instruction takes. Thus, a processor with an 8-step pipeline (the MIPS
R4000) will be even faster than its 5-step counterpart. The MIPS R4000
chops its pipeline into more pieces by dividing some steps into two.
Instruction fetching, for example, is now done in two stages rather than one.
The stages are as shown: