COA Unit-2 Notes (P3)
COA Unit-2 Notes (P3)
ASANSOL
Pre Test:
UNIT-2 (Part-3)
Pipelining:
Pipelining is a technique used in Computer Organization to increase the throughput of a computer system by
allowing multiple instructions to be processed simultaneously at different stages of execution. This technique helps to
make better use of the CPU's resources and improve overall performance.
1. Pipeline Stages:
o Fetch (IF): Retrieve the instruction from memory.
o Decode (ID): Decode the instruction to understand what operations are required.
o Execute (EX): Perform the operations, such as arithmetic calculations or memory access.
o Memory Access (MEM): Read from or write to memory (if needed).
o Write Back (WB): Write the result back to the register or memory.
2. Pipelining Stages in Detail:
o Instruction Fetch (IF): The instruction is fetched from memory into the instruction register.
o Instruction Decode (ID): The fetched instruction is decoded to determine the operation and operands.
o Execution (EX): The actual operation is performed based on the decoded instruction.
o Memory Access (MEM): If the instruction involves memory, data is read from or written to memory.
o Write Back (WB): The result of the execution is written back to the register file.
Example of Pipelining :
Cycle 1: IF1
Cycle 2: ID1, IF2
Cycle 3: EX1, ID2, IF3
Cycle 4: MEM1, EX2, ID3, IF4
Cycle 5: WB1, MEM2, EX3, ID4, IF5
Cycle 6: WB2, MEM3, EX4, ID5
Cycle 7: WB3, MEM4, EX5
Cycle 8: WB4, MEM5
Cycle 9: WB5
In this example:
As the pipeline progresses, subsequent instructions enter the pipeline stages, allowing the processor to work on multiple
instructions simultaneously
Pipeline Hazards
1. Data Hazards:
o Read After Write (RAW): An instruction needs data that has not yet been written by a previous instruction.
o Write After Read (WAR): An instruction writes to a location before a previous instruction has read it.
o Write After Write (WAW): Two instructions write to the same location in different stages.
Example:
Instruction 1: ADD R1, R2, R3 (R1 = R2 + R3)
Instruction 2: SUB R4, R1, R5 (R4 = R1 - R5)
Instruction 2 needs the result of Instruction 1 before it can proceed.
2. Control Hazards:
Branch Hazards: Occur when the pipeline makes incorrect decisions based on branch instructions.
Example:
Instruction 1: BEQ R1, R2, LABEL (Branch if R1 == R2)
Instruction 2: ADD R3, R4, R5
If the branch prediction is incorrect, Instruction 2 may need to be discarded.
3. Structural Hazards:
Occur when hardware resources are insufficient to support all the concurrent instructions.
Example: If there is only one memory unit and both an instruction fetch and data access need to occur simultaneously, a
structural hazard arises.
Advantages of Pipelining
Disadvantages of Pipelining
Consider a simple example where we have an arithmetic pipeline that performs addition and multiplication operations.
Example Operations
Pipelining Execution :
Here's how these operations might be pipelined over several clock cycles:
Cycle 1: IF1
Cycle 2: ID1, IF2
Cycle 3: EX1, ID2, IF3
Cycle 4: MEM1, EX2, ID3, IF4
Cycle 5: WB1, MEM2, EX3, ID4, IF5
Cycle 6: WB2, MEM3, EX4, ID5
Cycle 7: WB3, MEM4, EX5
Cycle 8: WB4, MEM5
Cycle 9: WB5
1. Operand Fetch:
o Fetch the operands from registers or memory.
2. Operand Decode:
o Decode the instruction to determine the arithmetic operation and operands.
3. Arithmetic Execution:
o Perform the arithmetic operation. For instance, ADD will add the contents of R2 and R3, and MUL will
multiply the contents of R5 and R6.
4. Write Back:
o Write the result of the arithmetic operation back to the destination register or memory.
Advantages:
Increased Throughput: More arithmetic operations can be processed in a shorter time.
Efficient Use of Hardware: Each stage of the pipeline can work on different instructions simultaneously.
Disadvantages:
Pipeline Overhead: Designing and managing the pipeline adds complexity to the hardware.
Latency: The time to get the final result for a single operation may be longer than a non-pipelined design due to the time
taken in each stage.
INSTRUCTION PIPELINE
An Instruction Pipeline is a technique in computer organization that allows for the overlapping execution of multiple
instructions. Instead of executing one instruction at a time, modern processors break the execution process into several
stages, allowing different stages of multiple instructions to be executed simultaneously. This increases the throughput of the
processor, meaning more instructions are executed in a shorter time.
Let’s assume an instruction pipeline with five stages (IF, ID, EX, MEM, WB):
Cycle 1: Fetch instruction 1 (IF stage).
Cycle 2: Decode instruction 1 (ID stage) and fetch instruction 2 (IF stage).
Cycle 3: Execute instruction 1 (EX stage), decode instruction 2 (ID stage), and fetch instruction 3 (IF stage).
This process continues, with multiple instructions being processed at different stages in parallel.
1. Pipeline Hazards: Managing and mitigating hazards can be complex and may require additional mechanisms and
hardware.
2. Complex Design: Designing and managing an instruction pipeline involves complex control logic and synchronization.
RISC PIPELINE
The RISC Pipeline is an integral feature of Reduced Instruction Set Computer (RISC) architectures. RISC processors are
designed to execute instructions in a single clock cycle, making them ideal candidates for pipelining. A RISC pipeline breaks
down the instruction execution process into smaller stages, which allows for the simultaneous processing of multiple
instructions, increasing throughput and efficiency.
1 IF
2 ID IF
3 EX ID IF
4 MEM EX ID
5 WB MEM EX
6 WB MEM
7 WB
VECTORS
A vector is a mathematical entity that has both magnitude and direction. In computer science and data processing, a vector
typically refers to an ordered collection of numbers, often used to represent data points, directions, or coordinates in space.
Vectors are fundamental in fields such as physics, mathematics, and computer graphics.
Vectors are versatile and essential in many applications like physics, machine learning, computer graphics, and optimization
problems.
VECTOR PROCESSING
Vector processing is a technique used in computer organization to handle operations involving vectors, which are essentially
one-dimensional arrays of data. It's particularly useful for tasks that involve large amounts of data or require parallel
processing, such as scientific computations, graphics processing, and machine learning.
Vector Processors:
These are specialized processors designed to perform operations on vector data. They have vector registers and vector
instructions that allow them to handle large datasets efficiently.
Common operations include addition, multiplication, and other mathematical functions performed on vectors.
Example of Vectors
1. Vector Arrays
Consider an array of integers:
Vector A: [4, 8, 15, 16, 23, 42]
This is a one-dimensional vector where each element can be accessed by its index (0 through 5).
Operations such as addition or multiplication can be performed on this array efficiently.
Adding 5 to each element:
Resulting Vector A: [9, 13, 20, 21, 28, 47]
2. Vector Processors
Suppose we have two vectors and a vector processor that can perform operations on vectors:
Vector X: [1, 2, 3, 4]
Vector Y: [5, 6, 7, 8]
Vector Addition Operation:
Result Vector Z = X + Y: [6, 8, 10, 12]
The vector processor will perform this operation in parallel for each element of the vectors.
Vector Instructions:
Vector Load/Store: Instructions to load vectors from memory into vector registers and store vectors from vector registers to
memory.
Vector Arithmetic: Instructions to perform arithmetic operations (e.g., addition, subtraction, multiplication) on entire vectors.
Vector Reduction: Operations that combine elements of a vector into a single result, like summing all elements of a vector.
Vector Registers:
These are special-purpose registers used to store vector data. They are larger than scalar registers and can hold multiple data
elements.
Vector Length:
The length of a vector refers to the number of elements it contains. Vector processors are designed to handle vectors of a
specific length efficiently.
Applications of Vectors :
1. Scientific Computing: Vectors are used to represent and compute scientific data, such as physical simulations and
mathematical modeling.
2. Graphics Processing: In graphics processing, vectors represent coordinates and colors, and vector operations are
used for rendering and transformations.
3. Machine Learning: Vectors represent data points and features in machine learning algorithms, enabling efficient
computation and analysis.
ARRAY PROCESSORS :
Array Processors are specialized parallel computing systems that perform the same operation on multiple data points
simultaneously. They consist of multiple processing units that work in parallel to execute vector or matrix operations, making
them ideal for tasks that involve large datasets such as scientific computations, image processing, and simulations.
1. SIMD Architecture:
o Array processors use Single Instruction, Multiple Data (SIMD) architecture, where a single instruction is
applied to multiple data points at the same time.
o The array processor consists of multiple Processing Elements (PEs), each capable of performing operations
on different data elements in parallel.
2. Parallel Processing:
o Array processors are designed to exploit parallelism by performing operations on entire arrays or vectors in
parallel. This increases throughput and efficiency when dealing with large amounts of data.
3. Fixed Control Unit:
o The control unit issues a single instruction, which is broadcast to all the processing elements. Each processing
element then applies that instruction to its corresponding data point.
Control Unit (CU): Issues instructions that are broadcast to all processing elements.
Processing Elements (PEs): Small, simple processors that perform the actual computation on individual data
elements. Each PE operates on one element of the array at a time.
Interconnection Network: Connects the PEs and allows data to be transferred between them or to/from memory.
The array processor has four Processing Elements (PE1, PE2, PE3, PE4).
The Control Unit (CU) issues a single instruction: ADD A, B.
Each PE will then add the corresponding elements of A and B:
o PE1 adds 1 + 5 = 6
o PE2 adds 2 + 6 = 8
o PE3 adds 3 + 7 = 10
o PE4 adds 4 + 8 = 12
The result C = [6, 8, 10, 12] is produced in parallel.
Since all PEs work in parallel, the operation takes the same time as processing a single element, making array processors
highly efficient for vector and matrix operations.
Applications of Array Processors
1. Scientific Computing: Array processors are used in scientific simulations where large-scale mathematical
computations (e.g., matrix multiplication, solving differential equations) are required.
2. Image and Signal Processing: Array processors are used for real-time image and signal processing tasks, such as
filtering, transformation, and enhancement.
3. Artificial Intelligence and Machine Learning: In AI/ML, array processors can accelerate operations on large datasets,
such as matrix multiplications in neural networks.
4. Weather Forecasting: Large amounts of data are processed simultaneously to predict weather patterns.
1. High Throughput: Since operations are performed in parallel on multiple data elements, array processors can process
large datasets much faster than traditional processors.
2. Efficient for SIMD Tasks: Ideal for tasks where the same operation needs to be performed on multiple data points,
such as matrix operations.
3. Scalability: The number of processing elements can be increased to handle larger arrays or more complex tasks,
enhancing performance.
1. Limited Flexibility: Array processors are best suited for SIMD operations. They may not perform well on tasks that
require different instructions for different data points.
2. Complex Hardware: The design and implementation of an array processor with multiple processing elements and a
control unit can be complex and costly.
3. Memory Bottleneck: Array processors often need fast access to large amounts of memory, which can become a
bottleneck if not efficiently managed.
IMPORTANT QUESTIONS
Q6. What technique can be used to mitigate control hazards in an instruction pipeline?
A) Stalling
B) Branch prediction
C) Data forwarding
D) Loop unrolling
Q9. Which of the following is true for RISC processors compared to CISC processors?
A) Fewer instructions, but they are more complex
B) More instructions and they execute faster
C) Fewer, simpler instructions with uniform execution time
D) No pipelining capabilities
A) Different instructions
B) A single element of data
C) Multiple elements of data simultaneously
D) Multiple memory locations simultaneously
Q15. Which of the following applications benefits most from array processors?