0% found this document useful (0 votes)

8 views

csso-U-5

The document discusses parallel processing techniques, including pipelining and vector processing, to enhance computational speed in computer systems. It outlines Flynn's classification of computer architectures, detailing SISD, SIMD, MISD, and MIMD, and explains the operational principles of pipelining, including its structure, timing, and potential hazards. Additionally, it highlights the importance of vector processing in specialized applications such as weather forecasting, medical diagnosis, and image processing.

Uploaded by

hindipoemcollection

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

csso-U-5

Uploaded by

hindipoemcollection

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 29

Pipelining and Vector Processing

Parallel Processing:
• Parallel processing is a term used for a large class of techniques that are used
to provide simultaneous data-processing tasks for the purpose ofincreasing
the computational speed of a computer system.
• It refers to techniques that are used to provide simultaneous data processing.
• The system may have two or more ALUs to be able to execute two or more
instruction at the same time.
• The system may have two or more processors operating concurrently.
• It can be achieved by having multiple functional units that perform
same or different operation simultaneously.
• Parallel processing is done by distributing the data among multiple
functional Units.
Processor with Multiple function units:
The following figure shows one possible way of separating the execution unit into
8 functional units operating in parallel
Fig: Processor with Multiple functional units
• The operation performed in each functional unit is indicated in each block
of the diagram.
• The Adder and integer multiplier perform arithmetic operation with Integer
numbers.
• The floating point operations are separated into 3 circuits operating in
parallel.
• The logic, shift, and increment operation can be performed concurrently on
different data.
• All units are independent, so one number can be shifted while another
number is being activated.
• Architectural Classification: –
• Flynn's classification
• Considers the organization of a computer system by number of instructions
and data items that are manipulated simultaneously.
• Based on the multiplicity of Instruction Streams and Data Streams
• Instruction Stream-Sequence of Instructions read from memory
• Data Stream - Operations performed on the data in the processor
• Parallel processing may occur in the instruction stream, in the data stream
or in both.
• Flynn’s classification divides computer into 4 major groups:
1. SISD (Single Instruction stream, Single Data stream)
2. SIMD (Single Instruction stream, Multiple Data stream)
3. MISD (Multiple Instruction stream, Single Data stream)
4. MIMD (Multiple Instruction stream, Multiple Data stream

• SISD represents the organization containing single control unit, a processor

unit and a memory unit.
• Instruction are executed sequentially and system may or may not have
internal parallel processing capabilities.
• SIMD represents an organization that includes many processing units under
the supervision of a common control unit.
• MISD structure is of only theoretical interest since no practical system has
been constructed using this organization.
• MIMD organization refers to a computer system capable of
processing several programs at the same time.
The main difference between multicomputer system and multiprocessor
system is that the multiprocessor system is controlled by one operating
system that provides interaction between processors and all the component
of the system cooperate in the solution of a problem
• Parallel Processing can be discussed under following topics:
• Pipeline Processing
• Vector Processing
• Array Processors

PIPELINING
• A technique of decomposing a sequential process into sub operations, with
each sub process being executed in a special dedicated segment that operates
concurrently with all other segments.
• A pipelinig is a collection of processing segments.
• Each segment performs partial processing dictated by the way task is
partitioned.
• The result obtained from each segment is transferred to next segment.
• The final result is obtained when data have passed through all segments.
• Suppose we have to perform the following task:
• Each sub operation is to be performed in a segment within a pipeline.
• Each segment has one or two registers and a combinational circuit.
• The register holds the data. The combinational circuit performs
the suboperation in the particular segment.
• A clock is applied to all registers after enough time has elapsed to perform
all segment activity.
• A clock is applied to all registers after enough time has elapsed to perform
all segment activity.
• The pipeline organization will be demonstrated by means of a simple
example.
• To perform the combined multiply and add operations with a stream of
numbers
Ai * Bi + Ci for i = 1, 2, 3, …, 7
• Each suboperation is to be implemented in a segment within a
pipeline. R1 Ai , R2 BiInput Ai and Bi
R3 R1 * R2, R4 Ci Multiply and input Ci
R5 R3 + R4 Add Ci to product
• Each segment has one or two registers and a combinational circuit as shown
in Fig.
• The five registers are loaded with new data every clock pulse. The effect of
each clock is shown in Table.

General Considerations:
• Any operation that can be decomposed into a sequence of suboperations of
about the same complexity can be implemented by a pipeline processor.
• The general structure of a four-segment pipeline is illustrated in Fig. 4-2.
We define a task as the total operation performed going through all the
segments in the pipeline.

• The behavior of a pipeline can be illustrated with a space-time diagram. o It

shows the segment utilization as a function of time
• The space-time diagram of a four-segment pipeline is demonstrated in Fig

• Where a k-segment pipeline with a clock cycle time tp is used to execute n

tasks.
• The first task T1 requires a time equal to ktp to complete its operation.
• The remaining n-1 tasks will be completed after a time equal to (n-1)tp
• Therefore, to complete n tasks using a k-segment pipeline requires k+(n-1)
clock cycles.
• Consider a nonpipeline unit that performs the same operation and takes a
time equal to tn to complete each task.
• The total time required for n tasks is ntn.
• The speedup of a pipeline processing over an equivalent nonpipeline
processing is defined by the ratio
S = ntn/(k+n-1)tp .
• If n becomes much larger than k-1, the speedup becomes
S = tn/tp.
• If we assume that the time it takes to process a task is the same in the
pipeline and nonpipeline circuits, i.e.,tn = ktp, the speedup reduces to
S=ktp/tp=k.
• This shows that the theoretical maximum speedup that a pipeline can
provide is k, where k is the number of segments in the pipeline.
• To duplicate the theoretical speed advantage of a pipeline process by means
of multiple functional units, it is necessary to construct k identical units that
will be operating in parallel.
• This is illustrated in Fig. below, where four identical circuits are connected
in parallel.
• Instead of operating with the input data in sequence as in a pipeline, the
parallel circuits accept four input data items simultaneously and perform
four tasks at the same time
• There are various reasons why the pipeline cannot operate at its maximum
theoretical rate.
• Different segments may take different times to complete their sub operation.
• It is not always correct to assume that a nonpipe circuit has the same time
delay as that of an equivalent pipeline circuit.
• There are three areas of computer design where the pipeline organization is
applicable.
Arithmetic pipeline
Instruction pipeline
RISC pipeline

Arithmetic
pipeline:

• Pipeline arithmetic units are usually found in very high speed computers
• Floating–point operations, multiplication of fixed-point numbers, and
similar computations in scientific problem
• Floating–point operations are easily decomposed into suboperations as
demonstrated in Sec. 10-5.
• An example of a pipeline unit for floating-point addition and subtraction is
showed in the following:
• The inputs to the floating-point adder pipeline are two normalized floating
point binary number
• A and B are two fractions that represent the mantissas, a and b are the
exponents.
• The floating-point addition and subtraction can be performed in four
segments, as shown in Fig. 9-6.
• The suboperations that are performed in the four segments are:
1. Compare the exponents
2. Align the mantissa
3. Add or subtract the mantissas
4. Normalize the result

Example: Consider two floating point numbers binary addition

X = 0.9504 * 103
Y = = 0.8200 * 102
1. Compare exponents by subtraction:
• The exponents are compared by subtracting them to determine their
difference. The larger exponent is chosen as the exponent of the result.
• The difference of the exponents, i.e., 3 - 2 = 1 determines how many times
the mantissa associated with the smaller exponent must be shifted to the
right.
2. Align the mantissas:
• The next segment shifts the mantissa of Y to the right
X = 0.9504 * 103
Y = 0.08200 * 103
3. Add mantissas:
• The two mantissas are added in segment three.
Z = X + Y = 1.0324 * 103
4. Normalize the result:
• After normalization, the result is written as:
Z = 0.1324 * 104
Flow chart for floating point addition and subtraction using
Pipelining

Pipelining for Floating point Addition and Subtraction

• The larger exponent is chosen as the exponent of the result
• The exponent difference determines how many times the mantissa associated
with the smaller exponent must be shifted to the right.
• When an overflow occurs, the mantissa of the sum or difference is shifted
right and the exponent incremented by one.
• If an underflow occurs, the number of leading zeros in the mantissa
determines the number of left shifts in the mantissa and the the exponent
decremented by one.

Instruction Pipeline:
• Pipeline processing can occur not only in the data stream but in the
instruction as well.
• Consider a computer with an instruction fetch unit and an instruction
execution unit designed to provide a two-segment pipeline.
• Computers with complex instructions require other phases in addition to
above phases to process an instruction completely.
• In the most general case, the computer needs to process each instruction
with the following sequence of steps.
1. Fetch the instruction from memory.
2. Decode the instruction.
3. Calculate the effective address.
4. Fetch the operands from memory.
5. Execute the instruction.
6. Store the result in the proper place.
• There are certain difficulties that will prevent the instruction pipeline from
operating at its maximum rate.
• Different segments may take different times to operate on the incoming
information.
• Some segments are skipped for certain operations.
• Two or more segments may require memory access at the same time,
causing one segment to wait until another is finished with the
memory.
Example: four-segment instruction pipeline:
• Assume that:
• The decoding of the instruction can be combined with the calculation of
the effective address into one segment (DA in segment 2 and FI in
segment 1).
• The instruction execution and storing of the result can be combined into one
segment(FO in segment 3 and IE in segment 4)
• Fig 9-7 shows how the instruction cycle in the CPU can be processed with a
four segment pipeline.

• Thus up to four suboperations in the instruction cycle can overlap and

up to four different instructions can be in progress of being processed at the
same time.
• An instruction in the sequence may be causes a branch out of
normal sequence.
• In that case the pending operations in the last two segments are completed
and all information stored in the instruction buffer is deleted.
• Similarly, an interrupt request will cause the pipeline to empty and start
again from a new address value.
• Fig. above shows the operation of the instruction pipeline.
• The four segments are represented in the diagram with an abbreviated
symbol.
1. FI is the segment that fetches an instruction.
2.DA is the segment that decodes the instruction and calculates
the effective address.
3. FO is the segment that fetches the operand.
4. EX is the segment that executes the
instruction Timing of Instruction Pipeline
• The time in the horizontal axis is divided into steps of equal duration.
• Pipeline Hazards
• It is a conflict that prevents an instruction from executing during its
designated clock cycles.
• In general, there are three major difficulties that cause the instruction
pipeline to deviate from its normal operation.
1. Structural Hazards
2. Data Hazard
3. Control hazard
1. Structural Hazards:
 These are the Resource conflicts caused by access to memory by two
segments at the same time.
• Can be resolved by using separate instruction and data memories
2. Data Hazard:
These conflicts arise when an instruction depends on the result of a previous
instruction, but this result is not yet available.
3. Control Hazard:
These conflicts arise when an Branch instruction arise and this branch instruction
causes the change the value of PC.

RISC (Reduced Instruction Set Computer)Pipeline:

• The data transfer instructions in RISC are LOAD and STORE.
• To prevent conflicts between a memory access to fetch an instruction and to
load or store an operand, most RISC machines use two separate buses with
two memories
• One for storing information and other for storing data.
• Example: Three-Segment Instruction Pipeline
• There are three types of instructions:
• The data manipulation instructions: operate on data in processor registers
• The data transfer instructions(load and store)
• The program control instructions(branch instructions)
• The instruction cycle can be divided into three suboperations
and implemented in three segments:
I: Instruction fetch
• Fetches the instruction from program memory
A: ALU operation
• The instruction is decoded and an ALU operation is performed. It performs
an operation for a data manipulation instruction, It evaluates the effective
address for a load or store instruction. It calculates the branch address for a
program control instruction.
E: Execute instruction
• Directs the output of the ALU to one of three destinations, depending on the
decoded instruction. It transfers the result of the ALU operation into a
destination register in the register file.
• It transfers the effective address to a data memory for loading or storing.
• It transfers the branch address to the program
counter. Delayed Load:
• Consider the operation of the following four instructions:
1. LOAD: R1 M[address 1]
2. LOAD: R2 M[address 2]
3. ADD: R3 R1 +R2
4. STORE: M[address 3] R3
• There will be a data conflict in instruction 3 because the operand in R2 is
not yet available in the A segment.
• This can be seen from the timing of the pipeline shown in Fig. 9-9(a).
Pipelining Timing with Delayed load:
Delayed Branch
• The method used in most RISC processors is to rely on the compiler to
redefine the branches so that they take effect at the proper time in the
pipeline.
• This method is referred to as delayed branch.
• The compiler is designed to analyze the instructions before and after the
branch and rearrange the program sequence by inserting useful instructions
in the delay steps.
• It is up to the compiler to find useful instructions to put after the branch
instruction. Failing that, the compiler can insert no-op instructions.
• An Example of Delayed Branch:
• The program for this example consists of five instructions.
1. Load from memory to R1
2. Increment R2
3. Add R3 to R4
4. Subtract R5 from R6
5. Branch to address X
• In Fig. 9-10(a) the compiler inserts two no-op instructions after the branch.
• The branch address X is transferred to PC in clock cycle 7.
• The program in Fig. 9-10(b) is rearranged by placing the add and subtract
instructions after the branch instruction.
• PC is updated to the value of X in clock cycle 5.

Pipelining Timing with Delayed Branch:

Vector processing:
• Normal computational systems are not enough in some special processing
requirements
• In many science and engineering applications, the problems can be formulated
in terms of vectors and matrices that lend themselves to vector processing.
• Computers with vector processing capabilities are in demand in specialized
applications.
Examples:
• Long-range weather forecasting
• Petroleum explorations
• Seismic data analysis
• Medical diagnosis
• Artificial intelligence and expert systems
• Image processing
• Mapping the human genome
The term vector processing involves the data processing on the vectors of
involving high amount of data.
• The large data can be classified as very big arrays.
• The vectors are considered as the large one dimensional array of data.
• The vector processing system can be understood by the example below.
• EX: Consider a program which is adding two arrays A and B of length
100 to produce a vector C
• Machine level program
Initialize I=0
Read A(I)
Read B(I)
20 Store C(I)=A(I)+B(I)
Increment I=I+1
If I<=100 go to 20 continue
• so in this above program we can see that the two arrays are being added in a
loop format.
• First we are starting from the value of 0 and then we are continuing the loop
with the addition operation until the I value has reached to 100.
• In the above program there are 5 loop statements which will be executing
100 times.
• Therefore the total cycles of the CPU taken are 500 cycles.
• But if we use the concept of vector processing then we can reduce the
unnecessary fetch cycles.
• The same program written in the vector processing statement is given below:
C(1:100)=A(1:100)+B(1:100)
• In the above statement, when the system is creating a vector like this the
original source values are fetched from the memory into the vector.
• Therefore the data is readily available in the vector.
• So when a operation is initiated on the data, naturally the operation will be
performed directly on the data and will not wait for the fetch cycle.
• So the total no of CPU Cycles taken by the above instruction is only 100
• Instruction format of vector Instruction:
Matrix Multiplication
• The multiplication of two n x n matrices consists of n2 inner products or n3
multiply-add operations.
• Consider, for example, the multiplication of two 3 x 3 matrices A and B.

c11= a11b11+ a12b21+ a13b31

• This requires three multiplication and (after initializing c11 to 0) three
additions.
• In general, the inner product consists of the sum of k product terms of the
form
C = A1B1+A2B2+A3B3+…+AkBk.
• In a typical application k may be equal to 100 or even 1000.
• The inner product calculation on a pipeline vector processor is shown in
Fig. 9-12.
Implementation of the Vector Processing
• Below we can see the implementation of the vector processing concept on
the following matrix multiplication.

• In the above diagram we can see that how the values of A vector and B
Vector which represents the matrix are being multiplied. Here we will be
considering a 4x4 matrix A and B.
• When addition operation is taking place in the adder pipeline the next set of
values will be brought into the multiplier pipeline, so that all the operations
can be performed simultaneously using the parallel processing concepts by
the implementation of pipeline.
Memory Interleaving:
• Pipeline and vector processors often require simultaneous access to memory
from two or more sources.
• An instruction pipeline may require the fetching of an instruction and an
operand at the same time from two different segments.
• An arithmetic pipeline usually requires two or more operands to enter the
pipeline at the same time.
• Instead of using two memory buses for simultaneous access, the
memory can be partitioned into a number of modules connected to a
common memory address and data buses.
• A memory module is a memory array together with its own address and
data registers.
• Fig. 9-13 shows a memory unit with four modules.

Multiple module Memory Organization

• The advantage of a modular memory is that it allows the use of a technique
called interleaving.
• In an interleaved memory, different sets of addresses are assigned to
different memory modules.
• By staggering the memory access, the effective memory cycle time can be
reduced by a factor close to the number of modules.

Array Processors:
• An array processor is a processor that performs computations on large
arrays of data.
• The term is used to refer to two different types of processors.
Attached array processor:
It is an auxiliary processor. It is intended to improve the performance of the
host computer in specific numerical computation tasks.
SIMD array processor:
Has a single-instruction multiple-data organization. It manipulates vector
instructions by means of multiple functional units responding to a common
instruction

Attached Array Processor

• Its purpose is to enhance the performance of the computer by providing
vector processing for complex scientific applications.
• Parallel processing with multiple functional units
• Fig. 9-14 shows the interconnection of an attached array processor to a host
computer.
Attached Array Processor with host computer
• The host computer is a general-purpose commercial computer and the
attached processor is a back-end machine driven by the host computer.
• The array processor is connected through an input-output controller to the
computer and the computer treats it like an external interface.
• The data for the attached processor are transferred from main memory to a
local memory through a high-speed bus.
• The general-purpose computer without the attached processor serves the
users that need conventional data processing.
• The system with the attached processor satisfies the needs for complex
arithmetic applications.
• For example, when attached to a VAX 11 computer, the FSP-164/MAX
from Floating Point Systems increases the computing power of the VAX to
100megaflops.
• The objective of the attached array processor is to provide vector
manipulation capabilities to a conventional computer at a fraction of the cost
of supercomputer.
SIMD Array Processor:
• An SIMD array processor is a computer with multiple processing units
operating in parallel.
• A general block diagram of an array processor is shown in Fig. 9-15.

• It contains a set of identical processing elements (PEs), each having a local

memory M.
• Each PE includes an ALU, a floating-point arithmetic unit, and working
registers.
• Vector instructions are broadcast to all PEs simultaneously.
• Masking schemes are used to control the status of each PE during the
execution of vector instructions.
• Each PE has a flag that is set when the PE is active and reset when the PE is
inactive.
• For example, the ILLIAC IV computer developed at the University of
Illinois and manufactured by the Burroughs Corp.
– Are highly specialized computers.
– They are suited primarily for numerical problems that can be
expressed in vector or matrix form

3G 4G - Router Product Manual V.1 03-07-16 PDF
0% (1)
3G 4G - Router Product Manual V.1 03-07-16 PDF
24 pages
Introduction to the simulation of power plants for EBSILON®Professional Version 15
From Everand
Introduction to the simulation of power plants for EBSILON®Professional Version 15
Steffen Swat
No ratings yet
Deal Maven Excel Shortcuts
No ratings yet
Deal Maven Excel Shortcuts
4 pages
Unit-4 Pipelinie and Vector Processing
No ratings yet
Unit-4 Pipelinie and Vector Processing
33 pages
Unit-6 Pipelining
No ratings yet
Unit-6 Pipelining
63 pages
CO Module 5 Notes
No ratings yet
CO Module 5 Notes
16 pages
Pipeline Processing Coa
No ratings yet
Pipeline Processing Coa
34 pages
Unit-7-n
No ratings yet
Unit-7-n
13 pages
Presentation 5156 Content Document 20250301102853AM
No ratings yet
Presentation 5156 Content Document 20250301102853AM
40 pages
Chapter 5 - CO - BIM - III
No ratings yet
Chapter 5 - CO - BIM - III
7 pages
Unit-V NEW
No ratings yet
Unit-V NEW
21 pages
Chapter 8 Pipeline and Vector Processing
0% (1)
Chapter 8 Pipeline and Vector Processing
12 pages
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
No ratings yet
Chapter 9 - Pipeline and Vector Processing Section 9.1 - Parallel Processing
10 pages
Pipeline and Vector Processing
100% (1)
Pipeline and Vector Processing
18 pages
Chapter 5 Pipelining and Vector Processing Modified
No ratings yet
Chapter 5 Pipelining and Vector Processing Modified
37 pages
5. Pipeline -3117
No ratings yet
5. Pipeline -3117
21 pages
Architecture
No ratings yet
Architecture
15 pages
6. Pipeline -3117 (1)
No ratings yet
6. Pipeline -3117 (1)
22 pages
CHAPTER 3
No ratings yet
CHAPTER 3
59 pages
Unit - V: Pipeline & Vector Processing and Multi Processors Pipeline and Vector Processing: Multiprocessors
No ratings yet
Unit - V: Pipeline & Vector Processing and Multi Processors Pipeline and Vector Processing: Multiprocessors
20 pages
Coa, Unit v, Notes
No ratings yet
Coa, Unit v, Notes
26 pages
Coa Module 5
No ratings yet
Coa Module 5
10 pages
UNIT-4_Pipelining & Parallel processing
No ratings yet
UNIT-4_Pipelining & Parallel processing
34 pages
BCA Semester II Computer Organisation and Architecture (COA
No ratings yet
BCA Semester II Computer Organisation and Architecture (COA
24 pages
Parallel Processing
No ratings yet
Parallel Processing
32 pages
1 - Unit 8 Pipeline - MP
No ratings yet
1 - Unit 8 Pipeline - MP
12 pages
Unit 5 1
No ratings yet
Unit 5 1
21 pages
Computer Systems Architecture 308 312
No ratings yet
Computer Systems Architecture 308 312
5 pages
Comp Architecture Chapter 4 - Pipelining
No ratings yet
Comp Architecture Chapter 4 - Pipelining
53 pages
Vectors
No ratings yet
Vectors
52 pages
UNIT-5: Pipeline and Vector Processing
No ratings yet
UNIT-5: Pipeline and Vector Processing
63 pages
ACA - Pipelining
No ratings yet
ACA - Pipelining
25 pages
Unit-5-Parallel Processing
No ratings yet
Unit-5-Parallel Processing
11 pages
Unit-5 (Coa) Notes
No ratings yet
Unit-5 (Coa) Notes
33 pages
Chapter 3 - Pipelining-And-Vector-Processing
100% (1)
Chapter 3 - Pipelining-And-Vector-Processing
29 pages
Chap 9
No ratings yet
Chap 9
59 pages
Pipelining
No ratings yet
Pipelining
33 pages
Chapter 10: Pipeline: Objectives
No ratings yet
Chapter 10: Pipeline: Objectives
8 pages
Unit 6 COA
No ratings yet
Unit 6 COA
37 pages
Pipe Lining
No ratings yet
Pipe Lining
7 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
COAU5
No ratings yet
COAU5
31 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
39 pages
chapter9pipelining-200907163859
No ratings yet
chapter9pipelining-200907163859
13 pages
Pipelining and Vector Processing: - Parallel
No ratings yet
Pipelining and Vector Processing: - Parallel
37 pages
Principles of Linear Pipelining
50% (2)
Principles of Linear Pipelining
71 pages
Lecture 10
No ratings yet
Lecture 10
23 pages
Lecture 8 Unit 4 Pipeline and Vector Processing 2019
No ratings yet
Lecture 8 Unit 4 Pipeline and Vector Processing 2019
36 pages
Pipelining and Vector Processing Chapter 9
100% (6)
Pipelining and Vector Processing Chapter 9
29 pages
Parallel Processing
No ratings yet
Parallel Processing
33 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
28 pages
Mca Coa-unit III
No ratings yet
Mca Coa-unit III
16 pages
Pipelining Vector Processing
No ratings yet
Pipelining Vector Processing
27 pages
COA Unit-5
No ratings yet
COA Unit-5
144 pages
Pipeline and Vector Processing
83% (12)
Pipeline and Vector Processing
37 pages
Pipelining PDF
No ratings yet
Pipelining PDF
19 pages
Chapter 08 - Pipeline and Vector Processing
No ratings yet
Chapter 08 - Pipeline and Vector Processing
14 pages
Unit 6 - Pipeline, Vector Processing and Multiprocessors
No ratings yet
Unit 6 - Pipeline, Vector Processing and Multiprocessors
23 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
Unit 5-2 COA
No ratings yet
Unit 5-2 COA
52 pages
Pipeline and Vector
No ratings yet
Pipeline and Vector
29 pages
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
From Everand
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
Digital Equipment Corporation
No ratings yet
Check For Compatibility: Windows 11 Windows 10
No ratings yet
Check For Compatibility: Windows 11 Windows 10
9 pages
Gmail Wiki
No ratings yet
Gmail Wiki
29 pages
Advanced Media & Analytics Solution Guide - Q2'22
No ratings yet
Advanced Media & Analytics Solution Guide - Q2'22
48 pages
Final Written Presentation
No ratings yet
Final Written Presentation
4 pages
CH#4 (Full) CS-I
No ratings yet
CH#4 (Full) CS-I
1 page
Docu67723 VNX Operating Environment For Block 05.33.009.5.184 and For File 8.1.9.184, and Unisphere 1.3.9.1.184 Release Notes
No ratings yet
Docu67723 VNX Operating Environment For Block 05.33.009.5.184 and For File 8.1.9.184, and Unisphere 1.3.9.1.184 Release Notes
105 pages
1 Vectrino
No ratings yet
1 Vectrino
42 pages
H3S - Router LTE R01 - Specs
No ratings yet
H3S - Router LTE R01 - Specs
4 pages
ISM302_Assignment_1.7_Managing_Purchase_V1.1
No ratings yet
ISM302_Assignment_1.7_Managing_Purchase_V1.1
12 pages
BSP Circular No. 808-13 Guidelines On Information Technology Risk Management
No ratings yet
BSP Circular No. 808-13 Guidelines On Information Technology Risk Management
73 pages
(Ebook) Computers Are Your Future by Catherine LaBerta ISBN 9780132545181, 0132545187 - The ebook in PDF/DOCX format is available for instant download
100% (1)
(Ebook) Computers Are Your Future by Catherine LaBerta ISBN 9780132545181, 0132545187 - The ebook in PDF/DOCX format is available for instant download
52 pages
Application Server Administration
No ratings yet
Application Server Administration
440 pages
Hello World Issue 10
No ratings yet
Hello World Issue 10
100 pages
Angshuman Sengupta 17
No ratings yet
Angshuman Sengupta 17
1 page
Onyx Work PDF
No ratings yet
Onyx Work PDF
4 pages
Docx
No ratings yet
Docx
14 pages
Confiden Ial: Servicetechnical Report
No ratings yet
Confiden Ial: Servicetechnical Report
5 pages
FTTH Inspection and Quality Assurance
No ratings yet
FTTH Inspection and Quality Assurance
6 pages
Petri Net Synthesis For Discrete Event Control of Manufacturing Systems (PDFDrive)
No ratings yet
Petri Net Synthesis For Discrete Event Control of Manufacturing Systems (PDFDrive)
247 pages
BBA-1st-PracticalFile Computer Fundamental
No ratings yet
BBA-1st-PracticalFile Computer Fundamental
52 pages
05 - Teleprotection and Weak Infeed
No ratings yet
05 - Teleprotection and Weak Infeed
21 pages
FDP - 21 Days Challenge Checklist
No ratings yet
FDP - 21 Days Challenge Checklist
4 pages
8051 Question
No ratings yet
8051 Question
9 pages
Input Devices and Output Devices
No ratings yet
Input Devices and Output Devices
5 pages
Midterm Lab Exam - Principles of OS
No ratings yet
Midterm Lab Exam - Principles of OS
10 pages
Introduction To AVR Timers
No ratings yet
Introduction To AVR Timers
20 pages
Amadeus Robotics Auto Ticketing Time Limit Sales Sheet
No ratings yet
Amadeus Robotics Auto Ticketing Time Limit Sales Sheet
1 page
End User License Agreement-EU DailyFit
No ratings yet
End User License Agreement-EU DailyFit
6 pages

csso-U-5

Uploaded by

csso-U-5

Uploaded by

Pipelining and Vector Processing

• SISD represents the organization containing single control unit, a processor

• The behavior of a pipeline can be illustrated with a space-time diagram. o It

• Where a k-segment pipeline with a clock cycle time tp is used to execute n

Example: Consider two floating point numbers binary addition

Pipelining for Floating point Addition and Subtraction

• Thus up to four suboperations in the instruction cycle can overlap and

RISC (Reduced Instruction Set Computer)Pipeline:

Pipelining Timing with Delayed Branch:

c11= a11b11+ a12b21+ a13b31

Multiple module Memory Organization

Attached Array Processor

• It contains a set of identical processing elements (PEs), each having a local

You might also like