0% found this document useful (0 votes)
26 views

Assignment 1st PC

Uploaded by

anguralrakesh80
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Assignment 1st PC

Uploaded by

anguralrakesh80
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

ASSIGNMENT-IST

OF
PARALLEL COMPUTING

SUBMITTED BY: RAKESH KUMAR

ROLL NO: 2100083

COURSE: B. TECH 7TH SEM/CSE

SUBMITTED TO: ER. PAWANPREET MAM


INDEX

SR. TITLE PAGE


NO. NO.
Q1. Define parallel computing? 3-6

Q2. Explain Flynn’s classification? 6-8

Q3. Explain SPMD? 9-9

Q4. Give the handler’s classification? 10-11

Q5. What are the paradigms of parallel computing? 12-12


Q1. Define parallel computing?

Ans.

Parallel computing refers to the process of executing several processors an application or


computation simultaneously. Generally, it is a kind of computing architecture where the large
problems break into independent, smaller, usually similar parts that can be processed in one
go. It is done by multiple CPUs communicating via shared memory, which combines results
upon completion. It helps in performing large computations as it divides the large problem
between more than one processor.

Parallel computing also helps in faster application processing and task resolution by increasing
the available computation power of systems. The parallel computing principles are used by
most supercomputers employ to operate. The operational scenarios that need massive
processing power or computation, generally, parallel processing is commonly used there.

Typically, this infrastructure is housed where various processors are installed in a server rack;
the application server distributes the computational requests into small chunks then the
requests are processed simultaneously on each server. The earliest computer software is
written for serial computation as they are able to execute a single instruction at one time, but
parallel computing is different where it executes several processors an application or
computation in one time.

There are many reasons to use parallel computing, such as save time and money, provide
concurrency, solve larger problems, etc. Furthermore, parallel computing reduces complexity.
In the real-life example of parallel computing, there are two queues to get a ticket of anything;
if two cashiers are giving tickets to 2 persons simultaneously, it helps to save time as well as
reduce complexity.

Types of parallel computing

From the open-source and proprietary parallel computing vendors, there are generally three
types of parallel computing available, which are discussed below:

1. Bit-level parallelism: The form of parallel computing in which every task is dependent
on processor word size. In terms of performing a task on large-sized data, it reduces the
number of instructions the processor must execute. There is a need to split the
operation into series of instructions. For example, there is an 8-bit processor, and you
want to do an operation on 16-bit numbers. First, it must operate the 8 lower-order bits
and then the 8 higher-order bits. Therefore, two instructions are needed to execute the
operation. The operation can be performed with one instruction by a 16-bit processor.
2. Instruction-level parallelism: In a single CPU clock cycle, the processor decides in
instruction-level parallelism how many instructions are implemented at the same time.
For each clock cycle phase, a processor in instruction-level parallelism can have the
ability to address that is less than one instruction. The software approach in instruction-
level parallelism functions on static parallelism, where the computer decides which
instructions to execute simultaneously.
3. Task Parallelism: Task parallelism is the form of parallelism in which the tasks are
decomposed into subtasks. Then, each subtask is allocated for execution. And, the
execution of subtasks is performed concurrently by processors.

Applications of Parallel Computing

There are various applications of Parallel Computing, which are as follows:

o One of the primary applications of parallel computing is Databases and Data mining.
o The real-time simulation of systems is another use of parallel computing.
o The technologies, such as Networked videos and Multimedia.
o Science and Engineering.
o Collaborative work environments.
o The concept of parallel computing is used by augmented reality, advanced graphics, and
virtual reality.

Advantages of Parallel computing

Parallel computing advantages are discussed below:

o In parallel computing, more resources are used to complete the task that led to decrease
the time and cut possible costs. Also, cheap components are used to construct parallel
clusters.
o Comparing with Serial Computing, parallel computing can solve larger problems in a
short time.
o For simulating, modeling, and understanding complex, real-world phenomena, parallel
computing is much appropriate while comparing with serial computing.
o When the local resources are finite, it can offer benefit you over non-local resources.
o There are multiple problems that are very large and may impractical or impossible to
solve them on a single computer; the concept of parallel computing helps to remove
these kinds of issues.
o One of the best advantages of parallel computing is that it allows you to do several
things in a time by using multiple computing resources.
o Furthermore, parallel computing is suited for hardware as serial computing wastes the
potential computing power.

Disadvantages of Parallel Computing

There are many limitations of parallel computing, which are as follows:

o It addresses Parallel architecture that can be difficult to achieve.


o In the case of clusters, better cooling technologies are needed in parallel computing.
o It requires the managed algorithms, which could be handled in the parallel mechanism.
o The multi-core architectures consume high power consumption.
o The parallel computing system needs low coupling and high cohesion, which is difficult
to create.
o The code for a parallelism-based program can be done by the most technically skilled
and expert programmers.

o Although parallel computing helps you out to resolve computationally and the data-
exhaustive issue with the help of using multiple processors, sometimes it affects the
conjunction of the system and some of our control algorithms and does not provide
good outcomes due to the parallel option.
o Due to synchronization, thread creation, data transfers, and more, the extra cost
sometimes can be quite large; even it may be exceeding the gains because of
parallelization.
o Moreover, for improving performance, the parallel computing system needs different
code tweaking for different target architectures

Why parallel computing?

There are various reasons why we need parallel computing, such are discussed below:

o Parallel computing deals with larger problems. In the real world, there are multiple
things that run at a certain time but at numerous places simultaneously, which is difficult
to manage. In this case, parallel computing helps to manage this kind of extensively huge
data.
o Parallel computing is the key to make data more modeling, dynamic simulation and for
achieving the same. Therefore, parallel computing is needed for the real world too.
o With the help of serial computing, parallel computing is not ideal to implement real-
time systems; also, it offers concurrency and saves time and money.
o Only the concept of parallel computing can organize large datasets, complex, and their
management.
o The parallel computing approach provides surety the use of resources effectively and
guarantees the effective use of hardware, whereas only some parts of hardware are
used in serial computation, and some parts are rendered idle.

Q2: Explain Flynn’s classification?

Ans. Flynn’s taxonomy is a classification scheme for computer architectures proposed by


Michael Flynn in 1966. The taxonomy is based on the number of instruction streams and data
streams that can be processed simultaneously by a computer architecture.
There are four categories in Flynn’s taxonomy:
1. Single Instruction Single Data (SISD): In a SISD architecture, there is a single
processor that executes a single instruction stream and operates on a single data
stream. This is the simplest type of computer architecture and is used in most
traditional computers.
2. Single Instruction Multiple Data (SIMD): In a SIMD architecture, there is a single
processor that executes the same instruction on multiple data streams in parallel.
This type of architecture is used in applications such as image and signal processing.
3. Multiple Instruction Single Data (MISD): In a MISD architecture, multiple
processors execute different instructions on the same data stream. This type of
architecture is not commonly used in practice, as it is difficult to find applications
that can be decomposed into independent instruction streams.
4. Multiple Instruction Multiple Data (MIMD): In a MIMD architecture, multiple
processors execute different instructions on different data streams. This type of
architecture is used in distributed computing, parallel processing, and other high-
performance computing applications.

Flynn’s taxonomy is a useful tool for understanding different types of computer architectures
and their strengths and weaknesses. The taxonomy highlights the importance of parallelism
in modern computing and shows how different types of parallelism can be exploited to
improve performance.
systems are classified into four major categories:
Flynn’s classification –

1. Single-instruction, single-data (SISD) systems – An SISD computing system is a


uniprocessor machine that is capable of executing a single instruction, operating on
a single data stream. In SISD, machine instructions are processed in a sequential
manner and computers adopting this model are popularly called sequential
computers. Most conventional computers have SISD architecture. All the
instructions and data to be processed have to be stored in primary memory.

The speed of the processing element in the SISD model is limited(dependent) by the
rate at which the computer can transfer information internally. Dominant
representative SISD systems are IBM PC, and workstations.

2. Single-instruction, multiple-data (SIMD) systems – An SIMD system is a


multiprocessor machine capable of executing the same instruction on all the CPUs
but operating on different data streams. Machines based on a SIMD model are well
suited to scientific computing since they involve lots of vector and matrix
operations. So that the information can be passed to all the processing elements
(PEs) organized data elements of vectors can be divided into multiple sets(N-sets
for N PE systems) and each PE can process one data set. Dominant representative
SIMD systems are Cray’s vector processing machines.
3. Multiple-instruction, single-data (MISD) systems – An MISD computing system is
a multiprocessor machine capable of executing different instructions on different
PEs but all of them operate on the same dataset. Example Z = sin(x)+cos(x)+tan(x)
The system performs different operations on the same data set. Machines built
using the MISD model are not useful in most applications, a few machines are built,
but none of them are available commercially.

4. Multiple-instruction, multiple-data (MIMD) systems – An MIMD system is a


multiprocessor machine that is capable of executing multiple instructions on
multiple data sets. Each PE in the MIMD model has separate instruction and data
streams; therefore, machines built using this model are capable of any application.
Unlike SIMD and MISD machines, PEs in MIMD machines work asynchronously.
Q3: Explain SPMD?

Ans. The development of the first SPMD version of Radioss is started in 1994. The version
became a real alternative to SMP version after a long period of parallelization and optimization
of the code. In fact, the scalability of the version is much better than the SPM version. The
SPMD version allows using more processors with a better efficiency. It makes possible to use
up to 64 processors. In addition, all Radioss Crash options are available in this version including
"Arithmetic Parallel" option.
The principle of program is based on Single Program Multiple Data, where the same program
runs with different data. Radioss Starter carries out domain decomposition.
Then, Radioss Engine has just to send data to different processors in an initialization step.
Thereafter, each program runs over each sub domain. It is necessary to communicate
information between processors to manage data on the border of domains. This is carried out
in Radioss using MPI (Message Passing Interface) library.
Figure 1 illustrates the architecture of multi-processor computers with distributed memory.
The Radioss SPMD version runs independently to architecture of memory as well on the
computers with shared memory or distributed memory or a set of work stations in a network.

Figure 1. Architecture with Distributed Memory


Q4: Give the handler’s classification?

Ans. In 1977, Wolfgang Handler proposed an detailed notation for expressing the parallelism
and pipelining of computers. Handler's classification addresses the computer at three distinct
stages:

o Processor control unit (PCU),


o Bit-level circuit (BLC),
o Arithmetic logic unit (ALU),

Theprocessor control unit corresponds to a processor or CPU, the BLC corresponds to the logic
circuit needed to perform one- bit operations in the ALU and the arithmetic logic unit
corresponds to a functional unit or a processing element.
Handler's classification uses the following three pairs of integers to explain a computer:
Computer = (p * p', a * a', b * b')
Whereas, p = number of PCUs
Whereas, p'= number of PCUs that can be pipelined
Whereas, a = number of ALUs controlled by each PCU
Where a'= number of ALUs that can be pipelined
Whereas, b = number of bits in ALU or processing element (PE) word
Whereas, b'= number of pipeline segments on all ALUs or in a single PE
The following operators and rules are used to show the relationship between a variety of
elements of the computer:

o The '*' operator is used to indicate that the units are pipelined or macro-pipelined with
a stream of data running through all the units.
o The '+' operator is used to denote that the units are not pipelined but work on
independent streams of data.
o The 'v' operator is used to denote that the computer hardware can work in one of
numerous modes.
o The '~' symbol is used to specify a range of values for any one of the parameters.
o Peripheral processors are given away before the main processor using another three
pairs of integers. If the given value of the second element of any pair is 1, it may
misplaced for brevity.

Handler's classification is the best elaborate by showing how the operators and rules are used
to classify numerous machines.
The CDC 6600 has only a single main processor supported by 10 I/O processors. One control
unit managed one ALU with a 60-bit word length. The ALU has 10 functional units which can
be produced into a pipeline. The 10 peripheral I/O processors may work in parallel with the
CPU and with each other also. Every I/O processor contains one 12-bit ALU. The explanation
for the 10 I/O processors is:
CDC 6600I/O = (10, 1, 12)
The explanation for the main processor is:
CDC 6600main = (1, 1 * 10, 60)
The I/O processors and the main processor can be regarded as forming a macro-pipeline so
the '*' operator is used to join the two structures:
CDC 6600 = (central processor) *(I/O processors) = (10, 1, 12) * (1, 1 * 10, 60)
Texas Instrument's Advanced Scientific Computer (ASC) have one controller coordinating four
arithmetic units. Every arithmetic unit is an eight stage pipeline with 64-bit words. Therefore,
we have:
ASC = (1, 4, 64 * 8)
The Cray-1 is a 64-bit single processor computer whose ALU has twelve functional units, eight
of which can be joined together to from a pipeline. Dissimilar functional units have from 1 to
14 segments, which can be pipelined also. Handler's description of the Cray-1 is:
Cray-1 = (1, 12 * 8, 64 * (1 ~ 14))
One more sample system is Carnegie-Mellon University's C.mmp multiprocessor. This system
was considered to facilitate research into parallel computer architectures and consequently
can be broadly reconfigured. The system exists of 16 PDP-11 'minicomputers' (which has a 16-
bit word length), interrelated by a crossbar switching network. Usually, the C.mmp operates
in MIMD mode for which the explanation is (16, 1, 16). It can also managed in SIMD mode,
where all the processors are synchronized by a single master controller. The SIMD mode
description is (1, 16, 16). At last, the system can be rearranged to manage in MISD mode. Here
the processors are orderly arranged in a chain with a one stream of data passing through all
of them. The MISD modes description is (1 * 16, 1, 16). The 'v' operator is used to join
descriptions of the same part of hardware operating in differing modes. Thus, Handler's
description for the total C.mmp is:
C.mmp = (16, 1, 16) v (1, 16, 16) v (1 * 16, 1, 16)

The '+' and '*'operators are used to join several separate pieces of hardware. The 'v' operator
is of a dissimilar form to the other two in that it is used to join the different operating modes
of a one piece of hardware.
While Flynn's classification is simple to use, Handler's classification is cumbersome. The
straight use of numbers in the nomenclature of Handler's classification's build it much more
abstract and hence hard. Handler's classification is extremely geared towards the description
of chains and pipelines. While it is well able to explain the parallelism in a single processor, the
range of parallelism in multiprocessor computers is not addressed well.
Q5: What are the paradigms of parallel computing?

Ans. Parallel computing involves various programming and execution paradigms that
determine how tasks are divided, executed, and coordinated across multiple processors or
cores to achieve better performance. Some common paradigms of parallel computing include:

Task Parallelism: In this paradigm, a problem is divided into multiple tasks or sub-tasks, and
each task is executed concurrently on different processing units. Task parallelism is suitable
for applications where different tasks can be performed independently and do not require
frequent synchronization. Examples include parallelizing tasks in a rendering pipeline or
distributed data processing.

Data Parallelism: Data parallelism involves applying the same operation to multiple data
elements simultaneously. This paradigm is often seen in SIMD architectures, where a single
instruction operates on multiple data elements. GPUs and vector processing are examples of
data parallelism, where the same operation is performed on multiple data items in parallel.

Pipeline Parallelism: In pipeline parallelism, tasks are divided into stages, and each stage
processes a different part of the data. Each stage operates concurrently, and data moves
through the pipeline from one stage to another. This approach is common in scenarios where
tasks can be decomposed into a sequence of sub-tasks that can be executed in parallel.

Fork-Join Parallelism: In this paradigm, a master task (parent) forks into multiple sub-tasks
(child tasks) that can be executed concurrently. Once the child tasks are completed, the master
task joins or waits for their completion before continuing. This is often used for tasks that can
be divided into smaller independent units of work.

Task Farming: Task farming involves a master task distributing a set of identical sub-tasks to
multiple workers for parallel execution. This is suitable for applications where a large number
of tasks need to be processed independently and in parallel, such as in distributed simulations
or grid computing.

Dataflow Parallelism: Dataflow parallelism focuses on the flow of data between tasks. Tasks
execute as soon as their input data becomes available, rather than being explicitly scheduled.
This paradigm is often used in stream processing and real-time systems where tasks depend
on the availability of data.

Hybrid Parallelism: Hybrid parallelism combines multiple parallelism paradigms within a single
application or system. For example, a program might utilize both task parallelism and data
parallelism to optimize performance for different parts of the application. This approach is
common in complex parallel applications.

You might also like