0% found this document useful (0 votes)
18 views24 pages

Performnce Metrics and Measures

There are various conditions that must be met for parallelism in computer programs and hardware. For programs, these include having independent segments with minimal data, control, and resource dependencies. The types of data dependencies that can limit parallelism are defined. Parallelism can occur at different levels from the instruction level up through program, job, and task levels. Hardware parallelism depends on the machine's ability to simultaneously execute multiple instructions through features like multiple functional units.

Uploaded by

Pankz Thakur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views24 pages

Performnce Metrics and Measures

There are various conditions that must be met for parallelism in computer programs and hardware. For programs, these include having independent segments with minimal data, control, and resource dependencies. The types of data dependencies that can limit parallelism are defined. Parallelism can occur at different levels from the instruction level up through program, job, and task levels. Hardware parallelism depends on the machine's ability to simultaneously execute multiple instructions through features like multiple functional units.

Uploaded by

Pankz Thakur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

PERFORMNCE METRICS AND MEASURES

PARALLELISM PROFILE IN PROGRAM:


What are the conditions of Parallelism in Computer Architecture?

There are various conditions of Parallelism which are as follows −

 Data and resource dependencies − A program is made up of several parts, so the ability to
implement various program segments in parallel is needed that each segment should be
autonomous regarding the other segment. Dependencies in various segments of a program
may be in various forms like resource dependency, control depending & data depending.
A dependence graph can define the relation. The program statements are defined by nodes
and the directed edge with multiple labels displays the ordered relation among the statements.
After analyzing the dependence graph, it can be demonstrated that where opportunity exists
for parallelization & vectorization.
Data Dependencies − Relation between statements is shown by data dependencies. There are
5 types of data dependencies that are as follows −
o Antidependency − A statement S2 is antidependent on statement ST1 if ST2 follows
ST1 in order and if the output of ST2 overlaps the input to ST1.
o Input dependence − Read & write are input statement input dependence occur not
because of same variables involved put because of the same file is referenced by both
input statements.
o Unknown dependence − The dependence relation between two statements cannot be
found in the following methods
 The subscript of the variable is itself subscribed.
 The subscript does not have the loop index variable.
 Subscript is nonlinear in the loop index variable.
o Output dependence − Two statements are output dependence if they create a similar
output variable.
o Flow dependence − The statement ST2 flows dependent if a statement ST1, if an
expression path exists from ST1 to ST2 and at least are the output of ST, feeds in input
to ST2.
 Software Parallelism − Software dependency is represented by the control and data
dependency of programs. The degree of parallelism is disclosed in the program profile or
program flow graph. Software parallelism is a function of the algorithm, programming style,
and compiler optimization. Program flow graphs show the pattern of simultaneously
executable operation. Parallelism in a program changes during the implementation period.
 Hardware Parallelism − Hardware Parallelism is represented by hardware multiplicity &
machine hardware. It is a function of cost & performance trade-off. It presents the resource
application design of simultaneously executable operations. It also denotes the execution of
the processor resources. One method of identifying parallelism in hardware is using several
instructions issued per machine cycle.
 What is Parallelism? What are the various conditions of parallelismAns.
 Parallelism is the major concept used in today computer use of multiplefunctional units is a
form of parallelism within the CPU. In early computer onlyone arithmetic & functional units
are there so it cause only one operation toexecute at a time. So ALU function can be
distributed to multiple functionalunits, which are operating in parallel.H.T. Kung has
recognized that there is a need to move in three areas namelycomputation model for parallel
computing, inter process communication inparallel architecture & system integration for
incorporating parallel systems intogeneral computing environment.
 Conditions of Parallelism :1. Data and resource dependencies :
 A program is made up of several part, sothe ability of executing several program segment in
parallel requires that eachsegment should be independent other segment. Dependencies in
varioussegment of a program may be in various form like resource dependency,
controldepending & data depending. Dependence graph is used to describe the
relation.Program statements are represented by nodes and the directed edge withdifferent
labels shows the ordered relation among the statements. Afteranalyzing dependence graph, it
can be shown that where opportunity exist forparallelization & vectorization.
 Data Dependencies:
 Relation between statements is shown by datadependences. There are 5 types of data
dependencies given below:
 (a) Antidependency:
 A statement S
 2
 is antidependent on statement ST
 1
 if ST
 2
 follows ST
 1
 in order and if the output of ST
 2
 overlap the input to ST
 1
 .
 (b) Input dependence:
 Read & write are input statement input dependence occurnot because of same variables
involved put because of same file is referenced byboth input statements.
 (c) Unknown dependence:
 The dependence relation between two statementcannot be found in following situationThe
subscript of variable is itself subscribed.The subscript does not have the loop index
variable.Subscript is non linear in the loop index variable.


 Advanced Computer Arc. 15
 (d) Output dependence:
 Two statements are output dependence if they producethe same output variable.
 (e) Flow dependence:
 The statement ST
 2
 is flow dependent if an statement ST
 1
 , ifan expression path exists from ST
 1
 to ST
 2
 and at least are output of ST, feeds in aninput to ST
 2
 .
 2. Bernstein’s condition :
 Bernstein discovered a set of conditions depending onwhich two process can execute in
parallel. A process is a program that is inexecution. Process is an active entity. Actually it
is an stumbling block of aprogram fragment defined at various processing levels. I
 i
 is the input set ofprocess P
 i
 which is set of all input variables needed to execute the processsimilarly the output set of
consist of all output variable generated after executionof all process P
 i
 . Input variables are actually the operands which are fetched fromthe memory or registers.
Output variables are the result to be stored in workingregisters or memory locations.Let there
are 2 processes P
 1
 &P
 2
 Input sets are I
 1
 &I
 2
 Output sets are O
 1
 &O
 2
 The two processes P
 1
 &P
 2
 can execute in parallel & are directed by P
 1
 /P
 2
 if &only if they are independent and do not create confusing results.
 3. Software Parallelism :
 Software dependency is defined by control and datadependency of programs. Degree of
parallelism is revealed in the programprofile or in program flow graph. Software parallelism
is a function of algorithm,programming style and compiler optimization. Program flow graphs
shows thepattern of simultaneously executable operation. Parallelism in a program
variesduring the execution period.
 4. Hardware Parallelism :
 Hardware Parallelism is defined by hardwaremultiplicity & machine hardware. It is a function
of cost & performance trade off.It present the resource utilization patterns of simultaneously
executableoperations. It also indicates the performance of the processor resources.One
method of identifying parallelism in hardware is by means by number ofinstructions issued
per machine cycle.
 Q.6. What are the different levels of parallelism :Ans.
 Levels of parallelism are described below:
 1. Instruction Level :
 At instruction level, a grain is consist of less than 20instruction called fine grain. Fine grain
parallelism at this level may range fromtwo thousands depending an individual
program single instruction stream


 16
 For f
 ree study notes log on:
 www.gurukpo.com

 parallelism is greater than two but the average parallelism at instruction level isaround fine
rarely exceeding seven in ordinary program. For scientificapplications average parallel is in
the range of 500 to 300 fortran statementsexecuting concurrently in an idealized environment.
 2. Loop Level :
 It embrace iterative loop operations. A loop may contain less than500 instructions. Some
loop independent operation can be vectorized forpipelined execution or for look step
execution of SIMD machines. Loop levelparallelism is the most optimized program construct
to execute on a parallel orvector computer. But recursive loops are different to parallelize.
Vectorprocessing is mostly exploited at the loop level by vectorizing compiler.
 3. Procedural Level :
 It communicate to medium grain size at the task,procedure, subroutine levels. Grain at this
level has less than 2000 instructions.Detection of parallelism at this level is much more
difficult than a finer grainlevel. Communication obligation is much less as compared with that
MIMDexecution mode. But here major efforts are requisite by the programmer toreorganize a
program at this level.
 4. Subprogram Level :
 Subprogram level communicate to job steps and relatedsubprograms. Grain size here have
less than 1000 instructions. Job steps canoverlap across diverse jobs. Multiprogramming an
uniprocessor ormultiprocessor is conducted at this level.
 5. Job Level :
 It corresponds to parallel executions of independent tasks onparallel computer. Grain size
here can be tens of thousands of instructions. It ishandled by program loader and by operating
system. Time sharing & spacesharing multiprocessors explores this level of parallelism.
 Q.7. Explain Vector super computers?Ans.
 Program & data are first loaded into the main memory from a host computer. Allinstructions
are first decoded by the scalar control unit. If the decodedinstruction is a scalar operation or
program control operation it will be directlyexecuted by scalar processor using the scalar
functional pipelines.If the instruction is decoded as a vector procedure, it will be sent to the
vectorcontrol unit. This control unit will supervise the flow of vector data amid themain
memory & vector functional pipelines. The vector data flow issynchronized by control unit. A
number of vector functional pipelines may bebuilt into a vector processor.Computers with
vector processing capabilities are in demand in specializedapplications. The following are
symbolized application areas where vectorprocessing is of utmost importance.Long Range
weather forecasting


Trusted by over 1 million members
 Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions!

 Start Free Trial

 Cancel Anytime.


 Advanced Computer Arc. 17
 Petroleum explorationsMedical diagnosisSpace flight simulations
 Vector Processor Models
 Scalar
ProcessorScalarFunctionalPipelinesScalarInstructionsScalarcontrolunitInstructionsMainmenoryScala
rdataMassstorageHostcomputerVectorregistersVectorfunc. pipeVectorfunction
pipeControlVectorcontrolunitVector processor
 The Architecture of vector super computer

 Q.8. What are the different shared memory multiprocessor models?Ans.
 The most popular parallel computers are those that execute programs in MIMDmode. There
are two major classes of parallel computers: shared memorymultiprocessor & message
 –
 passing multi computers. The major distinctionbetween multiprocessors & multicomputers
lies in memory sharing and themechanisms used for interprocessor communication. The
processor inmultiprocessor system communicate with each other through shared variable ina
common memory. Each computer node in a multicomputer system has a localmemory,
unshared with other nodes. Inter process communication is donethrough message passing
among nodes.There are three shared memory multiprocessor models:-1. Uniform memory
access (UMA) model2. Non-uniform memory access (NUMA) model3. Cache only memory
Architecture (COMA) modelThese models are differ in how the memory & peripheral
resources are shared ordistributed.
 1. UMA Model:


 18
 For f
 ree study notes log on:
 www.gurukpo.com

 The UMA multiprocessor model
 P
 1
 P
 2
 P
 n
 System Interconnect(Bus, Crossbar, Multistage network)I/O
 SM
 1
 SM
 m
 Shard maneryProcessor
 In this model the physical memory is uniformly shared by all the processors. Allprocessors
have equal access time to all memory words, which is why it is calleduniform memory access.
Each processor may use a private cache. Peripherals arealso shared.Multiprocessors are called
tightly coupled systems for its high degree of resourcesharing.UMA model is suitable for time
sharing applications by multiple users. It can beused to speed up the execution of single large
program in time criticalapplication.When all processors have equal access to all peripheral
devices, thesystem is called a symmetric multiprocessor. In this case, all the processors
areequally capable of running programme, such as kernel.In an asymmetric multiprocessor,
only one or subset of processors are executivecapable. An executive or master processor can
execute the operating system andhandle I/O. The remaining processors called attached
processors (AP) runs usercode under the supervision of master processor.
 2. NUMA model:
 A NUMA multiprocessor is a shared memory system in whichthe access time diverge with the
location of memory word.Two NUMA machine models are depicted. The shared memory is
physicallydistributed to all processors, called local memories. The collection of all
localmemories forms a global address space accessible by all processors.It is quicker to
access a local memory with a local processor. The access of remotememory attached to other
processors takes longer due to the added delaythrough the interconnection network.


 Advanced Computer Arc. 19
 LM
 1
 LM
 2
 LM
 n
 P
 1
 P
 2
 P
 n
 InterConnectionNetwork
 Shared Local Memories
 In the hierarchial cluster Model processors are divided into several clusters. Eachcluster may
be UMA or NUMA Each cluster is connected to shared memorymodules. All processors of a
single cluster uniformally access the cluster sharedmemory modules. All cluster equally
access to global memory access time tocluster memory is shorter then that of global memory.

 GSMGSMGSMGlobal Interconnect NetworkP
 1
 P
 2
 P
 n
 CSMCSMCSMP
 1
 P
 2
 P
 n
 CSMCSMCSMCINCIN
 A hierarchical cluster models

You might also like