0% found this document useful (0 votes)
21 views

Unit 5

Uploaded by

goelh6718
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Unit 5

Uploaded by

goelh6718
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

COMPUTER ORGANIZATION

AND ARCHITECTURE

Unit-5: CPU Performance, Parallel Processing, Multi-processor


Performance
• The most important measure of the performance of a computer is how quickly it
can execute programs. The speed with which a computer executes program is
affected by the design of its hardware. For best performance, it is necessary to
design the compiler, the machine instruction set, and the hardware in a
coordinated way.

• The total time required to execute the program is elapsed time is a measure of
the performance of the entire computer system. It is affected by the speed of the
processor, the disk and the printer. The time needed to execute a instruction is
called the processor time.

09/08/2024 2
• The elapsed time for the execution of a program depends on all units in a
computer system, the processor time depends on the hardware involved in the
execution of individual machine instructions. This hardware comprises of the
processor and the memory which are usually connected by the bus.

• Let us examine the flow of program instructions and data between the memory
and the processor. At the start of execution, all program instructions and the
required data are stored in the main memory. As the execution proceeds,
instructions are fetched one by one over the bus into the processor, and a copy is
placed in the cache later if the same instruction or data item is needed a second
time, it is read directly from the cache.

09/08/2024 3
• The processor and relatively small cache memory can be fabricated on a single IC
chip. The internal speed of performing the basic steps of instruction processing
on chip is very high and is considerably faster than the speed at which the
instruction and data can be fetched from the main memory. A program will be
executed faster if the movement of instructions and data between the main
memory and the processor is minimized, which is achieved by using the cache.

09/08/2024 4
• Processor Clock
Processor circuits are controlled by a timing signal called clock. The clock designer the
regular time intervals called clock cycles. To execute a machine instruction the
processor divides the action to be performed into a sequence of basic steps that each
step can be completed in one clock cycle. The length of one clock cycle is an important
parameter that affects the processor performance.

• Basic performance equation


Let “T” be the processor time required to execute a program that has been prepared
in some high-level language. The compiler generates a machine language object
program that corresponds to the source program. Assume that complete execution of
the program requires the execution of “N” machine cycle language instructions. Some
instruction may be executed more than once, which in the case for instructions inside
a program loop others may not be executed all, depending on the input data used.
09/08/2024 5
Suppose that the average number of basic steps needed to execute one machine
cycle instruction is “S”, where each basic step is completed in one clock cycle. If
clock rate is “R” cycles per second, the program execution time is given by

• Clock rate
The rate at which a processor completes its total processing cycle in one second.
Generally, it is said that the higher the clock speed, the faster the CPU. But this
may not be the only reason for a faster CPU. There are many factors behind it like
the number of processors, speed of RAM, bus speed, size of cache etc. Some
instructions require more cycles from the CPU to be completed. Depending upon
the architecture of the CPU the clock speed can be more or less important.

09/08/2024 6
MIPS- Million instruction per second
A measure of the execution speed of the computer. The measure approximately
provides the number of machine instructions that could be executed in a second by
a computer.

09/08/2024 7
Parallel processing and Pipelining
• A large class of techniques that provide simultaneous data-processing tasks for
increasing the computational speed of a computer

• Pipelining is a technique of dividing a sequential process into sub-operations with


each sub-operation is getting executed in a dedicated segment that works
simultaneously with other segments

• Each segment in a pipeline performs partial processing and the result obtained
from one segment is transferred to the next segment

• The final result is achieved after the data has passed through all the segments
09/08/2024 8
• Simplest example of a pipeline can be the use of an input register and digital
combinational circuit in each segment. The register holds the data and digital
circuit performs the sub-operation. The output of the digital circuit is then fed to
data register of the next segment.

09/08/2024 9
09/08/2024 10
Space time diagram
Used to illustrate the behaviour of a pipeline. Indicating the segment utilization as
a function of time.

11
09/08/2024
Assume a k-segment pipeline that takes clock cycle time Tp to execute n tasks.
Time required by task T1 to be completely executed is kTp
Remaining (n-1) tasks will be completed after time (n-1)Tp.
Total no. of clock cycles required = k+(n-1)

12
09/08/2024
Each operand needs to pass through all four segments in a fixed sequence.

Each segment has a combinational circuit Si that performs the sub-operation on a


data stream. The segments are separated by registers Ri that hold the intermediate
results between stages.

13
09/08/2024
Arithmetic Pipeline
• Pipelined arithmetic units are found in high speed computers.
• Used to implement floating point operations, multiplication of fixed point
numbers or scientific problems etc.
• e.g. two floating point numbers and need to be added
The pipeline sub-operations can be broken down as:
 Compare the exponents
 Align the mantissa
 Add the mantissa
 Normalize the result

14
09/08/2024
Instruction Pipeline
• An instruction pipeline reads consecutive instructions from memory while
previous instructions are being executed in other segments.
This causes the instruction fetch and execute phases to overlap and perform
simultaneous operations

• Consider using a two-segment pipeline with instruction fetch and execution units.
The fetch segment can be implemented using a FIFO queue. Whenever execution
unit is not using the memory, the control increments the PC and uses its address
to fetch the consecutive instructions from memory and stores these instructions
into the queue

16
09/08/2024
In most general case, steps needed to process each instruction are:
 Fetch the instruction from memory
 Decode the instruction
 Calculate the effective address
 Fetch operands from memory
 Execute the instruction
 Store the results

17
09/08/2024
There are certain difficulties that prevent instruction pipeline from operating at its
maximum rate
 Different segments take different times to operate on incoming information
 Some segments get skipped for certain operations
 Two or more segments require memory access at the same time causing one
segment to go into wait state

18
09/08/2024
As an example take a 4-sement pipeline for instruction execution

19
09/08/2024
The four segments of the instruction pipeline can be
 FI: Segment that fetches an instruction
 DA: Segment that decodes instruction and calculates effective address
 FO: Segment that fetches the operand
 EX: Segment that executes the instruction

20
09/08/2024
The major difficulties that cause instruction pipeline to deviate from normal
operation are:
1) Resource conflict: Access to a memory location is made by two segments at the
same time.
2) Data dependency: An instruction depends on the result of previous instruction
but result is not available yet.
3) Branch difficulties: Arise when branching and other instructions change the
value of PC

21
09/08/2024
Vector Processing
• Utilized in science and engineering problems where vast number of calculations
are required which might take days or weeks to complete.

• Applications like: Long-range weather forecasting


Seismic data analysis
Medical diagnosis
Artificial intelligence
Image processing

22
09/08/2024
• In scientific problems, the data is usually formulated as vectors and matrices of
floating point numbers.
• To access each element in these vectors, program loops are introduced

• The computer capable of vector processing eliminates the overhead associated


with time taken to fetch and execute the instructions in a program loop

• Vector instruction includes the initial address of the operands, length of vectors
and operation to be performed all in one instruction

23
09/08/2024
Pipeline Hazards
• Pipeline hazards are situations that prevent the next instruction in the instruction
stream from executing during its designated clock cycles.
• Any condition that causes a stall in the pipeline operations can be called a hazard.
• There are primarily three types of hazards:

Data Hazards
Control Hazards or instruction Hazards
Structural Hazards

24
09/08/2024
• Data Hazard
Any condition in which either the source or the destination operands of an
instruction are not available at the time expected in the pipeline. As a result of
which some operation has to be delayed and the pipeline stalls. Whenever there
are two instructions one of which depends on the data obtained from the other.

• Structural Hazard
This situation arises mainly when two instructions require a given hardware
resource at the same time and hence for one of the instructions the pipeline needs
to be stalled.

25
09/08/2024
• Control Hazard
The instruction fetch unit of the CPU is responsible for providing a stream of
instructions to the execution unit. The instructions fetched by the fetch unit are in
consecutive memory locations and they are executed. However the problem arises
when one of the instructions is a branching instruction to some other memory
location. Thus all the instruction fetched in the pipeline from consecutive memory
locations are invalid now and need to removed. This induces a stall till new
instructions are again fetched from the memory address specified in the branch
instruction.

26
09/08/2024
Multi-processors
• A multi-processor system is an interconnection of 2 or more CPUs with memory
and I/O devices

• The multi-processor can either be a CPU or an IOP.

• Multi-processor systems come under MIMD systems (Multiple instruction stream,


multiple data stream)

• A multi-processor system is controlled by an operating system that provides


interaction between processors and other system components

09/08/2024 27
• A multi-processor system with common shared memory is called shared-memory
or tightly-coupled multi-processor.
• Alternative of the above system is called distributed-memory or loosely-coupled
system wherein each processor element has its own private local memory. The
processors are tied together by switching scheme designed to route information
from 1 processor to another through message-passing scheme.

09/08/2024 28
• Multi-processing improves the reliability of a system so that failure in one part
has limited effect on rest of the system.

• If a fault causes one processor to fail, a second processor can be assigned to


perform functions of a disabled processor.

• An overall function can be partitioned onto number of tasks handled by each


processor individually

• A program can be decomposed into parallel executable tasks

09/08/2024 29
Interconnection structures
• Physical forms available for establishing an interconnection network between
various components of the computer system.

 Time-shared common bus


In any multiprocessor system, the time-shared common bus interconnection
structures provide a common communication path by connecting all the functional
units like I/O processor, processor, memory unit, etc.

09/08/2024 30
• Only one processor can communicate with memory or another processor at any
given time. Transfer operations are conducted by the processor that is in control
of the bus at the time.
• Any other processor wishing to initiate a transfer must first determine the
availability status of the bus and when the bus becomes available, the processor
can address the destination unit to initiate transfer.

• A single bus system is restricted to one transfer at a time i.e. when one processor
is communicating with the memory all other processors are idle waiting for the
bus.

• One solution for this is to implement a dual bus structure

09/08/2024 31
09/08/2024 32
 Multi-port memory
A multiport memory structure employs separate buses for every memory module
and CPU. Every processor in a multiport memory is connected to each memory
unit.

09/08/2024 33
The processor bus consists of address, data and control lines required to communicate
with the memory.

Each memory module has multiple ports and each port accommodates one of the
buses.

The module must have internal control logic to determine which port will have access
to memory at any given time.

Memory access conflicts are resolved by assigning fixed priorities to each memory port.

Disadvantage of this technique is that it requires expensive memory control logic and
large no. of connectors.
09/08/2024 34
 Crossbar switch
This organization consists of a no. of crosspoints placed at intersections between
processor buses and memory module paths

09/08/2024 35
The crosspoint consists of a switch that determines the path from a processor to a
memory module

Each switchpoint has control logic to set up the transfer path between processor
and memory. It examines the address that is placed in the bus to determine
whether its particular module is being addressed.

It also resolves multiple requests for access to same memory module on the basis
of pre-determined priority.

09/08/2024 36

You might also like