ACA Lect7n
ACA Lect7n
(0630561)
Lecture 7
ACA- Lecture
Objective:
ACA- Lecture
Performance Metrics:
ACA- Lecture
Performance Metrics:
ACA- Lecture
CPU Performance Equation:
ACA- Lecture
CPU Performance Equation:
ACA- Lecture
Execution Time (T):
T = I c * CPI * t
T: CPU time (seconds/program) needed to execute a program.
Ic: Number of Instructions in a given program.
CPI: Cycle per Instruction.
t: Cycle time. t=1/f, f=clock rate.
T = Ic * ( p + m * k ) *t
ACA- Lecture
System Attributes:
T = Ic *( p + m*k)*t
The above five performance factors (Ic, p, m, k & t) are influenced by these attributes:
FACTORS Ic p m k t
Instruction set architecture. X X
Compiler technology. X X X
CPU implementation & control X X
Cache & memory hierarchy X X
•The instruction set architecture affects program length and p.
•Compiler design affects the values of IC, p & m.
•The CPU implementation & control determine the total processor time= p*t
•The memory technology & hierarchy design affect the memory access time= k*t
7 ACA- Lecture
MIPS Rate:
• The processor speed is measured in terms of million instructions per seconds.
• MIPS rate varies with respect to:
– Clock rate (f).
– Instruction count (Ic).
– CPI of a given machine.
Ic f f * Ic
MIPS = 6
= 6
=
T *10 CPI *10 N *106
Where N is the total number of clock cycles needed to execute a given program.
I c *10 6
T = I c * CPI * t =
MIPS
MIPS rate of a given computer is directly proportional to the clock
clock rate and
inversely proportional to the CPI.
= ACA- Lecture
Throughput Rate (Wp):
• The Throughput Rate is defined by:
n
CPI i * I ci n
I ci
CPI i * I ci = i =1
= CPI i *
i =1 Ic i =1 Ic
Where;
CPIi: represents the average number of instructions per clock for instruction (i).
Ici: represents number of times instruction (i) is executed in a program.
ACA- Lecture
Example:
Suppose you have made the following measurements;
– Frequency of FP operations (other than FP SQR)= 25%
– Average CPI of FP operations= 4
– Average CPI of other operations=1.33
– Frequency of FPSQR=2%
– CPI of FPSQR=20
• Assume that TWO design alternatives are to decrease the CPI of FPSQR to 2, or to decrease
the average CPI of all FP operations to 2.5. Compare these two design alternatives?
n
I ci
CPI original = CPI i * = 4 * 25 % + 1 . 33 * 75 % = 2
i =1 Ic
• CPI (with new FPSQR)= CPIoriginal - 2%*[CPIoldFPSQR-CPInewFPSQR]
» = 2-2%*[20-2]=1.62 for design 1
• CPI (with new FP)= [75%*1.33] + [25%*2.5]= 1.625 for design 2
CPU Time ( original ) I c * clockcycle * CPI ( original )
Speedup ( newFP ) = =
CPU Time ( newFP ) I c * clockcycle * CPI ( newFP )
CPI ( original ) 2
Speedup ( newFP ) = = = 1 . 23
CPI ( newFP ) 1 . 625
ACA- Lecture
Example:
Consider the execution of a task with 100000 instructions on 500 MHz processor. The program
consists of FOUR major types of instructions:
Instruction Type CPI Instruction%
Integer arithmetic 1 60%
Floating point arithmetic 2 20%
Load/Store 4 10%
Memory Reference 6 10%
When the task is executed on a uniprocessor;
– Calculate the average CPI?
– Determine the corresponding MIPS rate?
Solution:
Average CPI= 1*0.6+2*0.2+4*0.1+6*0.1= 2 cycles/instruction.
f 500MHz
MIPS = = = 250
CPI 2cycles / instr
ACA- Lecture
Example:
• Now, when the task given in the previous example is executed on a FOUR-processor system
with shared memory. Due to the need for synchronization among the FOUR program parts,
2000 extra instructions are added to each part.
– Calculate the average CPI?
– Determine the corresponding MIPS rate?
– Calculate the speedup factor of the FOUR-processor system?
– Calculate the efficiency of the FOUR-processor system?
– Show the interconnection network of this system?
Solution:
Average CPI= 2 cycles/instruction.
MIPS= (4*500MHz)/2=1000
Speedup= [Tex1/Tex4]
Tex1=[Ic/MIPS]=100000/250=0.400 msec
Tex4= =[Ic/MIPS]=[100000+4*2000]/1000=0.108 msec
Speedup=0.4/0.108=3.703
Efficiency=Speedup/#Processors=3.703/4=92.59%
ACA- Lecture