0% found this document useful (0 votes)
22 views

2_Computer Organization and Architecture

Uploaded by

diptondey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

2_Computer Organization and Architecture

Uploaded by

diptondey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Computer Organization and

Architecture
Assessing and Understanding
Performance
Introduction

 Response time/execution time : The time between the start and


completion of a task. (try to minimize)
 Throughput : The total amount of work done in a given time. (try to
increase)
 =
 = =n
Continue…
 If computer X runs a program in 10 seconds and computer Y
runs the same program in 15 seconds, how much faster is x than
y?
Solution: we know, = = n
Given, execution time (x)=10 seconds
execution time (y) = 15 seconds

So the performance ratio is, = n , that is = 1.5


That means, = 1.5
so, = 1.5 x
So, X is 1.5 times faster than Y.
CPU Performance and its Factors

For a program:
 CPU execution time= CPU clock cycles X Clock cycle time …..(1)
 Clock cycle time= …..(2)
by putting the value from (2) into (1) we get,
 CPU execution time=
 CPU clock cycle= Instructions for a program x Avg clock cycles per instruction
 CPU time= Instruction count x CPI x Clock cycle time
Or , CPU time=

CPI=
 CPI = [ CPU clock cycle= CPU time x clock rate ]
Improving performance

 Our favorite program runs in 10 seconds on computer A, which


has a 4 GHz clock. We are trying to help a computer designer
build a computer B, that will run this program in 6 seconds. The
designer has determined that a substantial increase in the
clockrate is possible, but this increase will affect the rest of the
CPU design, causing computer B to require 1.2 times as many
clock cycles as computer A for this program. What clock rate
should we tell the designer to target?
Solution: given,
CPU time(A)=10 seconds ; Clock rate(A) = 4 GHz = 4 x
CPU time(B)=6 seconds ; Clock rate(B) = ? (have to determine)
Continue…
 For program on computer A:
CPU time(A)=
or, 10 seconds =
So, = 10 seconds x 4 x = 40 x cycles
 For program on computer B:
 CPU time(B)=
CPU time(B)=
or, 6 seconds =
so, = 8 x = 8 GHz
 computer B must therefore have twice the clock rate of A to run the
program in 6 seconds.
Using the performance equation

 Suppose we have two implementations of the same instruction set


architecture. Computer A has a clock cycle time of 250 ps and a CPI
of 2.0 for some program and computer B has a clock cycle time of
500 ps and a CPI of 1.2 for the same program. Which computer is
faster for his program and by how much?
Solution: We know that each computer executes the same number of
instructions for the program. Let’s call this number k .
So, CPU clock cycles(A) = k x 2.0
CPU clock cycles(B) = k x 1.2
So, CPU execution time(A)= CPU clock cycles(A) x Clock cycle time(A)
=k x 2.0 x 250 ps = 500 x
k ps
Continue…

CPU execution time(B)= CPU clock cycles(B) x Clock cycle time(B)


= k x 1.2 x 500 ps =600 x k ps
In between 500 x k and 600 x k , it is clear to us that 600 x k is greater. So,
computer A is faster as its CPU execution time is smaller.

= = = 1.2
That means , Computer A is 1.2 times faster than Computer B.
Comparing code segment
 A compiler designer is trying to decedide between two code sequences
for a particular computer. The hard ware designers have supplied the
following facts:
CPI for this instruction class
A B C
CPI 1 2 3

For a particular high level language statement, the compiler writter Is


considering two code sequences that require the following instruction
counts.
Code Instruction counts for instruction class
Sequence A B C

1 2 1 2
2 4 1 1

i. Whish code sequences executes the most instructions? Ii. Which will be
faster? Iii. What is the CPI for each sequence?
Continue…
 Solution for i. : Sequence 1 executes 2+1+2 = 5 instructions.
Sequence 2 executes 4+1+1 = 6
instructions.
So, sequence 2 executes the most instructions.
 Solution for ii. : we know that, CPU clock cycle =
This yields, CPU clock cycles(1) = (2x1)+(1x2)+(2x3)=2+2+6=10 cycles
CPU clock cycles(2) = (4x1)+(1x2)+(1x3)=4+2+3=9 cycles
So code sequence 2 is faster, even though it actually executes one extra
instruction.
 Solution for iii. : CPI =
CPI (1)= = = 2
CPI (2)= = = 1.5
Evaluating performance

 The set of programs run would form a workload.


 Compare execution time of the same workload.
 A set of benchmarks programs specifically chosen to measure
performance.
 By using benchmarks programs determine and compare response time
or throughput.(not this much straight forward)
 There are some confusion.
 Different program instructions takes different execution time in different
computer(see the performance table in the next slide).
Continue…
Computer A Computer B
Program 1 (seconds) 1 10
Program 2 (seconds) 1000 100
Total time (seconds) 1001 110

 A is 10 times faster than B for program 1.


 B is 10 times faster than A for program 2.
 As the both statements are true, so it is confusing to understand the
better one between computer A and B in this way.
 So, for better understanding we have to use total execution time.
Total execution time: A consistent
summary measure
 The simplest approach:

= = = 9.1
So , = 9.1 x
 That is B is 9.1 time faster than A for programs 1 and 2 together.

 The average of the execution times that is directly proportional


to total execution time is the arithmetic mean(AM).
AM =
Performance, Power and Energy
Efficiency
 Power is increasingly becoming the key limitation and critical factor in
processor performance and also has impact on costing. (Laptop)
 To save power , techniques ranging from putting parts of the computer to
sleep, to reducing the clock rate and voltage have all been used, but not
good enough.
 CMOS technology is used to reduce power by reducing frequency but this
also causes the reduction of performance.

Please for more detail goto to text book :


Computer organization and design (3rd edition) – Patterson, Hennessy
Page no: 263,264,265
Fallacies and Pitfalls

 Pitfall: Expecting the improvement of one aspect of a computer to


increase performance by an amount proportional to the size of
improvement.
 Amdahl’s law: A rule stating that the performance enhancement possible
with a given improvement is limited by the amount that the improved
feature is used.
 The law is :
Execution time after improvement
= ( + Execution time unaffected)
Continue…
 Suppose a program runs in 100 seconds on a computer, with multiply
operations responsible for 80 seconds of this time. How much do you
have to improve the speed of multiplication if you want your program
to run five times faster?
Solution: By putting the values using Amdahl’s law, we get:
Execution time after improvement
= ( + (100 - 80)seconds)…..(i)
As the execution time is 100 seconds and now we want 5 times faster , so it
will become 20 seconds. Put this value in (i),
20 seconds = + 20 seconds
Or, 0 = ….(ii)
(ii) Indicates that there is no amount by which we can enhance multiply to
achieve a fivefold increase in performance., if multiply accounts for only
80% of the workload.
MIPS

 One alternative to time as the metric is MIPS.


 MIPS: million instructions per second.
 For a given program , MIPS is simply:

MIPS =
MIPS as a performance measure
CPI for this instruction class
A B C
CPI 1 2 3

Code from Instruction counts(in billions) for each


instruction class
A B C
Compiler 1 5 1 1
Compiler 2 10 1 1

 Assume that the computer’s clock rate is 4 GHz. Which code sequence
will execute faster according to MIPS? According to execution time?
Continue…

Solution: we know that, Execution time =


CPU clock cycle = …..(ii)
By using (ii)—
CPU clock cycle(1) = (5x1+1x2+1x3)x =10 x
CPU clock cycle(2) = (10x1+1x2+1x3)x =15 x
Now by using (i)---
Execution time(1) = = 2.5 seconds
Execution time(2) = = 3.75 seconds
So, compiler 1 generates the faster program, according to execution time.
Continue…

 Let’s compute the MIPS rate for each version of the program,using the
following equation:

MIPS =

MIPS (1) = = 2800


MIPS (2) = = 3200
So, the code from compiler 2 has a higher MIPS rating, but the code from
the compiler 1 runs faster.
End of slide
Thank you
Any question?

You might also like