Computer Architecture and Performance
Computer Architecture and Performance
and Performance
Objectives:
Understand the concepts of computer
architecture
Understand how performance is measured
Know the different ways to measure
computer performance
Computer Architecture
The task the computer designer faces is a
complex one:
Determine what attributes are important
for a new computer, then design a
computer to maximize performance while
staying within cost, power, and
availability constraints.
Implementation
The implementation of a computer has
two components: organization and
hardware.
The term organization includes the highlevel aspects of a computers design, such
as the memory system, the memory
interconnect, and the design of the
internal processor or CPU.
Hardware refers to the specifics of a
computer, including the detailed logic
design and the packaging technology of
Goal
Time discovers
truth.
Seneca
Performance
In general, performance describes how
quickly a given system can execute a
program or programs.
Systems that execute programs in less
time are said to have higher performance
ExecutionX
Relative Performance
Ex. If computer A runs a program in
10 seconds and computer B runs the
same program in 15 seconds, how
much faster is A than B?
Ex.
PerformanceA = ExecutionB = n
PerformanceB
ExecutionA
Measuring Performance
Time is the measure of computer
performance: the computer that
performs the same amount of work in
the least time is the fastest.
Performance Metrics
Cycles per Instruction (CPI)
Number of clock cycles required to execute each
instruction
CPI = number of clock cycles required to execute
program
number of instructions executed in running the
program
Ex.
A given program consists of a 100instruction loop that is executed 42 times.
If it takes 16,000 cycles to execute the
program on a given system, what are the
systems CPI and IPC values for the
program?
Soln:
Benchmark Suites
Consists of a set of programs that are
believed to be typical of the programs that
will be run on the system
They generate estimates of a systems
performance on different types of
applications.
Ex. SPEC Standard Performance Evaluation
Corporation
is a non-profit corporation formed to establish,
maintain and endorse a standardized set of relevant
benchmarks that can be applied to the newest
generation of high-performance computers.
SPEC CPU2006,SPEC CPUv6
Speedup
Used to describe how the performance of
an architecture changes as different
improvements are made to the
architecture
It is the ratio of the execution times before
and after a change is made
Speedup = Execution Time before
Execution Time
after
Ex
If a program takes 25 seconds to run on
one version of an architecture and 15
seconds to run on a new version, the
overall speedup
= 25 sec/15 sec = 1.67
Amdahls Law
The most important rule for designing
high-performance computer systems is
make the common case fast.
Qualitatively, this means that the impact
of a given performance on overall
performance is dependent on both how
much the improvement improves
performance when it is in use and how
often the improvement is in use
Amdahls Law
Execution Timenew =
Execution Timeold X [ Fracunused +
Speedupused
Frac
used
where:
Frac unused = fraction of time that the
improvement is not in use
Fracused = fraction of time that the
improvement is in use
Speedupused = speedup that occurs when
the improvement is used
Note that Fracused and Fracunused are
computed using the the execution time
before the modification is applied.
Ex.
Suppose that a given architecture does not
have hardware support for multiplication, so
multiplication have to be done through
repeated addition (this was the case on
some early microprocessors). If it takes 200
cycles to perform multiplication in software,
and 4 cycles to perform multiplication in
hardware, what is the overall speedup from
hardware support for multiplication if a
program spends 10% of its time doing
multiplications? What about a program that
spends 40% of its time doing
Soln:
Seatwork:
1. If the 2011 version of a computer
executes a program in 200ns and the
version of the computer made in the year
2013 executes the same program in 150ns,
what is the speedup that the manufacturer
had achieved over the two-year period?