0% found this document useful (0 votes)
9 views32 pages

CCS 1202 Lecture 2_Computer Evolution and Performance

The document outlines the evolution of computer organization and architecture, detailing five generations from vacuum tubes to advanced microprocessors and embedded systems. It discusses performance metrics, including response time and throughput, and emphasizes the importance of measuring performance to optimize hardware choices. Additionally, it introduces concepts like Amdahl's Law and the significance of making common cases faster in design improvements.

Uploaded by

kevinkanyoro06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views32 pages

CCS 1202 Lecture 2_Computer Evolution and Performance

The document outlines the evolution of computer organization and architecture, detailing five generations from vacuum tubes to advanced microprocessors and embedded systems. It discusses performance metrics, including response time and throughput, and emphasizes the importance of measuring performance to optimize hardware choices. Additionally, it introduces concepts like Amdahl's Law and the significance of making common cases faster in design improvements.

Uploaded by

kevinkanyoro06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

CCS 1202

COMPUTER ORGANIZATION AND ARCHITECTURE

Computer evolution and performance

1
Introduction to Computer Evolution
• Definition: The development and transformation of computer systems over
time.

• Key Milestones:
• First Generation (1946-1954): Vacuum tubes; examples: ENIAC, UNIVAC.
• Second Generation (1955-1965): Transistors; examples: IBM 1401.
• Third Generation (1968-1975): Integrated Circuits (ICs); examples: IBM
System/360.
• Fourth Generation (1976-1980): Microprocessors; examples: Intel 4004.
• Fifth Generation (1980-Present): Advanced microprocessors, AI, parallel
processing.

• Significance: How each generation contributed to more powerful, compact,


and efficient computers.
2
Computer Generations
Generation Components Used Characteristics
First Generation (1946 Vacuum tubes Unreliable, very costly, generated a lot of heat,
– 1954) huge size.

Second Generation Transistors Reliable, faster, still costly, generated less heat,
(1955 – 1965) smaller size than first generation.

Third Generation (1968 Integrated Circuits(IC) More reliable, faster, still costly, less
– 1975) maintenance, smaller size than second
generation.
Fourth Generation Very Large Scale Very cheap, portable and reliable, very small
(1976 – 1980) Integrated Circuits (VLSI) size

Fifth Generation (1980 Ultra Large Scale Very powerful and compact computers at
– today) Integrated Circuits (ULSI) cheaper rates

3
Embedded Systems
• The use of electronics and software within a product
• Billions of computer systems are produced each year that are
embedded within larger devices
• Today many devices that use electric power have an embedded
computing system
• Often embedded systems are tightly coupled to their
environment
• This can give rise to real-time constraints imposed by the need to interact
with the environment
• Constraints such as required speeds of motion, required precision of
measurement, and required time durations, dictate the timing of software
operations
• If multiple activities must be managed simultaneously this imposes more
complex real-time constraints

4
Deeply Embedded Systems
• Subset of embedded systems
• Uses a microcontroller rather than a microprocessor
• Is not programmable once the program logic for the device has been
burned into ROM
• Dedicated, single-purpose devices that detect something in the
environment, perform a basic level of processing, and then do
something with the results
• Often have wireless capability and appear in networked configurations,
such as networks of sensors deployed over a large area
• Typically have extreme resource constraints in terms of memory,
processor size, time, and power consumption
• Examples: Smart Tvs, game consoles, pacemakers, Voice Assistants
5
The Internet of Things (IoT)
• Term that refers to the expanding interconnection of smart devices, ranging
from appliances to tiny sensors
• Is primarily driven by deeply embedded devices
• Generations of deployment culminating in the IoT:
• Information technology (IT)
• PCs, servers, routers, firewalls, and so on, bought as IT devices by enterprise IT people and
primarily using wired connectivity
• Operational technology (OT)
• Machines/appliances with embedded IT built by non-IT companies, such as medical machinery,
SCADA, process control, and kiosks, bought as appliances by enterprise OT people and
primarily using wired connectivity
• Personal technology
• Smartphones, tablets, and eBook readers bought as IT devices by consumers exclusively using
wireless connectivity and often multiple forms of wireless connectivity
• Sensor/actuator technology
• Single-purpose devices bought by consumers, IT, and OT people exclusively using wireless
connectivity, generally of a single form, as part of larger systems

6
Performance
• Performance is the key to understanding underlying motivation for the
hardware and its organization
• Measure, report, and summarize performance to enable users to
• make intelligent choices
• see through the marketing hype!

• Why is some hardware better than others for different programs?


• What factors of system performance are hardware related?
(e.g., do we need a new machine, or a new operating system?)
• How does the machine's instruction set affect performance?

7
The Role of Performance
 Hardware performance is a key to the effectiveness of the entire
system
 Performance has to be measured and compared to evaluate
various design and technological approaches
 To optimize the performance, major affecting factors have to be
known
 For different types of applications, different performance metrics
may be appropriate and different aspects of a computer system
may be most significant
 Instructions use and implementation, memory hierarchy and I/O
handling are among the factors that affect the performance
8
Historical Context and Performance Metrics
• Key Factors in Historical Performance:
• Clock Speed (MHz/GHz): Indicates how fast a processor can execute
instructions.
• Instruction Set Architecture (ISA): The set of commands a processor can
execute.
• Memory Size: Evolution from kilobytes to terabytes.

• Performance Benchmarks:
• MIPS (Millions of Instructions Per Second).
• FLOPS (Floating Point Operations Per Second) for high-performance
computing.

• Limitations: Increasing power consumption and heat production.


9
Key Principles of Computer Performance
• Performance = Work Done / Time Taken.

• Factors Affecting Performance:


• Clock Rate: Speed of the CPU's clock cycle.
• Instruction-Level Parallelism (ILP): Ability of the processor to execute
multiple instructions simultaneously.
• Cache Size and Hierarchy: Faster access to frequently used data.
• Memory Bandwidth: Speed at which data can be read/written from
memory.
• Pipelining: Overlapping the execution of multiple instructions.

10
Cost-Performance
• Purchasing perspective: from a collection of machines, choose one
which has
• best performance?
• least cost?
• best performance/cost?
• Computer designer perspective: faced with design options, select
one which has
• best performance improvement?
• least cost?
• best performance/cost?
• Both require: basis for comparison and metric for evaluation

11
Two “notions” of performance
• Which computer has better performance?
• User: one which runs a program in less time
• Computer centre manager: one which completes more jobs in a given time

• Users are interested in reducing


Response time or Execution time
• the time between the start and the completion of an event

• Managers are interested in increasing


Throughput or Bandwidth
• total amount of work done in a given time

12
Computer Performance
• Response Time (elapsed time, latency):
• how long does it take for my job to run? Individual user
• how long does it take to execute (start to finish) my job? concerns…

• how long must I wait for the database query?


• Throughput:
• how many jobs can the machine run at once?
• what is the average execution rate? Systems manager
concerns…
• how much work is getting done?

• If we upgrade a machine with a new processor what do we increase?


• If we add a new machine to the lab what do we increase?
13
An Example
Plane DC to Paris Top Speed Passengers Throughput
[hour] [mph] [p/h]
Boeing 747 6.5 610 470 72 (=470/6.5)

Concorde 3 1350 132 44 (=132/3)

? Time to deliver 1 passenger?


 Concord is 6.5/3 = 2.2 times faster (120%)

Time to deliver 470 passengers?


 Boeing is 72/44 = 1.6 times faster (60%)

14
Performance Metrics
Response (execution) time:
• The time between the start and the completion of a task
• Measures user perception of the system speed
• Common in reactive and time critical systems, single-user computer, etc.
Throughput:
• The total number of tasks done in a given time
• Most relevant to batch processing (billing, credit card processing, etc.)
• Mainly used for input/output systems (disk access, printer, etc)
Examples:
 Replacing the processor of a computer with a faster version
 Enhance BOTH response time and throughput
 Adding additional processors to a system that uses multiple processors
for separate tasks (e.g. handling of airline reservations system)
 Improves throughput ONLY

Decreasing response time always improve throughput


15
Response-time Metric
Maximizing performance means minimizing response (execution) time
1
Perfromance=
Execution time
Performance of Processor P1 is better than P2 if, for a given work load L, P1
takes less time to execute L than P2 does
Perfromance ( P1 )>Perfromance ( P2 ) w .r . t L
⇒ Exection time ( P 1 , L)< Execution time ( P2 , L )

Relative performance capture the performance ratio of Processor P1


compared to P2 is, for the same work load

16
Designer’s Performance Metrics
• Users and designers measure performance using different metrics
• Designers looks at the bottom line of program execution

CPU execution time for a program=CPU clock cycles for a program×Clock cycle time

CPU clock cycles for a program


=
Clock rate

• To enhance the hardware performance, designers focuses on reducing


the clock cycle time and the number of cycles per program
• Many techniques to decrease the number of clock cycles also increase
the clock cycle time or the average number of cycles per instruction
(CPI)

17
Calculation of CPU Time
CPU time = Instruction count  CPI  Clock cycle time

Or Instruction count×CPI
CPU time=
Clock rate

Instructions Clock cycles Seconds


CPU time= × ×
Program Instruction Clock cycle

Component of performance Units of measure


CPU execution time for a program Seconds for the program
Instruction count Instructions executed for the program
Clock cycles per instructions (CPI) Average number of clock cycles/instruction
Clock cycle time Seconds per clock cycle

18
Execution Time
• Elapsed Time
• counts everything (disk and memory accesses, waiting for I/O, running other programs, etc.)
from start to finish
• a useful number, but often not good for comparison purposes
elapsed time = CPU time + wait time (I/O, other programs, etc.)

• CPU time
• doesn't count waiting for I/O or time spent running other programs
• can be divided into user CPU time and system CPU time (OS calls)
CPU time = user CPU time + system CPU time
 elapsed time = user CPU time + system CPU time + wait time

• Our focus: user CPU time (CPU execution time or, simply, execution time)
• time spent executing the lines of code that are in our program

19
CPU Execution Time
CPU time=CPU clock cycles for a program ×Clock cycle time
CPU clock cycles for a program
CPUtime=
Clock rate
Definitions
• Instruction count (IC) = Number of instructions executed
• Clock cycles per instruction (CPI)
CPU clock cycles for a program
CPI =
IC
CPI - one way to compare two machines with same instruction set,
since Instruction Count would be the same
20
CPU Time
• CPU execution time can be measured by running the program
• The clock cycle is usually published by the manufacture
• Measuring the CPI and instruction count is not trivial
• Instruction counts can be measured by: a software profiling, using an
architecture simulator, using hardware counters on some architecture
• The CPI depends on many factors including: processor structure, memory
system, the mix of instruction types and the implementation of these
instructions
n
• Designers uses the following formula: CPU clock cycles=∑ CPI i ×C i
i=1

Where: Ci is the count of number of instructions of class i executed


CPIi is the average number of cycles per instruction for that instruction class
n is the number of different instruction classes
21
Clock Cycles
• Instead of reporting execution time in seconds, we often use cycles. In modern
computers hardware events progress cycle by cycle: in other words, each event, e.g.,
multiplication, addition, etc., is a sequence of cycles

seconds cycles seconds


• Clock ticks indicate start and end of cycles:  
program program cycle

cycle time

tick
tick
• cycle time = time between ticks = seconds per cycle
• clock rate (frequency) = cycles per second (1 Hz. = 1 cycle/sec, 1 MHz. = 106 cycles/sec)

1
• Example: A 200 Mhz. clock has a  10 9  5 nanoseconds cycle time
200  10 6

22
How many cycles are required for a program?

time

• Multiplication takes more time than addition


• Floating point operations take longer than integer ones
• Accessing memory takes more time than accessing registers
• Important point: changing the cycle time often changes the number of cycles
required for various instructions because it means changing the hardware
design.

23
Performance Equation
seconds cycles seconds
 
program program cycle

equivalently
CPU execution time = CPU clock cycles  Clock cycle time
for a program for a program

• So, to improve performance one can either:


• reduce the number of cycles for a program, or
• reduce the clock cycle time, or, equivalently,
• increase the clock rate

24
Example
• Our favorite program runs in 10 seconds on computer A, which has a 400Mhz.
clock.
• We are trying to help a computer designer build a new machine B, that will run
this program in 6 seconds. The designer can use new (or perhaps more
expensive) technology to substantially increase the clock rate, but has informed
us that this increase will affect the rest of the CPU design, causing machine B to
require 1.2 times as many clock cycles as machine A for the same program.

• What clock rate should we tell the designer to target?

25
Example: Answer
CPU clock cycles
CPU time ( A )=
Clock rate ( A )
CPU clock cycles of the program
10 seconds=
400 ×10 6 cycles/second
CPU clock cycles of the program = 10 seconds × 400 ×106 cycles/second
= 4000 ×106 cycles
To get the clock rate of the faster computer, we use the same formula
6
1 .2 × CPU clock cycles of the program 1 .2 × 4000 ×10 cycles
6 seconds= =
clock rate ( B ) clock rate ( B )
6
1 . 2 × 4000 ×10 cycles
clock rate (B )= =800 ×106 cycles/second
6 second
26
Quantitative Principles of Design
• Where to spend time making improvements?
 Make the Common Case Fast
 Most important principle of computer design:
Spend your time on improvements where those improvements will do the
most good
 Example

 Instruction A represents 5% of execution

 Instruction B represents 20% of execution

 Even if you can drive the time for A to 0, the CPU will only be 5% faster

• Key questions
 What is the frequent case?

 How much performance can be improved by making that case faster?

27
Amdahl’s Law
• Suppose that we make an enhancement to a machine that will
improve its performance; Speedup ratio:
ExTime for entire task without enhancement
Speedup=
ExTime for entire task using enhancement
Performance for entire task using enhancement
Speedup=
Performance for entire task without enhancement

• Amdahl’s Law states that the performance improvement that can be


gained by a particular enhancement is limited by the amount of time
that enhancement can be used

28
An Example
• Enhancement runs 10 times faster and it affects 40% of the
execution time
• Fractionenhanced = 0.40
• Speedupenhanced = 10
1
• Speedupoverall = ? Speedup=
Fractionenhanced
1− Fractionenhanced +
Speedup enhanced

1 1
Speedup= = ≈ 1. 56
0 . 4 0 . 64
1− 0 . 4 +
10
29
Enhancing Performance: Hardware Strategies
• Multi-Core and Many-Core Processors:
• Increased parallelism by incorporating multiple processing units.
• Examples: Intel Core i7, AMD Ryzen.
• Vector Processing and SIMD (Single Instruction, Multiple Data):
• Specialized instructions that process multiple data points simultaneously.
• GPU Acceleration:
• Highly parallelized architecture, ideal for tasks like rendering and AI.

• Advanced Memory Technologies:


• DDR4, DDR5, and emerging technologies like HBM (High Bandwidth
Memory).

30
Enhancing Performance: Software and Algorithmic Strategies
• Optimizing Code:
• Techniques like loop unrolling, using efficient data structures.

• Parallel Programming:
• Using frameworks such as OpenMP, CUDA for multi-threaded processing.
• Cache Optimization:
• Minimizing cache misses through data locality.
• Compiler Optimizations:
• Advanced compiler techniques that restructure code for better
performance.
• High-Level Abstractions:
• Use of parallel algorithms in higher-level programming languages.
31
Challenges in Performance Enhancement
• Power Consumption and Heat Dissipation:
• Challenges with keeping performance high while managing thermal
output.
• Memory Wall:
• The gap between CPU speed and memory access speed.
• Amdahl’s Law Limitation:
• Real-world scenarios often make it impossible to achieve theoretical
maximum speedups.
• Scalability Issues:
• Managing parallelism effectively as processor count increases.

32

You might also like