Chapter 1 - Fundamentals of Computer Design
Chapter 1 - Fundamentals of Computer Design
Chapter 1
Fundamentals of Computer
Design
Ping-Liang Lai ( )
Computer Architecture- 1
Outline
1.1 Introduction
1.2 Classes of Computers
1.3 Defining Computer Architecture
1.4 Trends in Technology
1.5 Trends in Power in Integrated Circuits
1.6 Trends in Cost
1.7 Dependability
1.8 Measuring, Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance and Price-Performance
Computer Architecture- 2
1.1 Introduction
Crossroads: Uniprocessor
Performance
FromHennessyandPatterson,Computer
Architecture:AQuantitativeApproach,4thedition,
October,2006
VAX
: 25%/year 1978 to 1986
RISC + x86: 52%/year 1986 to 2002
RISC + x86: ??%/year 2002 to present
Computer Architecture- 4
Outline
1.1 Introduction
1.2 Classes of Computers
1.3 Defining Computer Architecture
1.4 Trends in Technology
1.5 Trends in Power in Integrated Circuits
1.6 Trends in Cost
1.7 Dependability
1.8 Measuring, Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance and Price-Performance
Computer Architecture- 5
Feature
Price of
system
Price of
microproces
sor module
Critical
system
design
issues
Desktop
Server
Embedded
$500-$5,000
$5,000$5,000,000
$10-$100,000 (including
network routers at the
high end)
$50-$500 (per
processor)
$200$10,000
(per
processor)
$0.01-$100 (per
processor)
Priceperformance,
graphics
performance
Throughput
,
availability,
scalability
Price, power
consumption,
application-specific
performance
Computer Architecture- 6
Outline
1.1 Introduction
1.2 Classes of Computers
1.3 Defining Computer Architecture
1.4 Trends in Technology
1.5 Trends in Power in Integrated Circuits
1.6 Trends in Cost
1.7 Dependability
1.8 Measuring, Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance and Price-Performance
Computer Architecture- 7
instruction set
hardware
Class of ISA;
Memory addressing;
Addressing modes;
Types and sizes of operands;
Operations;
Control flow instructions;
Encoding on ISA.
Computer Architecture- 9
design.
Memory system, the memory interconnect, and the design of the internal
processor or CPU (arithmetic, logic, branching, and data transfer).
For example: AMD Opteron 64 and Intel P4 have same ISA, but they have
different internal pipeline and cache organizations.
For example, P4 and Mobile P4 have same ISA and organization, but they
have different clock frequency and memory system.
Outline
1.1 Introduction
1.2 Classes of Computers
1.3 Defining Computer Architecture
1.4 Trends in Technology
1.5 Trends in Power in Integrated Circuits
1.6 Trends in Cost
1.7 Dependability
1.8 Measuring, Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance and Price-Performance
Computer Architecture- 11
mainframe.
Four critical technologies
Computer Architecture- 12
Bandwidth or throughput:
the total amount of work
done in a given time.
Computer Architecture- 13
the x or y dimension.
Computer Architecture- 14
Outline
1.1 Introduction
1.2 Classes of Computers
1.3 Defining Computer Architecture
1.4 Trends in Technology
1.5 Trends in Power in Integrated Circuits
1.6 Trends in Cost
1.7 Dependability
1.8 Measuring, Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance and Price-Performance
Computer Architecture- 15
Power in IC (1/3)
Power also provides challenges as devices are scaled.
Dynamic power (watts, W)in CMOS chip: the traditional dominant energy
consumption has been in switching transistors.
Powerdynamic
1
Capacitive load Voltage 2 Frequency switched
2
For mobile devices: they care about battery life more than power, so
energy is the proper metric, measured in joules:
Energy dynamic Capacitive load Voltage 2
Hence, lower voltage can reduce Powerdynamic and Energydynamic greatly. (In
the past 20 years, supply voltage is from 5V down to 1V)
Computer Architecture- 16
Power in IC (2/3)
Example 1 (p.22): Some microprocessor today are design to have adjustable
voltage, so that a 15% reduction in voltage may result in a 15% reduction in
frequency. What would be the impact on dynamic power?
Answer
Since the capacitance is unchanged, the answer is the ratios of the voltages
and frequencies:
0
.
85
0.61
2
Powerold
Voltage Frequency switch
2
Computer Architecture- 17
Power in IC (3/3)
As we move from one process to the next, (60 nm or 45 nm)
Computer Architecture- 18
Outline
1.1 Introduction
1.2 Classes of Computers
1.3 Defining Computer Architecture
1.4 Trends in Technology
1.5 Trends in Power in Integrated Circuits
1.6 Trends in Cost
1.7 Dependability
1.8 Measuring, Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance and Price-Performance
Computer Architecture- 19
Wafer (
)
Computer Architecture- 20
Cost of die
Cost of wafer
# Dies per wafer Die yield
Wafer radius
Wafer diameter
# Dies per wafer
Die area
2 Die area
2
Computer Architecture- 21
Examples of Cost of an IC
Example 1 (p.22): Find the number of dies per 300 mm (30 cm) wafer for a die that is
1.5 cm on a side.
Wafer radius
Wafer diameter
# Dies per wafer
Die area
2 Die area
2
30 / 2
30
706.5 94.2
270
2.25
2 2.25 2.25 2.12
2
Example 2 (p.24): Find the die yield for dies that are 1.5 cm on a side and 1.0 cm on a
side, assuming a defect density of 0.4 per cm2and is 4.
The total die areas are 2.25 cm2 and 1.00 cm2. For the large die the yield is
0.4 2.25
1
0.44
4.0
0.4 1.00
Die yield 1
4.0
0.68
Computer Architecture- 22
Outline
1.1 Introduction
1.2 Classes of Computers
1.3 Defining Computer Architecture
1.4 Trends in Technology
1.5 Trends in Power in Integrated Circuits
1.6 Trends in Cost
1.7 Dependability
1.8 Measuring, Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance and Price-Performance
Computer Architecture- 23
time.
The phrase X is faster than Y is used here to mean that the response time
or execution time is lower on X than on Y.
In particular, X is n times faster than Y or the throughput of X is n
times higher than Y will mean
Execution time Y
n
Execution time X
Computer Architecture- 24
Performance Measuring
Execution is the reciprocal of performance,
Performance X
1
Execution time X
1
Execution Time Y Performance Y Performance X
n
1
Execution Time X
Performance Y
Performance X
Computer Architecture- 25
Computer Architecture- 26
Outline
1.1 Introduction
1.2 Classes of Computers
1.3 Defining Computer Architecture
1.4 Trends in Technology
1.5 Trends in Power in Integrated Circuits
1.6 Trends in Cost
1.7 Dependability
1.8 Measuring, Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance and Price-Performance
Computer Architecture- 27
Principle of Locality
Amdahls law
Computer Architecture- 28
Computer Architecture- 29
Speedup
Alternatively,
Speedup
Computer Architecture- 30
of the machine + the time spent using the enhancement, i.e. that
is,
Execution time new
Speedup overall
Fraction enhanced
Fraction enhanced
Execution time new 1 Fraction
enhanced
Speedup enhanced
ExTimenew
Computer Architecture- 31
Speedup overall
1
(1 - 0.4)
0.4
10
1
1
1.56
0.6 0.04 0.64
Speedup FPSQR
1
(1 - 0.2)
1
1.56
0.82
1
1.23
0.5 0.8125
(1 - 0.5)
1.6the FP operations overall is slightly better because of the higher frequency.
Improving the performance of
Speedup FP
0.2
10
Computer Architecture- 33
Example 5 (p.41): The calculation of the failure rates of the disk subsystem was
1
1
1
1
1
Failure ratesystem 10
Therefore, the fraction of the failure rate that could be improved is 5 per million hours
out of 23 for the whole system, or 0.22.
Answer
The reliability improvement would be
Improvement power supply pair
1
(1 - 0.22)
0.22
4150
1
1.28
0.78
Thus, the CPU time for a program can be expressed two ways:
CPU Time CPU clock cycles for a program Clock cycle time
Or,
CPU Time
CPI
This figure provides insight into different styles of instruction sets and
implementations.
Thus, clock cycles can be defined as IC CPI, this allows us to use CPI in the
execution time formula:
IC CPI
Clock rate
Computer Architecture- 36
CPU time
Program
Instruction Clock cycle program
CPU time
IC CPI
i 1
IC CPI
i
n
ICi
i 1
CPI
CPI i
Instruction count i 1 Instruction count
Computer Architecture- 38
ICi
We can compute the CPI for the enhanced FPSQR by subtracting the cycles saved from the original CPI:
CPI with new FPSQR CPI original 2% CPI old FPSQR CPI of new FPSQR only 2.0 2% 20 - 2 1.64
CPI new FP 75% 1.33 25% 2.5 1.62
Speedup new FP
CPU timeoriginal
CPU time new FP
CPI original
CPI new FP
2.00
1.23
1.625
Computer Architecture- 39
Most new processors include counter for both instructions executed and for
clock cycles.
Computer Architecture- 40