Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02
Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02
Lecture 2
Quantitative Principles
Detailed discussion on the computer Performance the key to quantitative design and analysis
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
Todays Topics
Recap of Lecture 1 Growth in processor performance Price-performance design CPU performance metrics CPU benchmarks suites Summary
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
Recap of Lecture 1
Computer Systems:
Architecture refers to those attributes of a computer visible to a programmer or compiler writer; e.g. instruction set, addressing techniques, I/O mechanisms etc.
Organization refers to how the features of a computer are implemented? i.e., control signals are generated using the principles of finite state machine (FSM) or microprogramming
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
Recap of Lecture 1
Computer Development: Academically, modern computer developments have their infancy in 1944-49
Commercially, the first machine was built by EckertMauchly Computer Corporation in 1949 Technological developments, from vacuum tubes to VLSI circuits, dynamic memory and network technology gave birth to four different generations of computers.
MAC/VU-Advanced Microprocessor and PCs - were introduced in 1971 Computer Architecture Lecture 2 Performance 4
Recap of Lecture 1 Design Perspectives: Processor ISA, ILP and Cache Memory hierarchy: Multilevel cache and Virtual memory input/output and storages multiprocessor and networks
Lecture 2 - Performance
5
Recap of Lecture 1
Computer Design Cycle:
The computer design and development has been under the influence of
-Technology
-performance and
-cost;
the decisive factors for rapid changes in the computer development have been the performance enhancements, price reduction and functional improvements.
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
Lecture 2 - Performance
Lecture 2 - Performance
Year
Lecture 2 - Performance
Price-Performance Design
Technology improvements are used to lower the cost and increase performance. The relationship between cost and price is complex one The cost is the total amount spends to produce a product The price is the amount for which a finished good is sold.
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
10
Price-Performance Design
Lecture 2 - Performance
11
12
here
List Price:
Amount for which the finished good is sold; it includes Average Discount of 15% to 35% of the as volume discounts and/or retailer markup
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
13
W/S
Lecture 2 - Performance
PC
14
Cost-effective IC Design:
Price-Performance Design
Feature Size:
Lecture 2 - Performance
15
Cost-effective IC Design:
Price-Performance Design
Reduction in feature size from 10 microns in 1971 and 0.18 in 2001has resulted in:
Lecture 2 - Performance
16
The Integrated circuit manufacturing passes through many stage: Wafer growth and testing Wafer chopping it into dies Packaging the dies to chips Testing a chip.
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
17
Die: is the square area of the wafer containing the integrated circuit
See that while fitting dies on the wafer the small wafer area around the periphery goes waist
Cost of a die: The cost of a die is determined from cost of a wafer; the number of dies fit on a wafer and the percentage of dies that work, i.e., the yield of the die.
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
18
Lecture 2 - Performance
19
cost of integrated circuit can be determined as ratio of the total cost; i.e., the sum of the costs of die, cost of testing die, cost of packaging and the cost of final testing a chip; to the final test yield.
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
20
die cost + die testing cost + packaging cost + final testing cost
Lecture 2 - Performance
21
cost of die is the ratio of the cost of the wafer to the product of the dies per wafer and die yield
Lecture 2 - Performance
22
die cost + die testing cost + packaging cost + final testing cost
Lecture 2 - Performance
23
number of dies per wafer is determined by the dividing the wafer area (minus the waist wafer area near the round periphery) by the die area
Lecture 2 - Performance
24
die cost + die testing cost + packaging cost + final testing cost
Lecture 2 - Performance
25
Lecture 2 - Performance
26
Example
For die of 0.7 Cm on a side, find the number of dies per wafer of 30 cm diameter
Lecture 2 - Performance
27
Lecture 2 - Performance
28
The yield of a die, 0.7cm on a side, with defect density of 0.6/cm2 = (1+[0.6x0.47]/4.0) -4 = 0.75
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
29
Price-Performance Design
Time to run the task: Execution time, response time, latency Throughput or bandwidth: Tasks per day, hour, week, sec, ns
Lecture 2 - Performance
30
Lecture 2 - Performance
31
Cost / person
Cost-performance
Train
4.0 hours
2400
4.0 hours
6.0 sec
300 Rs.
300x6=1,800 Rs-sec/person
Plane
45 min.
300
9.0 sec.
3000 Rs.
3000x9=27,000 Rs-sec/person
Plane 10 time faster but takes 50% more time to complete the job; i.e., lesser throughput thus performance of train is MAC/VU-Advanced 50%better than plane Computer Architecture
The time per person and cost person of train is less than that of plane Thus the cost-performance of plane
is 1:15
32
Lecture 2 - Performance
Compiler
MIPS: Millions of Instructions per second MFLOPS: millions of FP operations per sec.
Lecture 2 - Performance
33
Program
Program
Instruction
Cycle
Inst Count
Program
CPI
Clock Rate
Compiler
Inst. Set. Organization Technology
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
34
per Instruction CPI = CPU Clock Cycles for program / Instruction Count
= (CPU Time * Clock Rate) / Instruction Count
Instruction
Frequency
For instruction mix, the relative frequency of occurrence of different types of instructions is given as:
FICi = IC of ith instruction / Total Instruction count Average Cycles per Instruction
n i=1
MAC/VU-Advanced Computer Architecture
n i=1
35
1.5
Lecture 2 - Performance
36
Lecture 2 - Performance
37
Performance Time is the key measurement of performance Comparing performance of two designs: the ratio,
= Execution time Y / Execution time X determines how much lower execution time machine Y takes as compared to X ; as performance is inverse of execution time, i.e., = Performance X / Performance Y
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
38
MFLOPS defined for Floating-point-intensive programs as millions of floating-point operations per second
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
39
Lecture 2 - Performance
40
SPEC:
SPECmarks
Lecture 2 - Performance
41
Speed
2x in 3 years
2x in 10 years 2x in 10 years
2x in 3 years
4x in 3 years 4x in 3 years
6yrs to graduate => 16X CPU speed, DRAM/Disk size Execution time, response time, latency
Performance(X) Performance(Y)
Lecture 2 - Performance
42
ExTime(X)
Summary .. Contd
CPI Law:
CPU time = Seconds = Instructions x Cycles x Seconds
Program
Program
Instruction
Cycle
Execution time is the REAL measure of computer performance! Good products created when have: Good benchmarks, good ways to summarize performance Die Cost goes roughly with die area4
Lecture 2 - Performance
43
Summary
.. Contd
For better or worse, benchmarks shape a field Good products created when have: Good benchmarks Good ways to summarize performance Given sales is a function in part of performance relative to competition, investment in improving product as reported by performance summary If benchmarks/summary inadequate, then choose between improving product for real programs vs. improving product to get more sales; Sales almost always wins! Execution time is the measure of computer performance!
MAC/VU-Advanced Computer Architecture
Lecture 2 - Performance
44