0% found this document useful (0 votes)
2 views

001-stored program computer

The document discusses computer evolution and performance, detailing the architecture of stored program computers, including components like memory, ALU, and control units. It also covers concepts such as Moore's Law, instruction sets, microprocessors, and the importance of system components in overall performance. Additionally, it addresses memory operations, instruction execution, and the impact of various factors on system speed and efficiency.

Uploaded by

chowresearch22
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

001-stored program computer

The document discusses computer evolution and performance, detailing the architecture of stored program computers, including components like memory, ALU, and control units. It also covers concepts such as Moore's Law, instruction sets, microprocessors, and the importance of system components in overall performance. Additionally, it addresses memory operations, instruction execution, and the impact of various factors on system speed and efficiency.

Uploaded by

chowresearch22
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

CHAPTER 2

COMPUTER EVOLUTION AND


PERFORMANCE

A NSWERS TO Q UESTIONS
2.1 In a stored program computer, programs are represented in a form suitable for
storing in memory alongside the data. The computer gets its instructions by reading
them from memory, and a program can be set or altered by setting the values of a
portion of memory.

2.2 A main memory, which stores both data and instructions: an arithmetic and logic
unit (ALU) capable of operating on binary data; a control unit, which interprets the
instructions in memory and causes them to be executed; and input and output
(I/O) equipment operated by the control unit.

2.3 Gates, memory cells, and interconnections among gates and memory cells.

2.4 Moore observed that the number of transistors that could be put on a single chip
was doubling every year and correctly predicted that this pace would continue into
the near future.

2.5 Similar or identical instruction set: In many cases, the same set of machine
instructions is supported on all members of the family. Thus, a program that
executes on one machine will also execute on any other. Similar or identical
operating system: The same basic operating system is available for all family
members. Increasing speed: The rate of instruction execution increases in going
from lower to higher family members. Increasing Number of I/O ports: In going
from lower to higher family members. Increasing memory size: In going from
lower to higher family members. Increasing cost: In going from lower to higher
family members.

2.6 In a microprocessor, all of the components of the CPU are on a single chip.

A NSWERS TO PROBLEMS
2.1 This program is developed in [HAYE98]. The vectors A, B, and C are each stored in
1,000 contiguous locations in memory, beginning at locations 1001, 2001, and 3001,
respectively. The program begins with the left half of location 3. A counting
variable N is set to 999 and decremented after each step until it reaches –1. Thus,
the vectors are processed from high location to low location.

-5-
Location Instruction Comments
0 999 Constant (count N)
1 1 Constant
2 1000 Constant
3L LOAD M(2000) Transfer A(I) to AC
3R ADD M(3000) Compute A(I) + B(I)
4L STOR M(4000) Transfer sum to C(I)
4R LOAD M(0) Load count N
5L SUB M(1) Decrement N by 1
5R JUMP+ M(6, 20:39) Test N and branch to 6R if nonnegative
6L JUMP M(6, 0:19) Halt
6R STOR M(0) Update N
7L ADD M(1) Increment AC by 1
7R ADD M(2)
8L STOR M(3, 8:19) Modify address in 3L
8R ADD M(2)
9L STOR M(3, 28:39) Modify address in 3R
9R ADD M(2)
10L STOR M(4, 8:19) Modify address in 4L
10R JUMP M(3, 0:19) Branch to 3L

2.2 a.
Opcode Operand
00000001 000000000010

b. First, the CPU must make access memory to fetch the instruction. The
instruction contains the address of the data we want to load. During the execute
phase accesses memory to load the data value located at that address for a total
of two trips to memory.
2.3 To read a value from memory, the CPU puts the address of the value it wants into
the MAR. The CPU then asserts the Read control line to memory and places the
address on the address bus. Memory places the contents of the memory location
passed on the data bus. This data is then transferred to the MBR. To write a value to
memory, the CPU puts the address of the value it wants to write into the MAR. The
CPU also places the data it wants to write into the MBR. The CPU then asserts the
Write control line to memory and places the address on the address bus and the
data on the data bus. Memory transfers the data on the data bus into the
corresponding memory location.

-6-
2.4
Address Contents
08A LOAD M(0FA)
STOR M(0FB)
08B LOAD M(0FA)
JUMP +M(08D)
08C LOAD –M(0FA)
STOR M(0FB)
08D

This program will store the absolute value of content at memory location 0FA into
memory location 0FB.

2.5 All data paths to/from MBR are 40 bits. All data paths to/from MAR are 12 bits.
Paths to/from AC are 40 bits. Paths to/from MQ are 40 bits.

2.6 The purpose is to increase performance. When an address is presented to a memory


module, there is some time delay before the read or write operation can be
performed. While this is happening, an address can be presented to the other
module. For a series of requests for successive words, the maximum rate is
doubled.

2.7 The discrepancy can be explained by noting that other system components aside from clock
speed make a big difference in overall system speed. In particular, memory systems and
advances in I/O processing contribute to the performance ratio. A system is only as fast as
its slowest link. In recent years, the bottlenecks have been the performance of memory
modules and bus speed.

2.8 As noted in the answer to Problem 2.7, even though the Intel machine may have a
faster clock speed (2.4 GHz vs. 1.2 GHz), that does not necessarily mean the system
will perform faster. Different systems are not comparable on clock speed. Other
factors such as the system components (memory, buses, architecture) and the
instruction sets must also be taken into account. A more accurate measure is to run
both systems on a benchmark. Benchmark programs exist for certain tasks, such as
running office applications, performing floating point operations, graphics
operations, and so on. The systems can be compared to each other on how long
they take to complete these tasks. According to Apple Computer, the G4 is
comparable or better than a higher-clock speed Pentium on many benchmarks.

2.9 This representation is wasteful because to represent a single decimal digit from 0
through 9 we need to have ten tubes. If we could have an arbitrary number of these
tubes ON at the same time, then those same tubes could be treated as binary bits.
With ten bits, we can represent 210 patterns, or 1024 patterns. For integers, these
patterns could be used to represent the numbers from 0 through 1023.

-7-
2.10
Ic p m k τ
Instruction set X X
architecture
Compiler technology X X X
Processor X X
implementation
Cache and memory X X
hierarchy

Source: [HWAN93]

2.11 MIPS rate = f/(CPI × 10 6 )

2.12 a. We can express the MIPs rate as: [(MIPS rate)/106 ] = Ic/T. So that:
Ic = T × [(MIPS rate)/10 6 ]. The ratio of the instruction count of the RS/6000 to
the VAX is [x × 18]/[12x × 1] = 1.5.
b. For the Vax, CPI = (5 MHz)/(1 MIPS) = 5.
For the RS/6000, CPI = 25/18 = 1.39.

2.13 CPI = 1.55; MIPS rate = 25.8; Execution time = 3.87 ns. Source: [HWAN93]

2.14 a. Ultimately, the user is concerned with the execution time of a system, not its
execution rate. If we take arithmetic mean of the MIPS rates of various
benchmark programs, we get a result that is proportional to the sum of the
inverses of execution times. But this is not inversely proportional to the sum of
execution times. In other words, the arithmetic mean of the MIPS rate does not
cleanly relate to execution time. On the other hand, the harmonic mean MIPS
rate is the inverse of the average execution time.
b.
Arithmetic mean Harmonic Mean Rank
Computer A 25.3 MIPS 0.25 MIPS 2
Computer B 2.8 MIPS 0.21 MIPS 3
Computer C 3.25 MIPS 2.1 MIPS 1

-8-
HAPTER 3
COMPUTER FUNCTION AND
INTERCONNECTION

A NSWERS TO Q UESTIONS
3.1 Processor-memory: Data may be transferred from processor to memory or from
memory to processor. Processor-I/O: Data may be transferred to or from a
peripheral device by transferring between the processor and an I/O module. Data
processing: The processor may perform some arithmetic or logic operation on data.
Control: An instruction may specify that the sequence of execution be altered.

3.2 Instruction address calculation (iac): Determine the address of the next instruction
to be executed. Instruction fetch (if): Read instruction from its memory location
into the processor. Instruction operation decoding (iod): Analyze instruction to
determine type of operation to be performed and operand(s) to be used. Operand
address calculation (oac): If the operation involves reference to an operand in
memory or available via I/O, then determine the address of the operand. Operand
fetch (of): Fetch the operand from memory or read it in from I/O. Data operation
(do): Perform the operation indicated in the instruction. Operand store (os): Write
the result into memory or out to I/O.

3.3 (1) Disable all interrupts while an interrupt is being processed. (2) Define priorities
for interrupts and to allow an interrupt of higher priority to cause a lower-priority
interrupt handler to be interrupted.

3.4 Memory to processor: The processor reads an instruction or a unit of data from
memory. Processor to memory: The processor writes a unit of data to memory. I/O
to processor: The processor reads data from an I/O device via an I/O module.
Processor to I/O: The processor sends data to the I/O device. I/O to or from
memory: For these two cases, an I/O module is allowed to exchange data directly
with memory, without going through the processor, using direct memory access
(DMA).

3.5 With multiple buses, there are fewer devices per bus. This (1) reduces propagation
delay, because each bus can be shorter, and (2) reduces bottleneck effects.

3.6 System pins: Include the clock and reset pins. Address and data pins: Include 32
lines that are time multiplexed for addresses and data. Interface control pins:
Control the timing of transactions and provide coordination among initiators and
targets. Arbitration pins: Unlike the other PCI signal lines, these are not shared
lines. Rather, each PCI master has its own pair of arbitration lines that connect it
directly to the PCI bus arbiter. Error Reporting pins: Used to report parity and
other errors. Interrupt Pins: These are provided for PCI devices that must generate
requests for service. Cache support pins: These pins are needed to support a
memory on PCI that can be cached in the processor or another device. 64-bit Bus
extension pins: Include 32 lines that are time multiplexed for addresses and data

-9-
and that are combined with the mandatory address/data lines to form a 64-bit
address/data bus. JTAG/Boundary Scan Pins: These signal lines support testing
procedures defined in IEEE Standard 1149.1.

A NSWERS TO PROBLEMS
3.1 Memory (contents in hex): 300: 3005; 301: 5940; 302: 7006
Step 1: 3005 → IR; Step 2: 3 → AC
Step 3: 5940 → IR; Step 4: 3 + 2 = 5 → AC
Step 5: 7006 → IR; Step 6: AC → Device 6

3.2 1. a. The PC contains 300, the address of the first instruction. This value is loaded
in to the MAR.
b. The value in location 300 (which is the instruction with the value 1940 in
hexadecimal) is loaded into the MBR, and the PC is incremented. These two
steps can be done in parallel.
c. The value in the MBR is loaded into the IR.
2. a. The address portion of the IR (940) is loaded into the MAR.
b. The value in location 940 is loaded into the MBR.
c. The value in the MBR is loaded into the AC.
3. a. The value in the PC (301) is loaded in to the MAR.
b. The value in location 301 (which is the instruction with the value 5941) is
loaded into the MBR, and the PC is incremented.
c. The value in the MBR is loaded into the IR.
4. a. The address portion of the IR (941) is loaded into the MAR.
b. The value in location 941 is loaded into the MBR.
c. The old value of the AC and the value of location MBR are added and the
result is stored in the AC.
5. a. The value in the PC (302) is loaded in to the MAR.
b. The value in location 302 (which is the instruction with the value 2941) is
loaded into the MBR, and the PC is incremented.
c. The value in the MBR is loaded into the IR.
6. a. The address portion of the IR (941) is loaded into the MAR.
b. The value in the AC is loaded into the MBR.
c. The value in the MBR is stored in location 941.

3.3 a. 224 = 16 MBytes


b. (1) If the local address bus is 32 bits, the whole address can be transferred at
once and decoded in memory. However, because the data bus is only 16 bits, it
will require 2 cycles to fetch a 32-bit instruction or operand.
(2) The 16 bits of the address placed on the address bus can't access the whole
memory. Thus a more complex memory interface control is needed to latch the
first part of the address and then the second part (because the microprocessor
will end in two steps). For a 32-bit address, one may assume the first half will
decode to access a "row" in memory, while the second half is sent later to access
a "column" in memory. In addition to the two-step address operation, the
microprocessor will need 2 cycles to fetch the 32 bit instruction/operand.
c. The program counter must be at least 24 bits. Typically, a 32-bit microprocessor
will have a 32-bit external address bus and a 32-bit program counter, unless on-
chip segment registers are used that may work with a smaller program counter.
If the instruction register is to contain the whole instruction, it will have to be
-10-

You might also like