Module 3 (Ddco)
Module 3 (Ddco)
BASIC CONCEPTS
• Computer Architecture (CA) is concerned with the structure and behaviour of the computer.
• CA includes the information formats, the instruction set and techniques for addressing memory.
• In general covers, CA covers 3 aspects of computer-design namely:
1) Computer Hardware,
2)Instruction set Architecture and
3) Computer Organization.
1. Computer Hardware
It consists of electronic circuits, displays, magnetic and optical storage media and
Communication facilities.
2. Instruction Set Architecture
It is programmer visible machine interface such as instruction set, registers, memory
Organization and exception handling.
Two main approaches are 1) CISC and 2) RISC.
(CISC-Complex Instruction Set Computer, RISC-Reduced Instruction Set Computer)
3. Computer Organization
It includes the high level aspects of a design, such as
→ memory-system
→ bus-structure &
→ Design of the internal CPU.
It refers to the operational units and their interconnections that realize the architectural
Specifications.
It describes the function of and design of the various units of digital computer that store and process
information.
FUNCTIONAL UNITS
• A computer consists of 5 functionally independent main parts:
1) Input
2) Memory
3) ALU
4) Output &
5) Control units.
The input unit accepts coded information from human operators using devices such as keyboards, or
from other computers over digital communication lines. The information received is stored in the
computer‟s memory, either for later use or to be processed immediately by the arithmetic and logic
unit. The processing steps are specified by a program that is also stored in the memory. Finally, the
results are sent back to the outside world through the output unit. All of these actions are coordinated
by the control unit. An interconnection network provides the means for the functional units to exchange
information and coordinate their actions. The arithmetic and logic circuits, in conjunction with the
main control circuits, is the processor. Input and output equipment is often collectively referred to as
the input-output (I/O) unit. A program is a list of instructions which performs a task. Programs are
stored in the memory. The processor fetches the program instructions from the memory, one after
another, and performs the desired operations. The computer is controlled by the stored program, except
for possible external interruption by an operator or by I/O devices connected to it. Data are numbers
and characters that are used as operands by the instructions. Data are also stored in the memory. The
instructions and data handled by a computer must be encoded in a suitable format. Each instruction,
number, or character is encoded as a string of binary digits called bits, each having one of two possible
values, 0 or 1, represented by the two stable states.
Input Unit: Computers accept coded information through input units. The most common input device
is the keyboard. Whenever a key is pressed, the corresponding letter or digit is automatically translated
into its corresponding binary code and transmitted to the processor. Many other kinds of input devices
for human-computer interaction are available, including the touchpad, mouse, joystick, and trackball.
These are often used as graphic input devices in conjunction with displays. Microphones can be used
to capture audio input which is then sampled and converted into digital codes for storage and
processing. Similarly, cameras can be used to capture video input. Digital communication facilities,
such as the Internet, can also provide input to a computer from other computers and database servers.
Memory Unit
The function of the memory unit is to store programs and data. There are two classes of storage, called
primary and secondary
Primary Memory
Primary memory, also called main memory, is a fast memory that operates at electronic speeds.
Programs must be stored in this memory while they are being executed. The memory consists of a
large number of semiconductor storage cells, each capable of storing one bit of information. These
cells are rarely read or written individually. Instead, they are handled in groups of fixed size called
words. The memory is organized so that one word can be stored or retrieved in one basic operation.
The number of bits in each word is referred to as the word length of the computer, typically 16, 32, or
64 bits. To provide easy access to any word in the memory, a distinct address is associated with each
word location. Addresses are consecutive numbers, starting from 0, that identify successive locations.
Instructions and data can be written into or read from the memory under the control of the processor.
A memory in which any location can be accessed in a short and fixed amount of time after specifying
its address is called a random-access memory (RAM). The time required to access one word is called
the memory access time. This time is independent of the location of the word being accessed. It
typically ranges from a few nanoseconds (ns) to about 100 ns for current RAM units
Cache Memory:
As an adjunct to the main memory, a smaller, faster RAM unit, called a cache, is used to hold sections
of a program that are currently being executed, along with any associated data. The cache is tightly
coupled with the processor and is usually contained on the same integrated-circuit chip. The purpose
of the cache is to facilitate high instruction execution rates. At the start of program execution, the cache
is empty. As execution proceeds, instructions are fetched into the processor chip, and a copy of each
is placed in the cache. When the execution of an instruction requires data, located in the main memory,
the data are fetched and copies are also placed in the cache. If these instructions are available in the
cache, they can be fetched quickly during the period of repeated use.
Secondary Storage
Although primary memory is essential, it tends to be expensive and does not retain information when
power is turned off. Thus additional, less expensive, permanent secondarystorage is used when large
amounts of data and many programs have to be stored, particularly for information that is accessed
infrequently. Access times for secondary storage are longer than for primary memory. The devices
available are including magnetic disks, optical disks (DVD and CD), and flash memory devices
Output Unit
Output unit function is to send processed results to the outside world. A familiar example of such a
device is a printer. Most printers employ either photocopying techniques, as in laser printers, or ink jet
streams. Such printers may generate output at speeds of 20 or more pages per minute. However,
printers are mechanical devices, and as such are quite slow compared to the electronic speed of a
processor. Some units, such as graphic displays, provide both an output function, showing text and
graphics, and an input function, through touchscreen capability. The dual role of such units is the
reason for using the single name input/output (I/O) unit in many cases.
Control Unit
The memory, arithmetic and logic, and I/O units store and process information and perform input and
output operations. The operation of these units must be coordinated in some way. This is the
responsibility of the control unit. The control unit is effectively the nerve center that sends control
signals to other units and senses their states. I/O transfers, consisting of input and output operations,
are controlled by program instructions that identify the devices involved and the information to be
transferred. Control circuits are responsible for generating the timing signals that govern the transfers.
They determine when a given action is to take place. Data transfers between the processor and the
memory are also managed by the control unit through timing signals. A large set of control lines (wires)
carries the signals used for timing and synchronization of events in all units.
BUS STRUCTURE
• A bus is a group of lines that serves as a connecting path for several devices.
• A bus may be lines or wires.
• The lines carry data or address or control signal.
• There are 2 types of Bus structures: 1) Single Bus Structure and 2) Multiple Bus Structure.
1) Single Bus Structure
Because the bus can be used for only one transfer at a time, only 2 units can actively use the bus at any
given time.
Bus control lines are used to arbitrate multiple requests for use of the bus.
Advantages:
1) Low cost &
2) Flexibility for attaching peripheral devices.
2) Multiple Bus Structure
PERFORMANCE
• The most important measure of performance of a computer is how quickly it can execute programs.
• The speed of a computer is affected by the design of
1) Instruction-set.
2) Hardware & the technology in which the hardware is implemented.
3) Software including the operating system.
• Because programs are usually written in a HLL, performance is also affected by the compiler that
translates programs into machine language. (HLL High Level Language).
• For best performance, it is necessary to design the compiler, machine instruction set and hardware in
a co-ordinated way.
examine the flow of program instructions and data between the memory & the processor.
• At the start of execution, all program instructions are stored in the main-memory.
• As execution proceeds, instructions are fetched into the processor, and a copy is placed in the cache.
• Later, if the same instruction is needed a second time, it is read directly from the cache.
• A program will be executed faster
if movement of instruction/data between the main-memory and the processor is minimized
which is achieved by using the cache.
PROCESSOR CLOCK
• Processor circuits are controlled by a timing signal called a Clock.
• The clock defines regular time intervals called Clock Cycles.
• To execute a machine instruction, the processor divides the action to be performed into a sequence
of basic steps such that each step can be completed in one clock cycle.
• Let P = Length of one clock cycle
R = Clock rate.
• Relation between P and R is given by
------(1)
• Equ1 is referred to as the basic performance equation.
• To achieve high performance, the computer designer must reduce the value of T, which means
reducing N and S, and increasing R.
The value of N is reduced if source program is compiled into fewer machine instructions.
The value of S is reduced if instructions have a smaller number of basic steps to perform.
The value of R can be increased by using a higher frequency clock.
• Care has to be taken while modifying values since changes in one parameter may affect the other.
CLOCK RATE
• There are 2 possibilities for increasing the clock rate R:
1) Improving the IC technology makes logic-circuits faster.
This reduces the time needed to compute a basic step. (IC integrated circuits).
This allows the clock period P to be reduced and the clock rate R to be increased.
2) Reducing the amount of processing done in one basic step also reduces the clock period P.
• In presence of a cache, the percentage of accesses to the main-memory is small.
Hence, much of performance-gain expected from the use of faster technology can be realized.
The value of T will be reduced by same factor as R is increased „.‟ S & N are not affected.
PERFORMANCE MEASUREMENT
• Benchmark refers to standard task used to measure how well a processor operates.
• The Performance Measure is the time taken by a computer to execute a given benchmark.
• SPEC selects & publishes the standard programs along with their test results for different application
domains. (SPEC => System Performance Evaluation Corporation).
• SPEC Rating is given by
• SPEC rating = 50 -> The computer under test is 50 times as fast as reference-computer.
• The test is repeated for all the programs in the SPEC suite.
Then, the geometric mean of the results is computed.
• Let SPECi = Rating for program “i' in the suite.
Overall SPEC rating for the computer is given by
where n = no. of programs in the suite
Problem 1:
List the steps needed to execute the machine instruction:
Load R2, LOC
in terms of transfers between the components of processor and some simple control commands.
Assume that the address of the memory-location containing this instruction is initially in register PC.
Solution:
1. Transfer the contents of register PC to register MAR.
2. Issue a Read command to memory.
And, then wait until it has transferred the requested word into register MDR.
3. Transfer the instruction from MDR into IR and decode it.
4. Transfer the address LOCA from IR to MAR.
5. Issue a Read command and wait until MDR is loaded.
6. Transfer contents of MDR to the ALU.
7. Transfer contents of R0 to the ALU.
8. Perform addition of the two operands in the ALU and transfer result into R0.
9. Transfer contents of PC to ALU.
10. Add 1 to operand in ALU and transfer incremented address to PC.
BYTE-ADDRESSABILITY
• In byte-addressable memory, successive addresses refer to successive byte locations in the memory.
• Byte locations have addresses 0, 1, 2. . . . .
• If the word-length is 32 bits, successive words are located at addresses 0, 4, 8. . with each word
having 4 bytes.
• There are two ways in which byte-addresses are arranged (Figure 2.3).
1) Big-Endian: Lower byte-addresses are used for the more significant bytes of the word.
2) Little-Endian: Lower byte-addresses are used for the less significant bytes of the word
• In both cases, byte-addresses 0, 4, 8. . . . . are taken as the addresses of successive words in the
memory.
WORD ALIGNMENT
• Words are said to be Aligned in memory if they begin at a byte-address that is a multiple of the
number of bytes in a word.
• For example,
If the word length is 16(2 bytes), aligned words begin at byte-addresses 0, 2, 4 . . . . .
If the word length is 64(2 bytes), aligned words begin at byte-addresses 0, 8, 16 . . . . .
• Words are said to have Unaligned Addresses, if they begin at an arbitrary byte-address.
• The Load operation transfers a copy of the contents of a specific memory-location to the processor.
The memory contents remain unchanged.
• Steps for Load operation:
1) Processor sends the address of the desired location to the memory.
2) Processor issues „read‟ signal to memory to fetch the data.
3) Memory reads the data stored at that address.
4) Memory sends the read data to the processor.
• The Store operation transfers the information from the register to the specified memory-location.
This will destroy the original contents of that memory-location.
• Steps for Store operation are:
1) Processor sends the address of the memory-location where it wants to store data.
2) Processor issues „write‟ signal to memory to store the data.
3) Content of register(MDR) is written into the specified memory-location.
BASIC INSTRUCTIONS:-
The operation of adding two numbers is a fundamental capability in any computer. The statement
C=A+B
In a high-level language program is a command to the computer to add the current values of the two
variables called A and B, and to assign the sum to a third variable, C. When the program containing
this statement is compiled, the three variables, A, B, and C, are assigned to distinct locations in the
memory. We will use the variable names to refer to the corresponding memory location addresses. The
contents of these locations represent the values of the three variables. Hence, the above high-level
language statement requires the action.
C <= [A] + [B
To carry out this action, the contents of memory locations A and B are fetched from the memory and
transferred into the processor where their sum is computed. This result is then sent back to the memory
and stored in location C.
Let us first assume that this action is to be accomplished by a single machine instruction. Furthermore,
assume that this instruction contains the memory addresses of the three operands – A, B, and C. This
three-address instruction can be represented symbolically as
Add A, B, C
Operands A and B are called the source operands, C is called the destination operand, and Add is the
operation to be performed on the operands. A general instruction of this type has the format.
Operation Source1, Source 2, Destination
If k bits are needed for specify the memory address of each operand, the encoded form of the
above instruction must contain 3k bits for addressing purposes in addition to the bits needed
to denote the Add operation.
An alternative approach is to use a sequence of simpler instructions to perform the same
task, with each instruction having only one or two operands. Suppose that two- address
instructions of the form
Operation Source, Destination
Are available.
An Add instruction of this typeis
Add A, B
Which
performs operation B <= [A] + [B].
A single two-address instruction cannot be used to solve our original problem, which
is to add the the contents of locations A and B, without destroying either of them,
and to place the sum in location C. The problem can be solved by using another
two-address instruction that copies the contents of one memory location into
another. Such an instruction is
Move B, C
Which performs the operations C< = [B], leaving the contents of location B unchanged.
Using only one-address instructions, the operation C< = [A] + [B] can
be performed by executing the sequence of instructions
Load A
Add B
Store C
Some early computers were designed around a single accumulator
structure. Most modern computers have a number of general-purpose processor registers
– typically 8 to 32, and even considerably more in some cases. Access to data in
these registers is much faster than to data stored in memory locations because the
registers are inside the processor.
Let Ri represent a general-purpose register. The instructions
Load A, Ri
Store Ri, A
and Add A, Ri
Are generalizations of the Load, Store, and Add instructions for the single-
accumulator case, in which register Ri performs the function of the accumulator.
When a processor has several general-purpose registers, many instructions
involve only operands that are in the register. In fact, in many modern processors,
computations can be performed directly only on data held in processor
registers.Instructions such as
Add Ri,
Rj
Or
Add Ri, Rj, Rk
In both of these instructions, the source operands are the contents of registers
Ri and Rj. In the first instruction, Rj also serves as the destination register, whereas in the
second instruction, a third register, Rk, is used as the destination.
It is often necessary to transfer data between different locations. This is achieved with the
instruction
Move Source,
Destination
When data are moved to or from a processor register, the Move instruction
can be used rather than the Load or Store instructions because the order of the source
and destination operands determines which operation is intended.
In processors where arithmetic operations are allowed only on operands that are processor
registers, the C = A + B task can be performed by the instruction sequence
Move A, Ri
Move B, R
Add Ri, Rj
Move Rj, C
In processors where one operand may be in the memory but the other must be
in register, an instruction sequence for the required task would be
Move A, Ri
Add B, Ri
Move Ri, C
The speed with which a given task is carried out depends on the time it takes to
transfer instructions from memory into the processor and to access the operands
referenced by these instructions. Transfers that involve the memory are much slower
than transfers within the processor.
We have discussed three-, two-, and one-address instructions. It is also
possible
to use instructions in which the locations of all operands are defined implicitly. Such
instructions are found in machines that store operands in a structure called a pushdown
stack. In this case, the instructions are called zero-address instructions.
BRANCHING:-
Consider the task of adding a list of n numbers. Instead of using a long list of add
instructions, it is possible to place a single add instruction in a program loop, as shown in
fig b. The loop is a straight-line sequence of instructions executed as many times
as needed. It starts at location LOOP and ends at the instruction Branch > 0. During
each pass through this loop, the address of the next list entry is determined, and that entry
is fetched and added to
Fig b Using a loop to add n numbers
Assume that the number of entries in the list, n, is stored in memory location
N, as shown. Register R1 is used as a counter to determine the number of time the
loop is executed. Hence, the contents of location N are loaded into register R1 at the
beginning of the program. Then, within the body of the loop, the instruction.
Decrement R1
Reduces the contents of R1 by 1 each time through the loop.
This type of instruction loads a new value into the program counter. As a
result, the processor fetches and executes the instruction at this new address, called
the branch target, instead of the instruction at the location that follows the branch
instruction in sequential address order. A conditional branch instruction causes a branch
only if a specified condition is satisfied. If the condition is not satisfied, the PC is
incremented in the normal way, and the next instruction in sequential address order is
fetched and
executed.
Branch > 0 LOOP
CONDITION CODES:-
The processor keeps track of information about the results of various operations
for use by subsequent conditional branch instructions. This is accomplished by recording
the required information in individual bits, often called condition code flags. These flags
are usually grouped together in a special processor register called the condition code
register or status register. Individual condition code flags are set to 1 or cleared to 0,
depending on the outcome of the operation performed.
Four commonly used flags are
N(negative) Set to 1 if the result is negative; otherwise, cleared to 0
Z(zero) Set to 1 if the result is 0; otherwise, cleared to 0
V(overflow) Set ot1 if arithmetic overflow occurs; otherwise, cleared to 0
C(carry) Set to 1 if a carry-out results from the operation; otherwise, cleared to 0
The instruction Branch > 0, discussed in the previous section, is an example of
a branch instruction that tests one or more of the condition flags. It causes a branch if
the value tested is neither negative nor equal to zero. That is, the branch is taken if neither
N nor Z is 1. The conditions are given as logic expressions involving the condition
code flags.
In some computers, the condition code flags are affected automatically by
instructions that perform arithmetic or logic operations. However, this is not
always the case. A number of computers have two versions of an Add instruction.
GENERATING MEMORY ADDRESSES:-
Let us return to fig b. The purpose of the instruction block at LOOP is to add a
different number from the list during each pass through the loop. Hence, the Add
instruction in the block must refer to a different address during each pass. How are
the addresses to be specified ? The memory operand address cannot be given directly
in a single Add instruction in the loop. Otherwise, it would need to be modified on each
pass through the loop.
The instruction set of a computer typically provides a number of such methods,
called addressing modes. While the details differ from one computer to another, the
underlying concepts are the same.
Register mode - The operand is the contents of a processor register; the name
(address)of the register is given in the instruction.
Integer A, B;
Places the value 200 in register R0. Clearly, the Immediate mode is only
used to specify the value of a source operand. Using a subscript to denote the
Immediate mode is not appropriate in assembly languages. A common
convention is to use the sharp sign (#) in front of the value to indicate that this
value is to be used as an immediate operand. Hence, we write the instruction
above in the form
Move #200, R0
INDIRECTION AND POINTERS:-
In the addressing modes that follow, the instruction does not give the
operand or its address explicitly, Instead, it provides information from which the
memory address of the operand can be determined. We refer to this address as
the effective address (EA) of the operand.
To execute the Add instruction in fig (a), the processor uses the value which
is in register R1, as the effective address of the operand. It requests a read operation
from the memory to read the contents of location B. the value read is the
desired operand, which the processor adds to the contents of register R0. Indirect
addressing through a memory location is also possible as shown in fig (b). In
this case, the processor first reads the contents of memory location A, then
requests a second read operation using the value B as an address to obtain the
operand
(b) Through a
memory location
Move N,R1
Move #NUM,
R2
Clear R0
LOOP ADD (R2), R0
ADD #4, R2
DECREMEN R1
T
Branch > 0 LOOP
Move R0, SUM
EA = X + [Rj]
The contents of the index register are not changed in the process of
generating the effective address. In an assembly language program, the constant
X may be given either as an explicit number or as a symbolic name representing a
numerical value.
Fig a illustrates two ways of using the Index mode. In fig a, the index
register, R1, contains the address of a memory location, and the value X
defines an offset (also called a displacement) from this address to the location where
the operand is found. An alternative use is illustrated in fig b. Here, the constant X
corresponds to a memory address, and the contents of the index register define
the offset to the operand. In either case, the effective address is the sum of two
values; one is given explicitly in the instruction, and the other is stored in a
register.
Fig (a) Offset is given as a constant
The program is to compute the sum of all scores obtained on each of the tests
and to store the result in SUM1, SUM2 and SUM3.
Register R0 is the index register. R0 is set to point to the ID location of the first
student. Hence it contains the address LIST.
On the first pass through the loop, the test scores of the first student are added
to the running sums of registers R1,R2 and R3, which are initially cleared to 0.
These scores are accessed through the Index addressing mode 4(R0), 8(R0) and
12(R0).
The index register is incremented by 16 to point to ID location of second student.
Register R4, initialized with n, is decremented by 1 at the end of each pass of the
loop. When the contents of R4 reach 0, all student records have been accessed
and the loop terminates.
The last three instructions transfers the accumulated sums from register R1,R2
and R3 to memory locations SUM1,SUM2 and SUM3.
In the most basic form of indexed addressing several variations of this basic form
provide a very efficient access to memory operands in practical programming
situations. For example, a second register may be used to contain the offset X, in
which case we can write the Index mode as
(Ri, Rj)
The effective address is the sum of the contents of registers Ri and Rj. The
second register is usually called the base register. This form of indexed
addressing provides more flexibility in accessing operands, because both components of
the effective address can be changed.
Another version of the Index mode uses two registers plus a constant, which can be
denoted as
X(Ri, Rj)
In this case, the effective address is the sum of the constant X and the contents of registers
Ri and Rj. This added flexibility is useful in accessing multiple components inside
each item in a record, where the beginning of an item is specified by the (Ri, Rj) part
of the addressing mode. In other words, this mode implements a three-dimensional array.
RELATIVE ADDRESSING:-
We have defined the Index mode using general-purpose processor registers. A
useful version of this mode is obtained if the program counter, PC, is used instead of
a general purpose register. Then, X(PC) can be used to address a memory location that is
X bytes away from the location presently pointed to by the program counter.
Relative mode – The effective address is determined by the Index mode using
the program counter in place of the general-purpose register Ri.
This mode can be used to access data operands. But, its most common use is to specify
the target address in branch instructions. An instruction such as
Causes program execution to go to the branch target location identified by the name
LOOP if the branch condition is satisfied. This location can be computed by
specifying it as an offset from the current value of the program counter. Since the branch
target may be either before or after the branch instruction, the offset is given as a signed
number.
Autoincrement mode – the effective address of the operand is the contents of a register
specified in the instruction. After accessing the operand, the contents of this register are
automatically to point to the next item in a list.
(Ri)+
Autodecrement mode – the contents of a register specified in the instruction are
first automatically decremented and are then used as the effective address of the operand.
-(Ri)
Move N, R1
Move #NUM1,
Clear R2 R0
LOOP Add (R2)+, R0
Decrement R1
Branch>0 LOOP
Move R0, SUM