Microprocessor BCA Complete
Microprocessor BCA Complete
Introduction to Microprocessor
Microprocessor is a controlling unit of a micro-computer, fabricated on a small chip
capable of performing ALU (Arithmetic Logical Unit) operations and communicating with
the other devices connected to it.
Microprocessor consists of an ALU, register array, and a control unit. ALU performs
arithmetical and logical operations on the data received from the memory or an input
device. Register array consists of registers identified by letters like B, C, D, E, H, L and
accumulator (register in which intermediate arithmetic and logic results are stored.). The
control unit controls the flow of data and instructions within the computer.
Features of a Microprocessor
Here is a list of some of the most prominent features of any microprocessor −
a) Cost-effective − The microprocessor chips are available at low prices and results
its low cost.
b) Size − The microprocessor is of small size chip, hence is portable.
c) Low Power Consumption − Microprocessors are manufactured by using
metaloxide semiconductor technology, which has low power consumption.
d) Versatility − The microprocessors are versatile as we can use the same chip in a
number of applications by configuring the software program.
e) Reliability − The failure rate of microprocessors is very low, hence it is reliable.
Microprocessor Architecture & Operation
Control Unit – The control unit provides the necessary timing and control signals to all
the operations in the microcomputer. It controls the flow of data between the
microprocessor and memory and peripherals.
Input – The input section transfers data and instructions in binary from the outside world
to the microprocessor. It includes such devices as a keyboard, switches, a scanner, and an
analog-to-digital converter.
Output – The output section transfers data from the microprocessor to such output
devices as LED, CRT, printer, magnetic tape, or another computer.
Memory – It stores such binary information as instructions and data, and provides that
information to the microprocessor. To execute programs, the microprocessor reads
instructions and data from memory and performs the computing operations in its ALU
section. Results are either transferred to the output section for display or stored in
memory for later use.
b) Data bus –
It is a group of conducting wires which carries Data only. Data bus is bidirectional
because data flow in both directions, from microprocessor to memory or
Input/output devices and from memory or Input/output devices to
microprocessor.
c) Control bus –
It is a group of wires, which is used to generate timing and control signals to
control all the associated peripherals, microprocessor uses control bus to process
data, that is what to do with selected memory location. Some control signals are:
- Memory read
- Memory write
- I/O read
- I/O Write
4) Program counter
It is a 16-bit register used to store the memory address location of the next
instruction to be executed. Microprocessor increments the program whenever an
instruction is being executed, so that the program counter points to the memory
address of the next instruction that is going to be executed.
5) Stack pointer
It is also a 16-bit register works like stack, which is always
incremented/decremented by 2 during push & pop operations.
6) Temporary register
It is an 8-bit register, which holds the temporary data of arithmetic and logical
operations.
7) Flag register
It is an 8-bit register having five 1-bit flip-flops, which holds either 0 or 1
depending upon the result stored in the accumulator.
These are the set of 5 flip-flops –
- Sign (S)
- Zero (Z)
- Auxiliary Carry (AC)
- Parity (P)
- Carry (C)
D7 D6 D5 D4 D3 D2 D1 D0
S Z AC P CY
a) Sign Flag (S) – After any operation if result is negative sign flag becomes set, i.e. If
result is positive sign flag becomes reset i.e. 0.
• Example:
MVI A 30 (load 30H in register A)
MVI B 40 (load 40H in register B)
SUB B (A = A – B)
These set of instructions will set the sign flag to 1 as 30 – 40 is a negative number.
MVI A 40 (load 40H in register A)
MVI B 30 (load 30H in register B)
SUB B (A = A – B)
These set of instructions will reset the sign flag to 0 as 40 – 30 is a positive
number.
b) Zero Flag (Z) – After any arithmetical or logical operation if the result is 0 (00)H,
the zero flag becomes set i.e. 1, otherwise it becomes reset i.e. 0.
• Example:
MVI A 10 (load 10H in register A)
SUB A (A = A – A)
These set of instructions will set the zero flag to 1 as 10H – 10H is 00H
c) Auxiliary Carry Flag (AC) – If intermediate carry is generated this flag is set to 1,
otherwise it is reset to 0.
• Example:
MOV A 2B (load 2BH in register A)
MOV B 39 (load 39H in register B)
ADD B (A = A + B)
These set of instructions will set the auxiliary carry flag to 1, as on adding 2B and
39, addition of lower order nibbles B and 9 will generate a carry.
d) Parity Flag (P) – If after any arithmetic or logical operation the result has even
parity, an even number of 1 bits, the parity register becomes set i.e. 1, otherwise it
becomes reset.
1-accumulator has even number of 1 bits
0-accumulator has odd parity
e) Carry Flag (CY) – Carry is generated when performing n bit operations and the
result is more than n bits, then this flag becomes set i.e. 1, otherwise it becomes
reset i.e. 0.
During subtraction (A-B), if A>B it becomes reset and if (A<B) it becomes set.
Carry flag is also called borrow flag.
Priority of Interrupts
When microprocessor receives multiple interrupt requests simultaneously, it will
execute the interrupt service request (ISR) according to the priority of the
interrupts.
2) Data bus
AD7-AD0, it carries the least significant 8-bit address and data bus.
a) RD − This signal indicates that the selected IO or memory device is to be read and
is ready for accepting data available on the data bus.
b) WR − This signal indicates that the data on the data bus is to be written into a
selected memory or IO location.
c) ALE − It is a positive going pulse generated when a new operation is started by the
microprocessor. When the pulse goes high, it indicates address. When the pulse
goes down it indicates data.
IO/M
This signal is used to differentiate between IO and Memory operations, i.e. when it is
high indicates IO operation and when it is low then it indicates memory operation.
S1 & S0
These signals are used to identify the type of current operation.
4) Power supply
There are 2 power supply signals − VCC & VSS. VCC indicates +5v power supply
and VSS indicates ground signal.
5) Clock signals
There are 3 clock signals, i.e. X1, X2, CLK OUT.
Fetch cycle takes four t-states and execution cycle takes three t-states. It is shown below:
Different types of Machine Cycles:
Opcode Fetch Cycle
The first machine cycle of every instruction is opcode fetch cycle in which the 8085 finds
the nature of the instruction to be executed. In this machine cycle, processor places the
contents of the Program Counter on the address lines, and through the read process,
reads the opcode of the instruction. The length of this cycle is not fixed. It varies from 4T
states to 6T states as per the instruction.
Below figure shows timing diagram of opcode fetch:
The method by which the address of source of data or the address of destination of result
is given in the instruction is called Addressing Modes. The term addressing mode refers to
the way in which the operand of the instruction is specified.
Introduction to 8086
8086 Microprocessor is an enhanced version of 8085Microprocessor that was designed
by Intel in 1976. It is a 16-bit Microprocessor having 20 address lines and16 data lines
that provides up to 1MB storage. It consists of powerful instruction set, which provides
operations like multiplication and division easily.
It supports two modes of operation, i.e. Maximum mode and Minimum mode. Maximum
mode is suitable for system having multiple processors and Minimum mode is suitable for
system having a single processor.
Features of 8086
a) It has an instruction queue, which is capable of storing six instruction bytes from
the memory resulting in faster processing.
b) It was the first 16-bit processor having 16-bit ALU, 16-bit registers, internal data
bus, and 16-bit external data bus resulting in faster processing.
c) It uses two stages of pipelining, i.e. Fetch Stage and Execute Stage, which
improves performance. Fetch stage can prefetched up to 6 bytes of instructions
and stores them in the queue. Execute stage executes these instructions.
d) It consists of 29,000 transistors.
Execution Unit(EU)
The EU receives opcode of an instruction from the queue, decodes it and then executes
it. While Execution, unit decodes or executes an instruction, then the BIU fetches
instruction codes from the memory and stores them in the queue.
• General Purpose Registers: There are four 16-bit general purpose registers: AX
(Accumulator Register), BX (Base Register), CX (Counter) and DX.
• Index Register: The following four registers are in the group of pointer and index
registers:
• Stack Pointer (SP)
• Base Pointer (BP)
• Source Index (SI)
• Destination Index (DI)
• ALU: It handles all arithmetic and logical operations. Such as addition, subtraction,
multiplication, division, AND, OR, NOT operations.
• Flag Register: It is a 16 bit register which exactly behaves like a flip-flop, means it
changes states according to the result stored in the accumulator. It has 9 flags and
they are divided into 2 groups i.e. conditional and control flags.
• Conditional Flags: This flag represents the result of the last arithmetic or
logical instruction executed. Conditional flags are:
• Carry Flag
• Auxiliary Flag
• Parity Flag
• Zero Flag
• Sign Flag
• Overflow Flag
• Control Flags: It controls the operations of the execution unit. Control flags
are:
• Trap Flag
• Interrupt Flag
• Direction Flag
• AD0-AD15 (Address Data Bus): Bidirectional address/data lines. These are low
order address bus. When these lines are used to transmit memory address the
symbol A is used instead of AD for example A0- A15.
• A16 - A19 (Output): High order address lines. These are multiplexed with status
signals.
• A16/S3, A17/S4: A16 and A17 are multiplexed with segment identifier signals S3
and S4.
• TEST (Input): Wait for test control. When LOW the microprocessor continues
execution otherwise waits.
• GND: Ground.
Unit II Introduction to Assembly Language Programming
Assembly Language Programming Basics
An assembly language is the most basic programming language available for any
processor. With assembly language, a programmer works only with operations that are
implemented directly on the physical CPU.
b) Arithmetic Instructions
These instructions perform the operations like addition, subtraction, increment and
decrement.
Example: ADD, SUB, INR, DCR
c) Logical Instructions
These instructions perform logical operations on data stored in registers and memory.
The logical operations are: AND, OR, XOR, Rotate, Compare and Complement.
Example: ANA, ORA, RAR, RAL, CMP, CMA
d) Branching Instructions
Branching instructions refer to the act of switching execution to a different instruction
sequence as a result of executing a branch instruction. The three types of branching
instructions are: Jump, Call and Return.
e) Control Instructions
The control instructions control the operation of microprocessor. Examples: HLT, NOP, EI
(Enable Interrupt), DI (Disable Interrupt).
Data Transfer Instructions
MOV A, B 78 Register 1
MOV A, C 79 Register 1
MOV A, D 7A Register 1
MOV A, E 7B Register 1
MOV A, H 7C Register 1
MOV A, L 7D Register 1
MOV B, A 47 Register 1
MOV B, B 40 Register 1
MOV B, C 41 Register 1
MOV B, D 42 Register 1
MOV B, E 43 Register 1
MOV B, H 44 Register 1
MOV B, L 45 Register 1
MOV C, A 4F Register 1
MOV C, B 48 Register 1
MOV C, C 49 Register 1
MOV C, D 4A Register 1
MOV C, E 4B Register 1
MOV C, H 4C Register 1
MOV C, L 4D Register 1
MOV D, A 57 Register 1
MOV D, B 50 Register 1
MOV D, C 51 Register 1
MOV D, D 52 Register 1
MOV D, E 53 Register 1
MOV D, H 54 Register 1
MOV D, L 55 Register 1
MOV E, A 5F Register 1
MOV E, B 58 Register 1
MOV E, C 59 Register 1
MOV E, D 5A Register 1
MOV E, E 5B Register 1
MOV E, H 5C Register 1
MOV E, L 5D Register 1
MOV H, A 67 Register 1
MOV H, B 60 Register 1
MOV H, C 61 Register 1
MOV H, D 62 Register 1
MOV H, E 63 Register 1
MOV H, H 64 Register 1
MOV H, L 65 Register 1
MOV L, A 6F Register 1
MOV L, B 68 Register 1
MOV L, C 69 Register 1
MOV L, D 6A Register 1
MOV L, E 6B Register 1
MOV L, H 6C Register 1
MOV L, L 6D Register 1
Total = 84
Arithmetic Instructions
Instruction Opcode Addressing Bytes Description
Mode
ADD A 87 Register 1 It adds the content stored in given register
with the accumulator. The result of this
addition is stored in accumulator.
ADD B 80 Register 1
ADD C 81 Register 1
ADD D 82 Register 1
ADD E 83 Register 1
ADD H 84 Register 1
ADD L 85 Register 1
ADI data C6 Immediate 2 It immediately adds the given data with the
accumulator and the answer will be stored in
Accumulator.
ADC D 89 Register 1
ADC E 8B Register 1
ADC H 8C Register 1
ADC L 8D Register 1
SUB C 91 Register 1
SUB D 92 Register 1
SUB E 93 Register 1
SUB H 94 Register 1
SUB L 95 Register 1
SBB C 99 Register 1
SBB D 9A Register 1
SBB E 9B Register 1
SBB H 9C Register 1
SBB L 9D Register 1
INR C 0C Register 1
INR D 14 Register 1
INR E 1C Register 1
INR H 24 Register 1
INR L 2C Register 1
DCR B 05 Register 1
DCR C 0D Register 1
DCR D 15 Register 1
DCR E 1D Register 1
DCR H 25 Register 1
DCR L 2D Register 1
Total=65
Logical Instructions
Instruction Opcode Addressing Bytes Description
Mode
ANA A A7 Register 1 A=A AND A
ANA C A1 Register 1
ANA D A2 Register 1
ANA E A3 Register 1
ANA H A4 Register 1
ANA L A5 Register 1
ANA M A6 Register 1
Indirect
ANI data E6 Immediate 2 A=A AND data
ORA C B1 Register 1
ORA D B2 Register 1
ORA E B3 Register 1
ORA H B4 Register 1
ORA L B5 Register 1
ORA M B6 Register 1
Indirect
ORI data F6 Immediate 2
XRA C A9 Register 1
XRA D AA Register 1
XRA E AB Register 1
XRA H AC Register 1
XRA L AD Register 1
XRA M AE Register 1
Indirect
XRI data EE Immediate 2
CMP C B9 Register 1
CMP D BA Register 1
CMP E BB Register 1
CMP H BC Register 1
CMP L BD Register 1
CMP M BE Register 1
Indirect
CPI data FE Immediate 2 A=A-data (Acc. Remain unchanged)
Total=39
Branching Instructions
Instruction Opcode Addressing Bytes Description
Mode
JMP address C3 Immediate 3 Unconditional Jump
PUSH D D5 Register 1
Indirect
PUSH H E5 Register 1
Indirect
POP B C1 Register 1 POP data from data from stack on the basis of
Indirect address pointed by BC pair.
POP D D1 Register 1
Indirect
POP H E1 Register 1
Indirect
NOP 00 Implied/Implicit 1 No operation is performed
DI F3 Implied/Implicit 1
EI 7B Implied/Implicit 1
Input:
Memory Location Data
2050H 45H
2051H 53H
Output:
Memory Location Data
2055H 98H
Input:
Memory Location Data
2050H 65H
2051H 53H
Output:
Memory Location Data
2055H 12H
Input:
Memory Location Data
2013H 12H
Output:
Memory Location Data
2052H EDH
Input:
Memory Location Data
2013H 12H
Output:
Memory Location Data
2052H EEH
Input:
Register Pair Data
HL 1124H
DE 2253H
Output:
Memory Location Data
2055H 77H
2056H 33H
Input:
Memory Location Data
2050H 33H
2051H 45H
2052H 24H
2053H 34H
Output:
Memory Location Data
2055H 57H
2056H 79H
Input:
Register Pair Data
HL 4897H
DE 1234H
Output:
Memory Location Data
2055H 63H
2056H 36H
12. Program to subtract two 16-bit numbers.
Statement: Input first number from memory location 2050H & 2051H and second
number from memory location 2052H & 2053H and store result in memory
location 2055H & 2056H.
LHLD 2052H //Load 16-bit number from memory location 2052H & 2053H to HL
pair
XCHG //Exchange contents of HL pair and DE pair
LHLD 2050H //Load 16-bit number from memory location 2050H & 2051H to HL
pair
MOV A,L //Move contents of register L to Accumulator
SUB E //Subtract contents of Accumulator and E register
MOV L,A //Move contents of Accumulator to L register
MOV A,H //Move contents of register H to Accumulator
SBB D //Subtract contents of Accumulator and D register with carry
MOV H,A //Move contents of Accumulator to register H
SHLD 2055H //Store contents of HL pair in memory address 2055H & 2056H
HLT //Terminate the program.
Input:
Memory Location Data
2050H 78H
2051H 45H
2052H 24H
2053H 34H
Output:
Memory Location Data
2055H 54H
2056H 11H
MVI A,00H
MVI B,06H
MIV C,03H
X: ADD B
DCR C
JNZ X
STA 2055H
HLT
14. Program to divide to 8-bit numbers.
Statement: Divide 08H and 03H and store quotient in memory location 2055H and
remainder in memory location 2056H.
MVI A,08H
MVI B,03H
MVI C,00H
X: CMP B
JC Y
SUB B
INR C
JMP X
Y: STA 2056H
MOV A,C
STA 2055H
HLT
Y: MOV A,M
INX H
CMP M
JC Z
MOV B,M
MOV M,A
DCX H
MOV M,B
INX H
Z: DCR D
JNZ Y
DCR C
JNZ X
HLT
29. Sort numbers in descending order in array. Length of array is in memory location
2050H.
LDA 2050H
MVI C,A
DCR C
X: MOV D,C
LXI H,2051H
Y: MOV A,M
INX H
CMP M
JNC Z
MOV B,M
MOV M,A
DCX H
MOV M,B
INX H
Z: DCR D
JNZ Y
DCR C
JNZ X
HLT
30. Multiply two 8 bit numbers 43H & 07H. Result is stored at address 3050 and
3051.
LXI H,0000H
MVI D,00H
MVI E,43H
MVI C,07H
X: DAD D
DCR C
JNZ X
SHLD 2050H
HLT
31. Multiply two 8 bit numbers stored at address 2050 and 2051. Result is stored
at address 3050 and 3051.
LDA 2050H
MOV E,A
LDA 2051H
MOV C,A
MVI D,00H
LXI H,0000H
X: DAD D
DCR C
JNZ X
SHLD 3050H
HLT
Unit III Basic Computer Architecture
Introduction to Computer Architecture
Computer architecture is a specification detailing how a set of software and hardware
technology standards interact to form a computer system or platform. In short, computer
architecture refers to how a computer system is designed and what technologies it is
compatible with.
As with other contexts and meanings of the word architecture, computer architecture is
likened to the art of determining the needs of the user/system/technology, and creating
a logical design and standards based on those requirements.
A very good example of computer architecture is von Neumann architecture, which is still
used by most types of computers today. This was proposed by the mathematician John
von Neumann in 1945. It describes the design of an electronic computer with its CPU,
which includes the arithmetic logic unit, control unit, registers, memory for data and
instructions, an input/output interface and external storage functions.
It describes how the computer performs. Ex, circuit design, control signals, memory types
and etc.
1) So, for example, the fact that a multiply instruction is available is a computer
architecture issue. How that multiply is implemented is a computer organization
issue.
Cache Memory
A special very high speed memory called a Cache is sometimes used to increase the speed
of processing by making current programs and data available to the CPU at a rapid rate.
The cache memory is employed in computer systems to compensate for the speed
differential between main memory access time and processor logic. CPU logic is usually
faster than main memory access time, with the result that processing speed is limited
primarily by the speed of main memory. A technique used to compensate for the
mismatch in operating speeds is to employ an extremely fast, small cache between the
CPU and main memory whose access time Is dose to processor logic dock cycle time. The
cache is used for storing segments of programs currently being executed in the CPU and
temporary data frequently needed in the present calculations.
If the active portions of the program and data are placed in a fast small memory, the
average memory access time can be reduced, thus reducing the total execution time of
the program. Such a fast small memory is referred to as a cache memory. It is placed
between the CPU and main memory as illustrated in Fig. below. The cache memory
access time is less than the access time of main memory by a factor of 5 to 10. The cache
is the fastest component in the memory hierarchy and approaches the speed of CPU
components.
The basic operation of the cache is as follows. When the CPU needs to access memory,
the cache is examined. If the word is found in the cache, it is read from the fast memory.
If the word addressed by the CPU is not found in the cache, the main memory is accessed
to read the word. A block of words containing the one just accessed is then transferred
from main memory to cache memory. The block size may vary from one word (the one
just accessed) to about 16 words adjacent to the one just accessed. In this manner, some
data are transferred to cache so that future references to memory find the required
words in the fast cache memory.
The main memory can store 32K words of 12 bits each. The cache is capable of storing
512 of these words at any given time. For every word stored in cache, there is a duplicate
copy in main memory. The CPU communicates with both memories. It first sends a 15-bit
address to cache. If there is a hit, the CPU accepts the 12-bit data from cache. If there is a
miss, the CPU reads the word from main memory and the word is then transferred to
cache.
Magnetic platters - Platters are the round plates in the image above. Each platter holds a
certain amount of information, so a drive with a lot of storage will have more platters
than one with less storage. When information is stored and retrieved from the platters it
is done so in concentric circles, called tracks, which are further broken down into
segments called sectors.
Arm - The arm is the piece sticking out over the platters. The arms will contain read and
write heads which are used to read and store the magnetic information onto the platters.
Each platter will have its own arm which is used to read and write data off of it.
Motor - The motor is used to spin the platters from 4,500 to 15,000 rotations per minute
(RPM). The faster the RPM of a drive, the better performance you will achieve from it.
When a computer wants to retrieve data off of the hard drive, the motor will spin up the
platters and the arm will move itself to the appropriate position above the platter where
the data is stored. The heads on the arm will detect the magnetic bits on the platters and
convert them into the appropriate data that can be used by the computer. Conversely,
when data is sent to the drive, the heads will this time, send magnetic pulses at the
platters changing the magnetic properties of the platter, and thus storing your
information.
Instruction Code
An instruction code is a group of bits that instruct the computer to perform a specific
operation. It is usually divided into parts, each having its own particular interpretation.
The most basic part of an instruction code is its operation part. The operation code of an
instruction is a group of bits that define such operations as add, subtract, multiply, shift,
and complement. The number of bits required for the operation code of an instruction
depends on the total number of operations available in the computer.
Instruction codes together with data are stored in memory. The computer reads each
instruction from memory and places it in a control register. The control then interprets
the binary code of the instruction and proceeds to execute it by issuing a sequence of
micro operations. Every computer has its own unique instruction set. The ability to store
and execute instructions, the stored program concept, is the most important property of
a general-purpose computer.
The operation part of an instruction code specifies the operation to be performed. This
operation must be performed on some data stored in processor registers or in memory.
An instruction code must therefore specify not only the operation but also the registers
or the memory words where the operands are to be found, as well as the register or
memory word where the result is to be stored.
Stored Program Organization
The simplest way to organize a computer is to have one processor register and an
instruction code format with two parts. The first part specifies the operation to be
performed and the second specifies an address. The memory address tells the control
where to find an operand in memory. This operand is read from memory and used as the
data to be operated on together with the data stored in the processor register.
Below figure depicts this type of organization. Instructions are stored in one section of
memory and data in another. For a memory unit with 4096 words we need 12 bits to
specify an address since 212 = 4096. If we store each instruction code in one 16-bit
memory word, we have available four bits for the operation code (abbreviated opcode)
to specify one out of 16 possible operations, and 12 bits to specify the address of an
operand. The control reads a 16-bit instruction from the program portion of memory. It
uses the 12-bit address part of the instruction to read a 16-bit operand from the data
portion of memory. It then executes the operation specified by the operation code.
Indirect Address
It is sometimes convenient to use the address bits of an instruction code not as an
address but as the actual operand. When the second part of an instruction code specifies
an operand, the instruction is said to have an immediate operand. When the second part
specifies the address of an operand, the instruction is said to have a direct address. This is
in contrast to a third possibility called indirect address, where the bits in the second part
of the instruction designate an address of a memory word in which the address of the
operand is found.
In register indirect addressing mode, the data to be operated is available inside a memory
location and that memory location is indirectly specified by a register pair.
Example:
MOV A, M (move the contents of the memory location pointed by the H-L pair to the
accumulator)
Computer Registers
Registers are the fastest and smallest type of memory elements available to a processor.
Registers are normally measured by the number of bits they can hold, for example, an "8-
bit register", "32-bit register" or a "64-bit register" (or even with more bits). A processor
often contains several kinds of registers, which can be classified according to their
content or instructions that operate on them.
Four registers, DR, AC, IR, and TR, have 16 bits each. Two registers, AR and PC, have 12
bits each since they hold a memory address. When the contents of AR or PC are applied
to the 16-bit common bus, the four most significant bits are set to O's. When AR or PC
receive information from the bus, only the 12 least significant bits are transferred into
the register.
The input register INPR and the output register OUTR have 8 bits each and communicate
with the eight least significant bits in the bus. INPR is connected to provide information to
the bus but OUTR can only receive information from the bus. This is because INPR
receives a character from an input device which is then transferred to AC. OUTR receives
a character from AC and delivers it to an output device. There is no transfer from OUTR to
any of the other registers. The 16lines of the common bus receive information from six
registers and the memory unit. The bus lines are connected to the inputs of six registers
and the memory. Five registers have three control inputs: LD (load), INR (increment), and
CLR (clear).
The input data and output data of the memory are connected to the common bus, but
the memory address is connected to AR. Therefore, AR must always be used to specify a
memory address. By using a single register for the address, we eliminate the need for an
address bus that would have been needed otherwise. The content of any register can be
specified for the memory data input during a write operation. Similarly, any register can
receive the data from memory after a read operation except AC.
Instruction Cycle
Time required to execute and fetch an entire instruction is called instruction cycle. A
program residing in the memory unit of the computer consists of a sequence of
instructions. The program is executed in the computer by going through a cycle for each
instruction. Each instruction cycle in turn is subdivided into a sequence of sub cycles or
phases. In the basic computer each instruction cycle consists of the following phases:
Terminologies
Control Memory:
Control Memory is the storage in the microprogrammed control unit to store the
microprogram.
Control Word:
The control variables at any given time can be represented by a control word string of 1 's
and 0's called a control word.
Microoperations:
Micro-operations perform basic operations on data stored in one or more registers,
including transferring data between registers or between registers and external buses of
the central processing unit (CPU), and performing arithmetic or logical operations on
registers.
Microcode:
A very low-level instruction set which is stored permanently in a computer or peripheral
controller and controls the operation of the device.
Microinstruction
A single instruction in microcode. It is the most elementary instruction in the computer,
such as moving the contents of a register to the arithmetic logic unit (ALU).
Microprogram
A set or sequence of microinstructions.
❖ The gate structure that controls the LD, INR, and CLR inputs of AC is shown in Fig.
❖ The output of the AND gate that generates this control function is connected to
the CLR input of the register.
❖ Similarly, the output of the gate that implements the increment micro operation is
connected to the INR input of the register.
❖ The other seven micro operations are generated in the adder and logic circuit and
are loaded into AC at the proper time.
❖ The outputs of the gates for each control function is marked with a symbolic
name. These outputs are used in the design of the adder and logic circuit.
ALU Organization
❖ Various circuits are required to process data or perform arithmetical operations
which are connected to microprocessor's ALU.
❖ Accumulator and Data Buffer stores data temporarily. These data are processed as
per control instructions to solve problems. Such problems are addition,
multiplication etc.
Functions of ALU:
Functions of ALU or Arithmetic & Logic Unit can be categorized into following 3
categories:
1. Arithmetic Operations:
Additions, multiplications etc. are example of arithmetic operations. Finding greater than
or smaller than or equality between two numbers by using subtraction is also a form of
arithmetic operations.
2. Logical Operations:
Operations like AND, OR, NOR, NOT etc. using logical circuitry are examples of logical
operations.
3. Data Manipulations:
Operations such as flushing a register is an example of data manipulation. Shifting binary
numbers are also example of data manipulation.
Control Memory
❖ A computer that employs a microprogrammed control unit will have two
separate memories: a main memory and a control memory.
❖ The main memory is available to the user for storing the programs. The contents
of main memory may alter when the data are manipulated and every time that
the program is changed. The user's program in main memory consists of machine
instructions and data.
❖ In contrast, the control memory holds a fixed microprogram that cannot be
altered by the occasional user. The microprogram consists of microinstructions
that specify various internal control signals for execution of register
microoperations.
❖ Each machine instruction initiates a series of microinstructions in control memory.
These microinstructions generate the microoperations to fetch the instruction
from main memory; to evaluate the effective address, to execute the operation
specified by the instruction, and to return control to the fetch phase in order to
repeat the cycle for the next instruction.
❖ The control unit initiates a series of sequential steps of rnicrooperations. During
any given time, certain microoperations are to be initiated, while others remain
idle. The control variables at any given time can be represented by a string of 1's
and 0's called a control word.
❖ As such, control words can be programmed to perform various operations on the
components of the system.
❖ The microinstruction specifies one or more microoperations for the system. A
sequence of microinstructions constitutes a microprogram.
Address Sequencing
❖ Microinstructions are stored in control memory in groups, with each group
specifying a routine.
❖ Each computer instruction has its own microprogram routine in control memory
to generate the microoperations that execute the instruction.
❖ To appreciate the address sequencing in a microprogram control unit, let us
enumerate the steps that the control must undergo during the execution of a
single computer instruction.
❖ An initial address is loaded into the control address register when power is turned
on in the computer. This address is usually the address of the first
microinstruction that activates the instruction fetch routine. At the end of the
fetch routine, the instruction is in the instruction register of the computer.
❖ The control memory next must go through the routine that determines the
effective address of the operand. When the effective address computation
routine is completed, the address of the operand is available in the memory
address register.
❖ The next step is to generate the microoperations that execute the instruction
fetched from memory. The microoperation steps to be generated in processor
registers depend on the operation code part of the instruction.
❖ When the execution of the instruction is completed, control must return to the
fetch routine. This is accomplished by executing an unconditional branch
microinstruction to the first address of the fetch routine.
❖ In summary, the address sequencing capabilities required in a control memory
are:
❖ Incrementing of the control address register.
❖ Unconditional branch or conditional branch, depending on status bit conditions.
❖ A mapping process from the bits of the instruction to an address for control
memory.
❖ A facility for subroutine call and return.
Conditional Branching
❖ The branch logic provides decision-making capabilities in the control unit.
❖ The status conditions are special bits in the system that provide parameter
information such as the carry-out of an adder, the sign bit of a number, the mode
bits of an instruction, and input or output status conditions.
❖ Information in these bits can be tested and actions initiated based on their
condition: whether their value is 1 or 0.
❖ The status bits, together with the field in the microinstruction that specifies a
branch address, control the conditional branch decisions generated in the branch
logic.
❖ The branch logic hardware may be implemented in a variety of ways. The simplest
way is to test the specified condition and branch to the indicated address if the
condition is met; otherwise, the address register is incremented.
Mapping of Instruction
❖ Each instruction has its own microprogram routine stored in a given location of
control memory. The transformation from the instruction code bits to an address
in control memory where the routine is located is referred to as a mapping
process.
❖ A mapping procedure is a rule that transforms the instruction code into a control
memory address.
❖ For example, a computer with a simple instruction format as shown in Fig. 7-3 has
an operation code of four bits. Assume further that the control memory has 128
words, requiring an address of seven bits. For each operation code there exists a
microprogram routine in control memory that executes the instruction.
❖ One simple mapping process that converts the 4-bit operation code to a 7-bit
address for control memory.
❖ This mapping consists of placing a 0 in the most significant bit of the address,
transferring the four operation code bits, and clearing the two least significant bits
of the control address register. This provides for each computer instruction a
microprogram routine with a capacity of four microinstructions.
❖ If the routine needs more than four microinstructions, it can use addresses
1000000 through 1111111.
❖ If it uses fewer than four microinstructions, the unused memory locations would
be available for other routines.
Subroutines
❖ Subroutines are programs that are used by other routines to accomplish a
particular task.
❖ A subroutine can be called from any point within the main body of the
microprogram.
❖ Frequently, many microprograms contain identical sections of code.
Microinstructions can be saved by employing subroutines that use common
sections of microcode.
❖ For example, the sequence of microoperations needed to generate the effective
address of the operand for an instruction is common to all memory reference
instructions. This sequence could be a subroutine that is called from within many
other routines to execute the effective address computation.
❖ Microprograms that use subroutines must have a provision for storing the return
address during a subroutine call and restoring the address during a subroutine
return. This may be accomplished by placing the incremented output from the
control address register into a subroutine register and branching to the beginning
of the subroutine. The subroutine register can then become the source for
transferring the address for the return to the main routine.
Microprogram
❖ Microprogram is a sequence of microinstructions that controls the operation of
an arithmetic and logic unit so that machine code instructions are executed.
❖ It is a microinstruction program that controls the functions of a central processing
unit or peripheral controller of a computer.
Microinstruction Format
The microinstruction format for the control memory is shown in Fig. The 20 bits of the
microinstruction are divided into four functional parts. The three fields F1, F2, and F3
specify microoperations for the computer. The CD field selects status bit conditions. The
BR field specifies the type of branch to be used. The AD field contains a branch address.
The address field is seven bits wide, since the control memory has 128 = 27 words.
Symbols & Binary Code for Microoperations
❖ Control unit generates timing and control signals for the operations of the
computer. The control unit communicates with ALU and main memory. It also
controls the transmission between processor, memory and the various
peripherals. It also instructs the ALU which operation has to be performed on
data.
❖ Control unit can be designed by two methods:
1. Hardwired control Unit
2. Micro programmed Control Unit
2. Types of Micro-operation
These operations consist of a sequence of micro operations. All micro instructions fall
into one of the following categories:
❖ Transfer data between registers
❖ Transfer data from register to external
❖ Transfer data from external to register
❖ Perform arithmetic or logical operations
3. Functions of Control Unit
The control unit perform two tasks:
❖ Sequencing: The control unit causes the CPU to step through a series of micro-
operations in proper sequence based on the program being executed.
❖ Execution: The control unit causes each micro-operation to be performed.
Microprogram Sequencer
❖ The basic components of a microprogrammed control unit are the control
memory and the circuits that select the next address. The address selection part is
called a microprogram sequencer.
❖ The purpose of a microprogram sequencer is to present an address to the control
memory so that a microinstruction may be read and executed.
❖ The next-address logic of the sequencer determines the specific address source to
be loaded into the control address register. The choice of the address source is
guided by the next-address information bits that the sequencer receives from the
present microinstruction.
❖ Commercial sequencers include within the unit an internal register stack used for
temporary storage of addresses during microprogram looping and subroutine
calls. Some sequencers provide an output register which can function as the
address register for the control memory.
❖ The control memory is included in the diagram to show the interaction between
the sequencer and the memory attached to it.
❖ There are two multiplexers in the circuit.
❖ The first multiplexer selects an address from one of the four sources and routes it
into the CAR(Control Address Register).
❖ The second multiplexer tests the value of a selected status bit and the result of
the test is applied to an input logic circuit.
❖ The output from CAR provides the address for the control memory.
❖ The contents of CAR is incremented and applied to one of the multiplexer inputs
and to the SBR.
❖ The other three input come from the address field of the present
microinstruction, from the output of SBR(Subroutine Register) and from an
external source that maps the instruction.
Unit V – Central Processing Unit
Introduction to CPU
The part of the computer that performs the bulk of data-processing operations is called
the central processing unit and is referred to as the CPU.
The CPU is made up of three major parts, as shown in Fig.
➢ The register set stores intermediate data used during the execution of the
instructions.
➢ The arithmetic logic unit (ALU) performs the required micro operations for
executing the instructions.
➢ The control unit supervises the transfer of information among the registers and
instructs the ALU as to which operation to perform.
Hence it is necessary to provide a common unit that can perform all the arithmetic, logic,
and shift micro operations in the processor. Generally, CPU has seven general registers.
Register organization show how registers are selected and how data flow between
register and ALU.
The control unit that operates the CPU bus system directs the information flow through
the registers and ALU by selecting the various components in the system. For example, to
perform the operation
R1<--R2 + R3
the control must provide binary selection variables to the following selector inputs:
1. MUX A selector (SELA): to place the content of R2 into bus A.
2. MUX B selector (SELB): to place the content of R3 into bus B.
3. ALU operation selector (OPR): to provide the arithmetic addition A+ B.
4. Decoder destination selector (SELD): to transfer the content of the output bus into
R 1.
Control Word
There are 14 binary selection inputs in the unit, and their combined value specifies a
control word.
1. The three bit of SELA select a source registers of the a input of the ALU.
2. The three bits of SELB select a source registers of the b input of the ALU.
3. The three bits of SELED or SELREG select a destination register using the decoder.
4. The four bits of SELOPR select the operation to be performed by ALU.
A control word of 14 bits is needed to specify a micro operation in the CPU. The control
word for a given micro operation can be derived from the selection variables.
Register Stack
A stack can be placed in a portion of a large memory or it can be organized as a collection
of a finite number of memory words or registers. Figure 8-3 shows the organization of a
64-word register stack.
The stack pointer register SP contains a binary number whose value is equal to the
address of the word that is currently on top of the stack. Three items are placed in the
stack: A, B, and C, in that order. Item C is on top of the stack so that the content of SP is
now 3.
To remove the top item, the stack is popped by reading the memory word at address 3
and decrementing the content of SP. Item B is now on top of the stack since SP holds
address 2.
To insert a new item, the stack is pushed by incrementing SP and writing a word in the
next-higher location in the stack.
Initially, SP is cleared to 0, EMTY is set to 1, and FULL is cleared to 0, so that SP points to
the word at address 0 and the stack is marked empty and not full. If the stack is not full (if
FULL = 0), a new item is inserted with a push operation. The push operation is
implemented with the following sequence of micro operations;
SP <- SP + 1 Increment stack pointer
M[SP]<-DR Write item on top of the stack
A new item is deleted from the stack if the stack is not empty (if EMTY = 0). The pop
operation consists of the following sequence of micro operations:
DR <--M[SP] Read item from the top of stack
SP<--SP - 1 Decrement stack pointer
Memory Stack
A stack can be implemented in a random-access memory attached to a CPU.
The implementation of a stack in the CPU is done by assigning a portion of memory to a
stack operation and using a processor register as a stack pointer.
The portion of computer memory is partitioned into three segments: program, data, and
stack.
The program counter PC points at the address of the next instruction in the program. The
address register AR points at an array of data. The stack pointer SP points at the top of
the stack.
PC is used during the fetch phase to read an instruction. AR is used during the execute
phase to read an operand. SP is used to push or pop items into or &om the stack.
As shown in Fig., the initial value of SP is 4001 and the stack grows with decreasing
addresses. Thus the first item stored in the stack is at address 4000 the second item is
stored at address 3999 and the last address that can be used for the stack Is 3000.
We assume that the items in the stack communicate with a data register DR. A new item
is inserted with the push operation as follows:
SP<-SP - 1
M[SP]<-DR
The stack pointer is decremented so that it points at the address of the next word. A
memory write operation inserts the word from DR into the top of the stack.
A new item is deleted with a pop operation as follows:
DR <-M[SP]
SP<-SP + 1
The top item is read from the stack into DR. The stack pointer is then incremented to
point at the next item in the stack.
Program Interrupt
Program interrupt refers to the transfer of program control from a currently running
program to another service program as a result of an external or internal generated
request.
Control returns to the original program after the service program is executed.
The interrupt procedure is, in principle, quite similar to a subroutine call except for three
variations:
(1) The interrupt is usually initiated by an internal or external signal rather than from
the execution of an instruction (except for software interrupt);
(2) the address of the interrupt service program is determined by the hardware
rather than from the address field of an instruction; and
an interrupt procedure usually stores all the information necessary to define the state of
the CPU rather than storing only the program counter.
Types of Interrupts
There are three major types of interrupts that cause a break in the normal execution of a
program. They can be classified as:
1. External interrupts
2. Internal interrupts
3. Software interrupts
External Interrupts
External interrupts come from input/output (l/0) devices, from a timing device, from a
circuit monitoring the power supply, or from any other external source.
Examples that cause external interrupts are l/0 device requesting transfer of data, l/0
device finished transfer of data, elapsed time of an event, or power failure.
Timeout interrupt may result from a program that is in an endless loop and thus
exceeded its time allocation.
Power failure interrupt may have as its service routine a program that transfers the
complete state of the CPU into a nondestructive memory in the few milliseconds before
power ceases.
Internal Interrupts
Internal interrupts arise from illegal or erroneous use of an instruction or data. Internal
interrupts are also called traps.
Examples of interrupts caused by internal error conditions are register overflow, attempt
to divide by zero, an invalid operation code, stack overflow, and protection violation.
These error conditions usually occur as a result of a premature termination of the
instruction execution. The service program that processes the internal interrupt
determines the corrective measure to be taken.
Software Interrupts
External and internal interrupts are initiated from signals that occur in the hardware of
the CPU. A software interrupt is initiated by executing an instruction.
Software interrupt is a special call instruction that behaves like an interrupt rather than a
subroutine call.
The most common use of software interrupt is associated with a supervisor call
instruction. This instruction provides means for switching from a CPU user mode to the
supervisor mode. Certain operations in the computer may be assigned to the supervisor
mode only, as for example, a complex input or output transfer procedure.
A program written by a user must run in the user mode. When an input or output transfer
is required, the supervisor mode is requested by means of a supervisor call instruction.
This instruction causes a software interrupt.
Instruction Format
Computer perform task on the basis of instruction provided. An instruction format
defines layout of bits of an instruction.
An instruction in computer comprises of groups called fields. These field contains
different information as for computers every thing is in 0 and 1 so each field has different
significance on the basis of which a CPU decide what so perform.
The most common fields are:
• Operation field which specifies the operation to be performed like addition.
• Address field which contain the location of operand, i.e., register or memory
location.
• Mode field which specifies how operand is to be founded.
3. Stack organization
Computers with stack organization would have PUSH and POP instructions which require
an address field. Thus the instruction
PUSH X
will push the word at address X to the top of the stack. The stack pointer is updated
automatically.
Operation-type instructions do not need an address field in stack-organized computers.
This is because the operation is performed on the two items that are on top of the stack.
The instruction
ADD
in a stack computer consists of an operation code only with no address field. This
operation has the effect of popping the two top numbers from the stack, adding the
numbers, and pushing the sum into the stack. There is no need to specify operands with
an address field since all operands are implied to be in the stack.
Instruction Format
To illustrate the influence of the number of addresses on computer programs, we will
evaluate the arithmetic statement
X = (A + B) * (C + D)
using zero, one, two, or three address instructions. We will use the symbols ADD, SUB,
MUL, and DIV for the four arithmetic operations; MOV for the transfer-type operation;
and LOAD and STORE for transfers to and from memory and AC register. We will assume
that the operands are in memory addresses A, B, C, and D, and the result must be stored
in memory at address X.
1. Three Address Instructions
Computers with three-address instruction formats can use each address field to specify
either a processor register or a memory operand.
The program in assembly language that evaluates X = (A + B) * (C + D) is shown below,
together with comments that explain the register transfer operation of each instruction.
It is assumed that the computer has two processor registers, R1 and R2. The symbol M[A]
denotes the operand at memory address symbolized by A.
The advantage of the three-address format is that it results in short programs when
evaluating arithmetic expressions. The disadvantage is that the binary-coded instructions
require too many bits to specify three addresses.
2. Two-Address Instructions
Two-address instructions are the most common in commercial computers. Here again
each address field can specify either a processor register or a memory word. The program
to evaluate X = (A + B) * (C + D) is as follows:
The MOV instruction moves or transfers the operands to and from memory and
processor registers. The first symbol listed in an instruction is assumed to be both a
source and the destination where the result of the operation is transferred.
3. One-Address Instructions
One-address instructions use an accumulator (AC) register for all data manipulation. For
multiplication and division there is a need for a second register. However, here we will
neglect the second register and assume that the AC contains the result of all operations.
The program to evaluate X = (A + B) * (C + D) is
All operations are done between the AC register and a memory operand. T is the address
of a temporary memory location required for storing the intermediate result.
4. Zero-Address Instructions
A stack-organized computer does not use an address field for the instructions ADD and
MUL. The PUSH and POP instructions, however, need an address field to specify the
operand that communicates with the stack.
The following program shows how X = (A + B) * (C + D) will be written for a stack
organized computer. (TOS stands for top of stack.)
The load instructions transfer the operands from memory to CPU registers. The add and
multiply operations are executed with data in the registers without accessing memory.
The result of the computations is then stored in memory with a store instruction.
Addressing Modes
An example of an instruction format with a distinct addressing mode field is shown in Fig.
8-6.
The operation code specifies the operation to be performed. The mode field is used to
locate the operands needed for the operation.
There may or may not be an address field in the instruction. If there is an address field, it
may designate a memory address or a processor register. Moreover, the instruction may
have more than one address field, and each address field may be associated with its own
particular addressing mode.
1. Implied Mode
Implied Addressing Mode also known as "Implicit" or "Inherent" addressing mode is the
addressing mode in which, no operand (register or memory location or data) is specified
in the instruction. As in this mode the operand is specified implicit in the definition of
instruction.
As an example: The instruction: “Complement Accumulator” is an Implied Mode
instruction because the operand in the accumulator register is implied in the definition of
instruction. In assembly language it is written as:
All register reference instructions that use an accumulator are implied-mode instructions.
Zero-address instructions in a stack-organized computer are implied-mode instructions
since the operands are implied to be on top of the stack.
2. Immediate Mode
In this mode the operand is specified in the instruction itself. In other words, an
immediate-mode instruction has an operand field rather than an address field.
The operand field contains the actual operand to be used in conjunction with the
operation specified in the instruction. Immediate-mode instructions are useful for
initializing registers to a constant value.
The address field of an instruction may specify either a memory word or a processor
register. When the address field specifies a processor register, the instruction is said to be
in the register mode.
3. Register Mode
In this mode the operands are in registers that reside within the CPU. The particular
register is selected from a register field in the instruction.
Data transfer instructions cause transfer of data from one location to another without
changing the binary information content.
Data manipulation instructions are those that perform arithmetic, logic, and shift
operations.
Program control instructions provide decision-making capabilities and change the path
taken by the program when executed in the computer.
The instruction set of a particular computer determines the register transfer operations
and control decisions that are available to the user.
1. Arithmetic Instructions
The four basic arithmetic operations are addition, subtraction, multiplication, and
division. Most computers provide instructions for all four operations. Some small
computers have only addition and possibly subtraction instructions.
The increment instruction adds 1 to the value stored in a register or memory word. One
common characteristic of the increment operations when executed in processor registers
is that a binary number of all 1' s when incremented produces a result of all 0' s.
The decrement instruction subtracts 1 from a value stored in a register or memory word.
A number with all 0's, when decremented, produces a number with all 1's.
2. Logical and Bit Manipulation Instructions
Logical instructions perform binary operations on strings of bits stored in registers. They
are useful for manipulating individual bits or a group of bits that represent binary-coded
information. The logical instructions consider each bit of the operand separately and
treat it as a Boolean variable.
3. Shift Instructions
Shifts are operations in which the bits of a word are moved to the left or right.
The bit shifted in at the end of the word determines the type of shift used. Shift
instructions may specify either logical shifts, arithmetic shifts, or rotate-type operations.
In either case the shift may be to the right or to the left.
Program Control Instructions
Instructions are always stored in successive memory locations. When processed in the
CPU, the instructions are fetched from consecutive memory locations and executed.
Each time an instruction is fetched from memory, the program counter is incremented so
that it contains the address of the next instruction in sequence.
After the execution of a data transfer or data manipulation instruction, control returns to
the fetch cycle with the program counter containing the address of the instruction next in
sequence.
Program control instructions specify conditions for altering the content of the program
counter, while data transfer and manipulation instructions specify conditions for data-
processing operations. The change in value of the program counter as a result of the
execution of a program control instruction causes a break in the sequence of instruction
execution.
This is an important feature in digital computers, as it provides control over the flow of
program execution and a capability for branching to different program segments.
The branch and jump instructions are used interchangeably to mean the same thing, but
sometimes they are used to denote different addressing modes.
The branch is usually a one-address instruction. It is written in assembly language as BR
ADR, where ADR is a symbolic name for an address. When executed, the branch
instruction causes a transfer of the value of ADR into the program counter. Since the
program counter contains the address of the instruction to be executed, the next
instruction will come from location ADR.
Each mnemonic is constructed with the letter B (for branch) and an abbreviation of the
condition name. When the opposite condition state is used, the letter N (for no) is
inserted to define the 0 state.
Thus BC is Branch on Carry, and BNC is Branch on No Carry. If the stated condition is true,
program control is transferred to the address specified by the instruction. If not, control
continues with the instruction that follows. The conditional instructions can be associated
also with the jump, skip, call, or return type of program control instructions.
Subroutine Call and Return
A subroutine is a self-contained sequence of instructions that performs a given
computational task. During the execution of a program, a subroutine may be called to
perform its function many times at various points in the main program.
Each time a subroutine is called, a branch is executed to the beginning of the subroutine
to start executing its set of instructions. After the subroutine has been executed, a branch
is made back to the main program.
The instruction that transfers program control to a subroutine is known by different
names. The most common names used are call subroutine, jump to subroutine, branch
to subroutine, or branch and save address.
A call subroutine instruction consists of an operation code together with an address that
specifies the beginning of the subroutine. The instruction is executed by performing two
operations:
(1) the address of the next instruction available in the program counter (the return
address) is stored in a temporary location so the subroutine knows where to
return, and
(2) control is transferred to the beginning of the subroutine.
The last instruction of every subroutine, commonly called return from subroutine,
transfers the return address from the temporary location into the program counter.
This results in a transfer of program control to the instruction whose address was
originally stored in the temporary location.
RISC Characteristics
The concept of RISC architecture involves an attempt to reduce execution time by
simplifying the instruction set of the computer. The major characteristics of a RISC
processor are:
1. Relatively few instructions
2. Relatively few addressing modes
3. Memory access limited to load and store instructions
4. All operations done within the registers of the CPU
5. Fixed-length, easily decoded instruction format
UNIT VI – Pipeline, Vector Processing & Microprocessors
Parallel Processing
• Parallel processing is a term used to denote a large class of techniques that are
used to provide simultaneous data-processing tasks for the purpose of increasing
the computational speed of a computer system.
• Instead of processing each instruction sequentially as in a conventional computer,
a parallel processing system is able to perform concurrent data processing to
achieve faster execution time.
• For example, while an instruction is being executed in the ALU, the next
instruction can be read from memory. The system may have two or more ALUs
and be able to execute two or more instructions at the same time.
• Furthermore, the system may have two or more processors operating
concurrently. The purpose of parallel processing is to speed up the computer
processing capability and increase its throughput, that is, the amount of
processing that can be accomplished during a given interval of time.
• The amount of hardware increases with parallel processing. and with it, the cost
of the system increases. However, technological developments have reduced
hardware costs to the point where parallel processing techniques are
economically feasible.
• Parallel processing is established by distributing the data among the multiple
functional units. For example, the arithmetic, logic, and shift operations can be
separated into three units and the operands diverted to each unit under the
supervision of a control unit.
• The adder and integer multiplier perform the arithmetic operations with integer
numbers.
• The floating-point operations are separated into three circuits operating in
parallel.
• The logic, shift, and increment operations can be performed concurrently on
different data.
• All units are independent of each other, so one number can be shifted while
another number is being incremented.
• A multifunctional organization is usually associated with a complex control unit to
coordinate all the activities among the various components.
Pipelining
• Pipelining is a technique of decomposing a sequential process into sub-
operations, with each sub-process being executed in a special dedicated segment
that operates concurrently with all other segments.
• A pipeline can be visualized as a collection of processing segments through which
binary information flows.
• Each segment performs partial processing dictated by the way the task is
partitioned. The result obtained from the computation in each segment is
transferred to the next segment in the pipeline. The final result is obtained after
the data have passed through all segments.
• It is characteristic of pipelines that several computations can be in progress in
distinct segments at the same time.
• The overlapping of computation is made possible by associating a register with
each segment in the pipeline. The registers provide isolation between each
segment so that each can operate on distinct data simultaneously.
Pipelining Example
The pipeline organization will be demonstrated by means of a simple example. Suppose
that we want to perform the combined multiply and add operations with a stream of
numbers.
The above figure shows operation of 4-segment instruction pipeline. The four segments
are represented as:
1. FI: segment 1 that fetches the instruction.
2. DA: segment 2 that decodes the instruction and
calculates the effective address.
3. FO: segment 3 that fetches the operands.
4. EX: segment 4 that executes the instruction.
The space time diagram for the 4-segment instruction pipeline is given below:
Pipeline Conflicts (Hazards)
A pipeline hazard occurs when the instruction pipeline deviates at some phases, some
operational conditions that do not permit the continued execution. In general, there are
three major difficulties that cause the instruction pipeline to deviate from its normal
operation.
1. Resource conflicts caused by access to memory by two segments at the same
time. Most of these conflicts can be resolved by using separate instruction and
data memories.
2. Data dependency conflicts arise when an instruction depends on the result of a
previous instruction, but this result is not yet available.
3. Branch difficulties arise from branch and other instructions that change the value
of PC.
Data Dependency
It arises when instructions depend on the result of previous instruction but the previous
instruction is not available yet.
For example, an instruction in segment may need to fetch an operand that is being
generated at same time by the previous instruction in the segment.
The most common techniques used to resolve data hazard are:
(a) Hardware interlock - a hardware interlock is a circuit that detects instructions
whose source operands are destinations of instructions farther up in the pipeline.
It then inserts enough number of clock cycles to delays the execution of such
instructions.
(b) Operand forwarding - This method uses a special hardware to detect conflicts in
instruction execution and then avoid it by routing the data through special path
between pipeline segments. For example, instead of transferring an ALU result
into a destination result, the hardware checks the destination operand, and if it is
needed in next instruction, it passes the result directly into ALU input, bypassing
the register.
(c) Delayed load - It is software solutions where the compiler is designed in such a
way that it can detect the conflicts; re-order the instructions to delay the loading
of conflicting data by inserting no operation instruction.
Vector Processing
• Vector processing is a procedure for speeding the processing of information by a
computer, in which pipelined units perform arithmetic operations on uniform,
linear arrays of data values, and a single instruction involves the execution of the
same operation on every element of the array.
• There is a class of computational problems that are beyond the capabilities of a
conventional computer. These problems are characterized by the fact that they
require a vast number of computations that will take a conventional computer
days or even weeks to complete.
• In many science and engineering applications, the problems can be formulated in
terms of vectors and matrices that lend themselves to vector processing.
• To achieve the required level of high performance it is necessary to utilize the
fastest and most reliable hardware and apply innovative procedures from vector
and parallel processing techniques.
❖ This is a program for adding two vectors A and B of length 100 to produce a vector
C.
❖ A computer capable of vector processing eliminates the overhead associated with
the time it takes to fetch and execute the instructions in the program loop. It
allows operations to be specified with a single vector instruction of the form
C(1 : 100) = A(1 : 100) + B(1: 100)
❖ The vector instruction includes the initial address of the operands, the length of
the vectors, and the operation to be performed, all in one composite instruction.
Matrix Multiplication
Matrix multiplication is one of the most computational intensive operations performed
in computers with vector processors. An n x m matrix of numbers has n rows and m
columns and may be considered as constituting a set of n row vectors or a set of m
column vectors. Consider, for example, the multiplication of two 3 x 3 matrices A and B.
For example, the number in the first row and first column of matrix C is calculated by
letting i = 1, j = 1, to obtain
Inner Product
In general, the inner product consists of the sum of k product terms of the form
In a typical application k may be equal to 100 or even 1000. The inner product calculation
on a pipeline vector processor is shown below:
Arithmetic Pipeline
Pipeline arithmetic units are usually found in very high speed computers. They are used
to implement floating-point operations, multiplication of fixed-point numbers, and similar
computations encountered in scientific problems.
Let’s take an example of a pipeline unit for floating-point addition and subtraction. The
inputs to the floating-point adder pipeline are two normalized floating-point binary
numbers.
Multiprocessor System
• A multiprocessor is a computer system with two or more central processing units
(CPUs), with each one sharing the common main memory as well as the
peripherals. This helps in simultaneous processing of programs.
• The key objective of using a multiprocessor is to boost the system’s execution
speed, with other objectives being fault tolerance and application matching.
• A multiprocessor is regarded as a means to improve computing speeds,
performance and cost-effectiveness, as well as to provide enhanced availability
and reliability.
Characteristics:
❖ Consists of more than one CPU.
❖ Fast processing.
❖ Reliability
❖ Cost – Effective
❖ Simultaneous processing of programs.
2. Multiport Memory
A multiport memory system employs separate buses between each memory module and
each CPU.
3. Crossbar Switch
Consists of a number of cross points that are placed at intersections between processor
buses and memory module paths.