0% found this document useful (0 votes)
37 views43 pages

Unit - 1 1. Bus Construction Techniques 2. ALU

The document discusses the key components and operations of a computer's arithmetic logic unit (ALU). It describes how the ALU performs logical and mathematical operations on the contents of registers. Specific operations covered include addition, subtraction, incrementing, and decrementing. It also explains the basic instruction cycle of fetching, decoding, and executing instructions from memory.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views43 pages

Unit - 1 1. Bus Construction Techniques 2. ALU

The document discusses the key components and operations of a computer's arithmetic logic unit (ALU). It describes how the ALU performs logical and mathematical operations on the contents of registers. Specific operations covered include addition, subtraction, incrementing, and decrementing. It also explains the basic instruction cycle of fetching, decoding, and executing instructions from memory.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Unit - 1

1. Bus construction techniques


2. ALU
 Instead of having individual registers performing the micro-operations,
computer system provides a number of registers connected to a common unit
called as Arithmetic Logical Unit (ALU).
 ALU is the main and one of the most important unit inisde CPU of computer.
 All the logical and mathematical operations of computer are performed here.
 The contents of specific register is placed in the in the input of ALU. ALU
performs the given operation and then transfer it to the destination register.

3. Arithmetic micro operations


Some of the basic micro-operations are addition, subtraction, increment and
decrement.
 Add Micro-Operation:
It is defined by the following statement:
R3 → R1 + R2
The above statement instructs the data or contents of register R1 to be added to data
or content of register R2 and the sum should be transferred to register R3.
 Subtract Micro-Operation:
Let us again take an example:
R3 → R1 + R2' + 1
In subtract micro-operation, instead of using minus operator we take 1's
compliment and add 1 to the register which gets subtracted,
i.e R1 - R2 is equivalent to R3 → R1 + R2' + 1
 Increment/Decrement Micro-Operation:
Increment and decrement micro-operations are generally performed by adding and
subtracting 1 to and from the register respectively.
R1 → R1 + 1
R1 → R1 – 1

4. Instruction cycle
 A program residing in the memory unit of the computer consists of a sequence
of instructions. ➢ Upon the completion of step 4, the control goes back to step 1
to fetch, decode, and execute the next instruction.
 The program is executed in the computer by going through a cycle for each
instruction.
 In the basic computer each instruction cycle consists of the following phases:
1. Fetch an instruction from memory.
2. Decode the instruction.
3. Read the effective address from memory if the instruction has
an indirect address.
4. Execute the instruction.
 Each instruction cycle in turn is subdivided into a sequence of sub cycles or
phases.

Fetch and Decode:


 Initially, the program counter PC is loaded with the address of the first
instruction in the program.
 The sequence counter SC is cleared to 0, providing a decoded timing signal T0.
 The microoperations for the fetch and decode phases can be specified by the
following register transfer statements.
 To provide the data path for the transfer of PC to AR we must apply timing
signal T0 to achieve the following connection:
1. Place the content of PC onto the bus by making the bus selection inputs
S2, S1, S0 equal to 010.
2. Transfer the content of the bus to AR by enabling the LD input of AR.
 In order to implement the second statement it is necessary to use timing signal
T1 to provide the following connections in the bus system.
 Multiple input OR gates are included in the diagram because there are other
control functions that will initiate similar operations.
1. Enable the read input of memory.
2. Place the content of memory onto the bus by making S2S1S0=111.
3. Transfer the content of the bus to IR by enabling the LD input of IR.
4. Increment PC by enabling the INR input of PC.
5. Timing and control

 The timing for all registers in the basic computer is controlled by a master clock
generator.
 The clock pulses are applied to all flip-flops and registers in the system,
including the flip-flops and registers in the control unit.
 The clock pulses do not change the state of a register unless the register is
enabled by a control signal.
 The control signals are generated in the control unit and provide control inputs
for the multiplexers in the common bus, control inputs in processor registers,

and microoperations for the accumulator.


 There are two major types of control organization:
 The differences between hardwired and microprogrammed control are
1. Hardwired control
2. Microprogrammed control

 The block diagram of the hardwired control unit is shown in Fig.


 It consists of two decoders, a sequence counter, and a number of control logic
gates.
 An instruction read from memory is placed in the instruction register (IR). It is
divided into three parts: The I bit, the operation code, and bits 0 through 11.
 The operation code in bits 12 through 14 are decoded with a 3 x 8 decoder. The
eight outputs of the decoder are designated by the symbols D0 through D7.
 Bit 15 of the instruction is transferred to a flip-flop designated by the symbol I.
 Bits 0 through 11 are applied to the control logic gates.
 The 4-bit sequence counter can count in binary from 0 through 15.
 The outputs of the counter are decoded into 16 timing signals T0 through
T15.
 The sequence counter SC can be incremented or cleared synchronously.
 The counter is incremented to provide the sequence of timing signals out of
the 4 x 16 decoder.
 As an example, consider the case where SC is incremented to provide timing
signals T0, T1, T2, T3 and T4 in sequence. At time T4, SC is cleared to 0 if
decoder output D3 is active.
 This is expressed symbolically by the statement
D3T4: SC0
The timing diagram of Fig. 5-7 shows the time relationship of the control signals.
The sequence counter SC responds to the positive transition of the clock.
Initially, the CLR input of SC is active. The first positive transition of the clock
clears SC to 0, which in turn activates the timing signal T0 out of the decoder. T0 is
active during one clock cycle.
SC is incremented with every positive clock transition, unless its CLR input is
active.
This produces the sequence of timing signals T0, T1, T2, T3, T4and so on, as
shown in the diagram.
The last three waveforms in
Fig.5-7 show how SC is
cleared when D3T4 = 1.
Output D3 from the
operation decoder
becomes active at the
end of timing signal T2.
When timing signal T4
becomes active, the
output of the AND gate that
implements the control
function D3T4
becomes active.
This signal is applied to the
CLR input of SC. On the
next positive clock
transition (the one marked T4 in the diagram) the counter is cleared to 0.
This causes the timing signal T0 to become active instead of T5 that would have
been active if SC were incremented instead of cleared.

6. Instruction codes
 The organization of the computer is defined by its internal registers, the timing and control
structure, and the set of instructions that it uses.
 Internal organization of a computer is defined by the sequence of micro-operations it performs on
data stored in its registers.
 Computer can be instructed about the specific sequence of operations it must perform.
 User controls this process by means of a Program.
 Program: set of instructions that specify the operations, operands, and the sequence by which
processing has to occur.
 Instruction: a binary code that specifies a sequence of micro-operations for the computer.
 The computer reads each instruction from memory and places it in a control register. The control
then interprets the binary code of the instruction and proceeds to execute it by issuing a sequence
of micro-operations. – Instruction Cycle
 Instruction Code: group of bits that instruct the computer to perform specific operation.
 Instruction code is usually divided into two parts: Opcode and address(operand)
Operation Code (opcode):
✓ group of bits that define the operation
✓ Eg: add, subtract, multiply, shift, complement.
✓ No. of bits required for opcode depends on no. of operations available in computer.
✓ n bit opcode >= 2n (or less) operations
Address (operand):
✓ specifies the location of operands (registers or memory words)
✓ Memory words are specified by their address
✓ Registers are specified by their k-bit binary code
✓ k-bit address >= 2k registers

7. Input - Output
Input-Output and Interrupt:
 Instructions and data stored in memory must come from some input device.
 Computational results must be transmitted to the user through some output
device.
 To demonstrate the most basic requirements for input and output
communication, we will use as an illustration a terminal unit with a keyboard
and printer.
Input-Output Configuration:

 The terminal sends and receives serial information.


 Each quantity of information has eight bits of an alphanumeric code.
 The serial information from the keyboard is shifted into the input register INPR.
 The serial information for the printer is stored in the output register OUTR.
 These two registers communicate with a communication interface serially and
with the AC in parallel.
 The input register INPR consists of eight bits and holds alphanumeric input
information.
1. The 1-bit input flag FGI is a control flip-flop.
2. The flag bit is set to 1 when new information is available in the input
device and is cleared to 0 when the information is accepted by the
computer.
 The output register OUTR works similarly but the direction of information flow
is reversed.
 Initially, the output flag FGO is set to 1.
 The computer checks the flag bit; if it is 1, the information from AC is
transferred in parallel to OUTR and FGO is cleared to 0.
 The output device accepts the coded information, prints the corresponding
character, and when the operation is completed, it sets FGO to 1.
Input-Output Instructions:
 These instructions are executed with the clock transition associated with
timing signal T3.
a. Each control function needs a Boolean relation D7IT3, which we designate
for convenience by the symbol p.
b. The control function is distinguished by one of the bits in IR (6-11).
c. By assigning the symbol Bi to bit i of IR, all control functions can be
denoted by pBi for i = 6 though 11.
d. The sequence counter SC is cleared to 0 when p = D7IT3 = 1.
e. The last two instructions set and clear an interrupt enable flip-flop IEN.
f. Input and output instructions are needed for transferring information to and
from AC register, for checking the flag bits, and for controlling the interrupt
facility.
g. Input-output instructions have an operation code 1111 and are recognized by
the control when D7 = 1 and I = 1.
h. The remaining bits of the instruction specify the particular operation.
i. The control functions and microoperations for the input-output instructions
are listed in Table 5-5.

Unit – 2
1. Addressing modes *****
The operation field of an instruction specifies the operation to be performed.
 This operation will be executed on some data which is stored in computer
registers or the main memory.
 The way any operand is selected during the program execution is dependent on

the addressing mode of the instruction.


 The purpose of using addressing modes is as follows:
1. To give the programming versatility to the user.
2. To reduce the number of bits in addressing field of instruction.
Types of Addressing Modes
Below we have discussed different types of addressing modes one by one:
Immediate Mode:
In this mode, the operand is specified in the instruction itself. An immediate mode
instruction has an operand field rather than the address field.
For example: ADD 7, which says Add 7 to contents of accumulator. 7 is the
operand here.
Register Mode:
In this mode the operand is stored in the register and this register is present in CPU.
The instruction has the address of the Register where the operand is stored.
Advantages
1. Shorter instructions and faster instruction fetch.
2. Faster memory access to the operand(s)

Disadvantages
1. Very limited address space
2. Using multiple registers helps performance but it complicates the
instructions.
Register Indirect Mode:
In this mode, the instruction specifies the register whose contents give us the
address of operand which is in memory. Thus, the register contains the address of
operand rather than the operand itself.

Auto Increment/Decrement Mode:


In this the register is incremented or decremented after or before its value is used.
Direct Addressing Mode:
In this mode, effective address of operand is present in instruction itself.
1. Single memory reference to access data.
2. No additional calculations to find the effective address of the operand.
For Example: ADD R1, 4000 - In this the 4000 is effective address of operand.
NOTE: Effective Address is the location where operand is present.
Indirect Addressing Mode:
In this, the address field of instruction gives the address where the effective address
is stored in memory. This slows down the execution, as this includes multiple
memory lookups to find the operand.

Displacement Addressing Mode:


In this the contents of the indexed register is added to the Address part of the
instruction, to obtain the effective address of operand.
EA = A + (R), In this the address field holds two values, A(which is the base value)
and R(that holds the displacement), or vice versa.
Relative Addressing Mode
It is a version of Displacement addressing mode.
In this the contents of PC(Program Counter) is added to address part of instruction
to obtain the effective address.
EA = A + (PC), where EA is effective address and PC is program counter.
The operand is A cells away from the current cell(the one pointed to by PC)
Base Register Addressing Mode
It is again a version of Displacement addressing mode. This can be defined as EA =
A + (R), where A is displacement and R holds pointer to base address.
Stack Addressing Mode
In this mode, operand is at the top of the stack. For example: ADD, this instruction
will POP top two items from the stack, add them, and will then PUSH the result to
the top of the stack.

2. Address sequencer
Address Sequencing:
 Microinstructions are stored in control memory in groups, with each group
specifying a routine.
 To appreciate the address sequencing in a micro-program control unit, let us
specify the steps that the control must undergo during the execution of a single
computer instruction.
Step-1:
 An initial address is loaded into the control address register when power is
turned on in the computer.
 This address is usually the address of the first microinstruction that activates the
instruction fetch routine.
 The fetch routine may be sequenced by incrementing the control address register
through the rest of its microinstructions.
 At the end of the fetch routine, the instruction is in the instruction register of the
computer.
Step-2:
 The control memory next must go through the routine that determines the
effective address of the operand.
 A machine instruction may have bits that specify various addressing modes,
such as indirect address and index registers.
 The effective address computation routine in control memory can be reached
through a branch microinstruction, which is conditioned on the status of the
mode bits of the instruction.
 When the effective address computation routine is completed, the address of the
operand is available in the memory address register.

Step-3:
 The next step is to generate the microoperations that execute the instruction
fetched from memory.
 The microoperation steps to be generated in processor registers depend on the
operation code part of the instruction.
 Each instruction has its own micro-program routine stored in a given location of
control memory.
 The transformation from the instruction code bits to an address in control
memory where the routine is located is referred to as a mapping process.
 A mapping procedure is a rule that transforms the instruction code into a control
memory address.
Step-4:
 Once the required routine is reached, the microinstructions that execute the
instruction may be sequenced by incrementing the control address register.
 Micro-programs that employ subroutines will require an external register for
storing the return address.
 Return addresses cannot be stored in ROM because the unit has no writing
capability.
 When the execution of the instruction is completed, control must return to the
fetch routine.
 This is accomplished by executing an unconditional branch microinstruction to
the first address of the fetch routine.
the address sequencing capabilities required in a control memory are:
1. Incrementing of the control address register.
2. Unconditional branch or conditional branch, depending on status bit
conditions.
3. A mapping process from the bits of the instruction to an address for control
memory.
4. A facility for subroutine call and return.

 block diagram of a control memory and the associated hardware needed for
selecting the next microinstruction address.
 The microinstruction in control memory contains a set of bits to initiate
microoperations in computer registers and other bits to specify the method by
which the next address is obtained.
 The diagram shows four different paths from which the control address register
(CAR) receives the address.

3. Design of control units


Design of Control Unit:
 Control unit generates timing and control signals for the operations of the
computer.
 The control unit communicates with ALU and main memory.
 It also controls the transmission between processor, memory and the various
peripherals. It also instructs the ALU which operation has to be performed on
data.
Control unit can be designed by two methods which are given below:
Hardwired Control Unit:
 It is implemented with the help of gates, flip flops, decoders etc. in the
hardware.
o The inputs to control unit are the instruction register, flags, timing signals
etc.
o This organization can be very complicated if we have to make the control
unit large.
 If the design has to be modified or changed, all the combinational circuits have
to be modified which is a very difficult task.

Microprogrammed Control Unit:


 It is implemented by using programming approach.
 A sequence of micro operations is carried out by executing a program consisting of micro-
instructions.
 In this organization any modifications or changes can be done by updating the micro program in
the control memory by the programmer.
4. Modes of data transfer (priority,interrupts)

Data transfer instructions move data from one place in the computer to another
without changing the data content.
➢ The most common transfers are between memory and processor registers,
between processor registers and input or output, and between the processor
registers themselves.
➢ The load instruction has been used mostly to designate a transfer from
memory to a processor register, usually an accumulator.
➢ The store instruction designates a transfer from a processor register into
memory.
➢ The move instruction has been used in computers with multiple CPU
registers to designate a transfer from one register to another. It has also been
used for data transfers between CPU registers and memory or between two
memory words.
➢ The exchange instruction swaps information between two registers or a
register and a memory word.
➢ The input and output instructions transfer data among processor registers and
input or output terminals.
Unit – 3
1. Booth multiplication algorithm

 Booth algorithm gives a procedure for multiplying binary integers in signed- 2’s
complement representation.
 It operates on the fact that strings of 0’s in the multiplier require no addition but
just shifting, and a string of 1’s in the multiplier from bit weight 2k to weight
2m can be treated as 2k+1 – 2m.
 For example, the binary number 001110 (+14) has a string 1’s from 23 to 21
(k=3, m=1). The number can be represented as 2k+1 – 2m. = 24 – 21 = 16 – 2 =
14. Therefore, the multiplication M X 14, where M is the multiplicand and 14
the multiplier, can be done as M X 24 – M X 21.
 Thus the product can be obtained by shifting the binary multiplicand M four
times to the left and subtracting M shifted left once.

As in all multiplication schemes, booth algorithm requires examination of the


multiplier bits and shifting of partial product.
Prior to the shifting, the multiplicand may be added to the partial product,
subtracted from the partial, or left unchanged according to the following rules:

The multiplicand is subtracted from the partial product upon encountering the first
least significant 1 in a string of 1’s in the multiplier.
2. The multiplicand is added to the partial product upon encountering the first 0 in a
string of 0’s in the multiplier.

3. Thepartial product does not change when multiplier bit is identical to the
previous multiplier bit.

The algorithm works for positive or negative multipliers in 2’s complement


representation.
1.
This is because a negative multiplier ends with a string of 1’s and the last operation
will be a subtraction of the appropriate weight.
The two bits of the multiplier in Qn and Qn+1 are inspected.
If the two bits are equal to 10, it means that the first 1 in a string of 1 's has been
encountered. This requires a subtraction of the multiplicand from the partial
product in AC.
If the two bits are equal to 01, it means that the first 0 in a string of 0's has been
encountered. This requires the addition of the multiplicand to the partial product in
AC.
 When the two bits are equal, the partial product does not change.

2. Fixed point algorithm


3. Floating point arithmetic
 In many high-level programming languages we have a facility for specifying
floating-point numbers.
o The most common way is by a real declaration statement. High level
programming languages must have a provision for handling floating-point
arithmetic operations.
o The operations are generally built in the internal hardware. If no hardware is
available, the compiler must be designed with a package of floating-point
software subroutine.
o Although the hardware method is more expensive, it is much more efficient
than the software method.
o Therefore, floating- point hardware is included in most computers and is
omitted only in very small ones.
 Basic Considerations :
o There are two part of a floating-point number in a computer - a mantissa m
and an exponent e.
o The two parts represent a number generated from multiplying m times a
radix r raised to the value of e. Thus
m x re
o The mantissa may be a fraction or an integer.
o The position of the radix point and the value of the radix r are not included in
the registers.
o For example, assume a fraction representation and a radix10.
o The decimal number 537.25 is represented in a register with m = 53725 and e =
3 and is interpreted to represent the floating-point number

.53725 x 103
o A floating-point number is said to be normalized if the most significant digit of
the mantissa in nonzero.
o So the mantissa contains the maximum possible number of significant
digits.

o We cannot normalize a zero because it does not have a nonzero digit.

o It is represented in floating-point by all 0’s in the mantissa and exponent.

o Floating-point representation increases the range of numbers for a given


register.
o Consider a computer with 48-bit words. Since one bit must be reserved
for the sign, the range of fixed-point integer numbers will be + (247 – 1),
which is approximately + 1014.

o The 48 bits can be used to represent a floating-point number with 36 bits


for the mantissa and 12 bits for the exponent.

o Assuming fraction representation for the mantissa and taking the two sign
bits into consideration, the range of numbers that can be represented is
+ (1 – 2-35) x 22047
o This number is derived from a fraction that contains 35 1’s, an exponent
of 11 bits (excluding its sign), and because 211–1 = 2047.
o The largest number that can be accommodated is approximately 10615.

o The mantissa that can accommodated is 35 bits (excluding the sign) and if
considered as an integer it can store a number as large as (235 –1).
o This is approximately equal to 1010, which is equivalent to a decimal
number of 10 digits.
o Computers with shorter word lengths use two or more words to represent
a floating-point number.
o An 8-bit microcomputer uses four words to represent one floating-point
number. One word of 8 bits are reserved for the exponent and the 24 bits
of the other three words are used in the mantissa.
o Arithmetic operations with floating-point numbers are more complicated
than with fixed-point numbers.
o Their execution also takes longer time and requires more complex
hardware.
o Adding or subtracting two numbers requires first an alignment of the
radix point since the exponent parts must be made equal before adding or
subtracting the mantissas.
o We do this alignment by shifting one mantissa while its exponent is
adjusted until it becomes equal to the other exponent.
o Consider the sum of the following floating-point numbers:
.5372400 x 102
+ .1580000 x 10-1
o Floating-point multiplication and division need not do an alignment of the
mantissas.
o Multiplying the two mantissas and adding the exponents can form the
product.
o Dividing the mantissas and subtracting the exponents perform division.
o The operations done with the mantissas are the same as in fixed-point
numbers, so the two can share the same registers and circuits.
o The operations performed with the exponents are compared and
incremented (for aligning the mantissas), added and subtracted (for
multiplication) and division), and decremented (to normalize the result).
o We can represent the exponent in any one of the three representations -
signed-magnitude, signed 2’s complement or signed 1’s complement.
o Biased exponents have the advantage that they contain only positive
numbers.
o Now it becomes simpler to compare their relative magnitude without
bothering about their signs.

o Another advantage is that the smallest possible biased exponent contains


all zeros.

o The floating-point representation of zero is then a zero mantissa and the


smallest possible exponent.

The exponent part may use corresponding lower-case letter symbol.

o Assuming that each floating-point number has a mantissa in signed-


magnitude representation and a biased exponent.
o Thus the AC has a mantissa whose sign is in As, and a magnitude that is in
A.
o The diagram shows the most significant bit of A, labeled by A1.
o The bit in his position must be a 1 to normalize the number. Note that the
symbol AC represents the entire register, that is, the concatenation of As, A
and a.
o In the similar way, register BR is subdivided into Bs, B, and b and QR into
Qs, Q and q.
o A parallel-adder adds the two mantissas and loads the sum into A and the
carry into E.
o A separate parallel adder can be used for the exponents. The exponents do
not have a district sign bit because they are biased but are represented as a
biased positive quantity. 25
o It is assumed that the floating- point number are so large that the chance of
an exponent overflow is very remote and so the exponent overflow will be
neglected.

o The exponents are also connected to a magnitude comparator that provides


three binary outputs to indicate their relative magnitude.

o The number in the mantissa will be taken as a fraction, so they binary point
is assumed to reside to the left of the magnitude part.

o Integer representation for floating point causes certain scaling problems


during multiplication and division.

o To avoid these problems, we adopt a fraction representation.


o The numbers in the registers should initially be normalized. After each
arithmetic operation, the result will be normalized.

o Thus all floating-point operands are always normalize

4. Decimal arithmetic unit

Unit – 4
1. DMA

 Removing the CPU from the path and letting the peripheral device manage the
memory buses directly would improve the speed of transfer.
 This technique is known as DMA.
 In this, the interface transfer data to and from the memory through memory bus.
 A DMA controller manages to transfer data between peripherals and memory
unit.
 Many hardware systems use DMA such as disk drive controllers, graphic cards,
network cards and sound cards etc.
 It is also used for intra chip data transfer in multicore processors.

 In DMA, CPU would initiate the transfer, do other operations while the transfer
is in progress and receive an interrupt from the DMA controller when the
transfer has been completed.
2. Cache memory mapping techniques
Cache Memory:
 The data or contents of the main memory that are used again and again by CPU,
are stored in the cache memory so that we can easily access that data in shorter
time.
 Whenever the CPU needs to access memory, it first checks the cache memory.
 If the data is not found in cache memory then the CPU moves onto the main
memory.
 It also transfers block of recent data into the cache and keeps on deleting the old
data in cache to accomodate the new one.

Hit Ratio:
 The performance of cache memory is measured in terms of a quantity called hit
ratio.
 When the CPU refers to memory and finds the word in cache it is said to
produce a hit.
 If the word is not found in cache, it is in main memory then it counts as a miss.
 The ratio of the number of hits to the total CPU references to memory is called
hit ratio.

Hit Ratio = Hit/(Hit + Miss)

3. Associate memory
 Associative Memory: Content Addressable Memory (CAM).
 The time required to find an item stored in memory can be reduced
considerably if stored data can be identified for access by the content of the
data itself rather than by an address.
 A memory unit accessed by content is called an associative memory or
content addressable memory (CAM).
 This type of memory is accessed simultaneously and in parallel on the basis
of data content rather than by specific address or location.
 It consists of a memory array and logic form words with n bits per word.
 The argument register A and key register K each have n bits, one for each bit
of a word. The match register M has m bits, one for each memory word.
 Each word in memory is compared in parallel with the content of the
argument register.
 The words that match the bits of the argument register set a corresponding bit
in the match register.
 After the matching process, those bits in the match register that have been set
indicate the fact that their corresponding words have been matched.
 Reading is accomplished by a sequential access to memory for those words
whose corresponding bits in the match register have been set.

4. Main memory – Paging


The memory unit that communicates directly within the CPU, Auxillary memory
and Cache memory, is called main memory.
 It is the central storage unit of the computer system. It is a large and fast
memory used to store data during computer operations.
Main memory is made up of RAM and ROM, with RAM integrated circuit chips
holing the major share.
 DRAM: Dynamic RAM, is made of capacitors and transistors, and must be
refreshed every 10~100 ms. It is slower and cheaper than SRAM.
 SRAM: Static RAM, has a six transistor circuit in each cell and retains data,
until powered off.
 NVRAM: Non-Volatile RAM, retains its data, even when turned off. Example:
Flash memory.

RAM: Random Access Memory


ROM:
 Read Only Memory, is non-volatile and is more like a permanent storage for
information.
 It also stores the bootstrap loader program, to load and start the operating
system when computer is turned on.
 PROM (Programmable ROM), EPROM (Erasable PROM) and EEPROM
(Electrically Erasable PROM) are some commonly used ROMs.
Unit – 5
1. Cisc and Risc.
RISC Processors: -
 To execute an instruction, a number of steps are required. By the control unit
of the processor, a number of control signals are generated for each step.
 To execute each instruction, if there is a separate electronic circuitry in the
control unit, which produces all the necessary signals, this approach of the
design of the control section of the processor is called RISC design. It is
hardware approach.
 It is also called hard-wired approach.

Characteristic of RISC:
1. Simpler instruction, hence simple instruction decoding.
2. Instruction come under size of one word.
3. Instruction take single clock cycle to get executed.
4. More number of general purpose register.
5. Simple Addressing Modes.
6. Less Data types.
7. Pipeling can be achieved.

CISC Processors: -
 If the control unit contains a number of micro electronic circuitry to
generate a set of control signals and each micro circuitry is activated by
a microcode, this design approach is called CISC design.
 This is a software approach of designing a control unit of the processor.

Characteristic of CISC:
1. Complex instruction, hence complex instruction decoding.
2. Instruction are larger than one word size.
3. Instruction may take more than single clock cycle to get executed.
4. Less number of general purpose register as operation get performed in memory
itself.
5. Complex Addressing Modes.
6. More Data types.

2. Arithmetic pipeline *****


o Pipeline arithmetic units are usually found in very high speed
computers
o They are used to implement floating-point operations,
multiplication of fixed-point numbers, and similar computations
encountered in scientific problems
o Example for floating-point addition and subtraction

 Inputs are two normalized floating-point binary numbers

 a o = A x 2a
o = B x 2b
 A and B are two fractions that represent the mantissas
a and b are the exponents
 Four segments are used to perform the following:

 Compare the exponents


 Align the mantissas

 Add or subtract the mantissas

 Normalize the result


 X = 0.9504 x 103 and Y = 0.8200 x 102
 The two exponents are subtracted in the first segment to obtain 3-2=1

 The larger exponent 3 is chosen as the exponent of the result

 Segment 2 shifts the mantissa of Y to the right to obtain Y = 0.0820 x 103

 The mantissas are now aligned

 Segment 3 produces the sum Z = 1.0324 x 103


 Segment 4 normalizes the result by shifting the mantissa once to the
right and incrementing the exponent by one to obtain Z = 0.10324 x
104

3. Instruction pipeline
 An instruction pipeline reads consecutive instructions from
memory while previous instructions are being executed in other
segments
 This causes the instruction fetch and execute phases to overlap
and perform simultaneous operations

 If a branch out of sequence occurs, the pipeline must be emptied and


all the instructions that have been read from memory after the branch
instruction must be discarded

 Consider a computer with an instruction fetch unit and an


instruction execution unit forming a two segment pipeline

 A FIFO buffer can be used for the fetch segment

 Thus, an instruction stream can be placed in a queue, waiting for


decoding and processing by the execution segment
 This reduces the average access time to memory for reading instructions

 Whenever there is space in the buffer, the control unit initiates


the next instruction fetch phase

 The following steps are needed to process each instruction:


o Fetch the instruction from memory

 Decode the instruction


 Calculate the effective address
 Fetch the operands from memory
 Execute the instruction

 Store the result in the proper place

 The pipeline may not perform at its maximum rate due to:

 Different segments taking different times to operate


 Some segment being skipped for certain operations
 Memory access conflicts

o Example: Four-segment instruction pipeline

o Assume that the decoding can be combined with


calculating the EA in one segment

 Assume that most of the instructions store the result in a register so


that
o the execution and storing of the result can be combined in one
segment
 Up to four suboperations in the instruction cycle can overlap and up to
four different instructions can be in progress of being processed at the
same time
 It is assumed that the processor has separate instruction and data memories

 Reasons for the pipeline to deviate from its normal operation are:

 Resource conflicts caused by access to memory by two segments at the same


time.

 Data dependency conflicts arise when an instruction depends on


the result of a previous instruction, but his result is not yet
available

 Assume that most of the instructions store the result in a register so


that the execution and storing of the result can be combined in one
segment

 Up to four suboperations in the instruction cycle can overlap and up to


four different instructions can be in progress of being processed at the
same time

 It is assumed that the processor has separate instruction and data memories

 Reasons for the pipeline to deviate from its normal operation are:

 Resource conflicts caused by access to memory by two


segments at the same time.

o Data dependency conflicts arise when an instruction depends on the


result of a
o previous instruction, but his result is not yet available

 Branch difficulties arise from program control instructions that may


change the
 value of PC
4. Array processor
Array processors are also known as multiprocessors or vector processors. They
perform computations on large arrays of data. Thus, they are used to improve the
performance of the computer.

Types of Array Processors


There are basically two types of array processors:

1. Attached Array Processors


2. SIMD Array Processors

Attached Array Processors


An attached array processor is a processor which is attached to a general purpose
computer and its purpose is to enhance and improve the performance of that
computer in numerical computational tasks. It achieves high performance by
means of parallel processing with multiple functional units.

SIMD Array Processors


SIMD is the organization of a single computer containing multiple
processors operating in parallel. The processing units are made to
operate under the control of a common control unit, thus providing a
single instruction stream and multiple data streams.
A general block diagram of an array processor is shown below. It
contains a set of identical processing elements (PE's), each of which is
having a local memory M. Each processor element includes
an ALU and registers. The master control unit controls all the
operations of the processor elements. It also decodes the instructions and
determines how the instruction is to be executed.
The main memory is used for storing the program. The control unit is
responsible for fetching the instructions. Vector instructions are send to
all PE's simultaneously and results are returned to the memory.
The best known SIMD array processor is the ILLIAC IV computer
developed by the Burroughs corps. SIMD processors are highly
specialized computers. They are only suitable for numerical problems
that can be expressed in vector or matrix form and they are not suitable
for other types of computations.
5. Inter processor Communication and Synchronization
The various processors in a multiprocessor system must be provided with a facility
for communicating with each other.

o A communication path can be established through a portion of memory or a


common input-output channels.
o The sending processor structures a request, a message, or a procedure, and
places it in the memory mailbox.
o Status bits residing in common memory o The receiving processor can check
the mailbox periodically. o The response time of this procedure can be time
consuming.
o A more efficient procedure is for the sending processor to alert the receiving
processor directly by means of an interrupt signal.
o In addition to shared memory, a multiprocessor system may have other shared
resources. e.g., a magnetic disk storage unit.
o To prevent conflicting use of shared resources by several processors there must
be a provision for assigning resources to processors. i.e., operating system.
o There are three organizations that have been used in the design of operating
system for multiprocessors: master-slave configuration, separate operating
system, and distributed operating system.
o In a master-slave mode, one processor, master, always executes the operating
system functions.
o In the separate operating system organization, each processor can execute the
operating system routines it needs. This organization is more suitable for
loosely coupled systems.
o In the distributed operating system organization, the operating system routines
are distributed among the available processors. However, each particular
operating system function is assigned to only one processor at a time. It is also
referred to as a floating operating system.

6. Cache coherence

You might also like