MODULE 2
MODULE 2
Architecture
Course Instructor
Dr. Umadevi V
Department of CSE, BMSCE
1
16 October 2024 CSE, BMSCE
Unit-2
Memory Memory
Address Contents Address Contents
3016 3016
3020 SP 3020 6
SP 3024 28 3024 28
R1 6 R1 6
Memory Memory
Address Contents Address Contents
3016 3016
SP 3020 6 3020 6
3024 28 SP 3024 28
R1 19 R1 6
3008 00
3012 40
3016 50
3020 60
Note: 3024 70
Push operation can be implemented as
Subtract SP, SP, #4
3028 80
Store Rj, (SP) 3032 90
Pop operation can be implemented as
Load Rj, (SP) 3036 100
Add SP, SP, #4
3040
16 October 2024 CSE, BMSCE 7
Answer
Register R5 is used in a program to point to top of a stack. Consider each word length in stack is of 32-bits.Write a sequence of
instructions to perform each of the following
a. Pop the top two items off the stack, add them and then push the result onto the stack.
b. Copy the fifth item from the top of the stack into register R3 Contents of Memory after executing set of instruction
3008 00
R3 3012 40
3016 50
50
3020 60
Note: 3024 70
Push operation can be implemented as
Subtract SP, SP, #4
3028 80
Store Rj, (SP) 3032 90
Pop operation can be implemented as
Load Rj, (SP) 3036 100
Add SP, SP, #4
3040
16 October 2024 CSE, BMSCE 9
Answer
Register R5 is used in a program to point to top of a stack. Consider each word length in stack is of 32-bits.Write a sequence of
instructions to perform each of the following
a. Pop the top two items off the stack, add them and then push the result onto the stack.
b. Copy the fifth item from the top of the stack into register R3 Contents of Memory after executing set of instruction
3004 20
3008 00
3012 40
3016 50
3020 60
Note: 3024 70
Push operation can be implemented as
Subtract SP, SP, #4
3028 80
Store Rj, (SP) 3032 90
Pop operation can be implemented as
Load Rj, (SP) 3036 100
Add SP, SP, #4
R5 3040
16 October 2024 CSE, BMSCE 10
Subroutines
3012
3016
3020
3024
3028
R2
3032
R3 100
3036
R4 200
SP
3040 3040
R5 300
18
Stack Frame
Stack Frame refers to locations that constitute a private work-space for the subroutine.
The work-space is
◼ Created at the time the subroutine is entered and
◼ Freed up when the subroutine returns the control to the calling program
Figure shows an example of a commonly used layout for information in a stack
frame. In addition to the Stack Pointer(SP), it is useful to have another pointer register, called
the Frame Pointer (FP), for convenient access to the parameters passed to the subroutine and
to the local memory variables used by the subroutine. In the figure, we assume that four
parameters are passed to the subroutine, three local variables are used within the subroutine,
and registers R2, R3, and R4 need to be saved because they will also be used within the
Subroutine.
Next we will learn few more instructions that are found in most instruction sets.
R3 R3
1 0 0 1 0 1 0 1 1 0 0 1 0 1 0 1
R2 R2
1 0 1 0 0 1 1 0 1 0 1 0 0 1 1 0
R4 R4
0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0
0x43 C 0x59 X
… … 0x5A Z
Note: To specify
Hexadecimal numbers
Prefix 0x will be used
i.e.,
0x123
or
123h
0x43 C 0x59 X
… … 0x5A Z
0x43 C 0x59 X
… … 0x5A Z
Logical Arithmetic
LShiftL R1, R2 , #2
Before Executing the Instruction After Executing the Instruction
R2 R2
1 0 1 0 1 1 0 1 1 0 1 0 1 1 0 1
C R1 C R1
0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0
C 0
MSB LSB
Logical Left Shift
General Syntax: LShiftL Ri, Rj, Count
LShiftL R1, R1 , #2
Before Executing the Instruction After Executing the Instruction
C R1 C R1
0 1 0 1 0 1 1 0 1 1 0 1 0 1 1 0 1 0
After 1st Shift
C R1
0 1 0 1 1 0 1 0 0
After 2nd Shift
R1 C
0 0 1 0 1 0 1 1 0
After 2nd Shift
0x33 3
… …
0x39 9
LOC 0x39
LOC+1 0x32
Note: Binary Coded Decimal system(BCD): Each decimal digit is represented by a group of 4 bits. PACKED 92
LOC 0x39
LOC+1 0x32
PACKED 92
C
MSB LSB
R1 C
1 1 1 0 1 0 1 1 0
After 2nd Shift
C
MSB LSB
C R1 C R1
0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 0
C
MSB LSB
C R1 C R1
0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1
C
MSB LSB
R1 C R1 C
1 0 1 1 0 1 0 0 0 0 1 0 1 1 0 1 0 0
C
MSB LSB
R1 C R1 C
1 0 1 1 0 1 0 0 0 0 1 0 1 1 0 1 0 0
R2 76 R2 20
R3 10 R3 10
R4 2 R4 2
R1 38 R1 0
R2 76 R2 5
R3 10 R3 10
R4 2 R4 2
Speed:
Processor Many Million Instructions
Per Second
Speed: Several
Memory I/O Device Speed:
Thousand I/O Device Speed: Several
Few Hundred
Characters
Characters Characters
Per Second
Per Second Per Second
Data transfer between the central unit and I/O devices can
be handled in generally three types of modes which are
given below:
1. Program Controlled I/O
2. Interrupt Initiated I/O
3. Direct Memory Access
Let us consider a simple task of reading a character from a keyboard and displaying that
character on a display screen.
A simple way of performing the task is called program-controlled I/O.
There are two separate blocks of instructions in the I/O program that perform this task:
◼ One block of instructions transfers the character into the processor.
◼ Another block of instructions causes the character to be displayed.
Processor
R5
Processor
R5
Processor
R5
Processor
R5
Processor
R5
Monitor
Character ASCII Hexacode Keyboard
A 0x41
B 0x42
C 0x43
Memory
D 0x44
E 0x45
F 0x46
…. ….
Enter Key 0x0D
or Carriage
Return
Memory
Move R2, #LOC
Data transfer between the central unit and I/O devices can
be handled in generally three types of modes which are
given below:
1. Program Controlled I/O
2. Interrupt Initiated I/O
3. Direct Memory Access
Under Program Controlled I/O, the program enters a wait loop in which it repeatedly
tests the device status. During this period, the processor is not performing any useful
computation. There are many situations where other tasks can be performed while waiting
for an I/O device to become ready.
One mechanisms to avoid “successive interruptions causing the system to enter infinite loop” problem is as follows:
Processor automatically disabling the interrupts before staring the execution of ISR
The second option, which is suitable for a simple processor with only one interrupt-request line, is to have the processor automatically disable interrupts before
starting the execution of the interrupt-service routine. After saving the contents of the PC and the processor status register (PS) on the stack, the processor
performs the equivalent of executing an Interrupt-disable instruction. It is often the case that one bit in the PS register, called Interrupt-enable, indicates
whether interrupts are enabled. An interrupt request is received while this bit is equal to 1 will be accepted. After saving the contents of the PS on the stack,
with the Interrupt-enabled bit equal to 1, the processor clears the interrupt-enable bit in its PS register, thus disabling further interrupts. When a return from
interrupt instruction is executed, the contents of the PS are restored from the stack, setting the interrupt enable bit to 1. Hence interrupts are again enabled.
Device
Processor
Let us now consider the specific case of a single interrupt request from one device. When a device activates the
interrupt-request signal, it keeps this signal activated until it learns that the processor has accepted its request. This
means that the interrupt-request signal will be active during execution of the interrupt-service routine, perhaps until an
instruction is reached that accesses the device in question. It is essential to ensure that this active request signal does
not lead to successive interruptions, causing the system to enter an infinite loop from which it cannot recover.
A good choice is to have the processor automatically disable interrupts before starting the execution of the interrupt-
service routine. The processor saves the contents of the program counter and the processor status register. After saving
the contents of the PS register, with the IE bit equal to 1, the processor clears the IE bit in the PS register, thus
disabling further interrupts. Then, it begins execution of the interrupt-service routine. When a Return-from-interrupt
instruction is executed, the saved contents of the PS register are restored, setting the IE bit back to 1. Hence, interrupts
are again enabled.
Answer: a
Explanation: The Processor upon receiving the interrupt should let the device know that its
request is received.
Answer: b
Explanation: The delay in servicing of an interrupt happens due to the time taken for transfer to
take place from Main Program to ISR
Data transfer between the central unit and I/O devices can
be handled in generally three types of modes which are
given below:
1. Program Controlled I/O
2. Interrupt Initiated I/O
3. Direct Memory Access
Processor
I/O
Device
1
Processor
I/O I/O I/O
Device Device …………… Device
1 2 n
Consider a simple arrangement where all devices send their interrupt-requests over a single
control line in the bus.
When the processor receives an interrupt request over this control line, how does it know
which device is requesting an interrupt?
This information is available in the status register of the device requesting an interrupt:
◼ The status register of each device has an IRQ bit which it sets to 1 when it requests an interrupt.
Interrupt service routine can poll the I/O devices connected to the bus. The first device with
IRQ equal to 1 is the one that is serviced.
Polling mechanism is easy, but time consuming to query the status bits of all the I/O devices
connected to the bus.
Processor
I/O I/O I/O
Device Device …………… Device
Interrupt Request 1 2 8
Status Register
0 0 0 0 0 0 0 0
Device8 Device1
Interrupt Status Bit Interrupt Status Bit
A device requesting an interrupt can identify itself if it has its own interrupt-request signal, or if it can
send a special code to the processor through the interconnection network.
The processor’s circuits determine the memory address of the required interrupt-service routine.
A commonly used scheme is to allocate permanently an area in the memory to hold the addresses of
interrupt-service routines. These addresses are usually referred to as interrupt vectors, and they are
said to constitute the interrupt-vector table. For example, 128 bytes may be allocated to hold a
table of 32 interrupt vectors.
The interrupt-service routines may be located anywhere in the memory.
When an interrupt request arrives, the information provided by the requesting device is used as a
pointer into the interrupt-vector table, and the address in the corresponding interrupt vector is
automatically loaded into the program counter.
III. Simultaneous Requests: Multiple Requests are received over a single interrupt request line at the
same time
Interrupt Nesting: A mechanism has to be developed such that even though a processor is executing an ISR, another interrupt
request should be accepted and serviced
I/O devices are organized in a priority structure: An interrupt request from a high-priority device is accepted while the processor is
executing the interrupt service routine of a low priority device.
A multiple-level priority organization means that during execution of an interrupt service routine, interrupt requests will be accepted
from some devices but not from others, depending upon the device’s priority.
To implement this scheme, we can assign a priority level to the processor that can be changed under program control. The priority
level of the processor is the priority of the program that is currently being executed. The processor accepts interrupts only from
devices that have priorities higher than its own.
At the time that execution of an interrupt-service routine for some device is started, the priority of the processor is raised to that of
the device either automatically or with special instructions. This action disables interrupts from devices that have the same or lower
level of priority.
However, interrupt requests from higher-priority devices will continue to be accepted. The processor’s priority can be encoded in a
few bits of the processor status register.
Finally, we should point out that if nested interrupts are allowed, then each interrupt service routine must save on the stack the
saved contents of the program counter and the status register. This has to be done before the interrupt-service routine enables
nesting by setting the IE bit in the status register to 1
ISR of ISR of
ISR of Device2 ISR of Device1
Device1 Device2
III. Simultaneous Requests: Multiple Requests are received over a single interrupt request line at the
same time
We also need to consider the problem of simultaneous arrivals of interrupt requests from two or more devices.
The processor must have some means of deciding which request to service first.
Polling the status registers of the I/O devices is the simplest such mechanism. In this case, priority is determined by the order in
which the devices are polled. When vectored interrupts are used, we must ensure that only one device is selected to send its
interrupt vector code. This is done in hardware, by using arbitration circuits.
It is important to ensure that interrupt requests are generated only by those I/O devices that the processor is currently willing to
recognize. Hence, we need a mechanism in the interface circuits of individual devices to control whether a device is allowed to
interrupt the processor. The control needed is usually provided in the form of an interrupt-enable bit in the device’s interface circuit.
The keyboard status register includes bits KIN and KIRQ. The KIRQ bit is set to 1 if an interrupt request has been raised, but not
yet serviced. The keyboard may raise interrupt requests only when the interrupt-enable bit, KIE, in its control register is set to 1.
Thus, when both KIE and KIN bits are equal to 1, an interrupt request is raised and the KIRQ bit is set to 1. Similarly, the DIRQ bit
in the status register of the display interface indicates whether an interrupt request has been raised. Bit DIE in the control register
of this interface is used to enable interrupts.
16 October 2024 CSE, BMSCE 99
Under Interrupt Driven I/O data transfer:
Processor Control Registers
The status register, PS, includes the interrupt-enable bit, IE, in addition to other status information. Recall that the processor will
accept interrupts only when this bit is set to 1. The IPS register is used to automatically save the contents of PS when an interrupt
request is received and accepted. At the end of the interrupt-service routine, the previous state of the processor is automatically
restored by transferring the contents of IPS into PS. Since there is only one register available for storing the previous status
information, it becomes necessary to save the contents of IPS on the stack if nested interrupts are allowed.
The IENABLE register allows the processor to selectively respond to individual I/O devices. A bit may be assigned for each device,
as shown in the figure for the keyboard, display, and a timer circuit that we will use in a later example. When a bit is set to 1, the
processor will accept interrupt requests from the corresponding device.
The IPENDING register indicates the active interrupt requests. This is convenient when multiple devices may raise requests at the
same time.
Let us consider again the task of reading a line of characters typed on a keyboard, storing
the characters in the main memory, and displaying them on a display device.
We will use interrupts with the keyboard, but polling with the display.
1. Synchronous scheme
2. Asynchronous scheme
1. Synchronous data transfer: sender and receiver use the same clock signal
◼ Needs clock signal between the sender and the receiver
◼ Supports high data transfer rate
2. Asynchronous data transfer: sender provides a synchronization signal to the
receiver before starting the transfer of each message
◼ Does not need clock signal between the sender and the receiver
◼ Slower data transfer rate
Processor
Address Bus Memory
Move R1, LOCA Addr Memory
Data Bus Contents
R1 LOCA 2000 38
By 2000
Processor Read
The Master
By
Memory
The Slave
By 2000
Processor Read
The Master
By
Memory
38 The Slave
The clock pulse width, t1 – t0, must be longer It is important that slaves take no action or place any data on
than the maximum propagation delay between the bus before t1. The information on the bus is unreliable
two devices connected to the bus. It also has to be during the period t0 to t1 because signals are changing state.
long enough to allow all devices to decode the address The addressed slave places the requested input data on the
and control signals so that the addressed device data lines at time t1. At the end of the clock cycle, at time t2,
(the slave) can respond at time t1 the master strobes the data on the data lines into its input buffer.
In this context, “strobe” means to capture the values of the data
at a given instant and store them into a buffer
Answer:
Minimum clock period = 4+5+6+10+3 = 28 ns
Processor
Address Bus Memory
Store R1, LOCA Addr Memory
Data Bus Contents
R1 38 LOCA 2000
By 2000
Processor Write
The Master
By
38 Memory
The Slave
•In case of a Write operation, the master places the data on the bus along with the
address and commands at time t0.
•The slave strobes the data into its input buffer at time t2.
16 October 2024 CSE, BMSCE 120
Synchronous bus
Data transfer has to be completed within one clock cycle.
◼ Clock period t2 - t0 must be such that the longest propagation delay on the bus and the slowest device
interface must be accommodated.
◼ Forces all the devices to operate at the speed of the slowest device.
Processor just assumes that the data are available at t2 in case of a Read operation, or are
read by the device in case of a Write operation.
◼ What if the device is actually failed, and never really responded?
Address
Command
Master strobes data
into the input buffer.
Data
Slave-ready
Slave places the data on the bus, Clock changes are seen by all the devices
and asserts Slave-ready signal. at the same time.
Solution:
The minimum time for the high phase of the clock is the time for the address
to arrive and be decoded by the slave, which is 1.5 + 4 + 3 = 8.5 ns.
The minimum time for the low phase of the clock is the time for the slave to
place data on the bus and for the master to load the data into a register,
which is 5 + 4 + 1 = 10 ns.
Then, the minimum clock period is 2 × 10 = 20 ns, and
the maximum clock frequency is 50 MHz.
1. Synchronous scheme
2. Asynchronous scheme
1. Synchronous data transfer: sender and receiver use the same clock signal
◼ Needs clock signal between the sender and the receiver
◼ Supports high data transfer rate
2. Asynchronous data transfer: sender provides a synchronization signal to the
receiver before starting the transfer of each message
◼ Does not need clock signal between the sender and the receiver
◼ Slower data transfer rate
Master
Slave
Processor Address Bus
Memory
Data Bus
38
Address Bus
Processor Memory
Data Bus
Addr Memory
Store R1, LOCA Master-Ready Control Signal Contents
LOCA 2000 38
R1 38 Slave-Ready Control Signal
Address Bus
Processor Memory
Data Bus
Addr Memory
Store R1, LOCA Master-Ready Control Signal Contents
LOCA 2000 38
R1 38 Slave-Ready Control Signal
38
Arbiter
Memory Circuit
Processor
Printer
Bus Arbitration
16 October 2024 CSE, BMSCE 138
Bus Arbitration (Contd…)
Figure below, illustrates a possible sequence of events for the case of three masters. Assume that master 1 has the
highest priority, followed by the others in increasing numerical order. Master 2 sends a request to use the bus first.
Since there are no other requests, the arbiter grants the bus to this master by asserting BG2. When master 2 completes
its data transfer operation, it releases the bus by deactivating BR2. By that time, both masters 1 and 3 have activated
their request lines. Since device 1 has a higher priority, the arbiter activates BG1 after it deactivates BG2, thus granting
the bus to master 1. Later, when master 1 releases the bus by deactivating BR1, the arbiter deactivates BG1 and
activates BG3 to grant the bus to master 3. Note that the bus is granted to master 1 before master 3 even though
master 3 activated its request line before master 1.
Example
BR2 BG2 BR3 BR1 BR2 and BG2 BR1 and BG1 BG2
BG1
5:00am 5:01am 5:05am 5:10am are released are released 5:16am
5:16am
5:15am 5:20am
Answer:
R3 , R4 , R1 , R2
Answer : d
minimum=2+5+4=11. we count the bus driver delay, propagation delay on the bus and the fixed with whcih is 4.
maximum 2+10+4=16
Max Time
2+4+5+25=36 (25 because this would be the longest data fetch would take for this problem)