0% found this document useful (0 votes)
4 views

Module 6-Advanced processors and buses

The document discusses various aspects of pipelined and non-pipelined execution in computer architecture, including Flynn's classification, pipeline hazards, and bus arbitration methods. It explains the efficiency of pipelined execution, how to calculate execution times, and the impact of different types of hazards such as structural, control, and data dependencies. Additionally, it covers techniques like delayed branches and branch prediction to enhance performance in pipelined processors.

Uploaded by

ghostnaidu7
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Module 6-Advanced processors and buses

The document discusses various aspects of pipelined and non-pipelined execution in computer architecture, including Flynn's classification, pipeline hazards, and bus arbitration methods. It explains the efficiency of pipelined execution, how to calculate execution times, and the impact of different types of hazards such as structural, control, and data dependencies. Additionally, it covers techniques like delayed branches and branch prediction to enhance performance in pipelined processors.

Uploaded by

ghostnaidu7
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Module 6-

Sonal Hutke 1
1) Explain Flynn's classification. -10M
2) List and explain the various pipeline Hazards
3) Explain the various Bus arbitration methods 10M
4) What is Bus arbitration? Explain any two techniques of Bus arbitration. 10M
5) Draw the neat diagram of Flynn's classification.
6) A program having 10 instructions (without Branch and Call instructions) is executed on non-pipeline and
pipeline processors. All instructions are of the same length and have 4 pipeline stages and the time required
for each stage is 1nsec. (Assume the four stages as Fetch Instruction, Decode Instruction, Execute
Instruction, Write Output)
• i.) Calculate time required to execute the program on a Non-pipeline and Pipeline processor.
• ii) Show the pipeline processor with a diagram.
1) Explain different types Distributed and Centralized bus arbitration methods 10M
2) Draw and explain 4 stage instruction pipelining and briefly describe the hazards associated with it
10M

Sonal Hutke 2
Introduction-

•A program consists of several number of instructions.


•These instructions may be executed in the following two ways-

1.Non-Pipelined Execution
2.Pipelined Execution

Sonal Hutke 3
1. Non-Pipelined Execution-
• In non-pipelined architecture,
• All the instructions of a program are executed sequentially one after the other.
• A new instruction executes only after the previous instruction has executed completely.
• This style of executing the instructions is highly inefficient.
Example-
Consider a program consisting of three instructions.
• In a non-pipelined architecture, these instructions execute one after the other as-

• If time taken for executing one instruction = t, then-Time taken for executing ‘n’ instructions = n x t
In non-pipelined architecture,
Time taken to execute three instructions would be
= 3 x Time taken to execute one instruction
= 3 x 4 clock cycles
= 12 clock cycles

Sonal Hutke 4
Pipelined Execution-
• In pipelined architecture,
• Multiple instructions are executed parallelly.
• This style of executing the instructions is highly efficient.
• Instruction pipelining is a technique that implements a form of
parallelism called as instruction level parallelism within a single
processor.
• A pipelined processor does not wait until the previous instruction has
executed completely.
• Rather, it fetches the next instruction and begins its execution.

Sonal Hutke 5
Four-Stage Pipeline-
• In four stage pipelined architecture, the execution of each instruction is completed
in following 4 stages-
1.Instruction fetch (IF)
2.Instruction decode (ID)
3.Instruction Execute (IE)
4.Write back (WB)
• To implement four stage pipeline,
• The hardware of the CPU is divided into four functional units.
• Each functional unit performs a dedicated task.

Sonal Hutke 6
At stage-01,
• First functional unit performs instruction fetch.
• It fetches the instruction to be executed.
At stage-02,
• Second functional unit performs instruction decode.
• It decodes the instruction to be executed.
At stage-03,
• Third functional unit performs instruction execution.
• It executes the instruction.
At stage-04,
• Fourth functional unit performs write back.
• It writes back the result so obtained after executing the instruction.

Sonal Hutke 7
• In pipelined architecture,
• Instructions of the program execute parallely.
• When one instruction goes from nth stage to (n+1)th stage, another
instruction goes from (n-1)th stage to nth stage.
Time taken to execute three instructions in four stage pipelined
architecture = 6 clock cycles.

Sonal Hutke 8
Performance of Pipelined Execution-
• The following parameters serve as criterion to estimate the
performance of pipelined execution-
• Speed Up
• Efficiency
• Throughput
1. Speed Up-
• It gives an idea of “how much faster” the pipelined execution is as
compared to non-pipelined execution.
• It is calculated as-

Sonal Hutke 9
2. Efficiency-
• The efficiency of pipelined execution is calculated as-

Sonal Hutke 10
3. Throughput-
• Throughput is defined as number of instructions executed per unit
time.
• It is calculated as-

Sonal Hutke 11
Let us learn how to calculate certain important parameters of pipelined architecture.
Consider-
• A pipelined architecture consisting of k-stage pipeline
• Total number of instructions to be executed = n
Point-01: Calculating Cycle Time-
There are two cases possible-
• Case-01: All the stages offer same delay-
• If all the stages offer same delay, then-
• Cycle time = Delay offered by one stage including the delay due to its register
• Case-02: All the stages do not offer same delay-
• If all the stages do not offer same delay, then-
• Cycle time = Maximum delay offered by any stage including the delay due to its
register

Sonal Hutke 12
Point-02: Calculating Frequency Of Clock-

• Frequency of the clock (f) = 1 / Cycle time

Point-03: Calculating Non-Pipelined Execution Time-


In non-pipelined architecture,
• The instructions execute one after the other.
• The execution of a new instruction begins only after the previous instruction has
executed completely.
So, number of clock cycles taken by each instruction = k clock cycles
Thus, Non-pipelined execution time
= Total number of instructions x Time taken to execute one instruction
= n x k clock cycles

Sonal Hutke 13
Point-04: Calculating Pipelined Execution Time-
In pipelined architecture,
• Multiple instructions execute parallelly.
• Number of clock cycles taken by the first instruction = k clock cycles
• After first instruction has completely executed, one instruction comes out per
clock cycle.
So, number of clock cycles taken by each remaining instruction = 1 clock cycle
Thus, Pipelined execution time
= Time taken to execute first instruction + Time taken to execute remaining
instructions
= 1 x k clock cycles + (n-1) x 1 clock cycle
= (k + n – 1) clock cycles
Sonal Hutke 14
Point-04: Calculating Speed Up-
Speed up
= Non-pipelined execution time / Pipelined execution time
= n x k clock cycles / (k + n – 1) clock cycles
= n x k / (k + n – 1)
= n x k / n + (k – 1)
= k / { 1 + (k – 1)/n }

• For very large number of instructions, n→∞. Thus, speed up = k.


• Practically, total number of instructions never tend to infinity.
• Therefore, speed up is always less than number of stages in pipeline.
Under ideal conditions,
• One complete instruction is executed per clock cycle i.e. CPI (Clock per instruction) = 1.
• Speed up = Number of stages in pipelined architecture

Sonal Hutke 15
1) A program having 10 instructions (without Branch and Call
instructions) is executed on non-pipeline and pipeline processors.
All instructions are of the same length and have 4 pipeline stages
and the time required for each stage is 1nsec. (Assume the four
stages as Fetch Instruction, Decode Instruction, Execute Instruction,
Write Output)
i.) Calculate time required to execute the program on a Non-pipeline
and Pipeline processor.
ii) Show the pipeline processor with a diagram.

Sonal Hutke 16
Given:
Total number of instructions to be executed = n = 10 instructions
Architecture consisting of Numbers of stages = k = 4 stages
Clock Cycles = t = 1 nsec

i. Time required to execute the program on Non-pipeline and Pipeline processor.


Pipelined Execution Time
= Time taken to execute first instruction + Time taken to execute remaining instructions
= 1 x k clock cycles + (n -1) x 1 clock cycle
= (k + n – 1) clock cycles
= (4 + 10 - 1) * t
= (4 + 9) * 1ns
= 13 nsec

Sonal Hutke 17
Non-pipelined Execution Time
= Total number of instructions x Time taken to execute one instruction
= n x k clock cycles
= (n * k) * t
= (10 * 4) * 1ns
= 40 nsec

ii. Speedup
Speedup
= Non-pipelined Execution Time / Pipelined Execution Time
= 40/13 = 3.07 times

Sonal Hutke 18
Pipeline Hazards
• Pipeline hazards are conditions that can occur in a pipelined machine that prevents the
execution of a subsequent instruction in a particular cycle for a variety of reasons.
• It occurs when the pipeline or some portion of the pipeline must stall because conditions
do not permit continued execution. Such pipeline stall is also referred to as a pipeline
bubble.
Three types of hazards:

1.Structural / Resource Dependency


2.Control Dependency
3. Data Dependency

• These dependencies may introduce stalls in the pipeline.


• Stall : A stall is a cycle in the pipeline without new input.

Sonal Hutke 19
1) Structural Hazard
This dependency arises due to the resource conflict in the pipeline. A
resource conflict is a situation when more than one instruction tries to
access the same resource in the same cycle. A resource can be a
register, memory, or ALU.

Example:
Instruction /
1 2 3 4 5
Cycle

I1 IF(Mem) ID EX Mem
I2 IF(Mem) ID EX
I3 IF(Mem) ID EX
I4 IF(Mem) ID

Sonal Hutke 20
• In the above scenario, in cycle 4, instructions I1 and I4 are trying to
access same resource (Memory) which introduces a resource conflict.
To avoid this problem, we have to keep the instruction on wait until
the required resource (memory in our case) becomes available. This
wait will introduce stalls in the pipeline as shown below:

Cycle 1 2 3 4 5 6 7 8

I1 IF(Mem) ID EX Mem WB

I2 IF(Mem) ID EX Mem WB

I3 IF(Mem) ID EX Mem WB

I4 – – – IF(Mem)

Sonal Hutke 21
• Solution for structural dependency
To minimize structural dependency stalls in the pipeline, we use a
hardware mechanism called Renaming.
Renaming : According to renaming, we divide the memory into two
independent modules used to store the instruction and data separately
called Code memory(CM) and Data memory(DM) respectively. CM will
contain all the instructions and DM will contain all the operands that are
required for the instructions.

Sonal Hutke 22
2) Control Dependency (Branch Hazards)
• This type of dependency occurs during the transfer of control instructions such as
BRANCH, CALL, JMP, etc. On many instruction architectures, the processor will not
know the target address of these instructions when it needs to insert the new instruction
into the pipeline. Due to this, unwanted instructions are fed to the pipeline.
• Consider the following sequence of instructions in the program:
100: I1 Instruction/
1 2 3 4 5 6
101: I2 (JMP 250) Cycle

102: I3 I1 IF ID EX MEM WB

. I2 IF ID (PC:250) EX Mem WB
I3 IF ID EX Mem
.
BI1 IF ID EX
250: BI1
• Expected output: I1 -> I2 -> BI1
• NOTE: Generally, the target address of the JMP instruction is known after ID stage only.

Sonal Hutke 23
• Output Sequence: I1 -> I2 -> Delay (Stall) -> BI1
• As the delay slot performs no operation, this output sequence is equal to the expected
output sequence. But this slot introduces stall in the pipeline.
• Solution for Control dependency Branch Prediction is the method through which stalls
due to control dependency can be eliminated. In this at 1st stage prediction is done about
which branch will be taken. For branch prediction Branch penalty is zero.
• Branch penalty : The number of stalls introduced during the branch operations in the
pipelined processor is known as branch penalty.
• NOTE : As we see that the target address is available after the ID stage, so the number of
stalls introduced in the pipeline is 1. Suppose, the branch target address would have been
present after the ALU stage, there would have been 2 stalls. Generally, if the target
address is present after the kth stage, then there will be (k – 1) stalls in the pipeline.
• Total number of stalls introduced in the pipeline due to branch instructions = Branch
frequency * Branch Penalty

Sonal Hutke 24
3) Data Dependency (Data Hazard)
It occurs when there is a conflict in access of an operand location
• E.g. Two instructions I1 and I2
• I2 dependent on I1

Consider A=10
I1: A <-- A +5
I2: B <-- A X 2
3 types of data hazards
a)RAW-Read after write
b)WAR- Write after read
c)WAW- Write after write

Sonal Hutke 25
• RAW hazard occurs when instruction J tries to read data before instruction I
writes it.
Eg:
I: R2 <-- R1 + R3 ---------------(R2 Write)
J: R4 <-- R2 + R3 ---------------(R2 Read)
• WAR hazard occurs when instruction J tries to write data before instruction I
reads it.
Eg:
I: R2 <-- R1 + R3 ---------------(R3 Read)
J: R3 <-- R4 + R5 ---------------(R2 Write)
• WAW hazard occurs when instruction J tries to write output before instruction I
writes it.
Eg:
I: R2 <-- R1 + R3 ---------------(R2 Write)
J: R2 <-- R4 + R5 ---------------(R2 Write)
• WAR and WAW hazards occur during the out-of-order execution of the
instructions.

Sonal Hutke 26
• Delayed Branch
• Delayed branch is a technique used in pipelined processors to minimize the
performance penalty caused by branch instructions.
• In most pipelined architectures, instructions are fetched and executed in
stages.
• A branch instruction disrupts this flow because the next instruction to
execute is not known until the branch condition is evaluated, leading to
pipeline stalls.
• To mitigate this, the concept of a "branch delay slot" is introduced.
• In a delayed branch mechanism, the processor fetches and executes the
instruction immediately after the branch, regardless of whether the branch is
taken or not.
• This means that even though a branch is supposed to change the flow of
execution, the instruction right after the branch (in the delay slot) is executed
before the branch is actually taken.
• This allows the pipeline to continue operating smoothly for at least one more
instruction cycle.
Sonal Hutke 27
• Key points about delayed branch:
• The branch instruction doesn't take effect immediately.
• The instruction in the delay slot is executed, whether or not the branch
is taken.
• It requires compilers or programmers to carefully place useful
instructions in the delay slot, which can complicate coding.

BEQ R1, R2, LABEL ; Branch if R1 == R2


ADD R3, R4, R5 ; This instruction will be executed even if the branch is taken
LABEL:
...

Sonal Hutke 28
Branch Prediction
• Branch prediction is a technique used in modern processors to improve the
efficiency of pipelining by guessing the outcome of a branch instruction
before it is resolved.
• Since branches occur frequently in code, predicting their outcome allows the
pipeline to continue fetching and executing instructions without waiting for
the branch resolution.
• There are two types of branch prediction:
1. Static Branch Prediction: The prediction is made based on fixed criteria,
often defined by the compiler. Common static predictions include:
1.Predicting that backward branches (loops) will always be taken.
2.Predicting that forward branches will not be taken.

Sonal Hutke 29
2. Dynamic Branch Prediction: The processor uses historical information to
make more informed predictions. This involves maintaining a table (Branch
History Table, BHT) to store the outcomes of previous branches.
1.One-bit predictor: The simplest dynamic predictor. It maintains a single
bit for each branch, which indicates whether the branch was taken last
time.
2.Two-bit predictor: A more sophisticated predictor that requires two
consecutive mispredictions before changing the prediction, reducing the
likelihood of misprediction in short loops.
3.Branch Target Buffer (BTB): A cache-like structure that stores the target
addresses of previously encountered branches to make predictions more
efficient.
• Branch predictors are critical in deep pipelines because incorrect predictions
can cause long stalls and waste processor cycles.

Sonal Hutke 30
• Relationship Between Delayed Branch and Branch Prediction
• Delayed branch is an older technique used in early RISC architectures
to handle branches without sophisticated prediction mechanisms. It
pushes the complexity to the compiler.
• Branch prediction is more common in modern architectures where
pipelines are much deeper, and handling branches dynamically with
hardware predictors is more efficient.
• In summary:
• Delayed branch avoids pipeline stalls by executing one instruction
regardless of the branch outcome.
• Branch prediction tries to guess the outcome of a branch early to
minimize the number of stalls in case of incorrect guesses.

Sonal Hutke 31
• Amdahl's Law is a formula used to predict the theoretical maximum
speedup for a program when only a portion of the program can be
parallelized. It highlights the limits of parallel computing, showing how
the presence of non-parallelizable portions of a program reduces the
overall speedup that can be achieved through parallelism.
• Amdahl's Law Formula:
• The formula is expressed as:

Sonal Hutke 32
Sonal Hutke 33
Flynn's Classification
Flynn's Classification is a system for categorizing computer architectures
based on the number of instruction streams and data streams they can
handle simultaneously. Proposed by Michael J. Flynn in 1966.
Flynn's Classification is significant because it helps to understand the
architecture of computers and the potential for parallel processing. It
provides a framework for researchers and engineers to design and
evaluate different types of computing systems based on their processing
capabilities. Each classification has its advantages and is suitable for
different types of applications, influencing the choice of hardware and
software design in computing.

Sonal Hutke 34
This classification divides computer architectures into four primary
categories:
1. SISD (Single Instruction stream, Single Data stream)
Description: A traditional uniprocessor architecture where a single
instruction stream is executed on a single data stream.
Example: Most early computers and simple microcontrollers.
Characteristics:
• Each instruction operates on a single data element at a time.
• Limited parallelism.

Sonal Hutke 35
2. SIMD (Single Instruction stream, Multiple Data streams)
Description: An architecture that allows the same instruction to
be executed simultaneously on multiple data elements.
Example: Vector processors, GPUs, and modern CPUs with
SIMD instructions.
Characteristics:
Useful for data-parallel tasks, such as matrix operations and
image processing.
Efficient for applications with regular data patterns.

Sonal Hutke 36
3. MISD (Multiple Instruction streams, Single Data stream)
Description: An architecture that executes multiple instructions on
a single data stream.
Example: Very rare in practice; could be found in some fault-
tolerant systems or specialized processors.
Characteristics:
Primarily used for redundant computations or operations that
can benefit from multiple processing paths.
The single data stream can be processed by different
instructions, but this is not common in conventional
architectures.

Sonal Hutke 37
4. MIMD (Multiple Instruction streams, Multiple Data streams)
Description: An architecture that allows multiple instructions to
operate on multiple data streams simultaneously.
Example: Most modern multicore processors and distributed systems.
Characteristics:
Highly versatile, supporting various parallel computing paradigms.
Each processor can execute different instructions on different data,
making it suitable for complex applications and multitasking.

Sonal Hutke 38
Multicore Architecture
• Multicore architecture refers to the design of computer processors
that integrate multiple processing units, or cores, on a single chip.
• Each core can independently execute instructions and perform
computations, allowing for increased parallelism and improved
performance for a variety of applications.
• This architecture is commonly used in modern CPUs, GPUs, and
embedded systems.

Sonal Hutke 39
Key Features of Multicore Architecture
1.Multiple Processing Units:
1. Each core can execute its own instruction stream and process data independently,
allowing for simultaneous execution of multiple tasks.
2.Shared Resources:
1. Cores often share resources such as caches (L1, L2, and sometimes L3), memory
interfaces, and I/O subsystems. This shared architecture can improve communication
between cores but may lead to contention for resources.
3.Parallelism:
1. Multicore architectures support both data-level parallelism (where the same
operation is performed on multiple data elements) and task-level parallelism (where
different tasks are executed simultaneously). This makes them suitable for a wide
range of applications, from high-performance computing to everyday consumer tasks.
4.Power Efficiency:
1. Multicore designs can achieve better performance per watt compared to single-core
designs by allowing lower clock speeds for individual cores while leveraging
parallelism. This can lead to reduced power consumption and heat generation.
5.Scalability:
1. Multicore systems can scale by adding more cores to meet increasing performance
demands without significantly increasing the chip's complexity.

Sonal Hutke 40
Advantages of Multicore Architecture
1.Improved Performance:
1. By executing multiple threads or processes simultaneously, multicore processors can
significantly improve the performance of multithreaded applications, such as video
editing, gaming, and scientific simulations.
2.Enhanced Multitasking:
1. Multicore processors can better handle multiple applications running concurrently,
leading to smoother multitasking experiences for users.
3.Increased Throughput:
1. More cores can lead to higher throughput for applications designed to take advantage
of parallel processing.
4.Better Resource Utilization:
1. With multiple cores, the system can dynamically allocate workloads to cores based on
their current usage and performance needs.

Sonal Hutke 41
Applications of Multicore Architecture
1.High-Performance Computing (HPC):
1. Used in supercomputers and data centers for scientific simulations, financial
modeling, and large-scale data analysis.
2.Consumer Electronics:
1. Found in smartphones, tablets, and personal computers, enhancing
performance for gaming, multimedia applications, and multitasking.
3.Embedded Systems:
1. Used in automotive systems, robotics, and IoT devices to perform complex
tasks while managing power efficiency.
4.Cloud Computing:
1. Powers virtual machines and services in data centers, allowing for efficient
resource allocation and scalability.

Sonal Hutke 42
• Buses are crucial for enabling communication in computer systems.
• Instruction Set Architecture (ISA) bus
• Peripheral Component Interconnect (PCI) bus, and
• Universal Serial Bus (USB):
• The ISA bus facilitates interaction within the CPU, while PCI and USB
are essential for connecting and communicating with external
devices.
• Each bus type has its own features, benefits, and applications,
reflecting the diverse needs of modern computing environments.

Sonal Hutke 43
1. Instruction Set Architecture (ISA) Bus
Definition: ISA bus refers to the bus architecture used by the instruction set
of a processor to communicate with memory and other components. It
defines the data paths, control signals, and the way instructions are fetched
and executed by the processor.
Key Features:
Data Transfer: Facilitates the transfer of data between the CPU and memory.
Control Signals: Generates control signals needed for instruction execution.
Addressing: Determines how addresses are generated for memory access.
Example: The ISA bus is not a physical bus in the same way as PCI or USB, but
it represents the interface through which the CPU interacts with memory
and I/O devices based on the processor's architecture.

Sonal Hutke 44
2. Peripheral Component Interconnect (PCI) Bus
Definition: PCI is a local computer bus for attaching hardware devices to a
computer's motherboard. It allows multiple peripherals (such as graphics cards,
network cards, and sound cards) to communicate with the CPU and memory.
Key Features:
Parallel Communication: Uses a parallel communication method to transfer data, allowing
multiple devices to be connected and communicate simultaneously.
Bus Mastering: Supports bus mastering, where devices can take control of the bus to
transfer data without CPU intervention, improving performance.
Flexible Bandwidth: Offers various bandwidths, typically 32-bit or 64-bit data paths, with
speeds ranging from 33 MHz to 66 MHz.
Types:
PCI: Original standard.
PCI-X: An enhanced version for higher performance.
PCI Express (PCIe): A newer, high-speed serial bus standard that has largely replaced PCI.

Sonal Hutke 45
3. Universal Serial Bus (USB)
Definition: USB is a standardized serial bus interface that allows communication
between computers and peripheral devices. It is widely used for connecting a variety
of devices, including keyboards, mice, printers, external storage, and more.
Key Features:
• Serial Communication: Unlike PCI, USB uses a serial communication method, which means data
is sent one bit at a time over a single channel.
• Hot Swapping: Supports hot swapping, allowing devices to be connected and disconnected
without shutting down the computer.
• Power Supply: Provides power to connected devices, enabling operation without a separate
power source.
• Plug and Play: Facilitates easy installation and configuration of devices.
Versions:
• USB 1.0/1.1: The original versions with low (1.5 Mbps) and full-speed (12 Mbps) data rates.
• USB 2.0: Enhanced version with a maximum speed of 480 Mbps (high-speed).
• USB 3.0: Introduced SuperSpeed (up to 5 Gbps).
• USB 3.1/3.2: Further improvements with higher data transfer rates (up to 20 Gbps).
• USB4: The latest iteration that integrates Thunderbolt 3, offering even higher speeds and
improved data transfer capabilities.

Sonal Hutke 46
Sonal Hutke 47
Bus contention and arbitration are important concepts in computer architecture,
particularly in systems with multiple devices sharing a common bus.
Here's a detailed overview of both concepts:
Bus Contention
Bus contention occurs when two or more devices attempt to use the bus at the same
time.
In a shared bus system, when multiple devices try to send data over the bus
simultaneously, it can lead to conflicts, resulting in data corruption or loss.
This contention can arise in various scenarios, such as:
• Multiple CPUs or processors trying to access shared memory.
• Several I/O devices attempting to communicate with the CPU concurrently.

Sonal Hutke 48
Consequences of Bus Contention:
• Data Collision: When two devices transmit data simultaneously, the
signals may collide, leading to incorrect data being received.
• Increased Latency: Contention can slow down communication, as
devices may need to retry sending data after a collision.
• Resource Wastage: The processor or devices may waste cycles
waiting for the bus to become available, which can degrade system
performance.

Sonal Hutke 49
Bus Arbitration
Bus arbitration is the process of managing access to the bus in a way that
resolves contention and ensures orderly communication between devices.
It determines which device gets control of the bus when multiple devices
request access simultaneously. There are various arbitration schemes, including:
1. Centralized Arbitration:
• A single bus arbiter (a dedicated controller) manages access to the bus.
• Devices send their requests to the arbiter, which grants access based on a
predefined priority scheme.
Pros:
• Simple to implement.
• Can enforce strict priorities.
Cons:
• The arbiter can become a bottleneck.
• Single point of failure if the arbiter fails.

Sonal Hutke 50
2. Distributed Arbitration:
• Devices communicate directly with each other to negotiate access to the
bus.
• Each device has an equal opportunity to gain control of the bus, often using
algorithms such as token passing or random selection.
• Pros:
• Reduces the likelihood of bottlenecks.
• More robust, as there is no single point of failure.
• Cons:
• More complex to implement.
• Can introduce delays during arbitration.

Sonal Hutke 51
Common Arbitration Schemes
1.Fixed Priority Arbitration:
1. Devices are assigned fixed priorities. The device with the highest priority gets
access to the bus when multiple requests are made.
2.Round Robin Arbitration:
1. Each device gets an equal chance to use the bus in a rotating manner. This
prevents starvation of lower-priority devices but may not be efficient for high-
priority tasks.
3.Random Arbitration:
1. Devices are allowed to compete for bus access randomly. This method is less
predictable but can be simple to implement.
4.Time Division Multiple Access (TDMA):
1. Each device is assigned a specific time slot during which it can access the bus.
This guarantees access but may waste time if a device has no data to send.

Sonal Hutke 52
Conclusion
• Bus contention and arbitration are critical aspects of designing
effective communication systems in multi-device architectures.
• Contention can lead to performance issues and data corruption, while
arbitration provides a structured way to manage bus access, ensuring
orderly communication among devices.
• Properly implementing arbitration schemes can significantly improve
the efficiency and reliability of bus-based communication systems.

Sonal Hutke 53
Thank You!
([email protected])

Sonal Hutke 54

You might also like