0% found this document useful (0 votes)
7 views

PDC Notes Complete- Updated

The document provides an overview of parallel and distributed computing, tracing the evolution of computers from early machines to modern architectures. It discusses key concepts such as Moore's Law, the characteristics and advantages of parallel and distributed computing, and the challenges associated with each. Additionally, it covers various computing architectures, including RISC vs. CISC, and introduces important principles like Amdahl's Law and pipelining techniques.

Uploaded by

PIRSALMAN SHAH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

PDC Notes Complete- Updated

The document provides an overview of parallel and distributed computing, tracing the evolution of computers from early machines to modern architectures. It discusses key concepts such as Moore's Law, the characteristics and advantages of parallel and distributed computing, and the challenges associated with each. Additionally, it covers various computing architectures, including RISC vs. CISC, and introduces important principles like Amdahl's Law and pipelining techniques.

Uploaded by

PIRSALMAN SHAH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Lecture 01: Introduction to Parallel and Distributed

Computing
Introduction to Early Computers
The first computers, such as ENIAC and UNIVAC, emerged in the 1940s and 1950s. These machines were
massive, room-sized structures that relied on vacuum tubes for processing. Compared to modern technology,
their processing capabilities were highly limited.
Evolution to Mainframes
By the mid-20th century, computers transitioned into mainframes. A significant milestone was IBM's
System/360, introduced in the 1960s. These mainframe computers, although still large, had significantly
improved processing power and capabilities.
Personal Computers Era
The late 20th century witnessed the rise of personal computers, making computing more accessible to
individuals. Companies like Apple and Microsoft played pivotal roles in popularizing personal computing.
The introduction of user-friendly interfaces and affordable hardware revolutionized the industry. Key
developments included the Apple II (1977) and IBM PC (1981), which marked the beginning of widespread
computer usage.

Moore's Law
Moore's Law refers to the observation made by Gordon Moore in 1965 that the number of transistors on a
microchip doubles approximately every two years, leading to an exponential increase in computing power
and a reduction in relative cost. This exponential growth in processing power has:
· Driven advancements in computer technology, making devices smaller, faster, and more efficient.
· Influenced technological innovation across various devices.
· Revolutionized modern life, from smartphones to supercomputers.

Serial Computation
Traditionally, software has been designed for serial computations, which run on a single computer with a
single Central Processing Unit (CPU). These computations involve breaking a problem into a set of discrete
instructions that are executed sequentially, meaning only one instruction is processed at any given time.
Introduction to Parallel Computing
Parallel computing is a type of computation where multiple processes are carried out simultaneously to solve
a problem faster. It works by breaking a problem into smaller, independent or interdependent parts that can
be processed concurrently. Each of these parts is further divided into a sequence of instructions, which are
then executed simultaneously across multiple CPUs or processors. This approach enhances performance,
reduces execution time, and optimizes resource utilization, making it essential for handling large-scale
computations in fields such as scientific research, data processing, and artificial intelligence.

Top of Form
Bottom of Form

Characteristics of Parallel Computing


· Multiple Processors: Uses two or more processors (CPUs, GPUs, or specialized hardware) to
perform computations simultaneously.
· Task Decomposition: A problem is broken into smaller tasks that can be executed in parallel.
· Concurrency: Multiple tasks are executed at the same time, reducing execution time.
· Inter-Processor Communication: Processors may need to share data or synchronize their execution
Examples of Parallel Computing
· Blockchains
· Smartphones
· Laptop computers
· Internet of Things (IoT)
· Artificial intelligence and machine learning
· Space shuttles
· Supercomputers
Advantages of Parallel Computing
1. Increased Speed – Multiple calculations are executed concurrently, reducing the time required for large-
scale computations.
2. Efficient Use of Resources – Fully utilizes all processing units, maximizing computational power.
3. Scalability – More processors can be added to handle increasingly complex problems.
4. Improved Performance for Complex Tasks – Best suited for numerical simulations, scientific analysis,
modeling, and data processing.

Disadvantages of Parallel Computing


1. Complexity in Programming – Writing parallel programs is more challenging than serial programming.
2. Synchronization Issues – Multiple processors operating concurrently can face synchronization
challenges.
3. Hardware Costs – Implementing parallel computing requires multi-core processors, which can be
expensive.
Introduction to Distributed Computing
Distributed computing is a computing paradigm in which multiple autonomous computers work together to
solve a problem while appearing as a single system to the user. Unlike parallel computing, distributed
systems do not share memory; instead, they communicate through message passing to coordinate tasks. In
such systems, a single computational task is divided among multiple computers, each handling a portion of
the workload independently. This approach enhances scalability, fault tolerance, and resource efficiency,
making it suitable for large-scale applications such as cloud computing, big data processing, and
decentralized networks.

Examples of Distributed Computing : Artificial Intelligence and Machine Learning


· Scientific Research and High-Performance Computing
· Financial Sectors
· Energy and Environmental Sectors
· Internet of Things (IoT)
· Blockchain and Cryptocurrencies

Advantages of Distributed Computing


1. Fault Tolerance – If one node fails, other nodes continue computing, making the system more
reliable.
2. Cost-Effective – Utilizes commodity hardware instead of expensive specialized processors.
3. Scalability – Can expand by adding more machines to handle larger workloads.
4. Geographic Distribution – Allows tasks to be executed at different locations, reducing latency.

Disadvantages of Distributed Computing


1. Complexity in Management – Requires handling network failures, synchronization issues, and
latency.
2. Communication Overhead – Increased inter-node communication can impact overall performance.
3. Security Concerns – More vulnerable to security risks compared to centralized systems due to
network dependencies.
Flynn’s Taxonomy
Flynn’s Taxonomy classifies computing systems based on parallelism in instruction and data streams. A
system with n CPUs has n instruction streams that execute in parallel. The possible combinations are:
SISD (Single Instruction Single Data): A processor that executes one job at a time from start to finish.

SIMD (Single Instruction Multiple Data): A single control unit (CU) with multiple processing elements
(PEs). CU fetches and decodes an instruction, broadcasting control signals to all PEs. All PEs execute the
same instruction synchronously on different data sets.

MISD (Multiple Instructions Single Data): A rare type due to limited data throughput. multiple
instruction, single data (MISD) is a type of parallel computing architecture where many functional units
perform different operations on the same data.

MIMD (Multiple Instructions Multiple Data): Machines using MIMD have a number of processor
cores that function asynchronously and independently. At any time, different processors may be executing
different instructions on different pieces of data.
Communication in Parallel Programs

In parallel computing, communication between tasks is essential for data sharing and synchronization.
Efficient communication strategies impact performance by minimizing delays and optimizing resource
utilization.

1. Importance of Communication

· Most parallel applications require tasks to share data.


· The cost of communication includes:
o Resource Utilization: Data transmission consumes computational resources.
o Synchronization Overhead: Frequent synchronization can cause some tasks to wait.
o Network Bottlenecks: Excessive communication can saturate network bandwidth.

2. Latency vs. Bandwidth


· Latency: The time required to send a minimal message between tasks.
· Bandwidth: The amount of data transmitted per unit time.
· Impact: Sending many small messages increases latency overhead, reducing overall efficiency.

3. Synchronous vs. Asynchronous Communication


· Synchronous (Blocking): Tasks pause execution until communication is complete.
· Asynchronous (Non-blocking): Tasks continue working while communication occurs in the
background.

4. Scope of Communication
· Point-to-Point Communication: Direct data transfer between two tasks.
· Collective Communication: Involves all tasks within a communication group, such as broadcasting
or gathering data.

Feature Parallel Computing Distributed Computing


Memory Shared memory Each node has its own memory
Architecture
Communication High bandwidth, low latency Relies on message passing
Task Focus Solves a single complex problem Handles multiple tasks simultaneously
Scalability Limited by processor and memory Scales horizontally by adding more
capacity machines
Coordination Requires synchronization Nodes operate independently

Parallel Computer Memory Architecture


1. Shared Memory Systems: In shared memory systems, multiple processors access the same physical
memory. This allows for efficient communication between processors because they directly read
from and write to a common memory space. Shared memory systems are typically easier to program
than distributed memory systems due to the simplicity of memory access.
Advantages:
· Global address space simplifies programming.
· Fast and uniform data sharing.
Disadvantages:
· Scalability issues as CPU count increases.
· Programmers must ensure correct access to shared memory.

2. Distributed Memory Systems: Distributed memory systems consist of multiple processors, each
with its own private memory. Processors communicate by passing messages over a network. This
design can scale more effectively than shared memory systems, as each processor operates
independently, and the network can handle communication between them.
Advantages:
· Scalable memory with increasing CPUs.
· Each CPU can access its local memory quickly.
Disadvantages:
· Programmers must manage data communication.
· Difficult to map global memory structures.

3. Hybrid Systems: Hybrid systems combine elements of shared and distributed memory architectures.
They typically feature nodes that use shared memory, interconnected by a distributed memory
network. Each node operates as a shared memory system, while communication between nodes
follows a distributed memory model. Within a node, tasks can communicate quickly using shared
memory, while inter-node communication uses message passing.

Advantages:
· Improved scalability.
Disadvantages:
· Increased programming complexity.

LECTURE -3
RISC vs. CISC Processors

Reduced Instruction Set Computer (RISC) is a type of computer architecture that utilizes a small, highly
optimized set of instructions. Unlike Complex Instruction Set Computer (CISC) architectures, which have a
large number of specialized instructions, RISC systems focus on executing a limited number of simple
instructions efficiently. This design philosophy leads to faster execution speeds, as RISC processors can
complete most instructions in a single clock cycle. Additionally, RISC architectures emphasize a uniform
instruction format, a large number of general-purpose registers, and a load/store approach, where memory
access is limited to specific instructions. These characteristics enhance performance, power efficiency, and
parallel execution, making RISC processors widely used in modern computing, especially in mobile and
embedded systems.
Top of Form

Bottom of Form
Advantages of RISC:
· Faster processing due to simple instruction decoding.
· Lower power consumption.
· High efficiency in portable devices.
Disadvantages of RISC:
· Requires more instructions for complex tasks.
· Higher memory usage.

Complex Instruction Set Computer (CISC) is a type of computer architecture that utilizes a large and
diverse set of instructions, allowing a single instruction to perform multiple low-level operations, such as
memory access, arithmetic computations, and complex addressing modes. Unlike Reduced Instruction Set
Computer (RISC) architectures, which focus on executing simple instructions quickly, CISC architectures
aim to reduce the number of instructions per program by using more complex and multi-step instructions.
This approach helps in minimizing the need for multiple instructions to perform a task, reducing the number
of memory accesses and instruction fetches. However, CISC processors often require multiple clock cycles
to execute a single instruction, making them potentially slower than RISC processors for certain operations.
Despite this, CISC remains widely used, especially in x86-based systems, where backward
.
Advantages of CISC:
· Reduced code size due to complex instructions.
· More memory-efficient.
· Long-established with broad software support.
Disadvantages of CISC:
· Slower execution due to complex instruction decoding.
· Higher power consumption.
Comparison of RISC vs. CISC
Feature RISC CISC
Focus Software Hardware
Control Unit Hardwired Hardwired & Microprogrammed
Transistors Usage More registers Storing complex instructions
Instruction Size Fixed Variable
Execution Type Register-to-register operations REG-to-REG, REG-to-MEM, MEM-to-MEM
Register Usage More registers required Fewer registers required
Code Size Large Small
Execution Speed Single cycle per instruction Multiple cycles per instruction
Addressing Modes Simple Complex
Pipelining Highly pipelined Less pipelined
Power Consumption Low High
RAM Requirement More Less
LECTURE -7
Amdahl’s Law is a principle in computer architecture that defines the potential speedup of a system
when improving a specific part of it, particularly through parallel computing.
S=T1/T2
represents speedup, where:
· T1 is the time taken by machine 1 (slower machine).
· T2 is the time taken by machine 2 (faster machine).
· Since T2 is smaller than T1, speedup S quantifies how much faster machine 2 is compared
to machine 1.
Amdahl’s Law and Parallel Processing
Gene Amdahl, in 1967, highlighted the limitations of improving performance by adding more
processors. He pointed out that:
· Speedup is constrained by the portion of the task that cannot be parallelized.
· If a fraction of the computation must still be executed sequentially, the overall performance
gain will be limited, even if other parts run in parallel
This result says that, no matter how much one type of operation in a system is improved,
the overall performance is inherently limited by the operations that are unaffected by the
improvement. For example, the best speedup that could be obtained in a parallel computing system
with p processors is p.
However, if 10% of a program cannot be executed in parallel, the overall speedup when using
the parallel machine is at most 1/α = 1/0.1=10, even if an infinite number of processors were
available.

An obvious corollary to Amdahl's law

o any system designer or programmer should concentrate on making the common case fast.

o That is, operations that occur most often will have the largest value of α.

o Thus, improving these operations will have the biggest impact on overall performance.

o Interestingly, the common cases also tend to be the simplest cases.


o As a result, optimizing these cases first tends to be easier than optimizing the more complex,
but rarely used, cases.

Scaling Amdahl’s law

o One of the major criticisms concerning Amdahl’s law has been that it emphasizes the wrong
aspect of the performance potential of parallel-computing systems.

o The argument is that purchasers of parallel systems want to solve larger problems within the
available time.

o Following this line of argument leads to the following “scaled” or “fixed-time” version of
Amdahl's law.

o It is common to judge the performance of an application executing on a parallel system by

o comparing the parallel execution time with p processors, Tp,

o with the time required to execute the equivalent sequential version of the application
program, T1 using the speedup Sp = T1/Tp.
o With the fixed-time interpretation, however, the assumption is that

o there is no single-processor system that is capable of executing an equivalent sequential


o version of the parallel application.

o The single-processor may not have a large enough memory, for example, or

o the time required to execute the sequential version would be unreasonably long.
LEC 8: Introduction to Pipelining
Pipelining is a technique in computer architecture that overlaps the execution of multiple instruction
stages to boost performance. It allows a processor to work on different instructions simultaneously,
improving efficiency. Pipelining divides the instruction execution process into smaller stages where
each stage processes a different instruction at the same time, enabling parallel execution. This
increases instruction throughput, meaning more instructions are completed in less time, and allows
the CPU to handle multiple tasks in a single clock cycle.

Need for Pipelining:


· Improves Performance & Speed: Without pipelining, the CPU processes one instruction at a
time, wasting resources. Pipelining allows multiple instructions to be processed simultaneously,
increasing throughput.
· Uses Resources Efficiently: Different CPU units work in parallel, reducing idle time.
· Supports Higher Clock Speeds: Smaller tasks per stage allow faster execution, enabling higher
clock frequencies.
· Reduces Execution Delays: Overlapping instructions minimizes the wait time between
instructions.
· Scales Complex Architectures: Pipelining is key for RISC processors and supports advanced
designs.
· Enhances Multi-Core Systems: Pipelining manages multiple instruction flows in multi-core
and superscalar CPUs.

Types of Pipelining
1. Hardware Pipelining: Hardware pipelining is a method used in computers to speed up
processing by dividing tasks into smaller steps and working on different steps at the same time.
There are different types of pipelining
· Instruction pipelining breaks instructions into steps like fetch, decode, and execute, so the
processor can handle many instructions at once. For example, the 5-stage RISC pipeline.
· Arithmetic pipelining helps with solving complex math problems, like floating-point
operations, using special units like FPUs.
· Data pipelining moves data quickly between memory and input/output devices, often using
tools like DMA controllers.
2. Software Pipelining: Software pipelining involves compiler optimizations to rearrange
instructions for better performance without hardware changes.
· Loop Unrolling: Reduces loop overhead by combining iterations. Example: A 10-iteration
loop runs 2 iterations per cycle.
· VLIW Pipelining: Schedules multiple operations in one instruction. Example: DSP
processors.
· Speculative Execution: Executes instructions before conditions are confirmed to avoid
stalls. Example: Branch prediction.
· Software Parallelism: Breaks tasks into threads for multi-core processors. Example:
OpenMP, CUDA.
Tradeoff Between Cost and Performance
Pipelining boosts performance but increases costs due to hardware complexity, power usage, and
design challenges. Balancing cost and performance depends on the system’s needs.
Scenario High Performance Low Cost

Simple CPU (e.g., embedded Few pipeline stages, lower Reduced hardware cost, lower power
systems) complexity usage

High-Speed CPUs (e.g., gaming, Deep pipelining, out-of-order Expensive fabrication, higher power
AI) execution

Superpipelining More stages, higher clock speed Complex hazard handling

Superscalar Multiple pipelines, more More execution units, higher cost


instructions/cycle

Software Pipelining Compiler optimizations for No hardware cost, but complex


performance software

· Choose Pipeline Depth: Avoid too many stages to reduce hazards.


· Use Hazard Mitigation: Techniques like forwarding and branch prediction improve
efficiency.
· Optimize Scheduling: Software pipelining reduces hardware needs.
· Select Architecture: RISC uses deeper pipelines; CISC may avoid excessive pipelining.
· For cost-sensitive systems (e.g., IoT), use simpler pipelines. For performance-critical
systems (e.g., AI), use deeper pipelines with advanced hazard handling.

Stages in Pipelining
1. Instruction Fetch (IF)

The Instruction Fetch stage is responsible for retrieving the instruction from memory, which may be
located in RAM or cache. During this stage, the Program Counter (PC) is updated to point to the
next instruction in sequence. However, this stage can face issues such as control hazards due to
branch instructions, which can disrupt the instruction flow. To address these challenges,
optimizations like instruction prefetching and branch prediction are commonly employed to
improve performance and maintain a steady pipeline flow.

2. Instruction Decode (ID)

In the Instruction Decode stage, the fetched instruction is decoded to determine the type of
operation and identify the required operands. This stage also generates control signals that direct
other parts of the CPU, particularly the execution unit. To enhance efficiency and reduce stalls
caused by data hazards, techniques such as register renaming and out-of-order execution are often
applied.

3. Execute (EX)

The Execute stage is where the actual operation takes place. This may involve arithmetic or logical
calculations using the Arithmetic Logic Unit (ALU), or it may involve evaluating branch
conditions. To ensure efficient processing, this stage can benefit from optimizations like operand
forwarding, which minimizes delays in data availability, and the use of multiple execution units to
allow parallel execution of instructions.

4. Memory Access (MEM)

During the Memory Access stage, the processor handles load and store operations, which involve
reading from or writing to memory. This stage is crucial for data retrieval and storage. Performance
enhancements at this stage include the use of cache memory and multi-level caching systems, which
significantly reduce memory access time and improve overall throughput.

5. Write Back (WB)

The final stage, Write Back, involves storing the results of computations into the processor’s
registers so that they can be used in subsequent instructions. To optimize this process, techniques
like register forwarding and write buffers are utilized, which help to reduce write delays and support
faster access to updated data by future instructions.

Example of Instruction Execution


1. ADD R1, R2, R3
2. SUB R4, R5, R6
3. LOAD R7, 0(R8)

Characteristics of Pipelining
· Multiple Instructions: Different instructions are processed in parallel.
· Segmented Cycle: Instruction cycle is split into stages (e.g., IF, ID, EX).
· Operation Overlap: Stages work simultaneously on different instructions.
· Increased Throughput: More instructions completed per unit time.
· Reduced Execution Time: Instructions complete faster once the pipeline is full.
· Pipeline Depth: More stages improve throughput but increase complexity.
· Hazards: Structural, data, and control hazards can cause stalls.
· Speedup: Ideally equals the number of stages, but hazards reduce this.
· Dependency Management: Forwarding and branch prediction handle dependencies.
· Latency vs. Throughput: Throughput improves, but individual instruction latency remains.

Hazards in Pipelining
1. Data Hazards:
o Read After Write (RAW): Instruction needs a result not yet written.
o Write After Read (WAR): Later instruction writes before earlier reads.
o Write After Write (WAW): Two instructions write to the same register out of
order.
o Solution: Register renaming, operand forwarding.
2. Structural Hazards: Occur when hardware resources (e.g., memory) are insufficient
3. Control Hazards: Caused by branch instructions disrupting the instruction flow.
o Solutions: Branch prediction, delay slots, speculative execution.
Register Renaming
Register renaming eliminates data hazards by assigning different physical registers to the same
logical register. This prevents conflicts and is used in out-of-order and superscalar processors.

Lecture #09: Shared and Distributed Memory in


Parallel Computing
Parallel and distributed computing rely on efficient memory management to enable multiple
processors to collaborate. The two primary memory architectures are shared memory and
distributed memory, each with unique characteristics, advantages, and challenges.

Shared Memory Architecture: Shared memory systems provide a single, unified


memory space that all processors can access.

Hardware Mechanisms
· Memory Bus: A high-bandwidth bus connects processors to centralized DRAM, but contention
can slow performance.
· Cache Coherence: Protocols ensure all processors see consistent memory values.

Synchronization Challenges
· Race Conditions: Locks, semaphores, and monitors are needed to prevent conflicts when
multiple processors access the same data.
· False Sharing: When processors modify different variables in the same cache line, performance
degrades due to unnecessary invalidations.
· Memory Bandwidth: Limited bandwidth can bottleneck processor performance.

Advantages
· Simpler Programming: Developers work with a single memory space, making coding easier.
· Faster Communication: Data sharing within shared memory is quick.
· Hardware-Managed Coherence: Cache coherence is handled automatically.

Disadvantages
· Scalability Issues: Bus contention limits the number of processors.
· Limited Memory Capacity: Total memory is constrained by the system.
· Single Point of Failure: A memory failure can affect the entire system.
Applications
· Multiprocessor systems for scientific simulations.
· Operating systems for efficient task management.

Distributed Memory Architecture: Distributed memory systems consist of


independent processors, each with its own private memory. Processors communicate by sending
messages over a network.

Hardware Mechanisms
· Processors: Each node has its own CPU and local DRAM.
· Network Interface Controller (NIC): Facilitates message-based communication.
· Interconnection Network: Connects processors using topologies like mesh or torus.

Advantages
· High Scalability: Adding more processors is straightforward.
· Larger Memory Capacity: Aggregates memory across nodes.
· Fault Tolerance: A single processor failure doesn’t crash the system.

Disadvantages
· Complex Programming: Explicit message passing requires careful design.
· Slower Communication: Network latency increases data transfer time.
· Cache Management Complexity: No hardware coherence, increasing software complexity.

Applications

· High-performance computing clusters for large-scale problems.


· Distributed systems with geographically dispersed tasks.

Cache Coherence in Shared Memory: Cache coherence ensures data


consistency across local caches in a shared-memory multiprocessor system. It prevents conflicts
from stale or inconsistent data. Each processor has its own cache to speed up memory access.
Without coherence:

· Read Consistency: Processors may see different values for the same memory location.
· Write Propagation: Updates by one processor may not be visible to others.
· Serialization: Writes may occur in an undefined order, causing errors.
Cache Coherence Problems
· Inconsistent Data: One processor updates a variable in its cache, but others read an outdated
value from memory.
· False Sharing: Two processors modify different variables in the same cache line, causing
unnecessary invalidations and performance loss.

Cache Coherence Protocols

1. Directory-Based Protocols: Directory-based protocols use a centralized directory to keep track


of which caches hold copies of each memory block. When a cache updates a block, the directory
ensures that all other caches with copies of that block are updated or invalidated accordingly. This
approach helps maintain memory consistency across the system. Directory-based protocols are
especially useful in large-scale multiprocessor systems where scalability and reduced bus traffic are
important.

2. Snooping Protocols: Snooping protocols work by having each cache monitor, or "snoop on" a
shared communication bus to detect changes made by other caches. When a cache updates a
memory block, it broadcasts the change or an invalidation signal to all other caches. This ensures
that no stale data is used. Snooping protocols are typically employed in smaller systems due to their
simplicity and reliance on a single shared bus for communication.

Example Issues

· Inconsistency:
o CPU 1 and CPU 2 load X = 10 into their caches.
o CPU 1 updates X = 20, but CPU 2 still sees X = 10 without coherence.
· False Sharing:
o Two CPUs modify different variables in the same cache line, triggering unnecessary
invalidations.

MESI Protocol (Modified, Exclusive, Shared, Invalid): The MESI protocol is a


snooping-based cache coherence protocol that ensures consistency in shared-memory systems.

MESI States

State Meaning
Modified (M) Cache block is modified, differs from main memory, and is unique to this
cache. Must be written back before replacement.
Exclusive (E) Cache block is clean, matches main memory, and exists only in this cache.
Shared (S) Cache block matches main memory and exists in multiple caches. Reads are
allowed, but writes require invalidation.
Invalid (I) Cache block is invalid and cannot be used.

MESI Operations

· Read Operation:
o M or E: Read from cache.
o S: Read from cache; other caches may have copies.
o I: Cache miss; fetch from memory or another cache.
· Write Operation:
o M: Write to cache; no other copies exist.
o E: Write to cache, transition to M.
o S: Invalidate other caches, write, transition to M.
o I: Fetch data, invalidate others, write, transition to M.

State Transitions

· From Invalid (I):


o Read → S or E (depends on other caches).
o Write → M (after invalidating others).
· From Shared (S):
o Another processor writes → I.
o This processor writes → M (after invalidating others).
· From Exclusive (E):
o This processor writes → M.
o Another processor reads → S.
· From Modified (M):
o Another processor reads → S or I (may write back to memory).

Example Scenario

1. Processor 1 loads X = 10 (cache miss, state: E).


2. Processor 2 reads X (both caches transition to S).
3. Processor 1 writes X = 20 (invalidates Processor 2’s copy, transitions to M).
4. Processor 2’s cache is now I.

Advantages of MESI

· Reduces memory traffic by allowing shared read access.


· Ensures data consistency across caches.
· Avoids unnecessary memory writes for efficiency.
Lecture #10: Introduction to Distributed Systems
A distributed system is a collection of independent computers that work together to appear as a
single coherent system to users. These computers, or nodes, communicate and coordinate only by-
passing messages. Each node is a complete computer with its own peripherals, operating system,
file system, and administration. Distributed systems are more loosely coupled than multicomputer
systems, meaning nodes operate more independently.
Strengths and Weaknesses of Loose Coupling
· Strength: Nodes can be used for diverse applications due to their independence.
· Weakness: Programming is challenging because there’s no common underlying model.

Significant Characteristics of Distributed Systems


1. Concurrency: Multiple computers execute programs simultaneously, sharing resources like
web pages or files. Coordinating these concurrent programs is a key challenge.
2. No Global Clock: Clocks in different computers cannot be perfectly synchronized. Programs
rely on message passing to coordinate actions.
3. Independent Failures: Failures are unique to distributed systems:
· Network faults can isolate computers, but they continue running.
· Slow networks or crashed programs are hard to distinguish.
· Failures of one component aren’t immediately known to others.
4. Computer Clocks and Timing: Clocks drift from perfect time at different rates (clock drift
rate). Timing differences complicate coordination.

Types of Distributed Systems


1. Synchronous Distributed Systems: A synchronous distributed system, as defined by
Hadzilacos and Toueg, is a system in which the timing of operations is predictable and bounded.
Specifically, it satisfies the following conditions:

1. Each step of a process is completed within known lower and upper time bounds.
2. Messages transmitted over communication channels are guaranteed to be delivered
within a known maximum time.
3. Each process has a local clock whose drift from real time is bounded by a known limit.

Advantages:
One of the main benefits of synchronous systems is that timeouts can be used reliably to detect
process failures. Since the timing of processes and messages is predictable, the system can be
designed with precise failure detection and recovery mechanisms. Such systems are feasible when
processor cycles and network capacity are guaranteed and consistent.

Example:
An example of a synchronous system would be a tightly controlled environment where both
computing and networking resources are dedicated and reserved, such as real-time industrial control
systems or embedded systems in aerospace applications.

2. Asynchronous Distributed Systems: Asynchronous distributed systems do not have guaranteed


bounds on process execution speeds, message transmission delays, or clock drift rates. Each
component operates independently without assumptions about timing, making synchronization and
coordination more complex.

Examples:
The Internet is a classic example of an asynchronous system. Server loads, message transmission
times (such as those in FTP transfers or email delivery), and user interactions vary widely, making
timing unpredictable.

Challenges:
Asynchronous systems pose several design challenges. For instance, tasks like multimedia
streaming that rely on meeting deadlines become difficult to manage reliably. Additionally, because
of unpredictable delays, users often multitask—like browsing other tabs—while waiting for
responses.

In practice, most real-world distributed systems are asynchronous. This is due to the shared nature
of processors and networks, where resource contention and variable conditions prevent the system
from maintaining fixed timing guarantees.

Grid Computing: Grid computing connects multiple, often geographically dispersed


computers to act as a virtual supercomputer, enabling resource sharing across organizations.
· Resource Sharing: Combines computing power, storage, and data across machines.
· Heterogeneous Systems: Supports diverse hardware, OS, and networks.
· Geographical Distribution: Nodes can be anywhere but collaborate seamlessly.
· Decentralized Control: No single authority; different organizations manage parts.
· Parallel Processing: Tasks are divided across nodes for efficiency.
· High Scalability: Easily add more computers to boost performance.

Cluster Computing: Cluster computing involves a group of interconnected computers


(nodes) working as a single system. Unlike grids, clusters are tightly coupled and typically located
in one place.
· Tightly Coupled: Nodes are connected via high-speed local networks.
· Homogeneous Hardware: Nodes usually have similar hardware and OS.
· Single Administration: Managed as one system by a u single organization.
· Parallel Processing: Tasks are split across nodes for faster execution.
· High Availability: If one node fails, others take over to ensure uptime.
Types of Clusters
1. High-Performance Computing (HPC) Clusters: Used for scientific simulations, weather
forecasting, AI training. Example: Supercomputers
2. High-Availability (HA) Clusters: Ensures system uptime by handling failures
automatically. Example: Banking systems.
2. Load Balancing Clusters: Distributes workloads across servers for optimal performance.
Example: Web servers.

Event Ordering in Distributed Systems


In distributed systems, actions are modelled as three types of events:
o Internal Events: Actions within a single process.
o Message Send Events: Sending a message to another process.
o Message Receive Events: Receiving a message from another process.
Determining whether an event (send or receive) at one process happened before, after, or
concurrently with another event is critical. Since clocks cannot be perfectly synchronized,
Lamport's logical clocks provide a way to order events across different computers.
Lamport's Logical Clocks
· Assigns a logical timestamp to each event.
· Ensures that if event A causes event B (e.g., via a message), A’s timestamp is less than B’s.
· Helps establish a partial ordering of events, even without synchronized clocks.

Model of Distributed Executions


A distributed execution model describes how processes in a distributed system interact and how
events are ordered. It is typically based on:
1. Processes: A distributed system consists of multiple processes p1,p2,p3 .....pn that execute
independently and communicate via messages.
2. Events: Each process performs a sequence of events. Events within a process are totally ordered,
but globally, there's no inherent total order.
3. Communication: Processes communicate by sending and receiving messages. A message sent
from one process to another introduces a dependency between the sending and receiving events. Cij
denote the channel from process pi to pj and mij denote message sent by pi to pj.
4. Execution Trace: A complete execution can be seen as a set of all events across all processes
and the relationships between them.

Causal Precedence Relationship (→ or "happened-before")


Introduced by Leslie Lamport, the happened-before relation (→) captures causal relationships
between events:

Given two events a and b, we say a → b (a happened-before b) if:

1. Same Process: If a and b are in the same process, and a occurs before b, then a → b.
2. Message Send/Receive: If a is the sending of a message and b is the receipt of that message,
then a → b.
3. Transitivity: If a → b and b → c, then a → c.

If neither a → b nor b → a, then events a and b are concurrent, denoted a || b.

The rules of causal precedence (→)


HB1 (Same process): If two events occur in the same process, and one happens before the other: If
∃ process p: e → e′, then e → e′
HB2 (Message passing): If a message is sent and then received, the send event precedes the receive
event: send(m) → receive(m)
HB3 (Transitivity): If e → e′ and e′ → e″, then e → e″. This builds a partial order of events
Concurrent Events
Two events a and e are concurrent (a || e) if:
· Neither a → e nor e → a.
Event a doesn't causally affect e, and e doesn't causally affect a.
Hence, no information can be guaranteed to flow between them.

Logical vs. Physical Concurrency

· Physical concurrency: Events that occur simultaneously in real (wall-clock) time.


· Logical concurrency: Events that:
o Do not causally affect each other (i.e., there’s no “happened-before” relationship).
o May not occur at the same physical time, but are considered concurrent in the logical
sense.

Logical concurrency is essential for understanding independence of events in distributed systems,


especially when no shared clock exists.

Logical Time and Logical Clocks

Leslie Lamport (1978) introduced the concept of logical clocks to define a partial ordering of events
in a distributed system. Each process pi maintains a logical clock Li, which is a monotonically
increasing counter.

Helps determine happened-before relation: if e→e′e \rightarrow e'e→e′, then L(e)<L(e′)L(e) <
L(e')L(e)<L(e′).

Lamport Clock Rules

LC1 (Local Event Rule): Before executing any event at process pi, increment its clock:

Li:=Li+1

LC2 (Message Sending/Receiving Rule):

· a) Sending: When sending message m, send it with timestamp t=Li.


· b) Receiving:When process pj receives message with timestamp t:

Lj:=max(Lj,t)

Then apply LC1 again to timestamp the receive event.


· Incrementing by 1 is a convention; any positive increment works.
· Using induction, it’s provable that e→e′, then: L(e)<L(e′)

VECTOR CLOCKS

Vector clocks were developed by Mattern (1989) and Fidge (1991). It Solves a problem with
Lamport clocks: L(e) < L(e′) does not imply causality.Vector clock has an

· An array of N integers for N processes.


· Each process pi maintains its own vector Vi for local timestamps.

· VC1 (Initialization): All elements of all vectors start at 0.


· VC2 (Event timestamping): When pi performs an event, it increments Vi[i].
· VC3 (Sending messages): Messages carry the current vector clock.
· VC4 (Receiving messages): Receiving process merges its vector with the one received.

Each event in a distributed system can be given a vector timestamp, which is an array of integers
one element per process in the system.

Comparisons between vector timestamps:

· Equality (V = V′): All elements are equal.


· Less than or equal (V ≤ V′): Each element of V is less than or equal to the corresponding
element in V′.
· Strictly less than (V < V′): V ≤ V′ and V ≠ V′.

Example:

· V = (3, 4, 5)
· V′ = (3, 6, 5)
→ V < V′ because all components of V are ≤ V′ and at least one is strictly less.

✅ Important conclusion:

If an event e happens before event e′, denoted e → e′, then the vector clock of e is less than that
of e′:
e → e′ ⇒ V(e) < V(e′)

Merge operation:

· When combining vector clocks, take the component-wise maximum.


· This is how a process updates its clock when receiving a message.

· Vi[i]: Number of events process pi has seen.


· Vi[j] (where j ≠ i): Number of events at pj that have causally affected pi.
Communication models
1. FIFO (First In First Out) Ordering: If a process sends multiple messages to another
process, those messages are received in the same order in which they were sent. Example:
If Process A sends messages M1 and then M2 to Process B, then B will receive M1 before M2.

Use case: Useful when message order from a specific sender must be preserved, like updates or
logs.

2. Non-FIFO Ordering: There is no guarantee that messages sent by one process to another
will be received in the same order. Example: If Process A sends M1 then M2 to Process B, B might
receive M2 before M1..
Use case: Suitable when message order is not critical, or when application-level logic handles
reordering.

3. Causal Ordering: If one message causally affects another (e.g., a message is sent in response
to a previously received one), then all processes must see these messages in the same causal order
Example:

1. A sends M1 to B.
2. B receives M1 and sends M2 to C.
3. Then, C must receive M1 before M2 because M2 was caused by M1.

Use case: Essential in collaborative applications (like Google Docs) where operations depend on
prior ones.

Global State of A Distributed System: The global state of a distributed system refers to a
complete snapshot of the system at a particular point in time. It includes the state of each process
and the state of the communication channels between them. A distributed system is made up of:

· Processes (independent units of computation)


· Channels (used for sending messages between processes)

So, the global state includes:

1. Local state of each process:This might include variables, current tasks, program counters, etc.
2. State of each communication channel:This is the set of messages that have been sent but not
yet received—they're "in transit".

In centralized systems, you can look at memory or process tables to know the system state. But in
distributed systems, there’s no central control, no global clock, and no shared memory. Each
process has only partial knowledge of what’s happening. So, capturing the global state helps with:

· Checkpointing and recovery after a failure


· Debugging or tracing problems
· Deadlock detection
· Consistent snapshots (e.g., for state replication or synchronization)

The challenge: The main difficulty in defining or capturing a global state is the lack of a global
clock. If you try to record states at "the same time," what does "same time" even mean when each
process has its own local clock and no process knows exactly what the others are doing at that
moment?

The solution: Algorithms like the Chandy-Lamport Snapshot Algorithm solve this problem. They
allow the system to record a consistent global state even while processes continue to run and
exchange messages. This consistency means the snapshot could have occurred in a real execution of
the system, even if not all states were recorded at the same real-time instant.
Chandy-Lamport Snapshot

The Chandy-Lamport Snapshot Algorithm is a distributed algorithm used to record a consistent


global state of a distributed system without stopping it. It was proposed by K. Mani Chandy and
Leslie Lamport in 1985. The key idea is to capture the local state of each process and the messages
in transit on the communication channels, in such a way that the combined snapshot reflects a
possible global state of the system. The Chandy-Lamport algorithm gives a way to take a consistent
snapshot without stopping the system, even while it is running and communicating.

Basic Assumptions

· The system is made up of multiple processes that communicate through unidirectional FIFO
channels (i.e., messages arrive in the order they were sent).
· There is no shared memory, and processes do not have a global clock.
· Any process can initiate the snapshot at any time.

Suppose one process (say, P0) initiates the snapshot. The algorithm consists of these key steps:

1. Recording the local state:


When a process initiates or receives a marker for the first time, it records its own local state.
This includes variables, pending tasks, etc.
2. Sending markers on outgoing channels:
After recording its own state, the process sends a special marker message on all of its outgoing
channels to inform others that a snapshot is being taken.
3. Recording channel state:
For each incoming channel, the process records all messages received after it recorded its own
state and before it receives a marker on that channel. These are the messages that were "in
transit" during the snapshot and are part of the global state.
4. Receiving subsequent markers:
If a process receives a marker on a channel after it has already recorded its own state, it stops
recording messages on that channel and saves what it has collected as the state of that channel.

The process completes the snapshot once:

· It has recorded its own state, and


· It has received a marker on all of its incoming channels.

Why is this consistent?

This method guarantees that the snapshot reflects a consistent view of the system. It avoids
situations like recording a message received by one process but not sent by the other, because the
algorithm ensures that such messages are captured as "in-transit" in the channel state.

Lecture #11: Time And Global States


(review slides too for this lec)

Time is crucial in distributed systems. It's used for practical purposes like timestamping transactions
and for understanding how events happen across different systems. However, physical clocks in
computers aren't perfectly accurate and can't always be synchronized exactly. This topic covers the
problems of clock synchronization, introduces algorithms to minimize these errors, and explains
logical clocks like vector clocks, which help in event ordering in distributed systems.

Physical Clocks and Their Challenges

Physical clocks in computers use oscillating crystals to keep time. These oscillations are counted
and converted into a software clock using this formula:

Where:
·Hi(t)H_i(t)Hi(t): Hardware clock of node i
·α: Scaling factor
·β: Offset factor
However, these clocks are prone to:
· Clock skew: The difference between clocks at a single moment
· Clock drift: Gradual divergence due to hardware differences or environmental factors like
temperature

Typical quartz clocks drift about 10−6 seconds per second (1 second in ~11.6 days). High-precision
clocks are more stable, with drifts around 10−7 or 10−8.

Standards for Accurate Timekeeping

To achieve better accuracy, computers can sync with external sources like International Atomic
Time (TAI). TAI is based on atomic oscillators and is extremely precise (drift of ~1 part in
10^{13})

Because the Earth's rotation (astronomical time) varies, Coordinated Universal Time (UTC)
combines TAI with leap seconds to stay in sync. UTC is distributed via:

· Radio (accuracy: 0.1–10 ms)


· Satellites (accuracy: ~1 microsecond)

Synchronizing Physical Clocks


Two types of synchronization:

· External synchronization: Sync with a trusted source like UTC within a bound DDD
· Internal synchronization: Ensure all system clocks are within DDD of each other

If external synchronization is within DDD, internal synchronization will be within 2D2D2D.

A clock is correct if:

· Its drift stays within a known bound (e.g., 10^−6 s/s)


· Itis monotonic (never jumps backward), which is important for applications like file builds
in UNIX.

Clock Failures and Synchronous Systems

Clocks may fail by:

· Crashing (stopping completely)


· Arbitrary errors, like the Y2K bug (incorrect jumps)

1. Cristian’s Method (Cristian’s Algorithm)

Goal: Synchronize a client’s clock with a time server.

How it works:

1. The client sends a request to a time server asking for the current time.
2. The server replies with its current time t.
3. The client notes the round-trip time (RTT) = time it took to send the request and receive the
reply.
4. It estimates the current server time as: t+RTT/2

(Assumes that the delay is symmetric — same time to go and come back).
Example:

· Client sends request at 10:00:00.


· Server replies with time 10:00:10.
· Round-trip time is 2 seconds.
· Estimated time = 10:00:10 + 1 = 10:00:11.

Pros:

· Simple and effective in many practical cases.


· Good for unicast (one-to-one) synchronization.

Cons:

· Assumes symmetric delay which may not always be true.


· Can be inaccurate with network congestion or variable delays.

2. Berkeley Algorithm

Goal: Synchronize clocks of a group of computers (not with a time server).

How it works:

1. One node (called the coordinator) polls other nodes for their local time.
2. All nodes reply with their time.
3. Coordinator:
o Discards outliers (very different times).
o Averages the remaining times.
o Calculates how much each clock should adjust (either forward or backward).
4. Sends adjustment values to all nodes.

Important: Unlike Cristian’s method, there is no central time source. It’s a peer-based
approach.

Pros:

· All nodes adjust toward a mean time, avoiding bias.


· Works well in closed systems like LANs.

Cons:

· Needs a coordinator.
· Not suitable for environments with high delay variability.

3. NTP (Network Time Protocol)


Goal: Synchronize clocks over the Internet with high accuracy.

How it works:

· Uses a hierarchical system of time servers:


o Stratum 0: Atomic clocks or GPS (very accurate).
o Stratum 1: Directly connected to Stratum 0.
o Stratum 2, 3, ...: Connected to higher strata.
· Clients query multiple servers.
· They compute time using timestamps from both client and server to estimate delay and
offset.
· Applies statistical filtering and error correction.

Formula (simplified):

· Offset = estimated time difference.


· Delay = round-trip delay estimate.

Pros:

· Very accurate (milliseconds to microseconds).


· Works over noisy networks like the Internet.
· Scalable and fault-tolerant.

Cons:

· Complex algorithm.
· Slight delay depending on network conditions.

Feature Cristian’s Berkeley Algorithm NTP


Method
Type Client-server Distributed peer system Hierarchical time servers
Accuracy Medium Medium High
Use case One client to Small networks (e.g., Internet-wide
server LAN) synchronization
Assumptions Symmetric delay Peer cooperation Varying delay, redundancy
External time Yes No Yes (e.g., atomic clocks)
source

Lecture #12: Middleware


Middleware (MW) is a software layer in distributed systems that sits between:

· Higher-level layer: Users and applications.


· Lower-level layer: Operating systems and basic communication facilities.

It acts like an operating system for distributed systems, providing a unified interface across multiple
machines.

Purpose of Middleware

It supports heterogeneous computers and networks while presenting a single-system view to users
and applications. It:

· Masks differences in networks, hardware, operating systems, and programming languages.


· Provides a consistent programming abstraction (data structures and operations).
· Enables communication between components of a single distributed application and between
different applications

Middleware Architecture

· The middleware layer extends across multiple machines.


· It offers each application the same interface, ensuring uniformity.
· Middleware is implemented over Internet protocols, which hide differences in underlying
network models.
· It handles variations in operating systems and hardware.

Key Functions of Middleware

1. Communication: It facilitates interaction between distributed application components and


supports inter-application communication.
2. Abstraction: It provides a uniform computation model (e.g., remote object invocation, event
notification) and hides the complexity of network message passing.
3. Heterogeneity Management: It abstracts differences in hardware, OS, and programming
languages.

Common Object Request Broker Architecture (CORBA)


CORBA is a client-server middleware system where everything is treated as an object. Client
processes on one machine can invoke operations on objects located on server machines. CORBA
hides network communication details making remote invocations appear as local operations.

Other Middleware Examples

· Java Remote Method Invocation (RMI):


o Supports only Java, allowing remote method calls between Java programs.
o Simplifies distributed programming within a single language.
· Remote SQL access, distributed transaction processing.

Uniform Computation Model

Middleware provides a consistent programming model for servers and distributed applications.
Common models include:

· Remote Object Invocation: Allows a program on one computer to call methods on an


object on another computer (e.g., CORBA).
· Remote Event Notification: Notifies distributed components of events.
· Remote SQL Access: Enables distributed database queries.
· Distributed Transaction Processing: Manages transactions across multiple systems.

How It Works

· Middleware hides network communication details (e.g., sending invocation requests and
replies).
· Programmers work with high-level abstractions rather than low-level network protocols.

Lecture #13: Challenges Facing Distributed Systems


Distributed systems face significant challenges due to their complex, interconnected nature. These
challenges arise from the need to coordinate independent computers across diverse environments
while ensuring performance, security, and scalability.

1. Heterogeneity: Heterogeneity refers to the variety and differences in:


· Networks: Different network types (e.g., Ethernet, Wi-Fi) require specific Internet Protocol (IP)
implementations.
· Computer Hardware: Data representations (e.g., byte ordering for integers) vary across
hardware.
· Operating Systems: Different OS provide distinct APIs for network protocols (e.g., UNIX vs.
Windows message exchange calls).
· Programming Languages: Character and data structure representations (e.g., arrays, records)
differ across languages.
Solution: Middleware and standardized protocols (e.g., IPs) mask these differences to enable
seamless communication.

2. Middleware and Mobile Code


Middleware provides a uniform interface for distributed systems, but challenges arise with mobile
code (e.g., Java applets) that moves and runs on different computers.
· Issue: Executable programs are specific to instruction sets and operating systems. For example,
Windows/x86 executables won’t run on macOS.
· Solution: Virtual Machines (VMs), like the Java VM, interpret code to run on various
platforms. JavaScript in web browsers is another common example of mobile code.

3. Openness
Open distributed systems allow extensions through new services or reimplementations, enabling
resource sharing.
· Requirements:
o Publish key software interfaces using an Interface Definition Language (IDL) to specify function
names, parameters, and return types.
o Define service semantics (what services do) informally using natural language, as IDLs only
capture syntax.
· Benefits:
o Systems can integrate components from different vendors or OS.
o Example: Web caching where browsers allow customizable cache policies (size, document storage,
consistency checks).
· Challenge: Ensuring precise semantic specifications to avoid misinterpretations.

4. Security
Security in distributed systems involves three components:
· Confidentiality: Protecting data from unauthorized access.
· Integrity: Preventing unauthorized data alteration.
· Availability: Ensuring access to resources despite interference.
Challenges:
· Data Transmission: Sending sensitive data (e.g., patient records, credit card numbers) over
networks. Solution: Use encryption techniques.
· User Authentication: Verifying the identity of users sending messages. Solution: Biometric
techniques or verification codes (e.g., via cell phones).
· Internal Threats: Firewalls protect against external threats but not misuse within an
intranet.

5. Scalability
A scalable system remains effective as the number of resources and users grows. Distributed
systems must scale in three dimensions:
· Size: Support more users and resources.
· Geographical: Handle users and resources spread far apart.
· Administrative: Manage systems spanning multiple organizations.
Scalability Problems:
· Size: Centralized services (e.g., single servers for sensitive data like medical records)
become bottlenecks as users grow. Replicating servers may compromise security.
· Geographical: LAN-based systems use synchronous communication (microsecond delays),
but WANs face millisecond delays, complicating interactive applications. WANs rely on
unreliable, point-to-point communication, unlike LANs’ reliable broadcast-based systems.
Example: Locating services in WANs requires special location services, unlike simple LAN
broadcasting.
· Administrative: Conflicting policies on resource usage, management, and security across
organizations. Security measures needed to protect both the system and new domains from
malicious attacks (e.g., restricting access, securing downloaded code like applets). Non-
technical issues (e.g., organizational politics) make administrative scalability the hardest.
Scaling Techniques
To address scalability challenges, distributed systems use three main techniques:
1. Hiding Communication Latencies: Improve geographical scalability by reducing the impact of
network delays.
· Use asynchronous communication to allow the requesting application to perform other
tasks while waiting for a reply.
· Handle replies via interrupts or separate threads to complete requests.
· Example: A client application continues processing while awaiting a server response.
2. Distribution: Goal is to split components into smaller parts and spread them across the system.
· Example: Domain Name System (DNS):
o DNS maps domain names (e.g., www.amazon.com) to IP addresses.
o Organized hierarchically into zones, each managed by a single name server.
o Name resolution (e.g., nl.vu.cs.flits) involves querying servers for each zone, reducing
load on any single server.
· Benefit: Hierarchical structures scale better (O(log n) access time) than linear ones.
3. Replication
· Goal: Create multiple copies of data or services to distribute load and improve availability.
· Example: Caching web content closer to users to reduce server load and latency.
· Challenge: Ensuring consistency across replicas (e.g., using cache coherence protocols).

Lecture #14: Event Ordering and Architectural Models


External Data Representation
In distributed systems, data in running programs is stored as data structures (e.g., interconnected
objects). However, messages transmitted between systems consist of sequences of bytes. To enable
communication, data structures must be:
· Flattened: Converted into a sequence of bytes for transmission.
· Rebuilt: Reconstructed at the receiving end.
Why External Data Representation is Needed
Computers differ in how they store data:
· Integer Ordering:
o Big-endian: Most significant byte comes first.
o Little-endian: Least significant byte comes first.
· Floating-Point Numbers: Representations vary across architectures.
· Character Encoding:
o ASCII: Common in UNIX, uses 1 byte per character.
o Unicode: Uses 2 bytes per character to support multiple languages.
These differences require a standardized format for data exchange.

Methods for Exchanging Binary Data


Two approaches to handle data exchange:
1. Convert to Agreed External Format: Data is converted to a standard format before
sending and back to the local format on receipt. Conversion can be skipped if sender and
receiver use the same architecture.
2. Send in Sender’s Format: Data is sent in the sender’s format with metadata indicating the
format. The receiver converts the data if needed.

External Data Representation and Marshalling


To support Remote Method Invocation (RMI) and Remote Procedure Call (RPC), data types passed
as arguments or results must be flattened.
· External Data Representation (EDR): A standard for representing data structures and
primitive values.
· Marshalling: Converting structured data and primitive values into an external data
representation.
· Unmarshalling: Rebuilding data structures and primitive values from their external
representation.

Approaches to External Data Representation and Marshalling


1. CORBA’s Common Data Representation (CDR):

CORBA’s Common Data Representation defined in CORBA 2.0, is a standard format used to
support the data types involved in CORBA remote method invocations, including both arguments
and return values. It supports a wide range of types, such as primitive types like short (16-bit), long
(32-bit), unsigned short, unsigned long, float (32-bit), double (64-bit), char, boolean, octet (8-bit),
and a special type called "any" that can represent any data type. Additionally, it accommodates
composite types such as arrays, structs, unions, and sequences.

CDR comes with several notable features. It supports both big-endian and little-endian byte
orderings, with the sender specifying the ordering in the message. Primitive values are aligned
based on their size such as placing a 4-byte long at byte indices divisible by 4—for efficiency.
Floating-point values adhere to IEEE standards, ensuring consistency across platforms. Characters
are encoded using a mutually agreed code set, such as ASCII or Unicode. Furthermore, composite
types are serialized in a defined sequence; for instance, strings are encoded by first specifying their
length, followed by the character data.

Example: A Person struct {name: "Smith", place: "London", year: 1984} is marshalled as:
Index Value
0–3 5 (length of "Smith")
4–11 "Smith___" (padded)
12–15 6 (length of "London")
16–23 "London__" (padded)
24–27 1984 (unsigned long)
CDR does not include type information in messages; sender and receiver must know the data order
and types.
2. Java Object Serialization
Java serialization is a process that flattens and represents Java objects or entire object trees for the
purpose of transmission or storage. It is specifically limited to use within Java applications. One of
its key features is that it includes type information in the serialized form, which allows for accurate
reconstruction of the object during deserialization. Additionally, it supports complex object graphs,
including the handling of object references. Java serialization is commonly used in scenarios such
as sending Java objects over a network or saving them to disk for later retrieval.
3. XML (Extensible Markup Language)
A textual format for representing structured data, originally for web documents but now used in
web services. It Represents data textually, resulting in larger sizes compared to binary formats. It
Includes type information, often referencing external namespaces for type definitions. It is used for
exchanging data between clients and servers in web services.
Issues in Marshalling Design
1. Who Performs Marshalling/Unmarshalling?
In CORBA and Java, middleware automatically handles marshalling/unmarshalling,
programmers of the task. For XML, software libraries are available, though manual
encoding is possible but error-prone due to complex data representations.
2. Compactness:
Binary formats (CORBA, Java) are compact, marshalling data into efficient byte sequences.
Textual formats (XML) are less compact, as text representations (e.g., "4560" vs. binary
4560) require more bytes.
3. Type Information:
o CORBA CDR: Excludes type information, assuming sender and receiver share
knowledge of data structure.
o Java Serialization: Embeds all type information in the serialized form.
o XML: Includes type information, often via external namespace references.
4. General Use: Beyond RMI/RPC, external data representation is used for storing data in files
or transmitting structured documents.
CORBA’s Common Data Representation (CDR) in Detail
· Primitive Types:
o Values are aligned based on size (e.g., 4-byte values start at indices divisible by 4).
o Supports both endian orderings, with the sender’s ordering specified in the message.
· Composite Types:
o Serialized in a defined order (e.g., struct fields are marshalled sequentially).
o Strings include length (unsigned long) followed by characters, padded with zeros for
consistency.
· No Pointers: CDR represents data structures without pointers, ensuring portability.
· Comparison: Similar to Sun XDR (used in Sun NFS), which also omits type information in
messages.
Marshalling in CORBA
· Automation: Marshalling/unmarshalling operations are generated from CORBA Interface
Definition Language (IDL) specifications.
· Example IDL for the Person struct:
struct Person {
string name;
string place;
unsigned long year;
};
The CORBA interface compiler generates marshalling code for remote method arguments and
results.

Lecture #15: Architectural Models


Java Object Serialization
In Java Remote Method Invocation (RMI), both objects and primitive data values can be passed as
arguments or results of method invocations. Java object serialization is the mechanism to convert an
object’s state into a byte stream for transmission or storage, and deserialization reconstructs the
object from that stream.
· Serialization: Converts an object into a byte stream.
· Deserialization: Rebuilds the object from the byte stream.
· Requirement: Objects must implement the java.io.Serializable interface.
Example: Person Class
public class Person implements Serializable {
private String name;
private String place;
private int year;

public Person(String aName, String aPlace, int aYear) {


name = aName;
place = aPlace;
year = aYear;
}
// Methods for accessing instance variables
}

The Serializable interface (from java.io package) enables serialization without requiring additional
methods.
Serialization Process
· Class Information: Includes the class name and a version number (set manually or computed as
a hash of class name, instance variables, methods, and interfaces).
· Object References: If an object references other objects, all referenced objects are serialized to
ensure references can be fulfilled upon deserialization.
· Handles: References are serialized as unique handles (sequential positive integers). Each object
is written once, and subsequent occurrences use the handle.
· Recursive Serialization: It writes class information, followed by types and names of instance
variables. If instance variables belong to new classes, their class information is included,
continuing recursively.
· Primitive Types: Written in a portable binary format using ObjectOutputStream.
· Strings and Characters: Written using writeUTF method in UTF-8 format:
o ASCII characters use 1 byte.
o Unicode characters use multiple bytes.
o Strings are preceded by their byte length.
Example: Serializing a Person Object
For Person p = new Person("Smith", "London", 1984):
· Simplified Serialized Form:
o Class information: Person class name, version number.
o Fields:
 name: Length (5), "Smith" (padded).
 place: Length (6), "London" (padded).
 year: 1984 (int).
· Note: Actual form includes type markers and handles (e.g., h0, h1), omitted in simplified
examples.
Serialization and Deserialization Code
· Serialize:
ObjectOutputStream out = new ObjectOutputStream(...);
out.writeObject(p); // Serializes Person object
· Deserialize:
ObjectInputStream in = new ObjectInputStream(...);
Person p = (Person) in.readObject(); // Reconstructs Person object
· Customization: Programmers can define custom read and write methods. Variables marked
transient (e.g., file handles, sockets) are not serialized.

Lecture # 16: Network of work stations, Cluster


REMOTE INVOCATION
Remote invocation refers to the process by which distributed systems enable communication
between processes, objects, or services that reside on different machines. This is managed through
the upper middleware layer, which abstracts away low-level network interactions and provides a
higher-level, more programmer-friendly interface for inter-process communication.

Remote Invocation Paradigms


There are several remote invocation paradigms used in distributed systems. These include request-
reply protocols, Remote Procedure Call (RPC), and Remote Method Invocation (RMI).
o The request-reply protocol is a basic communication model in which a client sends a request
message to a server, which then processes the request and returns a reply message. This protocol
forms the foundation of both RPC and RMI by offering basic, low-level support for remote
operations.
o Remote Procedure Call (RPC) extends the conventional procedure call mechanism to support
distributed computing. In a local procedure call, a function is invoked with arguments and
returns a result. Similarly, in RPC, a client process can invoke procedures on a server process
located on a different machine as if it were a local call. The RPC system handles the marshalling
(serialization) of arguments into a message, sends it over the network to the server, and then
unmarshals the reply when it is received.

o Remote Method Invocation (RMI) evolved in the 1990s as an extension of the object-
oriented programming model. RMI enables an object in one Java Virtual Machine (JVM) to
invoke methods on an object located in another JVM. It builds upon the concept of local
method invocation (LMI) but allows method calls to operate across network boundaries,
thereby facilitating distributed object-oriented computing.
*Imp-Request-Reply Protocols
The request-reply protocol is designed to facilitate typical client-server interactions. In its most
common form, communication is synchronous, meaning the client process sends a request and then
blocks until a response is received from the server. This mechanism also provides reliability
because the reply from the server serves as an implicit acknowledgment of the client’s request.
However, there are cases where asynchronous communication is preferred. In asynchronous
request-reply communication, the client sends a request and continues processing without waiting
for an immediate response. This is useful when replies can be retrieved later, allowing for more
efficient resource usage in certain applications.
In terms of implementation, request-reply interactions can be constructed using Java’s API for UDP
datagrams. However, many modern systems prefer TCP streams for reliability and ease of use.
Protocols built over UDP datagrams help avoid the overhead associated with TCP, such as
connection establishment and flow control, which are unnecessary in many remote invocations
involving small data exchanges.
Core Operations in Request-Reply Protocol
The request-reply model is typically implemented using three communication primitives:
doOperation, getRequest, and sendReply.
o The doOperation function is used by the client to initiate a remote operation. It takes as input
the server reference, an identifier for the operation, and the necessary arguments, all of which
are marshalled into a byte array. After sending the request, doOperation waits for the server’s
response and then unmarshals the result from the reply byte array.
o On the server side, the getRequest function is responsible for receiving client requests.
o Once the server completes the requested operation, it uses the sendReply function to send the
reply back to the client. The client’s doOperation then resumes execution upon receiving the
reply.

Message Matching and Identification

o To ensure proper pairing of requests and replies, each message is assigned a unique identifier.
This identifier consists of a requestId and a sender identifier (such as an IP address and port
number). The requestId is a sequential integer generated by the client, while the sender
identifier ensures uniqueness across the distributed system. These identifiers help the client
verify that the received reply corresponds to the correct request and is not an outdated or
duplicate message. The server copies the requestId from the request message into the reply,
allowing the client to perform this verification.
o When using datagram-based protocols like UDP, the reply from the server acts as an implicit
acknowledgment of the client’s request. If delivery guarantees are required, the protocol must
ensure that the reply reliably reaches the client, potentially using retransmission strategies if
necessary.

Structure of Request and Reply Messages


Each request message contains several key fields: a message type (indicating whether it is a request
or a reply), a unique message identifier, and an operation identifier that specifies which operation
the server should perform. This operation identifier could be a numeric ID (e.g., 1, 2, 3 for different
operations) or a representation of the method itself when reflection is supported by the
programming language.
The RemoteRef class is often used in such systems to encapsulate a reference to the remote server.
This class provides methods to retrieve the server's IP address and port number, which are needed
for communication. The client uses this reference in doOperation to send the request to the
appropriate server.

Client-Server Architecture: Client-server architecture is a centralized computing model


in which multiple clients request and receive services from a single, central server. It forms the
basis of most traditional networked applications and is characterized by clear roles and
responsibilities between clients and the server.
Structure and Function
In this model, the server is always active and is designed to serve multiple client requests
concurrently. The server is responsible for storing data, handling business logic, and processing
client requests. Clients act as consumers of services and do not communicate directly with one
another. Instead, all communication is routed through the server.
Examples
Typical implementations of client-server architecture include web servers , email servers and
Domain Name System (DNS) servers (resolving domain names to IP addresses).

Peer-to-Peer (P2P) Architecture: Peer-to-peer architecture is a decentralized model in which


each participant, or peer, acts as both a client and a server. This model eliminates the need for a
central server and promotes direct communication and resource sha ing among peers.
Structure and Function
In P2P systems, peers are autonomous and can initiate or respond to communication with other
peers. They share resources such as files, bandwidth, or processing power, and the system
dynamically balances load and responsibilities among all nodes.
Examples
Well-known examples of peer-to-peer architecture include BitTorrent, where users share file
segments directly with each other, and Skype (in its earlier implementations), which enabled users
to make voice and video calls without relying on a central server.
Top of Form
Bottom of Form
Lecture 17: Challenges in Software Architecture
Request-reply protocols implemented over UDP datagrams inherit UDP’s unreliability. They are
vulnerable to:
· Omission failures: Messages can be lost.
· Out-of-order delivery: Messages may not arrive in the order they were sent.
· Crash failures of processes: Processes can halt and remain non-functional, but Byzantine
(arbitrary or malicious) behaviour is not assumed.

To handle these limitations, especially in the doOperation primitive, timeouts are used while
waiting for a reply. If a timeout occurs, the action taken depends on the guarantees provided by the
protocol.
Timeout Handling
When a client sends a request but doesn’t get a reply within a certain time, it waits for a bit
(timeout). If that time passes without a reply, the system can do a few things:
· Immediate Failure: The client is told the operation failed. But this is risky because maybe
the server did the job, but the reply was lost.
· Resend the Request: The client sends the same request again. It keeps trying until it either
gets a reply or gives up and shows an error.
Duplicate Request Management
Because of resending, the server might get the same request more than once. This can happen if the
server is slow and the client thinks it didn't respond or if the client resends the request before the
server finishes processing.
To avoid doing the same task multiple times:
· Each request has a unique ID.
· If the server hasn’t replied yet, it just finishes the current work and replies once.
· If the server already replied, and gets the same request again:
o If the task is safe to repeat (idempotent), it can do it again.
o If not, the server sends the old reply from memory instead of redoing it.
Idempotent Operations
An idempotent operation produces the same result no matter how many times it is performed. These
operations simplify duplicate handling because re-execution has no adverse effect.
Examples:
· Adding an item to a set is idempotent.
· Appending to a list is not idempotent.
Servers with only idempotent operations don’t need elaborate mechanisms to avoid re-execution.
Using History to Handle Lost Replies
To avoid re-executing non-idempotent operations, servers may maintain a history of past requests.
Each entry in this history typically includes the request ID, the corresponding reply message, and
the client ID. When a duplicate request is received, the server can refer to this history and resend the
previously stored reply instead of processing the request again.
However, maintaining such a history comes with memory overhead. To manage this, servers often
treat a new request from a client as an implicit acknowledgment of the previous reply. This allows
the server to safely discard the last stored reply for that client. Despite this strategy, the history can
still grow, especially when many clients are connected or if a client terminates without sending
another request (thus never acknowledging the previous reply). As a result, servers commonly
discard old messages after a predetermined period to prevent unbounded memory usage.
Exchange Protocol Styles
Three exchange styles define how messages are passed between client and server in the presence of
failures:
· Request (R) Protocol:

In the Request (R) protocol, the client sends a single request without expecting any confirmation or
reply. This approach is suitable in scenarios where no result is needed and confirmation is not
critical. It is a very lightweight communication method but comes with the drawback of being
vulnerable to failures due to the lack of response or acknowledgment mechanisms.
· Request-Reply (RR) Protocol:

The Request-Reply (RR) protocol is commonly used in client-server interactions. In this model, the
server's reply serves as an implicit acknowledgment of the client’s request. Furthermore, subsequent
requests from the client can act as acknowledgments for previous replies. To ensure reliability, this
protocol may involve mechanisms such as retransmissions, duplicate detection, and maintaining a
history of interactions.

· Request-Reply-Acknowledge (RRA) Protocol:


The Request-Reply-Acknowledge (RRA) protocol is more comprehensive, involving three types of
messages: a request, a reply, and an acknowledgment of the reply. The acknowledgment message
includes the request ID, which helps servers clean up historical data related to the request.
Additionally, acknowledging one reply implies acknowledgment of all earlier replies, making the
protocol more efficient in managing past communications. Although this approach adds robustness
and reliability, it also increases message overhead and bandwidth usage. However, it doesn’t block
the client, as the acknowledgment can be sent after processing the reply.

Lecture #19: Remote Invocation & Remote Procedure Calls (RPC)


Remote Procedure Call (RPC) : RPC allows procedures on remote machines to be called as if
they were local. This makes distributed systems easier to program by promoting distribution
transparency—programming appears conventional even across different systems. The idea was
introduced by Birrell and Nelson (1984) and marked a significant advancement in distributed
computing.
What RPC Hides
An RPC system abstracts the complexity of distribution by handling:
· Marshalling/unmarshalling of parameters and results
· Message transmission across the network
· Preservation of call semantics to ensure consistent execution
Key Design Concerns in RPC
RPC promotes interface-based programming and raises several design considerations:
· The semantics of the call (e.g., maybe, at-least-once)
· The degree of transparency
· Properly defined service interfaces between clients and servers
Interface-Based Programming
Service interfaces define the available procedures and the types of arguments/results. For example,
a file server might expose procedures for reading or writing files.
Interfaces encapsulate implementation details, so programmers need only interact with the abstract
definition—not the underlying language or platform. This supports easier updates and evolution of
services, as long as interface compatibility is preserved.
Interface Restrictions
· Direct access to server variables by the client is not allowed.
· Interfaces must only describe operations, not internal state.
· Attributes in CORBA IDL appear to violate this but actually use getter/setter methods.
· Only call-by-value is supported. Input parameters are sent with the request; output parameters
are returned in the reply.
· Memory addresses cannot be passed, as they are invalid outside their process space.
Interface Definition Languages (IDLs):IDLs provide a way to define interfaces in a language-
neutral form. This supports interoperability between services written in different programming
languages. Example: CORBA IDL defining a PersonList interface with methods like addPerson()
and getPerson(), along with structured data types.
Call Semantics in RPC

Call semantics in Remote Procedure Call (RPC) systems define how the system handles procedure
execution in the presence of communication failures. These semantics determine how reliably a
remote call is perceived by the client, especially under failure conditions such as lost messages or
crashes.

Maybe Semantics: Under maybe semantics, the remote procedure may execute once or not at all.
There is no fault tolerance mechanism, making this approach susceptible to omission or crash
failures. It is generally suitable for applications where occasional failed calls can be tolerated and do
not critically affect the system's correctness.

At-Least-Once Semantics: With at-least-once semantics, the remote procedure is guaranteed to


execute at least once. However, due to retransmissions in case of failures, the procedure might be
executed multiple times. This can cause incorrect results if the operation is not idempotent. For
instance, if a client retries a request to increase a bank balance, the amount might be added more
than once.

At-Most-Once Semantics: At-most-once semantics ensures that the remote procedure is executed
no more than once. This is achieved through comprehensive fault-tolerance techniques such as
retries combined with duplicate request filtering. It provides both reliability and protection against
duplicating side effects, making it a preferred approach in systems requiring consistency. An
example of a system implementing this semantic is Sun RPC.

Transparency in Remote Procedure Call (RPC)


Transparency in RPC refers to the ability of a system to hide the complexities of remote interaction
from the user, making remote procedure calls appear and behave like local ones. This abstraction
simplifies distributed application development but introduces trade-offs in reliability and
performance.

Location and Access Transparency


RPC aims to provide at least location and access transparency. Location transparency hides the
physical location of the remote procedure, while access transparency allows both local and remote
procedures to be invoked using the same syntax and mechanisms.

Middleware and Transparency


Middleware can enhance transparency by handling various low-level concerns such as
communication, marshalling, and error recovery. However, this abstraction can obscure the fact that
remote calls are inherently more fragile and error-prone than local ones due to their dependency on
networks, remote hosts, and external processes.

Handling Failures
Remote procedure calls are more susceptible to failure than local calls. Therefore, client
applications must be capable of recovering from communication issues, crashes, or timeouts. This
necessitates exception handling and timeout mechanisms to maintain robustness.

Latency Considerations
RPC latency is significantly higher than that of local function calls. Because of this, programs
should minimize the number of remote interactions. The designers of Argus proposed that a remote
call should be abortable without affecting the server, implying that the server must be capable of
rolling back the effects of an incomplete call.

Parameter Passing Constraints


RPC does not support call-by-reference due to the separation of address spaces between the client
and the server. This constraint necessitates a different parameter passing style, typically call-by-
value or call-by-copy-restore, which may affect performance and semantics.

Interface vs. Syntax Transparency


Some argue that the distinction between local and remote operations should be visible at the service
interface level. Waldo et al. [1994] suggested that this distinction helps developers reason about
reliability and failure modes. Languages like Argus adopted this principle by explicitly marking
remote operations in the syntax.

Role of Interface Definition Languages (IDLs)


IDLs offer flexibility to service designers regarding transparency. They can indicate that remote
invocations might throw exceptions on communication failure, forcing client-side handling of such
events. Additionally, IDLs may allow specification of call semantics (e.g., at-least-once or at-most-
once), guiding developers in designing idempotent operations where needed.

Current Consensus on Transparency


The prevailing view is that RPC should be syntactically transparent—allowing remote calls to look
like local ones—but semantically distinguishable. This means that while the calling syntax remains
unified, interfaces should clearly indicate whether a call is remote to inform the programmer of
potential latency and failure risks.

You might also like