0% found this document useful (0 votes)
29 views

Distrubuted Computing

Distributed computing refers to a system where processing and data storage is distributed across multiple devices or systems rather than being handled by a single central device. It allows devices to communicate and share resources without a central hub. Cloud computing is an example where resources are accessed over the internet on demand. Distributed systems have advantages like scalability and reliability but also challenges around complexity and security.

Uploaded by

Tech Proffesor
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Distrubuted Computing

Distributed computing refers to a system where processing and data storage is distributed across multiple devices or systems rather than being handled by a single central device. It allows devices to communicate and share resources without a central hub. Cloud computing is an example where resources are accessed over the internet on demand. Distributed systems have advantages like scalability and reliability but also challenges around complexity and security.

Uploaded by

Tech Proffesor
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Distributed computing refers to a system where processing and

data storage is distributed across multiple devices or systems, rather


than being handled by a single central device. In a distributed
system, each device or system has its own processing capabilities
and may also store and manage its own data. These devices or
systems work together to perform tasks and share resources, with no
single device serving as the central hub.
One example of a distributed computing system is a cloud
computing system, where resources such as computing power,
storage, and networking are delivered over the Internet and accessed
on demand. In this type of system, users can access and use shared
resources through a web browser or other client software.
Components
There are several key components of a Distributed Computing
System
• Devices or Systems: The devices or systems in a distributed
system have their own processing capabilities and may also
store and manage their own data.
• Network: The network connects the devices or systems in the
distributed system, allowing them to communicate and
exchange data.
• Resource Management: Distributed systems often have some
type of resource management system in place to allocate and
manage shared resources such as computing power, storage, and
networking.
The architecture of a Distributed Computing System is typically a
Peer-to-Peer Architecture, where devices or systems can act as both
clients and servers and communicate directly with each other.
Characteristics
There are several characteristics that define a Distributed Computing
System
• Multiple Devices or Systems: Processing and data storage is
distributed across multiple devices or systems.
• Peer-to-Peer Architecture: Devices or systems in a distributed
system can act as both clients and servers, as they can both
request and provide services to other devices or systems in the
network.
• Shared Resources: Resources such as computing power,
storage, and networking are shared among the devices or
systems in the network.
• Horizontal Scaling: Scaling a distributed computing system
typically involves adding more devices or systems to the
network to increase processing and storage capacity. This can be
done through hardware upgrades or by adding additional
devices or systems to the network..

Advantages and Disadvantages


Advantages of the Distributed Computing System are:
• Scalability: Distributed systems are generally more scalable
than centralized systems, as they can easily add new devices or
systems to the network to increase processing and storage
capacity.
• Reliability: Distributed systems are often more reliable than
centralized systems, as they can continue to operate even if one
device or system fails.
• Flexibility: Distributed systems are generally more flexible than
centralized systems, as they can be configured and reconfigured
more easily to meet changing computing needs.
There are a few limitations to Distributed Computing System
• Complexity: Distributed systems can be more complex than
centralized systems, as they involve multiple devices or systems
that need to be coordinated and managed.
• Security: It can be more challenging to secure a distributed
system, as security measures must be implemented on each
device or system to ensure the security of the entire system.
• Performance: Distributed systems may not offer the same level
of performance as centralized systems, as processing and data
storage is distributed across multiple devices or systems.

Applications
Distributed Computing Systems have a number of applications,
including:
• Cloud Computing: Cloud Computing systems are a type of
distributed computing system that are used to deliver resources
such as computing power, storage, and networking over the
Internet.
• Peer-to-Peer Networks: Peer-to-Peer Networks are a type of
distributed computing system that is used to share resources
such as files and computing power among users.
• Distributed Architectures: Many modern computing systems,
such as microservices architectures, use distributed architectures
to distribute processing and data storage across multiple devices
or systems.
Parallel computing:
• Refers to executing multiple processors or computing units
simultaneously to solve large problems by breaking them into
smaller independent parts.
• Involves multiple CPUs communicating via shared memory and
combining results upon completion.
• Increases computation power and speeds up application
processing.

Types of parallel computing:


1.Bit-level parallelism: Reduces the number of instructions by
splitting operations into series of instructions based on processor
word size.
2.Instruction-level parallelism: Executes multiple instructions
simultaneously within a single CPU clock cycle.
3.Task parallelism: Decomposes tasks into subtasks and executes
them concurrently on multiple processors.
Advantages of parallel computing:
• Utilizes more resources, reducing time and costs.
• Solves larger problems in a shorter time compared to serial
computing.
• Suitable for simulating and modeling complex phenomena.
• Efficiently utilizes local resources.
• Enables the resolution of computationally intensive problems
that are impractical on a single computer.
• Provides concurrency and the ability to perform multiple tasks
simultaneously.
• Optimizes hardware utilization.
Disadvantages of parallel computing:
• Difficult to achieve parallel architecture.
• Requires efficient cooling technologies for clustered systems.
• Requires specialized algorithms for parallel execution.
• High power consumption in multi-core architectures.
• Requires low coupling and high cohesion, which can be
challenging to achieve.
• Parallel programming requires skilled and knowledgeable
programmers.
• Some control algorithms may not yield desired outcomes in
parallel systems.
• Extra costs due to synchronization, thread creation, and data
transfers may outweigh the benefits.
• Different code tweaking may be necessary for different target
architectures to improve performance.
Applications of parallel computing:
• Databases and data mining.
• Real-time simulation of systems.
• Networked videos and multimedia.
• Science and engineering.
• Collaborative work environments.
• Augmented reality, advanced graphics, and virtual reality.
Motivation of parallelism
• Traditional parallel software development has been perceived as
time and effort intensive due to the complexity of specifying
and coordinating concurrent tasks.
• The rapid development of microprocessors raises questions
about the need for parallelism, as hardware platforms can
become obsolete during the lengthy development process.
• However, hardware design trends indicate that uniprocessor
architectures may struggle to sustain future performance
increases, creating a need for parallelism.
• Standardized hardware interfaces have reduced the time
required to develop parallel machines based on microprocessors.
• Progress in standardizing programming environments has
extended the life-cycle of parallel applications, making them
more practical and worthwhile investments.
• These factors provide strong arguments in favor of utilizing
parallel computing platforms.
Moore’s law
• Moore's Law is an observation and prediction made by Gordon
Moore, co-founder of Intel, in 1965.
• It states that the number of transistors on a microchip doubles
approximately every two years.
• This doubling of transistors leads to increased computational
power and performance while decreasing the cost of computing.
• Moore's Law has been a driving force behind the advancement
of technology, enabling the miniaturization of electronic devices
and the development of more complex software applications.
• It has fueled advancements in fields such as artificial
intelligence, data analysis, and scientific research.
• In recent years, challenges in scaling down transistors and
increasing clock speeds have led to alternative approaches to
continue advancing computing capabilities, and the
sustainability of Moore's Law is a topic of debate.
• Despite potential limitations, Moore's Law symbolizes the
relentless pursuit of technological progress and the exponential
growth of computing power over time.
Grand Challenge problems
Grand Challenge problems are fundamental problems in science and
engineering that have significant economic and scientific impact.
They require the application of high-performance computing to find
solutions. The official Grand Challenge applications recognized by
the Federal High Performance Computing and Communications
(HPCC) program include:
1.Aerospace: Addressing challenges related to aircraft design,
aerodynamics, propulsion systems, and space exploration.
2.Computer Science: Tackling problems in areas such as artificial
intelligence, machine learning, data analytics, and software
optimization.
3.Energy: Solving complex problems related to energy
production, storage, distribution, and efficiency, including
renewable energy sources and optimization of energy systems.
4.Environmental Monitoring and Prediction: Using computational
models and simulations to study and predict environmental
phenomena, climate change, natural disasters, and their impact
on ecosystems and human societies.
5.Molecular Biology and Biomedical Imaging: Applying high-
performance computing to advance research in genomics,
proteomics, drug discovery, and medical imaging for improved
understanding of biological processes and disease treatments.
6.Product Design and Process Optimization: Utilizing
computational methods to optimize product design,
manufacturing processes, and supply chain management for
increased efficiency, reduced costs, and improved quality.
7.Space Science: Exploring and understanding the universe
through simulations and data analysis, studying celestial bodies,
cosmology, gravitational waves, and astrophysics.
TYPES OF PARALLELISM
Data Parallelism:
Concurrent execution: Data parallelism involves executing the same
task on multiple computing cores concurrently.
Division of work: The workload is divided among the cores, with
each core operating on a different portion of the data.
Example: In the case of summing an array, each core may handle a
subset of the array, allowing for parallel computation of the sum.
Independent operations: Data parallelism is effective when the
operations on different data subsets can be performed independently.
Increased efficiency: By distributing the workload across multiple
cores, data parallelism can lead to improved performance and faster
execution of tasks.
Task Parallelism:
Concurrent execution: Task parallelism involves executing different
tasks simultaneously on multiple computing cores.
Independent tasks: Each core performs a unique operation or task,
which can be executed independently.
Example: In the case of an array, one core may perform statistical
analysis while another core performs sorting, both operating
concurrently.
Flexibility: Task parallelism allows for efficient utilization of
multiple cores by executing diverse operations simultaneously.
Enhanced scalability: With task parallelism, the system can easily
adapt to varying workloads by assigning tasks to available cores
dynamically.
Bit-level Parallelism:
Utilization of wider word size: Bit-level parallelism relies on
increasing the word size of a processor to process more bits
simultaneously.
Reduced instruction count: By accommodating larger data sizes
within a single instruction, bit-level parallelism reduces the number
of instructions needed for operations.
Enhanced efficiency: With a wider word size, operations on larger
data sets can be executed more efficiently, reducing processing time.
Example: An 8-bit processor requiring multiple instructions to add
16-bit integers can be replaced by a 16-bit processor that can
perform the operation in a single instruction.Performance
improvement: Bit-level parallelism can significantly enhance
computational speed for operations involving larger data sizes.
Instruction-level Parallelism:
Simultaneous execution: Instruction-level parallelism involves
executing multiple instructions from a program simultaneously.
Overlapping execution: Pipelining is a common technique used to
exploit instruction-level parallelism by overlapping the execution of
instructions.
Improved performance: By concurrently executing multiple
instructions, the overall execution time of the program can be
reduced.
Independent instructions: Instruction-level parallelism is effective
when instructions can be executed independently without
dependencies.
Compiler and hardware support: Achieving instruction-level
parallelism often requires compiler optimizations and hardware
features such as superscalar processors or out-of-order execution.
Example:
Parallel loop: The provided code snippet demonstrates a parallel
loop where each iteration can be executed concurrently.
Overlapping iterations: Different iterations of the loop can be
executed simultaneously, enhancing overall performance.
Instruction-Level Parallelism:
Simultaneous instruction execution: Instruction-level parallelism
involves executing multiple instructions from a program
simultaneously.
Pipelining: Pipelining is a common technique used to exploit
instruction-level parallelism by dividing instruction execution into
multiple stages that can overlap.
Instruction dependencies: Careful handling of dependencies is
necessary to ensure correct execution order and maintain program
semantics.
Performance improvement: By overlapping the execution of
multiple instructions, instruction-level parallelism can lead to
improved performance and faster program execution.
Hardware support: Achieving instruction-level parallelism often
requires specialized hardware features, such as superscalar
processors, branch prediction, and out-of-order execution.
Thread-Level Parallelism:
Concurrent execution of threads: Thread-level parallelism involves
executing multiple threads of execution concurrently.
Independent tasks: Threads represent different tasks or portions of
the program that can be executed independently.
Resource sharing: Multiple threads may share common resources,
such as memory or I/O devices, requiring proper synchronization
mechanisms.
Scalability: Thread-level parallelism allows for efficient utilization
of multiple processing cores or processors, enabling scalability.
Examples: Multithreading in applications like web servers, video
encoding, or scientific simulations can exploit thread-level
parallelism to improve performance by dividing the workload across
multiple threads.
Data-Level Parallelism:
Simultaneous processing of data elements: Data-level parallelism
involves performing operations on multiple data elements
concurrently.
Vectorization: Vector instructions or SIMD (Single Instruction,
Multiple Data) operations are used to exploit data-level parallelism.
Data dependencies: Dependencies between data elements must be
managed to ensure correct and consistent results.
Parallel processing units: Modern processors often include dedicated
vector processing units to accelerate data-level parallel operations.
Performance gain: Data-level parallelism can significantly enhance
performance for computations involving large arrays or data sets by
processing multiple elements in parallel.
Memory-Level Parallelism:
Concurrent memory operations: Memory-level parallelism involves
performing multiple memory operations concurrently.
Cache utilization: Memory-level parallelism takes advantage of the
cache hierarchy to overlap memory accesses and hide latency.
Memory dependencies: Dependencies between memory operations
must be managed to ensure data consistency and avoid conflicts.
Parallel memory systems: Techniques such as multi-channel
memory, interleaved memory access, or memory banks can exploit
memory-level parallelism.
Improved memory throughput: Memory-level parallelism helps
improve the overall memory system throughput by increasing the
utilization of memory resources and reducing memory access
bottlenecks.
TYPES OF Granularity
Granularity in distributed computing refers to the size or scale of
tasks distributed across multiple computing resources. Here's a
summary:
1.Task size determines the level of detail and amount of work in
each distributed task.
2.Granularity can be fine-grained (small tasks) or coarse-grained
(larger tasks), impacting efficiency, scalability, and coordination
requirements in the distributed computing system.
Fine-grained Parallelism:
Granularity of small tasks: Fine-grained parallelism involves
breaking down a program into small, fine-grained tasks that can be
executed concurrently.
High degree of parallelism: Fine-grained parallelism aims to
maximize the number of parallel tasks or threads executing
simultaneously.
Fine-grained synchronization: Fine-grained parallelism often
requires frequent synchronization between tasks to manage
dependencies and maintain data consistency.
Example: Fine-grained parallelism can be seen in applications with
fine-grained operations like matrix multiplications, image
processing, or simulations where numerous small tasks can be
executed concurrently.
Increased overhead: Fine-grained parallelism can introduce higher
synchronization and communication overhead due to frequent
coordination among fine-grained tasks.
Coarse-grained Parallelism:
Granularity of large tasks: Coarse-grained parallelism involves
dividing a program into larger, coarse-grained tasks that can be
executed independently.Lower degree of parallelism: Coarse-grained
parallelism focuses on a smaller number of parallel tasks or threads,
but each task represents a larger unit of work.
Reduced synchronization overhead: Coarse-grained parallelism
requires less frequent synchronization between tasks due to larger
task sizes and reduced data dependencies.
Example: Coarse-grained parallelism can be observed in
applications like parallelizing stages of a pipeline, parallel database
queries, or parallelizing loops with larger iterations.
Enhanced load balancing: Coarse-grained parallelism can provide
better load balancing as larger tasks allow for more efficient
distribution of work among parallel resources.
Medium-grained Parallelism:
Granularity between fine and coarse: Medium-grained parallelism
strikes a balance between fine-grained and coarse-grained
parallelism by dividing a program into tasks of medium size.
Moderate degree of parallelism: Medium-grained parallelism aims
to achieve a moderate number of parallel tasks or threads, striking a
balance between fine-grained and coarse-grained approaches.
Synchronization considerations: Medium-grained parallelism
requires synchronization at appropriate points between tasks to
manage dependencies and ensure correctness.
Example: Applications that exhibit medium-sized tasks, such as
parallel graph algorithms, parallel simulations with moderate-sized
computational units, or parallelizing algorithmic components, can
benefit from medium-grained parallelism.
Trade-off considerations: Medium-grained parallelism seeks to
balance the benefits of fine-grained parallelism in terms of increased
parallelism with the advantages of coarse-grained parallelism in
terms of reduced synchronization overhead and improved load
balancing.
Performance of Parallel Processor
Speedup

EFFICIENCY
Speedup Performance Law
1. Amdahl’s Law
Amdahl’s Law was named after Gene Amdahl, who presented it in
1967. In general terms, Amdahl’s Law states that in parallelization,
if P is the proportion of a system or program that can be made
parallel, and 1-P is the proportion that remains serial, then the
maximum speedup S(N) that can be achieved using N processors is:
S(N)=1/((1-P)+(P/N))
As N grows the speedup tends to 1/(1-P).
Speedup is limited by the total time needed for the sequential (serial)
part of the program. For 10 hours of computing, if we can parallelize
9 hours of computing and 1 hour cannot be parallelized,then our
maximum speedup is limited to 10 times as fast. If computers get
faster the speedup itself stays the same.
2. Gustafson’s Law
This law says that increase of problem size for large machines can
retain scalability with respect to the number of processors. American
computer scientist and businessman, John L. Gustafson (born
January 19, 1955) found out that practical problems show much
better speedup than Amdahl predicted.
Gustafson’s law: The computation time is constant (instead of the
problem size), increasing number of CPUs  solve bigger problem
and get better results in the same time.
 Execution time of program on a parallel computer is (a+b)
 a is the sequential time and b is the parallel time
 Total amount of work to be done in parallel varies linearly with
the number of processors.
 So b is fixed as P is varied. The total run time is (a+p*b)
 The speed up is (a+p*b)/(a+b)
 Define α = a/(a+b), the sequential fraction of the execution time,
then
 Any sufficiently large problem can be efficiently parallelized
with a speedup
 Where p is the number of processors, and α is the serial portion
of the problem
 Gustafson proposed a fixed time concept which leads to scaled
speed up for larger problem
sizes.
 Basically, we use larger systems with more processors to solve
larger problems
UNIPROCESSOR ARCHITECTURE:
Computer architecture encompasses the design and structure of
computer systems, including their components and the
interconnections between them.
It defines the functionality of a computer system by specifying the
operations it can perform, such as arithmetic calculations, data
storage, and control flow.
Computer architecture determines how the various hardware
components, such as the central processing unit (CPU), memory,
input/output (I/O) devices, and peripherals, are organized and
interact with each other.
It involves designing instruction sets and addressing modes that
dictate how the CPU interprets and executes instructions.
Computer architecture also includes the design of memory
hierarchies, which involve different levels of memory (such as
cache, RAM, and secondary storage) that work together to store and
retrieve data efficiently.
Additionally, computer architecture encompasses the design of
communication pathways and protocols that facilitate data transfer
between different components, as well as the overall system
performance and power consumption considerations.

RISC and CISC Architecture


Reduced Instruction Set Computer or RISC Architecture
The fundamental goal of RISC is to make hardware simpler by
employing an instruction set that consists of only a few basic steps
used for evaluating, loading, and storing operations. A load
command loads data but a store command stores data.
Characteristics of RISC:
1. It has simpler instructions and thus simple instruction decoding.
2. More general-purpose registers.
3. The instruction takes one clock cycle in order to get executed.
4. The instruction comes under the size of a single word.
5. Pipeline can be easily achieved.
6. Few data types.
7. Simpler addressing modes.
Complex Instruction Set Computer or CISC Architecture
The fundamental goal of CISC is that a single instruction will handle
all evaluating, loading, and storing operations, similar to how a
multiplication command will handle evaluating, loading, and storing
data, which is why it’s complicated.
Characteristics of CISC:
1. Instructions are complex, and thus it has complex instruction
decoding.
2. The instructions may take more than one clock cycle in order to
get executed.
3. The instruction is larger than one-word size.
4. Lesser general-purpose registers since the operations get
performed only in the memory.
5. More data types.
6. Complex addressing modes.
Both CISC and RISC approaches primarily try to increase the
performance of a CPU. Here is how both of these work:
1. CISC: This kind of approach tries to minimize the total number
of instructions per program, and it does so at the cost of increasing
the total number of cycles per instruction.
2. RISC: It reduces the cycles per instruction and does so at the cost
of the total number of instructions per program.

RI
CISC
SC
RISC is a reduced instruction CISC is a complex instruction
1.
set. set.
The number of instructions is The number of instructions is
2.
less as compared to CISC. more as compared to RISC.
3. The addressing modes are less. The addressing modes are more.
It works in a fixed instruction It works in a variable instruction
4.
format. format.
The RISC consumes low The CISC consumes high
5.
power. power.
The RISC processors are The CISC processors are less
6.
highly pipelined. pipelined.
It optimizes the performance It optimizes the performance by
7.
by focusing on software. focusing on hardware.
Requires less RAM.
8. Requires more RAM.

It has a large number of


It doesn't support arrays..
instructions. It supports arrays
1
It doesn't use condition codes.
1 Condition codes are used.
.

Registers are used for procedure


1
2
The stack is used for procedure
arguments and return addresses.
. arguments and return addresses.
Parallel processing in a uniprocessor
Parallel processing in a uniprocessor system involves executing
multiple tasks simultaneously using a single processor.
Hardware parallelism utilizes multiple functional units within the
processor to perform different operations concurrently.
Techniques like pipelining and superscalar execution enable the
overlapping of instruction execution stages, allowing for
simultaneous processing of multiple instructions.
Software parallelism involves designing programs to enable
concurrent execution of multiple tasks through techniques like
multithreading.
Parallel processing in a uniprocessor system enhances efficiency and
reduces processing time by effectively utilizing available hardware
resources.
However, the capabilities and resources of a single processor limit
the extent of parallelism in a uniprocessor system, as true parallel
processing is typically achieved in multiprocessor systems.
Parallelism within CPU:
Parallelism within a CPU refers to the execution of multiple
operations or tasks simultaneously using different functional units
within the processor.
This hardware parallelism allows for concurrent processing of
multiple instructions or data elements, thereby increasing throughput
and performance.
Different functional units within the CPU, such as arithmetic logic
units (ALUs), floating-point units (FPUs), and memory units, can
operate in parallel to handle different types of operations
simultaneously.
Parallelism within the CPU is achieved through techniques like
superscalar execution and out-of-order execution, where multiple
instructions are fetched, decoded, and executed concurrently.
By effectively utilizing available hardware resources and
overlapping instruction execution stages, parallelism within the CPU
improves overall efficiency and reduces processing time.
Pipelining within CPU:
Pipelining is a technique used within the CPU to achieve parallelism
by overlapping the execution of multiple instructions.
It divides the instruction execution process into stages, and different
instructions are processed in a pipeline fashion, with each stage
handling a specific task.
This allows multiple instructions to be in different stages of
execution at the same time, enabling higher instruction throughput.
Pipelining involves stages like instruction fetch, instruction decode,
execution, memory access, and write-back, which operate
concurrently on different instructions.
As one instruction moves to the next stage, the subsequent
instruction can enter the pipeline, allowing for a continuous flow of
instructions through the CPU.
Pipelining improves CPU performance by reducing the overall
instruction execution time and increasing instruction throughput.
Multiprocessor
A multiprocessor is a computer system that consists of two or more
CPUs that have full access to a common RAM (random access
memory).
The primary purpose of a multiprocessor is to enhance the system's
execution speed by allowing multiple CPUs to work on different
tasks simultaneously.
There are two types of multiprocessors: shared memory
multiprocessors and distributed memory multiprocessors.
In a shared memory multiprocessor, all CPUs share a common
memory, enabling them to access and manipulate data stored in that
memory.
In contrast, a distributed memory multiprocessor assigns each CPU
its own private memory, meaning that data must be explicitly
transferred between CPUs if they need to share information.
Benefits of a multiprocessor:
 Enhanced performance.
 Multiple applications.
 Multi-tasking inside an application.
 High throughput and responsiveness.
 Hardware sharing among CPUs.
There are three models comes under a multiprocessor:
UMA (Uniform Memory Access)
UMA (Uniform Memory Access)is a shared memory architecture
used in multiprocessor systems.
In UMA, all processors in the system access a single shared memory
through an interconnection network.
Each processor in a UMA system has equal memory accessing time
and access speed.
UMA can be implemented using different interconnect technologies
such as a single bus, multiple buses, or a crossbar switch.
UMA provides balanced shared memory access, which means that
all processors have equal access to the memory.
Due to its balanced design, UMA is also referred to as SMP
(Symmetric Multiprocessor) systems.

NUMA (Non-Uniform Memory Access)


NUMA (Non-Uniform Memory Access) is a shared memory
multiprocessor model where the access time to memory can vary
based on its location.
In NUMA, each processor is connected to its dedicated memory, but
these memory segments are combined to form a single address
space.
Unlike UMA, the access time in NUMA depends on the distance
between the processor and the memory, resulting in varying memory
access times.
NUMA allows access to any memory location using physical
addresses.
NUMA systems typically employ a distributed memory architecture,
where each processor has its own local memory.
NUMA architectures are designed to optimize memory access for
the processors closest to the memory, reducing latency and
improving performance.

COMA (Cache-Only Memory Architecture)


COMA (Cache-Only Memory Architecture) is a model that
combines multiprocessor and cache memory.
It is a variation of NUMA where distributed memory is replaced by
caches, creating a global address space.
In COMA, each processor has a portion of the shared memory,
which is represented by cache memory.
Data in a COMA system needs to be migrated to the processor that
requests it since there is no memory hierarchy.
COMA systems utilize a cache directory (D) to facilitate remote
cache access.
The Kendall Square Research's KSR-1 machine is an example of a
system that implements the COMA architecture.

Multicomputer
Multicomputer is a system architecture composed of multiple
independent computers connected via a network.Each computer in a
multicomputer operates autonomously with its own local resources,
including memory and processors.Communication between
computers in a multicomputer is achieved through message passing
over the network.Multicomputers can be highly scalable, allowing
for the addition of more computers to expand computational power.
Unlike shared memory architectures, multicomputers do not have a
single global address space.Multicomputers offer flexibility in terms
of heterogeneous hardware configurations and distributed computing
models.Load balancing and task distribution are important
considerations in multicomputer systems to ensure efficient
utilization of
resources across the
network.
NORMA (No-Remote Memory Access)
NORMA (No-Remote Memory Access) restricts direct remote
memory access in a multiprocessor system.
Each processor in NORMA has its dedicated local memory for direct
access.
NORMA does not provide a shared global address space like UMA
or NUMA.
Communication between processors in NORMA relies on explicit
message passing or inter-processor communication methods.
NORMA emphasizes efficient utilization of local memory to reduce
remote memory access latency.
NORMA offers improved scalability and performance by limiting
remote memory access and promoting local memory usage.
Multiprocessor Model Multicomputer Model
Shared memory
architecture with a single Each computer has its own
Memory
memory accessed by all local memory.
processors.
Communication between
Direct communication
Communic computers is achieved
between processors
ation through message passing
through shared memory.
over a network.
Highly scalable as
Limited scalability due to
computers can be added or
Scalability contention and shared
removed from the network
memory access.
easily.
Global Provides a single global No single global address
Address address space accessible space; each computer has its
Space by all processors. own address space.
Synchronization
Requires explicit message
Synchroniz mechanisms rely on shared
passing for synchronization
ation memory for efficient
between computers.
coordination.
Load balancing can be
Load balancing can be
Load challenging due to shared
implemented by distributing
Balancing memory access and
tasks across computers.
contention.
Failure of a processor can
Fault impact the entire system's Individual computer failures
Tolerance functionality and have a localized impact.
performance.
NUMA (Non- COMA NORMA (No-
UMA (Uniform
Uniform (Cache-Only Remote
Memory
Memory Memory Memory
Access)
Access) Architecture) Access)
Memory access
Single memory Memory is Each processor
time can vary
Memory Access accessed by all replaced by has its own
based on
processors cache memory local memory
location
Shared
Shared No shared
Shared memory memory
Shared Memory memory global address
architecture consists of
architecture space
cache memory
Memory
Hierarchy Yes Yes No No
Cache
Shared bus, Explicit
Interconnect directory (D)
Communication multiple bus, or message
network for remote
crossbar switch passing
cache access
Limited
Limited Scalable as
scalability due Scalable with
scalability due processors can
Scalability to lack of addition/remov
to shared have dedicated
memory al of processors
memory access memory
hierarchy
Balanced shared Varying Emphasizes Limits remote
Performance
memory access memory access efficient memory access,
among time based on utilization of improving
processors distance cache memory performance
Kendall Square
Symmetric
Research's
Example Multiprocessor -
KSR-1
(SMP) systems
machine
Flynn's taxonomy:Flynn's taxonomy is a classification system used
to categorize computer architectures based on the number of
instruction streams and data streams that can be processed
simultaneously. It was proposed by Michael J. Flynn in 1966 and
has four main categories, as follows:
Single Instruction, Single Data (SISD):
This category represents the traditional sequential processing model.
It consists of a single processor executing a single instruction stream
on a single set of data.
Examples include most conventional desktop computers.
Single Instruction, Multiple Data (SIMD):
In this category, a single instruction is applied to multiple data
elements simultaneously.
It involves a single control unit that issues the same instruction to
multiple processing units.
SIMD architectures are commonly used for tasks that involve
parallel processing, such as multimedia applications and scientific
simulations.
Multiple Instruction, Single Data (MISD):
This category involves multiple instructions operating on a single
data stream.
MISD architectures are relatively rare and less commonly used in
practical systems.
They were initially proposed for error checking and fault tolerance
purposes but have limited real-world implementations.
Multiple Instruction, Multiple Data (MIMD):
MIMD architectures have multiple processors that independently
execute different instruction streams on different data streams.
Each processor in a MIMD system can operate on its own data and
execute its own instructions.MIMD is the most common category
and includes various parallel processing architectures, such as
clusters, multiprocessor systems, and distributed systems.
Feng’s classification:
Feng’s classification: (1972) is based on serial versus parallel
processing.
Under above classification:
1.Word Serial and Bit Serial (WSBS):
In WSBS architecture, both data and instructions are processed
serially, one word or bit at a time.
The entire word or bit is processed before moving on to the next
word or bit.
This type of architecture is typically slower compared to parallel
processing types.
Word Parallel and Bit Serial (WPBS):
WPBS architecture involves parallel processing of multiple words
simultaneously, but within each word, processing occurs serially.
Multiple words are processed concurrently, but within a word, the
individual bits are processed serially.
This type of architecture can provide some performance
improvement over WSBS by utilizing parallelism at the word level.
Word Serial and Bit Parallel (WSBP):
In WSBP architecture, each word is processed serially, but multiple
bits within the word are processed in parallel.
The architecture allows for parallel processing of the bits within a
word, improving performance compared to WSBS.
This type of architecture is commonly used in SIMD (Single
Instruction, Multiple Data) systems.
Word Parallel and Bit Parallel (WPBP):
WPBP architecture involves parallel processing of both words and
bits.
Multiple words are processed simultaneously, and within each word,
multiple bits are processed concurrently.
This type of architecture offers the highest level of parallelism and
can provide significant performance improvements.
Distributed Memory Multi-computers
In a distributed-memory multiprocessor, each processor has its own
associated memory module.
Processors can directly access their own memory but require a
message passing mechanism (such as MPI) to access memory
associated with other processors.
Memory access in distributed-memory multiprocessors is non-
uniform, meaning it depends on which memory module a processor
is trying to access. This is known as a Non-Uniform Memory Access
(NUMA) multiprocessor system.
If all processors in a distributed-memory multiprocessor are
identical, it is referred to as a Symmetric Multiprocessor (SMP).
If the processors in a distributed-memory multiprocessor are
heterogeneous, it is called an Asymmetric Multiprocessor (ASMP).
Distributed-memory systems are easier to build but harder to use, as
they consist of multiple shared-memory computers with separate
operating systems and memory.
Distributed-memory multiprocessors are the architecture of choice
for constructing modern supercomputers due to their scalability and
ability to handle large-scale parallel computing tasks.
Shared Memory Multi-processors
In a shared-memory multiprocessor, both data and code in a parallel
program are stored in the main memory accessible to all processors.
All processors in a shared-memory system have direct hardware
access to the entire main memory address space.
Shared-memory multiprocessors consist of a limited number of
processors that can concurrently access and modify shared memory.
In this architecture, all CPU cores can access the same memory,
similar to multiple workers sharing a whiteboard, and are controlled
by a single operating system.
Modern processors are often multicore, with multiple CPU cores
integrated on a single chip.
Shared-memory systems are relatively easier to use, as all processors
can access shared memory without explicit message passing.
Shared-memory multiprocessors are well-suited for laptops and
desktops, providing a convenient architecture for general-purpose
computing tasks.
Introduction to Distributed Systems
Distributed System:
A collection of autonomous computer systems connected by a
centralized computer network.
Autonomous systems communicate by sharing resources and files.
Example: Social media with a centralized network as headquarters
and autonomous systems for user access.

Characteristics of Distributed Systems:


Resource Sharing: Ability to use hardware, software, and data
anywhere in the system.
Openness: Concerned with extensions and improvements in the
system.
Concurrency: Multiple users in remote locations performing the
same activities.
Scalability: System can accommodate increased users and improve
responsiveness.
Fault Tolerance: System continues to operate despite hardware or
software failures.
Transparency: Hides complexity of the system from users and
applications.
Heterogeneity: Components can vary in networks, hardware,
operating systems, programming languages, and implementations.
Advantages of Distributed Systems:
Inherently distributed applications.
Information sharing among geographically distributed users.
Resource sharing across autonomous systems.
Better price-performance ratio and flexibility.
Shorter response time and higher throughput.
Higher reliability and availability against component failure.
Extensibility for system expansion.
Disadvantages of Distributed Systems:
Relevant software for distributed systems is lacking.
Security concerns due to shared resources and easy access to data.
Networking saturation can hinder data transfer.
Database management is more complex compared to single-user
systems.
Network overload if all nodes send data simultaneously.
Applications of Distributed Systems:
Finance and commerce (Amazon, eBay, online banking).
Information society (search engines, social networking, cloud
computing).
Cloud technologies (AWS, Salesforce, Microsoft Azure, SAP).
Entertainment (online gaming, music, YouTube).
Healthcare (online patient records, health informatics).
Education (e-learning).
Transport and logistics (GPS, Google Maps).
Environment management (sensor technologies).
Challenges of Distributed Systems:
Network latency affecting system performance.
Distributed coordination among nodes.
Security vulnerabilities due to the distributed nature.
Data consistency maintenance across multiple nodes.
Distributed Systems Centralized Systems
Consists of multiple Consists of a single server
Architecture interconnected nodes or or a central node that
servers that work together. performs all tasks.
Resources and data are All resources and data are
Resource
distributed across multiple located in one central
Location
nodes. server.
Can scale horizontally by Limited scalability based
Scalability adding more nodes to handle on the capacity of the
increased workloads. central server.
Designed to be fault-tolerant, Single point of failure,
Fault allowing the system to where failure of the central
Tolerance continue functioning even if server can lead to system
nodes fail. downtime.
Parallel processing and Performance depends on
Performanc
distributed workload lead to the capacity of the central
e
improved performance. server.
More resilient to failures,
Vulnerable to failures as
attacks, and natural disasters
Resilience the entire system relies on
due to data and processing
the central server.
being distributed.
More complex to manage and Easier to manage and
Control control due to the distributed control as all resources and
nature. data are centralized.
Data Achieving consistency across Achieving consistency is
Consistency distributed nodes can be relatively simpler due to
challenging. the central server's control.
Requires network
Communica Communication occurs
communication between
tion within the central server,
nodes, leading to potential
Overhead reducing potential latency.
latency.
Client-Server Model:
A distributed application structure where tasks or workload are
divided between servers and clients.
Servers provide resources or services, while clients make requests
for those resources or services.
Examples include email and the World Wide Web.

How the Client-Server Model Works:


Clients send requests for data to servers through the internet.
Servers accept the requests and process them.
Servers deliver the requested data packets back to the clients.
Clients do not share their resources.
Advantages of Client-Server Model:
Centralized system with data in one place.
Cost-efficient with lower maintenance costs.
Data recovery is possible.
Capacity of clients and servers can be changed separately.
Disadvantages of Client-Server Model:
Clients are vulnerable to viruses, Trojans, and worms.
Servers are prone to Denial of Service (DoS) attacks.
Data packets may be spoofed or modified during transmission.
Phishing attacks and Man-in-the-Middle (MITM) attacks are
common.
Peer-to-Peer (P2P) Systems:
Decentralized distributed architecture where peers act as both clients
and servers.
Peers directly communicate with each other without a central server.
Peers can request and provide resources or services to other peers.
Advantages of P2P Systems:
Decentralization reduces single points of failure and increases
resilience.
Scalability as more peers join, increasing resource availability.
Resource utilization through contribution of peers' own resources.
Robustness as the system can function even if peers join or leave.
Enhanced privacy as data is shared directly between peers.
Disadvantages of P2P Systems:
Lack of centralized control makes policy enforcement and data
consistency challenging.
Vulnerability to security risks such as malware, attacks, and
unauthorized access.
Varying quality and reliability of resources depending on
contributing peers.
Higher network overhead due to direct peer-to-peer communication.
Applications of P2P Systems:
File sharing networks like BitTorrent and eMule.
Decentralized cryptocurrencies such as Bitcoin.
Content delivery networks (CDNs) utilizing P2P technology.
Collaborative computing for distributed processing and scientific
research.
examples of distributed systems in short points:
Cloud Computing Platforms (e.g., AWS, Azure, GCP)
Distributed File Systems (e.g., HDFS, GFS)
Content Delivery Networks (e.g., Akamai, Cloudflare)
Blockchain Networks (e.g., Bitcoin, Ethereum)
Distributed Database Systems (e.g., Cassandra, MongoDB)
Peer-to-Peer (P2P) Networks (e.g., BitTorrent, Bitmessage)
Distributed Sensor Networks (e.g., Internet of Things applications)
Distributed Web Applications (e.g., social media platforms, e-
commerce websites)
Distributed Machine Learning Systems (e.g., TensorFlow, Apache
Spark)
Distributed Gaming Systems (e.g., multiplayer online games)
main characteristics of distributed systems summarized in
points:
Decentralization: Distributed systems are composed of multiple
interconnected nodes or servers without a central controlling
authority.
Autonomy: Each node in a distributed system has its own autonomy
and control over its resources, operations, and decision-making.
Concurrency: Distributed systems deal with multiple activities or
tasks concurrently, where separate users or nodes can perform
different operations simultaneously.
Scalability: Distributed systems can scale horizontally by adding
more nodes, allowing them to handle increased workloads and
accommodate growing demands.
Fault Tolerance: Distributed systems are designed to be fault-
tolerant, meaning that if one node fails or goes offline, the system
can continue to operate with minimal disruption.
Communication: Nodes in a distributed system communicate and
exchange information through message passing, remote procedure
calls, or other communication mechanisms.
Heterogeneity: Distributed systems may consist of different
hardware, operating systems, programming languages, and
architectures, which require interoperability and handling of
heterogeneous components.
Resource Sharing: Distributed systems facilitate resource sharing,
allowing nodes to access and utilize shared resources such as
computing power, storage, and data.
Transparency: Distributed systems aim to provide transparency to
users and applications by hiding the complexity of the underlying
network and architecture, making it appear as a single coherent
system.
Security: Distributed systems face security challenges due to the
distributed nature of data and resources, requiring mechanisms for
authentication, access control, encryption, and secure
communication.
design goals of distributed systems summarized in short points:
Scalability: Ability to handle increasing workloads and
accommodate more users or nodes.
Fault Tolerance: Resilience to hardware and software failures for
high availability.
Performance: Efficient processing and fast response times through
parallelism.
Transparency: Hiding complexities of the network and architecture
from users and applications.
Consistency and Replication: Ensuring data consistency across
distributed copies.
Security: Robust measures for authentication, encryption, and access
control.
Interoperability: Seamless communication and resource sharing
among heterogeneous components.
Manageability: Tools for easy administration, monitoring, and
troubleshooting.
Load Balancing: Evenly distributing workload to optimize resource
utilization.
Extensibility and Flexibility: Adaptability to changing requirements
and technological advancements.
main problems of distributed systems summarized in short
points:
Communication and Latency: Communication between distributed
nodes introduces latency and can lead to delays and performance
issues.
Data Consistency: Ensuring consistency across distributed nodes is
challenging, as updates and changes need to be propagated and
synchronized.
Fault Tolerance: Handling failures and maintaining system stability
in the presence of node failures or network disruptions.
Security and Trust: Distributed systems face security challenges,
including unauthorized access, data breaches, and ensuring trust
among nodes.
Scalability and Load Balancing: Scaling the system to handle
increased workloads and distributing the load evenly across nodes.
Heterogeneity: Dealing with differences in hardware, software, and
platforms among distributed components, requiring interoperability
and integration.
Distributed Coordination: Coordinating activities and maintaining
consistency in a distributed environment where multiple nodes work
independently.
Network Overhead: Managing network resources and mitigating
overhead associated with communication and data transfer.
Debugging and Troubleshooting: Identifying and resolving issues in
distributed systems can be complex due to the distributed nature and
interactions between nodes.
Complexity and Manageability: Distributed systems are inherently
complex, requiring sophisticated management and administration
tools for deployment, configuration, and maintenance.
Resource sharing in Distributed Systems:
Resource sharing in a distributed system involves sharing and
accessing software, hardware, and data across different computer
systems.
Data Migration: The process of transferring data from one location
to another in the system to be accessed by distributed systems.
Computation Migration: The process of transferring computation
instead of data across the system, typically done when transferring
large files or big data.
Advantages of Computation Migration:
Increases computational speed.
Enables load balancing to optimize resource sharing.
Web Challenges in Distributed Systems:
Scalability: Ensuring that system performance is not degraded as the
system load increases.
Heterogeneity: The ability to communicate with different devices,
such as computers, mobile devices, and peripherals.
Security Challenges: Privacy, Authentication, and Availability.
Privacy: Ensuring confidentiality of shared data.
Authentication: Verifying the proper identity of users and preventing
unauthorized access.
Availability: Ensuring that data and resources are consistently
available.
Handling of Failure: Dealing with system failures effectively.
Fault Tolerance: Continuously operating in the presence of errors.
Redundancy: Avoiding duplicacy or inconsistency in the system.
Exception Handling: Properly handling errors that occur during
system operation.
distributed system: the Interaction Model, the Failure Model,
and the Security Model.
The Interaction Model:
• Describes how different components of a distributed system
communicate and interact with each other.
• Defines the protocols, message formats, and communication
patterns used for exchanging information.
• Specifies the roles and responsibilities of each component in the
system and how they collaborate to achieve the desired
functionality.
• Determines the flow of control and data between the various
components, including request-response mechanisms and event-
driven communication.
• Considers factors such as message passing, remote procedure
calls (RPC), message queues, and publish-subscribe patterns.
• Ensures that the interactions are designed to be efficient,
scalable, and reliable to meet the system requirements.
2.The Failure Model:
• Addresses the potential failures and faults that can occur in a
distributed system.
• Identifies different types of failures, such as hardware failures,
network failures, software errors, and human errors.
• Specifies how the system should handle and recover from
failures to ensure fault tolerance and availability.
• Includes mechanisms like redundancy, replication, fault
detection, error recovery, and failure mitigation strategies.
• Considers factors such as fault detection time, failure detection
accuracy, fault isolation, and system resilience.
• Aims to minimize the impact of failures on the overall system
performance and maintain data consistency and integrity.
3.The Security Model:
• Focuses on protecting the distributed system from unauthorized
access, data breaches, and malicious activities.
• Defines the security policies, access controls, and authentication
mechanisms to ensure confidentiality, integrity, and availability
of system resources.
• Specifies encryption techniques, secure communication
protocols, and secure storage mechanisms to protect sensitive
data.
• Considers factors such as user authentication, authorization,
secure channels, secure messaging, and secure data storage.
• Includes measures to prevent attacks like unauthorized access,
denial-of-service (DoS), man-in-the-middle attacks, and data
tampering.
• Aims to provide a secure environment for users and prevent data
loss, privacy violations, and system compromises.
Types of Distributed System: Grid, Cluster, Cloud
1.Grid:
• A grid distributed system is composed of geographically
distributed and heterogeneous resources that are coordinated to
provide a unified computing infrastructure.
• It enables the sharing and aggregation of computational
resources, such as processing power, storage, and network
bandwidth, across multiple administrative domains.
• Grid systems are designed to handle large-scale scientific and
data-intensive applications that require high-performance
computing capabilities.
• They employ middleware software to manage resource
allocation, job scheduling, and data management across the grid
nodes.
• Grid systems are characterized by their ability to support
distributed and parallel computing, allowing multiple tasks to be
executed simultaneously across different nodes.
• They provide fault tolerance and scalability by dynamically
adapting resource allocation based on demand and availability.
• Grid systems often have security mechanisms in place to ensure
data privacy and protect against unauthorized access.
• Examples of grid systems include the World Community Grid,
European Grid Infrastructure, and Open Science Grid.
2.Cluster:
• A cluster distributed system is a group of interconnected
computers or servers that work together as a single integrated
system.
• The cluster is typically located in close proximity and connected
by a high-speed local area network (LAN).
• Each node in the cluster runs its own operating system and
applications, and they communicate and collaborate to achieve a
common goal.
• Cluster systems are designed to provide high availability, fault
tolerance, and load balancing for mission-critical applications.
• They often employ distributed file systems to enable shared
access to data across the cluster.
• Cluster systems use clustering software or middleware to
manage resource allocation, workload distribution, and failover
mechanisms.
• They are commonly used in enterprise environments for tasks
such as database management, web serving, and scientific
simulations.
• Examples of cluster systems include Apache Hadoop for
distributed data processing, Kubernetes for container
orchestration, and high-performance computing (HPC) clusters
for scientific computing.
3.Cloud:
• A cloud distributed system provides on-demand access to
computing resources, such as virtual machines, storage, and
software applications, over the internet.
• Cloud systems are hosted in data centers and offer scalability,
flexibility, and cost efficiency by allowing users to pay for
resources based on usage.
• They provide a range of services, including infrastructure as a
service (IaaS), platform as a service (PaaS), and software as a
service (SaaS).
• Cloud systems utilize virtualization technologies to abstract and
pool physical resources, enabling multi-tenancy and resource
sharing.
• They often employ distributed storage systems to ensure data
availability and durability across multiple data centers.
• Cloud systems are managed by cloud service providers (CSPs),
who handle resource provisioning, monitoring, security, and
maintenance.
• They offer various deployment models, such as public cloud,
private cloud, hybrid cloud, and multi-cloud, to cater to different
needs and requirements.
• Popular examples of cloud platforms include Amazon Web
Services (AWS), Microsoft Azure, Google Cloud Platform
(GCP), and Salesforce.

Grid Cluster Cloud


Shares geographically
distributed and Shares resources
Resourc heterogeneous within a close Shares virtualized
e resources across proximity and resources over the
Sharing multiple interconnected internet
administrative servers
domains
Connected via a
Connect Connected via a wide Connected via the
local area network
ivity area network (WAN) internet
(LAN)
Grid Cluster Cloud
Typically used for Provides
Designed for large-
high availability and scalability and
scale scientific and
Scale load balancing of flexibility for
data-intensive
mission-critical various types of
applications
applications applications
Often employs Managed by
Utilizes middleware
clustering software cloud service
software for resource
Middle or middleware for providers (CSPs)
allocation, job
ware resource allocation with various
scheduling, and data
and failover cloud service
management
mechanisms models
Provides fault Offers
Offers high
tolerance and redundancy, data
availability and fault
Fault scalability by replication, and
tolerance through
Toleranc dynamically adapting disaster recovery
redundant resources
e resource allocation mechanisms
and failover
based on demand and
mechanisms
availability
Uses distributed file Utilizes
May employ
systems for shared distributed
distributed file
Data data access within storage systems
systems for shared
Storage the cluster for data
access to data across
availability and
the grid
durability
Hosted in data
Locatio Geographically Proximity-based
centers
n distributed resources resources
worldwide
World Community Amazon Web
Apache Hadoop,
Grid, European Grid Services (AWS),
Kubernetes, high-
Exampl Infrastructure, Open Microsoft Azure,
performance
es Science Grid Google Cloud
computing (HPC)
Platform (GCP),
clusters
Salesforce
Introduction to Distributed File System:
A distributed file system (DFS) is a software layer that enables files
and directories to be shared across multiple computers in a network.
It provides a unified and transparent view of files, allowing users to
access and manage them as if they were stored locally. Here are the
key features, advantages, disadvantages, and applications of
distributed file systems:

Working of Distributed File System:


Files are partitioned into blocks for distributed storage across
multiple servers.
Metadata servers store file information, including locations,
attributes, and permissions.
Clients interact with the DFS, retrieve metadata, and directly access
appropriate storage servers.
Data is accessed and transferred between clients and storage servers
over the network.
Consistency mechanisms maintain data integrity and
synchronization across replicas.
Replication provides fault tolerance and high availability through
multiple copies of data blocks.
Features:
Scalability: Distributed file systems are designed to handle large
amounts of data and can scale horizontally by adding more storage
nodes.
Fault Tolerance: They provide high availability by replicating data
across multiple nodes, ensuring that data remains accessible even if
some nodes fail.
Transparency: Users can access and manage files without needing to
be aware of the underlying distributed nature of the system.
Consistency: Distributed file systems implement various consistency
models to ensure that concurrent operations on files maintain data
integrity.
Security: They offer access control mechanisms to protect data from
unauthorized access and ensure data privacy.
Caching: Distributed file systems often employ caching techniques
to improve performance by storing frequently accessed data closer
to the users.
Advantages:
Data Redundancy: Replication of data provides fault tolerance and
ensures data availability even in the event of node failures.
Scalability: Distributed file systems can seamlessly scale to
accommodate growing amounts of data by adding more storage
nodes.
Data Accessibility: Files can be accessed from anywhere on the
network, allowing for collaboration and remote access.
Load Balancing: Distributed file systems distribute data across
multiple nodes, balancing the workload and improving overall
system performance.
Simplified Data Management: Centralized control and management
of files simplify administration tasks.
Data Backup: Replication and data distribution provide built-in data
backup mechanisms.
Disadvantages:
Complexity: Implementing and managing a distributed file system
can be complex and require specialized knowledge and expertise.
Network Dependency: The performance of a distributed file system
relies heavily on the network infrastructure and its stability.
Data Consistency: Ensuring consistency in a distributed
environment can be challenging, and trade-offs may need to be
made between consistency and performance.
Latency: Accessing files over a network introduces additional
latency compared to local file systems.
Cost: Distributed file systems may require additional hardware,
network infrastructure, and maintenance, leading to increased costs.
Synchronization Overhead: Coordinating data updates across
multiple nodes can introduce overhead and potential bottlenecks.
Applications:
Big Data Analytics: Distributed file systems like Hadoop HDFS are
widely used in processing and analyzing large volumes of data.
Cloud Storage: Distributed file systems provide the backbone for
cloud storage services, enabling users to store and access their data
remotely.
High-Performance Computing: Distributed file systems like Lustre
are used in scientific and research environments for high-
performance computing tasks.
Media Streaming: Distributed file systems can efficiently distribute
and stream media content across multiple servers.
Content Delivery Networks (CDNs): CDNs rely on distributed file
systems to store and serve content closer to end-users, improving
content delivery speed.
Collaborative Work: Distributed file systems facilitate collaborative
work environments by allowing multiple users to access and modify
shared files.
File Service Architecture:
File Service Architecture is an architecture that provides the facility
of file accessing by designing the file service as the following three
components:
• A client module
• A flat file service
• A directory service

Flat File Service:


The flat file service is responsible for managing the operations of
individual files in a distributed system.
Each file in the system is assigned a unique file identifier, which is a
long sequence of bits.
These unique file identifiers ensure that no two files in the system
have the same identifier.
When a request to create a new file is received, the flat file service
generates a new unique file identifier for it.
Directory File Service:
The directory file service utilizes mappings between file names and
unique file identifiers.
It provides functions to create directories, add new file names to
directories, and retrieve unique file identifiers from directories.
The directory service plays a crucial role in organizing and
managing the file system structure.
By using the directory service, clients can easily locate and access
files based on their names or unique identifiers.
Client Module:
The client module runs on each client computer in the distributed
system.
It interacts with both the flat file service and the directory service to
access files.
The client module stores information about the network location of
the flat file server and directory server processes.
This module is essential for achieving satisfactory performance by
efficiently utilizing the file services and enabling seamless file
access for clients.
Introduction to Name Service in 6 points
Name Service is a system or mechanism that maps human-readable
names or identifiers to specific resources or entities in a distributed
computing environment.
It provides a way for users and applications to refer to resources
using logical names instead of physical addresses or locations.
Name Services are essential for simplifying resource access and
management by abstracting the complexity of network addresses and
locations.
They typically involve a naming convention to structure and assign
unique names to resources, ensuring consistency and uniqueness in
the naming scheme.
Name Services perform name resolution, which involves looking up
the logical name to obtain the corresponding network address or
location of the resource.
Examples of Name Services include Domain Name System (DNS)
for translating domain names to IP addresses, Network Information
Service (NIS) for centralized user and host information, and service
discovery mechanisms for locating network services within local
networks.
NAME SERVICE
A name service stores bindings between textual names and attributes
for objects like computers, services, and users.
The main operation of a name service is to resolve names, allowing
users to find the associated objects or resources.
Uniform Resource Identifiers (URIs) were developed to identify
resources on the web and other internet resources, such as email
addresses.
URIs provide a coherent and uniform syntax that encompasses
various types of resource identifiers (URI schemes).
The uniformity of URIs enables compatibility and ease of use across
different software and systems.
URIs facilitate the management of the global namespace of schemes,
allowing for the introduction of new identifier types and expanding
existing ones without disrupting existing usage.
Introduction to Domain Name System (DNS):
The Domain Name System (DNS) is a decentralized hierarchical
naming system that translates human-readable domain names into IP
addresses and vice versa.
It is a fundamental component of the internet infrastructure and
plays a crucial role in enabling the navigation and communication
between devices and services.
DNS operates based on a distributed database that contains
mappings between domain names and their associated IP addresses
or other resource records.
Working Principle of DNS:
DNS follows a client-server model, where DNS clients (such as web
browsers) send queries to DNS servers to resolve domain names.
When a client needs to resolve a domain name, it sends a DNS query
to its configured DNS resolver, which could be a local DNS server
or the ISP's DNS server.
The resolver, if necessary, recursively queries other DNS servers
until it obtains the final answer or the IP address associated with the
domain name.
The resolved IP address is then returned to the client, allowing it to
establish a connection with the intended server or resource.
Merits of DNS:
Human-Readable Names: DNS provides a human-friendly way to
access internet resources by associating them with easily
recognizable domain names.
Scalability: DNS is highly scalable, allowing it to handle a vast
number of domain name resolutions efficiently.
Redundancy and Fault Tolerance: DNS uses a distributed
architecture with multiple servers, providing redundancy and
ensuring high availability even in the event of server failures.
Caching: DNS resolvers implement caching mechanisms to store
resolved domain names and their corresponding IP addresses,
improving performance and reducing the load on DNS servers.
Demerits of DNS:
Single Point of Failure: Although DNS is designed with redundancy,
if the entire DNS infrastructure fails, it can lead to service
disruptions and difficulties in accessing resources.
DNS Spoofing and Attacks: DNS is susceptible to various types of
attacks, such as DNS spoofing and cache poisoning, which can
manipulate or redirect DNS responses, leading to security
vulnerabilities.
Propagation Delays: When changes are made to DNS records, such
as updating IP addresses, it can take time for these changes to
propagate across DNS servers globally, resulting in potential delays
in resolution.
Applications of DNS:
Website Access: DNS is primarily used to translate domain names
(e.g., example.com) into IP addresses, allowing users to access
websites and web services.
Email Routing: DNS is utilized for resolving mail exchanger (MX)
records, which determine the email servers responsible for accepting
incoming email for a domain.
Load Balancing: DNS can be used to distribute incoming traffic
across multiple servers by associating a single domain name with
multiple IP addresses.
Service Discovery: DNS-based service discovery protocols like
DNS Service Discovery (DNS-SD) enable devices and applications
to locate services within local network
importance and applications of DNS:
Human-Readable Names: DNS allows users to access internet
resources using easy-to-remember domain names instead of numeric
IP addresses.
IP Address Resolution: DNS resolves domain names to their
corresponding IP addresses, enabling devices to locate and connect
to the appropriate servers or services.
Website Access: DNS is essential for accessing websites and web
services by translating domain names into IP addresses. Users can
enter a domain name in their web browser, and DNS resolves it to
the correct IP address.
Email Routing: DNS is used to determine the mail servers
responsible for accepting incoming email for a particular domain. It
resolves mail exchanger (MX) records to ensure proper email
delivery.
Load Balancing: DNS can be utilized to distribute incoming traffic
across multiple servers or data centers. By associating a domain
name with multiple IP addresses, DNS can balance the load and
improve the performance and availability of services.
Redundancy and Failover: DNS enables the setup of redundant
systems by associating multiple IP addresses with a domain name. If
one server or IP address becomes unavailable, DNS can redirect the
traffic to another available IP address.
Caching: DNS resolvers implement caching mechanisms to store
resolved DNS records. This improves performance by reducing the
need for repeated DNS queries and reducing the load on DNS
servers.
Service Discovery: DNS-based service discovery protocols, such as
DNS Service Discovery (DNS-SD), allow devices and applications
to discover and locate services available on a network. This
simplifies the process of finding and connecting to network
resources.
Dynamic IP Address Assignment: DNS supports dynamic IP address
assignment, allowing devices with changing IP addresses, such as
DHCP clients, to update their corresponding DNS records
dynamically.
Infrastructure Management: DNS plays a crucial role in managing
and organizing the internet's infrastructure by maintaining the
hierarchical and distributed structure of domain names and IP
addresses.
Introduction to Google File System (GFS):
Google File System (GFS) is a distributed file system developed by
Google to store and manage large amounts of data across multiple
servers.
It is designed to handle the challenges of scalability, reliability, and
performance required by Google's massive computing infrastructure.
GFS operates on a master-worker architecture, where a single master
node coordinates and manages multiple worker nodes that store the
data.
Working Principle of GFS:
Data Splitting: GFS divides files into fixed-size chunks and
distributes them across multiple worker nodes. Each chunk is
replicated for fault tolerance.
Metadata Management: The master node maintains the metadata for
the file system, including file and chunk metadata, namespace
hierarchy, and chunk locations.
Chunk Distribution: The master assigns chunks to worker nodes and
keeps track of their locations. It ensures data is distributed evenly
across the cluster.
Data Access: Clients access files through the GFS client library,
which communicates with the master node to obtain the metadata
and locate the chunks needed to read or write data.
Merits of GFS:
Scalability: GFS is designed to handle massive amounts of data,
supporting petabyte-scale storage and processing across thousands
of servers.
Fault Tolerance: By replicating data chunks, GFS ensures high
availability and data durability even in the presence of hardware
failures or node crashes.
High Throughput: GFS is optimized for sequential access patterns,
making it suitable for applications that require large data transfers,
such as data analytics and batch processing.
Simplified Management: GFS provides a simplified interface for file
system operations, abstracting away the complexities of data
distribution and replication.
Demerits of GFS:
Latency for Small Files: GFS is optimized for handling large files
and sequential access, which may result in increased latency for
small file operations due to the overhead of chunk management.
Limited Metadata Operations: GFS sacrifices some flexibility in
metadata operations in favor of scalability and performance. It may
not be suitable for applications that require frequent and complex
metadata operations.
Importance and Applications of GFS:
Big Data Processing: GFS plays a critical role in supporting
Google's big data processing frameworks like MapReduce, enabling
efficient storage and processing of large-scale datasets.
Web Indexing: GFS is used to store and manage the vast amounts of
web pages indexed by Google's search engine, allowing for quick
and reliable access to indexed content.
Data Analytics: GFS serves as the underlying storage system for
Google's data analytics platforms, enabling efficient processing of
large datasets for insights and decision-making.
Log Processing: GFS is utilized for log storage and analysis,
enabling real-time monitoring, troubleshooting, and performance
analysis of Google's various services and systems.
Content Delivery: GFS is employed in Google's content delivery
networks (CDNs) to store and distribute large media files, ensuring
fast and reliable content delivery to users worldwide.
Comparison of Different Distributed File System
Distribute Fault
Scalab Performanc
d File Key Features Tolera Use Cases
ility e
System nce
High
High Big data
Google File Large-scale Highly throughput,
fault processing,
System storage, scalabl optimized for
toleran web indexing,
(GFS) replication e sequential
ce data analytics
access
Good
Hadoop High Big data
Fault-tolerant, Highly throughput,
Distributed fault analytics,
replication, scalabl optimized for
File System toleran Hadoop
data locality e batch
(HDFS) ce ecosystem
processing
Real-time
High
Distributed, Highly High read data
Apache fault
column- scalabl and write processing,
HBase toleran
oriented e performance NoSQL
ce
database
Scientific
High
High- Highly High computing,
fault
Lustre performance, scalabl bandwidth, high-
toleran
parallel I/O e low latency performance
ce
computing
High Good Cloud
Distributed, Highly
fault performance, storage,
Ceph object-based scalabl
toleran flexible virtualization
storage e
ce deployment environments
High Good Cloud
Scalable, Highly
fault performance, storage,
Amazon S3 object-based scalabl
toleran pay-as-you- backup and
storage e
ce go model archival
Introduction to CORBA:
CORBA stands for Common Object Request Broker Architecture.
It is a middleware technology that enables communication and
interaction between distributed objects in a networked environment.
CORBA provides a standard interface definition language (IDL) to
define the structure and behavior of objects.
It uses a request/response model, where clients send requests to
objects, and the objects respond with the requested operations.
CORBA supports multiple programming languages, allowing
objects implemented in different languages to communicate
seamlessly.
It provides location transparency, allowing objects to be located on
different machines and platforms while still being accessible.
CORBA uses an Object Request Broker (ORB) to handle
communication and manage the interaction between objects.
It has been widely used in enterprise applications to integrate diverse
systems and enable interoperability.
Introduction to Mach:
Mach is an operating system kernel developed at Carnegie Mellon
University.
It provides a microkernel architecture, which separates the kernel
into small, modular components.
Mach emphasizes message passing as the primary means of
communication between components.
It supports a range of features like virtual memory management, task
scheduling, inter-process communication, and file system support.
Mach is known for its performance and scalability, making it
suitable for both small embedded systems and large-scale distributed
environments.
It has influenced the development of other operating systems, such
as macOS, where Mach served as the foundation for its kernel.
Mach provides a flexible and extensible design, allowing developers
to customize and add new features to the kernel.
While Mach has been influential, it is not as widely used as some
other operating systems, and its development has slowed down in
recent years.
Introduction to Jini:
Jini is a network technology developed by Sun Microsystems (now
Oracle) in the late 1990s.
It aims to simplify the connection and integration of devices and
services in a network, forming a dynamic and self-configuring
distributed system.
Jini is based on Java and uses Java Remote Method Invocation
(RMI) for communication between networked devices.
It provides a service-oriented architecture, where devices advertise
their capabilities as services and other devices can dynamically
discover and use those services.
Jini includes features like automatic discovery, join protocols, and
leasing mechanisms to handle dynamic changes in the network
environment.
It enables plug-and-play functionality, allowing new devices to
seamlessly join the network and make their services available.
Jini has been used in various applications, including home
automation, smart environments, and distributed computing.
While Jini had promising concepts, its adoption has been limited,
and its development and support have declined over the years.

You might also like