DC Module 1
DC Module 1
https://docs.google.com/document/d/13PIt2MvHGv2wOMVqVxkydRHUC
yXw60bpNSmHL9b7VVk/edit?usp=sharing
Course Outcomes
At the completion of course student should be able to
1) Demonstrate knowledge of the basic elements and concepts related to
distributed system technologies.
2) Illustrate the middleware technologies that support distributed applications
such as RPC, RMI and Object based middleware.
3) Analyze the various techniques used for clock synchronization and mutual
exclusion
4) Demonstrate the concepts of Resource and Process management and
synchronization algorithms
5) Demonstrate the concepts of Consistency and Replication Management
6) Apply the knowledge of Distributed File System to analyze various file
systems like NFS, AFS and the experience in building large-scale distributed
applications
Why distributed computing?
● Economics: Distributed systems allow the pooling of
resources, including CPU cycles, data storage,
input/output devices, and services
● Reliability: Distributed systems allow replication of
resources and/or services, thus reducing service
outage due to failures
● Universality: The Internet has become a universal
platform for distributed computing, i.e., it is available
everywhere with substantially identical standards
7
The strengths and weaknesses of distributed
computing
➢ In any form of computing, there is always a tradeoff in
advantages and disadvantages
➢ Some of the reasons for the popularity of distributed
computing :
● The affordability of computers and availability of network
access
● Resource sharing
● Scalability
● Fault tolerance
8
The strengths and weaknesses of
distributed computing
➢ The disadvantages of distributed computing:
● Multiple Points of Failures: the failure of one or more
participating computers, or one or more network
links, can spell trouble.
● Security Concerns: In a distributed system, there are
more opportunities for unauthorized attack.
● Programming Difficulty: Complex APIs, many issues
that have to be handled at the same time
9
A distributed system
● A simplified view
Processor
Process
Communication
Medium
Thread
Communication channel
Node: processor/process 10
11
How are distributed systems built?
12
A distributed system
● Distributed computing refers to a system where processing
and data storage is distributed across multiple devices or
systems, rather than being handled by a single central
device.
● Distributed system is a collection of independent components
located on different machines that share messages with each
other in order to achieve common goals.
● Set of computing nodes that cooperate in order to achieve a
well defined goal
● Nodes cooperate through communication
● Communication is by message passing at the fundamental
level 13
A distributed system
● A distributed system is one/more applications running on
a collection of independent computers that appears to its
users as a single coherent system
Or
14
What is a distributed system?
● Hardware is distributed
○ n processing elements (processor + memory), PE
○ Interconnected by some network
○ No shared memory
● Software is distributed
○ No centralized OS, each PE has its own copy of OS
○ No physically centralized file system
○ Means for inter-process communication is message passing at the lowest
level
15
Why distributed systems?
● Information exchange (collaborative work)
● Resource sharing (e.g. printer, backup storage,
disk units, etc.)
● Resource sharing (applications, information,
media, services)
● Cost reduction
● Increase of availability (partial failure)
● Increase of performance through parallelism, ...
16
Main characteristics
● No shared memory between nodes
○ Each node has its memory
○ Communication by message passing
● No global clock
○ Each node has its own clock
18
19
20
● Examples
● Network workstation with pool of resources – single file
system which can be access in same way from all
machines
● Sales department
● WWW
21
Middleware view
● A distributed system is
often organized as a
layer on the top of local
operating systems
22
Goals of Distributed System
1. Boosting Performance
● The distributed system tries to make things faster by dividing a
bigger task into small chunks and finally processing them
simultaneously in different computers.
● It’s just like a group of people working together on a project.
● Example, when we try to search for anything on the internet the
search engine distributes the work among several servers and then
retrieve the result and display the webpage in a few seconds.
2. Enhancing Reliability
● Distributed system ensures reliability by minimizing the load of
individual computer failure.
● If one computer gets some failure then other computers try to keep
the system running smoothly.
● Example, when we search for something in social media if one server
gets an issue then also we are able to access photos, and posts
23
because they switch the server quickly.
Goals of Distributed System
3. Scaling for the Future
● Distributed systems are experts at handling increased
demands.
● They manage the demands by incorporating more and
more computers into the system.
● This way they run everything smoothly and can handle
more users.4. Resourceful Utilization
4. Resource Utilization
● Instead of putting a load on one computer, they distribute
the task among the other available resource.
● This ensures that work will be done by utilizing every
resource.
24
1. Consistency and Transparency
● The distributed system also provide a seamless experience
to the user. They make sure that when you interact with the
system, it feels like working with a single entity, even with
many computers working behind the scene.
25
3. Security and Data Integrity
● Distributed system have special codes and lock to protect
data from other. They use some renowned techniques for
encryption and authentication to keep information safe
and unauthorized access. Distributed systems prioritize
data security as you keep your secret safe.
4. Load Balancing
● As we know distributed systems ensure good resource
utilization and allow the system to handle a high volume of
data without getting it slow down, this is achieved by load
balancing, in which it evenly distributed load to all the
computers available. Thus preventing single-machine
overload and preventing bottlenecks. 26
27
Goals of a distributed system
● Transparency
○ Hide the fact the processes are resources are physically
distributed
● Scalability
○ Distributed systems should be easy to expand
● Availability
○ Distributed systems should be continuously available
● Openness
○ Adding new users/components into the system
○ Adding new functionality, incrementally and independently by
independent developer teams
28
Transparency
● Ideally a distributed application (system) should
look like a conventional centralized system, with
no distinction between local and remote
resources
● This is the user view
● The developer view is different
○ Network aware, knows the cost of distribution of programming entities
(e.g. objects)
○ Have means to control the distribution behavior
29
Transparency
● Access Transparency
○ Hide differences in data representation and how a
resource is accessed
○ Hides heterogeneity of underlying nodes
● Location Transparency
○ Hide where a resource/service is located
● Migration Transparency
○ Hides that resources/services may be moved to another
location without affecting how they are accessed
30
Transparency
● Relocation Transparency
○ Hides that a resource may be moved to another
location while in use
● Failure Transparency
○ Hide the failure and recovery of a resource
● Concurrency Transparency
○ Hides that a resources may be shared by a
number of competitive uses/processes
31
Transparency
Transparency Description
Eg protocol
34
Scalability
● Size
○ Add more users and resources/components
● Distance
○ Cope with geographically separate resources and users
● Management
○ Spanning over independent administrative organizations
○ Local management
Concept Example
37
1. Client/Server Systems:
● Client-Server System is the most basic communication
method where the client sends input to the server and
the server replies to the client with an output.
● The client requests the server for resources or a task to
do, the server allocates the resource or performs the
task and sends the result in the form of a response to the
request of the client.
● Client Server System can be applied with multiple
servers.
38
39
How the browser interacts with the servers ?
There are few steps to follow to interacts with the servers a
client.
42
2. Peer-to-Peer Systems:
● Peer-to-Peer System communication model works as a
decentralized model in which the system works like both
Client and Server.
● Nodes are an important part of a system.
● In this, each node performs its task on its local memory
and shares data through the supporting medium, this
node can work as a server or as a client for a system.
● Programs in the peer-to-peer system can communicate at
the same level without any hierarchy.
43
Types of P2P networks
● Unstructured P2P networks: In this type of P2P network,
each device is able to make an equal contribution. This
network is easy to build as devices can be connected
randomly in the network. But being unstructured, it becomes
difficult to find content. For example, Napster, Gnutella, etc.
● Structured P2P networks: It is designed using software that
creates a virtual layer in order to put the nodes in a specific
structure. These are not easy to set up but can give easy
access to users to the content. For example, P-Grid,
Kademlia, etc.
● Hybrid P2P networks: It combines the features of both P2P
networks and client-server architecture. An example of such a
network is to find a node using the central server.
44
Features of P2P network
45
46
How Does P2P Network Work?
● If the peer-to-peer software is not already installed, then the
user first has to install the peer-to-peer software on his
computer.
● This creates a virtual network of peer-to-peer application
users.
● The user then downloads the file, which is received in bits
that come from multiple computers in the network that have
already that file.
● The data is also sent from the user’s computer to other
computers in the network that ask for the data that exist on
the user’s computer.
47
Applications of P2P Network
● File sharing: P2P network is the most convenient, cost-efficient method for file
sharing for businesses. Using this type of network there is no need for
intermediate servers to transfer the file.
● Blockchain: The P2P architecture is based on the concept of decentralization.
When a peer-to-peer network is enabled on the blockchain it helps in the
maintenance of a complete replica of the records ensuring the accuracy of the
data at the same time. At the same time, peer-to-peer networks ensure security
also.
● Direct messaging: P2P network provides a secure, quick, and efficient way to
communicate. This is possible due to the use of encryption at both the peers
and access to easy messaging tools.
● Collaboration: The easy file sharing also helps to build collaboration among
other peers in the network.
● File sharing networks: Many P2P file sharing networks like G2, and eDonkey
have popularized peer-to-peer technologies.
● Content distribution: In a P2P network, unline the client-server system so the
clients can both provide and use resources. Thus, the content serving capacity
of the P2P networks can actually increase as more users begin to access the
content.
● IP Telephony: Skype is one good example of a P2P application in VoIP. 48
Advantages of P2P Network
● Easy to maintain: The network is easy to maintain because
each node is independent of the other.
● Less costly: Since each node acts as a server, therefore the
cost of the central server is saved. Thus, there is no need to
buy an expensive server.
● No network manager: In a P2P network since each node
manages his or her own computer, thus there is no need for a
network manager.
● Adding nodes is easy: Adding, deleting, and repairing nodes
in this network is easy.
● Less network traffic: In a P2P network, there is less network
traffic than in a client/ server network.
49
Disadvantages of P2P Network
● Data is vulnerable: Because of no central server, data is
always vulnerable to getting lost because of no backup.
● Less secure: It becomes difficult to secure the complete
network because each node is independent.
● Slow performance: In a P2P network, each computer is
accessed by other computers in the network which slows down
the performance of the user.
● Files hard to locate: In a P2P network, the files are not
centrally stored, rather they are stored on individual computers
which makes it difficult to locate the files.
50
3. Middleware
51
Advantages of Middleware in Distributed Systems:
● Middleware is an intermediate layer of software that sits
between the application and the network. It is used in distributed
systems to provide common services, such as authentication,
authorization, compilation for best performance on particular
architectures, input/output translation, and error handling.
● Middleware can be modularized from the application so it has
better potential for reuse with other applications running on
different platforms.
● Application developers can design Middleware so it’s sufficiently
high-level that it becomes independent of specific hardware
environments or operating system platforms which simplifies
porting applications developed on one type of platform onto
another without rewriting code or without resorting to inefficient
and expensive binary compatibility toolsets such as
cross-compilers.
52
4. Three-tier
53
Presentation Tier
● It is the user interface and topmost tier in the architecture.
● Its purpose is to take request from the client and displays
information to the client.
● It communicates with other tiers using a web browser as it
gives output on the browser.
● If we talk about Web-based tiers then these are developed
using languages like- HTML, CSS, JavaScript.
54
Application Tier
● It is the middle tier of the architecture also known as the
logic tier as the information/request gathered through the
presentation tier is processed in detail here.
● It also interacts with the server that stores the data.
● It processes the client’s request, formats, it and sends it back
to the client.
● It is developed using languages like- Python, Java, PHP, etc.
55
Data Tier
56
5. N-tier:
● N-tier is also called a multitier distributed system.
● The N-tier system can contain any number of functions in the
network.
● N-tier systems contain similar structures to three-tier
architecture. When interoperability sends the request to another
application to perform a task or to provide a service.
● N-tier is commonly used in web applications and data systems.
57
Types of distributed systems
● Physical models
● Architectural models
● Fundamental models
58
Physical models
● Models that capture the hardware composition of a system
in terms of computer and their interconnecting networks.
● A physical model is basically a representation of the
underlying hardware elements of a distributed system.
● It is primarily used to design, manage, implement and
determine the performance of a distributed system.
1) Nodes
2) Links
3) Middleware
4) Network Topology
5) Communication Protocols
59
Components of Physical models
1) Nodes – Nodes are the end devices that have the ability of
processing data, executing tasks and communicating with the
other nodes.
● These end devices are generally the computers at the user
end or can be servers, workstations etc.
● Nodes provision the distributed system with an interface in
the presentation layer that enables the user to interact with
other back-end devices, or nodes, that can be used for
storage and database services, or processing, web browsing
etc.
● Each node has an Operating System, execution environment
and different middleware requirements that facilitate
communication and other vital tasks.
60
Components of Physical models
2) Links – Links are the communication channels between
different nodes and intermediate devices.
● These may be wired or wireless. Wired links or physical
media are implemented using copper wires, fibre optic cables
etc.
● The choice of the medium depends on the environmental
conditions and the requirements. Generally, physical links are
required for high performance and real-time computing.
Different connection types that can be implemented are as
follows:
● Point-to-point links – It establishes a connection and allows
data transfer between only two nodes.
● Broadcast links – It enables a single node to transmit data
to multiple nodes simultaneously.
● Multi-Access links – Multiple nodes share the same
communication channel to transfer data. Requires protocols 61
Components of Physical models
3) Middleware – These are the softwares installed and executed
on the nodes.
● By running middleware on each node, the distributed
computing system achieves a decentralised control and
decision-making.
● It handles various tasks like communication with other nodes,
resource management, fault tolerance, synchronisation
of different nodes and security to prevent malicious and
unauthorised access.
62
Components of Physical models
4) Network Topology – This defines the arrangement of
nodes
and links in the distributed computing system.
● The most common network topologies that are implemented
are bus, star, mesh, ring or hybrid.
● Choice of topology is done by determining the exact use
cases and the requirements.
5) Communication Protocols – Communication protocols are
the set rules and procedures for transmitting data from in the
links.
● Examples of these protocols include TCP, UDP, HTTPS,
MQTT etc.
● These allow the nodes to communicate and interpret the
data. 63
64
Architectural Model
● It is the overall design and structure of the system, and
how its different components are organised to interact
with each other and provide the desired functionalities.
● It is an overview of the system, on how will the development,
deployment and operations take place.
● Construction of a good architectural model is required for
efficient cost usage, and highly improved scalability of the
applications.
● The key aspects of architectural model are –
● Client-Server model
● P2P
● Layered Models
65
66
Fundamental model
67
Middleware: Models of Middleware,
Services offered by middleware
● Middleware acts as a bridge between different software
applications, services, and even devices that may be written
in different computer languages and exist in different formats.
● The software layer allows them to understand each other’s
data and requests, enabling them to work together smoothly.
● For example, middleware can connect different types of
database systems, web servers and applications, ERP
systems, CRM systems, cloud services, mobile applications,
IoT devices, BI tools, and more.
68
69
● Let’s say you’re using a travel website to book a flight.
The website needs to interact with various airlines’
systems to check flight availability and prices, and
make reservations.
● Middleware can translate your request into a format
that each airline’s system understands, send the
request to the airlines, receive and collect their
responses, and translate them back into a single format
which you can view on the travel website.
● This information is displayed in a user-friendly way, so
you can easily compare the prices in one place. When
you choose a flight, your booking details are then sent
back through the middleware to the respective airline’s
system to complete the reservation. 70
71
1) Message Oriented Middleware
● It is designed for the purpose of transporting messages
between two or more applications and is best suited for
distributed applications that require transaction-oriented
messaging.
● It could be used to monitor network traffic flows or to
monitor the health of a distributed system.
● It enables applications to be disbursed over various
platforms and simplifies the process of creating software
applications spanning many operating systems and network
protocols.
72
2) Object Middleware
● Also called an object request broker, enables applications
to send objects and request services via an
object-oriented system.
● It is a runtime software that enables objects (components) to
work cooperatively with a container program or another
object, even if the software is distributed across multiple
computers.s.
● In short, it manages the communication between object
73
3) Remote Procedure Call
● It is a software communication protocol that one program
can use to request a service from a program located in
another computer on a network without having to
understand the network's details.
● RPC is used to call other processes on the remote systems
like a local system.
4) Database middleware
● It allows for direct database access, meaning it's possible
to directly interact with your database from your application.
● Mongoose is an example of database middleware that
includes query, aggregate, model, and document
middleware.
74
5) Transaction middleware
● It includes applications like transaction processing
monitors and web application servers and is becoming
increasingly popular.
6) Embedded Middleware
● It provides communication services and software/firmware
integration interface that operates between embedded
applications, the embedded operating system, and
external applications.
75
Services offered by middleware
● security authentication
● Transaction management
● Message queues
● Applications servers
● Web servers and directories.
● Middleware can also be used for distributed processing with
actions occurring in real time rather than sending data back
and forth.
76
To avoid problems of centralized
• Distributed algorithms
• No process has complete information of the system
• Process decisions are based on local information
• Failure of one process does not damage the whole
system
• No assumptions about exactly synchronized clocks (no
global clock)
77
Scalability problems (distance)
● Long communication delays
● Programming techniques for Local Area Networks LAN do not
really work for Wide Area Networks WAN
○ Synchronous communication like a Remote Procedure Call
(RPC) is not suitable
○ Asynchronous message passing is more appropriate
78
Scalability problems (distance)
● WAN has unreliable communication media
● Cannot exploit broadcast communication
○ Only point-to-point communication
● Locating a service on a WAN is more difficult that on LAN
○ On LAN just broadcast a service identifier, and wait for response
79
Scalability problems (different administrative
organizations)
● Different and conflicting policies for
○ Resource usage
○ Management of the system
○ Security policies
■ WHO has access to WHAT resources
■ Can I trust a non local system administrator
80
Scalability problems (different administrative
organizations)
Distributed System DS
81
Scaling Techniques
1. Hiding communication latencies
2. Distribution
3. Replication
Scaling techniques (1) Asynchronous communication
1.4
1.5
DS MV
● Hides Communication latency : recourse is available near by
● Problem is consistency
85
Hardware Concepts Multicomputer heterogeneous
multiprocessor
homogeneous
1.6
1.7
Less scalable
Multiprocessors (2)
1.8
a) A crossbar switch
b) An omega switching network
Homogeneous Multicomputer Systems
1-9
a) Grid
b) Hypercube
Heterogeneous Multicomputer Systems
● Todays internet
● Distributed computer system (DAS)
○ Four clusters of multicomputer , interconnected through a wide-area ATM switched
backbone
1-9
Software Concepts
System Description Main Goal
● An overview of
• DOS (Distributed Operating Systems)
• NOS (Network Operating Systems)
• Middleware
Uniprocessor Operating Systems
1.11
1.15
Multicomputer Operating Systems (3)
Reliable comm.
Synchronization point Send buffer
guaranteed?
Block sender until buffer not full Yes Not necessary
a) Situation if page 10
is read only and
replication is used
Distributed Shared Memory
Systems (2)
● False sharing of a page between two independent
processes
1.18
Network Operating System (1)
● General structure of a network operating system
1-19
Network Operating System (2)
● Two clients and a server in a network operating
system
1-20
Network Operating System (3)
● Different clients may mount the servers in different
places
1.21
Positioning Middleware
● General structure of a distributed system as
middleware
1-22
Middleware models
● Treating every thing as file
● Distributed file system
DS MV
● RPC
● Distributed objects
● www: distributed document
104
Middleware services
● Communication facility : hide msg passing
● Naming : URL
DS MV
● Persistence
● Distributed transactions : multiple read write can take place.
● Security
105
Middleware and Openness
1.23
Number of copies of OS 1 N N N
Shared
Basis for communication Messages Files Model specific
memory
Global, Global,
Resource management Per node Per node
central distributed
● A sample server
An Example Client and Server (3)
1-27 b
1-28
1-29
Multitiered Architectures (2)
● An example of a server acting as a client
1-30
Modern Architectures
● An example of horizontal distribution of a Web
service
1-31
Tiered Architecture
Emergence
■ Middle Tier
■ Less flexible Vs More flexible
■ Portability
■ Less portable Vs More portable (Reason: middle layer not abstract
from other layers)
3 Tier with an Application Server
● Most of Application’s business logic is moved to Shared host server
● PC is used only for presentation services
● Approach is similar to X Architecture
○ Both aim at pulling the main body of application logic off the desktop and running it
on a shared host.
3 Tier with an Application Server
Advantages to Application Designer
● Less software on client, hence less to worry about security
● Application is more scalable
● Less software maintenance cost
● Easier to design the application to be DBMS-agnostic
● Allows “after the fact” application partitioning
3-Tier With an Object DBMS
● Using ODBMS as Middle layer
● ODBMS acts as “hot Cache”
● Retrieve, assemble and Store persistent
until required
○ For Generalized form of storage in DBMS
(server) may be inadequate for specific
application
○ E.g: Voice or Video not supported by RDBMS
Distributed/Collaborative Enterprise
Architectures
● Based on ORB technology
● Goes beyond CORBA by using shared, reusable business
models(not just objects)
● Applications built with “plug & play”components
● Performance tuning can be made, by transferring processes
Distributed/Collaborative Enterprise
Architectures
● same interface can be used for building a desktop, single location
application or a fully distributed application
● application can be developed and tested locally
● technical issues like queuing, timing and protocols aren't an issue for
the application developer