1. Introduction
1. Introduction
Distributed Systems
Instructor: Kirema D.
Mutembei
[email protected]
e
to 10 billion bits/sec
2. Distribution Transparency:
It is important for a distributed system to hide the
location of its process and resource. A distributed
system that can portray itself as a single system is
said to be transparent.
The various transparencies need to be considered
are access, location, migration, relocation,
replication, concurrency, failure and persistence.
Aiming for distributed transparency should be
considered along with performance issues.
Distributed System Design Goals cont’d
3. Openness:
Openness is an important goal of distributed system
in which it offers services according to standard rules
that describe the syntax and semantics of those
services.
Open distributed system must be flexible making it
easy to configure and add new components without
affecting existing components.
An open distributed system must also be extensible.
Distributed System Design Goals cont’d
4. Scalable:
Scalability is one of the most important goals which
are measured along three different dimensions.
First, a system can be scalable with respect to its size
which can add more user and resources to a system.
Second, users and resources can be geographically
apart.
Third, it is possible to manage even if many
administrative organizations are spanned.
Distribution Transparency
Transparency
This is hiding the fact that its processes
and resources are physically distributed
across multiple computers
A DS that is able to present itself to
users as if it were only on a single
computer system is said to be
transparent
Note
Distribution transparency is a nice a goal,
but achieving it is a different story.
Distribution Transparency
Transparency Description
Hide differences in data representation and how a
Access resource is accessed
Location Hide where a resource is located
Migration Hide that a resource may move to another location
Hide that a resource may be moved to another
Relocation location while in use
Hide that a resource may be shared by several
Replication competitive users
Hide that a resource may be shared by several
Concurrency competitive users
Failure Hide the failure and recovery of a resource
Hide whether a (software) resource is in memory or
Persistence on disk
Degree of Transparency
Aiming at full distribution transparency may be too
much:
Users may be located in different continents
Completely hiding failures of networks and nodes is
(theoretically and practically) impossible
You cannot distinguish a slow computer from a failing
one
You can never be sure that a server actually performed
an operation before a crash
Full transparency will cost performance, exposing
distribution of the system
Keeping Web caches exactly up-to-date with the
master
Immediately flushing write operations to disk for fault
tolerance
Openness of Distributed
Systems
Goal: Open distributed system -- able to interact
with services from other open systems,
irrespective of the underlying environment:
Standard rules (protocols/interfaces) to describe
services/components
Interfaced definitions should be Complete & Vendor
neutral
These help making system / services
interoperable & portable
Flexibility – ability to integrate multiple components
Achieving openness
Achieving openness: At least make
the distributed system independent
from the underlying environment:
Hardware
Platforms
Languages
Scale in Distributed
Systems
Observation: Many developers of modern
distributed system easily use the
adjective “scalable” without making
clear why their system actually scales.
Three metrics of a scalable system:
Number of users and/or processes (size
scalability)
Maximum distance between nodes
(geographical scalability)
Number of administrative domains
(administrative scalability)- used IDs, user
groups, ACLs, resource managers
Scalability Limitations
Centralized Services: Single server
for all users. Often necessary.
Centralized Data: Single online
telephone book.
Centralized Algorithms: Doing
routing based on complete
information.
Scalability Issues
Decentralized algorithms:
No machine has complete
information about the system state.
Machines make decisions based
only on local information.
Failure of one machine does not
ruin the algorithm
Scalability Issues cont’d
Other issues with Geographical scalability
Problems due to Synchronous
communication- sender will wait until the
acknowledgment is received from the receiver
and receiver waits until the message arrives.
Unreliable WANs.
Scaling the system across multiple
independent administrative domains.
Conflicting policies w.r.t to resource
usage (payment), management and
security.
Scaling Techniques
1. Partition data and
computations across multiple
machines
Move computations to clients (Java
applets)
Decentralized naming System (DNS)
Decentralized information systems
(WWW)
2. Make copies of data available at
different machines (Servers)
Replicated file servers (for fault tolerance)
Replicated databases
Mirrored web sites
DNS Structure
Scaling Techniques cont’d
3. Allow client processes to access local
copies
Attempts to reduce the network traffic of the previous
model by caching the data obtained from the server
node
Web caches (browser/Web proxy)
File caching (at server and client)
networks;
computer hardware;
operating systems;
programming languages;
Transparency
Quality of Service(QoS)
Distributed System
Challenges
6. Transparency
A problem with transparency may
arise with distributed systems due
to the nature of the system's
complexity.
In this context, transparency refers
to the distributed system's ability to
conceal its complexity and give off
the appearance of a single system.
And when we discuss transparency, we must also
control.