Distributed Systems
CONTENT
What is a Distributed System
Types of Distributed Systems
Examples of Distributed Systems
Common Characteristics
Basic Design Issues
Advantages
Disadvantages
Conclusion
2
WHAT IS A DISTRIBUTED SYSTEM?
A collection of independent computers that appears
to its users as a single coherent system.
Features:
No shared memory – message-based communication
Each runs its own local OS
Heterogeneity
Ideal: to present a single-system image:
The distributed system “looks like” a single computer
rather than a collection of separate computers.
1. WHAT IS A DISTRIBUTED SYSTEM?
Definition: A distributed system is one in which
components located at networked computers
communicate and coordinate their actions only by
passing messages. This definitions leads to the
following consequences of distributed systems:
Concurrency of components
Lack of a global ‘clock’
Independent failures of components
4
CONSEQUENCES OF DISTRIBUTED
SYSTEMS
Concurrency of components
Lack of a global ‘clock’
Independent failures of components
5
DEFINITION OF A DISTRIBUTED
SYSTEM
Figure 1-1. A distributed system organized as middleware. The
middleware layer runs on all machines, and offers a uniform
interface to the system
ROLE OF MIDDLEWARE (MW)
In some early research systems: MW tried to
provide the illusion that a collection of separate
machines was a single computer.
E.g. NOW project: GLUNIX middleware
Today:
clustering software allows independent computers
to work together closely
MW also supports seamless access to remote
services, doesn’t try to look like a general-purpose
OS
MIDDLEWARE EXAMPLES
CORBA (Common Object Request Broker
Architecture)
DCOM (Distributed Component Object
Management) – being replaced by .net
Sun’s ONC RPC (Remote Procedure Call)
RMI (Remote Method Invocation)
SOAP (Simple Object Access Protocol)
MIDDLEWARE EXAMPLES
All of the previous examples support
communication across a network:
They provide protocols that allow a program
running on one kind of computer, using one kind
of operating system, to call a program running on
another computer with a different operating
system
The
communicating programs must be running the
same middleware.
DISTRIBUTED SYSTEM GOALS
Resource Accessibility
Distribution Transparency
Openness
Scalability
GOAL 1 – RESOURCE
AVAILABILITY
Support user access to remote resources
(printers, data files, web pages, CPU
cycles) and the fair sharing of the
resources
Economics of sharing expensive
resources
Performance enhancement – due to
multiple processors; also due to ease of
collaboration and info exchange – access
to remote services
Groupware: tools to support collaboration
GOAL 2 – DISTRIBUTION TRANSPARENCY
Software hides some of the details of the
distribution of system resources.
Makes the system more user friendly.
A distributed system that appears to its
users & applications to be a single
computer system is said to be
transparent.
Users & apps should be able to access remote
resources in the same way they access local
resources.
Transparency has several dimensions.
TYPES OF TRANSPARENCY
Transparenc Description
y
Access Hide differences in data representation &
resource access (enables interoperability)
Location Hide location of resource (can use
resource without knowing its location)
Migration Hide possibility that a system may change
location of resource (no effect on access)
Replication Hide the possibility that multiple copies of
the resource exist (for reliability and/or
availability)
Concurrency Hide the possibility that the resource may
be shared concurrently
Failure Hide failure and recovery of the resource.
How does one differentiate betw. slow and
failed?
Figure 1-2. Different
Relocation Hideforms
thatofresource
transparency
mayinbe
a distributed system
moved during
(ISO, 1995)
GOAL 3 - OPENNESS
An open distributed system “…offers services
according to standard rules that describe the
syntax and semantics of those services.” In other
words, the interfaces to the system are clearly
specified and freely available.
Compare to network protocols
Not proprietary
Interface Definition/Description Languages
(IDL): used to describe the interfaces between
software components, usually in a distributed
system
Definitions are language & machine independent
Support communication between systems using different
OS/programming languages; e.g. a C++ program running
on Windows communicates with a Java program running
on UNIX
Communication is usually RPC-based.
EXAMPLES OF IDLS
GOAL 3-OPENNESS
IDL: Interface Description Language
The original
WSDL: Web Services Description Language
Provides machine-readable descriptions of the
services
OMG IDL: used for RPC in CORBA
OMG – Object Management Group
…
Open Systems Support …
Interoperability: the ability of two different
systems or applications to work together
A process that needs a service should be able to talk
to any process that provides the service.
Multiple implementations of the same service may be
provided, as long as the interface is maintained
Portability: an application designed to run on
one distributed system can run on another
system which implements the same interface.
Extensibility: Easy to add new components,
features
GOAL 4 - SCALABILITY
Dimensions that may scale:
With respect to size
With respect to geographical distribution
With respect to the number of administrative
organizations spanned
A scalable system still performs well as it scales
up along any of the three dimensions.
SIZE SCALABILITY
Scalability is negatively affected when the
system is based on
Centralized server: one for all users
Centralized data: a single data base for all
users
Centralized algorithms: one site collects all
information, processes it, distributes the results
to all sites.
Complete knowledge: good
Time and network traffic: bad
DECENTRALIZED
ALGORITHMS
No machine has complete information about the
system state
Machines make decisions based only on local
information
Failure of a single machine doesn’t ruin the
algorithm
There is no assumption that a global clock exists.
2. TYPES OF DISTRIBUTED SYSTEMS
Distributed Computing Systems.
Distributed Information Systems.
Distributed Pervasive Systems.
Distributed Computing Systems: The
distributed computing systems include the
following:
Cluster computing systems
Grid computing systems
20
Distributed Informative Systems: In the
distributed systems, the following forms are
concentrated:
Transaction processing systems
Enterprise application integration
Distributed Pervasive Systems: Few examples
of distributed pervasive systems are as below:
Home systems
Electronic health care systems
Sensor networks
21
3. EXAMPLES OF DISTRIBUTED SYSTEMS
Local Area Network and Intranet
Database Management System
Automatic Teller Machine Network
Internet/World-Wide Web
Mobile and Ubiquitous Computing
22
3.1 LOCAL AREA NETWORK
email server Desktop
computers
print and other servers
Local area
Web server network
email server
print
File server
other servers
the rest of
the Internet
router/firewall
23
3.2 DATABASE MANAGEMENT SYSTEM
24
3.3 AUTOMATIC TELLER MACHINE
NETWORK
25
3.4 INTERNET
intranet %
%
% ISP
backbone
satellite link
desktop computer:
server:
network link:
26
3.4.1 WORLD-WIDE-WEB
27
3.4.2 WEB SERVERS AND WEB
BROWSERS
http://www.google.comlsearch?q=lyu
www.google.com
Browsers
Web servers
www.uu.se Internet
http://www.uu.se/
www.w3c.org
File system of http://www.w3c.org/Protocols/Activity.html
www.w3c.org Protocols
Activity.html
28
3.5 MOBILE AND UBIQUITOUS
COMPUTING
Internet
Host intranet GSM/GPRS
Wireless LAN gateway Home intranet
Mobile
phone
Printer Laptop
Camera Host site
29
4. COMMON CHARACTERISTICS/
ATTRIBUTES
What are we trying to achieve when we construct a
distributed system?
Certain common characteristics can be used to assess
distributed systems
Heterogeneity
Openness
Security
Scalability
Failure Handling
Concurrency
Transparency
30
4.1 HETEROGENEITY
Variety and differences in
Networks
Computer hardware
Operating systems
Programming languages
Implementations by different developers
Middleware as software layers to provide a programming
abstraction as well as masking the heterogeneity of the
underlying networks, hardware, OS, and programming
languages (e.g., CORBA).
Mobile Code to refer to code that can be sent from one
computer to another and run at the destination (e.g., Java
applets and Java virtual machine).
31
4.2 OPENNESS
Openness is concerned with extensions and
improvements of distributed systems.
Detailed interfaces of components need to be
published.
New components have to be integrated with
existing components.
Differences in data representation of interface
types on different processors (of different
vendors) have to be resolved.
32
4.3 SECURITY
In a distributed system, clients send requests to
access data managed by servers, resources in
the networks:
Doctors requesting records from hospitals
Users purchase products through electronic commerce
Security is required for:
Concealing the contents of messages: security and
privacy
Identifying a remote user or other agent correctly
(authentication)
To improve security:
Allow only authorized users to access resources
Ensure that information transmitted over the network is
readable only by the intended recipients
Provide mechanisms to protect resources from attack
New challenges: 33
Denial of service attack
Security of mobile code
4.4 SCALABILITY
Adaptation of distributed systems to
accommodate more users
respond faster (this is the hard one)
Usually done by adding more and/or faster
processors.
Components should not need to be changed
when scale of a system increases.
Design components to be scalable!
Allows a distributed system to grow (i.e., add 34
more machines to the system) without affecting
the existing applications and users
4.5 FAILURE HANDLING (FAULT
TOLERANCE)
Hardware, software and networks fail!
Distributed systems must maintain availability
even at low levels of hardware/software/network
reliability.
Fault tolerance is achieved by
recovery
Redundancy
Replication
Replication
-Offers users increased reliability and availability over
single-machine implementations
35
-Designers must provide mechanisms to ensure
consistency among the state information at different
4.6 CONCURRENCY
Components in distributed systems are executed
in concurrent processes.
Components access and update shared resources
(e.g. variables, databases, device drivers).
Integrity of the system may be violated if
concurrent updates are not coordinated.
Lostupdates
Inconsistent analysis
36
4.7 TRANSPARENCY
Distributed systems should be perceived by users
and application programmers as a whole rather
than as a collection of cooperating components.
Transparency has different aspects.
These represent various properties that
distributed systems should have.
37
17.2.4 TRANSPARENCY
Access transparency
Hides the details of networking protocols that enable
communication between distributed computers
Location transparency
Builds on access transparency to hide the location of
resources in the distributed system
Failure transparency
Method by which a distributed system provides fault
tolerance
Checkpointing
Periodically stores the state of an object such that it can be
restored if a failure in the distributed system results in the
loss of the object
Replication
A system provides multiple resources that perform the same
function
17.2.4 TRANSPARENCY
Replication transparency
Hides the fact that multiple copies of a resource are available
in the system
Persistence transparency
Hides the information about where the resource is stored—
memory or disk
Migration and relocation transparency
Hide the movement of components of a distributed system
Migration transparency
Masks the movement of an object from one location to another in
the system
Relocation transparency
Masks the relocation of an object from other objects that
communicate with it
Transaction transparency
Allows a system to achieve consistency by masking the
coordination among a set of resources
Hides the implementation of checkpointing and other
consistency mechanisms
5. BASIC DESIGN ISSUES
General software engineering principles
include rigor and formality, separation of
concerns, modularity, abstraction,
anticipation of change, …
Specific issues for distributed systems:
Naming
Communication
Software structure
System architecture
Workload allocation
Consistency maintenance 48
5.1 NAMING
A name is resolved when translated into an
interpretable form for resource/object reference.
Communication identifier (IP address + port number)
Name resolution involves several translation steps
Design considerations
Choiceof name space for each resource type
Name service to resolve resource names to comm. id.
Name services include naming context resolution,
hierarchical structure, resource protection
49
5.2 COMMUNICATION
Separated components communicate with sending
processes and receiving processes for data transfer
and synchronization.
Message passing: send and receive primitives
synchronous or blocking
asynchronous or non-blocking
Abstractions defined: channels, sockets, ports.
Communication patterns: client-server communication
(e.g., RPC, function shipping) and group multicast
50
5.3 SOFTWARE STRUCTURE
Layers in centralized computer systems:
Applications
Middleware
Operating system
Computer and Network Hardware
51
5.3 SOFTWARE STRUCTURE
Layers and dependencies in distributed systems:
Applications
Open
Distributed programming services
support
Open system kernel services
Computer and network hardware
52
5.4 SYSTEM ARCHITECTURES
Client-Server
Peer-to-Peer
Services provided by multiple servers
Proxy servers and caches
Mobile code and mobile agents
Network computers
Thin clients and mobile devices
53
5.4.1 CLIENTS INVOKE INDIVIDUAL
SERVERS
Client invocation Server
invocation
result result
Server
Client
Key:
Process: Computer:
54
5.4.2 PEER-TO-PEER SYSTEMS
Peer 2
Peer 1
Application
Application
Sharable Peer 3
objects
Application
Peer 4
Application
Peers 5 .... N
55
5.4.3 A SERVICE BY MULTIPLE SERVERS
Service
Server
Client
Server
Client
Server
56
5.4.4 WEB PROXY SERVER
Client Web
server
Proxy
server
Client Web
server
57
5.4.5 WEB APPLETS
a) client request results in the downloading of applet code
Client Web
server
Applet code
b) client interacts with the applet
Web
Client Applet server
58
5.4.6 THIN CLIENTS AND COMPUTE
SERVERS
Computer server
Network computer or PC
Thin network Application
Client Process
59
6.ADVANTAGES
Sharing Data : There is a provision in the environment
where user at one site may be able to access the data
residing at other sites.
Autonomy : Because of sharing data by means of data
distribution each site is able to retain a degree of
control over data that are stored locally.
Availability : If one site fails in a distributed system,
the remaining sites may be able to continue operating.
Thus a failure of a site doesn't necessarily imply the
shutdown of the System.
60
7. DISADVANTAGES
Software Development Cost
Greater Potential for Bugs
increased Processing Overhead
61
ISSUES/PITFALLS OF
DISTRIBUTION
Requirement for advanced software to
realize the potential benefits.
Security and privacy concerns regarding
network communication
Replication of data and services
provides fault tolerance and availability,
but at a cost.
Network reliability, security,
heterogeneity, topology
Latency and bandwidth
Administrative domains
THANKS