0% found this document useful (0 votes)
22 views

Chapter 1-Introduction

The document provides an introduction to distributed systems, defining them as collections of independent computers that coordinate actions through message passing to appear as a single coherent system. It discusses how distributed systems arose from cheaper networked computers and the benefits they provide like scalability, availability and shared resources. Some challenges of distribution like partial failures are also outlined.

Uploaded by

mehari kiros
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Chapter 1-Introduction

The document provides an introduction to distributed systems, defining them as collections of independent computers that coordinate actions through message passing to appear as a single coherent system. It discusses how distributed systems arose from cheaper networked computers and the benefits they provide like scalability, availability and shared resources. Some challenges of distribution like partial failures are also outlined.

Uploaded by

mehari kiros
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 51

Chapter 1 – Introduction

Introduction to Distributed Systems

November 6, 2023 Introduction 1


1.1 Introduction and Definition

Before the mid-80s, computers were


 Very expensive (hundred of thousands or even millions of
dollars)
 Very slow (a few thousand instructions per second)
 Not connected among themselves
After the mid-80s: two major developments
 Cheap and powerful microprocessor-based computers appeared
 Computer were networks
• LANs at speeds ranging from 10 to 1000 Mbps
• WANs at speed ranging from 64 Kbps to GB/sec
 Consequence፡ feasibility of using a large network of computers
to work for the same application.
 This is in contrast to the old centralized systems where there
was a single computer with its peripherals
November 6, 2023 Introduction 2
A system in which hardware or software components located at
networked computers communicate and coordinate their actions
only by message passing.
• Concurrency of components.
• No global clock.
• Independent failures
A collection of two or more independent computers which
coordinate their processing through the exchange of synchronous or
asynchronous message passing.

(George Coulouris, Jean Dollimore and Tim Kindberg)


November 6, 2023 Introduction 3
Definition of a Distributed System
 a distributed system is:
a collection of independent computers that appears to its users as a
single coherent system - computer (Tanenbaum & Van Steen)
 This definition has two aspects:
1. Hardware: autonomous machines
2. Software: a single system view for the users
Other Definitions
A distributed system is a system designed to support the
development of applications and services which can exploit a physical
architecture consisting of multiple, autonomous processing elements
that do not share primary memory but cooperate by sending
asynchronous messages over a communication network (Blair &
Stefani)

November 6, 2023 Introduction 4


Why Distributed?
 Resource and Data Sharing
Printers, databases, multimedia servers, ...
 Availability, Reliability
The loss of some instances can be hidden
 Scalability, Extensibility
The system grows with demand (e.g., extra servers)
 Performance
Huge power (CPU, memory, ...) available
 Inherent distribution, communication
Organizational distribution, e-mail, video

November 6, 2023 Introduction 5


Problems of Distribution
 Concurrency, Security
Clients must not disturb each other
 Privacy
Unwanted communication such as spam
 Partial failure
We often do not know where the error is (e.g., RPC)
 Location, Migration, Replication
Clients must be able to find their servers
 Heterogeneity
Hardware, platforms, languages, management

November 6, 2023 Introduction 6


Characteristics of Distributed Systems
 Differences between the computers and the way they communicate are
hidden from users.

 Users and applications can interact with a distributed system in a


consistent and uniform way regardless of location.

 Distributed systems should be easy to expand and scale.


 A distributed system is normally continuously available, even if there
may be partial failures.

November 6, 2023 Introduction 7


1.2 Organization and Goals of a Distributed System

 To support heterogeneous computers and networks and to provide a single-


system view.
 A distributed system is often organized by means of a layer of software called
middleware that extends over multiple machines

A distributed system organized as middleware


Note that the middleware layer extends over multiple machines
November 6, 2023 Introduction 8
Goals of a distributed system: a distributed system should

 Easily connect users with resources (printers, computers, storage


facilities, data, files, Web pages, ...)
 Reasons: economics, to collaborate and exchange information
 Be transparent: hide the fact that the resources and processes are
distributed across multiple computers
 Be open—add resources easily
 Be scalable ---added extra servers easily

Transparency in a Distributed System


A distributed system that is able to present itself to users and applications
as if it were only a single computer system is said to be transparent

November 6, 2023 Introduction 9


 Different forms of transparency in a distributed system
Transparency Description
Access Hide differences in data representation
and how a resource is accessed.
Location Hide where a resource is physically located; where
is http://www.prenhall.com/index.html? (naming)
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another
location while in use; e.g., mobile users using their wireless laptops
Replication Hide that a resource is replicated
Concurrency Hide that a resource may be shared by several
competitive users; a resource must be left in a consistent state
Failure Hide the failure and recovery of a resource
Persistence Hide whether a (software) resource is in memory or on
disk

November 6, 2023 Introduction 10


Openness in a Distributed System
A distributed system should be open and we need well-defined interfaces
Interoperability
Components of different origin can communicate.
Portability
Components work on different platforms.
Another goal of an open distributed system is that it should be flexible
and extensible; easy to configure the system out of different components;
easy to add new components, replace existing ones
An Open Distributed System is a system that offers services according
to standard rules that describe the syntax and semantics of those services;
e.g., protocols in networks
Standards - a necessity
Should allow competition in non-normative areas

November 6, 2023 Introduction 11


In distributed systems, such services are often specified through interfaces
often described using an Interface Definition Language (IDL)
 specify only syntax: the names of the functions, types of
parameters, return values, possible exceptions, ...

Scalability in Distributed Systems


A distributed system should be scalable
 size: adding more users and resources to the system
 geographically: users and resources may be far apart
 administratively: should be easy to manage even if it spans
many administrative organizations.

November 6, 2023 Introduction 12


Scalability problems

Concept Example
Single server for all users-mostly for security
Centralized services
reasons
Centralized data A single on-line telephone book
Centralized algorithms Doing routing based on complete information

Examples of scalability limitations


Scaling Techniques
 How to solve scaling problems ?
 The problem is mainly performance, and arises as a result of
limitations in the capacity of servers and networks ( for geographical
scalability)
 Three possible solutions: hiding communication latencies,
distribution, and replication

November 6, 2023 Introduction 13


a. Hide Communication Latencies
 Try to avoid waiting for responses to remote service requests
 Let the requester do other useful job i.e., construct requesting
applications that use only asynchronous communication instead of
synchronous communication; when a reply arrives the application is
interrupted

 Good for batch processing and parallel applications but not for interactive
applications
 For interactive applications, move part of the job to the client to reduce
communication; e.g. filling a form and checking the entries

November 6, 2023 Introduction 14


(a) a server checking the correctness of field entries
(b) a client doing the job
e.g., shipping code is now supported in Web applications using Java
Applets

November 6, 2023 Introduction 15


B. Distribution
• e.g., DNS - Domain Name System ([email protected])
• divide the name space into zones

an example of dividing the DNS name space into zones

November 6, 2023 Introduction 16


C. Replication

 Replicate components across a distributed system to increase availability and for load
balancing, leading to better performance

 Decided by the owner of a resource


 Caching (a special form of replication) also reduces communication latency; decided by
the user

 but, caching and replication may lead to consistency problems (see Chapter 6 -
Consistency and Replication)

November 6, 2023 Introduction 17


1.3 Types of distributed Systems

• Three types: distributed computing systems, distributed


information systems, and pervasive/embedded systems
1. Distributed Computing Systems
• Used for high-performance computing tasks
• two types: cluster computing and grid computing
• Cluster Computing
• a collection of similar workstations or PCs (homogeneous), closely
connected by means of a high-speed LAN
• Each node runs the same operating system
• used for parallel programming in which a single compute intensive
program is run in parallel on multiple machines

November 6, 2023 Introduction 18


An example of a cluster computing system
A master node runs a middleware (containing libraries for parallel
programs) and controls other compute nodes;
It allocates tasks and provides an interface to users ,etc.

November 6, 2023 Introduction 19


Grid Computing

◦ “Resource sharing and coordinated problem solving in dynamic,


multi-institutional virtual organizations" (Ian Foster)

◦ High degree of heterogeneity: no assumptions are made concerning


hardware, operating systems, networks, administrative domains,
security policies, etc.

◦ Globus is a software system for Grid Computing; read about the


Globus Alliance at http://www.globus.org/

Cloud Computing

◦ The use of computing resources (hardware and software) that are


delivered as a service over a network (typically the Internet), not as a
product.
November 6, 2023 Introduction 20
2. Distributed Information Systems

Problem: many networked applications with a problem of


interoperability

At the lowest level: wrap a number of requests into a single larger request
and have it executed as a distributed transaction; all or none of the requests
would be executed

how to let applications communicate directly with each other, i.e.,


Enterprise Application Integration (EAI)

November 6, 2023 Introduction 21


Transaction Processing Systems (TPS)
The technique of distributed the function of a transaction over several
transactions within a networked applications.
They are special primitives and required Transaction Processing Systems
◦ Consider database to program transactions, supplied either by the
underlying distributed system or by the language runtime system
◦ Exact list of primitives depends on the type of application; procedure
calls, ordinary statements, etc. can also be included

◦ e.g., assume the following banking operation


 withdraw an amount x from account 1
 deposit the amount x to account 2
◦ What happens if there is a problem after the first activity is carried out?
◦ Group the two operations into one transaction; either both are carried
out or neither
◦ We need a way to roll back when a transaction is not completed
November 6, 2023 Introduction 22
Properties of transactions, often referred to as ACID
1.Atomic: to the outside world, the transaction happens indivisibly; a
transaction either happens completely or not at all; intermediate states are
not seen by other processes
2.Consistent: the transaction does not violate system invariants;
e.g., in an internal transfer in a bank, the amount of money in the bank must
be the same as it was before the transfer (the law of conservation of
money); this may be violated for a brief period of time, but not seen to
other processes
3.Isolated or Serializable: concurrent transactions do not interfere with
each other;
If two or more transactions are running at the same time, the final result
must look as all transactions run sequentially in some order
4.Durable: once a transaction commits, the changes are permanent; see
later in Chapter 8 -Fault Tolerance
November 6, 2023 Introduction 23
3.Distributed Pervasive Systems
 the distributed systems discussed so far are characterized by their stability; fixed
nodes having high-quality connection to a network
 there are also mobile and embedded computing devices which are small, battery-
powered, mobile, and with a wireless connection

 three requirements for pervasive applications


 embrace contextual changes: a device is aware that its environment may change
all the time, e.g., changing its network access point
 encourage ad hoc composition: devices are used in different ways by different
users
 recognize sharing as the default: devices join a system to access or provide
information
 examples of pervasive systems
◦ Home Systems that integrate consumer electronics
◦ Electronic Health Care Systems to monitor the well-being of individuals
◦ Sensor Networks
◦ read pages 26 - 30

November 6, 2023 Introduction 24


Distributed Systems Examples (The Internet)
The Internet is a vast interconnected collection of computer networks of
many types.
Its design enabling a program running anywhere to address messages to
programs anywhere else.
Allowing its users to make use of many services as: WWW, E-Mail, Web
hosting, and File transfer.
Its services can be extended by adding new types of service (open-ended
services).
Small organizations and individual users can to access internet services
through Internet Service Providers (ISPs).
Independent intranets are linked together by high transmission capacity
circuits called backbones.

November 6, 2023 Introduction 25


Distributed Systems Examples (The Internet)

intranet %

%
%
 ISP

%

 backbone

satellite link

 desktop computer:
server:

network link:
 A typical portion
of the Internet
November 6, 2023 Introduction 26
Distributed Systems Examples (Intranets)

An Intranet is a portion of the internet that is administrated separately


and its local security policies are enforced by a configured boundary.

Composed of several local area networks (LANs) linked by backbone


connections to allow its users to access the provided services.

Connected to the Internet via a router which allows its users to make use
of the internet services elsewhere.

Many organization protect their own services from unauthorized use by


filtering incoming and outgoing messages using a firewall.

November 6, 2023 Introduction 27


Distributed Systems Examples (Intranets)

email server Desktop


computers
print and other servers

Local area
Web server network

email server
print
File server
other servers
the rest of
the Internet
router/firewall

 A typical
Intranet
November 6, 2023 Introduction 28
Chapter Two
Architectural Styles
Consents
2.1 Introduction
2.2 Architectural Styles
2.3 System Architectures

November 6, 2023 Introduction 29


Introduction to Architectural Styles

 Refers to the logical organization of distributed systems into software


components
 A component is a modular unit with well-defined, required and provided
interfaces that is replaceable within its environment; can be replaced provided that
with respect its interfaces
 A connector is a mechanism that mediates communication, coordination, or
cooperation among components, e.g., facilities for RPC, message passing, or
streaming multimedia data
There are various architectural styles

 Layered architectures
 Object-based architectures
 Data-centered architectures
 Event-based architectures

November 6, 2023 Introduction 30


Layered architectures
◦ components are organized in a layered fashion where a component at
layer Li is allowed to call components at the underlying layer Li-1,
but not the other way around;
◦ requests go down the hierarchy and results flow upward
◦ e.g., network layers

the layered architectural style

November 6, 2023 Introduction 31


• Object-based architectures

• Each object corresponds to a component and these components are


connected through a remote procedure call mechanism (matches the
client-server paradigm)

the object-based architectural style

November 6, 2023 Introduction 32


Data-centered architectures
◦ Processes communicate through a common repository; e.g., a shared
distributed file system
Event-based architectures
◦ Processes communicate through the propagation of events (can also
optionally carry data)
◦ Publish/subscribe systems
◦ Processes publish events and the middleware ensures that only those
processes that subscribed to those events will receive them
◦ Processes are loosely coupled; no need of explicitly referring to each
other

November 6, 2023 Introduction 33


• Shared data spaces

• event-based architectures combined with data-centered architectures


• processes are decoupled in time

the shared data-space architectural style

November 6, 2023 Introduction 34


System Architectures

• Refers to the logical organization of distributed systems into


software components or how are processes organized in a system;
where do we place software components
• Deciding on software components, their interaction, and their
placement is what system architecture is all about.
• Can be centralized, decentralized or a hybrid

November 6, 2023 Introduction 35


• Centralized: traditional client-server structure
• Vertical or hierarchical organization of communication and control
paths (as in layered software architectures)
• Logical separation of functions into client (requesting process)
and server (responder)
• Decentralized: peer-to-peer
• Horizontal rather than hierarchical communication and control
• Hybrid: combine elements of C/S and P2P
• Edge-server systems
• Collaborative distributed systems.
• Classification of a system as centralized or decentralized refers to
communication and control organization, primarily.

November 6, 2023 Introduction 36


Centralized Architectures

• Thinking in terms of clients requesting services from servers


• A server is a process implementing a specific service
• A client is a process that requests a service from a server by sending a
request and waiting for a reply
• We have a request-reply behavior

Server and client Interaction

November 6, 2023 Introduction 37


Communication between client and server can be

◦ By a connectionless protocol if the underlying network is fairly


reliable; efficient since there is no much overhead
◦ But assuring reliability is difficult
◦ We don’t also know the source of error; was the request or the reply
lost, for instance
◦ when messages are lost or corrupted let the client send the request
again; applicable only for idempotent operations
◦ An operation is idempotent if it can be repeated multiple times without
harm; e.g., reading a record in a database
◦ but, transferring an amount to a bank account is not idempotent by a
reliable connection-oriented protocol
◦ If the underlying network is unreliable establishing and terminating
connections is expensive

November 6, 2023 Introduction 38


Client-Server Architectures

• Processes are divided into two groups (clients and servers).


• Synchronous communication: request-reply protocol
• In LANs, often implemented with a connectionless protocol (unreliable)
• In WANs, communication is typically connection-oriented TCP/IP
(reliable)
• High likelihood of communication failures

November 6, 2023 Introduction 39


Layered (software) Architecture for Client-Server Systems

• User-interface level: GUI’s (usually) for interacting with end users

• Processing level: data processing applications – the core functionality

• Data level: interacts with data base or file system

• Data usually is persistent; exists even if no client is accessing it file


or database system

November 6, 2023 Introduction 40


Examples

• Web search engine


• Interface: type in a keyword string
• Processing level: processes to generate DB queries, rank replies,
format response
• Data level: database of web pages
• Stock broker’s decision support system
• Interface: likely more complex than simple search
• Processing: programs to analyze data; rely on statistics, AI perhaps,
may require large simulations
• Data level: DB of financial information
• Desktop “office suites”
• Interface: access to various documents, data,
• Processing: word processing, database queries, spreadsheets,…
• Data level: file systems and/or databases

November 6, 2023 Introduction 41


Application Layering

November 6, 2023 Introduction 42


System Architecture

• Mapping the software architecture to system hardware


• Correspondence between logical software modules and actual
computers
• Multi-tiered architectures
• Layer and tier are roughly equivalent terms, but layer typically
implies software and tier is more likely to refer to hardware.
• Two-tier and three-tier are the most common System Architecture

November 6, 2023 Introduction 43


Two-tiered C/S Architectures

• Server provides processing and data management; client provides simple


graphical display (thin-client)
• Perceived performance loss at client
• Easier to manage, more reliable, client machines don’t need to be so
large and powerful
• At the other extreme, all application processing and some data resides at
the client (fat-client approach)
• Pro: reduces work load at server; more scalable
• Con: harder to manage by system admin, less secure

November 6, 2023 Introduction 44


• How to physically distribute a client-server application
across several machines

Two-tiered architecture: alternative client-server organizations

November 6, 2023 Introduction 45


(a) Put only terminal-dependent part of the user interface on the client
machine and let the applications remotely control the presentation
(b) put the entire user-interface software on the client side
(c) move part of the application to the client, e.g. checking correctness in
filling forms
(a) to (c) are for thin clients
(d) And (e) are for powerful client machines what are called fat clients
(more popular)
(d) and (e) are difficult to manage since client side software is distributed
and is prone to error; it is also dependent on the client’s platform such as
operating system

November 6, 2023 Introduction 46


Three-tiered Architectures

• In some applications servers may also need to be clients,


leading to a three level architecture
• Distributed transaction processing

• Web servers that interact with database servers

• Distribute functionality across three levels of machines


instead of two.

November 6, 2023 Introduction 47


Multitiered Architectures (3 Tier Architecture)

November 6, 2023 Introduction 48


Decentralized Architectures
◦ Vertical distribution: refers to the ones discussed so far where the different tiers
correspond directly with the logical organization of applications; place logically
different components on different machines

◦ Horizontal distribution: physically split up the client or the server into


logically equivalent parts

◦ An example is a peer-to-peer system where processes are equal and hence each
process acts as a client and a server at the same time (servant)

◦ Read about the different approaches of peer-to-peer architecture and about


Architectures versus Middleware

November 6, 2023 Introduction 49


another example is the horizontal distribution of a Web
service

November 6, 2023 Introduction 50


Centralized v Decentralized Architectures

• Traditional client-server architectures exhibit vertical distribution. Each


level serves a different purpose in the system.
• Logically different components reside on different nodes

• Horizontal distribution (P2P): each node has roughly the same


processing capabilities and stores/manages part of the total system data.
• Better load balancing, more resistant to denial-of-service attacks,
harder to manage than C/S
• Communication & control is not hierarchical; all about equal

November 6, 2023 Introduction 51

You might also like