0% found this document useful (0 votes)
14 views

Distributed Systems Chapter 1-Introduction

Uploaded by

eyob daggy Girma
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Distributed Systems Chapter 1-Introduction

Uploaded by

eyob daggy Girma
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 32

Chapter 1 - Introduction

1.1 Introduction and Definition


 before the mid-80s, computers were
 very expensive (hundred of thousands or even millions
of dollars)
 very slow (a few thousand instructions per second)
 not connected among themselves
 after the mid-80s: two major developments
 cheap and powerful microprocessor-based computers
appeared
 computer networks
 LANs at speeds ranging from 10 to 1000 Mbps
 WANs at speed ranging from 64 Kbps to gigabits/sec
 consequence
 feasibility of using a large network of computers to
work for the same application; this is in contrast to the
old centralized systems where there was a single
computer with its peripherals
2
 Definition of a Distributed System
 a distributed system is:
a collection of independent computers that appears to its
users as a single coherent system - computer (Tanenbaum
& Van Steen)

 this definition has two aspects:


1. hardware: autonomous machines
2. software: a single system view for the users

3
 Other Definitions
A distributed system is a system designed to support the
development of applications and services which can exploit a
physical architecture consisting of multiple, autonomous
processing elements that do not share primary memory but
cooperate by sending asynchronous messages over a
communication network (Blair & Stefani)

A distributed system is one that stops you getting any work


done when a machine you’ve never even heard of crashes
(Leslie)

4
 Why Distributed?
 Resource and Data Sharing
 printers, databases, multimedia servers, ...
 Availability, Reliability
 the loss of some instances can be hidden
 Scalability, Extensibility
 the system grows with demand (e.g., extra servers)
 Performance
 huge power (CPU, memory, ...) available
 Inherent distribution, communication
 organizational distribution, e-mail, video

5
 Problems of Distribution
 Concurrency, Security
 clients must not disturb each other
 Privacy
 e.g., when building a preference profile
 unwanted communication such as spam
 Partial failure
 we often do not know where the error is (e.g., RPC)
 Location, Migration, Replication
 clients must be able to find their servers
 Heterogeneity
 hardware, platforms, languages, management

6
 Characteristics of Distributed Systems
 differences between the computers and the ways they
communicate are hidden from users
 users and applications can interact with a distributed system
in a consistent and uniform way regardless of location
 distributed systems should be easy to expand and scale
 a distributed system is normally continuously available, even
if there may be partial failures

7
1.2 Goals of a Distributed System
 to support heterogeneous computers and networks and to
provide a single-system view, a distributed system is
often organized by means of a layer of software called
middleware that extends over multiple machines

a distributed system organized as middleware; note that the


middleware layer extends over multiple machines, and offers each
application the same interface
Ack: most drawings in all slides are taken from the textbook
8
 Goals of a distributed system: a distributed system should
 easily connect users with resources (printers, computers,
storage facilities, data, files, Web pages, ...)
 reasons: economics, to collaborate and exchange
information
 be transparent: hide the fact that the resources and
processes are distributed across multiple computers
 be open
 be scalable
Transparency in a Distributed System
 a distributed system that is able to present itself to users
and applications as if it were only a single computer
system is said to be transparent

9
 different forms of transparency in a distributed system
Transparency Description
Access Hide differences in data representation
(endianness, file naming, ...) and how a
resource
is accessed
Location Hide where a resource is physically located;
where
is http://www.prenhall.com/index.html?
(naming)
Migration Hide that a resource may move to another
location
Relocation Hide that a resource may be moved to
another location while in use; e.g., mobile users
using their wireless laptops
Replication Hide that a resource is replicated
Concurrency Hide that a resource may be shared by
several competitive users; a resource must be left 10
 Openness in a Distributed System
 a distributed system should be open
 we need well-defined interfaces
 interoperability
 components of different origin can communicate
 portability
 components work on different platforms
 another goal of an open distributed system is that it should
be flexible and extensible; easy to configure the system out
of different components; easy to add new components,
replace existing ones; easier said than done
 an Open Distributed System is a system that offers services
according to standard rules that describe the syntax and
semantics of those services; e.g., protocols in networks
 standards - a necessity
 should allow competition in non-normative areas
11
 in distributed systems, such services are often specified
through interfaces often described using an Interface
Definition Language (IDL)
 specify only syntax: the names of the functions, types
of parameters, return values, possible exceptions, ...
 Semantics are given in an informal way by means of
natural languages

 Scalability in Distributed Systems


 a distributed system should be scalable
 size: adding more users and resources to the system
 geographically: users and resources may be far apart
 administratively: should be easy to manage even if it
spans many administrative organizations
 but a scalable system may exhibit performance problems

12
 scalability problems

Concept Example
Single server for all users-mostly for security
Centralized services
reasons
Centralized data A single on-line telephone book
Doing routing based on complete
Centralized algorithms
information
examples of scalability limitations

 Scaling Techniques
 how to solve scaling problems
 the problem is mainly performance, and arises as a result
of limitations in the capacity of servers and networks (for
geographical scalability)
 three possible solutions: hiding communication latencies,
distribution, and replication

13
a. Hide Communication Latencies
 try to avoid waiting for responses to remote service
requests
 let the requester do other useful job
 i.e., construct requesting applications that use only
asynchronous communication instead of synchronous
communication; when a reply arrives the application is
interrupted
 good for batch processing and parallel applications but
not for interactive applications
 for interactive applications, move part of the job to the
client to reduce communication; e.g. filling a form and
checking the entries

14
(a) a server checking the correctness of field entries
(b) a client doing the job
 e.g., checking the completeness of mandatory fields
 shipping code is now supported in Web applications using Java Applets and Javascript

15
b. Distribution
 e.g., DNS - Domain Name System ([email protected])
 divide the name space into nonoverlapping zones
 for details, see later in Chapter 5 - Naming

an example of dividing the DNS name space into zones


16
c. Replication
 replicate components across a distributed system to
increase availability and for load balancing, leading to
better performance
 decided by the owner of a resource
 caching (a special form of replication) also reduces
communication latency; decided by the user
 but, caching and replication may lead to consistency
problems (see Chapter 7 - Consistency and Replication)

17
Pitfalls when Developing Distributed Systems
 False assumptions made by first time developers
 The network is reliable
 The network is secure
 The network is homogeneous
 The topology does not change
 Latency is zero
 Bandwidth is infinite
 Transport cost is zero
 There is one administrator

18
1.3 Types of Distributed Systems
 Three types: distributed computing systems, distributed
information systems, and distributed embedded systems
1. Distributed Computing Systems
 Used for high-performance computing tasks
 two types: cluster computing and grid computing
 Cluster Computing
 a collection of similar workstations or PCs
(homogeneous), closely connected by means of a
high-speed LAN
 each node runs the same operating system
 used for parallel programming in which a single
compute intensive program is run in parallel on
multiple machines

19
an example of a cluster computing system

20
 Grid Computing
 “Resource sharing and coordinated problem solving
in dynamic, multi-institutional virtual organizations”
(I. Foster)
 high degree of heterogeneity: no assumptions are
made concerning hardware, operating systems,
networks, administrative domains, security policies,
etc.
2. Distributed Information Systems
 problem: many networked applications with a problem of
interoperability
 at the lowest level: wrap a number of requests into a
single larger request and have it executed as a
distributed transaction
 how to let applications communicate directly with each
other, i.e., Enterprise Application Integration (EAI)

21
 Transaction Processing Systems
 Consider database applications
 special primitives are required to program transactions,
supplied either by the underlying distributed system or
by the language runtime system
 exact list of primitives depends on the type of application

Primitive Description
BEGIN_TRANSACTION Mark the start of a transaction
Terminate the transaction and try to
END_TRANSACTION
commit
Kill the transaction and restore the old
ABORT_TRANSACTION
values
Read data from a file, a table, or
READ
otherwise
Write data to a file, a table, or
WRITE
otherwise
22
 The Transaction Model
 the model for transactions comes from the world of
business
 a supplier and a retailer negotiate on
 price
 delivery date
 quality
 etc.
 until the deal is concluded they can continue
negotiating or one of them can terminate
 but once they have reached an agreement they are
bound by law to carry out their part of the deal
 transactions between processes is similar with this
scenario

23
 e.g., assume the following banking operation
 withdraw an amount x from account 1
 deposit the amount x to account 2
 what happens if there is a problem after the first activity
is carried out?
 group the two operations into one transaction; either
both are carried out or neither
 we need a way to roll back when a transaction is not
completed

24
 e.g. reserving a seat from White Plains to Malindi through
JFK and Nairobi airports

BEGIN_TRANSACTION BEGIN_TRANSACTION
reserve WP  JFK; reserve WP  JFK;
reserve JFK  Nairobi; reserve JFK  Nairobi;
reserve Nairobi  Malindi; reserve Nairobi  Malindi full 
END_TRANSACTION ABORT_TRANSACTION
(a) (b)

(a) transaction to reserve three flights commits


(b) transaction aborts when third flight is unavailable

25
 properties of transactions, often referred to as ACID
1. Atomic: to the outside world, the transaction happens
indivisibly; a transaction either happens completely or
not at all; intermediate states are not seen by other
processes
2. Consistent: the transaction does not violate system
invariants; e.g., in an internal transfer in a bank, the
amount of money in the bank must be the same as it
was before the transfer (the law of conservation of
money); this may be violated for a brief period of time,
but not seen to other processes
3. Isolated or Serializable: concurrent transactions do not
interfere with each other; if two or more transactions
are running at the same time, the final result must look
as though all transactions run sequentially in some
order
4. Durable: once a transaction commits, the changes are
permanent; see later in Chapter 8
26
 Classification of Transactions
 a transaction could be flat, nested or distributed
 Flat Transaction
 consists of a series of operations that satisfy the ACID
properties
 simple and widely used but with some limitations
 do not allow partial results to be committed or aborted
 i.e., atomicity is also partly a weakness
 in our airline reservation example, we may want to
accept the first two reservations and find an
alternative one for the last
 some transactions may take too much time

27
 Nested Transaction
 constructed from a number of subtransactions; it is
logically decomposed into a hierarchy of
subtransactions
 the top-level transaction forks off children that run in
parallel, on different machines; to gain performance or
for programming simplicity
 each may also execute one or more subtransactions
 permanence (durability) applies only to the top-level
transaction; commits by children should be undone
 Distributed Transaction
 a flat transaction that operates on data that are
distributed across multiple machines
 problem: separate algorithms are needed to handle the
locking of data and committing the entire transaction;
see later in Chapter 8 for distributed commit

28
(a) a nested transaction
(b) distributed transaction

29
 Enterprise Application Integration
 how to integrate applications independent from their
databases
 transaction systems rely on request/reply
 how can applications communicate with each other

middleware as a communication facilitator in enterprise application


integration 30
 there are different communication models
 RPC (Remote procedure Call)
 RMI (Remote Method Invocation)
 MOM (Message-Oriented Communication)
 see later in Chapter 4
3. Distributed Pervasive Systems
 the distributed systems discussed so far are
characterized by their stability; fixed nodes having high-
quality connection to a network
 there are also mobile and embedded computing devices
with wireless connections

31
 three requirements for pervasive applications
 embrace contextual changes: a device is aware that
its environment may change all the time
 encourage ad hoc composition: devices are used in
different ways by different users
 recognize sharing as the default: devices join a
system to access or provide information
 examples of pervasive systems
 Home Systems
 Electronic Health Care Systems
 Sensor Networks
 read pages 27 - 30

32

You might also like