Lect 3
Lect 3
Distributed Systems
Communication
in Distributed Systems
Christoph Kessler
IDA
Linköping University
Sweden
2023
Agenda
2
Communication Models
and their Layered Implementation
Applications and Services
RMI, RPC
Middleware
Request and Reply
In this chapter:
▪ Communication between distributed objects by means of two models:
▪ Remote Method Invocation (RMI)
▪ Remote Procedure Call (RPC)
▪ RMI, as well as RPC, are implemented on top of request and reply primitives.
▪ Request and reply are implemented on top of the network protocol
(e.g. TCP or UDP in case of the internet).
3
Network Protocol
▪ Middleware and distributed applications are implemented on top of
a network protocol. Such a protocol is implemented as several
layers.
▪ In case of the Internet: Applications and Services
Middleware
TCP or UDP
IP
7
Request-Reply Primitives
in the Client-Server Model
▪ The system is structured as a group of processes (objects), called
servers, that deliver services to clients.
… …
send (request) to server_reference; receive (request) from client-reference;
execute requested operation
receive (reply); send (reply) to client_reference;
… …
8
Remote Method Invocation (RMI) and
Remote Procedure Call (RPC)
RMI, RPC
Middleware
Request and Reply
Operating System and Network Protocol
11
Implementation of RMI
Question 1
▪ What if the two computers use different representation for data
(integers, chars, floating point)?
▪ The most elegant and flexible solution is to have a standard
representation used for all values sent through the network.
▪ The proxy and skeleton convert to/from this representation
during marshalling/unmarshalling.
Question 2
▪ Who generates the classes for proxy and skeleton?
▪ In advanced middleware systems (e.g. CORBA) the classes for
proxies and skeletons can be generated automatically.
Given the specification of the server interface and the
standard data representations, an interface compiler can
generate the (source) code for proxies and skeletons.
12
Implementation of RMI
Server skeleton
15
RMI Semantics and Failures
Problem!
What if the request message was not truly lost (but, for example,
the server is too slow) and the server receives it more than once?
▪ We must avoid that the server executes operations more than once.
▪ Messages have to be identified by an identifier
and copies of the same message have to be filtered out:
▪ If the duplicate arrives and the server has not yet sent the reply
→ simply send the reply.
▪ If the duplicate arrives after the reply has been sent
→ the reply may have been lost or it did not arrive in time.
17
Lost Reply Message
The client can not distinguish the loss of a request from that of a reply;
it simply resends the request because no answer has been received!
▪ If the reply really got lost → when the duplicate request arrives at
the server, it already has executed the operation once!
▪ In order to resend the reply, the server may need to re-execute the
operation in order to get the result.
Danger?!
▪ Some operations can be executed more than once without any
problem; they are called idempotent operations
→ no danger with executing the duplicate request.
▪ There are operations which cannot be executed repeatedly without
changing the effect (e.g. transferring an amount of money between
two accounts)
→ history can be used to avoid re-execution.
History (log): stores a record of reply messages that have been
transmitted, together with the message identifier and the client which it
has been sent to.
18
Conclusion with Lost Messages
▪ Exactly-once semantics can be implemented in the case of lost
(request or reply) messages if both duplicate filtering and history
are provided and the message is resent until an answer arrives:
▪ Eventually a reply arrives at the client and the call has been
executed correctly - exactly one time.
19
Server Crash
(a) The normal sequence:
▪ When the client got an answer, the RMI has been carried out at least one
time, but possibly more.
▪ If the client got an answer, the RMI has been executed exactly once.
▪ If the client got a failure message, the RMI has been carried out at most
one time, but possibly not at all.
Problems:
▪ waste of server CPU time
▪ locked resources (files, peripherals, etc.)
▪ if the client reboots and repeats the RMI, confusion can be
created.
23
Summary - RMI Semantics and Failures
In practical applications, servers can survive crashes without loss of
memory.
▪ Transaction-based sophisticated commitment protocols are
implemented in distributed database systems to achieve this goal.
▪ In such systems, history can be used
and duplicates can be filtered out after restart of the server:
The client repeats sending requests without being in danger
operations to be executed more than once:
– If no answer is received after a certain amount of tries,
the client is notified, so it knows that the method has been
executed at most once or not at all.
– If an answer is received, it is forwarded to the client, which
knows that the method has been executed exactly one time.
→ Very Rigid!
25
Direct vs. Indirect Communication
An alternative: Indirect communication
▪ No direct coupling between sender and receiver(s).
▪ Communication is performed via an intermediary.
☺ Space decoupling:
sender does not know the identity of receiver(s).
☺ Time decoupling:
sender and receiver(s) have independent lifetimes:
they do not need to exist at the same time.
27
Group Communication
Why do we need it?
▪ Special applications: interest-groups, mail-lists, etc.
▪ Fault tolerance based on replication:
▪ A request is sent to several servers which all execute the
same operation (if one fails, the client still will be served).
▪ Locating a service or object in a distributed system:
▪ The client sends a message to all machines, but only the
one (or those) which holds the server/object responds.
▪ Replicated data (for reliability or performance):
▪ whenever the data changes, the new value has to be sent
to all processes managing replicas.
28
Group Communication
Group membership management:
▪ maintains the view of group membership,
considering members joining, leaving, or failing.
Services provided by group membership management:
▪ Group membership changes:
▪ create/destroy process groups;
▪ add/withdraw processes to/from group.
▪ Failure detection:
▪ Detects processes that crash or become unavailable (due to e.g.
communication failure);
▪ Excludes processes from membership if crashed or unavailable.
▪ Notification:
▪ Notifies members of events, e.g., processes joining/leaving group.
▪ Group address expansion:
▪ Processes sending to a group specify the group identifier;
▪ address expansion provides the actual addresses for the multicast
operation delivering the message
29
to each group members.
Group Communication
Essential features:
▪ Atomicity (all-or-nothing):
▪ when a message is sent to a group, it will either arrive correctly at all
members of the group or at none of them.
▪ Ordering
▪ FIFO-ordering: Messages originating from a given sender are delivered
in the order they have been sent, to all members of the group.
▪ Total-ordering: When several messages, from different senders, are sent
to a group, the messages reach all the members of the group in the same
order.
30
Publish-Subscribe Systems
The general objective of publish-subscribe systems is to let
information propagate from publishers to interested subscribers, in
an anonymous, decoupled fashion.
▪ Publishers publish events.
▪ Subscribers subscribe to and receive the events they are interested in.
Subscribers are not directly targeted from publishers but indirectly via
the notification service →
▪ Subscribers express their interest
by issuing subscriptions for specific notifications,
independently from the publishers that produces them;
▪ they are asynchronously notified for all notifications,
submitted by any publisher, that match their subscription.
33
Publish-Subscribe Systems
S2
publish(’IBM’, 95)
P1
S3
▪ Centralized implementations:
▪ are the simplest, however, scalability is limited by the processing
power of the machine that hosts the service.
▪ Distributed implementations:
▪ The notification service is realised as a network of distributed
processes, called brokers;
the brokers interact among themselves with the common aim
of dispatching notifications to all interested subscribers.
▪ Such a solution is scalable, but more challenging to implement
itrequires complex protocols for the coordination of the
various brokers and the diffusion of the information.
36
Acknowledgments
▪ Most of the slide contents is based on a previous version
by Petru Eles, IDA, Linköping University.
37