0% found this document useful (0 votes)
13 views

Sysnchronization Between Distributed System Springer

This document discusses synchronization in distributed systems. It begins by describing how distributed systems have individual node clocks that drift out of sync over time due to factors like temperature and hardware instability. This can cause problems for applications that require precise timing. The document then discusses different types of synchronization approaches, including external synchronization to a common reference clock and internal synchronization between nodes. It also discusses issues that can cause clocks to go out of sync like clock skew and drift. Several clock synchronization algorithms are described, including Cristian's centralized algorithm and distributed algorithms. The document proposes making synchronization distributed to avoid the single point of failure issue in Cristian's algorithm.

Uploaded by

timeop273
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Sysnchronization Between Distributed System Springer

This document discusses synchronization in distributed systems. It begins by describing how distributed systems have individual node clocks that drift out of sync over time due to factors like temperature and hardware instability. This can cause problems for applications that require precise timing. The document then discusses different types of synchronization approaches, including external synchronization to a common reference clock and internal synchronization between nodes. It also discusses issues that can cause clocks to go out of sync like clock skew and drift. Several clock synchronization algorithms are described, including Cristian's centralized algorithm and distributed algorithms. The document proposes making synchronization distributed to avoid the single point of failure issue in Cristian's algorithm.

Uploaded by

timeop273
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Synchronization in Distributed Systems

Amritha Sampath and C. Tripti

Department of Computer Science


Rajagiri School of Engineering & Technology, Rajagiri valley, Cochin, India
[email protected],
[email protected]

Abstract. In the present scenario, a demand for the highly reliable and syn-
chronous systems is seen. As a result, there has been a gradual shift to distri-
buted systems from the centralized systems. There are few disadvantages for
this system too. The most important one is that in a distributed system, the dif-
ferent nodes maintain their own time using local clocks and their time values
may not be same for the different nodes. I.e. there is no global clock within the
system so that that the various activities in the distributed environment can be
synchronized. The various clocks in the system even if set to a common time
value at an instant, drift apart due to unavoidable reasons. Hence some kind of
continuous mechanism for synchronization is needed so that they can coordi-
nate and work together to achieve the objectives of the distributed system. Two
types of synchronization are possible- external synchronization and internal
synchronization. In a real time scenario, it is important for the system to be syn-
chronous with each other and with a common external reference time. This is
called external synchronization. But in certain systems, it is only necessary for
the nodes in the system to be synchronized with each other. This is called inter-
nal synchronization. In many applications, the relative ordering of events is
more important than actual physical time. Here event ordering is done without
clock time values. Hence, depending on the area and type of application, clock
synchronization techniques used differs.
In certain real time applications, the system requires to be both internally
and externally synchronized. In such cases a centralized algorithm called the
‘Cristian’s’ algorithm is used for synchronization. But this algorithm fails in
situations where the time server fails. This paper suggests some methods to
make the synchronization process distributed so that the disadvantages of the
Cristian’s algorithm can be nullified.

Keywords: Synchronization, Centralized Algorithms, Distributed Systems.

1 Introduction
The advantages of the distributed systems are so attractive that there is a gradual shift
from the centralized system era to the distributed systems era. The distributed systems
are faster and cheaper when compared to the centralized systems. In a distributed
system, a set of processors, each with its own internally built-in hardware clock,

N. Meghanathan et al. (Eds.): Advances in Computing & Inform. Technology, AISC 176, pp. 417–424.
springerlink.com © Springer-Verlag Berlin Heidelberg 2012
418 A. Sampath and C. Tripti

communicates by message transmission. But they do not have access to a central


global clock. The hardware clock of each processor tends to drift apart even if all the
processor clocks are set to a common time value. The drifting of these processors can
be due to instabilities inherent in source oscillators and environmental conditions such
as temperature, air circulation and mechanical stresses. For real time software applica-
tions and related processes, highly accurate and synchronized time is a necessity.
Clock inaccuracy can cause a number of problems. Even if it is a difference of a
minute or two, the outcomes may be unacceptable for the application.
A distributed system is designed to realize some synchronized behavior, especially in
real-time processing in areas like factories, aircraft, space vehicles, and military. Syn-
chronization of individual clocks becomes very important in case of certain hard and
risky real time applications like, where predictable performance is of major concern,
one need to preserve a total logical or temporal scheduling of the tasks in the system.
Clock inaccuracies occur due to certain instabilities inherent in source oscillators
and environmental conditions such as temperature, air circulation and mechanical
stresses. The clocks in the different nodes need to be synchronized to limit the
inaccuracies and hence implement the objectives of distributed system in an efficient
manner. Hence, clock skew and drifts[8] which forms the major source of clock inac-
curacy needs to be monitored continuously. In certain applications it is not just
enough to synchronize the various processes but also the various events that constitute
them. This is called intraprocess synchronization. Intraprocess concurrency is cap-
tured by relation affects or causally affects. Bit matrix clocks and hierarchical clocks
which evolved after the logical and vector clocks of Lamport[1] capture affects rela-
tion between events of process. However both have a major disadvantage in terms of
increased storage and communication overhead. Difference clocks[2] captures intra-
process and interprocess concurrency, at the same time with reduced storage and
communication overheads.
Section 2 describes about the various issues in clock synchronization. There are
various methods of achieving clock synchronization depending on the requirements of
the situation. In order to behave as a single, unified computing resource, distributed
systems have need for a synchronization of clocks and several algorithms have been
proposed on this topic. Section 3 describes the various clock synchronization algo-
rithms and their advantages and disadvantages. Proposed solution for overriding the
disadvantages of the clock synchronization algorithms is discussed in section 4.
Section 5 gives the conclusion.

2 Issues in Clock Synchronization


Load balancing and resource sharing are the two main objectives of distributed sys-
tems. In order to achieve these objectives nodes should communicate with each other.
In such an environment, it is necessary that the different nodes in the system should
have a common time based on which they can order the events. The clocks of the
communicating nodes should agree upon a common time value. If the system is
working on real time applications like aviation traffic control and position reporting,
radio and TV programming launch and monitoring, multimedia synchronization for
real-time teleconferencing etc, then the clocks should match with Coordinated
Universal Time, UTC[10].
Synchronization in Distributed Systems 419

Two factors that might cause errors are clock skew and drift rate. Two clocks are
running at the exact same speed but with a constant difference called clock skew.
Figure 1 demonstrates clock skew pictorially. Clock drift is another reason for mis-
match in values between the various nodes. The clocks may run at different speeds
and this difference in speed, even if small may get accumulated to a large value
enough to cause errors in the intended working of the system. The difference in speed
is mainly due to the type of quartz crystal being used in the clock. Other factors such
as temperature and other mechanical effects on the crystal also causes change in
frequency of oscillation and hence a drift in speed. Clock drifts are caused due to
effects which cannot be removed permanently. Therefore, the clocks need frequent
monitoring and adjustments in order to keep them synchronized.

Fig. 1. Clock Skew

Thus, even after synchronization, clock values differs [3]. If C denotes the perfect
clock’s time, then ideally,(dC/dt) = 1. Perfect clock may be considered as that which
provides value of UTC or some other external clock reference value. For a clock that
is fast with respect to a perfect clock, (dC/dt) > 1 and for a slow clock, (dC/dt)< 1.
The slow and the fast clocks drift in opposite ways when compared with the perfect
clock. In both the cases, the clock values are incorrect after a particular time interval
and needs to be resynchronized.
Clock synchronization requires that a node be able to read another node’s time val-
ue. Errors occur mainly due to delay in the messages or the time values sent between
the nodes. The minimum value of delay can be calculated by adding time taken to
prepare, transmit and receive a message in absence of traffic and other errors in the
system. But it is impossible to find the upper limit because it depends on the load in
the communication system at the time of transmission[3].
Another issue in synchronization is that a computer clock usually can be adjusted
forward but never backward[3]. The time of fast clock should be gradually corrected
instead of setting it to the correct value at once. It is done using intelligent interrupt
routine. An intelligent routine readjusts the amount of time to be added to the clock
time for each interrupt.
420 A. Sampath and C. Tripti

3 Related Works
Various algorithms have been proposed for clock synchronization. Centralized algo-
rithms maintain a node as time server which has a real-time receiver. Cristian’s algo-
rithm[10] is an passive time server centralized algorithm and Berkeley algorithm is
example for active time server algorithm. These algorithms are not scalable and are
subject to single point failure. Hence distributed algorithms are used. These can be
global averaging or localized averaging distributed algorithms[3].

3.1 Cristian’s algorithm

Cristian’s algorithm[10] relies on the existence of a time server. In a centralized algo-


rithm, time server is considered to be a perfect clock and whose value can be used as
a reference for the other systems to set its time value, so that the entire system of
nodes in the network remains synchronized with each other and with external refer-
ence time, i.e. it is both externally and internally synchronized. A client machine
makes a procedure call to the time server and the server replies with the time value.
The round trip time is used to calculate the propagation delay and is added with the
server’s time value to get the time value for the client clock.

• Client p sends request to time server S


• S inserts its time t immediately before reply is returned
• p measures how long it takes (TroundTrip=T1 - T0)
• p sets its local clock to t+TroundTrip/2

Cristian’s algorithm takes several values of T1-T0 and those values which exceeds a
particular threshold is considered unreliable and is discarded. Then average of the re-
maining values is calculated to get value of TroundTrip and half the value is added
with ’t’ to get the value to which the client nodes must set its clock value. I.e.
Precision of the passive time server centralized algorithm can be improved by taking
several measurements and taking the smallest round trip or using an average after
throwing out the large values.

Fig. 2. Cristian’s Algorithm[10]


Synchronization in Distributed Systems 421

3.2 Berkeley Algorithm

It is used in systems without UTC receiver. One computer is master, other are
slaves[9].

• Server polls each client.


• Each client responds to the server with its local time.
• The server estimates the clients’ local time (similar to Cristian’s technique), and
averages the time (including the server’s own reading, but excluding those that
may have drifted badly). It then tells each client their offset.

In case of failure of master, election is done to find a new master.

Fig. 3. Berkeley Algorithm[10]

3.3 Naimi-Trehel Algorithm- Token Based System

Several token based hierarchical algorithms have been proposed, most of which are
tree based[4]. Naimi-Trehel’s algorithm[5] is a token based algorithm for large hie-
rarchical networks that maintains a logical dynamic tree structure such that the root of
the tree is always the last node that will get the token among the current requesting
nodes. Various extensions to this algorithm take into account the network topology,
specially the latency gap between local and remote clusters of machines. It reduces
the number of inter-cluster messages and gives a higher priority to local mutual exclu-
sion requests.
In the first extension, on each cluster, excepting the one that initially holds the to-
ken, a dedicated process, called proxy is introduced. It is in charge of storing the last
request to remote clusters. Before asking for a token which it believes belong to a
node of a remote cluster, a node ‘i’ first a request to its corresponding proxy. If
another node ‘j’ of the same cluster has recently asked for the token and the proxy is
aware of it, the proxy redirects the request to ‘j’ avoiding transmission to the remote
cluster.
422 A. Sampath and C. Tripti

The second extension aims at reducing the number of inter-cluster messages by ag-
gregating remote requests. When a request has to be redirected to a probable owner,
belonging to a remote cluster, the request is not sent to it but stored in a queue. This
queue accumulates therefore requests for remote clusters. It is stored in the last node
which will enter the critical section within the cluster.
Finally, a local preemption of the token is performed, giving a higher priority to
requests originating from the local cluster in order to exploit cluster locality. A thre-
shold that defines the degree of locality and avoids starvation is selected. When the
number of local request is below this threshold, the requesting path is modified in
order to serve local requests first.

3.4 Token Ring Based System

Paper by Latha CA and Dr. Shashidhara[7] proposes a token ring method that aims at
distributing the job of the centralized time server to all the nodes within the system.
Therefore, the effect of failure of the time server can be reduced to some extent.
Here the nodes in the system are arranged in the form of a ring and a token is circu-
lated within. The node which possesses the token at an instant act as the time server
for the system until the token is passed to its neighbor in the ring. It receives the time
value from an external reference time source and broadcasts it to all the nodes in the
system. The other nodes receive this value and set its time to the received value if it is
within an expected range. Otherwise it requests for retransmission. Even after a
particular number of retransmissions if system is unable to get an acceptable value, an
error message is created. After a particular amount of time, the token is passed to the
neighbor which then takes the role of the time server.

4 Proposed Solution
External synchronization techniques are usually implemented based on a centralized
time server. In a centralized server, there is a time server which will be connected to
external reference time or the UTC server and is responsible for synchronizing the
other nodes within the system. But such a system has all disadvantages of a centra-
lized approach, i.e. the burden of synchronizing all the nodes in the system lies with
the dedicated time server. If the server fails, the entire system fails.
The proposed solution is based on the token ring based method discussed in [7]
which reduces the centralized dependencies and aims at implementing a system which
can realize both external and internal synchronization and hence can deal with the two
issues in clock synchronization, that is, clock drift and clock skew.
Here, all nodes are arranged in form of a ring and token is passed from one node to
another. Such a token based mechanism has a limit on the number of nodes that it can
handle and distance between the nodes. Hence, this type of network organization has
a disadvantage that it does not allow the system to be scalable. In order to make the
network scalable, a hierarchical organization[6] of the nodes can be incorporated to
the above mentioned scheme. Here, in the first layer of hierarchy, the external refer-
ence time can be broadcasted to nodes which may be a large distance apart and these
nodes in turn act as the reference time source for local networks that work on the
Synchronization in Distributed Systems 423

above mentioned token ring mechanism. I.e. nodes which are closely located will be
arranged in the ring and work on token ring mechanism and these local groups get the
external time value from a distant centralized server through broadcasting.
Hence this scheme implements a centralized approach in the first layer of hierarchy
and a distributed approach in the second layer of hierarchy. The first layer helps to
make the system more scalable while the second layer tries to reduce the centralized
mechanism hence reducing the impact of node failures. In case of a node failure, only
the nodes belonging to that particular ring will be affected. Trouble shooting also be-
comes easier since the entire system of nodes have been divided into smaller groups
of nodes.
The time for which each node holds the token and hence work as the time server
needs to be decided according to the requirements of the system. Also interval be-
tween each broadcast of the time value also needs to be decided based on the accuracy
of the physical clocks in each node.
The advantages of the system are:

1. Synchronize the systems internally and externally, hence useful in real time
systems
2. Makes the system more scalable.
3. Remove clock drift and clock skew
4. Nodes within the system will be internally synchronized even if connection to
external reference time fails
5. Effect of failure of node possessing the token is minimal, since after a particular
amount of time the next node in the ring takes up the job of the time server.
6. Easier troubleshooting.

Following are the disadvantages:

1. Requires broadcasting technique


2. Costly
3. Token loss, failure of node etc needs to be handled

5 Conclusion
In a distributed system, clocks of the individual nodes need to be synchronized with
each other. If a system is synchronized with a universal reference time, the system is
both internally and externally synchronized. Various centralized synchronization al-
gorithms like Cristian’s algorithm and Berkeley algorithm was discussed in section 3.
They are capable of both internal and external synchronization but suffer from disad-
vantages of centralized systems. Hence, the proposed solution aims at making the sys-
tem more distributed by introducing a hierarchical system together with a token ring
based approach in the second layer of hierarchy. Here, the effect of failure of node
that acts as time server is minimized. They can deal with clock skews and drifts.
Above all it makes the system highly scalable.
424 A. Sampath and C. Tripti

References
1. Lamport, L.: Time. clocks and the ordering of events in a distributed system. Communica-
tions of the ACM 21(7), 558–564 (1978)
2. Vaidehi, S., Ram, D.J., Shukla, A.: Difference clocks-A new scheme for logical time in
distributed systems. IEE Proc.-Comput. Digit. Tech. 143(6), 426–430 (1996)
3. Sinha, P.K.: Distributed Operating Systems: Concepts and Design, pp. 282–336. PHI
Learning Private Limited (2009)
4. Housni, A., Trehel, M.: Distributed mutual exclusion token-permission based by priori-
tized groups. In: Proceedings of the ACS/IEEE International Conference on Computer
Systems and Applications, pp. 253–259 (June 2001)
5. Bertier, M., Arantes, L., Sens, P.: Hierarchical token based mutual exclusion algorithms.
In: IEEE International Symposium on Cluster Computing and the Grid, pp. 539–546
(2004)
6. Nishimura, T., Hayashibara, N., Enokido, T., Takizawa, M.: Causally Ordered Delivery
with Global Clock in Hierarchical Group. In: Proceedings of the 2005 11th International
Conference on Parallel and Distributed Systems, ICPADS 2005 (2005)
7. Latha, C.A., Shashidhara, H.L.: Clock Synchronization in Distributed Systems. In: 5th In-
ternational Conference on Industrial and Information Systems, pp. 475–480 (July-August
2010)
8. Distributed Systems: Clock Synchronisation, UTC, Clock drift and skew (2010),
http://www.krzyzanowski.org/rutgers/lectures/l-clocks.html
9. Distributed Systems: Principles and Paradigms, Physical and logical clocks (2010),
http://net.pku.edu.cn/~course/cs501/2008/resource/
steen_vrije/courses/ds-slides/2006/notes.06.pdf
10. Applied Computer Science Problems: Clock and State Synchronization, Clock Synchroni-
sation algorithms (2010), http://www.sti-innsbruck.at/fileadmin/
documents/teaching_archive/acsp0405/06_Ruff_Ausarbeitung.pdf

You might also like