0% found this document useful (0 votes)
4 views

Clock Synchronization

Clock synchronization is crucial for time-based computations across multiple machines, ensuring accurate timestamps and deadlines. The document discusses various synchronization protocols, their goals, and challenges, including the famous example of the Patriot missile system's failure due to clock drift. It also highlights different synchronization methods, including authenticated and probabilistic schemes, while noting that real-world systems often face limitations in achieving optimal synchronization.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Clock Synchronization

Clock synchronization is crucial for time-based computations across multiple machines, ensuring accurate timestamps and deadlines. The document discusses various synchronization protocols, their goals, and challenges, including the famous example of the Patriot missile system's failure due to clock drift. It also highlights different synchronization methods, including authenticated and probabilistic schemes, while noting that real-world systems often face limitations in achieving optimal synchronization.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 19

Clock Synchronization

Why do clock synchronization?

 Time-based computations on multiple machines


 Applications that measure elapsed time
 Agreeing on deadlines
 Real time processes may need accurate timestamps

 Many applications require that clocks advance at


similar rates
 Real time scheduling events based on processor clock
 Setting timeouts and measuring latencies
 Ability to infer potential causality from timestamps
Famous example

 Scud rockets launched by Iraq towards Israel


 Ground-based Patriot missiles fire back
 But missiles always missed the warhead!
 Why?
Famous example

 Scud rockets launched by Iraq towards Israel


 Ground-based Patriot missiles fire back
 But missiles always missed the warhead!
 Why?
 After 72 hours of waiting control system was out
of sync relative to Patriot guidance system
 “be at (x,y,z) at time t” was misinterpreted!
Goals for clock
synchronization?
 We might be concerned with
 Clock accuracy relative to real-time
 Clock precision, or degree to which correct
clocks agree with one-another
 Rate of possible clock drift
 Would we want the Patriot system to be
optimally accurate, or optimally precise, if
we can’t have both?
The System Model

 Hardware clocks

Physical clock of process q designated Rq(t)
 Clocks have a drift rate ρ:
 (1+ ρ)-1(t2-t1)  Rp(t2)- Rp(t1)  (1+ ρ) (t2-t1)
 Implies that rate of drift is bounded by dr = ρ(2+ ρ)/(1+ ρ)
 For Byzantine model assume nothing about the clock
 May increase or decrease or return a random number

 May get “stuck” (surprisingly common in real systems)

 Cannot necessarily be modeled by functions.

 There is a limit tdel on message latency


Clock synchronization goals

 A clock synchronization protocol implements a virtual


clock function mapping real time t to Cp(t)
 Agreement condition:

|Cp(t) - Cq(t)|  Dmax for all correct p, q

Dmax bounds the difference between two virtual clocks
running on different processors
 Accuracy condition:
 (1+)-1t + a  Cp(t)  (1+)t +b, for constants a, b, 
 Says that p’s clock must be within a linear envelope of
“real time”
Clocks and True Time

b
t+
)
(1+
( t)
C p
Clock Time 

k ck:
l o
oc C
l C l
tua
l
+ a
de
a Vi
r
) -1 t
I
+
(1

b a
True Time 
Authenticated Algorithm

 Solution for system of n processes, at most f of which are


faulty.
 Let P be the logical time between resynchronizations
 A process expects the k’th resynchronization at time kP

When Cp(t)=kP broadcast a signed message for the form “round k”
 When a process receives f+1 such messages, it sets its logical clock
Cq(t)=kP+ for some constant  greater than the increase in Cq since
q sent its own round k message.
 Also, q relays round k messages it receives
 Srikanth and Toueg give proofs of correctness. Insight: at
least one of the round k messages is from a correct process
Overview of proof

 Lemma 1: The k’th resynchronization is bounded in size by some


constant dmin, such that for k  1, endk-begink  dmin
 Lemma 2: After k’th resynchronization, correct clocks differ by at most
dmin(1+ρ)
 Lemma 3: No correct process starts its k’th clock until at least some
correct process is ready to do so: for k  1, begink  readyk
 Lemma 4: All correct processes start their k’th clock soon after one
correct process is ready to do so: endk-readyk  (1+ ρ)Dmax+tdel
 Lemma 5, 6, 7: The periods between resynchronizations and maximum
deviations between clocks are bounded and do not overlap
 Theorem: the algorithm achieves agreement & accuracy
Optimality

 Bound on accuracy: Srikanth and Toueg


show that for any synchronization, accuracy
cannot exceed that of the underlying
hardware clocks
 And they show that their simple algorithm
achieves optimal accuracy
 Proof is remarkably tricky!
Unauthenticated algorithm

 The algorithm relies on properties of the message


system:
 Correctness: If at least f+1 correct processes broadcast
round k messages by time t, then every correct process
accepts a message by time t+tdel
 Unforgeability: If no correct process broadcasts a round
k message by time t, then no correct process accepts the
message by time t or earlier
 Relay: If a correct process accepts the message round k
at time t, then every correct process does so by time t+tdel
Simulating Authentication

 Here they reference a different paper:


 T.K. Srikanth and S. Toueg. Simulating authenticated broadcasts to
derive simple fault-tolerant algorithms. Distributed Computing 2(2):
80-94 (1987).
 Based on an echoing scheme where witnesses to a broadcast
effectively “sign it”
 Cost is O(n3) messages per broadcast round, hence per clock
synchronization round
 Paper claims cost is O(n2) but this assumes a built-in way of sending one
message to n processes in one step
 Realistic cost of resynchronization is something like O(n4)
since each process needs to do one of these broadcasts
Other ways to think about
resynchronization
 Cristian: probabilistic clock synchronization
 Starts with observation about RPC
 If I “ping” you in a network
 Most round-trip times will be small
 But distribution may have a heavy tail

 Expressed in terms of expectation: “with


probability p a reply to a ping will be received
within time ”
Cristian’s scheme

 His idea: System contains some number of time


“authorities” that everyone trusts
 i.e. they have a GPS receiver – cheap and common…
 Periodically, client machine a pings authority b asking
“what time is it?”
 If round-trip time is less than , then a replaces Ca(t)
with (Ca(t)+ (Cb(t)- /2))/2
 With high probability this scheme gives very good clock
synchronization. Not tolerant of faults but can be
extended into a fault-tolerant solution
Verissimo and Rodriguez

 They notice that clock synchronization is


really bounded not by actual latencies but by
uncertainty in latency
 Instead of , think of min+, for some   0
 Leads to a solution where accuracy is limited
by  rather than by 
Other practical considerations

 Real systems have


 Hardware from multiple vendors
 Operating systems from multiple sources
 Tends to limit our ability to synchronize clocks
 Several widely supported standards but no single
solution that everyone uses
 Hence when crossing machine boundaries, expect
problems!
Real-world clocks

 Real systems
 Sometimes stop the clock
 Sometimes even run the clock backwards!
 Better approach?
 Pick a constant  and synchronize during periods of time
 long
 If clock needs to be adjusted by , adjust at rate / over
the course of a period, value catches up
 Avoids sudden discontinuities or stopping the clock
Summary

 We often assume synchronized clocks


 In practice, quality of synchronization remains
relatively poor
 At best synchronization will be limited by quality
of physical clocks, rates of physical clock drift, and
uncertainty in latencies
 Cristian’s probabilistic scheme makes these
uncertainties explicit and also works very well

You might also like