Lecture 11A - Replication Control
Lecture 11A - Replication Control
FALL 2022
Dr. Zeshan Iqbal
Lecture 11: Replication Control
Server-side Focus
• Concurrency Control = how to coordinate
multiple concurrent clients executing
operations (or transactions) with a server
Next:
• Replication Control = how to handle
operations (or transactions) when there are
objects are stored at multiple servers, with or
without replication
1
Replication: What and Why
• Replication = An object has identical copies, each
maintained by a separate server
– Copies are called “replicas”
• Why replication?
– Fault-tolerance: With k replicas of each object, can
tolerate failure of any (k-1) servers in the system
– Load balancing: Spread read/write operations out over
the k replicas => load lowered by a factor of k compared
to a single replica
– Replication => Higher Availability
Availability
• If each server is down a fraction f of the time
– Server’s failure probability
• With no replication, availability of object =
= Probability that single copy is up
= (1 – f)
• With k replicas, availability of object =
Probability that at least one replicas is up
= 1 – Probability that all replicas are down
= (1 – f k)
2
Nines Availability
• With no replication, availability of object =
= (1 – f)
• With k replicas, availability of object =
= (1 – f k)
Availability Table
f=failure No replication k=3 replicas k=5 replicas
probability
0.1 90% 99.9% 99.999%
3
Replication Transparency
Replicas of an
Front ends Replica 1 object O
provide replication
transparency
Client Front End
Replica 2
Client
Front End
Client Replica 3
Requests
(replies flow opposite)
Replication Consistency
• Two ways to forward updates from front-ends (FEs) to
replica group
– Passive Replication: uses a primary replica (master)
– Active Replication: treats all replicas identically
• Both approaches use the concept of “Replicated State
Machines”
– Each replica’s code runs the same state machine
– Multiple copies of the same State Machine begun in the
Start state, and receiving the same Inputs in the same order
will arrive at the same State having generated the same
Outputs. [Schneider 1990]
4
Passive Replication
• Master => total ordering of all updates
Replica 1 • On master failure, run election
Client
Front End
Client Replica 3
Requests
(replies flow opposite)
Active Replication
Multicast
Front ends Replica 1 inside
provide replication Replica group
transparency
Client
Front End
Client Replica 3
Requests
(replies flow opposite)
10
5
Active Replication Using Concepts
You’ve Learnt earlier
• Can use any flavor of multicast ordering, depending on
application
– FIFO ordering
– Causal ordering
– Total ordering
– Hybrid ordering
• Total or Hybrid (*-Total) ordering + Replicated State
machines approach
– => all replicas reflect the same sequence of updates to the
object
11
12
6
Transactions and Replication
• One-copy serializability
– A concurrent execution of transactions in a replicated database
is one-copy-serializable if it is equivalent to a serial execution
of these transactions over a single logical copy of the database.
– (Or) The effect of transactions performed by clients on
replicated objects should be the same as if they had been
performed one at a time on a single set of objects (i.e., 1 replica
per object).
• In a non-replicated system, transactions appear to be
performed one at a time in some order.
– Correctness means serial equivalence of transactions
• When objects are replicated, transaction systems for
correctness need one-copy serializability
13
Example 1
Site A Site B Site C
x x, y x, y, z
T1: x ← 20 T2: Read(x) T3: Read(x)
Write(x) y ← x+y Read(y)
Commit Write(y) z ← (x∗y)/100
Consider the three histories: Commit Write(z)
Commit
HA={W1(xA), C1}
HB={W1(xB), C1, R2(xB), W2(yB), C2}
HC={W1(xC),C1 , W2(yC), C2, R3(xC), R3(yC),W3(zC), C3}
14
7
Next
15
Object Z
16
8
Transactions With Distributed
Servers
• Transaction T may touch objects that reside on
different servers
• When T tries to commit
– Need to ensure all these servers commit their updates
from T => T will commit
– Or none of these servers commit => T will abort
• What problem is this?
17
18
9
One-phase Commit
Coordinator Server 1
Transaction T Server Object A
write(A,1);
.
write(B,2); Object B
.
… . .
write(Y, 25); .
write(Z, 26); .
commit
Server 13
Object Y
• Special server called “Coordinator”
initiates atomic commit
Object Z
• Tells other servers to either
commit or abort
19
20
10
Two-phase Commit
Coordinator
Server …
Server 1 Server 13
Prepare
21
Two-phase Commit
Coordinator
Server …
Server 1 Server 13
Prepare
22
11
Two-phase Commit
Coordinator
Server …
Server 1 Server 13
Prepare
23
Two-phase Commit
Coordinator
Server …
Server 1 Server 13
Prepare
24
12
Two-phase Commit
Coordinator
Server …
Server 1 Server 13
Prepare
25
Two-phase Commit
Coordinator
Server …
Server 1 Server 13
Prepare
26
13
Failures in Two-phase Commit
• If server voted Yes, it cannot commit unilaterally before
receiving Commit message
• If server voted No, can abort right away (why?)
• To deal with server crashes
– Each server saves tentative updates into permanent storage, right
before replying Yes/No in first phase. Retrievable after crash
recovery.
• To deal with coordinator crashes
– Coordinator logs all decisions and received/sent messages on disk
– After recovery or new election => new coordinator takes over
27
28
14
Using Paxos in Distributed Servers
Atomic Commit
•Can instead use Paxos to decide whether to commit a
transaction or not
•But need to ensure that if any server votes No, everyone
aborts
Ordering updates
•Paxos can also be used by replica group (for an object)
to order all updates – iteratively do:
– Server proposes message for next sequence number
– Group reaches consensus (or not)
29
Summary
• Multiple servers in cloud
– Replication for Fault-tolerance
– Load balancing across objects
• Replication Flavors using concepts we learnt
earlier
– Active replication
– Passive replication
• Transactions and distributed servers
– Two phase commit
30
15