0% found this document useful (0 votes)
178 views

Chapter 7-Consistency and Replication

Hh

Uploaded by

daniel asefa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
178 views

Chapter 7-Consistency and Replication

Hh

Uploaded by

daniel asefa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

Chapter 7 - Consistency and Replication

Introduction
ƒ data are generally replicated to enhance reliability and
improve performance
ƒ but replication may create inconsistency
ƒ when one copy is updated, all the other copies must be
updated as well
ƒ consistency models for shared data are often hard to
implement efficiently in large-scale distributed systems;
hence simpler models such as client–centric consistency
models which are easier to implement are used

2
Objectives of the Chapter
ƒ we will discuss
ƒ why replication is useful and its relation with scalability
ƒ consistency models for shared data designed for parallel
computers which are also useful in distributed shared
memory systems
ƒ client–centric consistency models (from the perspective of
a single client, possibly mobile)
ƒ how consistency and replication are implemented
ƒ caching protocols

3
7.1 Reasons for Replication
ƒ two major reasons: reliability and performance
ƒ reliability
ƒ if a file is replicated, we can switch to other replicas if
there is a crash on our replica
ƒ we can provide better protection against corrupted
data; similar to mirroring
ƒ performance
ƒ if the system has to scale in size and geographical area
ƒ place a copy of data in the proximity of the process
using them, reducing the time of access and increasing
its performance; for example a Web server is accessed
by thousands of clients from all over the world
ƒ caching is strongly related to replication; normally by
clients
ƒ in replication, more bandwidth is consumed to keep
replicas consistent (cost) and a possibility of inconsistency
4
ƒ Replication as Scaling Technique
ƒ replication and caching are widely applied as scaling
techniques
ƒ processes can use local copies and limit access time and
traffic
ƒ however, we need to keep the copies consistent
1. but this may require more network bandwidth
ƒ if the copies are refreshed more often than used (low
access-to-update ratio), the cost (bandwidth) is more
expensive than the benefits; not all updates have been
used

5
2. it may itself be subject to serious scalability problems
ƒ intuitively, a read operation made on any copy should
return the same value (the copies are always the same)
ƒ thus, when an update operation is performed on one
copy, it should be propagated to all copies before a
subsequent operation takes place
ƒ this is sometimes called tight consistency (a write is
performed at all copies in a single atomic operation or
transaction)
ƒ atomicity is difficult to implement since it means that
all replicas first need to reach agreement on when
exactly an update is to be performed locally, say by
deciding a global ordering of operations using Lamport
timestamps or letting a coordinator assign an order;
they both take a lot of communication time

6
ƒ dilemma
ƒ scalability problems can be alleviated by applying
replication and caching, leading to a better performance
ƒ but, keeping copies consistent requires global
synchronization, which is generally costly in terms of
performance
ƒ solution: loosen the consistency constraints
ƒ updates do not need to be executed as atomic operations
(no more instantaneous global synchronization); but
copies may not be always the same everywhere
ƒ to what extent the consistency can be loosened depends
on the specific application (the purpose of data as well as
access and update patterns)

7
7.2 Data-Centric Consistency Models
ƒ consistency has always been discussed
ƒ in terms of read and write operations on shared data
available by means of (distributed) shared memory, a
(distributed) shared database, or a (distributed) file system
ƒ we use the broader term data store, which may be physically
distributed across multiple machines
ƒ assume also that each process has a local copy of the data
store and write operations are propagated to the other copies

the general organization of a logical data store, physically distributed and


8
replicated across multiple processes
ƒ a consistency model is a contract between processes and the
data store
ƒ processes agree to obey certain rules
ƒ then the data store promises to work correctly
ƒ ideally, a process that reads a data item expects a value that
shows the results of the last write operation on the data
ƒ in a distributed system and in the absence of a global clock
and with several copies, it is difficult to know which is the
last write operation
ƒ to simplify the implementation, each consistency model
restricts what read operations return

9
ƒ data-centric consistency models to be discussed
1. Sequential Consistency
2. Causal Consistency
3. Entry Consistency
ƒ the following notations and assumptions will be used
ƒ Wi(x)a means write by process Pi to data item x with the
value a has been done
ƒ Ri(x)b means a read by process Pi to data item x returning
the value b has been done
ƒ the index may be omitted when there is no confusion as
to which process is accessing data
ƒ assume that initially each data item is NIL
ƒ the time axis is drawn horizontally, with time increasing
from left to right

10
behavior of two processes operating on the same data item; it took
some time to propagate the update of x to P2

1.Sequential Consistency
ƒ a data store is said to be sequentially consistent when it
satisfies the following condition:
ƒ The result of any execution is the same as if the (read
and write) operations by all processes on the data store

ƒ were executed in some sequential order and …
ƒ the operations of each individual process appear …
ƒ in this sequence
ƒ in the order specified by its program
ƒ i.e., all processes see the same interleaving of operations
ƒ time does not play a role; no reference to the “most recent”
write operation 11
ƒ example: four processes operating on the same data item x

ƒ the write operation of P2


appears to have taken place
before that of P1; but for all
a sequentially consistent data processes
store

ƒ to P3, it appears as if the data


item has first been changed to
b, and later to a; but P4 , will
conclude that the final value is
a data store that is not b
sequentially consistent
ƒ not all processes see the same
interleaving of write operations

12
ƒ to understand sequential consistency better consider the
following example
ƒ assume three concurrently executing processes and three
data items (integers) stored in a sequentially consistent
data store
ƒ each variable is assumed to be initialized to 0

Process P1 Process P2 Process P3


x ← 1; y ← 1; z ← 1;
print (y, z); print (x, z); print (x, y);
three concurrently executing processes
ƒ assignments are write operations and prints are read
operations; all statements are assumed to be indivisible
ƒ there are 720 = 6! possible execution sequences
ƒ from the 120 (5!) sequences that begin with x ← 1, some of
them have print(x, z) before y ← 1 and violate program
order; some also have print(x, y) before z ← 1 and violate
program order 13
ƒ only 1/4 (=30) of the 120 sequences are valid
ƒ also considering those that start with y ← 1 and z ← 1,
there are only a total of 90 valid execution sequences
ƒ if we concatenate the outputs of P1, P2 and P3 in that
order, we get a 6-bit signature of the execution

x ← 1; x ← 1; y ← 1; y ← 1;
print (y, z); y ← 1; z ← 1; x ← 1;
y ← 1; print (x, z); print (x, y); z ← 1;
print (x, z); print (y, z); print (x, z); print (x, z);
z ← 1; z ← 1; x ← 1; print (y, z);
print (x, y); print (x, y); print (y, z); print (x, y);

Prints: 001011 Prints: 101011 Prints: 010111 Prints: 111111

Signature: Signature: Signature: Signature:


001011 101011 110101 111111
four valid execution sequences for the processes of the previous slide;
the vertical axis is time
14
ƒ there are a total of 64 = 26 signatures, where 6 is the number
of bits of the signature
ƒ not all 64 (26) signatures are valid; for example
ƒ 000000 is not valid; it means all prints are done before all
assignments; it violates the requirement that statements
are executed in program order
ƒ 001001 is impossible; 00 means P1 executes before P3 and
01 means P3 executes before P1
ƒ the 90 different valid statement orderings produce a variety of
results (< 64) that are allowed under the assumption of
sequential consistency
ƒ all processes must accept these as valid results and work
correctly, which is the contract between them and the data
store

15
2.Causal Consistency
ƒ it is a weakening of sequential consistency
ƒ it distinguishes between events that are potentially
causally related and those that are not
ƒ e.g., y = x+5; a write on y that follows a read on x; the
writing of y may have depended on the value of x
ƒ otherwise the two events are concurrent
ƒ two processes write two different variables
ƒ if event B is caused or influenced by an earlier event, A,
causality requires that everyone else must first see A, then
B

16
ƒ a data store is said to be causally consistent, if it obeys the
following condition:
ƒ Writes that are potentially causally related …
ƒ must be seen by all processes
ƒ in the same order.
ƒ Concurrent writes …
ƒ may be seen in a different order
ƒ on different machines.

17
ƒ example
ƒ W2(x)b and W1(x)c are concurrent, not a requirement for
processes to see them in the same order
CR
Conc

this sequence is allowed with a casually-consistent store, but not with


sequentially consistent store

CR Conc

a) a violation of a causally-consistent store


b) a correct sequence of events in a causally-consistent store (R(x)a is
removed) but not with sequentially consistent store
18
ƒ implementing causal consistency requires keeping track of which
processes have seen which writes; a dependency graph must be
constructed and maintained, say by means of vector timestamps
ƒ Grouping Operations
ƒ sequential and casual consistency are defined at the level of
read and write operations
ƒ this has fine granularity which does not match with the
granularity provided by applications
ƒ at the program level read and write operations are grouped
by the pair of operations ENTER_CS and LEAVE_CS
ƒ there is no need to worry about intermediate results in a
critical section since other processes will not see the data
until it leaves the critical section; only the final result need to
be seen by other processes

19
ƒ this can be done by shared synchronization variables
ƒ when a process enters its critical section, it should acquire
the relevant synchronization variables
ƒ when a process leaves its critical section, it releases these
variables
ƒ synchronization variable ownership
ƒ each synchronization variable has a current owner, the
process that acquired it last
ƒ the owner may enter and exit critical sections
repeatedly without sending messages
ƒ other processes must send a message to the current
owner asking for ownership and the current values of
the data associated with that synchronization variable
ƒ several processes can also simultaneously own a
synchronization variable in nonexclusive mode, i.e.,
only for reading

20
3. Entry Consistency
ƒ a data store exhibits entry consistency if it meets all the
following conditions:
ƒ An acquire access of a synchronization variable is not
allowed to perform with respect to a process until all
updates to the guarded shared data have been performed
with respect to that process. (at an acquire, all remote
changes to the guarded data must be made visible)
ƒ Before an exclusive mode access to a synchronization
variable by a process is allowed to perform with respect to
that process, no other process may hold the
synchronization variable, not even in nonexclusive mode.
ƒ After an exclusive mode access to a synchronization
variable has been performed, any other process's next
nonexclusive mode access to that synchronization variable
may not be performed until it has performed with respect
to that variable's owner. (it must first fetch the most recent
copies of the guarded shared data) 21
ƒ assume locks are associated with each data item, instead of
the entire shared data

a valid event sequence for entry consistency

ƒ when an acquire is done only those variables guarded by that


synchronization variable are made consistent
ƒ therefore, a few shared data items have to be synchronized
when there is a release

22
7.3 Client-Centric Consistency Models
ƒ with many applications, updates happen very rarely
ƒ for these applications, data-centric models where high
importance is given for updates are not suitable
ƒ very weak consistency is generally sufficient for such
systems
ƒ Eventual Consistency
ƒ there are many applications where few processes (or a
single process) update the data while many read it and
there are no write-write conflicts; we need to handle
only read-write conflicts; e.g., DNS server, Website
ƒ for such applications, it is even acceptable for readers
to see old versions of the data (e.g., cached versions of
a Web page) until the new version is propagated
ƒ with eventual consistency, it is only required that
updates are guaranteed to gradually propagate to all
replicas
23
ƒ data stores that are eventually consistent have the property
that in the absence of updates, all replicas converge toward
identical copies of each other
ƒ write-write conflicts are rare and are implemented separately
ƒ the problem with eventual consistency is when different
replicas are accessed, e.g., a mobile client accessing a
distributed database may acquire an older version of data
when it uses a new replica as a result of changing location

24
the principle of a mobile user accessing different replicas of a distributed
database
ƒ the solution is to introduce client-centric consistency
ƒ it provides guarantees for a single client concerning the
consistency of accesses to a data store by that client; no
guaranties are given concerning concurrent accesses by
different clients 25
ƒ assumptions
ƒ consider a data store that is physically distributed across
multiple machines
ƒ a process reads and writes to a locally (or nearest)
available copy and updates are propagated
ƒ assume that data items have an associated owner, the
only process permitted to modify that item, hence write-
write conflicts are avoided
ƒ the following notations are used
ƒ xi [t] denotes the version of the data item x at local copy
Li at time t
ƒ version xi [t] is the result of a series of write operations at
Li that took place since initialization; denote this set by
WS(xi[t])
ƒ if operations in WS(xi[t1]) have also been performed at
local copy Lj at a later time t2, we write WS(xi[t1];xj[t2]); it
means that WS(xi[t1]) is part of WS(xj[t2])
ƒ the time index may be omitted if ordering of operations is
clear from context; WS(xi), WS(xj), WS(xi;xj)
26
ƒ there are four client-centric consistency models
1.Monotonic Reads
ƒ a data store is said to provide monotonic-read consistency
if the following condition holds:
ƒ If a process reads the value of a data item x, any
successive read operation on x by that process will
always return that same value or a more recent value
ƒ i.e., a process never sees a version of data older than what
it has already seen
ƒ e.g., assume a replicated mail server; if a user moves from
location x to y, s/he should be able to get all e-mails that
are in location x

27
the read operations performed by a single process P at two different
local copies of the same data store
a) a monotonic-read consistent data store
b) a data store that does not provide monotonic reads; there is no
guaranty that when R(x2) is executed WS (x2) also contains WS (x1)
28
2.Monotonic Writes
ƒ it may be required that write operations propagate in the
correct order to all copies of the data store
ƒ in a monotonic-write consistent data store the following
condition holds:
ƒ A write operation by a process on a data item x is
completed before any successive write operation on x by
the same process
ƒ completing a write operation means that the copy on which
a successive operation is performed reflects the effect of a
previous write operation by the same process, no matter
where that operation was initiated

29
ƒ may not be necessary if a later write operation completely
overwrites the present
x = 78;
x = 90;
ƒ no need to make sure that x has been first changed to 78
ƒ it is important only if part of the state of the data item
changes
ƒ e.g., a software library, where one or more functions are
replaced, leading to a new version

30
the write operations performed by a single process P at two different
local copies of the same data store
a) a monotonic-write consistent data store
b) a data store that does not provide monotonic-write consistency

31
3.Read Your Writes
ƒ a data store is said to provide read-your-writes
consistency, if the following condition holds:
ƒ The effect of a write operation by a process on data item
x will always be seen by a successive read operation on x
by the same process
ƒ i.e., a write operation is always completed before a
successive read operation by the same process, no matter
where that read operation takes place
ƒ the absence of read-your-writes consistency is often
experienced when a Web page is modified using an editor
and the modification is not seen on the browser due to
caching; read-your-writes consistency guarantees that the
cache is invalidated when the page is updated

32
a) a data store that provides read-your-writes consistency
b) a data store that does not

33
4.Writes Follow Reads
ƒ updates are propagated as the result of previous read
operations
ƒ a data store is said to provide writes-follow-reads
consistency, if the following condition holds:
ƒ A write operation by a process on a data item x following
a previous read operation on x by the same process is
guaranteed to take place on the same or a more recent
value of x that was read
ƒ i.e., any successive write operation by a process on a data
item x will be performed on a copy of x that is up to date
with the value most recently read by that process
ƒ this guaranties, for example, that users of a newsgroup see
a posting of a reaction to an article only after they have
seen the original article; if B is a response to message A,
writes-follow-reads consistency guarantees that B will be
written to any copy only after A has been written
34
a) a writes-follow-reads consistent data store
b) a data store that does not provide writes-follow-reads consistency

35
7.4 Replica Management
ƒ a key issue in replication: deciding where, when, and by
whom replicas should be placed and how to keep replicas
consistent
ƒ placement refers to two issues: placing of replica servers
(or finding the best locations), and that of placing content
(or finding the best servers)
a. Replica-Server Placement
ƒ how to select the best K out of N locations where K < N
ƒ two possibilities (there are more)
ƒ based on distance between clients and locations where
distance can be measured in terms of latency or
bandwidth
ƒ or considering the topology of the Internet as formed by
Autonomous Systems; an AS is a network managed by a
single organization and where all nodes run the same
routing protocol; then place the servers on the K routers
with the larger number of network interfaces (or links) 36
b. Content Replication and Placement
ƒ three types of replicas:
ƒ permanent replicas
ƒ server-initiated replicas
ƒ client-initiated replicas

the logical organization of different kinds of copies of a data store into three
concentric rings
37
i. Permanent Replicas
ƒ the initial set of replicas that constitute a distributed data
store; normally a small number of replicas
ƒ e.g., a Web site: two forms
ƒ the files that constitute a site are replicated across a
limited number of servers on a LAN; a request is
forwarded to one of the servers (using round-robin
strategy)
ƒ mirroring: a Web site is copied to a limited number of
servers, called mirror sites, which are geographically
spread across the Internet; clients choose one of the
mirror sites
ii. Server-Initiated Replicas (push caches)
ƒ dynamically (temporarily) placing servers on new locations
due to sudden increases in request
ƒ Web Hosting companies dynamically create replicas to
improve performance (e.g., create a replica near hosts that
use the Web site very often) 38
iii. Client-Initiated Replicas (client caches or simply caches)
ƒ to improve access time
ƒ a cache is a local storage facility used by a client to
temporarily store a copy of the data it has just received
ƒ placed on the same machine as its client or on a machine
shared by clients on a LAN
ƒ managing the cache is left entirely to the client; the data
store from which the data have been fetched has nothing
to do with keeping cached data consistent
ƒ caches may not be as useful in the future: because of
improvements in network and server performance

39
c.Content Distribution
ƒ updates must propagate to replicas ensuring consistency
ƒ some design issues in propagating updates
ƒ state versus operations
ƒ pull versus push protocols
ƒ unicasting versus multicasting
i. State versus Operations
ƒ what is actually to be propagated? three possibilities
ƒ send notification of update only (for invalidation
protocols); it may specify which part of the data store
has been updated
ƒ then other replicas can later fetch the latest version
when they receive a read request
ƒ useful when the read-to-write ratio is small (i.e., there
are many update operations compared to read
operations)
ƒ use of little bandwidth
40
ƒ transfer the modified data
ƒ useful when the read-to-write ratio is high
ƒ instead of propagating modified data, it is possible
to log the changes and transfer only those logs to
save bandwidth
ƒ transfers can also be aggregated by packing multiple
modifications to save communication overhead
ƒ transfer the update operations and the parameters that
those operations need (also called active replication)
ƒ it assumes that each machine knows how to do the
operation
ƒ use of little bandwidth, but more processing power
needed from each replica

41
ii. Pull versus Push Protocols
ƒ push-based approach (also called server- based protocols)
ƒ propagate updates to other replicas without those
replicas even asking for the updates
ƒ used when high degree of consistency is required and
there is a high read-to-write ratio
ƒ usually applicable for permanent and server-initiated
replicas, but can also be used for client caches
ƒ the server must maintain list of all replicas and caches
ƒ pull-based approach (also called client-based protocols)
ƒ a client or a server requests for updates from the server
whenever needed (used when the read-to-write ratio is
low)
ƒ often used by client caches
ƒ response time may increase

42
ƒ e.g. is Conditional GET in
HTTP
ƒ Goal of conditional GET: cache server
don’t send an object if HTTP request msg
cache has up-to-date If-modified-since: <date>
object
version not
ƒ cache: specify date of HTTP response modified
cached copy in HTTP HTTP/1.1
304 Not Modified
request
ƒ If-modified-since: <date>
HTTP request msg
ƒ server: response contains If-modified-since: <date>
object
no object if cached copy is
modified
up-to-date: HTTP response
ƒ HTTP/1.1 304 Not HTTP/1.1 200 OK
Modified <data>

43
ƒ a comparison between push-based and pull-based protocols;
for simplicity assume multiple clients and a single server
Issue Push-based Pull-based
List of client replicas and
State of server None
caches
Update (and possibly fetch
Messages sent Poll and update
update later)
Response time at Immediate (or fetch-update Fetch-update
client time) time
ƒ a client here refers to replicas and caches
ƒ to gain the advantages of both, there is a hybrid form based
on leases
ƒ a lease is a promise by the server that it will push updates to
the client for a specified time
ƒ when the lease expires, the client has to poll the server for
updates and pull updates, if any, or has to request a new
lease
44
iii. Unicasting versus Multicasting
ƒ multicasting can be combined with push-based
approach; the underlying network takes care of sending a
message to multiple receivers
ƒ unicasting is the only possibility for pull-based
approach; the server sends separate messages to each
receiver

45
7.5 Consistency Protocols
ƒ so far we have concentrated on various consistency models
and general design issues
ƒ consistency protocols describe an implementation of a
specific consistency model
ƒ there are three types for data-centric consistency models
ƒ primary-based protocols
ƒ remote-write protocols
ƒ local-write protocols
ƒ replicated-write protocols
ƒ active replication
ƒ quorum-based protocols
ƒ cache-coherence protocols

46
1. Primary-Based Protocols
ƒ each data item x in the data store has an associated
primary, which is responsible for coordinating write
operations on x
ƒ two approaches: remote-write protocols, and local-write
protocols
a. Remote-Write Protocols
ƒ all write operations are forwarded to a fixed single
server (the primary)
ƒ read operations can be carried out locally
ƒ such schemes are known as primary-backup protocols
ƒ the backup servers are updated each time the primary
is updated

47
the principle of primary-backup protocol

48
ƒ may lead to performance problems since it may take time
before the process that initiated the write operation is
allowed to continue - updates are blocking
ƒ an alternative is a nonblocking primary-backup protocol:
when the primary updates its local copy, it sends an
acknowledgement; the problem is fault tolerance
ƒ primary-backup protocols provide straightforward
implementation of sequential consistency
ƒ the primary can order all incoming writes
ƒ all processes see all write operations in the same order
ƒ with blocking protocols, processes will always see the
effect of their most recent write operations (this cannot
be guaranteed for nonblocking protocols)

49
b. Local-Write Protocols
ƒ the primary migrates between processes that wish to
perform a write operation
ƒ multiple, successive write operations can be carried out
locally, while (other) reading processes can still access
their local copy
ƒ such improvement is possible only if a nonblocking
protocol is followed; i.e., updates are propagated to the
replicas after the primary has finished updates locally

50
nonblocking

primary-backup protocol in which the primary migrates to the process


wanting to perform an update

51
2.Replicated-Write Protocols
ƒ unlike primary-based protocols, write operations can be
carried out at multiple replicas; two approaches: Active
Replication and Quorum-Based Protocols
a. Active Replication
ƒ each replica has an associated process that carries out
update operations
ƒ updates are generally propagated by means of write
operations (the operation is propagated); also possible to
send the update
ƒ the operations need to be done in the same order
everywhere; totally-ordered multicast
ƒ two possibilities to ensure that the order is followed
ƒ Lamport’s timestamps (scalability problem), or
ƒ use of a central sequencer that assigns a unique
sequence number for each operation; the operation is
first sent to the sequencer then the sequencer forwards
the operation to all replicas (still scalability problem) 52
ƒ a problem is replicated invocations (pages 475 - 477 for the
next 3 slides)
ƒ suppose object A invokes B, and B invokes C; if object B is
replicated, each replica of B will invoke C independently
ƒ this may create inconsistency and other effects; what if the
operation on C is to transfer $10

the problem of replicated invocations 53


ƒ one solution is to have a replication-aware communication
layer that avoids the same invocation being sent more than
once
ƒ when a replicated object B invokes another replicated object C,
the invocation request is first assigned the same, unique
identifier by each replica of B
ƒ a coordinator of the replicas of B forwards its request to all
replicas of object C; the other replicas of object B hold back;
hence only a single request is sent to each replica of C
ƒ the same mechanism is used to ensure that only a single reply
message is returned to the replicas of B

54
a) forwarding an invocation request from a replicated object to
another replicated object
b) returning a reply to a replicated object
ƒ the above solution is sender-based
ƒ alternatively, a receiving replica can detect multiple copies of
incoming messages belonging to the same invocation and
pass only one copy to its associated object
55
b.Quorum-Based Protocols
ƒ use of voting: clients are required to request and acquire
the permission of multiple servers before either reading or
writing a replicated data item
ƒ e.g., assume a distributed file system where a file is
replicated on N servers
ƒ a client must first contact at least half + 1 (majority)
servers and get them to agree to do an update
ƒ the new update will be done and the file will be given a
new version number
ƒ to read a file, a client must also first contact at least half
+ 1 and ask them to send version numbers; if all version
numbers agree, this must be the most recent version
ƒ a more general approach is to arrange a read quorum (a
collection of any NR servers, or more) for reading and a
write quorum (of at least NW servers) for updating

56
ƒ the values of NR and Nw are subject to the following two
constraints
ƒ NR + Nw > N ; to prevent read-write conflicts
ƒ Nw > N/2 ; to prevent write-write conflicts

three examples of the voting algorithm (N = 12)


a) a correct choice of read and write sets; any subsequent read quorum of three
servers will have to contain at least one member of the write set which has a higher
version number
b) a choice that may lead to write-write conflicts; if a client chooses {A,B,C,E,F,G} as
its write set and another client chooses {D,H,I,J,K,L) as its write set, the two updates
will both be accepted without detecting that they actually conflict
57
c) a correct choice, known as ROWA (read one, write all)
3. Cache-Coherence Protocols
ƒ caches form a special case of replication as they are
controlled by clients instead of servers
ƒ cache-coherence protocols ensure that a cache is
consistent with the server-initiated replicas
ƒ two design issues in implementing caches: coherence
detection and coherence enforcement
ƒ coherence detection strategy: when inconsistencies are
actually detected
ƒ static solution: prior to execution, a compiler performs
the analysis to determine which data may lead to
inconsistencies if cached and inserts instructions that
avoid inconsistencies
ƒ dynamic solution: at runtime, a check is made with the
server to see whether a cached data have been
modified since they were cached

58
ƒ coherence enforcement strategy: how caches are kept
consistent with the copies stored at the servers
ƒ simplest solution: do not allow shared data to be
cached; suffers from performance improvement
ƒ allow caching shared data and
ƒ let a server send an invalidation to all caches
whenever a data item is modified
or
ƒ propagate the update

59
ƒ Implementing Client-Centric Consistency
ƒ Naive Implementation
ƒ each write operation is given a globally unique identifier,
assigned by the server that accepts the operation for the
first time
ƒ then for each client, keep track of two sets of identifiers:
ƒ the read set consists of the write identifiers relevant for
the read operations performed by a client
ƒ the write set consists of the write identifiers performed
by the client

60
ƒ monotonic-read consistency is implemented as follows
ƒ when a client performs a read operation at a server, the
server is handed the client’s read set to check if all the
identified writes have taken place locally
ƒ if not, the server contacts the other servers to ensure that it
is brought up to date before carrying out the read operation
(or the read operation is forwarded to a server where the
write operations took place)
ƒ after the read operation, the relevant write operations that
have taken place at the selected servers are added to the
client’s read set
ƒ monotonic-write consistency is implemented as follows
ƒ when a client initiates a new write operation to a server, the
server is handed the client’s write set
ƒ it then ensures that the identified write operations are done
first and in the correct order (may increase response time)
ƒ after performing the write, that operation’s write identifier is
added to the write set 61
ƒ read-your-writes consistency is implemented as follows
ƒ it requires that the server where the read operation is
performed has seen all the write operations in the client’s
write set
ƒ the writes can be fetched from the other servers before the
read operation is performed (may result with a poor
response time)
ƒ alternatively, the client-side software can search for a server
where the identified write operations in the client’s write set
have already been performed
ƒ writes-follow-reads consistency is implemented as follows
ƒ first bring the selected server up to date with the write
operations in the client’s read set
ƒ then add the identifier of the write operation to the write set,
along with the identifiers in the read set (which have now
become relevant for the write operation just performed)

62
ƒ problem: in naive implementation, the read and write sets can
become very large
ƒ to improve efficiency, read and write operations can be
grouped into sessions, clearing the sets when the session
ends
ƒ a session can start when an application begins and closed
when the application exits

63

You might also like