0% found this document useful (0 votes)
8 views

Mod 5

Uploaded by

rar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Mod 5

Uploaded by

rar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Consistency and Replication

Reasons for Replication

• Data replication is a common technique in distributed


systems. There are two reasons for data replication:
◼ It increases the reliability of a system.
If one replica is unavailable or crashes, use another
◼ It improves the performance of a system
Nearest copy can be accessed reducing the latency

• The key issue is the need to maintain consistency of


replicated data.
◼ If one copy is modified, others become inconsistent.
Replication

Price to be paid for replication :


1. Modifications need to be carried out on all the copies to
ensure consistency.
2. Consumes more n/w bandwidth to keep all replicas upto
date.
3. A read operation performed at any copy should always
return the same result.
i.e when an update operation is performed on one copy ,
the update should be propagated to all copies before
any subsequent operation takes place NO MATTER AT
WHICH COPY THAT OPERATION IS PERFORMED.
Hence the update operation at all copies should be
viewed as a SINGLE ATOMIC OPERATION
Replication

 This requires global synchronization which takes a lot


of communication time when replicas are spread
across a wide area network.
 Solution :
Loosen the consistency constraints i.e
Relax the requirement that updates need to be
executed as atomic operations.
 Price paid :
Copies may not always be the same everywhere
 To what extent consistency can be loosened depends
on the application.
Data Centric consistency models:

Hence we have a number of consistency models.


Data Centric consistency models:
Assumptions :
1. Data Store – Distributed data store
2. Each process can access data from the local copy available of
the data store.
3. Data operation is classified as write when it changes data
otherwise it is classified as a read operation
Notations:
 Write: Wi(x) a -- a write by process Pi to data item x with
the value a
 Read: Ri(x) b -- a read from data item x by process Pi return
b
Replication
Data Centric consistency models:

 A consistency model is a contract


between processes and the data
store.
It says that if processes agree to
obey certain rules then the store
promises to work correctly.
Data Centric consistency models:

1. Strict Consistency :
Any read on a data item x returns a value corresponding to the
result of the most recent write on x. “All writes are
instantaneously visible to all processes”

Strict consistency observed in a uniprocessor system.


a=1;
a=2;
print(a);

Value for a displayed is 2


Data Centric consistency models:

A strictly consistent store

A store that is not strictly consistent.


All writes should be instantaneously visible to all processes
which is very difficult when copies are spread wide apart.
The problem with strict consistency is that it relies on absolute
global time and is impossible to implement in a distributed system.
Issues in Design and Implementation of
DSM(consistency)

2. Sequential Consistency :

• Weaker than strict consistency.

• A data store is said to be sequentially consistent if


it follows the following condition : The result of any
execution is the same as if the read and write operations
by all processes were executed in some sequential order
and the operations of each individual process appear in
this sequence in the order specified by its program.

• Any valid interleaving is legal but all processes must see


the same interleaving.
Issues in Design and Implementation of DSM

For example, to improve query performance, a bank may


place copies of an account database in two different
cities, say NewYork and San Francisco. A query is always
forwarded to the nearest copy.
Assume a customer in San Francisco wants to add $ 100
to his account (account number 559), which currently
contains $ 1000. At the same time, a bank employee in
New York initiates an update by which the customer's
account is to be increased with 1 percent interest. Both
updates should be carried out at both copies of the
database. However, due to communication delays in the
underlying network, the updates may arrive in the order
as shown
Data Centric consistency models:

 Example :
Data Centric consistency models:

The customer's update operation is performed in San Francisco


before the interest update. In contrast, the copy of the account
in New York's replica is first updated with 1 percent interest
and after that with the $ 100 deposit.
Data Centric consistency models:

 The 2 updates should have been performed in


the same order at each copy to achieve
consistency(sequential)
Data Centric consistency models:

a) A sequentially consistent data store.

P3 and P4 disagree
on the order of the writes
a) A data store that is not sequentially consistent.
Data Centric consistency models:
3. Causal consistency
• Weaker than sequential consistency since it makes a
distinction between events that are potentially causally
related and those that are not.

• If event B is caused/influenced by an earlier event A---→


Then causality requires that everyone see A then see B

• Events that are not causally related are called concurrent


events.
Data Centric consistency models:
For a data store to be considered causally
consistent, it is necessary that the store obeys
the following condition:

Writes that are potentially causally related …


must be seen by all processes in the same order.

Concurrent writes …
may be seen in a different order on different machines.

Causally related writes


If write1 → read, and read → write2, then write1 → write2.
Causal vs. Concurrent events
 Consider an interaction between processes P1
and P2 operating on replicated data x and y

P1 W(x)a P1 W(x)a

R(x)a W(y)b W(y)b R(x)a


P2 P2

Events are causally related Events are not causally


Events are not concurrent related
• Computation of y at P2 may Events are concurrent
have depended on value of x • Computation of y at P2 does
written by P1 not depend on value of x
written
=Read by P1
variable = Write variable
P1 =Process P1 =Timeline at R(x)b W(x)b 18
x; x;
P1
Result is b Result is b
Data Centric consistency models:
Data Centric consistency models:
Data Centric consistency models:
4. FIFO(PRAM Pipelined RAM):
Writes done by a single process are seen by all
other processes in the order in which they were
issued, but writes from different processes may
be seen in a different order by different
processes.
All writes generated by different processes are
concurrent.
Data Centric consistency models:

A valid sequence of events for FIFO consistency


Data Centric consistency models:
P1 : W(x)1
P2 : R(x)1 W(x)2
P3 : R(x)2 R(x)1
P4 : R(x)1 R(x)2

Valid sequence of events for FIFO consistency


Data Centric consistency models:
Ex : If w11 and w12 are 2 write operations
performed by a process P1 in that order and if
w21 and w22 are 2 write operations
performed by a process P2 in that order then-

P3 can see them in order :
→[(w11,w12), (w21,w22)] and
P4 can see them in order
→[(w21,w22),(w11,w12)]
Data Centric consistency models:
Note :
In sequential consistency all processes agree on
the same order of operations.
But in FIFO all processes do not agree on the
same order of memory operation.
Either : [(w11,w12),(w21,w22)]
or
[(w21,w22),(w11,w12)]
Is acceptable but not both
Data Centric consistency models:
5. Weak consistency
FIFO--→ Propagation of all intermediate writes in
order to all copies.
Alternative --→ Let processes finish its critical
section (operation on shared memory item)
and make sure that the final results are sent
everywhere not worrying too much whether all
intermediate results have been propagated to
all copies in order.
Data Centric consistency models:
• Use a synchronization variable (S) has only a
single associated operation called synchronize.
• The operation synchronize is used to
synchronize memory.
• When a process does a synchronize operation,
all writes done on that machine are propagated
outward(to other machine) and all writes done
on other machines are brought in.
• In other words all off shared memory is
synchronized.
Data Centric consistency models:
• In weak consistency , when a process performs
an operation on a shared data item, no
guarantees are given about when they will be
visible to other processes. Only when explicit
synchronization takes place , changes are
propagated.
Data Centric consistency models:

a) A valid sequence of events for weak consistency.


b) An invalid sequence for weak consistency.
Data Centric consistency models:
• In (a) P1 does 2 writes to a data item and then
synchronizes.
Since P2 and P3 are not yet synchronized no
guarantees are given about what they see.

• In (b) P2 has been synchronized which means


it’s local copy of the data store is brought up to
date.
When P2 reads x, it must get the value b.
Getting a is not permitted for weak
consistency.
Data Centric consistency models:
6. Release consistency
• If it is possible to know the difference between
entering a critical region or leaving it, a more
efficient implementation might be possible.
• To do that, two kinds of synchronization
operations are needed----→
• acquire operation - to tell that a critical region
is being entered;
• release operation – to tell when a critical
region is to be exited
Data Centric consistency models:
• A data store that offers release consistency
guarantees that when a process does an
acquire, the store will ensure that all the local
copies of the protected(shared) data are
brought up to date to be consistent with the
remote ones if need be.
• When a release is done, protected data that
have been changed are propagated out to the
other local copies of the store.
Data Centric consistency models:

A valid event sequence for release consistency.

 P1 does an acquire and changes x twice, then does a release.

 P2 does an acquire and reads x. It is guaranteed to get b.

 P3 does not do an acquire before reading the shared data . Hence the
data store has no obligation to give it the current value of x. So
returning a is allowed.
Data Centric consistency models:
• Release consistency is also called eager release consistency
since -------→

When a release is done all the processes doing the release


pushes out all the modified data to all other processes that
have a copy of the data and thus might potentially need it.
There is no way to tell if they actually will need it, so all of
them get everything that has changed.

• Variation of eager release ---→ Lazy release


At the time of release nothing is sent anywhere. When an
acquire is done, the process trying to do an acquire has to get
the most recent values of the data from the process holding
them.
Data Centric consistency models:

Eager Release consistency


Data Centric consistency models:

Lazy Release consistency


Data Centric consistency models:
7. Entry consistency
• Requires each shared data item to be associated with a
synchronization variable (Lock).

• Synchronization variables are used as follows :

1. Each synchronization variable has a current owner i.e the


process that last acquired it.

2. The owner may enter and exit critical region (CR)


repeatedly without having to send any messages on the
network.
Data Centric consistency models:

3. A process not currently owning a synchronization


variable but wanting to acquire it has to send a message
to the current owner asking for ownership and the
current values of the data associated with that
synchronization variable.

4. It is also possible for several processes to


simultaneously own a synchronization variable in a non
exclusive mode i.e they can read but not write the
associated data.
Data Centric consistency models:
A data store exhibits entry consistency if it meets all of
the following condition :
 An acquire access of a synchronization variable is not
allowed to perform with respect to a process until all
updates to the guarded shared data have been
performed with respect to that process.
 Before an exclusive mode access to a synchronization
variable by a process is allowed to perform with respect
to that process, no other process may hold the
synchronization variable, not even in nonexclusive mode.
 After an exclusive mode access to a synchronization
variable has been performed, any other process's next
nonexclusive mode access to that synchronization
variable may not be performed until it has performed
with respect to that variable's owner.
Data Centric consistency models:

A valid event sequence for entry consistency.

•P1 does an acquire for x, changes x once after which it


also does an acquire for y.
•P2 does an acquire for x but not for y, so it will read a
for x but may read NIL for y.
•P3 first does an acquire for y, hence it will read value b
for y
Summary of weak, release , entry
Summary of Consistency
Models
a) Consistency models that do not use synchronization operations.
b) Models that do use synchronization operations. (These require additional
programming constructs, and allow programmers to treat the data-store
as if it is sequentially consistent, when in fact it is not. They “should” also
offer the best performance).
Consistency Description

Strict Absolute time ordering of all shared accesses matters.

All processes must see all shared accesses in the same order. Accesses are furthermore ordered
Linearizability
according to a (nonunique) global timestamp.

Sequential All processes see all shared accesses in the same order. Accesses are not ordered in time.

Causal All processes see causally-related shared accesses in the same order.

All processes see writes from each other in the order they were used. Writes from different processes
FIFO
may not always be seen in that order.

(a)

Consistency Description

Weak Shared data can be counted on to be consistent only after a synchronization is done.

Release Shared data are made consistent when a critical region is exited.

Entry Shared data pertaining to a critical region are made consistent when a critical region is entered.
42
(b)
Client Centric Consistency Models
Difference between data centric and client centric
consistency
 With data centric, the objective is to specify a
system wide consistency on a set of shared data
items in the presence of concurrent read/write
operations .
 For client centric consistency, consistency is defined
only concerning how one specific client experiences
the effects of read/write operations when moving
between locations.
Client Centric Consistency Models
 Data centric consistency
Concurrent processes may be simultaneously
updating the data store and hence it is necessary to
provide consistency in the face of such concurrency
Characteristics of Client centric consistency
 Lack of simultaneous updates
 Most operations involve reading data
Client Centric Consistency Models

Eventual consistency
Eventual consistency model states that when no updates
occur for a long period of time, eventually all updates will
propagate through the system and all the replicas will be
consistent
Client Centric Consistency Models
 Mobile users accesses the database connecting to
one of the replicas in a transparent way
 Assume the user performs several update
operations and then disconnects again.
 Later the user accesses the database again,
possibly after moving to a different location or by
using a different access device. The user may be
connected to a different replica.
 What if the updates have not propagated? Could
be confusing to the user.
 Solution : Use Client-centric consistency
Client Centric Consistency Models
 Client-centric consistency provides guarantees
for a single client concerning the consistency of
accesses to a data store by that client.
 No guarantees are given concerning concurrent
accesses by different clients.
Assumptions for client centric consistency
 Data store is physically distributed across
multiple machines.
 When a process accesses a data store, it
generally connects to its nearest available copy
 All read and write operations are performed on
the local copy. Updates are eventually
propagated to other copies.
Client-Centric Consistency

• xi[t] -version of data x at copy Li at time t

• WS(x[i,t]) is the set of writes to the replica at


node i until time t

• WS(x[i,t], x[j,t’]) the writes WS(x[i,t]) are


reflected in the version x[j,t’] at the later
time t’

(48)
Client Centric Consistency Models
1. Monotonic Read
 If a process reads the value of a data item x, any successive
read operations on x by that process will always return that
same value or a more recent value.
• Example:
• Consider a distributed e-mail database
• In such a database, each user’s mailbox may be distributed and
replicated across multiple machines.
• Mail can be inserted in a mailbox at any location.
• Updates are propagated in a lazy (i.e., on demand) fashion.
• Assume that reads don’t change the mailbox.
• Suppose a user reads their e-mail in Vancouver and then flies to
Toronto and reads their e-mail.
• A monotonic read guarantees that the messages that were in the
mailbox in Vancouver will also be in the mailbox in Toronto.
Client Centric Consistency Models

a) A monotonic-read consistent data store


b) A data store that does not provide monotonic reads.
The read operations performed by a single process P at two
different local copies of the same data store.
Client Centric Consistency Models
(a)
 Process P first performs a read operation on x at
L1, returning the value of x1 (at that time).
 This value results from the write operations in
WS (x1) performed at L1.
 Later, P performs a read operation on x at L2,
shown as R(x2).
 All operations in WS (x1) have been propagated
to L2 before the second read operation takes
place.
Client Centric Consistency Models
(b)
 Situation in which monotonic-read consistency is
not guaranteed.
 After process P has read x1 at L1, it later
performs the operation R (x2 ) at L2 .
 But, only the write operations in WS (x2 ) have
been performed at L2 .
 No guarantees are given that this set also
contains all operations contained in WS (x1).
Client Centric Consistency Models
2. Monotonic Writes
A write to a replica of a data item at a node is
delayed until all previous writes to the same data
element are executed at that node.
Hence: A write operation on a copy of item x is
performed only if that copy has been brought up
to date by means of any preceding write
operation, which may have taken place on other
copies of x. If need be, the new write must wait
for old ones to finish.
Client Centric Consistency Models
 Example: Maintaining versions of replicated files
in the correct order everywhere (propagate the
previous version to the server where the newest
version is installed).

a) A monotonic-write consistent data store.


b) A data store that does not provide monotonic-
write consistency.
Client Centric Consistency Models
(a)
 Process P performs a write operation on x at local
copy L1, presented as the operation W(x1).
 Later, P performs another write operation on x,
but this time at L2, shown as W (x2).
 To ensure monotonic-write consistency, the
previous write operation at L1 must have been
propagated to L2.
 This explains operation W (x1) at L2, and why it
takes place before W (x2).
Client Centric Consistency Models
(b)
 Situation in which monotonic-write consistency is
not guaranteed.
 Missing is the propagation of W(x1) to copy L2.
 No guarantees can be given that the copy of x on
which the second write is being performed has
the same or more recent value at the time W(x1)
completed at L1.
Client Centric Consistency Models
3. Read your writes
A write to a replica of a data element will be
reflected to any replica of the data element that is
subsequently read
Hence: a write operation is always completed
before a successive read operation by the same
process, no matter where that read operation
takes place.
Example: Updating your Web page and
guaranteeing that your Web browser shows the
newest version instead of its cached copy.
Client Centric Consistency Models

a) A data store that provides read-your-writes consistency.


b) A data store that does not.
Client Centric Consistency Models
(a)
▪Process P performed a write operation W(x1) and later a read
operation at a different local copy.

•Read-your-writes consistency guarantees that the effects of the write


operation can be seen by the succeeding read operation.

•This is expressed by WS (x1;x2), which states that W (x1) is part of


WS (x2).
(b)
W (x1) has been left out of WS (x2), meaning that the effects of the
previous write operation by process P have not been propagated to
L2.
Client Centric Consistency Models
 Write follow reads
A write operation by a process on a data item x following
a previous read operation on x by the same process, it is
guaranteed to take place on the same or a more recent
value of x that was read.

a) A writes-follow-reads consistent data store


b) A data store that does not provide writes-follow-reads
consistency
Client Centric Consistency Models
(a)
A process reads x at local copy L1.
The write operations that led to the value just read, also
appear in the write set at L2, where the same process later
performs a write operation.
(Note that other processes at L2 see those write operations
as well.)
(b)
No guarantees are given that the operation performed at L2,
They are performed on a copy that is consistent with the one
just read at L1.

You might also like