0% found this document useful (0 votes)
23 views8 pages

Lecture 7 PDC

This lecture discusses replication and transaction recovery in distributed systems. It explains that replication keeps multiple copies of data across different computers to improve availability, fault tolerance, and load balancing. There are two main types of replication: active replication where all replicas process the same requests, and passive replication where a primary replica handles requests and updates backup replicas. The lecture also discusses transaction recovery using the two-phase commit protocol, which has centralized, linear, and distributed implementations.

Uploaded by

Shahxaad Ahmad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views8 pages

Lecture 7 PDC

This lecture discusses replication and transaction recovery in distributed systems. It explains that replication keeps multiple copies of data across different computers to improve availability, fault tolerance, and load balancing. There are two main types of replication: active replication where all replicas process the same requests, and passive replication where a primary replica handles requests and updates backup replicas. The lecture also discusses transaction recovery using the two-phase commit protocol, which has centralized, linear, and distributed implementations.

Uploaded by

Shahxaad Ahmad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Lecture # 7

7th Semester – BS Computer Science

PARALLEL & DISTRIBUTED


COMPUTING
IBRAR AHMAD, Ph.D.
University of Swat (Shangla Sub-Campus)
[email protected]
What is Replication in Distributed System?
■ In a distributed system data is stored is over different computers in a network. Therefore, we need to
make sure that data is readily available for the users. Availability of the data is an important factor
often accomplished by data replication. Replication is the practice of keeping several copies of data
in different places.

Why do we require replication?


■ It makes our system more stable because of node replication. It is good to have replicas of a node in a
network due to following reasons:
• If a node stops working, the distributed network will still work fine due to its replicas which will
be there. Thus it increases the fault tolerance of the system.
• It also helps in load sharing where loads on a server are shared among different replicas.
• It enhances the availability of the data. If the replicas are created and data is stored near to the
consumers, it would be easier and faster to fetch data.
Types of Replication
Ø Active Replication:
• The request of the client goes to all the replicas.
• It is to be made sure that every replica receives the client request in the same order else the
system will get inconsistent.
• There is no need for coordination because each copy processes the same request in the same
sequence.
• All replicas respond to the client’s request.
■ Advantages:
• It is really simple. The codes in active replication are the same throughout.
• It is transparent.
• Even if a node fails, it will be easily handled by replicas of that node.
■ Disadvantages:
• It increases resource consumption. The greater the number of replicas, the greater the memory
needed.
• It increases the time complexity. If some change is done on one replica it should also be done in
all others.
Types of Replication
Ø Passive Replication:
• The client request goes to the primary replica, also called the main replica.
• There are more replicas that act as backup for the primary replica.
• Primary replica informs all other backup replicas about any modification done.
• The response is returned to the client by a primary replica.
• Periodically primary replica sends some signal to backup replicas to let them know that it is
working perfectly fine.
• In case of failure of a primary replica, a backup replica becomes the primary replica.
■ Advantages:
• The resource consumption is less as backup servers only come into play when the primary server
fails.
• The time complexity of this is also less as there’s no need for updating in all the nodes replicas,
unlike active replication.
■ Disadvantages:
• If some failure occurs, the response time is delayed.
Transaction Recovery in Distributed System
■ Transactions may be performed effectively using distributed transaction processing.
■ However, there are instances in which a transaction may fail for a variety of causes. System failure,
hardware failure, network error, inaccurate or invalid data, application problems, are all probable causes.
■ Transaction failures are impossible to avoid. These failures must be handled by the distributed transaction
system. When mistakes arise, one must be able to identify and correct them. Transaction Recovery is the
name for this procedure.
■ Let us consider the following scenario to analyze how transaction fail may occur. Let suppose, we have
two-person X and Y. X sends a message to Y and expects a response, but Y is unable to receive it.

– The following are some of the issues with this circumstance:


• The message was not sent due to a network problem.
• The communication sent by location B was not delivered to place A.
• Location B was destroyed.
• As a result, locating the source of a problem in a big communication network is extremely challenging.
Transaction Recovery in Distributed System
■ One of the most famous methods of Transaction Recovery is the “Two-Phase Commit Protocol”.

■ The coordinator and the subordinate are the two types of nodes that the Two-Phase Commit Protocol uses to
accomplish its procedures. The coordinator’s process is linked to the user app, and communication channels
between the subordinates and the coordinator are formed.

■ The two-phase commit protocol contains two stages, as the name implies. The first step is the
PREPARE phase, in which the transaction’s coordinator delivers a PREPARE message. The
second step is the decision-making phase, in which the coordinator sends a COMMIT
message if all of the nodes can complete the transaction, or an abort message if at least one
subordinate node cannot. Centralized 2PC, Linear 2PC, and Distributed 2PC are all ways that
may be used to perform the 2PC.
Transaction Recovery in Distributed System
■ Centralized 2PC: Contact in the Centralized 2PC is limited to the coordinator’s process, and no communication
between subordinates is permitted. The coordinator is in charge of sending the PREPARE message to the
subordinates, and once all of the subordinates’ votes have been received and analyzed, the coordinator chooses
whether to abort or commit.

■ Linear 2PC: Subordinates in the linear 2PC, can communicate with one another. The sites are numbered 1 to N,
with site 1 being the coordinator. As a result, the PREPARE message is propagated in a sequential manner. As a
result, the transaction takes longer to complete than centralized or dispersed approaches.

■ Distributed 2PC: All of the nodes of a distributed 2PC interact with one another. Unlike other 2PC techniques,
this procedure does not require the second phase. When the coordinator delivers a PREPARE message to all
participating nodes, the distributed 2PC gets started. When a participant receives the PREPARE message, it
transmits his or her vote to all other participants. As a result, each node keeps track of every transaction’s
participants.

You might also like