Fault Tolerant Message Passing Systems
Fault Tolerant Message Passing Systems
Engineering (RA)
COURSE NAME: PARALLEL &DISTRIBUTED COMPUTING
COURSE CODE: 22CS4106 R
Specifically
A component C depends on C∗ if the correctness of C’s behavior depends
on the correctness of C∗’s behavior. (Components are processes or
channels.)
Requirements related to
Requirement
dependability
Description
Availability Readiness for usage
Reliability Continuity of service delivery
Safety Very low probability of catastrophes
Maintainability How easy can a failed system be repaired
Basic concepts
RELIABILITY VERSUS AVAILABILITY
Traditional metrics
• Mean Time To Failure (MTTF): The average time until a component fails.
• Mean Time To Repair (MTTR): The average time needed to repair a
component.
• Mean Time Between Failures (MTBF): Simply MTTF + MTTR.
Basic concepts
RELIABILITY VERSUS AVAILABILITY
Observation
Reliability and availability make sense only if we have an accurate notion
of what a failure actually is.
Basic concepts
Terminology
Failure, error, fault
Term Description Example
Failure A component is not living up to Crashed program
its specifications
Error Part of a component that can Programming bug
lead to a failure
Fault Cause of an error Sloppy programmer
Basic concepts
Terminology
Handling faults
Basic concepts
Failure models
Types of failures
Observation
Note that deliberate failures, be they omission or commission failures, are typically
security problems. Distinguishing between deliberate failures and unintentional
ones is, in general, impossible.
HALTING FAILURES
Scenario
C no longer perceives any activity from C∗ — a halting failure? Distinguishing between a crash or
omission/timing failure may be impossible.
Failure models
REDUNDANCY FOR FAILURE MASKING
Types of redundancy
• Information redundancy: Add extra bits to data units so that errors can recovered
when bits are garbled.
• Time redundancy: Design a system such that an action can be performed again if
anything went wrong. Typically used when faults are transient or intermittent.
• Physical redundancy: add equipment or processes in order to allow one or more
components to fail. This type is extensively used in distributed systems.
PROCESS RESILIENCE
Basic idea
Protect against malfunctioning processes through process replication,
organizing multiple processes into a process group. Distinguish between
flat groups and hierarchical groups.
Important assumptions
• All members are identical
• All members process commands in the same order
Result: We can now be sure that all processes do exactly the same thing.
Observations
• P2 received all proposed commands from all other processes ⇒
makes decision.
• P3 may have detected that P1 crashed, but does not know if P2 received
anything, i.e., P3 cannot know if it has the same information as P2 ⇒
cannot make decision (same for P4 ).
Raft
When submitting an operation
• A client submits a request for operation o.
• The leader appends the request ⟨o, t , ⟩ to its own log (registering
the current term t and length of ).
• The log is (conceptually) broadcast to the other servers.
• The others (conceptually) copy the log and acknowledge the
receipt.
• When a majority of acks arrives, the leader commits o.
Note
In practice, only updates are broadcast. At the end, every server has the
same view and knows about the c committed operations. Note that
effectively, any information at the backups is overwritten.
Crucial observations
• The new leader has the most committed operations in its log.
• Any missing commits will eventually be sent to the other backups.
Problem
Have an operation being performed by each member of a process group,
or none at all.
• Reliable multicasting: a message is to be delivered to all recipients.
• Distributed transaction: each local transaction must succeed.
TWO-PHASE COMMIT PROTOCOL (2PC)
Essence
The client who initiated the computation acts as coordinator;
processes required to commit are the participants.
• Phase 1a: Coordinator sends VOTE-REQUEST to participants (also called
a pre-write)
• Phase 1b: When participant receives VOTE-REQUEST it returns either
VOTE-COMMIT or VOTE-ABORT to coordinator. If it sends VOTE-ABORT, it
aborts its local computation
• Phase 2a: Coordinator collects all votes; if all are VOTE-COMMIT, it
sends
GLOBAL - COMMIT to all participants, otherwise it sends GLOBAL - ABORT
• Phase 2b: Each participant waits for GLOBAL - COMMIT or GLOBAL - ABORT
and handles accordingly.
2PC - Finite state machines
Coordinator Participant
2PC – FAILING PARTICIPANT
Analysis: participant crashes in state S, and recovers to S
• INIT : No problem: participant was unaware of protocol
• READY : Participant is waiting to either commit or abort. After recovery,
participant needs to know which state transition it should make ⇒ log
the coordinator’s decision
• ABORT : Merely make entry into abort state idempotent, e.g.,
removing the workspace of results
• COMMIT : Also make entry into commit state idempotent, e.g.,
copying workspace to storage.
Observation
When distributed commit is required, having participants use temporary
workspaces to keep their results allows for simple recovery in the presence
of failures.
2PC – FAILING PARTICIPANT
Alternative
When a recovery is needed to READY state, check state of other
participants
⇒ no need to log coordinator’s decision.
Result
If all participants are in the READY state, the protocol blocks. Apparently, the
coordinator is failing. Note: The protocol prescribes that we need the
decision from the coordinator.
2PC – FAILING COORDINATOR
Observation
The real problem lies in the fact that the coordinator’s final decision may not
be available for some time (or actually lost).
Alternative
Let a participant P in the READY state timeout when it hasn’t received the
coordinator’s decision; P tries to find out what other participants know (as
discussed).
Observation
Essence of the problem is that a recovering participant cannot make a local
decision: it is dependent on other (possibly failed) processes
REFERENCES FOR FURTHER LEARNING OF THE
SESSION
Reference Books:
1. Chapman, Barbara Jost, Gabriele Pas, Ruud van der, Using OpenMP: portable shared
memory parallel programming, 2008, MIT Press.
2. Gadi Taubenfeld - Distributed Computing Pearls (2018, Morgan & Claypool Publishers)
3. Tanenbaum, Andrew S Steen, Maarten van-Distributed systems: principles and
paradigms. Pearson, 4th Edition
25
THANK YOU
26