0% found this document useful (0 votes)
11 views26 pages

Fault Tolerant Message Passing Systems

Uploaded by

rrk259388
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views26 pages

Fault Tolerant Message Passing Systems

Uploaded by

rrk259388
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Department of Computer Science and

Engineering (RA)
COURSE NAME: PARALLEL &DISTRIBUTED COMPUTING
COURSE CODE: 22CS4106 R

TOPIC: FAULT-TOLERANT MESSAGE-PASSING SYSTEMS


DEPENDABILITY
Basics
A component provides services to clients. To provide services, the component
may require the services from other components ⇒ a component may
depend on some other component.

Specifically
A component C depends on C∗ if the correctness of C’s behavior depends
on the correctness of C∗’s behavior. (Components are processes or
channels.)

Requirements related to
Requirement
dependability
Description
Availability Readiness for usage
Reliability Continuity of service delivery
Safety Very low probability of catastrophes
Maintainability How easy can a failed system be repaired

Basic concepts
RELIABILITY VERSUS AVAILABILITY

Reliability R(t ) of component C


Conditional probability that C has been functioning correctly during [0, t ) given
C was functioning correctly at time T = 0.

Traditional metrics
• Mean Time To Failure (MTTF): The average time until a component fails.
• Mean Time To Repair (MTTR): The average time needed to repair a
component.
• Mean Time Between Failures (MTBF): Simply MTTF + MTTR.

Basic concepts
RELIABILITY VERSUS AVAILABILITY

Availability A(t ) of component C


Average fraction of time that C has been up-and-running in interval [0, t ).
• Long-term availability A: A(∞)
• Note: A = MTTF = MTTF
MTBF MTTF+MTTR

Observation
Reliability and availability make sense only if we have an accurate notion
of what a failure actually is.

Basic concepts
Terminology
Failure, error, fault
Term Description Example
Failure A component is not living up to Crashed program
its specifications
Error Part of a component that can Programming bug
lead to a failure
Fault Cause of an error Sloppy programmer

Basic concepts
Terminology
Handling faults

Term Description Example


Fault Prevent the occurrence Don’t hire sloppy
prevention of a fault programmers
Fault tolerance Build a component Build each component
such that it can mask by two independent
the occurrence of a programmers
fault
Fault removal Reduce the presence, Get rid of sloppy
number, or seriousness programmers
of a fault
Fault Estimate current Estimate how a
forecasting presence, future recruiter is doing when
incidence, and it comes to hiring
consequences of faults sloppy programmers

Basic concepts
Failure models
Types of failures

Type Description of server’s behavior


Crash failure Halts, but is working correctly until it halts
Omission failure Fails to respond to incoming requests
Receive omission Fails to receive incoming messages Fails
Send omission to send messages
Timing failure Response lies outside a specified time interval
Response failure Response is incorrect
Value failure The value of the response is wrong
State-transition failure Deviates from the correct flow of control
Arbitrary failure May produce arbitrary responses at arbitrary
times
DEPENDABILITY VERSUS SECURITY

Omission versus commission


Arbitrary failures are sometimes qualified as malicious. It is better to make the following
distinction:
• Omission failures: a component fails to take an action that it should have taken
• Commission failure: a component takes an action that it should not have taken

Observation
Note that deliberate failures, be they omission or commission failures, are typically
security problems. Distinguishing between deliberate failures and unintentional
ones is, in general, impossible.
HALTING FAILURES

Scenario
C no longer perceives any activity from C∗ — a halting failure? Distinguishing between a crash or
omission/timing failure may be impossible.

Asynchronous versus synchronous systems


• Asynchronous system: no assumptions about process execution speeds or message delivery
times → cannot reliably detect crash failures.
• Synchronous system: process execution speeds and message delivery times are bounded
→ we can reliably detect omission and timing failures.
• In practice we have partially synchronous systems: most of the time, we can assume the
system to be synchronous, yet there is no bound on the time that a system is asynchronous
→ can normally reliably detect crash failures.
Halting failures
Assumptions we can make

Halting type Description


Fail-stop Crash failures, but reliably detectable
Fail-noisy Crash failures, eventually reliably detectable
Fail-silent Omission or crash failures: clients cannot tell
what went wrong
Fail-safe Arbitrary, yet benign failures (i.e., they cannot
do any harm)
Fail-arbitrary Arbitrary, with malicious failures

Failure models
REDUNDANCY FOR FAILURE MASKING

Types of redundancy
• Information redundancy: Add extra bits to data units so that errors can recovered
when bits are garbled.
• Time redundancy: Design a system such that an action can be performed again if
anything went wrong. Typically used when faults are transient or intermittent.
• Physical redundancy: add equipment or processes in order to allow one or more
components to fail. This type is extensively used in distributed systems.
PROCESS RESILIENCE
Basic idea
Protect against malfunctioning processes through process replication,
organizing multiple processes into a process group. Distinguish between
flat groups and hierarchical groups.

Resilience by process groups


GROUPS AND FAILURE MASKING
k -fault tolerant group
When a group can mask any k concurrent member failures (k is called
degree of fault tolerance).

How large does a k -fault tolerant group need to be?


• With halting failures (crash/omission/timing failures): we need a total of
k + 1 members as no member will produce an incorrect result, so the
result of one member is good enough.
• With arbitrary failures: we need 2k + 1 members so that the correct
result can be obtained through a majority vote.

Important assumptions
• All members are identical
• All members process commands in the same order
Result: We can now be sure that all processes do exactly the same thing.

Failure masking and replication


Flooding-based consensus
System model
• A process group P = {P1,..., Pn}
• Fail-stop failure semantics, i.e., with reliable failure detection
• A client contacts a Pi requesting it to execute a command

• Every Pi maintains a list of proposed commands

Basic algorithm (based on rounds)


1. In round r , Pi multicasts its known set of commands Ci r to all others
2. At the end of r , each Pi merges all received commands into a new Cir+1 .
3. Next command cmdi selected through a globally shared, deterministic
function: cmdi ← select (Cir+1 ).

Consensus in faulty systems with crash failures


Flooding-based consensus: Example

Observations
• P2 received all proposed commands from all other processes ⇒
makes decision.
• P3 may have detected that P1 crashed, but does not know if P2 received
anything, i.e., P3 cannot know if it has the same information as P2 ⇒
cannot make decision (same for P4 ).

Consensus in faulty systems with crash failures


RAFT
Developed for understandability
• Uses a fairly straightforward leader-election algorithm (see Chp. 5). The
current leader operates during the current term.
• Every server (typically, five) keeps a log of operations, some of which
have been committed. A backup will not vote for a new leader if its
own log is more up to date.
• All committed operations have the same position in the log of
each respective server.
• The leader decides which pending operation is to be committed next ⇒
a primary-backup approach.

Consensus in faulty systems with crash failures


Fault tolerance

Raft
When submitting an operation
• A client submits a request for operation o.
• The leader appends the request ⟨o, t , ⟩ to its own log (registering
the current term t and length of ).
• The log is (conceptually) broadcast to the other servers.
• The others (conceptually) copy the log and acknowledge the
receipt.
• When a majority of acks arrives, the leader commits o.

Note
In practice, only updates are broadcast. At the end, every server has the
same view and knows about the c committed operations. Note that
effectively, any information at the backups is overwritten.

Consensus in faulty systems with crash failures


Fault tolerance

Raft: when a leader crashes

Crucial observations
• The new leader has the most committed operations in its log.
• Any missing commits will eventually be sent to the other backups.

Consensus in faulty systems with crash failures


DISTRIBUTED COMMIT PROTOCOLS

Problem
Have an operation being performed by each member of a process group,
or none at all.
• Reliable multicasting: a message is to be delivered to all recipients.
• Distributed transaction: each local transaction must succeed.
TWO-PHASE COMMIT PROTOCOL (2PC)
Essence
The client who initiated the computation acts as coordinator;
processes required to commit are the participants.
• Phase 1a: Coordinator sends VOTE-REQUEST to participants (also called
a pre-write)
• Phase 1b: When participant receives VOTE-REQUEST it returns either
VOTE-COMMIT or VOTE-ABORT to coordinator. If it sends VOTE-ABORT, it
aborts its local computation
• Phase 2a: Coordinator collects all votes; if all are VOTE-COMMIT, it
sends
GLOBAL - COMMIT to all participants, otherwise it sends GLOBAL - ABORT

• Phase 2b: Each participant waits for GLOBAL - COMMIT or GLOBAL - ABORT
and handles accordingly.
2PC - Finite state machines

Coordinator Participant
2PC – FAILING PARTICIPANT
Analysis: participant crashes in state S, and recovers to S
• INIT : No problem: participant was unaware of protocol
• READY : Participant is waiting to either commit or abort. After recovery,
participant needs to know which state transition it should make ⇒ log
the coordinator’s decision
• ABORT : Merely make entry into abort state idempotent, e.g.,
removing the workspace of results
• COMMIT : Also make entry into commit state idempotent, e.g.,
copying workspace to storage.

Observation
When distributed commit is required, having participants use temporary
workspaces to keep their results allows for simple recovery in the presence
of failures.
2PC – FAILING PARTICIPANT
Alternative
When a recovery is needed to READY state, check state of other
participants
⇒ no need to log coordinator’s decision.

Recovering participant P contacts another participant Q


State of Q Action by P
COMMIT Make transition to COMMIT
ABORT Make transition to ABORT
INIT Make transition to ABORT
READY Contact another participant

Result
If all participants are in the READY state, the protocol blocks. Apparently, the
coordinator is failing. Note: The protocol prescribes that we need the
decision from the coordinator.
2PC – FAILING COORDINATOR
Observation
The real problem lies in the fact that the coordinator’s final decision may not
be available for some time (or actually lost).

Alternative
Let a participant P in the READY state timeout when it hasn’t received the
coordinator’s decision; P tries to find out what other participants know (as
discussed).

Observation
Essence of the problem is that a recovering participant cannot make a local
decision: it is dependent on other (possibly failed) processes
REFERENCES FOR FURTHER LEARNING OF THE
SESSION

Reference Books:

1. Chapman, Barbara Jost, Gabriele Pas, Ruud van der, Using OpenMP: portable shared
memory parallel programming, 2008, MIT Press.
2. Gadi Taubenfeld - Distributed Computing Pearls (2018, Morgan & Claypool Publishers)
3. Tanenbaum, Andrew S Steen, Maarten van-Distributed systems: principles and
paradigms. Pearson, 4th Edition

25
THANK YOU

A.SANJEEV KUMAR – PARALLEL AND DISTRIBUTED


COMPUTING

26

You might also like