Bayes' Theorem is a mathematical formula that helps determine the conditional probability of an event based on prior knowledge and new evidence.
It adjusts probabilities when new information comes in and helps make better decisions in uncertain situations.
Bayes' Theorem helps us update probabilities based on prior knowledge and new evidence. In this case, knowing that the pet is quiet (new information), we can use Bayes' Theorem to calculate the updated probability of the pet being a cat or a dog, based on how likely each animal is to be quiet.
Bayes Theorem and Conditional Probability
Bayes' theorem (also known as the Bayes Rule or Bayes Law) is used to determine the conditional probability of event A when event B has already occurred.
The general statement of Bayes’ theorem is “The conditional probability of an event A, given the occurrence of another event B, is equal to the product of the event of B, given A, and the probability of A divided by the probability of event B.” i.e.
For example, if we want to find the probability that a white marble drawn at random came from the first bag, given that a white marble has already been drawn, and there are three bags each containing some white and black marbles, then we can use Bayes’ Theorem.
Check: Bayes’s Theorem for Conditional Probability
For any two events A and B, Bayes's formula for the Bayes theorem is given by:
Formula for the Bayes theoremWhere,
- P(A) and P(B) are the probabilities of events A and B, also, P(B) is never equal to zero.
- P(A|B) is the probability of event A when event B happens,
- P(B|A) is the probability of event B when A happens.
Bayes Theorem Statement
Bayes's Theorem for n sets of events is defined as,
Let E1, E2,…, En be a set of events associated with the sample space S, in which all the events E1, E2,…, En have a non-zero probability of occurrence. All the events E1, E2,…, E form a partition of S. Let A be an event from space S for which we have to find the probability, then according to Bayes theorem,
P(E_i \mid A) = \frac{P(E_i) \cdot P(A \mid E_i)}{\sum_{k=1}^{n} P(E_k) \cdot P(A \mid E_k)}
for k = 1, 2, 3, …., n
Bayes Theorem Derivation
The proof of Bayes's, Theorem is given as, according to the conditional probability formula,
P(E_i \mid A) = \frac{P(E_i \cap A)}{P(A)}.....(i)
Then, by using the multiplication rule of probability, we get
P(E_i \cap A) = P(E_i) \cdot P(A \mid E_i)......(ii)
Now, by the total probability theorem,
P(A) = \sum_{k=1}^{n} P(E_k) \cdot P(A \mid E_k).....(iii)
Substituting the value of P(Ei∩A) and P(A) from eq (ii) and eq(iii) in eq(i) we get,
P(E_i \mid A) = \frac{P(E_i) \cdot P(A \mid E_i)}{\sum_{k=1}^{n} P(E_k) \cdot P(A \mid E_k)}
Bayes’ theorem is also known as the formula for the probability of “causes”. As we know, the Ei‘s are a partition of the sample space S, and at any given time, only one of the events Ei occurs. Thus, we conclude that the Bayes theorem formula gives the probability of a particular Ei, given that event A has occurred.
After learning about Bayes theorem in detail, let us understand some important terms related to the concepts we covered in the formula and derivation.
Hypotheses
- Hypotheses refer to possible events or outcomes in the sample space, they are denoted as E1, E2, …, En.
- Each hypothesis represents a distinct scenario that could explain an observed event.
Priori Probability
- Priori Probability P(Ei) is the initial probability of an event occurring before any new data is taken into account.
- It reflects existing knowledge or assumptions about the event.
- Example: The probability of a person having a disease before taking a test.
Posterior Probability
- Posterior probability (P(Ei∣A) is the updated probability of an event after considering new information.
- It is derived using the Bayes Theorem.
- Example: The probability of having a disease given a positive test result.
Conditional Probability
- The probability of an event A based on the occurrence of another event B is termed conditional Probability.
- It is denoted as P(A|B) and represents the probability of A when event B has already happened.
Joint Probability
- When the probability of two or more events occurring together and at the same time is measured, it is marked as Joint Probability.
- For two events A and B, it is denoted by joint probability is denoted as P(A∩B).
Random Variables
- Real-valued variables whose possible values are determined by random experiments are called random variables.
- The probability of finding such variables is the experimental probability.
Bayes Theorem Applications
Bayesian inference is very important and has found application in various activities, including medicine, science, philosophy, engineering, sports, law, etc., and Bayesian inference is directly derived from Bayes theorem.
Some of the Key Applications are:
- Medical Testing → Finding the real probability of having a disease after a positive test.
- Spam Filters → Checking if an email is spam based on keywords.
- Weather Prediction → Updating the chance of rain based on new data.
- AI & Machine Learning → Used in Naïve Bayes classifiers to predict outcomes.
Check, Bayes' Life Applications of Bayes theorem
Difference Between Conditional Probability and Bayes Theorem
The difference between Conditional Probability and Bayes's. The theorem can be understood with the help of the table given below.
Bayes Theorem | Conditional Probability |
---|
Bayes's Theorem is derived using the definition of conditional probability. It is used to find the reverse probability. | Conditional Probability is the probability of event A when event B has already occurred. |
Formula: P(A|B) = [P(B|A)P(A)] / P(B) | Formula: P(A|B) = P(A∩B) / P(B) |
Purpose: To update the probability of an event based on new evidence. | Purpose: To find the probability of one event based on the occurrence of another. |
Focus: Uses prior knowledge and evidence to compute a revised probability. | Focus: Direct relationship between two events. |
Theorem of Total Probability
Let E1, E2,…., En be mutually exclusive and exhaustive events of a sample space S, and let E be any event that occurs with some Ei. Then, prove that :
P(E) = n∑i=1P(E/Ei) . P(Ei)
Proof:
Let S be the sample space.
Since the events E1, E2,…,En are mutually exclusive and exhaustive, we have:
S = E1 ∪ E2 ∪ E3 ∪ . . . ∪ En and Ei ∩ Ej = ∅ for i ≠ j.
Now, consider the event E: E = E ∩ S
Substituting S with the union of Ei's:
⇒ E = E ∩ (E1 ∪ E2 ∪ E3 ∪ . . . ∪ En)
Using distributive law:
⇒ E = (E ∩ E1) ∪ (E ∩ E2) ∪ . . . ∪ (E ∩ En)
Since the events Ei are mutually exclusive, the intersections E∩Ei are also mutually exclusive. Therefore:
P(E) = P{(E ∩ E1) ∪ (E ∩ E2)∪ . . . ∪(E ∩ En)}
⇒ P(E) = P(E ∩ E1) + P(E ∩ E2) + . . . + P(E ∩ En)
{Therefore, (E ∩ E1), (E ∩ E2), . . . ,(E ∩ En)} are pairwise disjoint}
⇒ P(E) = P(E/E1) . P(E1) + P(E/E2) . P(E2) + . . . + P(E/En) . P(En) [by multiplication theorem]
⇒ P(E) = n∑i=1P(E/Ei) . P(Ei)
Articles Related to Bayes' Theorem-
Bayes Theorem for Programmers-
Solved Examples of Bayes's Theorem
Example 1: A person has undertaken a job. The probabilities of completion of the job on time with and without rain are 0.44 and 0.9, and 5, respectively. If the probability that it will rain is 0.45, then determine the probability that the job will be completed on time.
Solution:
Let E1 be the event that the mining job will be completed on time and E2 be the event that it rains. We have,
P(A) = 0.45,
P(no rain) = P(B) = 1 − P(A) = 1 − 0.45 = 0.55
By multiplication law of probability,
P(E1) = 0.44, and P(E2) = 0.95
Since, events A and B form partitions of the sample space S, by total probability theorem, we have
P(E) = P(A) P(E1) + P(B) P(E2)
⇒ P(E) = 0.45 × 0.44 + 0.55 × 0.95
⇒ P(E) = 0.198 + 0.5225 = 0.7205
So, the probability that the job will be completed on time is 0.7205
Example 2: There are three urns containing 3 white and 2 black balls, 2 white and 3 black balls, and 1 black and 4 white balls, respectively. There is an equal probability of each urn being chosen. One ball is equal probability chosen at random. What is the probability that a white ball will be drawn?
Solution:
Let E1, E2, and E3 be the events of choosing the first, second, and third urn respectively. Then,
P(E1) = P(E2) = P(E3) = 1/3
Let E be the event that a white ball is drawn. Then,
P(E/E1) = 3/5, P(E/E2) = 2/5, P(E/E3) = 4/5
By theorem of total probability, we have
P(E) = P(E/E1) . P(E1) + P(E/E2) . P(E2) + P(E/E3) . P(E3)
⇒ P(E) = (3/5 × 1/3) + (2/5 × 1/3) + (4/5 × 1/3)
⇒ P(E) = 9/15 = 3/5
Example 3: A card from a pack of 52 cards is lost. From the remaining cards of the pack, two cards are drawn and are found to be both hearts. Find the probability of the lost card being a heart.
Solution:
Let E1, E2, E3, and E4 be the events of losing a card of hearts, clubs, spades, and diamonds respectively.
Then P(E1) = P(E2) = P(E3) = P(E4) = 13/52 = 1/4.
Let E be the event of drawing 2 hearts from the remaining 51 cards. Then,
P(E|E1) = probability of drawing 2 hearts, given that a card of hearts is missing
⇒ P(E|E1) = 12C2 / 51C2 = (12 × 11)/2! × 2!/(51 × 50) = 22/425
P(E|E2) = probability of drawing 2 clubs ,given that a card of clubs is missing
⇒ P(E|E2) = 13C2 / 51C2 = (13 × 12)/2! × 2!/(51 × 50) = 26/425
P(E|E3) = probability of drawing 2 spades ,given that a card of hearts is missing
⇒ P(E|E3) = 13C2 / 51C2 = 26/425
P(E|E4) = probability of drawing 2 diamonds ,given that a card of diamonds is missing
⇒ P(E|E4) = 13C2 / 51C2 = 26/425
Therefore,
P(E1|E) = probability of the lost card is being a heart, given the 2 hearts are drawn from the remaining 51 cards
⇒ P(E1|E) = P(E1) . P(E|E1)/P(E1) . P(E|E1) + P(E2) . P(E|E2) + P(E3) . P(E|E3) + P(E4) . P(E|E4)
⇒ P(E1|E) = (1/4 × 22/425) / {(1/4 × 22/425) + (1/4 × 26/425) + (1/4 × 26/425) + (1/4 × 26/425)}
⇒ P(E1|E) = 22/100 = 0.22
Hence, The required probability is 0.22.
Example 4: Suppose 15 men out of 300 men and 25 women out of 1000 are good orators. An orator is chosen at random. Find the probability that a male person is selected.
Solution:
Given,
- Total Men = 300
- Total Women = 1000
- Good Orators among Men = 15
- Good Orators among Women = 25
Total number of good orators = 15 (from men) + 25 (from women) = 40
Probability of selecting a male orator:
P(Male Orator) = Numbers of male orators / total no of orators = 15/40 = 3/8
Example 5: A man is known to speak the lies 1 out of 4 times. He throws a die and reports that it is a six. Find the probability that it Bayes' actually a six.
Solution:
In a throw of a die, let
E1 = event of getting a six,
E2 = event of not getting a six and
E = event that the man reports that it is a six.
Then, P(E1) = 1/6, and P(E2) = (1 - 1/6) = 5/6
P(E|E1) = probability that the man reports that six occurs when six has actually occurred
⇒ P(E|E1) = probability that the man speaks the truth
⇒ P(E|E1) = 3/4
P(E|E2) = probability that the man reports that six occurs when six has not actually occurred
⇒ P(E|E2) = probability that the man does not speak the truth
⇒ P(E|E2) = (1 - 3/4) = 1/4
Probability of getting a six ,given that the man reports it to be six
P(E1|E) = P(E|E1) × P(E1)/P(E|E1) × P(E1) + P(E|E2) × P(E2) [by Bayes theorem]
⇒ P(E1|E) = (3/4 × 1/6)/{(3/4 × 1/6) + (1/4 × 5/6)}
⇒ P(E1|E) = (1/8 × 3) = 3/8
Hence the probability required is 3/8.
Similar Reads
Engineering Mathematics Tutorials Engineering mathematics is a vital component of the engineering discipline, offering the analytical tools and techniques necessary for solving complex problems across various fields. Whether you're designing a bridge, optimizing a manufacturing process, or developing algorithms for computer systems,
3 min read
Linear Algebra
MatricesMatrices are key concepts in mathematics, widely used in solving equations and problems in fields like physics and computer science. A matrix is simply a grid of numbers, and a determinant is a value calculated from a square matrix.Example: \begin{bmatrix} 6 & 9 \\ 5 & -4 \\ \end{bmatrix}_{2
3 min read
Row Echelon FormRow Echelon Form (REF) of a matrix simplifies solving systems of linear equations, understanding linear transformations, and working with matrix equations. A matrix is in Row Echelon form if it has the following properties:Zero Rows at the Bottom: If there are any rows that are completely filled wit
4 min read
Eigenvalues and EigenvectorsEigenvectors are the directions that remain unchanged during a transformation, even if they get longer or shorter. Eigenvalues are the numbers that indicate how much something stretches or shrinks during that transformation. These ideas are important in many areas of math and engineering, including
15+ min read
System of Linear EquationsIn mathematics, a system of linear equations consists of two or more linear equations that share the same variables. These systems often arise in real-world applications, such as engineering, physics, economics, and more, where relationships between variables need to be analyzed. Understanding how t
8 min read
Matrix DiagonalizationMatrix diagonalization is the process of reducing a square matrix into its diagonal form using a similarity transformation. This process is useful because diagonal matrices are easier to work with, especially when raising them to integer powers.Not all matrices are diagonalizable. A matrix is diagon
8 min read
LU DecompositionLU decomposition or factorization of a matrix is the factorization of a given square matrix into two triangular matrices, one upper triangular matrix and one lower triangular matrix, such that the product of these two matrices gives the original matrix. It was introduced by Alan Turing in 1948, who
7 min read
Finding Inverse of a Square Matrix using Cayley Hamilton Theorem in MATLABMatrix is the set of numbers arranged in rows & columns in order to form a Rectangular array. Here, those numbers are called the entries or elements of that matrix. A Rectangular array of (m*n) numbers in the form of 'm' horizontal lines (rows) & 'n' vertical lines (called columns), is calle
4 min read
Sequence & Series
Calculus
Limits, Continuity and DifferentiabilityLimits, Continuity, and Differentiation are fundamental concepts in calculus. They are essential for analyzing and understanding function behavior and are crucial for solving real-world problems in physics, engineering, and economics.Table of ContentLimitsKey Characteristics of LimitsExample of Limi
10 min read
Cauchy's Mean Value TheoremCauchy's Mean Value theorem provides a relation between the change of two functions over a fixed interval with their derivative. It is a special case of Lagrange Mean Value Theorem. Cauchy's Mean Value theorem is also called the Extended Mean Value Theorem or the Second Mean Value Theorem.According
7 min read
Taylor SeriesA Taylor series represents a function as an infinite sum of terms, calculated from the values of its derivatives at a single point.Taylor series is a powerful mathematical tool used to approximate complex functions with an infinite sum of terms derived from the function's derivatives at a single poi
8 min read
Inverse functions and composition of functionsInverse Functions - In mathematics a function, a, is said to be an inverse of another, b, if given the output of b a returns the input value given to b. Additionally, this must hold true for every element in the domain co-domain(range) of b. In other words, assuming x and y are constants, if b(x) =
3 min read
Definite Integral | Definition, Formula & How to CalculateA definite integral is an integral that calculates a fixed value for the area under a curve between two specified limits. The resulting value represents the sum of all infinitesimal quantities within these boundaries. i.e. if we integrate any function within a fixed interval it is called a Definite
8 min read
Application of Derivative - Maxima and MinimaDerivatives have many applications, like finding rate of change, approximation, maxima/minima and tangent. In this section, we focus on their use in finding maxima and minima.Note: If f(x) is a continuous function, then for every continuous function on a closed interval has a maximum and a minimum v
6 min read
Probability & Statistics
Mean, Variance and Standard DeviationMean, Variance and Standard Deviation are fundamental concepts in statistics and engineering mathematics, essential for analyzing and interpreting data. These measures provide insights into data's central tendency, dispersion, and spread, which are crucial for making informed decisions in various en
10 min read
Conditional ProbabilityConditional probability defines the probability of an event occurring based on a given condition or prior knowledge of another event. Conditional probability is the likelihood of an event occurring, given that another event has already occurred. In probability, this is denoted as A given B, expresse
12 min read
Bayes' TheoremBayes' Theorem is a mathematical formula that helps determine the conditional probability of an event based on prior knowledge and new evidence.It adjusts probabilities when new information comes in and helps make better decisions in uncertain situations.Bayes' Theorem helps us update probabilities
12 min read
Probability Distribution - Function, Formula, TableA probability distribution is a mathematical function or rule that describes how the probabilities of different outcomes are assigned to the possible values of a random variable. It provides a way of modeling the likelihood of each outcome in a random experiment.While a frequency distribution shows
15+ min read
Covariance and CorrelationCovariance and correlation are the two key concepts in Statistics that help us analyze the relationship between two variables. Covariance measures how two variables change together, indicating whether they move in the same or opposite directions. Relationship between Independent and dependent variab
5 min read
Practice Questions