0% found this document useful (0 votes)
13 views

AI&MLUnit 2

Ai&ml unit 2 notes

Uploaded by

shivani20.dsp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

AI&MLUnit 2

Ai&ml unit 2 notes

Uploaded by

shivani20.dsp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

CS3491-AI&ML-QB

UNIT II PROBABILISTIC REASONING

Acting under uncertainty – Bayesian inference – naïve bayes models. Probabilistic


reasoning – Bayesian networks – exact inference in BN – approximate inference in BN –
causal networks.

PART – A

1. What is meant by Bayes theorem in probability? Nov/Dec 19

In Probability, Bayes theorem is a mathematical formula, which is used to determine the


conditional probability of the given event. Conditional probability is defined as the likelihood
that an event will occur, based on the occurrence of a previous outcome.

2. What is MAP hypothesis?

In many learning scenarios, the learner considers some set of candidate


hypotheses H and is interested in finding the most probable hypothesis given the
observed data D (or at least one of the maximally probable if there are several). Any such
maximally probable hypothesis is called a maximum a posteriori (MAP) hypothesis.

3. What are the assumptions in Naïve Bayes algorithms?

The naive Bayes algorithm is based on the following assumptions:


• All the features are independent and are unrelated to each other. Presence or
absence of a feature does not influence the presence or absence of any other
feature.
• The data has class-conditional independence, which means that events are
independent so long as they are conditioned on the same class value.
These assumptions are, in general, true in many real world problems. It is because of
these assumptions; the algorithm is called a naive algorithm.

4. What is Probabilistic reasoning?

Probabilistic reasoning is a way of knowledge representation where we apply the concept


of probability to indicate the uncertainty in knowledge. In probabilistic reasoning, we
combine probability theory with logic to handle the uncertainty

5. How is Bayes theorem different from conditional probability? Nov/Dec 18


Dr.R.Ahila Page 32
CS3491-AI&ML-QB

As we know, Bayes theorem defines the probability of an event based on the prior
knowledge of the conditions related to the event. In case, if we know the conditional
probability, we can easily find the reverse probabilities using the Bayes theorem.

6. What is uncertainty in knowledge representation?


With this knowledge representation, we might write A→B, which means if A is true then
B is true, but consider a situation where we are not sure about whether A is true or not
then we cannot express this statement, this situation is called uncertainty.

7. What is Naive Bayes classifier? Apr/May 20

Where VNB denotes the target value output by the naive Bayes classifier. In a
naive Bayes classifier the number of distinct P(ai / vj) terms that must be estimated from
the training data is just the number of distinct attribute values times the number of
distinct target values-a much smaller number than if we were to estimate the P(a1, a2 . . .
an | vj) terms as first contemplated.

8. What is Bayesian belief network?

A Bayesian belief network describes the probability distribution governing a set


of variables by specifying a set of conditional independence assumptions along with a set
of conditional probabilities. In contrast to the naive Bayes classifier, which assumes that
all the variables are conditionally independent given the value of the target variable,
Bayesian belief networks allow stating conditional independence assumptions that apply
to subsets of the variables. Thus, Bayesian belief networks provide an intermediate
approach that is less constraining than the global assumption of conditional independence
made by the naive Bayes classifier, but more tractable than avoiding conditional
independence assumptions altogether.

9. How do you represent a Bayesian belief network? Explain with example.

A Bayesian belief network (Bayesian network) represents the joint probability


distribution for a set of variables. For example, the Bayesian network in below figure represents
the joint probability distribution over the Boolean variables Storm, Lightning, Thunder,
ForestFire, Campjre, and BusTourGroup. In general, a Bayesian network represents the joint
probability distribution by specifying a set of conditional independence assumptions (represented
by a directed acyclic graph), together with sets of local conditional probabilities. Each variable in
the joint space is represented by a node in the Bayesian network

Dr.R.Ahila Page 33
CS3491-AI&ML-QB

The network on the left represents a set of conditional independence assumptions.


In particular, each node is asserted to be conditionally independent of its non-
descendants, given its immediate parents. Associated with each node is a conditional
probability table, which specifies the conditional distribution for the variable given its
immediate parents in the graph?

The conditional probability table for the Campjire node is shown at the right,
where Campjire is abbreviated to C, Storm abbreviated to S, and BusTourGroup
abbreviated to B.

10 .What is MLE?

Maximum likelihood estimation (MLE) is a method of estimating the parameters


of a statistical model, given observation. MLE attempts to find the parameter values that
maximize the likelihood function, given the observations. The resulting estimate is called
a maximum likelihood estimate, which is also abbreviated as MLE.

11. What is Bayesian Networks (BN)? Nov/Dec ‘18

Bayesian Network is used to represent the graphical model for probability relationship
among a set of variables.
A Bayesian network is a probabilistic graphical model that measures the
conditional dependence structure of a set of random variables based on the Bayes theorem
A Bayesian network is a probabilistic graphical model which represents a set of
variables and their conditional dependencies using a directed acyclic graph."
It is also called a Bayes network, belief network, decision network, or Bayesian
model.
Bayesian networks are probabilistic, because these networks are built from
a probability distribution, and also use probability theory for prediction and anomaly
detection.

12. What is Bayesian Inference? Apr/May 19

The Bayesian inference is an application of Bayes' theorem, which is fundamental to Bayesian


statistics.

Dr.R.Ahila Page 34
CS3491-AI&ML-QB

It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).


Bayes' theorem can be derived using product rule and conditional probability of event A with
known event B:
As from product rule we can write:
P(A ⋀ B)= P(A|B) P(B) or
Similarly, the probability of event B with known event A:
P(A ⋀ B)= P(B|A) P(A)
Equating right hand side of both the equations, we will get:

The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of
most modern AI systems for probabilistic inference.

13. How Bayes theorem calculates posterior probability? Nov/Dec 20


A posterior probability is the revised or updated probability of an event occurring after
taking into consideration new information. It is a combination of prior probability and new
information. The posterior probability is calculated by updating the prior
probability using Bayes' theorem. In statistical terms, the posterior probability is the
probability of event A occurring given that event B has occurred.
The formula to calculate a posterior probability of A occurring given that B occurred:

where:

14. What is Probability learning?


Probability is a measure of uncertainty. Probability applies to machine learning because
in the real world, we need to make decisions with incomplete information. Hence, we need
a mechanism to quantify uncertainty – which Probability provides us. Using probability, we
can model elements of uncertainty such as risk in financial transactions and many other
business processes.
15. What are the advantages of Naive Bayes?
It handles both continuous and discrete data. It is highly scalable with the number of
predictors and data points. It is fast and can be used to make real-time predictions. It is not
sensitive to irrelevant features.
16. What is Conditional probability? Apr/May 18

Conditional probability is a probability of occurring an event when another event has

Dr.R.Ahila Page 35
CS3491-AI&ML-QB

already happened.
Let's suppose, we want to calculate the event A when event B has already occurred, "the
probability of A under the conditions of B", it can be written as:

17. What is the inference of Bayesian networks?


Bayesian networks are a type of probabilistic graphical model that uses Bayesian inference for
probability computations. Bayesian networks aim to model conditional dependence, and
therefore causation, by representing conditional dependence by edges in a directed graph.

18. What are the two components of Bayesian network?


There are two components involved in learning a Bayesian network: (i) structure
learning, which involves discovering the DAG that best describes the causal relationships in
the data, and (ii) parameter learning, which involves learning about the conditional
probability distributions.

A simple Bayesian network example.

19. Where is Naive Bayes Used?


o Face Recognition
As a classifier, it is used to identify the faces or its other features, like nose,
mouth, eyes, etc.
o Weather Prediction
It can be used to predict if the weather will be good or bad.
o Medical Diagnosis

Dr.R.Ahila Page 36
CS3491-AI&ML-QB

Doctors can diagnose patients by using the information that the classifier
provides. Healthcare professionals can use Naive Bayes to indicate if a patient is at
high risk for certain diseases and conditions, such as heart disease, cancer, and other
ailments.
o News Classification
With the help of a Naive Bayes classifier, Google News recognizes whether the
news is political, world news, and so on.
20. What are the advantages of Naïve Bayes model?

 Easy to work with when using binary or categorical input values.


 Require a small number of training data for estimating the parameters necessary for
classification.
 Handles both continuous and discrete data.
 Fast and reliable for making real-time predictions

PART B

1. a) State and Prove Bayes’ theorem. Nov/Dec 20


Bayes’ theorem describes the probability of occurrence of an event related to any
condition. It is also considered for the case of conditional probability. Bayes theorem is
also known as the formula for the probability of “causes”. For example: if we have to
calculate the probability of taking a blue ball from the second bag out of three different
bags of balls, where each bag contains three different colour balls viz. red, blue, black. In
this case, the probability of occurrence of an event is calculated depending on other
conditions is known as conditional probability
Bayes Theorem Statement

Let E1, E2,…, En be a set of events associated with a sample space S, where all the events E1,
E2,…, En have nonzero probability of occurrence and they form a partition of S. Let A be any
event associated with S, then according to Bayes theorem,

Dr.R.Ahila Page 37
CS3491-AI&ML-QB

Note:

The following terminologies are also used when the Bayes theorem is applied:
Hypotheses: The events E1, E2,… En is called the hypotheses
Priori Probability: The probability P(Ei) is considered as the priori probability of hypothesis E i
Posteriori Probability: The probability P(Ei|A) is considered as the posteriori probability of
hypothesis Ei
Bayes’ theorem is also called the formula for the probability of “causes”. Since the Ei‘s are a
partition of the sample space S, one and only one of the events E i occurs (i.e. one of the events
Ei must occur and the only one can occur). Hence, the above formula gives us the probability of
a particular Ei (i.e. a “Cause”), given that the event A has occurred.
b) Explain the axioms of probability.

Axioms of Probability
There are three axioms of probability that make the foundation of probability theory-
Axiom 1: Probability of Event
The first one is that the probability of an event is always between 0 and 1. 1 indicates
definite action of any of the outcome of an event and 0 indicates no outcome of the event
is possible.
Axiom 2: Probability of Sample Space
For sample space, the probability of the entire sample space is 1. The probability that at
least one of all the possible outcomes of a process (such as rolling a die) will occur is 1.
Axiom 3: Mutually Exclusive Events
And the third one is- the probability of the event containing any possible outcome of two
mutually disjoint is the summation of their individual probability. If two events A and B

Dr.R.Ahila Page 38
CS3491-AI&ML-QB

are mutually exclusive, then the probability of either A or B occurring is the probability
of A occurring plus the probability of B occurring
1. Probability of Event
The first axiom of probability is that the probability of any event is between 0 and 1.

As we know the formula of probability is that we divide the total number of outcomes in
the event by the total number of outcomes in sample space.

2. Probability of Sample Space


The second axiom is that the probability for the entire sample space equals 1.

Let’s take an example from the dataset. Suppose we need to find out the probability of
churning for the female customers by their occupation type.

In our data-set, we have 4 female customers, one of them is Salaried and three of them
are self-employed. The salaried female is going to churn. Since we have only one salaried
female who is going to churn, the number of salaried female customers who are not going
to churn is 0. Amongst 3 self-employed female customers, two are going to churn and we
can see that one self-employed female is not going to churn. This is the complete dataset:

Dr.R.Ahila Page 39
CS3491-AI&ML-QB

So the probability of the churning status of female customer by profession, in the sample
space of the problem we actually have:
Salaried Churn, Salaried Not churn, Self-employed Churn, Self-employed Not churn
And as we discussed their distribution earlier, in this sample space of female customer:
Salaried Churn = 1
Salaried Not churn = 0
Self-employed Churn = 2
Self-employed Not churn = 1

Dr.R.Ahila Page 40
CS3491-AI&ML-QB

3. Mutually Exclusive Event

If you remember the union formula you will recall that the intersection term is not here,
which means there is nothing common between A and B. Let us understand these
particular type of events which is called Mutually Exclusive Events.
These Mutually exclusive events mean that such events cannot occur together or in other
words, they don’t have common values or we can say their intersection is zero/null. We
can also represent such events as follows:

This means that the intersection is zero or they do not have any common value. For
example, if the Event A: is getting a number greater than 4 after rolling a die, the
possible outcomes would be 5 and 6.

Even B: is getting a number less than 3 on rolling a die. Here the possible outcomes
would be 1 and 2.

Clearly, both these events cannot have any common outcome. An interesting thing to note here is that
events A and B are not complemented of each other but yet they’re mutually exclusive.

Dr.R.Ahila Page 41
CS3491-AI&ML-QB

2. We have prior knowledge over entire population of people only 0.008 have Cancer. The
test returns a correct positive result in only 98% of the cases in which the disease is actually
present, and a correct negative result in only 97% of the cases in which the disease is not
present. In other cases, the test returns the opposite result. A patient takes a lab test and
the result comes back positive. Evaluate whether the patients have cancer or not using
Bayes learning? Nov/Dec 20

Dr.R.Ahila Page 42
CS3491-AI&ML-QB

3. Given 14 training examples of the target concert play tennis with attributes outlook,
temperature, humidity and wind. The frequency of play tennis = 9 Frequency of not play tennis = 5
Conditional probabilities are given as P(outlook = rainy|Play = Yes ) = 2/9 P(temp = cool|Play =
Yes) = 3/9 P(humidity = High|Play = Yes) = 3/9 P(Windy = true|Play = Yes) = 3/9 P(Outlook =
rainy|Play = No ) = 3/5 P(temp = cool|Play = No) = 1/5 P(humidity = High|Play = No) = 4/5 P(Windy
= true|Play=No ) = 3/5 Classify the new instance whether play = yes or No (Outlook = rainy, Temp =
hot, Humidity = high, wind = false) using Naive Bayes Classifier.
Apr/May 21

Solution:

Dr.R.Ahila Page 43
CS3491-AI&ML-QB

4. Explain Naive Bayes algorithm

Assumption

The naive Bayes algorithm is based on the following assumptions:


 All the features are independent and are unrelated to each other. Presence or absence
of a feature does not influence the presence or absence of any other feature.
 The data has class-conditional independence, which means that events are
independent so long as they are conditioned on the same class value.
 These assumptions are, in general, true in many real world problems. It is because of
these assumptions; the algorithm is called a naive algorithm.
Basic idea

Dr.R.Ahila Page 44
CS3491-AI&ML-QB

Computation of probabilities

Remarks

The various probabilities in the above expression are computed as follows:

Dr.R.Ahila Page 45
CS3491-AI&ML-QB

5. Write a short note on Bayesian network. Or Explain the Bayesian network by taking an
example. How is the Bayesian network representation for uncertainty knowledge?
Apr/May 21

 A Bayesian network is a probabilistic graphical model (PGM) which represents a set of


variables and their conditional dependencies using a directed acyclic graph (DAG).
 These networks are built from a probability distribution, and also use probability theory
for prediction and anomaly detection.
 Each variable is associated with a conditional probability table which gives the
probability of this variable for different values of the variables on which this nod
e depends.
 Using this model, it is possible to perform inference and learning.
Bayesian networks that model a sequence of variables varying in time are called
dynamic Bayesian networks
 Bayesian networks with decision nodes which solve decision problems under
uncertainly are Influence diagrams.
Explicit representation of conditional independencies Missing arcs encode conditional
independence efficient representation of joint PDF P(X) Generative model (not just
discriminative): allows arbitrary queries to be answered,
e.g. P (lung cancer=yes | smoking=no, positive X-ray=yes ) = ?
For example, a Bayesian network could represent the probabilistic relationships between
diseases and symptoms. Given symptoms, the network can be used to compute the probabilities

Dr.R.Ahila Page 46
CS3491-AI&ML-QB

of the presence of various diseases. Efficient algorithms can perform inference and learning in
Bayesian networks.
Another Example:

Applications:

 Prediction
 Anomaly detection
 Diagnostics
 Automated insight
 Reasoning
 Time series prediction
 Decision making under uncertainty.

6.a. Explain the role of prior probability and posterior probability in Bayesian classification.

Prior Probability:

Bayesian statistical inference is the probability of an event before new data is collected. This is the best
rational assessment of the probability of an outcome based on current knowledge.

It shows the likelihood of an outcome in a given dataset.

Dr.R.Ahila Page 47
CS3491-AI&ML-QB

For example,

In the Mortgage case, P(Y) is the default rate on a home mortgage, which is 2%. P(Y|X) is called the
conditional probability, which provides the probability of an outcome given the evidence, that is, when
the value of X is known.

Posterior Probability:

It is calculated using Bayes’ Theorem. Prior probability gets updated when new data is available, to
produce a more accurate measure of a potential outcome.

A posterior probability can subsequently become a prior for a new updated posterior probability as new
information arises and is incorporated into the analysis.

6.b. Explain the method of handling approximate inference in Bayesian networks.

Answer:

Inference over a Bayesian network can come in two forms. The first is simply evaluating the joint
probability of a particular assignment of values for each variable (or a subset) in the network

In exact inference, we analytically compute the conditional probability distribution over the variables of
interest.

Methods of handling approximate inference in Bayesian Networks:

Simulation Methods:

It uses the network to generate samples from the conditional probability distribution and estimate
conditional probabilities of interest when the number of samples is sufficiently large.

With machine learning, the inputs are known exactly, but the model is unknown prior to training.
Regarding output, the differences are more subtle. Both give an output, but the source of uncertainty is
different.

Variational Methods:

Variational Bayesian methods are a family of techniques for approximating intractable integrals arising
in Bayesian inference and machine learning.

PART C
1. a) An experiment consists of observing the sum of the outcomes when two fair dice are
thrown. Find the probability that the sum is 7 and find the probability that the sum is greater than
10. May/June 2016
Solution
Sample space for total number of possible outcomes
(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),
(2,1),(2,2),(2,3),(2,4),(2,5),(2,6),
(3,1),(3,2),(3,3),(3,4),(3,5),(3,6),
(4,1),(4,2),(4,3),(4,4),(4,5),(4,6),
(5,1),(5,2),(5,3),(5,4),(5,5),(5,6),
(6,1),(6,2),(6,3),(6,4),(6,5),(6,6)

Dr.R.Ahila Page 48
CS3491-AI&ML-QB

Total number of outcomes =36


Favorable outcomes for sum is 7
Favorable outcomes=(1,6)(2,5)(3,4),(4,3)(5,2)(6,1)
Number of favorable outcomes =6
Hence, the probability of obtaining the sum is 7=6/36=1/6

Favorable outcomes for sum greater than 10 are


(5, 6),(6,5),(6,6)
Number of favorable outcomes =3
Hence, the probability of obtaining a sum greater than 10 = 3/6=1/2

b)

Solution

Let A, B, C denote the events that a randomly selected bulb was manufactured in factory
A, B, C respectively. Let D denote the event that a bulb is defective. We have the
following data:

2. Consider a training data set consisting of the fauna of the world. Each unit has three features
named “Swim”, “Fly” and “Crawl”. Let the possible values of these features be as follows:
Swim Fast, Slow, No
Fly Long, Short, Rarely, No
Crawl Yes, No
For simplicity, each unit is classified as “Animal”, “Bird” or “Fish”. Let the training data set be as
in Table. Use naive Bayes algorithm to classify a particular species if its features are (Slow, Rarely,
No)?

Dr.R.Ahila Page 49
CS3491-AI&ML-QB

Solution
In this example, the features are
F1 = “Swim”; F2 = “Fly”; F3 = “Crawl”:
The class labels are
c1 = “Animal”; c2 = “ Bird”; c3 = “Fish”:
The test instance is (Slow, Rarely, No) and so we have:
x1 = “Slow”; x2 = “Rarely”; x3 = “No”:
We construct the frequency table shown in Table which summarizes the data. (It may be noted that the
construction of the frequency table is not part of the algorithm.)

Dr.R.Ahila Page 50
CS3491-AI&ML-QB

3. Harry installed a new burglar alarm at his home to detect burglary. The alarm reliably responds at
detecting a burglary but also responds for minor earthquakes. Harry has two neighbors David and
Sophia, who have taken a responsibility to inform Harry at work when they hear the alarm. David
always calls Harry when he hears the alarm, but sometimes he got confused with the phone ringing

Dr.R.Ahila Page 51
CS3491-AI&ML-QB

and calls at that time too. On the other hand, Sophia likes to listen to high music, so sometimes she
misses to hear the alarm. Here we would like to compute the probability of Burglary Alarm. Using
Bayesian network Calculate the probability that alarm has sounded, but there is neither a
burglary, nor an earthquake occurred, and David and Sophia both called the Harry.

Solution:
 The Bayesian network for the above problem is given below. The network structure is
showing that burglary and earthquake is the parent node of the alarm and directly
affecting the probability of alarm's going off, but David and Sophia's calls depend on
alarm probability.
 The network is representing that our assumptions do not directly perceive the burglary
and also do not notice the minor earthquake, and they also not confer before calling.
 The conditional distributions for each node are given as conditional probabilities table or
CPT.
 Each row in the CPT must be sum to 1 because all the entries in the table represent an
exhaustive set of cases for the variable.
 In CPT, a boolean variable with k boolean parents contains 2 K probabilities. Hence, if
there are two parents, then CPT will contain 4 probability values
o List of all events occurring in this network:
 Burglary (B)
 Earthquake(E)
 Alarm(A)
 David Calls(D)
 Sophia calls(S)
We can write the events of problem statement in the form of probability: P[D, S, A, B,
E], can rewrite the above probability statement using joint probability distribution:
P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]
=P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]
= P [D| A]. P [ S| A, B, E]. P[ A, B, E]
= P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]
= P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]
Let's take the observed probability for the Burglary and earthquake component:

Dr.R.Ahila Page 52
CS3491-AI&ML-QB

P(B= True) = 0.002, which is the probability of burglary.


P(B= False)= 0.998, which is the probability of no burglary.
P(E= True)= 0.001, which is the probability of a minor earthquake
P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
We can provide the conditional probabilities as per the below tables:
Conditional probability table for Alarm A:
The Conditional probability of Alarm A depends on Burglar and earthquake:

B E P(A= True) P(A= False)

True True 0.94 0.06

True False 0.95 0.04

False True 0.31 0.69

False False 0.001 0.999


Conditional probability table for David Calls:

The Conditional probability of David that he will call depends on the


probability of Alarm.

A P(D= True) P(D= False)

True 0.91 0.09

False 0.05 0.95


Conditional probability table for Sophia Calls:

The Conditional probability of Sophia that she calls is depending on


its Parent Node "Alarm."

Dr.R.Ahila Page 53
CS3491-AI&ML-QB

A P(S= True) P(S= False)

True 0.75 0.25

False 0.02 0.98


From the formula of joint distribution, we can write the problem statement in the form of
probability distribution:

P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).

= 0.75* 0.91* 0.001* 0.998*0.999


= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using
Joint distribution.
The semantics of Bayesian Network:
There are two ways to understand the semantics of the Bayesian network, which
is given below:
1. To understand the network as the representation of the Joint probability
distribution.
It is helpful to understand how to construct the network.
2. To understand the network as an encoding of a collection of conditional
independence statements.
It is helpful in designing inference procedure.

4.Given that P(A)=0.3,P(A|B)=0.4 and P(B)=0.5, Compute P(B|A).

To compute P(B∣A), we can use Bayes' theorem:


P(B∣A)= P(A∣B)×P(B)/ P(A)
Given:
 P(A)=0.3
 P(A∣B)=0.4
 P(B)=0.5
Substitute these values into Bayes' theorem
P(B∣A)=0.4×0.5/0.3
P(B∣A)=0.2/0.3
P(B∣A)≈0.67
So, P(B∣A) is approximately 0.67.

5. In a clinic, the probability of the patients having HIV virus is 0.15. A blood test done on
patients : If patient has virus, then the test is +ve with probability 0.95. If the patient does
not have the virus, then the test is +ve with probability 0.02. Assign labels to events :H=
patient has virus , P=test +ve Given :P(H)= 0.15, P(P/H)=0.95, P(P/ ┐H) =0.02 Find : If the
test is +ve what are the probabilities that the patient i) has the virus ie P(H|P) ; ii) does not
have virus ie P(¬H|P) ; If the test is -ve what are the probabilities that the patient iii) has the
virus ie P(H|¬P) ; iv) does not have virus ie P(¬H|¬P) ;

Dr.R.Ahila Page 54
CS3491-AI&ML-QB

To solve this problem, we can use Bayes' theorem, which allows us to update our
beliefs about the probability of an event given new evidence. Let's denote:
 H: Patient has the virus (HIV).
 ¬H: Patient does not have the virus (HIV).
 P: Test result is positive.
 ¬P: Test result is negative.
Given probabilities:
 P(H)=0.15: Probability of a patient having the virus.
 P(P∣H)=0.95: Probability of a positive test result given the patient has the virus.
 P(P∣¬H)=0.02: Probability of a positive test result given the patient does not have
the virus.
We need to find:
i)P(H∣P): Probability that the patient has the virus given the test is positive.
ii) P(¬H∣P): Probability that the patient does not have the virus given the test is
positive.
iii) P(H∣¬P): Probability that the patient has the virus given the test is negative.
iv) P(¬H∣¬P): Probability that the patient does not have the virus given the test is
negative.
Applying Bayes' Theorem:
i) P(H∣P)= P(P∣H)×P(H)/ P(P)
ii) P(¬H∣P)=P(P∣¬H)×P(¬H)/ P(P)
iii) P(H∣¬P)=P(¬P)P(¬P∣H)×P(H)
iv) P(¬H∣¬P)= P(¬P∣¬H)×P(¬H)/ P(¬P)
Where:
 P(P)=P(P∣H)×P(H)+P(P∣¬H)×P(¬H)
 P(¬P)=1−P(P)
 P(¬P∣H)=1−P(P∣H)
 P(¬P∣¬H)=1−P(P∣¬H)
Calculations:
Given:
 P(H)=0.15
 P(¬H)=1−P(H)=0.85
 P(P∣H)=0.95
 P(P∣¬H)=0.02
Calculate:

 P(P)
 P(¬P)
 P(¬P∣H)
 P(¬P∣¬H)
Then, substitute the values into Bayes' theorem to find P(H∣P),P(¬H∣P), P(H∣¬P),
and P(¬H∣¬P).
Let's do the calculations:
Calculations:

Dr.R.Ahila Page 55
CS3491-AI&ML-QB

1. P(P)=P(P∣H)×P(H)+P(P∣¬H)×P(¬H)
=0.95×0.15+0.02×0.85
=0.1425+0.017
=0.1595
2. P(¬P)=1−P(P)
=1−0.1595
=0.8405
3. P(¬P∣H)=1−P(P∣H)
=1−0.95
=0.05
4. P(¬P∣¬H)=1−P(P∣¬H)
=1−0.02
=0.98
Now, apply Bayes' theorem:
i) P(H∣P)= P(P∣H)×P(H)/ P(P)
=0.95×0.15/0.1595
≈0.1425/0.1595
≈0.893
ii) P(¬H∣P)= P(P∣¬H)×P(¬H)/ P(P)
=0.02×0.85/0.1595
≈0.017/0.1595
≈0.107
iii) P(H∣¬P)= P(¬P∣H)×P(H)/ P(¬P)
=0.05×0.15/0.8405
≈0.0075/0.8405
≈0.009
iv) P(¬H∣¬P)= P(¬P∣¬H)×P(¬H) / P(¬P)
=0.98×0.85/0.8405
≈0.833/0.8405
≈0.991
Results:
i) P(H∣P)≈0.893 or 89.3%
ii) P(¬H∣P)≈0.107 or 10.7%
iii) P(H∣¬P)≈0.009 or 0.9%
iv) P(¬H∣¬P)≈0.991 or 99.1%
These probabilities represent the likelihood of a patient having or not having the virus
given the test results.

6.Construct a Bayesian Network and define the necessary CPTs for the given scenario.
We have a bag of three biased coins a,b and c with probabilities of coming up heads of
20%, 60% and 80% respectively. One coin is drawn randomly from the bag (with equal
likelihood of drawing each of the three coins) and then the coin is flipped three times to
generate the outcomes X1, X2 and X3.
a.Draw a Bayesian network corresponding to this setup and define the relevant CPTs.
b.Calculate which coin is most likely to have been drawn if the flips come up HHT

Dr.R.Ahila Page 56
CS3491-AI&ML-QB

Dr.R.Ahila Page 57

You might also like