0% found this document useful (0 votes)
12 views

Unit2_5 part 1

Uploaded by

akg.uk14
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Unit2_5 part 1

Uploaded by

akg.uk14
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

BAYESIAN LEARNING

Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that
the quantities of interest are governed by probability distributions and that optimal decisions can be
made by reasoning about these probabilities together with observed data. It is important to machine
learning because it provides a quantitative approach to weighing the evidence supporting alternative
hypotheses. Bayesian reasoning provides the basis for learning algorithms that directly manipulate
probabilities, as well as a framework for analyzing the operation of other algorithms that do not
explicitly manipulate probabilities.
INTRODUCTION :
Bayesian learning methods are relevant to our study of machine learning for two different reasons-
1. First, Bayesian learning algorithms that calculate explicit probabilities for hypotheses, such as
the naive Bayes classifier, are among the most practical approaches to certain types of learning
problems.
2. The second reason that Bayesian methods are important to our study of machine learning is that
they provide a useful perspective for understanding many learning algorithms that do not
explicitly manipulate probabilities.
Features of Bayesian learning methods include:
1. Each observed training example can incrementally decrease or increase the estimated
probability that a hypothesis is correct. This provides a more flexible approach to learning than
algorithms that completely eliminate a hypothesis if it is found to be inconsistent with any single
example.
2. Prior knowledge can be combined with observed data to determine the final probability of a
hypothesis. In Bayesian learning, prior knowledge is provided by asserting (1) a prior probability
for each candidate hypothesis, and (2) a probability distribution over observed data for each
possible hypothesis.
3. Bayesian methods can accommodate hypotheses that make probabilistic predictions (e.g.,
hypotheses such as "this pneumonia patient has a 93% chance of complete recovery").
4. New instances can be classified by combining the predictions of multiple hypotheses, weighted
by their probabilities.
5. Even in cases where Bayesian methods prove computationally intractable, they can provide a
standard of optimal decision making against which other practical methods can be measured.
Difficulties in applying Bayesian Methods-
1. When these probabilities are not known in advance they are often estimated based on
background knowledge, previously available data, and assumptions about the form of the
underlying distributions.
2. A second practical difficulty is the significant computational cost required to determine the
Bayes optimal hypothesis in the general case (linear in the number of candidate hypotheses). In
certain specialized situations, this computational cost can be significantly reduced.

BAYES THEOREM:
Bayes theorem provides a way to calculate the probability of a hypothesis based on its prior probability,
the probabilities of observing various data given the hypothesis, and the observed data itself.
Bayes theorem is the cornerstone of Bayesian learning methods because it provides a way to calculate
the posterior probability P(hlD), from the prior probability P(h), together with P(D) and P(D/h).
P(h) is often called prior probability of h and may reflect any background knowledge we have about the
chance that h is a correct hypothesis.
P(D) denotes the prior probability that training data D will be observed (i.e., the probability of D given no
knowledge about which hypothesis holds.
P(D/h) to denote the probability of observing data D given some situation in which hypothesis h holds.
In machine learning problems, we are interested in finding posterior probability P (h / D) that h holds
given the observed training data D. Notice the posterior probability P(h/D) reflects the influence of the
training data D, in contrast to the prior probability P(h) , which is independent of D.
MAP (Maximum a posterior) hypothesis: We can determine the MAP hypotheses by using Bayes
theorem to calculate the posterior probability of each candidate hypothesis. More precisely, we will say
that hMAP is a MAP hypothesis provided :

Since, P(D) is constant and so, can be dropped. In many learning scenarios, the learner considers some
set of candidate hypotheses H and is interested in finding the most probable hypothesis h E H given the
observed data D (or at least one of the maximally probable if there are several).
Maximum Likelihood (ML) hypothesis hML: In some cases, we will assume that every hypothesis in H is
equally probable a priori (P(hi) = P(hj) for all hi and hj in H).

P(Dlh) is often called the likelihood of the data D given h, and any hypothesis that maximizes P(Dlh) is
called a maximum likelihood (ML) hypothesis, hML.

Bayes theorem can be applied equally well to any set H of mutually exclusive propositions whose
probabilities sum to one (e.g., "the sky is blue," and "the sky is not blue").
Illustration: Consider a medical diagnosis problem in which there are two alternative hypotheses: (1)
that the patient; has a particular form of cancer. and (2) that the patient does not. The available data is
from a particular laboratory test with two possible outcomes: + (positive) and - (negative). We have
prior knowledge that over the entire population of people only .008 have this disease. Furthermore, the
lab test is only an imperfect indicator of the disease. The test returns a correct positive result in only
98% of the cases in which the disease is actually present and a correct negative result in only 97% of the
cases in which the disease is not present. In other cases, the test returns the opposite result. The above
situation can be summarized by the following probabilities:

We can see that when lab test returns a positive result, the hMAP is patient is not having cancer.
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner

You might also like