U4-Naive Bayes Algorithm
U4-Naive Bayes Algorithm
X = (x1 , x2 , . . . , xn ).
We are required to determine the most appropriate class label that should be assigned to the test
instance. For this purpose we compute the following conditional probabilities
and choose the maximum among them. Let the maximum probability be P (ci ∣X). Then, we choose
ci as the most appropriate class label for the training instance having X as the feature vector.
The direct computation of the probabilities given in Eq.(6.5) are difficult for a number of reasons.
The Bayes’ theorem can b applied to obtain a simpler method. This is explained below.
P (ck ∣X) ∝ P (x1 ∣ck )P (x2 ∣ck )⋯P (xn ∣ck )P (ck ).
Remarks
The various probabilities in the above expression are computed as follows:
No. of examples with class label ck
P (ck ) =
Total number of examples
No. of examples with jth feature equal to xj and class label ck
P (xj ∣ck ) =
No. of examples with class label ck
Let there be a training data set having n features F1 , . . . , Fn . Let f1 denote an arbitrary value of F1 ,
f2 of F2 , and so on. Let the set of class labels be {c1 , c2 , . . . , cp }. Let there be given a test instance
having the feature vector
X = (x1 , x2 , . . . , xn ).
We are required to determine the most appropriate class label that should be assigned to the test
instance.
Step 1. Compute the probabilities P (ck ) for k = 1, . . . , p.
Step 2. Form a table showing the conditional probabilities
for k = 1, . . . , p.
Step 4. Find j such qj = max{q1 , q2 , . . . , qp }.
Step 5. Assign the class label cj to the test instance X.
Remarks
In the above algorithm, Steps 1 and 2 constitute the learning phase of the algorithm. The remaining
steps constitute the testing phase. For testing purposes, only the table of probabilities is required;
the original data set is not required.
6.3.5 Example
Problem
Consider a training data set consisting of the fauna of the world. Each unit has three features named
“Swim”, “Fly” and “Crawl”. Let the possible values of these features be as follows:
Swim Fast, Slow, No
Fly Long, Short, Rarely, No
Crawl Yes, No
For simplicity, each unit is classified as “Animal”, “Bird” or “Fish”. Let the training data set be as in
Table 6.1. Use naive Bayes algorithm to classify a particular species if its features are (Slow, Rarely,
No)?
CHAPTER 6. BAYESIAN CLASSIFIER AND ML ESTIMATION 66
Solution
In this example, the features are
F1 = “Swim”, F2 = “Fly”, F3 = “Crawl”.
The class labels are
c1 = “Animal”, c2 = “ Bird”, c3 = “Fish”.
The test instance is (Slow, Rarely, No) and so we have:
x1 = “Slow”, x2 = “Rarely”, x3 = “No”.
We construct the frequency table shown in Table 6.2 which summarises the data. (It may be noted
that the construction of the frequency table is not part of the algorithm.)
Features
Class Swim (F1 ) Fly (F2 ) Crawl (F3 ) Total
Fast Slow No Long Short Rarely No Yes No
Animal (c1 ) 2 2 1 0 0 1 4 2 3 5
Bird (c2 ) 1 0 3 1 2 0 1 1 3 4
Fish (c3 ) 1 2 0 0 0 0 3 0 3 3
Total 4 4 4 1 2 1 8 4 8 12
Features
Swim (F1 ) Fly (F2 ) Crawl (F3 )
Class
f1 f2 f3
Fast Slow No Long Short Rarely No Yes No
Animal (c1 ) 2/5 2/5 1/5 0/5 0/5 1/5 4/5 2/5 3/5
Bird (c2 ) 1/4 0/4 3/4 1/4 2/4 0/4 1/4 0/4 4/4
Fish (c3 ) 13 2/3 0/3 0/3 0/3 0/3 3/3 0/3 3/3
Step 4. Now
max{q1 , q2 , q3 } = 0.05.
c1 = “ Animal”.
So we assign the class label “Animal” to the test instance “(Slow, Rarely, No)”.
2. If there are no obvious cut points, we may discretize the feature using quantiles. We may
divide the data into three bins with tertiles, four bins with quartiles, or five bins with quintiles,
etc.