0% found this document useful (0 votes)
38 views

NaiveBayes Algorithm

The document describes the Naive Bayes algorithm and provides an example of its use. It includes: - An example training dataset with attributes of Education, Age, Gender and Class Label - Steps to calculate the probability of a new data point belonging to each class - Consideration for how to handle zero probabilities - A second example with numerical Age attribute instead of categorical - Calculation of the class probabilities for a new data point and prediction of the class

Uploaded by

Aysun Güran
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

NaiveBayes Algorithm

The document describes the Naive Bayes algorithm and provides an example of its use. It includes: - An example training dataset with attributes of Education, Age, Gender and Class Label - Steps to calculate the probability of a new data point belonging to each class - Consideration for how to handle zero probabilities - A second example with numerical Age attribute instead of categorical - Calculation of the class probabilities for a new data point and prediction of the class

Uploaded by

Aysun Güran
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

NAIVE BAYES ALGORITHM

Hmap = argmax(P(h\D)) = (P(D\h) * P(h)) /(P(D))


NB is a supervised classification algorithm:
The Training dataset:
ID Eduction Age Gender Class_Label
1 Secondary School Aged Male YES
2 Primary School Young Male NO
3 College Middle-Aged Female NO
4 Secondary School Middle-Aged Male YES
5 Primary School Middle-Aged Male YES
6 College Aged Female YES
7 Primary School Young Female NO
8 Secondary School Middle-Aged Female YES

By using NB algorithm determine the class label of the following test


instance:
Xtest(College, Middle_Aged, Female) ---
Step1:
    Class_Labels
Attribute
s Values YES (5) NO (3)
Primary 1/5 2/3
Educatio Secondary 3/5 0
n College 1/5 1/3
       
Young 0 2/3
Middle_Age
d 3/5 1/3
Age Aged 2/5 0
       
Female 2/5 2/3
Gender Male 3/5 1/3

C1 = YES P(C1=YES)= 5/8


C2 = NO P(C2=NO) = 3/8
Xtest(College, Middle_Aged, Female) ---
P(Ci \ X) = P(X / Ci ) P(Ci)
n

P(X / Ci ) = ∏ P ( x k /C i )
k =1

= argmax {P(X/C)*P(C)}
n

= argmax{∏ P ( x k /C i )∗P(C) }} ( i=1,2 ; xk = k.th attribute)


k =1

C1: YES
n

argmax{∏ P ( x k /C i )∗P(C) }}
k =1

= P(x1 = College / C1=Yes) * P(x2 = Middle_Aged / C1=Yes) * P(x3 = Female / C1=Yes) * P(C1=Yes)

= (1/5) * (3/5) * (2/5) * (5/8)


= 0.03

C2: No
n

argmax{∏ P ( x k /C i )∗P(C) }}
k =1

= P(x1 = College / C2=NO) * P(x2 = Middle_Aged / C2=NO) * P(x3 = Female / C2=NO) * P(C2=NO)

= (1/3) * (1/3) * (2/3) * (3/8)


= 0.028

Argmax {0.03, 0.028 } = 0.03 comes from the first class so we can say that the test
instances’s class label should be YES

ZERO VALUE PROBLEM IN NAIVEBAYES ALGORITHM:

Assume that P(x1 = College / C2=NO) = 0


Without appliying Laplace smoothing the result will be zero:

= P(x1 = College / C2=NO) * P(x2 = Middle_Aged / C2=NO) * P(x3 = Female / C2=NO) * P(C2=NO)

= (0/3) * (1/3)* (2/3)*(3/8) = 0

If we apply the Laplace Smoothing:


•One of these smoothing techniques is add-one smoothing (Laplacian
correction).
= P(x1 = College / C2=NO) * P(x2 = Middle_Aged / C2=NO) * P(x3 = Female / C2=NO) * P(C2=NO)

= ((0+1)/(3+3)) * ((1+1)/(3+3)) * ((2+1)/(3+3)) * (3/8)


Ex2: NUMERICAL ATTRIBUTES
ID Eduction Age Gender Class_Label
1 Secondary School 60 Male YES
2 Primary School 22 Male NO
3 College 38 Female NO
4 Secondary School 40 Male YES
5 Primary School 40 Male YES
6 College 60 Female YES
7 Primary School 20 Female NO
8 Secondary School 42 Female YES

By using NB algorithm determine the class label of the following test


instance:
Xtest(x1= College, x2= 44, x3=Female) --- > ? Yes ? No
C1 = YES P(C1=YES)= 5/8
C2 = NO P(C2=NO) = 3/8
P(C1/X) = P(X/C1)*P(C1)
P(C1/X) = P(x1,x2,x3 /C1)*P(C1)
P(C1=YES/Xtest) = P(x1= College /C1=YES)* P(x2= 44/C1=YES)* P(x3=Female /C1=YES)*P(C1=YES)

= (1/5) * (??????) * (2/5) * (5/8)


P(x2= 44/C1=YES)= g(44, mean_of_age_attribute(YES), stdev_of_age_attribute(Yes)) =

YES (Age): The age of the people who were accepted for the company:
60
40
40
60
42
Mean()=48.4
Stdev()=10.62

2
−1 44−48.4
1 ( )
P(x2= 44/C1=YES) = e 2 10.62
= 0.0344
√2 π ( 2.57 ) 2

???? = 0.0344
P(C1=YES/Xtest) = P(x1= College /C1=YES)* P(x2= 44/C1=YES)* P(x3=Female /C1=YES)*P(C1=YES)
= (1/5) * (0.0344) * (2/5) * (5/8)
= 0,00172

Now Let’s calculate the necessary items for No class:


P(C2/X) = P(X/C2)*P(C2)
P(C2/X) = P(x1,x2,x3 /C2)*P(C2)
P(C2=No/Xtest) = P(x1= College /C2=No)* P(x2= 44/ C2=No)* P(x3=Female / C2=No)*P(C2=No)

= (1/3) * (??????) * (2/3) * (3/8)

No (Age): The age of the people who weren’t accepted for the company:
22
38
20
Mean()= 26.66
Stdev()= 9.86
2
−1 44−26.66
1 ( )
P(x2= 44/C2=NO) = e 2 9.86
= 0.0086
√2 π ( 9.86 ) 2

P(C2=No/Xtest) = P(x1= College /C2=No)* P(x2= 44/ C2=No)* P(x3=Female / C2=No)*P(C2=No)

= (1/3) * (0.0086) * (2/3) * (3/8)


= 0.00071667

FINAL DECISION:
Argmax{YES= 0.00172, No= 0.00071667} = YES= 0.00172
The test will belong to the YES class

You might also like