Bayesian Classification
Bayesian Classification
Bayesian Classifier
Principle of Bayesian classifier
Bayes’ theorem of probability
3
Bayesian Classifier
A statistical classifier
Foundation
Assumptions
1. The classes are mutually exclusive and exhaustive.
4
Example: Bayesian Classification
Example 8.2: Air Traffic Data
6
Air-Traffic Data
Cond. from previous slide…
Days Season Fog Rain Class
Saturday Spring High Heavy Cancelled
Weekday Summer High Slight On Time
Weekday Winter Normal None Late
Weekday Summer High None On Time
Weekday Winter Normal Heavy Very Late
Saturday Autumn High Slight On Time
Weekday Autumn None Heavy On Time
Holiday Spring Normal Slight On Time
Weekday Spring Normal None On Time
Weekday Spring Normal Heavy On Time
7
Air-Traffic Data
In this database, there are four attributes
A = [ Day, Season, Fog, Rain]
with 20 tuples.
The categories of classes are:
C= [On Time, Late, Very Late, Cancelled]
Given this is the knowledge of data and classes, we are to find most likely
classification for any other unseen instance, for example:
8
Bayesian Classifier
In many applications, the relationship between the attributes set and the
class variable is non-deterministic.
Before going to discuss the Bayesian classifier, we should have a quick look at
the Bayes’ Theorem.
9
Bayes’ Theorem Of Probability
Theorem 8.4: Bayes’ Theorem
10
Naïve Bayesian Classifier
Suppose, Y is a class variable and X = is a set of attributes,
with instance of Y.
INPUT (X) CLASS(Y)
… … …
… … … …
… … … …
11
Naïve Bayesian Classifier
Naïve Bayesian classifier calculate this posterior probability using Bayes’ theorem,
which is as follows.
where,
(Y)
Note:
is called the evidence (also the total probability) and it is a constant.
12
Naïve Bayesian Classifier
Suppose, for a given instance of X (say x = () and ….. .
There are any two class conditional probabilities namely P(Y|X=x) and
P(YX=x).
If P(YX=x) > P(YX=x), then we say that is more stronger than for the
instance X = x.
13
Naïve Bayesian Classifier
Example: With reference to the Air Traffic Dataset mentioned earlier, let
us tabulate all the posterior and prior probabilities as shown below.
Class
Attribute On Time Late Very Late Cancelled
Weekday 9/14 = 0.64 ½ = 0.5 3/3 = 1 0/1 = 0
Saturday 2/14 = 0.14 ½ = 0.5 0/3 = 0 1/1 = 1
Day
Class
Attribute On Time Late Very Late Cancelled
None 5/14 = 0.36 0/2 = 0 0/3 = 0 0/1 = 0
Fog
15
Naïve Bayesian Classifier
Instance:
16
Naïve Bayesian Classifier
Algorithm: Naïve Bayesian Classification
Input: Given a set of k mutually exclusive and exhaustive classes C = ,
which have prior probabilities P(C1), P(C2),….. P(Ck).
There are n-attribute set A = which for a given instance have values = , = ,
….., =
Note: , because they are not probabilities rather proportion values (to posterior
probabilities) 17
Naïve Bayesian Classifier
Pros and Cons
The Naïve Bayes’ approach is a very popular one, which often works well.
18
A Practice Example
age income studentcredit_rating
buys_compu
Example 8.4 <=30 high no fair no
<=30 high no excellent no
Class: 31…40 high no fair yes
C1:buys_computer = ‘yes’ >40 medium no fair yes
C2:buys_computer = ‘no’ >40 low yes fair yes
>40 low yes excellent no
Data instance 31…40 low yes excellent yes
X = (age <=30,
<=30 medium no fair no
Income = medium,
<=30 low yes fair yes
Student = yes
>40 medium yes fair yes
Credit_rating = fair)
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
19
A Practice Example
P(Ci): P(buys_computer = “yes”) = 9/14 = 0.643
P(buys_computer = “no”) = 5/14= 0.357
PositiveXRay Dyspnea n
P ( x1 ,..., xn ) P ( x i | Parents (Y i ))
i 1
Bayesian Belief Networks 22
Advantages And Disadvantages:
Data Mining: Concepts and Techniques, (3rd Edn.), Jiawei Han, Micheline Kamber, Morgan
Kaufmann, 2015.
Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Addison-
Wesley, 2014
https://www.tutorialspoint.com/data_mining/dm_bayesian_classification.htm
24
Thank You!!
25