0% found this document useful (0 votes)

90 views35 pages

Bayesian Classification: Dr. Navneet Goyal BITS, Pilani

- Bayesian classifiers use Bayes' theorem to predict class membership probabilities based on training data. - Naive Bayesian classifiers assume conditional independence between attributes, allowing simple and efficient probability calculations. - They can outperform more sophisticated classifiers in many cases, despite the conditional independence assumption not always holding. - Bayesian belief networks allow modeling conditional dependencies between attributes and can overcome some limitations of naive Bayes classifiers.

Uploaded by

Siti Omar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

90 views35 pages

Bayesian Classification: Dr. Navneet Goyal BITS, Pilani

Uploaded by

Siti Omar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 35

Bayesian Classification

Dr. Navneet Goyal

BITS, Pilani
Bayesian Classification
 What are Bayesian Classifiers?
 Statistical Classifiers
 Predict class membership
probabilities
 Based on Bayes Theorem
 Naïve Bayesian Classifier
 Computationally Simple
 Comparable performance with DT
and NN classifiers
Bayesian Classification

 Probabilistic learning: Calculate explicit

probabilities for hypothesis, among the
most practical approaches to certain
types of learning problems
 Incremental: Each training example can
incrementally increase/decrease the
probability that a hypothesis is correct.
Prior knowledge can be combined with
observed data.
Bayes Theorem
 Let X be a data sample whose class
label is unknown
 Let H be some hypothesis that X
belongs to a class C
 For classification determine P(H/X)
 P(H/X) is the probability that H holds
given the observed data sample X
 P(H/X) is posterior probability
Bayes Theorem
Example: Sample space: All Fruits
X is “round” and “red”
H= hypothesis that X is an Apple
P(H/X) is our confidence that X is an
apple given that X is “round” and “red”
 P(H) is Prior Probability of H, ie, the
probability that any given data sample is
an apple regardless of how it looks
 P(H/X) is based on more information
 Note that P(H) is independent of X
Bayes Theorem
Example: Sample space: All Fruits
 P(X/H) ?
 It is the probability that X is round and
red given that we know that it is true
that X is an apple
 Here P(X) is prior probability =
P(data sample from our set of fruits is
red and round)
Estimating Probabilities
 P(X), P(H), and P(X/H) may be
estimated from given data
 Bayes Theorem

P(H | X )  P( X | H ) P( H )
P( X )

 Use of Bayes Theorem in Naïve Bayesian

Classifier!!
Naïve Bayesian Classification
Also called Simple BC
Why Naïve/Simple??
Class Conditional Independence
Effectof an attribute values on a given class is
independent of the values of other attributes
This assumption simplifies computations
Naïve Bayesian Classification
Steps Involved
1. Each data sample is of the type
X=(xi) i =1(1)n, where xi is the values of X for
attribute Ai
2. Suppose there are m classes Ci, i=1(1)m.
X  Ci iff
P(Ci|X) > P(Cj|X) for 1 j  m, ji
i.e BC assigns X to class Ci having highest
Naïve Bayesian Classification
The class for which P(Ci|X) is maximized is called the
maximum posterior hypothesis.
From Bayes Theorem
P(Ci | X )  P( X | Ci) P(Ci)
P( X )
3. P(X) is constant. Only P( X | Ci)P(Ci) need be
maximized.
 If class prior probabilities not known, then assume
all classes to be equally likely
 Otherwise maximize P( X | Ci)P(Ci)
P(Ci) = Si/S
Problem: computing P(X|Ci) is unfeasible!
(find out how you would find it and why it is infeasible)
Naïve Bayesian Classification
4. Naïve assumption: attribute independence
P( X | Ci) = P(x1,…,xn|C) = P(xk|C)

5. In order to classify an unknown sample

X, evaluate P( X | Ci)P(Ci) for each class Ci.
Sample X is assigned to the class Ci iff
P(X|Ci)P(Ci) > P(X|Cj) P(Cj) for 1 j  m, ji
Naïve Bayesian Classification
EXAMPLE
Age Income Student Credit_rating Class:Buys_comp
<=30 HIGH N FAIR N
<=30 HIGH N EXCELLENT N
31…..40 HIGH N FAIR Y
>40 MEDIUM N FAIR Y
>40 LOW Y FAIR Y
>40 LOW Y EXCELLENT N
31…..40 LOW Y EXCELLENT Y
<=30 MEDIUM N FAIR N
<=30 LOW Y FAIR Y
>40 MEDIUM Y FAIR Y
<=30 MEDIUM Y EXCELLENT Y
31….40 MEDIUM N EXCELLENT Y
31….40 HIGH Y FAIR Y
>40 MEDIUM N EXCELLENT N
Naïve Bayesian Classification
EXAMPLE
X= (<=30,MEDIUM, Y,FAIR, ???)
We need to maximize:
P(X|Ci)P(Ci) for i=1,2.
P(Ci) is computed from training sample
P(buys_comp=Y) = 9/14 = 0.643
P(buys_comp=N) = 5/14 = 0.357
How to calculate P(X|Ci)P(Ci) for i=1,2?
P(X|Ci) = P(x1, x2, x3, x4|C) = P(xk|C)
Naïve Bayesian Classification
EXAMPLE
P(age<=30 | buys_comp=Y)=2/9=0.222
P(age<=30 | buys_comp=N)=3/5=0.600
P(income=medium | buys_comp=Y)=4/9=0.444
P(income=medium | buys_comp=N)=2/5=0.400
P(student=Y | buys_comp=Y)=6/9=0.667
P(student=Y | buys_comp=N)=1/5=0.200
P(credit_rating=FAIR | buys_comp=Y)=6/9=0.667
P(credit_rating=FAIR | buys_comp=N)=2/5=0.400
Naïve Bayesian Classification
EXAMPLE
P(X | buys_comp=Y)=0.222*0.444*0.667*0.667=0.044
P(X | buys_comp=N)=0.600*0.400*0.200*0.400=0.019

P(X | buys_comp=Y)P(buys_comp=Y) = 0.044*0.643=0.028

P(X | buys_comp=N)P(buys_comp=N) = 0.019*0.357=0.007

CONCLUSION: X buys computer

Naïve Bayes Classifier:
Issues
 Probability values ZERO!
 Recall what you observed in WEKA!
 If Ak is continuous valued!
 Recall what you observed in WEKA!

If there are no tuples in the training set corresponding

to students for the class buys-comp=NO
P(student = Y|buys_comp=N)=0
Implications?
Solution?
Naïve Bayes Classifier:
Issues
 Laplacian Correction or Laplace Estimator
 Philosophy – we assume that the training data set is so large
that adding one to each count that we need would only make
a negligible difference in the estimated prob. value.
 Example: D (1000)
 Class: buys_comp=Y

income=low – zero tuples

income=medium – 990 tuples
income=high – 10 tuples
Without Laplacian Correction the probs. are 0, 0.990, and
0.010
With Laplacian correction: 1/1003 = 0.001,
991/1003=0.988, and 11/1003=0.011 respectively.
Naïve Bayes Classifier:
Issues
 Continuous variable: need to do more
work than categorical attributes!
 It is typically assumed to have a
Guassian distribution with a mean 
and a std. dev. .
 Do it yourself! And cross check with
WEKA!
Naïve Bayes (Summary)
 Robust to isolated noise points

 Handle missing values by ignoring the

instance during probability estimate
calculations

 Robust to irrelevant attributes

 Independence assumption may not hold for

some attributes
 Use other techniques such as Bayesian Belief
Networks (BBN)
Probability Calculations
Age Income Student Credit_rating Class:Buys_comp

<=30 HIGH N FAIR N

No. of attributes = 4
<=30 HIGH N EXCELLENT N

Distinct values = 3,3,3,3

31…..40 HIGH N FAIR Y
No. of classes = 2
>40 MEDIUM N FAIR Y

>40 LOW Y GOOD Y

Total no. of probability
>40 LOW Y EXCELLENT N
calculations in NBC
= 4*3*2 = 24!
31…..40 LOW Y EXCELLENT Y

<=30 MEDIUM N FAIR N

<=30 LOW Y GOOD Y

What if conditional ind. was not
>40 MEDIUM Y FAIR Y
assumed?
<=30 MEDIUM Y EXCELLENT Y
O(kp) for p k-valued attributes
31….40 MEDIUM N EXCELLENT Y Multiply by m classes.
31….40 HIGH Y FAIR Y

>40 MEDIUM N EXCELLENT N

Bayesian Belief Networks
 Naïve BC assumes Class Conditional Independence
 This assumption simplifies computations
 When this assumption holds true, Naïve BC is most
accurate compared to all other classifiers
 In real problems, dependencies do exist between
variables
 2 methods to overcome this limitation of NBC
 Bayesian networks, that combine Bayesian
reasoning with causal relationships between
attributes
 Decision trees, that reason on one attribute at the
time, considering ‘most important’ attributes first
Conditional Independence
 Let X, Y, & Z denote three set of random
variables. The variables in X are said to
be conditionally independent of Y, given
Z if
P(X|Y,Z) = P(X|Z)
 Rel. bet. a person’s arm length and
his/her reading skills!!
 One might observe that people with
longer arms tend to have higher levels of
reading skills
 How do you explain this rel.?
Conditional Independence
 Can be explained through a confounding
factor, AGE
 A young child tends to have short arms
and lacks the reading skills of an adult
 If the age of a person is fixed, then the
observed rel. between arm length and
reading skills disappears
 We can this conclude that arm length and
reading skills are conditionally
independent when the age variable is
fixed
P(reading skills| long arms,age) = P(reading skills|age)
Conditional Independence
P(X,Y|Z) = P(X,Y,Z)/P(Z)
= P(X,Y,Z)/P(Y,Z) x P(Y,Z)/P(Z)
= P(X|Y,Z) x P(Y|Z)
= P(X|Z) x P(Y|Z)

This explains the Naïve Bayesian:

P(X|Ci) = P(x1, x2, x3,…,xn|C) = P(xk|C)
Bayesian Belief Networks
 Belief Networks
 Bayesian Networks
 Probabilistic Networks
Bayesian Belief Networks
Conditional Independence (CI) assumption
made by NBC may be too rigid
Specially for classification problems in
which attributes are somewhat correlated
We need a more flexible approach for
modeling the class conditional probabilities
P(X|Ci) = P(x1, x2, x3,…,xn|C)
 instead of requiring that all the attributes
be CI given the class, BBN allows us to
specify which pair of attributes are CI
Bayesian Belief Networks
Belief Networks has 2 components
Directed Acyclic Graph (DAG)
Conditional Probability Table (CPT)
Bayesian Belief Networks
A node in BBN is CI of its non-
descendants, if its parents are known
Bayesian Belief Networks
Family
Smoker
History
(FH, S) (FH, ~S)(~FH, S) (~FH, ~S)

LC 0.8 0.5 0.7 0.1

LungCancer Emphysema ~LC 0.2 0.5 0.3 0.9

The conditional probability table

for the variable LungCancer
PositiveXRay Dyspnea

Bayesian Belief Networks

Bayesian Belief Networks
 6 boolean variables

Arcs allow representation of causal

knowledge
 Having lung cancer is influenced by family
history and smoking
 PositiveXray is ind. of whether the paient has
a FH or if he/she is a smoker given that we
know that the patient has lung cancer
 Once we know the outcome of Lung Cancer,
FH & Smoker do not provide any additional
info. about PositiveXray
Bayesian Belief Networks
 Lung Cancer is CI of Emphysema, given its
parents, FH & Smoker
BBN has a Conditional Probability Table (CPT) for
each variable in the DAG
 CPT for a variable Y specifies the conditional
distribution P(Y|parents(Y))
(FH, S) (FH, ~S)(~FH, S) (~FH, ~S)

LC 0.8 0.5 0.7 0.1 P(LC=Y|FH=Y,S=Y) = 0.8

~LC 0.2 0.5 0.3 0.9 P(LC=N|FH=N,S=N) = 0.9

CPT for LungCancer

Bayesian Belief Networks
 Let X=(x1, x2,…,xn) be a tuple described by variables
or attributes Y1, Y2, …,Yn respectively
 Each variable is CI of its nondescendants given its
parents
 Allows he DAG to provide a complete representation of
the existing Joint Probability Distribution by:
P(x1, x2, x3,…,xn)=P(xi|Parents(Yi))
where P(x1, x2, x3,…,xn) is the prob. of a particular
combination of values of X, and the values for P(xi|
Parents(Yi)) correspond to the entries in CPT for Yi
Bayesian Belief Networks
 A node within the network can selected as an
‘output’ node, representing a class label attribute
 More than one output node
Rather than returning a single class label, the
classification process can return a probability
distribution that gives the probability of each
class
 Training BBN!!
Training BBN
 Number of scenarios possible

 Network topology may be given in advance or

inferred from data
 Variables may be observable or hidden (mising
or incomplete data) in all or some of the training
tuples
 Many algos for learning the network topology
from the training data given observable attibutes
 If network topology is known and the variables
observable, training is straightforward (just
compute CPT entries)
Training BBNs
 Topology given, but some variables are hidden

 Gradient Descent (self study)

 Falls under the class of algos called Adaptive
Probabilistic Networks
 BBNs are computationally expensive
 BBNs provide explicit representation of Causal
structure
 Domain experts can provide prior knowledge to the
training process in the form of topology and/or in
conditional probability values
 This leads to significant improvement in the learning
process

Naïve Bayes Classifier: April 25, 2006
No ratings yet
Naïve Bayes Classifier: April 25, 2006
19 pages
3 - Bayesian Classification
No ratings yet
3 - Bayesian Classification
15 pages
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
No ratings yet
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
66 pages
Unit-3 AML (Bayesian Concept Learning)
No ratings yet
Unit-3 AML (Bayesian Concept Learning)
40 pages
Bayesian Classification Guide
No ratings yet
Bayesian Classification Guide
6 pages
Bayesian Classification, Nearest
No ratings yet
Bayesian Classification, Nearest
46 pages
UNIT - IV
No ratings yet
UNIT - IV
169 pages
Bays Classifier (Machine Learning)
No ratings yet
Bays Classifier (Machine Learning)
16 pages
Class Adv Classification IV
No ratings yet
Class Adv Classification IV
49 pages
Bayesian Classification Insights
No ratings yet
Bayesian Classification Insights
7 pages
Classification Bayes
No ratings yet
Classification Bayes
21 pages
Nayes Bayes Classifier
No ratings yet
Nayes Bayes Classifier
46 pages
A5 PDF
No ratings yet
A5 PDF
9 pages
Bayes' Theorem for Data Science
No ratings yet
Bayes' Theorem for Data Science
10 pages
Bayesian
No ratings yet
Bayesian
23 pages
Naïve Bayes for Data Scientists
No ratings yet
Naïve Bayes for Data Scientists
31 pages
Bayesian Classification Explained
No ratings yet
Bayesian Classification Explained
7 pages
CSC 325 AI Lecture08 Supervised Learning
No ratings yet
CSC 325 AI Lecture08 Supervised Learning
32 pages
ML 09 Naive Bayes Classifier
No ratings yet
ML 09 Naive Bayes Classifier
24 pages
CSC 325 AI Lecture08 Supervised Learning Fall2024 DR Raheel 20022025 034558pm
No ratings yet
CSC 325 AI Lecture08 Supervised Learning Fall2024 DR Raheel 20022025 034558pm
29 pages
Naive Bayes Classifier Guide
No ratings yet
Naive Bayes Classifier Guide
16 pages
Classification With NaiveBayes
No ratings yet
Classification With NaiveBayes
19 pages
NB Classifier & Bayesian Network 2
No ratings yet
NB Classifier & Bayesian Network 2
37 pages
Unit6 - 3 Classification-Bayesian
No ratings yet
Unit6 - 3 Classification-Bayesian
15 pages
Naive by
No ratings yet
Naive by
23 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
21 pages
Naive Bayes Classifier Guide
No ratings yet
Naive Bayes Classifier Guide
36 pages
20210913115710D3708 - Session 09-12 Bayes Classifier
No ratings yet
20210913115710D3708 - Session 09-12 Bayes Classifier
30 pages
Chapter 4
No ratings yet
Chapter 4
57 pages
L3 (Week3) Bayesian Classifier
No ratings yet
L3 (Week3) Bayesian Classifier
21 pages
2.3 Bayes Classification
No ratings yet
2.3 Bayes Classification
15 pages
Bayes Classification
No ratings yet
Bayes Classification
9 pages
ML 05 Bayesian Classifier
No ratings yet
ML 05 Bayesian Classifier
19 pages
Naive Bayes
No ratings yet
Naive Bayes
37 pages
Lecture12 Ch8 ClassBasic Part2
No ratings yet
Lecture12 Ch8 ClassBasic Part2
22 pages
K - Nearest Neighbours Classifier / Regressor
No ratings yet
K - Nearest Neighbours Classifier / Regressor
35 pages
9-Decision Tree Induction-23-01-2025
No ratings yet
9-Decision Tree Induction-23-01-2025
40 pages
Lecture 8 - Naive Bayes
No ratings yet
Lecture 8 - Naive Bayes
27 pages
Lesson 3.3 - Supervised Learning Rule Based Classification
No ratings yet
Lesson 3.3 - Supervised Learning Rule Based Classification
43 pages
Statistical Inference INF312 - Is - Lecture 03 - Part 3
No ratings yet
Statistical Inference INF312 - Is - Lecture 03 - Part 3
18 pages
AI Notes
No ratings yet
AI Notes
19 pages
WINSEM2024-25 BCSE334L TH VL2024250502042 2025-03-03 Reference-Material-I
No ratings yet
WINSEM2024-25 BCSE334L TH VL2024250502042 2025-03-03 Reference-Material-I
18 pages
Lecture 7
No ratings yet
Lecture 7
15 pages
Bayesian Classification
No ratings yet
Bayesian Classification
25 pages
Module 3 - Bayesian Classifier
No ratings yet
Module 3 - Bayesian Classifier
17 pages
Lecture 5 Bayesian Classification
No ratings yet
Lecture 5 Bayesian Classification
16 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
07 Naive Bayes
No ratings yet
07 Naive Bayes
6 pages
8 - Classification NaiveBayes PDF
No ratings yet
8 - Classification NaiveBayes PDF
13 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
47 pages
Ba Yes Naive
No ratings yet
Ba Yes Naive
15 pages
Teaching Reading Skills in A Foreign Language
No ratings yet
Teaching Reading Skills in A Foreign Language
14 pages
Camay Relaunch in Pakistan
100% (1)
Camay Relaunch in Pakistan
26 pages
5G Wireless Technology: Millimeter Wave Health Effects
No ratings yet
5G Wireless Technology: Millimeter Wave Health Effects
5 pages
App Letter 12a PDF
0% (1)
App Letter 12a PDF
2 pages
How To Send or Receive SMS Message Via GSM Module by at Commands
100% (1)
How To Send or Receive SMS Message Via GSM Module by at Commands
6 pages
High Pass Filter
No ratings yet
High Pass Filter
12 pages
Bayes Classifier
No ratings yet
Bayes Classifier
20 pages
Sample Guard House Drawing-Model
No ratings yet
Sample Guard House Drawing-Model
1 page
Engineering Heat Calculations
No ratings yet
Engineering Heat Calculations
6 pages
Class 12 Physics Electricity Experiment
No ratings yet
Class 12 Physics Electricity Experiment
18 pages
Random Variables and Probability Distribution: Purnomo Jurusan Teknik Mesin UGM
No ratings yet
Random Variables and Probability Distribution: Purnomo Jurusan Teknik Mesin UGM
48 pages
Learning Objectives: Introduction W
No ratings yet
Learning Objectives: Introduction W
238 pages
Inspection of The Building Signature by Pinnacle.: (Estructure and Electromechanic Equipment Surveying.)
No ratings yet
Inspection of The Building Signature by Pinnacle.: (Estructure and Electromechanic Equipment Surveying.)
12 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
16 pages
Statement of Purpose (Ashok)
No ratings yet
Statement of Purpose (Ashok)
2 pages
Topic-Economic Role For Advertisement Development
No ratings yet
Topic-Economic Role For Advertisement Development
11 pages
Procesos A Color
No ratings yet
Procesos A Color
6 pages
Istqb Advanced Level Test Manager Syllabus v5
No ratings yet
Istqb Advanced Level Test Manager Syllabus v5
126 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Character - Lorian Nod
No ratings yet
Character - Lorian Nod
2 pages
Izadian Leila Thesis 2021
No ratings yet
Izadian Leila Thesis 2021
39 pages
One Minute Manager Notes
No ratings yet
One Minute Manager Notes
8 pages
Saudi Arabia Technician Jobs Listings
No ratings yet
Saudi Arabia Technician Jobs Listings
6 pages
7805BG
No ratings yet
7805BG
28 pages
2023 Reports - Luminate On Diversity
No ratings yet
2023 Reports - Luminate On Diversity
28 pages
Android File Management Guide
No ratings yet
Android File Management Guide
19 pages
Angelica Resume
No ratings yet
Angelica Resume
1 page
EEE229/EEE223/GEE202 - Problem Sheet 1
No ratings yet
EEE229/EEE223/GEE202 - Problem Sheet 1
1 page
Prolegomenon To Geisha As A Cultural Performer: Miyako Odori, The Gion School and Representation of A Traditional" Japan - Mariko Okada
No ratings yet
Prolegomenon To Geisha As A Cultural Performer: Miyako Odori, The Gion School and Representation of A Traditional" Japan - Mariko Okada
7 pages
11 - 8 - 2022 - 9 - 12 - 58 - 189bachelor of Science B.Sc. - ExamForm
No ratings yet
11 - 8 - 2022 - 9 - 12 - 58 - 189bachelor of Science B.Sc. - ExamForm
2 pages
MyEdBC Family Portal Instructional Manual
No ratings yet
MyEdBC Family Portal Instructional Manual
6 pages
Sample CV
No ratings yet
Sample CV
6 pages
AP0070462152019
No ratings yet
AP0070462152019
1 page

Bayesian Classification: Dr. Navneet Goyal BITS, Pilani

Uploaded by

Bayesian Classification: Dr. Navneet Goyal BITS, Pilani

Uploaded by

Bayesian Classification

Dr. Navneet Goyal

 Probabilistic learning: Calculate explicit

 Use of Bayes Theorem in Naïve Bayesian

5. In order to classify an unknown sample

P(X | buys_comp=Y)P(buys_comp=Y) = 0.044*0.643=0.028

CONCLUSION: X buys computer

If there are no tuples in the training set corresponding

income=low – zero tuples

 Handle missing values by ignoring the

 Robust to irrelevant attributes

 Independence assumption may not hold for

<=30 HIGH N FAIR N

Distinct values = 3,3,3,3

>40 LOW Y GOOD Y

<=30 MEDIUM N FAIR N

<=30 LOW Y GOOD Y

>40 MEDIUM N EXCELLENT N

This explains the Naïve Bayesian:

LC 0.8 0.5 0.7 0.1

The conditional probability table

Bayesian Belief Networks

Arcs allow representation of causal

LC 0.8 0.5 0.7 0.1 P(LC=Y|FH=Y,S=Y) = 0.8

CPT for LungCancer

 Network topology may be given in advance or

 Gradient Descent (self study)

You might also like