0% found this document useful (0 votes)
10 views

Naive Bayes Classifier

notes

Uploaded by

prthmjain2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Naive Bayes Classifier

notes

Uploaded by

prthmjain2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Naive Bayes Classifier

It is a classification technique based on Bayes’ Theorem with an independence

assumption among predictors. In simple terms, a Naive Bayes classifier assumes that

the presence of a particular feature in a class is unrelated to the presence of any other

feature.

The Naïve Bayes classifier is a popular supervised machine learning algorithm used for

classification tasks such as text classification. It belongs to the family of generative

learning algorithms, which means that it models the distribution of inputs for a given

class or category. This approach is based on the assumption that the features of the

input data are conditionally independent given the class, allowing the algorithm to make

predictions quickly and accurately. In statistics, naive Bayes classifiers are considered

as simple probabilistic classifiers that apply Bayes’ theorem. This theorem is based on

the probability of a hypothesis, given the data and some prior knowledge. The naive

Bayes classifier assumes that all features in the input data are independent of each

other, which is often not true in real-world scenarios. However, despite this simplifying

assumption, the naive Bayes classifier is widely used because of its efficiency and good

performance in many real-world applications.

As a result, the naive Bayes classifier is a powerful tool in machine learning, particularly

in text classification, spam filtering, and sentiment analysis, among others.

For example, a fruit may be considered to be an apple if it is red, round, and about 3

inches in diameter. Even if these features depend on each other or upon the existence

of the other features, all of these properties independently contribute to the probability

that this fruit is an apple and that is why it is known as ‘Naive’.


An NB model is easy to build and particularly useful for very large data sets. Along with

simplicity, Naive Bayes is known to outperform even highly sophisticated classification

methods.

Bayes theorem provides a way of computing posterior probability P(c|x) from P(c), P(x)

and P(x|c). Look at the equation below:

● P(c|x) is the posterior probability of class (c, target) given predictor (x, attributes).

● P(c) is the prior probability of class.

● P(x|c) is the likelihood which is the probability of the predictor given class.

● P(x) is the prior probability of the predictor.

How Do Naive Bayes Algorithms Work?


Let’s understand it using an example. Below I have a training data set of weather and

corresponding target variable ‘Play’ (suggesting possibilities of playing). Now, we need

to classify whether players will play or not based on weather condition. Let’s follow the

below steps to perform it.

1. Convert the data set into a frequency table

In this first step data set is converted into a frequency table


2. Create Likelihood table by finding the probabilities

Create Likelihood table by finding the probabilities like Overcast probability = 0.29 and

probability of playing is 0.64.

3. Use Naive Bayesian equation to calculate the posterior probability

Now, use the Naive Bayesian equation to calculate the posterior probability for

each class. The class with the highest posterior probability is the outcome of

the prediction.

Problem: Players will play if the weather is sunny. Is this statement correct?

We can solve it using the above-discussed method of posterior probability.

P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)

Here P( Sunny | Yes) * P(Yes) is in the numerator, and P (Sunny) is in the denominator.

Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 =
0.64

Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability.
Pros and Cons of Naive Bayes?
Pros:

● It is easy and fast to predict class of test data set. It also perform well in multi
class prediction
● When assumption of independence holds, the classifier performs better
compared to other machine learning models like logistic regression or decision
tree, and requires less training data.
● It perform well in case of categorical input variables compared to numerical
variable(s). For numerical variable, normal distribution is assumed (bell curve,
which is a strong assumption).
Cons:

● If categorical variable has a category (in test data set), which was not observed
in training data set, then model will assign a 0 (zero) probability and will be
unable to make a prediction. This is often known as “Zero Frequency”. To solve
this, we can use the smoothing technique. One of the simplest smoothing
techniques is called Laplace estimation.
● On the other side, Naive Bayes is also known as a bad estimator, so the
probability outputs from predict_proba are not to be taken too seriously.
● Another limitation of this algorithm is the assumption of independent predictors.
In real life, it is almost impossible that we get a set of predictors which are
completely independent.

Applications of Naive Bayes Algorithms


● Real-time Prediction: Naive Bayesian classifier is an eager learning classifier

and it is super fast. Thus, it could be used for making predictions in real time.

● Multi-class Prediction: This algorithm is also well known for multi class

prediction feature. Here we can predict the probability of multiple classes of

target variable.
● Text classification/ Spam Filtering/ Sentiment Analysis: Naive Bayesian

classifiers mostly used in text classification (due to better result in multi class

problems and independence rule) have higher success rate as compared to

other algorithms. As a result, it is widely used in Spam filtering (identify spam

e-mail) and Sentiment Analysis (in social media analysis, to identify positive and

negative customer sentiments)

● Recommendation System: Naive Bayes Classifier and Collaborative Filtering

together builds a Recommendation System that uses machine learning and data

mining techniques to filter unseen information and predict whether a user would

like a given resource or not.

The Naive Bayes uses a similar method to predict the probability of different classes

based on various attributes. This algorithm is mostly used in text classification (nlp) and

with problems having multiple classes.

How to Build a Basic Model Using Naive Bayes in Python?


Again, scikit learn (python library) will help here to build a Naive Bayes model in Python.

There are five types of NB models under the scikit-learn library:

● Gaussian Naive Bayes: gaussiannb is used in classification tasks and it assumes

that feature values follow a gaussian distribution.

● Multinomial Naive Bayes: It is used for discrete counts. For example, let’s say,

we have a text classification problem. Here we can consider Bernoulli trials which

is one step further and instead of “word occurring in the document”, we have

“count how often word occurs in the document”, you can think of it as “number of

times outcome number x_i is observed over the n trials”.


● Bernoulli Naive Bayes: The binomial model is useful if your feature vectors are

boolean (i.e. zeros and ones). One application would be text classification with

‘bag of words’ model where the 1s & 0s are “word occurs in the document” and

“word does not occur in the document” respectively.

● Complement Naive Bayes: It is an adaptation of Multinomial NB where the

complement of each class is used to calculate the model weights. So, this is

suitable for imbalanced data sets and often outperforms the MNB on text

classification tasks.

● Categorical Naive Bayes: Categorical Naive Bayes is useful if the features are

categorically distributed. We have to encode the categorical variable in the

numeric format using the ordinal encoder for using this algorithm.

You might also like