0% found this document useful (0 votes)

2 views

MultinomialNB

The document discusses text classification, focusing on the Naïve Bayes method, which uses Bayes' rule for classifying documents into predefined categories. It covers various applications such as spam detection, authorship identification, and sentiment analysis, and explains the bag of words representation and the assumptions behind the Naïve Bayes classifier. Additionally, it addresses challenges like zero probabilities and the importance of precision, recall, and the F measure in evaluating classification performance.

Uploaded by

imrulumer

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

MultinomialNB

Uploaded by

imrulumer

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 52

Text

Classification
and Naïve
Bayes

The Task of Text

Classification
Is this spam?
Who wrote which Federalist papers?

• 1787-8: anonymous essays try to convince New York

to ratify U.S Constitution: Jay, Madison, Hamilton.
• Authorship of 12 of the letters in dispute
• 1963: solved by Mosteller and Wallace using
Bayesian methods

James Madison Alexander Hamilton

Male or female author?
1. By 1925 present-day Vietnam was divided into three parts
under French colonial rule. The southern region embracing
Saigon and the Mekong delta was the colony of Cochin-China;
the central area with its imperial capital at Hue was the
protectorate of Annam…
2. Clara never failed to be astonished by the extraordinary felicity
of her own name. She found it hard to trust herself to the
mercy of fate, which had managed over the years to convert
her greatest shame into one of her greatest assets…
S. Argamon, M. Koppel, J. Fine, A. R. Shimoni, 2003. “Gender, Genre, and Writing Style in Formal Written Texts,” Text, volume 23, number 3, pp.
321–346
Positive or negative movie review?
• unbelievably disappointing
• Full of zany characters and richly applied satire, and some
great plot twists
• this is the greatest screwball comedy ever filmed
• It was pathetic. The worst part about it was the boxing
scenes.

5
What is the subject of this article?

• Output: a predicted class c  C

Classification Methods:
Hand-coded rules
• Rules based on combinations of words or other features
• spam: black-list-address OR (“dollars” AND“have been selected”)
• Accuracy can be high
• If rules carefully refined by expert
• But building and maintaining these rules is expensive
Classification Methods:
Supervised Machine Learning
• Input:
• a document d
• a fixed set of classes C = {c1, c2,…, cJ}
• A training set of m hand-labeled documents (d1,c1),....,(dm,cm)
• Output:
• a learned classifier γ:d  c

10
Classification Methods:
Supervised Machine Learning
• Any kind of classifier
• Naïve Bayes
• Logistic regression
• Support-vector machines
• k-Nearest Neighbors

•…
Text
Classification
and Naïve
Bayes

The Task of Text

Classification
Text
Classification
and Naïve
Bayes

Naïve Bayes (I)

Naïve Bayes Intuition
• Simple (“naïve”) classification method based on
Bayes rule
• Relies on very simple representation of document
• Bag of words
The Bag of Words Representation

15
The bag of words representation
seen 2

γ
sweet 1
whimsical
recommend
1
1 )=c
(
happy 1
... ...
Text
Classification
and Naïve
Bayes

Naïve Bayes (I)

Text
Classification
and Naïve
Bayes

Formalizing the
Naïve Bayes
Classifier
Bayes’ Rule Applied to Documents and
Classes

• For a document d and a class c

Naïve Bayes Classifier (I)

MAP is “maximum a
posteriori” = most
likely class

Bayes Rule

Dropping the
denominator
Naïve Bayes Classifier (II)

Document d
argmax P( x1 , x2 ,  , xn | c) P (c) represented
as features
cC x1..xn
Naïve Bayes Classifier (IV)

cMAP argmax P ( x1 , x2 ,  , xn | c) P (c)

cC

O(|X|n•|C|) parameters How often does this

class occur?

Could only be estimated if a

We can just count the
very, very large number of relative frequencies
training examples was in a corpus

available.
Multinomial Naïve Bayes Independence
Assumptions
P ( x1 , x2 ,  , xn | c)

• Bag of Words assumption: Assume position doesn’t

matter
• Conditional Independence: Assume the feature
probabilities P(xi|cj) are independent given the class c.
Multinomial Naïve Bayes Classifier

cMAP argmax P ( x1 , x2 ,  , xn | c) P (c)

cC
Applying Multinomial Naive Bayes
Classifiers to Text Classification

positions  all word positions in test document

Problems with multiplying lots of probs
• There's a problem with this:

• Multiplying lots of probabilities can result in floating-point underflow!

• .0006 * .0007 * .0009 * .01 * .5 * .000008….
• Idea: Use logs, because log(ab) = log(a) + log(b)
• We'll sum logs of probabilities instead of multiplying
probabilities!
We actually do everything in log space
Instead of this:

This:

Notes:
1) Taking log doesn't change the ranking of classes!
The class with highest probability also has highest log
probability!
2) It's a linear model:
Just a max of a sum of weights: a linear function of the inputs
So naive bayes is a linear classifier
Text
Classification
and Naïve
Bayes

Formalizing the
Naïve Bayes
Classifier
Text
Classification
and Naïve
Bayes

Naïve Bayes:
Learning
Sec.13.3

Learning the Multinomial Naïve Bayes Model

• First attempt: maximum likelihood estimates

• simply use the frequencies in the data
Parameter estimation

fraction of times word wi appears

among all words in documents of topic cj

• Create mega-document for topic j by concatenating all docs in

this topic
• Use frequency of w in mega-document
Sec.13.3

Problem with Maximum Likelihood

• What if we have seen no training documents with the word
fantastic and classified in the topic positive (thumbs-up)?

• Zero probabilities cannot be conditioned away, no matter the

other evidence!
Laplace (add-1) smoothing for Naïve Bayes
Multinomial Naïve Bayes: Learning

• From training corpus, extract Vocabulary

• Calculate P(cj) terms • Calculate P(wk | cj) terms

• For each cj in C do • Textj  single doc containing all docsj
docsj  all docs with class =cj • For each word wk in Vocabulary
nk  # of occurrences of wk in Textj
Unknown words
• What about unknown words
• that appear in our test data
• but not in our training data or vocabulary?
• We ignore them
• Remove them from the test document!
• Pretend they weren't there!
• Don't include any probability for them at all!
• Why don't we build an unknown word model?
• It doesn't help: knowing which class has more unknown words is not
generally helpful!
Stop words
• Some systems ignore stop words
• Stop words: very frequent words like the and a.
• Sort the vocabulary by word frequency in training set
• Call the top 10 or 50 words the stopword list.
• Remove all stop words from both training and test sets
• As if they were never there!
• But removing stop words doesn't usually help
• So in practice most NB algorithms use all words and don't use
stopword lists
Text
Classification
and Naïve
Bayes

Naïve Bayes:
Learning
Text
Classification
and Naïve
Bayes

Naïve Bayes:
Relationship to
Language Modeling
Generative Model for Multinomial Naïve Bayes

c=China

X1=Shanghai X2=and X3=Shenzhen X4=issue X5=bonds

39
Naïve Bayes and Language Modeling
• Naïve bayes classifiers can use any sort of feature
• URL, email address, dictionaries, network features
• But if, as in the previous slides
• We use only word features
• we use all of the words in the text (not a subset)
• Then
• Naïve bayes has an important similarity to language
40
modeling.
Sec.13.2.1

Each class = a unigram language model

• Assigning each word: P(word | c)
• Assigning each sentence: P(s|c)=Π P(word|c)
Class pos
0.1 I
I love this fun film
0.1 love
0.01 this
0.1 0.1 .05 0.01 0.1
0.05 fun
0.1 film
P(s | pos) = 0.0000005
…
Sec.13.2.1

Naïve Bayes as a Language Model

• Which class assigns the higher probability to s?

Model pos Model neg

0.1 I 0.2 I I love this fun film
0.1 love 0.001 love
0.1 0.1 0.01 0.05 0.1
0.01 this 0.01 this 0.2 0.001 0.01 0.005 0.1
0.05 fun 0.005 fun
0.1 film P(s|pos) > P(s|neg)
0.1 film
Text
Classification
and Naïve
Bayes

Naïve Bayes:
Relationship to
Language Modeling
Text
Classification
and Naïve
Bayes

Multinomial Naïve
Bayes: A Worked
Example
Doc Words Class
Training 1 Chinese Beijing Chinese c
2 Chinese Chinese Shanghai c
3 Chinese Macao c
4 Tokyo Japan Chinese j
Test 5 Chinese Chinese Chinese Tokyo Japan ?
Priors:
P(c)= 3
4 1 Choosing a class:
P(j)= 4 P(c|d5) 3/4 * (3/7)3 * 1/14 * 1/14
≈ 0.0003
Conditional Probabilities:
P(Chinese|c) = (5+1) / (8+6) = 6/14 = 3/7
P(Tokyo|c) = (0+1) / (8+6) = 1/14 P(j|d5) 1/4 * (2/9)3 * 2/9 * 2/9
P(Japan|c) = (0+1) / (8+6) = 1/14 ≈ 0.0001
P(Chinese|j) = (1+1) / (3+6) = 2/9
P(Tokyo|j) = (1+1) / (3+6) = 2/9
45 P(Japan|j) = (1+1) / (3+6) = 2/9
Naïve Bayes in Spam Filtering
• SpamAssassin Features:
• Online Pharmacy
• Mentions millions of (dollar) ((dollar) NN,NNN,NNN.NN)
• Phrase: impress ...
• From: starts with many numbers
• Subject is all capitals
• HTML has a low ratio of text to image area
• One hundred percent guaranteed
• Claims you can be removed from the list
• 'Prestigious Non-Accredited Universities'
• http://spamassassin.apache.org/tests_3_3_x.html
Summary: Naive Bayes is Not So Naive
• Very Fast, low storage requirements
• Robust to Irrelevant Features
Irrelevant Features cancel each other without affecting results
• Very good in domains with many equally important features
Decision Trees suffer from fragmentation in such cases – especially if little data
• Optimal if the independence assumptions hold: If assumed
independence is correct, then it is the Bayes Optimal Classifier for problem
• A good dependable baseline for text classification
• But we will see other classifiers that give better accuracy
Text
Classification
and Naïve
Bayes

Multinomial Naïve
Bayes: A Worked
Example
Classification
and Naïve
Bayes

Precision, Recall, and

the F measure
The 2-by-2 contingency table
correct not correct
selected tp fp
not selected fn tn
Precision and recall
• Precision: % of selected items that are correct
Recall: % of correct items that are selected

correct not correct

selected tp fp
not selected fn tn
A combined measure: F
• A combined measure that assesses the P/R tradeoff is F
measure (weighted harmonic mean):

• The harmonic mean is a very conservative average

• People usually use balanced F1 measure
• i.e., with  = 1 (that is,  = ½): F = 2PR/(P+R)

David and Goliath, A Story of Place. The Narrative-Geographical Shaping of 1 Samuel 17
No ratings yet
David and Goliath, A Story of Place. The Narrative-Geographical Shaping of 1 Samuel 17
11 pages
nb24aug
No ratings yet
nb24aug
85 pages
4 Naive Bayes
No ratings yet
4 Naive Bayes
82 pages
nb24aug
No ratings yet
nb24aug
79 pages
Naive Bayes With Sentiment Classification
No ratings yet
Naive Bayes With Sentiment Classification
82 pages
Naive Bayes
No ratings yet
Naive Bayes
56 pages
NB 24 Aug
No ratings yet
NB 24 Aug
82 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
48 pages
Naivebayes 2021
No ratings yet
Naivebayes 2021
77 pages
4 NB 2024
No ratings yet
4 NB 2024
82 pages
NLP NB
No ratings yet
NLP NB
52 pages
20250129_Lecture03_naivebayes
No ratings yet
20250129_Lecture03_naivebayes
25 pages
NaiveBayes N Text Analytics
No ratings yet
NaiveBayes N Text Analytics
20 pages
04 Textcat
No ratings yet
04 Textcat
101 pages
04_1 06 naivebayes
No ratings yet
04_1 06 naivebayes
65 pages
T4L1 Naive Bayes
No ratings yet
T4L1 Naive Bayes
50 pages
Text Classification
No ratings yet
Text Classification
53 pages
Lecture13 Nbayes
No ratings yet
Lecture13 Nbayes
56 pages
Slp3 TextClassification Reduced
No ratings yet
Slp3 TextClassification Reduced
60 pages
CS464 Chapter 4: Naïve Bayes: (Slides Based On The Slides Provided by Öznur Taştan and Mehmet Koyutürk)
No ratings yet
CS464 Chapter 4: Naïve Bayes: (Slides Based On The Slides Provided by Öznur Taştan and Mehmet Koyutürk)
55 pages
Naive Bayes and Sentiment
No ratings yet
Naive Bayes and Sentiment
19 pages
4.Machine Learning for Text Understanding-1
No ratings yet
4.Machine Learning for Text Understanding-1
45 pages
Lecture-Feb20&25
No ratings yet
Lecture-Feb20&25
11 pages
Lecture03 Naive Bayes
No ratings yet
Lecture03 Naive Bayes
33 pages
02 Text Processing PDF
No ratings yet
02 Text Processing PDF
70 pages
L5 TextClassification Updated
No ratings yet
L5 TextClassification Updated
179 pages
Text Classification Using TF-IDF and Machine Learning
No ratings yet
Text Classification Using TF-IDF and Machine Learning
30 pages
L 13 Naive Bayes Classifier
100% (1)
L 13 Naive Bayes Classifier
52 pages
Naive Bayes Sentiment Analysis
No ratings yet
Naive Bayes Sentiment Analysis
23 pages
NLP ch4 l1
No ratings yet
NLP ch4 l1
23 pages
Naive Bayes and Sentiment Classification
No ratings yet
Naive Bayes and Sentiment Classification
23 pages
Lecture 02
No ratings yet
Lecture 02
31 pages
Text Classification PDF
No ratings yet
Text Classification PDF
56 pages
Naïve Bayes: The Task of Text Classification
No ratings yet
Naïve Bayes: The Task of Text Classification
34 pages
Text Classification and Naïve Bayes: The Task of Text Classifica1on
No ratings yet
Text Classification and Naïve Bayes: The Task of Text Classifica1on
74 pages
Lecture5 421
No ratings yet
Lecture5 421
115 pages
24 Shivangi DMDW
No ratings yet
24 Shivangi DMDW
12 pages
Text Classification in ML
No ratings yet
Text Classification in ML
47 pages
Tackling The Poor Assumptions of Naive Bayes Text Classifiers
No ratings yet
Tackling The Poor Assumptions of Naive Bayes Text Classifiers
8 pages
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper (1)
No ratings yet
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper (1)
74 pages
Na Ive Bayes Classifier
No ratings yet
Na Ive Bayes Classifier
3 pages
05 Naive Bayes - Relationship To Language Modeling 4-35
No ratings yet
05 Naive Bayes - Relationship To Language Modeling 4-35
2 pages
Chapter 4 Text Classification
No ratings yet
Chapter 4 Text Classification
28 pages
Top Machine Learning Informations About Different Algorithms
No ratings yet
Top Machine Learning Informations About Different Algorithms
63 pages
Naive Bayes
No ratings yet
Naive Bayes
12 pages
Text Classification: Slides Adapted From Lyle Ungar and Dan Jurafsky
No ratings yet
Text Classification: Slides Adapted From Lyle Ungar and Dan Jurafsky
29 pages
Week4
No ratings yet
Week4
45 pages
NBayes-1-20-2011-ann
No ratings yet
NBayes-1-20-2011-ann
21 pages
Qta Lse Day5.PDF
No ratings yet
Qta Lse Day5.PDF
62 pages
7 - Text Classification Naive Bayes
No ratings yet
7 - Text Classification Naive Bayes
41 pages
An Approach of The Naive Bayes Classifier For The Document Classification
No ratings yet
An Approach of The Naive Bayes Classifier For The Document Classification
4 pages
Naive_Bayes_Classifier_Presentation
No ratings yet
Naive_Bayes_Classifier_Presentation
10 pages
Lecture 5-1 Naive
No ratings yet
Lecture 5-1 Naive
44 pages
05_NaiveBayesAndSentimentClassification
No ratings yet
05_NaiveBayesAndSentimentClassification
36 pages
Naive Bayes Classifiers - Parta
No ratings yet
Naive Bayes Classifiers - Parta
17 pages
3 - Naive Bayes
No ratings yet
3 - Naive Bayes
60 pages
Statistics
No ratings yet
Statistics
25 pages
lecture3-linear-classifiers
No ratings yet
lecture3-linear-classifiers
36 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
6 pages
Finite-Dimensional Vector Spaces: Second Edition
From Everand
Finite-Dimensional Vector Spaces: Second Edition
Paul R. Halmos
No ratings yet
Naive Bayes Classifier: Fundamentals and Applications
From Everand
Naive Bayes Classifier: Fundamentals and Applications
Fouad Sabry
No ratings yet
Halloween Pumpkin Gnome
100% (7)
Halloween Pumpkin Gnome
16 pages
IT Applications in Operations by SR Prof TM
100% (1)
IT Applications in Operations by SR Prof TM
47 pages
UX100-003 Data Logger Datasheet
No ratings yet
UX100-003 Data Logger Datasheet
3 pages
Asad Notes
No ratings yet
Asad Notes
15 pages
AICPA Released Questions AUD 2015 Difficult
No ratings yet
AICPA Released Questions AUD 2015 Difficult
26 pages
Pharmacy Leve III Skill Gap Training Materials Based On Version
No ratings yet
Pharmacy Leve III Skill Gap Training Materials Based On Version
51 pages
First Aid & Bandaging: Maybelle B. Animas, R.N Nurse II-Imus I
No ratings yet
First Aid & Bandaging: Maybelle B. Animas, R.N Nurse II-Imus I
35 pages
AllHome Corporation
No ratings yet
AllHome Corporation
3 pages
Frequently Asked Questions: Compounding of Contraventions Under FEMA, 1999
No ratings yet
Frequently Asked Questions: Compounding of Contraventions Under FEMA, 1999
7 pages
SUNDARI Product Knowledge 2016 Skincare
No ratings yet
SUNDARI Product Knowledge 2016 Skincare
6 pages
Bu0560 Ers GB 0512
No ratings yet
Bu0560 Ers GB 0512
4 pages
3.6 Reading (Computer)
No ratings yet
3.6 Reading (Computer)
12 pages
India AI Mission - CMP MCQs
No ratings yet
India AI Mission - CMP MCQs
18 pages
Building Item Bank
67% (3)
Building Item Bank
6 pages
TRM256 Welded Reinforcement Grids
No ratings yet
TRM256 Welded Reinforcement Grids
2 pages
Element
No ratings yet
Element
22 pages
TOT Lunaria FR 68 SDS
No ratings yet
TOT Lunaria FR 68 SDS
11 pages
Blanca, Jonald R. Assignment1
No ratings yet
Blanca, Jonald R. Assignment1
5 pages
Errors in Accident Data - Its Types - Causes and Methods of Rectification - Analysis of The Literature
No ratings yet
Errors in Accident Data - Its Types - Causes and Methods of Rectification - Analysis of The Literature
19 pages
Coriolis Effect
No ratings yet
Coriolis Effect
4 pages
UM B ING MA 2021 (Wajib)
100% (1)
UM B ING MA 2021 (Wajib)
10 pages
Financial Statement 1
No ratings yet
Financial Statement 1
2 pages
Documentary Stamp Tax
No ratings yet
Documentary Stamp Tax
2 pages
Snoek Sterrekyker Here Be Dragons - Werkkaart en Memo
100% (1)
Snoek Sterrekyker Here Be Dragons - Werkkaart en Memo
12 pages
Architecture in The Culture of Early Hum
No ratings yet
Architecture in The Culture of Early Hum
3 pages
Makoons-International Affiliations
No ratings yet
Makoons-International Affiliations
1 page
Tyco Inline Joint Single Core Unarmoured Xlpe Mechanical Conn PDF
No ratings yet
Tyco Inline Joint Single Core Unarmoured Xlpe Mechanical Conn PDF
8 pages
12th Class Preboard 1 and 2 Exam & Preboard-1 Practical Schedule 2024-25 SIS
No ratings yet
12th Class Preboard 1 and 2 Exam & Preboard-1 Practical Schedule 2024-25 SIS
3 pages
R171 Light alloyWheelsCurrent
No ratings yet
R171 Light alloyWheelsCurrent
92 pages