8 ML
8 ML
Module-II
EC 19-203-0811
Introduction to Machine Learning
Course Outcomes
1. To understand various machine learning techniques
2. To acquire knowledge about classification techniques.
3. To understand dimensionality reduction techniques and decision trees.
4. To understand unsupervised machine learning techniques.
Module II
Multilayer Perceptrons: Introduction, The Perceptron, Training a Perceptron, Learning Boolean
Functions, Multilayer Perceptrons, Backpropagation Algorithm, Training Procedures. Classification- Cross
validation and re-sampling methods- Kfold cross validation, Boot strapping, Measuring classifier
performance- Precision, recall, ROC curves. Bayes Theorem, Bayesian classifier, Maximum Likelihood
estimation, Density Functions.
Module II
Multilayer Perceptrons: Introduction, The Perceptron, Training a Perceptron, Learning Boolean
Functions, Multilayer Perceptrons, Backpropagation Algorithm, Training Procedures. Classification- Cross
validation and re-sampling methods- Kfold cross validation, Boot strapping, Measuring classifier
performance- Precision, recall, ROC curves. Bayes Theorem, Bayesian classifier, Maximum Likelihood
estimation, Density Functions.
Module IV
Clustering: Introduction, Mixture Densities, k-Means Clustering, Expectation-
Maximization Algorithm, Mixtures of Latent Variable Models, Supervised Learning after
Clustering, Hierarchical Clustering, Choosing the Number of Clusters.
• Bayes Theorem
• Bayesian Classifier
• Maximum Likelihood Estimation
• Density Functions
– Generative model
MAP classification rule
– MAP: Maximum A Posterior
– Assign x to c* if
• Generative classification with the MAP rule
– Apply Bayesian rule to convert:
Bayes Classifier
• Establishing a probabilistic model for classification
– Discriminative model
P(C |X ) C c1 , , c L , X (X1 , , Xn )
– Generative model
P( X |C ) C c1 , , c L , X (X1 , , Xn )
2:12 PM 10
8-Bayes Classifier
Bayes Classifier
• Bayes classification
P(C |X ) P( X |C )P(C ) P( X1 , , Xn |C )P(C )
2:12 PM 11
8-Bayes Classifier
Bayes Classifier
• Naïve Bayes Algorithm (for discrete input attributes)
– Learning Phase: Given a training set S,
For each target value of ci (ci c1 , , c L )
Pˆ (C ci ) estimate P(C ci ) with examples in S;
For every attribute value a jk of each attribute x j ( j 1, , n; k 1, , N j )
Pˆ ( X j a jk |C ci ) estimate P( X j a jk |C ci ) with examples in S;
2:12 PM 12
8-Bayes Classifier
Bayes Classifier-Example
• Example: Play Tennis
2:12 PM 13
8-Bayes Classifier
Bayes Classifier-Example
• Learning Phase
Outlook Play=Yes Play=No
Sunny 2/9 3/5
Overcast 4/9 0/5
Rain 3/9 2/5
2:12 PM 14
8-Bayes Classifier
Bayes Classifier-Example
• Learning Phase
2:12 PM 15
8-Bayes Classifier
Bayes Classifier-Example
• Learning Phase
2:12 PM 16
8-Bayes Classifier
Bayes Classifier-Example
• Test Phase
– Given a new instance,
x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)
– Look up tables
P(Outlook=Sunny|Play=Yes) = 2/9 P(Outlook=Sunny|Play=No) = 3/5
P(Temperature=Cool|Play=Yes) = 3/9 P(Temperature=Cool|Play==No) = 1/5
P(Huminity=High|Play=Yes) = 3/9 P(Huminity=High|Play=No) = 4/5
P(Wind=Strong|Play=Yes) = 3/9 P(Wind=Strong|Play=No) = 3/5
P(Play=Yes) = 9/14 P(Play=No) = 5/14
– MAP rule
P(Yes|x’): [P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053
P(No|x’): [P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206
2:12 PM 17
8-Bayes Classifier
Bayes Classifier
Advantages :
Disadvantages :
9:59 AM 18
8-Bayes Classifier
2:12 PM 19
8-Bayes Classifier
Density Functions
• In probability theory, a probability density function (PDF), or density of
a continuous random variable, is a function whose value at any given
sample (or point) in the sample space (the set of possible values taken by
the random variable) can be interpreted as providing a relative
likelihood that the value of the random variable would be equal to that
sample.
• Probability density is the probability per unit length
• while the absolute likelihood for a continuous random variable to take on
any particular value is 0 (since there is an infinite set of possible values to
begin with), the value of the PDF at two different samples can be used to
infer, in any particular draw of the random variable, how much more likely
it is that the random variable would be close to one sample compared to
the other sample.
2:12 PM 20
Conclusion
• Bayes' Theorem
• Bayes Classifier
• Examples