EE 769 Introduction To Machine Learning: Sheet 4 - 2020-21-2 Linear Classification

This document discusses various topics in machine learning including linear classification, Bayesian classification, support vector machines, and regularization techniques. It provides mathematical proofs and explanations of concepts such as how linear decision boundaries arise from Gaussian class conditionals, how sensitivity and specificity apply to Bayesian classification, which points are support vectors in soft-SVM, and how elastic net regularization combines L1 and L2 penalties.

Uploaded by

ANURAG DIXIT

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

147 views

EE 769 Introduction To Machine Learning: Sheet 4 - 2020-21-2 Linear Classification

Uploaded by

ANURAG DIXIT

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

EE 769 Introduction to Machine Learning

Sheet 4 — 2020-21-2
Linear classification

1. Assume that we have an augmented input vector x̂ = [xT 1]T , where the last dimension
is always 1, and there is no bias b, and yi = sign(w.x̂i ). When x is 1-dimensional
pictorially and mathematically show that all x̂i lie on a straight line. Further, show
that the decision threshold is at the point where the normal to w intersects this line
formed by x̂i .

2. Assume that for a binary classification problem with 2-dimensional input, the training
label is given by:
ti = sign(xi,1 ) × sign(xi,2 ), (1)

where xi,1 6= 0, xi,2 6= 0, ∀i. Suggest a new variable xi,3 that is a function of the two
original variables to make this data linearly separable.

3. Gaussian class-conditionals:
(a) Show that the quadratic term cancels out when the covariance matrices of the two
Gaussian class conditional densities of a binary classification problem are equal,
thus yielding a linear decision boundary, even if the prior probabilities of the two
classes are unequal.
(b) Show that the Bayesian classification decision takes the form of a logistic regres-
sion classifier when the class conditionals are Gaussians with the same covariance
matrix.
(c) Show that the decision boundaries of a Bayesian classifier are still (piece-wise) linear
for a three-class classifier as long as the class conditional densities are Gaussian
distributed with the same covariance matrix.

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 1 of 4

Amit Sethi [email protected]
EE 769 Introduction to Machine Learning: Sheet 4— 2020-21-2

4. Bayesian classification:
(a) Assume that in a population, the prevalence of a disease is 10 per 100,000. A
company makes a test to detect the disease, but it fails to detect even a single case,
no matter on how large a population it is used. What is the expected accuracy
of such a test (ratio of correct decisions and total decisions) when tested on 100
million people?
(b) A better test is developed for the same disease, which has a sensitivity of 0.99,
and a specificity of 0.95. If a mass screening drive is conducted for 100 million
people, what is the expected number of people with the disease who will be missed
(false negative), and what is the expected number of people who will be sent to a
doctor but they will not have the disease (false positive)? What is the expected
accuracy of this classifier in a mass screening scenario? You can read about sensi-
tivity and specificity here https://en.wikipedia.org/wiki/Sensitivity_and_
specificity.
(c) For the previous part, assume that the test technology uses a logistic regression
classifier that computes probability of the disease using the logistic function. If
that probability is above 0.5, then the human subject is assumed to be diseased
and sent to a doctor for further examination. However, we need not use this default
threshold, and we can vary it between 0 and 1. If we want to miss approximately
0.1% of diseased subjects in a mass screening, then should we move this threshold
closer to 0 or closer to 1, and what will the sensitivity be? Is there any disadvantage
of doing so?
(d) Assume that the classifier is applied to a different population where the disease
burden is 100 patients per 100,000 population. In such a scenario, with the same
decision threshold of p = 0.5, what will be accuracy of the classifier?
(e) Now, assume that the societal cost (or risk) of missing a patient is $1,000,000, while
that of falsely calling a healthy person a patient is $1000. Which of the two will
incur a lower expected societal cost (or risk) when used for mass screening?
(i) A decision threshold of 0.5 with a sensitivity of 0.99 and a specificity of 0.95?
(ii) A decision threshold of 0.1 with a sensitivity of 0.999 and a specificity of 0.8?
(f) As obvious from the previous parts, accuracy itself is an incomplete measure of
a classifier’s performance. We need to know both sensitivity and specificity (or
FNR and FPR, or PPV and NPV etc., or a confusion matrix). On top of that,
we can vary the decision threshold in certain classifiers, such as logistic regression.
For such classifiers, there is a measure of performance that takes into account the
entire range of thresholds, which is called AUC or AU-ROC (area under receiver

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 2 of 4

Amit Sethi [email protected]
EE 769 Introduction to Machine Learning: Sheet 4— 2020-21-2

operating characteristic curve). Assume that the classifier yields p of 0.4, 0.55,
0.8, 0.9, 0.91, and 0.99 for positive samples, and 0.01, 0.02, 0.04, 0.11, 0.14, 0.24,
0.3, 0.43, and 0.54 for the negative samples tested. For this scenario, compute the
following:
(i) Sensitivity and specificity for a threshold of 0.5.
(ii) Sensitivity and specificity for a threshold of 0.4.
(iii) AUC.

5. Soft-SVM:
(a) Which of the following points are support vectors in a soft-SVM?
(i) Points outside the margin on the correct side of the decision boundary.
(ii) Points on the margin on the correct side of the decision boundary.
(iii) Points inside the margin on the correct side of the decision boundary.
(iv) Points on the decision boundary.
(v) Points inside the margin on the wrong side of the decision boundary.
(vi) Points outside the margin on the wrong side of the decision boundary.
(b) For each of the sub-parts in the previous part, what is the value or range of values
for the slack variable ξi ?
PN
(c) As λ is increased in its loss function Lw,b = λ||w||22 + i=1 [1 − ti (wT xi + b)]+ , do
you expect the number of support vectors to increase or decrease? Give reason.
(d) Analyze the behavior of a soft-SVM as λ is increased, whose cost function is given
by Lw,b = λ||w||1 + N T
P
i=1 [1 − ti (w xi + b)]+ , where the L2-norm of w has been
replaced by L1-norm. Such an SVM is called L1-SVM [Ref: ”Feature Selection via
Concave Minimization and Support Vector Machines” by by P.S. Bradley , O. L.
Mangasarian in ICML 1998, and doi:10.1093/bioinformatics/btp286].
(e) Suggest a way to convert the output of an SVM into a continuous measure that
may be interpreted as a probability measure, instead of simply a discrete binary
decision.

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 3 of 4

Amit Sethi [email protected]
EE 769 Introduction to Machine Learning: Sheet 4— 2020-21-2

6. Advanced regularization:
(a) Instead of using either L2 or L1 regularization in isolation, elastic net uses both L2
and L1 regularization in an optimization objective that looks like Lerror +λ2 ||w||22 +
λ1 ||w||1 ; λ2 > 0, λ1 > 0, where the weights of the model (such as linear or logistic
regression) are w, and the unregularized loss (e.g. MSE or cross-entropy) is Lerror .
The reason for using both regularizations is that L1 eliminates some variables and
L2 groups correlated variables such that two correlated variables either remain
together in the model, or get eliminated together from the model, depending on
the value of λ1 and λ2 and the strength of the correlation.
Assume two perfectly correlated variables x1 and x2 (you can assume that these
are exactly the same variables) along with other variables, and show that by just
using L1 penalty, it is possible to have no increase in the L1 penalty if we add a
constant to the coefficient of one of those variables, and subtract another constant
from the coefficient of its correlated variable. Then show that using the L2 penalty
stabilizes the coefficients such that they remain equal to each other, whether they
are zero or non-zero. [Read more at: https://rss.onlinelibrary.wiley.com/
doi/pdfdirect/10.1111/j.1467-9868.2005.00503.x]
(b) (Optional) Read about SCAD penalty, and state its advantage and disadvantage.
[This is a quick read: https://andrewcharlesjones.github.io/posts/2020/
03/scad/]

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 4 of 4

Amit Sethi [email protected]

Introduction to Statistical Methods for Financial Models 1st Severini Solution Manual - Latest Version Can Be Downloaded Immediately
100% (4)
Introduction to Statistical Methods for Financial Models 1st Severini Solution Manual - Latest Version Can Be Downloaded Immediately
54 pages
HCIA-Datacom V1.0 Certification H12-811-ENU Dumps
No ratings yet
HCIA-Datacom V1.0 Certification H12-811-ENU Dumps
10 pages
(Ebook) Machine Learning Algorithms in Depth (MEAP V01) by Vadim Smolyakov ISBN 9781633439214, 1633439216 download pdf
100% (5)
(Ebook) Machine Learning Algorithms in Depth (MEAP V01) by Vadim Smolyakov ISBN 9781633439214, 1633439216 download pdf
81 pages
Time Series Analysis and Its Applications (Instructor's Manual) (Robert H. Shumway, David S. Stoffer)
100% (1)
Time Series Analysis and Its Applications (Instructor's Manual) (Robert H. Shumway, David S. Stoffer)
81 pages
Probability With Applications in Engineering, Science, and Technology, 2nd (Instructor's Solution Manual) - Matthew A. Carlton
100% (1)
Probability With Applications in Engineering, Science, and Technology, 2nd (Instructor's Solution Manual) - Matthew A. Carlton
400 pages
EC400 Revision Mathematics Quizzes All
No ratings yet
EC400 Revision Mathematics Quizzes All
41 pages
Solution Key Comprehensive Question Paper
No ratings yet
Solution Key Comprehensive Question Paper
8 pages
ECON681 Homework Question
No ratings yet
ECON681 Homework Question
5 pages
CQF Math Aptitude Test Solutions
No ratings yet
CQF Math Aptitude Test Solutions
27 pages
Bayes Theorem Cheat Sheet
No ratings yet
Bayes Theorem Cheat Sheet
1 page
Solution 2
0% (1)
Solution 2
4 pages
Working With Words Language at Work
No ratings yet
Working With Words Language at Work
2 pages
2021 EE769 Tutorial Sheet 1
No ratings yet
2021 EE769 Tutorial Sheet 1
4 pages
HS307 Mid Sem Paper
No ratings yet
HS307 Mid Sem Paper
3 pages
Kit-Wing Yu - A Complete Solution Guide To Real and Complex Analysis I-978-988-78797-9-4 (2019)
No ratings yet
Kit-Wing Yu - A Complete Solution Guide To Real and Complex Analysis I-978-988-78797-9-4 (2019)
330 pages
App.A - Detection and Estimation in Additive Gaussian Noise PDF
No ratings yet
App.A - Detection and Estimation in Additive Gaussian Noise PDF
55 pages
Ps 1
No ratings yet
Ps 1
5 pages
Study Guide For STA3701
No ratings yet
Study Guide For STA3701
325 pages
ME 781 - Statistical Machine: Learning and Data Mining
No ratings yet
ME 781 - Statistical Machine: Learning and Data Mining
2 pages
SS ZG568 EC 2M SECOND SEM 2020 2021 Solution 1617600765956
No ratings yet
SS ZG568 EC 2M SECOND SEM 2020 2021 Solution 1617600765956
9 pages
ML Unit 1 Notes
100% (1)
ML Unit 1 Notes
19 pages
Sanet ST
No ratings yet
Sanet ST
385 pages
Mean Value Theorem
No ratings yet
Mean Value Theorem
46 pages
Introductory Concepts of Probabability & Statistics
No ratings yet
Introductory Concepts of Probabability & Statistics
6 pages
Revised Simplex Method
No ratings yet
Revised Simplex Method
18 pages
Statistical Learning Theory
No ratings yet
Statistical Learning Theory
4 pages
(Solutions Manual) Probability and Statistics For Engineers and Scientists Manual Hayler
100% (1)
(Solutions Manual) Probability and Statistics For Engineers and Scientists Manual Hayler
51 pages
Vector Norms
No ratings yet
Vector Norms
16 pages
Midsem Regular MFDS 22-12-2019 Answer Key PDF
No ratings yet
Midsem Regular MFDS 22-12-2019 Answer Key PDF
5 pages
hw3 Solutions PDF
No ratings yet
hw3 Solutions PDF
11 pages
CS 229, Autumn 2016 Problem Set #1: Supervised Learning: m −y θ x m θ (i) (i)
No ratings yet
CS 229, Autumn 2016 Problem Set #1: Supervised Learning: m −y θ x m θ (i) (i)
8 pages
PROJECTMean Value Theorem and Its Applicationby DRP 12
No ratings yet
PROJECTMean Value Theorem and Its Applicationby DRP 12
25 pages
IE506 Bagging Boosting April5 6
No ratings yet
IE506 Bagging Boosting April5 6
14 pages
Numpy
No ratings yet
Numpy
15 pages
Midterm 2010 Solutions
No ratings yet
Midterm 2010 Solutions
8 pages
Eigenvalues and Eigenvectors
No ratings yet
Eigenvalues and Eigenvectors
15 pages
Heteroskedasticity
100% (1)
Heteroskedasticity
23 pages
CpyProbStatSection PDF
No ratings yet
CpyProbStatSection PDF
240 pages
U5-6 Maxima and Minima of Functions of Two Variables
No ratings yet
U5-6 Maxima and Minima of Functions of Two Variables
16 pages
Lec20 RidgeRegression
No ratings yet
Lec20 RidgeRegression
21 pages
Full Solution Manual For An Introduction To Optimization, 4th Edition Edwin K. P. Chong Stanislaw H. Zak All Chapters
100% (6)
Full Solution Manual For An Introduction To Optimization, 4th Edition Edwin K. P. Chong Stanislaw H. Zak All Chapters
44 pages
Quiz Week 7 - Support Vector Machines
100% (1)
Quiz Week 7 - Support Vector Machines
3 pages
AMT305 INTRODUCTION TO MACHINE LEARNING, Pyq2
No ratings yet
AMT305 INTRODUCTION TO MACHINE LEARNING, Pyq2
3 pages
Orthogonal Functions and Fourier Series: Exercises 12.1
No ratings yet
Orthogonal Functions and Fourier Series: Exercises 12.1
46 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
29 pages
Cheet Sheet
No ratings yet
Cheet Sheet
47 pages
Decision Trees & The Iterative Dichotomiser 3 (ID3) Algorithm
100% (1)
Decision Trees & The Iterative Dichotomiser 3 (ID3) Algorithm
8 pages
Machine Learning Guide Line
No ratings yet
Machine Learning Guide Line
10 pages
Duda Solutions PDF
No ratings yet
Duda Solutions PDF
77 pages
ISyE 6669 Homework 15 PDF
No ratings yet
ISyE 6669 Homework 15 PDF
3 pages
Max and Min PDF
No ratings yet
Max and Min PDF
19 pages
ISYE 6669 Homework 15 Fall 2021 PDF
No ratings yet
ISYE 6669 Homework 15 Fall 2021 PDF
3 pages
Maths (Organizer)
No ratings yet
Maths (Organizer)
68 pages
Machine Learning - Stanford University - Coursera
No ratings yet
Machine Learning - Stanford University - Coursera
16 pages
1 Introduction: Why Time Series Analysis
No ratings yet
1 Introduction: Why Time Series Analysis
22 pages
Deep Learning Fundamentals Materials
100% (1)
Deep Learning Fundamentals Materials
216 pages
Solutions for Exercises in Foundations of Machine Learning, 2nd Edition – Mohri & Rostamizadeh
100% (1)
Solutions for Exercises in Foundations of Machine Learning, 2nd Edition – Mohri & Rostamizadeh
5 pages
The Advantages of Least Squares Monte Carlo
0% (1)
The Advantages of Least Squares Monte Carlo
9 pages
EE353 - 769 08 Linear Classification
No ratings yet
EE353 - 769 08 Linear Classification
22 pages
MS_key-4
No ratings yet
MS_key-4
4 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
Troubleshooting IPSec VPN - 0
No ratings yet
Troubleshooting IPSec VPN - 0
3 pages
108 Split Tests PDF
100% (3)
108 Split Tests PDF
144 pages
TAFJ MessageIntegrity
No ratings yet
TAFJ MessageIntegrity
14 pages
Eroju Java Anadu Repu Bava Antadu
No ratings yet
Eroju Java Anadu Repu Bava Antadu
36 pages
Array Mastering in 5 Days
No ratings yet
Array Mastering in 5 Days
28 pages
Synopsis-3
No ratings yet
Synopsis-3
17 pages
Mania in The Eyes
No ratings yet
Mania in The Eyes
22 pages
Operation Manual Prostar MPPT en
No ratings yet
Operation Manual Prostar MPPT en
45 pages
7124b168cc7e0c6a11120728ad5b9ea9
No ratings yet
7124b168cc7e0c6a11120728ad5b9ea9
4,413 pages
Kelp.2 Assignment Written Communication
No ratings yet
Kelp.2 Assignment Written Communication
2 pages
Lesson 80
No ratings yet
Lesson 80
5 pages
Basic Set Up Using TMSH
No ratings yet
Basic Set Up Using TMSH
5 pages
Week 13 GCP Notes
No ratings yet
Week 13 GCP Notes
5 pages
Gamasutra - Q&A - Translating The Humor & Tone of Yakuza Games For The West
No ratings yet
Gamasutra - Q&A - Translating The Humor & Tone of Yakuza Games For The West
11 pages
The Machine Learnings Leading The Cuffless PPG Blood Pressure Sensors Into The Next Stage
No ratings yet
The Machine Learnings Leading The Cuffless PPG Blood Pressure Sensors Into The Next Stage
13 pages
4.8 Consequences of Uses of Computing
No ratings yet
4.8 Consequences of Uses of Computing
2 pages
1 Fjune 2022 Ms
No ratings yet
1 Fjune 2022 Ms
24 pages
MPCH 1
No ratings yet
MPCH 1
15 pages
Power BI Cheatsheet Zep
No ratings yet
Power BI Cheatsheet Zep
1 page
Meta Bank Defi Whitepaper
No ratings yet
Meta Bank Defi Whitepaper
30 pages
Summary Research Project Chapters (1)
No ratings yet
Summary Research Project Chapters (1)
3 pages
Add Applets
No ratings yet
Add Applets
4 pages
Cojali Oem Solutions Catalogue 2024
No ratings yet
Cojali Oem Solutions Catalogue 2024
20 pages
Algorithm Design with Haskell Richard S. Bird - Read the ebook online or download it as you prefer
100% (1)
Algorithm Design with Haskell Richard S. Bird - Read the ebook online or download it as you prefer
65 pages
DMS_ReleaseVer10.9 Rel
No ratings yet
DMS_ReleaseVer10.9 Rel
8 pages
Maths Revision Cambridge Checkpoint
No ratings yet
Maths Revision Cambridge Checkpoint
5 pages
Bagatrix Manual
No ratings yet
Bagatrix Manual
27 pages
Shafni - Shiyam (Information Security Management) 22 Update
No ratings yet
Shafni - Shiyam (Information Security Management) 22 Update
122 pages

EE 769 Introduction To Machine Learning: Sheet 4 - 2020-21-2 Linear Classification

Uploaded by

EE 769 Introduction To Machine Learning: Sheet 4 - 2020-21-2 Linear Classification

Uploaded by

EE 769 Introduction to Machine Learning

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 1 of 4

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 2 of 4

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 3 of 4

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 4 of 4

You might also like