0% found this document useful (0 votes)
3 views

Classification _ DPP 01

The document consists of a series of questions and answers related to machine learning concepts, specifically focusing on linear classification and logistic regression. It covers various topics such as the output of logistic regression, AUC-ROC curves, regression types, hypothesis formulation, and cost functions. Additionally, it includes hints and solutions for selected questions, providing explanations for the correct answers.

Uploaded by

Anil Bhaskar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Classification _ DPP 01

The document consists of a series of questions and answers related to machine learning concepts, specifically focusing on linear classification and logistic regression. It covers various topics such as the output of logistic regression, AUC-ROC curves, regression types, hypothesis formulation, and cost functions. Additionally, it includes hints and solutions for selected questions, providing explanations for the correct answers.

Uploaded by

Anil Bhaskar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

GATE

Machine Learning DPP : 1


Linear Classification & Logistic Regression

Q1 In a logistic regression(Linear Classifier) (B) They can capture complex non-linear


problem, what is a possible output for a new relationships in the data
instance? (C) They are less sensitive to outliers compared
(A) 85 (B) -0.19 to other algorithms
(C) 1.20 (D) 89% (D) They require less computational resources
for training and prediction
Q2 The below figure shows AUC-ROC curves for
three logistic regression models. Different colors Q5 The learner is trying to predict housing prices
show curves for different hyper parameters based on the size of each house. What type of
values. Which of the following AUC-ROC will regression is this?
give the best result? (A) Multivariate Logistic Regression
(B) Logistic Regression
(C) Linear Regression
(D) Multivariate Linear Regression

Q6 The hypothesis is given by h(x) = t0 + t1x. What


is the goal of t0 and 01?
(A) Give negative h(x)
(B) Give h(x) as close to 0 as possible, without
themselves being 0
(C) Give h(x) as close to y, in training data, as
possible
(D) Give h(x) closer to x than y

Q7 In continuation with question 7, let x = 1 if the


(A) Yellow (B) Pink server is wearing black shirt and x = 0 for
(C) Black (D) All are same servers wearing other colored shirts. We know
that there are 2 points 70 observations with x 1
Q3 In the regression model (y = a + bx) where x =
and 340 observations with x = 0. The response
2.50, y = 5.50 and a = 1.50 (x̄ and ȳ denote
variable is also an indicator variable given by y
mean of variables x and y and a is a constant),
= 1 if the customer left a tip and y=0 if the
which one of the following values of parameter
customer did not leave a tip. Use this data to fit
'b' of the model is correct?
a logistic regression model to compute the log-
(A) 1.75 (B) 1.60
odds of leaving a tip depending on the color of
(C) 2.00 (D) 2.50
the server's shirt.
Q4 Which of the following is an advantage of linear (A) -0.4797 +0.1249x
classification algorithms? (B) 0.2877 +0.1249x
(A) They are highly interpretable (C) 0.1249+0.4317x

Android App | iOS App | PW Website

1/5
GATE

(D) -0.4797 +0.7674x Q13 Suppose you are given the below data, and
you want to apply a logistic regression model
Q8 In Simple Logistic regression the predictor . . . ?
for classifying it into two given classes.
(A) is interval/ratio data
(B) must undergo a logarithmic transformation
before undergoing logistic regression
(C) be in the range of 0 to 1
(D) represent ranked scores
(E) be a binary variable

Q9 In logistic regression the logit is . . . : (one You are using logistic regression with L1
correct choice) regularization.
(A) the natural logarithm of the odds ratio. ∑ni=1 log P (yi |xi, w0 , w1 , w2 )
(B) an instruction to record the data. − C (|w1 | + |w2 |).
(C) a logarithm of a digit. Where C is the regularization parameter, and w1
(D) the cube root of the sample size. & w2 are the coefficients of x1 and x2.
Which of the following option is correct when
Q10 Given an example from a dataset (x1, x2) = (4, 1),
you increase the value of C from zero to a very
observed value y = 2 and the initial weights w1,
large value?
w2, bias b as -0.015, -0.038 and 0. What will be
(A) First, w2 becomes zero, and then w1
the prediction y’.
becomes zero
(A) 0.01 (B) 0.03
(C) 0.05 (D) 0.1
(B) First, w1 becomes zero, and then w2
Q11 For the linear regression model yi- = β 0 + β 1xi, + becomes zero
let ŷ i denote the sample fitted value for the i- (C) Both become zero at the same time
th unit from the OLS procedure based on a (D) Both cannot be zero even after a very large
sample of n observations and β̂ 0 and β̂ 1 be value of C
the OLS estimators of β̂ 0 and β̂ 1 , respectively.
Q14 Likelihood (In the statistical sense) . . (one
Which of the following is true?
n correct choice)
(A) ∑i=1 ui = 0
n n (A) Is the same as a p value
(B) ∑i=1 ŷ i = ∑i=1 yi .
(B) Is the probability of observing a particular
(C) E (ŷ |x) = β̂ 0 + β̂ 1 x
parameter value given a set of data
(D) E (y|x) = β̂ 0 + β̂ 1 x
(C) attempts to find the parameter value which
is the most likely given the observed data.
Q12 A classification table: (one correct choice)
(D) minimizes the difference between the model
(A) helps the researcher assess statistical
and the data
significance.
(B) indicates how well a model has predicted Q15 A Maximum Likelihood Estimator (in the
group membership. statistical sense) . . (one correct choice)
(C) indicates how well the independent (A) Is the same as a p value
variable(s) correlate with the dependent (B) Is the probability of observing a particular
variable. parameter value given a set of data
(D) provides a basis for calculating the exp(b) (C) attempts to find the parameter value which
value is the most likely given the observed data.

Android App | iOS App | PW Website

2/5
GATE

(D) Is the same as R Square converge into the global minimum only if the
function is convex.
Q16 Why cost function which has been used for
(C) Linear regression uses mean squared error
linear regression can’t be used for logistic
as its cost function. If this is used for logistic
regression?
regression, then it will be a non-convex
(A) Linear regression uses mean squared error
function of its parameters. Gradient descent
as its cost function. If this is used for logistic
will converge into the global minimum only if
regression, then it will be a non-convex
the function is non-convex.
function of its parameters. Gradient descent
(D) Linear regression uses mean squared error
will converge into the global minimum only if
as its cost function. If this is used for logistic
the function is convex.
regression, then it will be a convex function
(B) Linear regression uses mean squared error as
of its parameters. Gradient descent will
its cost function. If this is used for logistic
converge into the global minimum only if the
regression, then it will be a convex function
function is non-convex.
of its parameters. Gradient descent will

Android App | iOS App | PW Website

3/5
GATE

Answer Key
Q1 (A) Q9 (A)

Q2 (A) Q10 (D)

Q3 (B) Q11 (A)

Q4 (A) Q12 (B)

Q5 (C) Q13 (B)

Q6 (C) Q14 (C)

Q7 (D) Q15 (C)

Q8 (E) Q16 (A)

Android App | iOS App | PW Website

4/5
GATE

Hints & Solutions


Q1 Text Solution: predicted group membership or class labels. It
The output can only be between 0 and 1. compares the actual class labels with the

Q3 Text Solution: predicted class labels generated by the model.

(y = a + bx) Q13 Text Solution:


where, By looking at the image, we see that even by
• x̄ = 2.50 just using x2, we can efficiently perform
• ȳ = 5.50 classification. So at first, w1 will become 0. As
• a = 1.50 the regularization parameter increases more,
• (x̄ and ȳ denote mean of variables x and y w2 will come closer and closer to 0.
and a is a constant) Q14 Text Solution:
Putting values in the formula: Likelihood, in the statistical sense, refers to the
5.50 = 1.50 + b × 2.50 probability of observing a particular parameter
b × 2.50 = 4 value given a set of data. It represents how well
b = 4/2.5 = 1.60
the parameter value fits the observed data.
Q8 Text Solution:
Q15 Text Solution:
Logistic regression is commonly used when the
A Maximum Likelihood Estimator (MLE) is a
outcome or dependent variable is binary (e.g.,
method used to estimate the parameters of a
yes/no, 0/1), and it models the probability of
statistical model. It aims to find the parameter
the outcome occurring as a function of the
values that maximize the likelihood function,
predictor variable.
making them the most likely given the observed
Q10 Text Solution: data.
Given
Q16 Text Solution:
x1 = 4, x2 = 1, w1 = –0.015, w2 = -0.038, y = 2 and
In linear regression, the mean squared error
b = 0.
(MSE) is used as the cost function.
Then prediction y’ = w1 x1 + w2 x2 + b
The MSE creates a convex optimization
= (-0.015 * 4) + (-0.038 * 1) + 0 problem, meaning that it has a single
= -0.06 + -0.038 + 0 minimum and gradient descent can
= -0.098
converge to the global minimum efficiently.
= -0.1
However, when the MSE is used for logistic
Q11 Text Solution: regression, it creates a non-convex
The equation states that the sum of the optimization problem due to the logistic
residuals (ui) in a linear regression model is function used in logistic regression. This non-
equal to zero. This is a fundamental property of convexity can cause gradient descent to
the Ordinary Least Squares (OLS) method used converge to local minima rather than the
to estimate the coefficients in a linear global minimum.
regression model. Therefore, using the MSE as the cost function
for logistic regression is not appropriate, and
Q12 Text Solution:
alternative cost functions like cross-entropy
A classification table, also known as a
loss are used instead.
confusion matrix, shows how well a model has

Android App | iOS App | PW Website

5/5

You might also like