0% found this document useful (0 votes)
7 views

Regression _ DPP 01

The document contains a series of questions and answers related to machine learning concepts, specifically focusing on regression techniques such as linear regression, ridge regression, and lasso regression. It covers topics like the interpretation of coefficients, assumptions of linear regression, and the effects of outliers and multicollinearity. Additionally, it includes hints and solutions for the questions presented.

Uploaded by

t33763210
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Regression _ DPP 01

The document contains a series of questions and answers related to machine learning concepts, specifically focusing on regression techniques such as linear regression, ridge regression, and lasso regression. It covers topics like the interpretation of coefficients, assumptions of linear regression, and the effects of outliers and multicollinearity. Additionally, it includes hints and solutions for the questions presented.

Uploaded by

t33763210
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

GATE

Machine Learning
DPP: 1
Regression
Q1 The parameters acquired through linear The first principal component explains the
regression: largest proportion of the variation in the
(A) can take any value in the real space dependent variable.
(B) are strictly integers (C) Principal components are linear
(C) always lie in the range [0,1] combinations of the original predictors that
(D) can take only non-zero values are uncorrelated with each other.
(D) PCR selects the principal components with
Q2 Which of the statements is/are True ?
the highest p-values for inclusion in the
(A) Ridge has sparsity constraint, and it will
regression model.
drive coefficients with low values to 0.
(E) PCR always results in a lower model
(B) Lasso has a closed form solution for the
complexity compared to ordinary least
optimization problem, but this is not the
squares regression.
case for Ridge.
(C) Ridge regression does not reduce the Q5 Which statement is true about outliers in Linear
number of variables since it never leads a regression ?
coefficient to zero but only minimizes it. (A) Linear regression model is not sensitive to
(D) If there are two or more highly collinear outliers
variables, Lasso will select one of them (B) Linear regression model is sensitive to
randomly outliers
(C) Can't say
Q3 The relation between studying time (in hours)
(D) None of these
and grade on the final examination (0-100) in a
random sample of students in the Introduction Q6 What does the slope coefficient in a linear
to Machine Learning Class was found to be : regression model indicate?
Grade = 30.5 + 15.2 (h) How will a student's (A) The point where the regression line intersects
grade be affected if she studies for four hours ? the y-axis
(A) It will go down by 30.4 points. (B) The dependent variable changes for every
(B) It will go down by 30.4 points. one-unit change in the independent
(C) It will go up by 60.8 points. variable
(D) The grade will remain unchanged. (C) The average value of the dependent
(E) It cannot be determined from the variable
information given (D) The dispersion of the dependent variable

Q4 Which of the following statements about Q7 Find the mean of squared error for the given
principal components in Principal Component predications :
Regression (PCR) is true?
(A) Principal components are calculated based
on the correlation matrix of the original Y F(X)
predictors.
1 2
(B)

Android App | iOS App | PW Website

1/13
GATE

Q10 The confidence interval is an interval which is an


2 3 estimate of -
(A) The mean value of the dependent variable
(B) The standard deviation value of the
4 5
dependent variable
(C) The mean value of the independent variable
8 9 (D) The standard deviation value of the
independent variable

16 15 Q11 In the regression model (y = a + bx) where x =


2.50, y = 5.50 and a = 1.50 (x and y denote mean

32 31 of variables x and y and a is a constant), which


one of the following values of parameter 'b' of
Hind : Find the squared error for each the model is correct?
predication and take the mean of that. (A) 1.75 (B) 1.60
(A) 1 (B) 2 (C) 2.00 (D) 2.50
(C) 1.5 (D) 0
Q12 There is no value of x that can simultaneously
Q8 What is the primary assumption of linear satisfy both the given equations. Therefore, find
regression regarding the relationship between the ‘least squares error’ solution to the two
the independent and dependent variables? equations, i.e ., find the value of x that minimize
(A) Non-linearity the sum of squares of the errors in the two
(B) Independence of errors equations. ____________.
(C) Homoscedasticity 2x = 3
(D) Linearity 4x = 1,

Q9 Which of the following statements is true Q13 For a bivariate data set on (x, y), if the means,
regarding Partial Least Squares (PLS) standard deviations and correlation coefficient
regression? are
(A) PLS is a dimensionality reduction technique x = 1.0, y = 2.0, sx = 3.0, sy = 9.0, r = 0.8
that maximizes the covariance between the Then the regression line of y on x is:
predictors and the dependent variable. (A) y = 1 + 2.4(x-1)
(B) PLS is only applicable when there is no (B) y = 2 + 0.27(x - 1)
multicollinearity among the independent (C) y = 2 + 2.4(x-1)
variables. (D) y = 1 + 0.27(x-2)
(C) PLS can handle situations where the number
Q14 What is the purpose of regularization in linear
of predictors is larger than the number of
regression?
observations.
(A) To make the model more complex
(D) PLS estimates the regression coefficients by
(B) To avoid underfitting
minimizing the residual sum of squares.
(C) To encourage overfitting
(E) PLS is based on the assumption of normally
(D) To reduce the complexity of the model
distributed residuals.
(F) All of the above. Q15 A set of observations of independent variable
(G) None of the above. (x) and the corresponding dependent variable
(y) is given below :

Android App | iOS App | PW Website

2/13
GATE

each. Let X be Nx(p+1) vectors of input values


X 5 2 4 3 (augmented by 1's), Y be Nx1 vector of target
values, and theta (0) be (p+1)×1 vector of
parameter values (a0, a1, a2,...,ap). If the sum
Y 16 10 13 12
squared error is minimized for obtaining the
Based on the data, the coefficient a of the optimal regression model, which of the following
linear regression model. equation holds ?
y = a + bx is estimated as 6.1 (A) XTX=XY (B) XO=XTY
The coefficient b is ____________ .(round off to (C) XTXO=Y (D) XTX0=XTY
one decimal place )
Q19 Use the regression equation to predict the
Q16 The purpose of using a dummy variable in the glucose level given the age. Consider the
regression model is - following is the data set for understanding the
(A) Some of the independent variables are concept of Linear Regression Numerical
categorical data Example with One Independent Variable.
(B) The dependent variable is categorical data GLUCOS
SUBJE
(C) Both independent and dependent variable AGE X E LEVEL
CT
may have a categorical data value Y
(D) The dependent and independent variable 1 43 99
must be a numerical data value 2 21 65
3 25 79
Q17 The random error e in multiple linear regression
model y = Xb + e are assumed to be identically 4 42 75
and independently distributed following the 5 57 87
normal distribution with zero mean and 6 59 81
constant variance. Here y is a n × 1 vectors of 7 55 ?
observations on response variable, X is a n × K
matrix of n observations on each of the K
explanatory variables, b is a K × 1 vectors of Q20 In linear regression, what is the primary
regression coefficients and e is a n × 1 vectors difference between Lasso (L1 regularization) and
of random errors. The residuals Ridge (L2 regularization)?
ε̂ = y − ŷ based on the ordinary least (A) Lasso tends to produce sparse coefficient
squares estimator of b have , in general. vectors, while Ridge does not.
(A) Zero mean, constant variance and are (B) Ridge tends to produce sparse coefficient
independent vectors, while Lasso does not.
(B) Zero mean, constant variance and are not (C) Both Lasso and Ridge produce sparse
independent coefficient vectors.
(C) Zero mean, non constant variance and are (D) Both Lasso and Ridge tend to produce non-
not independent sparse coefficient vectors.
(D) non Zero mean, non constant variance and
Q21 Consider the following set of points: {(-2 , -1) , (1 ,
are not independent
1) , (3 , 2)}
Q18 The linear regression model (a) Find the least square regression line for the
y=a0+a1x1+a2x2+...+apxp is to be fitted to a set given data points.
of N training data points having p attributes

Android App | iOS App | PW Website

3/13
GATE

(b) Plot the given points and the regression line To determine the significance of individual
in the same rectangular system of axes. coefficients.
(B) To test the overall significance of the
Q22 In the table below, the xi column shows scores
regression model.
on the aptitude test. Similarly, the yi column
(C) To assess the presence of multicollinearity
shows statistics grades. The last two columns
among independent variables.
show deviations scores - the difference
(D) To evaluate the normality of residuals.
between the student's score and the average
score on each measurement. The last two rows Q26 When performing linear regression,
show sums and mean scores. multicollinearity can be problematic. Which of
Find the regression equation the following statements about multicollinearity
Stu is true?
de xi yi (xi − x̄)
2
(yi − ȳ)
2
(A) Multicollinearity occurs when there is no
nt correlation between independent variables.
1 95 85 289 64 (B) Multicollinearity makes it easier to interpret

2 85 95 49 324 the individual coefficients in the regression


model.
3 80 70 4 49
(C) Multicollinearity inflates the standard errors
4 70 65 64 144
of the regression coefficients.
5 60 70 324 49
(D) Multicollinearity always improves the
Su 39 38 predictive performance of the model.
730 630
m 0 5
Me Q27 What is heteroscedasticity, and how does it
78 77 affect the assumptions of linear regression?
an
(A) Heteroscedasticity refers to the presence of
(5) (5)
Q23 What does (x , y ) represent or imply ? outliers in the dataset, violating the
(A) There are 5 training examples assumption of linearity.
(B) The values of x and y are 5 (B) Heteroscedasticity refers to the non-
(C) The fourth training examples constant variance of residuals, violating the
(D) The fifth training example. assumption of homoscedasticity.
(C) Heteroscedasticity occurs when there is
Q24 The values of y and their corresponding values
perfect multicollinearity among
of y are shown in the table below.
independent variables, violating the
assumption of independence.
X 0 1 2 3 4
(D) Heteroscedasticity refers to the presence of
correlated errors, violating the assumption
y 2 3 5 4 6 of normality.

Q28 When should one prefer ridge regression over


(a) Find the least square regression line y = a x +
lasso regression?
b.
(A) When the goal is to select a subset of
(b) Estimate the value of y when x = 10.
important predictors.
Q25 In the context of linear regression, what is the (B) When the coefficients of irrelevant predictors
purpose of the F-test? should be exactly zero.
(A) (C)

Android App | iOS App | PW Website

4/13
GATE

When there is multicollinearity among the (D) When the dataset has a large number of
independent variables. observations.

Android App | iOS App | PW Website

5/13
GATE

Answer Key
Q1 (C) Q15 (1.9)

Q2 (A) Q16 (A)

Q3 (C) Q17 (C)

Q4 (C) Q18 (D)

Q5 (B) Q19 86.327

Q6 (B) Q20 (A)

Q7 (A) Q21 0.

Q8 (D) Q22 0.644x.

Q9 (C) Q23 (D)

Q10 (A) Q24 11.2

Q11 (B) Q25 (B)

Q12 (5) Q26 (C)

Q13 (C) Q27 (B)

Q14 (B) Q28 (C)

Android App | iOS App | PW Website

6/13
GATE

Hints & Solutions


Q1 Text Solution: allows for an efficient and direct calculation of
1. "The parameters obtained in linear regression the coefficients.
can take any value in the real space: This 3. Ridge regression does not reduce the number
statement is True. In linear regression, the of variables since it never leads a coefficient to
model's parameters (coefficients) can take any zero but only minimizes it.
real value, including positive, negative, or zero. True. Ridge regression tends to shrink the
2. "Are strictly integers": This statement is not coefficients towards zero but does not lead
true in the context of typical linear regression. In them exactly to zero (except in cases where the
linear regression, the parameters are usually predictors are perfectly collinear).
real numbers, not strictly integers. However, Consequently, Ridge does not perform variable
there are cases where specialized variants of selection or reduce the number of variables, as
linear regression, such as integer linear all the predictors remain in the model, albeit
regression or integer programming, exist and with smaller weights.
can deal with parameters that are required to 4. If there are two or more highly collinear
be integers. variables, Lasso will select one of them
3. "Always lie in the range [0,1]": This statement is randomly.
not true. In standard linear regression, the True. In situations of high collinearity, Lasso
parameters are not constrained to lie within the regularization may randomly select one of the
range [0, 1]. They can take values anywhere on correlated variables to include in the model
the real number line. while driving the coefficients of others to
4. "Can take only non-zero values": This exactly zero. The choice of which variable is
statement is not true. In linear regression, the kept and which ones are eliminated may vary
parameters can take any real value, including depending on the algorithm or software
zero. implementation used.

Q2 Text Solution: In summary, all four statements are true.

True. Ridge regression adds an L2 regularization Q3 Text Solution:


term to the linear regression cost function, To calculate how a student's grade will be
which imposes a penalty on the magnitude of affected if she studies for four hours, we can use
the coefficients. As the regularization strength the given regression equation:
increases, the coefficients with low values tend Grade 30.5+15.2(h)
to be driven closer to zero, effectively where "h" is the studying time in hours.
introducing sparsity in the model. To find the grade for studying four hours (h = 4):
2. Lasso has a closed form solution for the Grade 30.5+15.2(4)
optimization problem, but this is not the case Grade = 30.5+60.8
for Ridge. True. Lasso regression adds an L1 Grade = 91.3
regularization term to the linear regression cost So, if the student studies for four hours, her
function. The L1 regularization introduces grade will be 91.3 points.
sparsity and often leads to some coefficients The correct answer is: "It will go up by 60.8
being exactly equal to zero. Due to the nature points."
of the L1 regularization term, the optimization Q4 Text Solution:
problem has a closed-form solution, which

Android App | iOS App | PW Website

7/13
GATE

The true statement about principal False. PCR can result in a lower model
components in Principal Component Regression complexity compared to ordinary least squares
(PCR) is: (OLS) regression when a small number of
Principal components are linear combinations principal components are retained. However, if
of the original predictors that are uncorrelated all principal components are used in PCR, the
with each other. model complexity can be similar to the full OLS
1. Principal components are calculated based regression model.
on the correlation matrix of the original
Q5 Text Solution:
predictors.
The slope of the regression line will change due
False. Principal components are calculated
to outliers in most of thecases.
based on the covariance matrix (or
Q6 Text Solution:
equivalently, the correlation matrix after
In a linear regression model, the slope
standardization) of the original predictors, not
coefficient represents the rate of change in the
the correlation matrix directly.
dependent variable (Y) for each one-unit
2. The first principal component explains the
change in the independent variable (X).
largest proportion of the variation in the
Specifically, it indicates how much the
dependent variable. False. The first principal
predicted value of the dependent variable
component explains the largest proportion of
changes for every one-unit increase (or
the variation in the predictors, not the
decrease) in the independent variable, holding
dependent variable. It captures the direction of
all other variables constant.
maximum variance in the predictor space.
3. Principal components are linear combinations Q7 Text Solution:
of the original predictors that are uncorrelated Calculate the squared error for each prediction,
with each other. which is the square of the difference between
True. Principal components are linear each predicted value (F(x)) and the
combinations of the original predictors that are corresponding true value (y).
constructed in such a way that they are Given predicitions : y = [1, 2, 4, 8, 16 , 32] F(x) = [2 ,
uncorrelated with each other. Each principal 3 , 5 , 9 ,15 , 31]
component represents a unique orthogonal Squared error for each prediction :
direction in the predictor space. Prediction 1 : (2 - 1) ∧2 = 1
4. PCR selects the principal components with Prediction 2 : (3 − 2) ∧ 2 = 1

the highest p-values for inclusion in the Prediction 3 : (5 − 4) ∧ 2 = 1

regression model. Prediction 4 : (9 − 8) ∧ 2 = 1

False. PCR does not involve p-values or Prediction 5 : (15 − 16) ∧ 2 = 1

hypothesis testing. It is a dimensionality Prediction 6 : (31 − 32) ∧ 2 = 1

reduction technique that aims to reduce 2. Calculate the mean of squared error by
multicollinearity and model complexity by taking the sum of squared errors and dividing
selecting a subset of the principal components by the number of predictions (samples).
that capture most of the variance in the Mean squared error (MSE) = (Squared error 1 +
predictors. Squared error 2 +Squared error 3 + Squared
5. PCR always results in a lower model error 4 + Squared error 5 +Squared error 6 ) / 6
complexity compared to ordinary least squares Mean squared error (MSE) = (1 + 1 + 1 + 1 + 1+ 1) / 6
regression. = 6 / 6 = 1.

Android App | iOS App | PW Website

8/13
GATE

So, the mean squared error for the given minimizing the residual sum of squares as in
predicitions is 1. ordinary least squares (OLS) regression.

Q8 Text Solution: (5) PLS is based on the assumption of normally


distributed residuals.
The primary assumption of linear regression is
that there exists a linear relationship between False. PLS does not assume normally

the independent variables (predictors) and the distributed residuals. It is a non-parametric

dependent variable (outcome). This means that method and makes fewer assumptions about

the change in the dependent variable is the underlying data distribution compared to

proportional to the change in the independent linear regression.

variables. The model assumes that the Q10 Text Solution:


relationship can be described by a straight line. (A) The mean value of the dependent variable:

Q9 Text Solution: This is correct. The confidence interval

PLS is a dimensionality reduction technique estimates the range within which the true mean

that maximizes the covariance between the of the dependent variable is likely to fall.

predictors and the dependent variable. (B) The standard deviation value of the
dependent variable: This is incorrect. The
True. PLS aims to find a low-dimensional latent
confidence interval is not typically used to
space that maximizes the covariance between
the predictors (independent variables) and the estimate the standard deviation of the

dependent variable while considering their dependent variable. Instead, it estimates the

relationship. mean of the dependent variable.

(2). PLS is only applicable when there is no (C) The mean value of the independent

multicollinearity among the independent variable: This is incorrect. The confidence

variables. interval is related to estimating the mean or

False. Unlike traditional multiple linear other parameters of the dependent variable,

regression, PLS can handle multicollinearity not the independent variable.

among the independent variables. It deals with (D) The standard deviation value of the

multicollinearity by creating latent variables independent variable: This is incorrect for the

(components) that are linear combinations of same reason as option B. The confidence

the original predictors. interval is not typically used to estimate the


standard deviation of the independent
(3). PLS can handle situations where the number
variable.
of predictors is larger than the number of
observations. Q11 Text Solution:
True. PLS is particularly useful when dealing with (y = a + bx)
high-dimensional datasets, where the number where,
of predictors (independent variables) is larger x = 2.50
than the number of observations. It can y = 5.50
effectively reduce the dimensionality and a = 1.50
handle the "small n, large p" problem. (x and y denote mean of variables x and y and
(4)PLS estimates the regression coefficients by a is a constant)
minimizing the residual sum of squares. Putting values in the formula:
False. PLS estimates the regression coefficients 5.50 =1.50+ b*2.50
by maximizing the covariance between the b*2.50 = 4
predictors and the dependent variable, not by b= 4/2.5 =1.60

Android App | iOS App | PW Website

9/13
GATE

Therefore, 'B' is the correct answer. Q16 Text Solution:

Q12 Text Solution: This is the correct purpose of using a dummy


variable in a regression model. When you have
Given the functions are :
categorical variables (variables that represent
2x = 3 ⇒ 2x − 3 = 0 and 4x = 1 ⇒ 4x − 1

= 0
categories or groups), such as gender, ethnicity,
or geographic region, in your dataset, you need
2 2
∴ ​ R = (2x − 3) + (4x − 1)

Hence, to minimize the value of R,


dR

dx
= 0 to convert them into a numerical format to

dR
= 2 × 2 (2x − 3) + 4 × 2 (4x − 1) include them in a regression model. Dummy
dx

= 0
variables are created to represent different
∴ x =
1
&R min = (2 ×
1
− 3)
2
categories of the categorical variable.
2 2

+ (4 ×
1
− 1)
2
= 5 Q17 Text Solution:
2

∴ The value of x that minimizes the sum of ε̂ = y − ŷ


−1

squares of the errors in the two equation is 1/2. = y − Xb where b = (X'X) X'y
−1
= (I − H)y where H = X(X'X) X'
Q13 Text Solution:
= (I − H) ε
According to the question
E (ε̂ ) = 0
y - 2 = 0.8 x 9 (x-1)/3 2
V (ε̂ ) = σ (I − H)
⇒ y - 2 2.4(x - 1)
Since E (ε̂ ) = 0 , ε̂ i 's have zero mean.
y = 2 + 2.4(x - 1)
Since I − H is not generally a diagonal matrix ,
Q14 Text Solution: So ε̂ i 's do not have necessarily the same
The purpose of regularization in linear variances.
regression is to prevent underfitting by allowing The off - diagonal elements in (I − H) are not
the model to capture more complex zero , in general. So ε̂ i 's are not independent.
relationships in the data while still avoiding
overfitting. Regularization achieves this by
Q18 Text Solution:
penalizing overly complex models and
0 minimizes the sum of squared errors and
encouraging simpler models that generalize
obtain the optimal linear regression model, we
better to new data.
need to solve for the parameter vector 0. The
Q15 Text Solution: equation that holds in this context is:
Given Data and Calculation : XTXO = XTY
x y x2 xy Where
5 16 25 80 X is the Nx(p+1) matrix of input values
2 10 4 20 (augmented by 1's) with N data points and p
attributes each.
4 13 16 52
Y is the Nx1 vector of target values (the
3 12 9 36
dependent variable).
Σ x2 = Σ xy = O(thetha) is the (p+1)x1 vector of parameter
Σ x = 14 Σ y = 51
54 188 values (a0, a1, a2, ..., ap).
n = 4 So To understand why this equation holds, let's
51 = 4a + 14b briefly describe the steps of linear regression.
188 = 14a + 54 b The goal of linear regression is to find the
Solving the above two equations a = 6.1 and b = parameter vector 0 that minimizes the sum of
1.9. squared errors (SSE). The SSE is given by:

Android App | iOS App | PW Website

10/13
GATE

SSE(0) = (Y - XO)T (Y - XQ) b1 =


n(∑ xy)−(∑ x)(∑ xy)

2 2
n(∑ x )−(∑ x)
where (Y-XO) is the vector of residuals (the 6(20485)−(247)(486)
b1 =
difference between the actual target values Y 6(11409)−(247)
2

and the predicted values XO).


2868
b1 = = 0. 385335
7445

To find the optimal 0 that minimizes SSE, we Insert the values into the equation.
take the derivative of SSE with respect to 0 and y’ =bo +b1 * x
set it to zero. The solution for 8 that satisfies this y’ = 65.14 + (0.385225 * x)
condition is: Prediction – the value of y for the given value of
0 = (XTX)^(-1)XTY x = 55
Substituting this value of 0 back into the SSE y’ = 65.14 +(0.385225 ∗55)
equation, we get: y’ = 86.327
SSE(0) = (Y - XO)T (Y - X8) SSE(0) = (Y -
Q20 Text Solution:
X(XTX)^(-1)XTY)T (Y - X(XTX)^(-1)XTY) SSE(0) = (Y -
Lasso tends to produce sparse coefficient
Xe)T (Y- xe)
vectors, while Ridge does not. Lasso
Q19 Text Solution: regularization includes an L1 penalty term that
2

b0 =
(∑ y)(∑ x )−(∑ x)(∑ xy)
encourages some coefficients to be exactly
2 2
n(∑ x )−(∑ x)

n(∑ xy)−(∑ x)(∑ xy)


zero, leading to sparsity in the coefficient
b1 =
2
n(∑ x )−(∑ x)
2
vector, while Ridge regularization (L2 penalty)
does not force coefficients to be exactly zero.
GLUCOSE Q21 Text Solution:
SUBJECT AGE X XY X2 Y 2
LEVEL Y
x y xy x2
98
1 43 99 4257 1849 -2 -1 2 4
01
1 1 1 1
42
2 21 65 1365 441 3 2 6 9
25
62 ?xy = ?x2 =
3 25 79 1975 625 ?x = 2 ?y = 2
41 2 14

56 A = 23/38 b = 5/19
4 42 75 3150 1764
25 (b) now graph the regression line given by y = a
324 75 x + b and the given points.
5 57 87 4959
9 69
65
6 59 81 4779 3481
61
40
204 1140
E 247 486 02
85 9
2
Find b0 :
2
(∑ y)(∑ x )−(∑ x)(∑ xy) Q22 Text Solution:
b0 =
2 2
n(∑ x )−(∑ x)
(xi
(486)(11409)−(247)(20485)
b0 =
2 Studen − x̄)
6(11409)−(247)
xi yi
b0 =
4848979
= 65. 14 t (yi
7445

Find b1 : − ȳ)

Android App | iOS App | PW Website

11/13
GATE

1 95 85 136 (b) Now that we have the least square


2 85 95 126 regression line y = 0.9 x + 2.2, substitute x by 10

3 80 70 - 14 to find the value of the corresponding y.


y = 0.9 * 10 + 2.2 = 11.2
4 70 65 96
5 60 70 126 Q25 Text Solution:
To test the overall significance of the regression
Sum 390 385 470
model. The F-test is used to determine whether
Mean 78 77
the regression model as a whole is statistically
The regression equation is a linear equation of
significant. It compares the variance explained
the form ; ŷ = b0 + b1 x To contact a
by the model to the variance not explained by
regression analysis, we need to solve for b0 and
the model.
b1 , Computations
First , we solve for the regression coefficient (b1); Q26 Text Solution:
2 Multicollinearity inflates the standard errors of
b1 = ∑ [(xi − x̄) (yi − ȳ)]/ ∑ [(xi − x̄) ]
the regression coefficients. Multicollinearity
b1 = 470 /730
refers to the situation where independent
b1 = 0.644
variables in a regression model are highly
Once we know the value of the regression
correlated with each other. It can cause
coefficient (b1), we can solve for the regression
instability in the estimation of coefficients and
slope (b0) :
inflate their standard errors, making the
b0 = ȳ − b1 * x̄
interpretation of individual coefficients less
b0 = 77 - (0.644) (78)
reliable.
b0 = 77 - (0.644) (78)
b0 = (26.768) Q27 Text Solution:
Heteroscedasticity refers to the non-constant
Therefore , the regression equations is ŷ =
variance of residuals, violating the assumption
26.768 + 0.644 x.
of homoscedasticity. In linear regression,
Q23 Text Solution:
homoscedasticity assumes that the variance of
In a linear regression model, the set (x(i), y(i))
the residuals is constant across all levels of the
represents the ith example in the training set. x(i)
independent variables. Heteroscedasticity
gives the value of ith x, y(i) gives the ith value of y.
violates this assumption by causing the
Q24 Text Solution: variance of residuals to vary systematically with
2 the independent variables.
x y xy x
0 2 0 0 Q28 Text Solution:
1 3 3 1 When there is multicollinearity among the
2 5 10 4 independent variables. Ridge regression is
3 4 12 9 particularly useful when multicollinearity is

4 6 24 16 present among the independent variables


because it shrinks the coefficients towards zero
?x = ?y = ?xy = ?x2 =
without setting them exactly to zero. Lasso
10 20 49 30
regression, on the other hand, can set
We now calculate a and b using the least
coefficients exactly to zero, which may not be
square regression formulas for a and b.
desirable when multicollinearity is present.
A = 0.9 b = 2.2

Android App | iOS App | PW Website

12/13
GATE

Therefore, ridge regression is preferred in such


situations.

Android App | iOS App | PW Website

13/13

You might also like